Springtech Day 1 - PDF Tags

| No Comments

Crikey, the late morning and afternoon went by like a flash, so sorry that I didn't write an update.

Well, the afternoon looked at Acrobat in comparison to some of the other PDF products out there and we then moved on to talk about the various security features available in Acrobat.

It was a really useful session overall and I guess I just have one comment to put here regarding some of the other PDF products out there. We looked at various aspects of how these other systems created PDF and the quality of the finished article. Can I just say how genuinely stunned I was at the quite awful job some of these products did, I'll just pick on one element... Tagging!

PDF files can contain many things. At a minimum, a PDF contains the text, graphics, bookmarks, links and other elements of content that go to make up an electronic document.

In addition to content, PDF files may also include "structure". Structure is the term for a set of instructions that define the logic that binds the content together - the correct reading order, for example, the laguage being used in the document, and the presence and meaning of significant elements such as figures, lists, tables, and so on.

Tagging is a vital part of creating a PDF that will be genuinely useful for business, publishing, accessibility and well, just about everything.


Acrobat has a large number of tag types, over 30 I think. and these types recognise specific parts and elements of a document and will tag them accordingly so that the document structure is recorded and can be used by other systems. The tagging gets broken down into block elements.. Block-level elements are page elements that consist of text laid out in paragraph-like forms. Block-level elements are part of a document’s logical structure. Such elements are further classified as

Container elements
Heading and paragraph elements
Label and list elements
Special text elements
Table elements
Inline-level elements
Special inline-level elements

So, we have a way of expressing all of a documents constituent elements to either Acrobat itself or any other system.

I compared this to what I saw some other tools produce where every element was just called a Paragraph or worse, didn't even correctly recognise any structure within the document. Well I was really surprised, and not in a good way.

Put it this way, if you have any interest in Accessibility, Document Structure, I'd seriously ask questions about what's happening under the bonnet if you're looking at a non Adobe PDF tool.

...erm.... rant over :-)

More later this morning as Day 2 Commences!

Leave a comment