Flat PDF vs. XFA Form PDFs

A frequent mistake that is made is to assume that, since XFA Forms can be saved as PDFs, they will behave like any other PDF. Truth is: XFA PDFs and flat PDFs are entirely different beasts.

1) About PDF

“PDF” stands for Portable Document Format. Initially, PDFs were meant to be a digital counterpart to printable documents. You can open a PDF, and see the layout exactly as the page designer intended, with pictures and page breaks in the right places, ergonomic page margins, and most noticeably, with the original fancy fonts preserved. This is the original flat PDF.

Flat PDFs contain the page render data – a binary encoding of how the document should visually be drawn on screen or on paper (minus interactive items such as videos and flash animations).

PDF has come a long way since. It really has embraced the idea of “portable document,” the idea of the distribution of a published, polished document, and seeks to be all that printed documents could never be.

You can embed videos, flash animations and 3D spaces, protect them with encryption, limit their usage with DRM solutions (Adobe’s own is called LiveCycle Rights Management), annotate through comments and highlighting, measure elements, digitally sign them, make form fields to be filled in – and make forms that change according to the data inside them. Wow.

2) About forms – AcroForms

The first iteration of interactive form filling came as AcroForms.

At the most basic, an AcroForm is a flat PDF form that has some additional elements – the interactive fields – layered above the flat render, that allow users to enter information, and allow developers to extract data from.

You can create these using Adobe Acrobat, or any third-party PDF creation application that allows creation of PDF forms.

Flat PDFs can be annotated (comments, highlighting, and various other scribbles as desired), as these annotations can be mapped to an {x,y} location on the page.

Flat PDFs can have their pages extracted, as each page is already defined in the render.

Flat PDFs can be linearly optimized, for fast web viewing, which ensure that data for the first page all occurs before the data for the second page, in turn being before the data of the third page, and so on.

Such features are not available to XFA PDFs.

3) About forms – Dynamic XFA Forms

Dynamic forms are based on an XML specification known as XFA, the “XML Forms Architecture”. The information about the form (as far as PDF is concerned) is very vague – it specifies that fields exist, with properties, and JavaScript events, but does not specify any rendering. This allows them to redraw as much as necessary, with subforms repeating on the page, sections appearing and disappearing as appropriate, the ability to pull in form fragments stored in different files, and objects re-arranging as you (the developer) dictate.

This also means that some features of AcroForms and flat PDFs are lost.

XFA Foms cannot be annotated. Reader (or Acrobat) cannot know whether all your custom code may change the layout. As such, without any render data, and a chance that the render data may be drastically altered on the fly, local annotations cannot be implemented. An annotation only has sense at an {x,y} location, but if the item you are annotate changes location, your annotation becomes meaningless, if not misleading.

XFA Forms cannot have their pages extracted. There is no render data to determine pages. Change some data in the form, and the layout of the pages may change drastically. You must “flatten” the PDF before extracting pages, thereby losing interactive properties.

XFA Forms cannot be optimized for fast web viewing. There is no render information. Data at the end of the document may affect displays on the first page of the document.

4) Acrobat is not a word processor

A final common misconception is that you can edit pages in Acrobat as you could in Word. This is not the case. Acrobat focuses on the integrity of the layout – as such, if you have a page with 2 paragraphs, each 5 lines long, that is what Acrobat’s PDF engine will commit to showing.

The editing tools available are provided for minor cosmetic changes – nothing more. It is essential to keep the original documents – Word, PowerPoint, OpenOffice, TIFF or otherwise – for your editing process. Once converted to flat PDF, the intent is that the overall layout should never change.

For the same reason, XFA Forms are in conflict with Acrobat’s editing tools – the latter operate on the render layer, which does not exist in the saved XFA PDF. Such editing tools lose meaning faced with XFA Forms.

I hope this helps clear some of the confusions around what XFA PDFs can and cannot do in Acrobat and other PDF manipulation tools.

2 Responses to Flat PDF vs. XFA Form PDFs

  1. Dimitri says:

    Nice post Tai. Isn’t it amazing though, that there is still confusion on this after how many years now since LiveCycle Designer was added to each box of Acrobat??

    • Tai says:

      Hi Dimitri – thanks. I think more to the point the people who only use it occasionally or lightly won’t necessarily know the difference once they have “saved as PDF” – though I still get power users (using scripting model and everything!) wondering why there are differences, and it is to help them too that I wrote this article :-)