Archive for November, 2009

Linked vs. Embedded Template Images

Hey, it has been a while since I wrote a blog entry.  I just spent a week visiting a customer site and getting familiar with a major form deployment.  Lots of learning happened in both directions.  And lots of stuff for me to report back on at this blog — starting today.

Images and PDF sizes

We do not support linked images from PDF files.  Remember, the "P" in PDF means "Portable".  If the file has references to external content, it isn’t exactly portable.  Another issue with linked images is that a PDF file with an external reference cannot reliably embed a digital signature.

In your XFA template you have the choice to link or embed images.  But the final PDF will always have the images embedded — even if the template referenced them with a link.

Given that, what factors do you consider when you choose whether to embed or link images in your XFA templates?  Choosing between the two has size implications in the generated PDF.  Here is a bit of detail on how they get processed:

Embedded Images

Embedded images are stored in the XFA template XML as a base64-encoded value.  As with all base64 encoded binary data, the size expands by a factor of 4/3.  When Adobe Reader renders a dynamic form, it extracts the image data from the template and draws it to the screen.  If a template embeds the same image multiple times, we carry all copies of the duplicated image in the template and consequently, inside the PDF.

Linked Images

When we create a PDF from an XFA template with linked images, the images are stored in a PDF resource area.  We create an indexed name for the image based on the image file reference.  There are two efficiencies gained here:

  1. The images are stored in binary format — not base64 encoded
  2. Multiple references to the same image are reconciled to a single copy of that image in the PDF

So clearly, if you are including images, and especially if you are including multiple copies of the same image in your XFA form, your final PDF will be smaller when you include those images as links instead of by embedding.

Working with multiple datasets

One of the formfeed blog commenters ("mo") asked about preserving data during an import operation.   I gave her (him?) a flippant reply with some hand-waving about saving/restoring data before/after import. Then I tried it myself and discovered it was not nearly as easy as I thought.

Here’s the problem description:  Your data arrives in two separate data files.  You need to import them both into your form.  Problem is that importing new data replaces existing data.  Loading the second data file will discard the data from your first import. 

Let’s set up a specific example — Suppose my data looks like:

  <set1> … </set1>
  <set2> … </set2>

We want to be able to load set1 and set2 from different data files.

There are a couple of solutions to this problem.   But first some review on dataset handling within XFA/PDF forms.  Normally form data gets stored under $data — which is a shortcut to:  If the root node of your form is "multidata", then your data appears under

The XML hierarchy looks like this:

<xfa:datasets xmlns:xfa="">
      <set1> … </set1>
      <set2> … </set2>

When Acrobat performs a data import, it replaces the <xfa:data> element. But it *appends to* any other datasets. 

Solution 1: Preserve/Restore data before/after import

If during import you could arrange your data to look like:

<xfa:datasets xmlns:xfa="">
      <set2> … </set2>
  <set1> … </set1>

In this case, <set1> would be preserved and only <set2> would be replaced.  Then after the import is complete,  you’d move <set1> back where it belonged.  The way to control the import is to use the Acrobat script function: importXFAData();  Here’s the outline of the script to import set2:

  1. Copy set1 data to be a child of xfa:datasets
  2. Remove the set1 subform
  3. Call importXFAData() to load set2
  4. Move the set1 data back under <multidata>
  5. Re-add the set1 subform

There are a number of tricky parts:

  • When importXFAData() is successful, it causes a remerge.  When remerge happens, any script commands after the call to importXFAData() will not execute.  The workaround is to perform steps 4 and 5 in a separate form:ready script.
  • importXFAData() does not return a status.  You have no way of knowing if the user cancelled.  If they did cancel, you need to restore set1 without depending on the form:ready script.
  • If the user is running in Reader, then importXFAData() will throw an error.  We need to catch this error and restore set1 data.
  • If the user imports data from the Acrobat menu (Forms/Manage Form Data/Import Data) then all your clever script won’t run and set1 data will get cleared.  You need to figure out how to remove this option from the Acrobat menu.

Note that this specific example assumes you are loading data in the order set1 then set2.  The form could be coded more generally to load the data in any order.  It would just be a bit more complicated.  You’d need to move both set1 and set2 and then after the load you’d figure out which one(s) need to be moved back.

Here is a sample form.  Here is sample set2.xml data you can load.  Have a look at the button click and form:ready events for all the gory details.

Solution 2: Bind set1 outside of xfa:data

Instead of temporarily arranging your data so that set1 is under <xfa:datasets>, you could permanently arrange your data this way.  In the binding expression for the set1 subform, specify "!set1" — which is a shortcut for xfa.datasets.set1.  Now whenever you import data for set2, it will leave set1 untouched.  However, this introduces a new problem.  Whenever you import new set1 data you will end up with multiple copies of set1.  You need a form:ready script that will delete all but the last copy.  This also means that the data file holding set1 needs to include the <xfa:datasets> element so that it can correctly specify the location for set1

Here is a sample file with set1 data and set2 data. The script to trim back the extra copies of set1 data is found in the multidata form:ready event.

My personal preference would be to use Solution 2.  The script is simpler.  The user can use the menu commands for loading the data.  But this approach might not be possible if your data is bound to a schema.

Editable Floating Fields V2

This is a follow-up to a previous blog entry that you probably should read first.

After doing the first version of the floating field editor, I tackled some issues/enhancements:

  1. A bug in the script where if you tabbed out and tabbed back in, the editor stopped working.
  2. Enforce constraints associated with the referenced fields
  3. When the editor does not have focus, display the field values using the formatted values of the referenced fields
  4. When the editor has focus, display the field values using the edit values of the referenced fields

I have updated the previous sample form — as well as the Editor fragment.

Enforcing Field Constraints

Since the floating fields are all presented inside a single text field, there was originally no constraints on any of the user input.  Now the form will look at the referenced fields and will restrict user input:

  • Respect the max chars constraint of text fields (in the sample, they’re all limited to 10 characters)
  • For numeric fields, limit input to valid numeric characters
  • For choice list fields, limit input to the set of valid choices

Locale-sensitive Numeric Fields

When restricting the set of valid characters for numeric input, it is tempting to just go with the obvious set:
[0-9\-\.]  However for many locales, the radix (decimal) and minus symbols will be different.  In order to know which symbols to use, the form queries the locale definition.  You XML source peepers will be aware of the <localeSet> packet in your XDP files.  This has all the data for the locales that are explicitly referenced on the form. 

The symbols are stored in a format that looks like:

<localeSet xmlns="">
   <locale name="de_DE" desc="German (Germany)">
      <calendarSymbols name="gregorian"> … </calendarSymbols>
      <datePatterns> … </datePatterns>
      <timePatterns> … </timePatterns>
      <numberPatterns> … </numberPatterns>
         <numberSymbol name="decimal">,</numberSymbol>
         <numberSymbol name="grouping">.</numberSymbol>
         <numberSymbol name="percent">%</numberSymbol>
         <numberSymbol name="minus">-</numberSymbol>
         <numberSymbol name="zero">0</numberSymbol>
      <currencySymbols> … </currencySymbols>
      <typefaces> … </typefaces>
    <locale> … </locale>

I was able to extract the number symbols with this function:

function findLocaleNumberSymbols(vRefField) {
    var oSymbols = {decimal: ".",
                    minus: "-"
    var vLocale = localeSet[vRefField.locale];
    if (typeof(vLocale) !== "undefined") {
        var vNumberSymbols = vLocale["#numberSymbols"].nodes;

        for (var i = 0; i < vNumberSymbols.length; i++) {
            oSymbols[vNumberSymbols.item(i).name] =
    return oSymbols;

Using the numeric symbols the form is able to more accurately restrict input for numeric fields.

Choice List Fields

The last field in the sample is a reference to a choice list with the American states.  Try out the editing experience here.  It’s pretty cool:

  • Input characters are limited to the set of valid choices
  • As soon as you type enough characters to uniquely identify a state, the rest of the input is completed automatically

Use Formatted and Edit Values

In the updated sample. the editing field now behaves like any other widget.  When you tab in, referenced field values display in their edit format.  When you tab out, the referenced fields display their formatted value.  In the sample you will notice that the currency and date values change when you tab in/out.  This is functionality that happens automatically on normal fields but had to be emulated in script for this sample.

You don’t have to read through all the script to figure out how it works, but it is worth noting that you can access a field value in three different ways:

  • field.rawValue — the canonical value as it is stored in the data
  • field.formattedValue — the value with the display pattern applied
  • field.editValue — the value with the edit pattern applied

Note that if a format or edit picture/mask is not supplied, there are default patterns for numeric and date values.

Hook up via the enter event

Previously, the editor field tapped into the script object by delegating its initialize, change and exit events.  it now also needs to delegate the enter event:

form1.LetterEdit::change – (JavaScript, client)

form1.LetterEdit::enter – (JavaScript, client)

form1.LetterEdit::exit – (JavaScript, client)

form1.LetterEdit::initialize – (JavaScript, client)
scEditFF.initialize(this, Letter, "#c0c0c0", 10);


There are more constraints that could be enforced, e.g. digits before/after decimal but those