PaperForms (2D) Barcodes with Repeating Subforms

| 5 Comments

One of our support engineers brought an issue with 2D barcodes to my attention this week.  I was able to help her with a solution, so I thought I'd share it here as well.

These barcodes encode form field data the user has typed in.  When the form is printed (or perhaps faxed) the barcode can be scanned and the form data retrieved.  This workflow was originally envisioned on static forms with fixed amounts of data -- after all, a 2D barcode holds a limited amount of data.  As a result, the design process does not allow the inclusion of repeating subforms in the barcode data.  However, there certainly are cases where we could safely allow some repeating data in a barcode without overflowing the storage capacity.

I'll show you how to fix the problem first, and then if I still have your attention I'll give you the details on how it works.

The Fix

When you add your PaperForm barcode to your form, you also define a collection of fields and subforms to include in the barcode value.  If you add any repeating subforms to this list, then at runtime only the first subform instance will be included in the barcode.  In order to have the barcode include more instances, we need to modify the collection.  In the sample form, I've done this with an initialization script on the barcode field.  When you re-use this script, you need to change the last line so that it points to your collection:

makeManifestRepeat(BC_Collection);

You need to replace "BC_Collection" with the name of the collection used by your barcode.   There is one other detail needed to make this work for Reader 8.  When you add a new instance of a subform, you need to explicitly fire the barcode calculation.  The sample does this in the button click event:

PaperFormsBarcode1.execCalculate();

How it Works

The UI terminology for the fields included in a barcode is "Collection".  But the grammar calls it a manifest.  The manifest in the sample looks like this:

<manifest name="BC_Collection" id="2ae4d4a5-5e5d-4dba-b50e-e069b91533ce">
  <ref>xfa[0].form[0].bcTest[0].p1[0].Subform1[0].TextField1[0].dataNode</ref>
  <ref>xfa[0].form[0].bcTest[0].p1[0].Subform1[0].TextField2[0].dataNode</ref>
  <ref>xfa[0].form[0].bcTest[0].p1[0].Subform1[0].TextField3[0].dataNode</ref>
</manifest>

The manifest is a list of SOM expressions to the values to be included in the barcode.  Note that our repeating subform explicitly references the first instance: "Subform1[0]".  To make this manifest include all instances, we need to change it to "Subform1[*]":

<manifest name="BC_Collection" id="2ae4d4a5-5e5d-4dba-b50e-e069b91533ce">
  <ref>xfa[0].form[0].bcTest[0].p1[0].Subform1[*].TextField1[0].dataNode</ref>
  <ref>xfa[0].form[0].bcTest[0].p1[0].Subform1[*].TextField2[0].dataNode</ref>
  <ref>xfa[0].form[0].bcTest[0].p1[0].Subform1[*].TextField3[0].dataNode</ref>
</manifest>

 

We could have fixed this by editing the XML source in Designer, but that would be awkward.  Instead, we update the manifest when the form is opened in Reader -- that's the function of the initialization script.

Big and Complex Forms

| 8 Comments

How Big and Complex can you make your form?

I get asked this question often. Customers or partners develop very complex or large dynamic forms with many pages and large amounts of script. At what point do we cross the line and reach a level of complexity where Reader/PDF is no longer the right tool for the job?

There is no easy answer. The answer will be different for different users. But it is helpful to look at some of the stress points you’ll encounter with large forms.

Note that these notes apply to forms opened in Acrobat/Reader. The stress points for forms rendered on the server are much different.

  1. Number of pages to render
  2. File Size
  3. Script size, complexity and development methodology
  4. Script performance

Number of pages to render

One of the great properties of regular PDF files is that the file open time is constant no matter how large the PDF. The time to open a two thousand page PDF is pretty much the same as for a one page PDF. This is because Reader doesn’t load the whole PDF into memory and doesn’t read the bytes for page <n> until the user navigates to page <n>.

Dynamic XFA/PDF forms offer a different value proposition. The pages are shaped at form open time by the form data. Of course, there are great advantages to dynamic forms. But there are also associated processing costs. At form open time the entire form definition is loaded into memory. The entire set of data is loaded and merged with the form template. Reader performs enough of the layout to determine how many pages will be rendered. Then when you navigate to page <n>, Reader renders that page from the in-memory structures.

How many pages can Reader handle for a dynamic document? This depends on the complexity of the template. I’ve seen five page forms that take forever to open. I’ve seen a hundred page form open in a second. The limit is more related to the density/complexity of template and data rather than the actual number of pages.

Some form authors attempt to reduce file open time by hiding inactive pages. This strategy was effective in reducing form open time in Acrobat/Reader 7. But in Reader 8.1 when the form open algorithm was improved, the ‘page hiding’ strategy no longer makes a significant difference.

File Size

Dynamic XFA/PDF forms tend to be smaller than static documents. This is because of the template property of forms. For example: a hundred page static PDF will have a hundred pages of PDF mark-up. Whereas in the dynamic case, this could be one page of XFA mark-up that gets replicated a hundred times when merged with data. The latter will be a much smaller file. Nonetheless, dynamic documents can grow to the point where they begin to stress your system. The time to read and parse the documents happens very quickly – even for very large templates. However, the size of the template becomes more of a factor when there are security components in play. Operations such as Certification, Reader extensions and Signatures will perform comparison operations on ‘before-and-after’ versions of the form. The costs of these comparisons are proportionate to the size of the template.

So while there is no absolute threshold on file size, you will find the threshold is lower for certified/extended/signed forms.

Script size and complexity and development methodology

I have seen XFA/PDF files with tens of thousands of lines of JavaScript. Given that there is no debugger, you have to be pretty persistent to create this amount of script. If your big script library is well written, it may perform well enough, but the stress comes with the maintenance of the script:

  • When you change the script, do you have the ability to rigorously test your changes? When you modify fields or subforms, will your script still work? Do you have test collateral that gives you code coverage for all the edge cases in your script? Do you have some form of automated testing? QTP anyone?
  • Is your script maintainable? Or is the code ‘write-only’? Unless you have been disciplined in the creation of your library, you will have longer term maintenance issues when a new developer comes along to update an existing form.
  • When you encounter problems with your script, are you able to isolate the problem when you ask for help? Your friends in our support organization are much better at solving problems with small, simple forms than with large, complex ones. If your script is modular and isolated into components then you’ll be able to ask for help much more easily than if your script is an inter-tangled mess.
  • When you change script, do you preserve previous versions of your form? You need the ability to roll-back changes.

Again, there are no absolutes here, but if you want/need to write lots of script, you need to have the associated discipline in your development environment to make it maintainable.

Script performance

Large amounts of script do not necessarily imply poor performance. But poorly written script of any amount can kill form performance. A script that traverses the entire form hierarchy will have performance that is proportionate to the number of objects in the form. As the form grows, the script slows down. There are many 'best practises' for writing efficient script. It is very important to pay close attention to the contents of frequently executed loops.

Conclusion

But before you make a big investment in a form, make sure you consider the alternatives. You might be better off with a Flash form or an AIR application.  If you choose Reader/PDF, the maximum size and complexity of your form depends primarily on your own tolerances.  You need to decide whether the runtime experience is responsive enough.  You need to decide if you are getting the return on investment for your cost to develop and maintain the form. 

Designer ES2 Macros

| 7 Comments

Hey, it has been a while since I posted.  Lots of stuff on my plate.  Occasionally they make me do real work around here. 

Have you installed ES2 Designer yet?  If you have, then there is a new experimental feature that you can play with.  Macros.  I want to tell you all about them.  But first the caveats:

  1. ES2 Designer macros are an experimental (prototype) feature
  2. ES2 Designer macros are not officially supported
  3. Macros developed for Designer ES2 are not guaranteed to work in the next release of Designer.

We figured out the architecture for this feature fairly late in the ES2 development cycle.  We think macros have lots of potential, but we didn't have the resources to finish the job in ES2.  So we've put it out there as a prototype.  You get to kick the tires.  Tell us if you like it.  Let us know what enhancements are needed.

Overview

Design Macros provide an external plugin interface to Designer, so that 3rd parties like partners or customers can extend the functionality of Designer. Some examples:

  • Rename a field or subform and update all the script references
  • Add metadata to form objects (<extras> or <desc>)
  • Find all scripts that consist entirely of comments
  • Add an onEnter script to all fields

Macro Script

The macro itself consists of a JavaScript file.  The JavaScript in the macro has full access to the template model. (I have a previous blog post that talks about scripting to the template: Template Transformation ).  The basics are that the scripting knowledge  from coding your form script transfers nicely to the Designer environment.

In addition to the template DOM, there's an object in the root namespace called "designer" that has methods that you can use to communicate directly with the Designer application.

Flash Dialogs

One of the methods on the designer object allows you to launch a flash dialog (.SWF)and allows you to exchange strings with the dialog.  This allows you to build a custom UI.

Installing a Plugin

To install a plugin:

  1. Create a folder named "scripts" in the Designer install directory.
  2. In the scripts folder, create a folder for your plugin, i.e. "MyPlugin"
  3. In the scripts \ MyPlugin folder, create a JavaScript file (this is the actual plugin), i.e. "MyPlugin.js"
  4. Place any SWF files used by the plugin in the same directory.

When Designer starts up, it searches its install directory for a folder called scripts. If this folder is found, Designer will then search each child folder of scripts looking for *.js files.

Every *.js file found (and there can be more than one in the same folder) will appear as a menu entry under the Tools | Scripts menu. This menu entry on the Tools menu appears only if the \scripts directory exists and there is at least one *.js file in a subdirectory of the \scripts folder.

To run the plugin, select the plugin you want to run from the Tools | Scripts menu.

The Designer API

Here follows a description of the methods that are available on the designer object.

/**
* Output a message to the log window in Designer.
* (Note that the log window in designer will not emit any
* duplicate strings)
* @param sMsg The text to push to the log window
*/
void designer.println(sMsg)


/**
* returns the object (or objects) currently selected
* on the canvas or in the hierarchy dialog. If nothing is
* currently selected, the list returned will be empty.
* @return a nodelist
*/
nodelist designer.getSelection()

 

/*
* Create a new modal dialog window from a provided
SWF.
*
* @param sSWF The name of a *.swf file to load.
* Note that the *.swf file must be in
* the same directory that the plugin is installed in. 
* The sSWF parameter should only contain a file name,
* no path information.
*
* @param nWidth The width of the Flex dialog
* @param nHeight The height of the Flex dialog
*
* @return a string that comes from the
* Flex application.  When the Flex application terminates, it
* can pass back a string. Commonly used to send back it's
* closing status, e.g. "OK" or "Cancel"
*/
string designer.showFlexDialog(sSWF, nWidth, nHeight)


/*
* This method writes out a text file (showTextWindow.txt) to
* the system's temporary directory with the content of sText
* then launches the system's default *.txt file editor with
* that file as a parameter.
*
* This method allows a non-modal way of showing output.
* The Flex dialog and the alert dialog are both modal - this
* makes it impossible for a user to interact with the output
* of a plugin at the same time they interact with Designer.
*
* @param sText The text to show in the system's default
* text editor.
*/
void designer.showTextWindow(sText)


/**
* This function creates an XDP data file from the 
* supplied data and will launch a PDF file with that data.
* This allows rich reporting from a plugin script.
* Note that this function looks for an installed version of
* Acrobat, and will not work with Reader.

* @param dataPacketString The XML data to be written out
* @param pdfName The base name of the PDF file to display,
* which must be in the plugin directory.
*/
void designer.showXDPinAcrobat(dataPacketString, pdfName);



/**
* This function is used to get data out of the Flex dialog
* invoked by designer.showFlexDialog(). The Flex dialog can
* send data to Designer by calling:
* ExternalInterface.call("setDialogString",
*                          "VariableName", "VariableValue");
*
* If the Flex dialog makes this external call to Designer, then
* once the dialog is dismissed, "VariableName" is available for
* inspection in the plugin through a call to:
* designer.getDialogString();
* In this particular example, the call would be:
* designer.getDialogString("VariableName");
* The return value would be "VariableValue".
*
* @param sFieldName The name of the field to inspect.  
* This field is available for inspection only if the Flex
* application made the appropriate ExternalInterface call.
*
@return The value of sFieldName or empty if the Flex
* application did not set that value.
*/
string designer.getDialogString(string sFieldName);



/**
* This method is used to push data into the Flex dialog before
* calling designer.showFlexDialog(). If the plugin wants to set
* data inside the Flex dialog, it needs to call
* designer.setDialogString();
* with the data before invoking designer.showFlexDialog().
*
The Flex application, in turn, needs to call
*
ExternalInterface.call("getDialogString", "sFieldName")
*
* @param sFieldName The name of the variable to set
*
@param sValue The value of sFieldName.
*/
void designer.setDialogString(sFieldName, sValue)


/**
* Show a message box in Designer with sMsg as the text.
* @param sMsg The message to display in the message box.
*/
void designer.alert(sMsg);

An Example

Over time I hope to share a bunch of sample macros.  But to get started, here is a fairly simple macro that should wet your appetite.

The macro refactor.js will rename a field object.  In addition to renaming the field, it will find all occurrences of that field in scripts and will rename it there as well. It uses refactor.swf as a ui to modify the scripts.

Step 1.

Install the macro.  Place the .js and .swf files below the Designer install.  On my system this looked like:

refactorScreen2

Step 2.

Open a PDF in Designer ES2.  I used this file to test.

Step 3.

Select the field to rename

Step 4.

Launch the macro:

refactorScreen1

Step 5.

When the flash dialog pops up, enter a new name for the field. Then use the buttons to find/replace the field name in scripts

refactorScreen3

When the dialog is dismissed, the form will be updated with all the changes.

Here is the rest of the collateral you'll need (right click to download):

 

Whew.  That's a lot to absorb.  I hope to offer some more samples soon.

Null Data Handling

| No Comments

We run into scenarios where we want to control how null values are represented in our XML instance data.  There are three different ways that null data can be represented in XML data:

  1. Exclude the element.  If the value has no data, do not write it out to the XML file
  2. Use XML Schema's nil attribute:
    <spouseName xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                                                                                                  xsi:nil="true"/>
  3. Write out an empty element.  e.g. <spouseName/>

Null Handling in XML Schema

When determining which strategy to use, we look to the form's XML Schema for hints:

  1. If a leaf element (no child elements, no attributes) is marked as optional (minOccurs="0") then it will be excluded when null.
  2. If an element in the schema is marked as nillable="true", then the data will be marked with the xsi:nil attribute
  3. In all other cases we write out null values as empty elements

For the rest of this blog entry I'll describe exactly how the XFA processor deals with null data, and offer some tips on how you can further customize the behaviour.  In particular, I'd like to show how to control null handling at the data group level.  

Let's look at the problem using a specific example.  Here's a sample purchase order schema:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="purchaseOrder" type="poType"/> <xsd:complexType name="poType">
<xsd:sequence>

  <xsd:element name="emptyItem" minOccurs="0">
   <xsd:complexType>
    <xsd:sequence>
     <xsd:element name="partNum" type="xsd:string"/>
     <xsd:element name="description" type="xsd:string"/>
     <xsd:element name="quantity" type="xsd:positiveInteger"/>
     <xsd:element name="unitPrice" type="xsd:float"/>
    </xsd:sequence>
   </xsd:complexType>
  </xsd:element>

  <xsd:element name="excludeItem" minOccurs="0">
   <xsd:complexType>
    <xsd:sequence>
     <xsd:element name="partNum" type="xsd:string"/>
     <xsd:element name="description" type="xsd:string"/>
     <xsd:element name="quantity" type="xsd:positiveInteger"/>
     <xsd:element name="unitPrice" type="xsd:float"/>
    </xsd:sequence>
   </xsd:complexType>
  </xsd:element>

  <xsd:element name="xsiItem" minOccurs="0">
   <xsd:complexType>
    <xsd:sequence>
     <xsd:element name="partNum" type="xsd:string"/>
     <xsd:element name="description" type="xsd:string"/>
     <xsd:element name="quantity" type="xsd:positiveInteger"/>
     <xsd:element name="unitPrice" type="xsd:float"/>
    </xsd:sequence>
   </xsd:complexType>
  </xsd:element>

  <xsd:element name="comment1" type="xsd:string" minOccurs="0"/>
  <xsd:element name="comment2" type="xsd:string" nillable="true"/>

</xsd:sequence>
</xsd:complexType>
</xsd:schema>

 

Note the two comment elements at the bottom.  When comment1 is saved to data, it will be excluded when null.  When comment2 is saved to data it will be annotated with the xsi:nil attribute.

But what some form authors want is for empty groups (purchase order items in our example) to be excluded when their contents are null.  I can show you how -- but we are now getting into the deep end.

Data Description

When we save a PDF/XDP file that is based on a schema or sample XML or WSDL connection, we generate a data description.  This data description is really a distilled version of the schema.  It takes the form of a sample XML annotated with special namespaced attributes.  The data description for the sample above looks like:

<xfa:datasets
      xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
<dd:dataDescription
      xmlns:dd="http://ns.adobe.com/data-description/"
      dd:name="purchaseOrder">
  <purchaseOrder>

   <emptyItem dd:minOccur="0">
    <partNum/>
    <description/>
    <quantity/>
    <unitPrice/>
   </emptyItem>

   <excludeItem dd:minOccur="0">
    <partNum/>
    <description/>
    <quantity/>
    <unitPrice/>
   </excludeItem>

   <xsiItem dd:minOccur="0">
    <partNum/>
    <description/>
    <quantity/>
    <unitPrice/>
   </xsiItem>

   <comment1 dd:minOccur="0" dd:nullType="exclude"/>
   <comment2 dd:nullType="xsi"/>

  </purchaseOrder>
</dd:dataDescription>
</xfa:datasets>

 

Note how the comment elements are annotated.  We use dd:nullType to specify the null handling behaviour:

  1. dd:nullType="exclude" don't write out elements where the value is null.  Note that this option may be used only if the element is marked in the schema as optional (minOccur="0"). 
  2. dd:nullType="xsi" Use the XML schema nil attribute.  As described in the W3C definition:

    "XML Schema: Structures introduces a mechanism for signaling that an element should be accepted as valid when it has no content despite a content type which does not require or even necessarily allow empty content. An element may be valid without content if it has the attribute xsi:nil with the value true. An element so labeled must be empty, but can carry attributes if permitted by the corresponding complex type."
  3. dd:nullType="empty" (the default) save null values as empty elements

The dd:nullType attribute can also be placed on grouping elements.  When it is placed on a grouping element, then the setting gets applied to all children of the group.  While there is nothing in XML schema that will do this for us, we can do it by hand-editing the item elements in XML source view:

<emptyItem dd:minOccur="0" dd:nullType="empty">
...
<excludeItem dd:minOccur="0" dd:nullType="exclude">
...
<xsiItem dd:minOccur="0" dd:nullType="xsi">

And here is the obligatory sample form.  And the sample schema.  The sample form is bound to the sample schema, and has a field that shows the current state of the XML instance data.  Try typing values into the 3 different partnum fields and the comment fields, and watch how the data reacts according to the instructions in the data description.

The Really Deep End

The disadvantage to hand-editing your data description is that when your schema changes and the data description gets refreshed, your edits will be lost.

There is a way that you can write script to update your data description automatically.  The general technique is described at this blog post.  The script below will update all your data descriptions and for every element that is marked minOccur="0", we'll add nullType="exclude".  Note that the changes will be applied when saving as PDF or when generating a PDF using LiveCycle.

purchaseOrder::initialize - (JavaScript, server)
if (xfa.host.name === "XFAPresentationAgent") {
    var vDataDescriptionList = xfa.datasets.dataDescription.all;
    for (var i = 0; i < vDataDescriptionList.length; i++) {
        var vDD = vDataDescriptionList.item(i);
        var sDD = vDD.saveXML();
        sDD = sDD.replace(
             /dd:minOccur= "0" (?!dd:nullType="exclude")/g,
             "dd:minOccur=\"0\" dd:nullType=\"exclude\" ");
        xfa.datasets.nodes.remove(vDD);
        xfa.datasets.loadXML(sDD, false, false);
    }
}

Linked vs. Embedded Template Images

| 4 Comments

Hey, it has been a while since I wrote a blog entry.  I just spent a week visiting a customer site and getting familiar with a major form deployment.  Lots of learning happened in both directions.  And lots of stuff for me to report back on at this blog -- starting today.

Images and PDF sizes

We do not support linked images from PDF files.  Remember, the "P" in PDF means "Portable".  If the file has references to external content, it isn't exactly portable.  Another issue with linked images is that a PDF file with an external reference cannot reliably embed a digital signature.

In your XFA template you have the choice to link or embed images.  But the final PDF will always have the images embedded -- even if the template referenced them with a link.

Given that, what factors do you consider when you choose whether to embed or link images in your XFA templates?  Choosing between the two has size implications in the generated PDF.  Here is a bit of detail on how they get processed:

Embedded Images

Embedded images are stored in the XFA template XML as a base64-encoded value.  As with all base64 encoded binary data, the size expands by a factor of 4/3.  When Adobe Reader renders a dynamic form, it extracts the image data from the template and draws it to the screen.  If a template embeds the same image multiple times, we carry all copies of the duplicated image in the template and consequently, inside the PDF.

Linked Images

When we create a PDF from an XFA template with linked images, the images are stored in a PDF resource area.  We create an indexed name for the image based on the image file reference.  There are two efficiencies gained here:

  1. The images are stored in binary format -- not base64 encoded
  2. Multiple references to the same image are reconciled to a single copy of that image in the PDF

So clearly, if you are including images, and especially if you are including multiple copies of the same image in your XFA form, your final PDF will be smaller when you include those images as links instead of by embedding.

Recent Comments

  • Kevin McHale: What do I win when I get this? vSubform.instanceManager has read more
  • Niall O'Donovan: John, Thank you very much for the direction. I feel read more
  • Maruan Sahyoun: John: LiveCycle Barcoded Forms is part of LiveCycle Reader Extensions read more
  • Maruan Sahyoun: @Niall: I think when it comes to very dynamic solutions read more
  • John Brinkman: Niall: First of all, my apologies for the delay in read more
  • John Brinkman: Maruan: Obviously you're processing this in LiveCycle Output. Server print read more
  • John Brinkman: Stanton: This sample form shows the barcode only if you read more
  • Stanton: Does your sample form work in Reader 9.3.0? The barcode read more
  • Maruan Sahyoun: John: the document is about 60000 pages. There are also read more
  • John Brinkman: Maruan: You must have very many pages if you've chased read more

Recent Assets

Find recent content on the main index or look in the archives to find all content.