Archive for January, 2009

Populating list boxes

One of the Reader 9 enhancements was a new API call to populate list boxes: field.setItems(). The motivation for the new API is to provide better performance for populating lists.

Prior to Reader 9, the standard way to populate a list box is to call:

field.addItem(displayValue [, boundValue])

for each item in the list.  The new API looks like:

field.setItems(itemListString [, numColumns])

The first parameter is a comma-separated list of values, the second parameter is an integer telling the field how many columns are in the data (defaults to one).  The second parameter is designed for future extensibility if we choose to some day implement a multi-column list box.

Examples

A call to populate a listbox with currencies might look like:

Currency.setItems(
  "US Dollar,Canadian Dollar,Euro,United Kingdom Pounds");

Or if there were a bound value, it would look like:

Currency.setItems("US Dollar,USD,Canadian Dollar,CAD,Euro,EUR,United Kingdom Pounds,GBP", 2);

Prior to Reader 9, this second variation would have been coded as:

Currency.clearItems();
Currency.addItem("US Dollar", "USD");
Currency.addItem("Canadian Dollar", "CAD");
Currency.addItem("Euro", "EUR");
Currency.addItem("United Kingdom Pounds", "GBP");

Alternative

There is a 3rd method for populating listboxes: binding them to data.  Designer allows you to point your field at a location in your instance data where list box contents will be stored.  While this method has very good performance, it has the disadvantages that a) your data is not always in the correct format for binding, b) the listbox gets populated from data only during the initial data load.

Performance

If you are using listboxes only casually you probably will not notice the difference in performance between the two methods. But if you are using listboxes intensively, the new method is a life-saver.

I have attached a form where I compare the old performance to the new. On my laptop, I populate a listbox with 500 items in 125 milliseconds using addItem() calls, and in 16 milliseconds using setItems(). Neither of these numbers may seem significant, but we have customers with forms containing many list boxes with many, many entries where the difference in performance is critical.

Compatibility

If you are designing a form to use this new API, be sure and set your target version (in Form Properties/Default) to "Acrobat and Adobe Reader 9.0 or later".  Unless you do this, calls to setItems() will not work — even though you might open the form in Reader 9.

Position a Subform at Page Bottom

A common layout task I’ve heard requested is the ability to place a flowed subform at the bottom of a page.  Picture a series of detail subforms followed by a summary. Instead of having the summary positioned immediately below the last detail subform, we want it anchored to the bottom of the page.

There is nothing in the XFA markup that allows this kind of subform placement.  However, we can achieve this via script.  The solution involves placing a "spacer" subform immediately before the summary subform.  Initially the spacer subform has a height of zero.  After the initial layout is complete, we calculate how much space is left at the bottom of the page. We then set the height of the spacer subform to the remainder amount and force a re-layout.

With the size taken up by the newly expanded spacer subform, the summary subform will now be positioned at page bottom.  To see this work, have a look at the sample form.  Notice that as you add/remove detail subforms the summary subform remains at a fixed position.  One important detail to make this work is that we added script to the add/remove buttons to clear the current spacer.  Before changing the layout we set the height of the spacer subform back to zero and set a form variable to indicate that the spacer needs to be re-evaluated.

Paragraph Breaks in Plain Text Fields

The vast majority of text fields we create on our forms hold plain text. Downstream systems that receive data from our forms handle plain text much more easily than they deal with rich text expressed in XHTML.

Obviously this is a bit of a compromise, since plain text is much less expressive than rich text. However, one area where we can express some richness in our plain text is by handling paragraph breaks — specifically by differentiating them from line breaks. This means that paragraph properties on your field such as space before/after and indents can be applied to paragraphs within your plain text. The main difficulty is how to differentiate a paragraph break from a line break in plain text and what keystrokes are used to enter the two kinds of breaks.

Keystrokes

Most authoring systems have keystrokes to differentiate a line break from a paragraph break. The prevalent convention is that pressing “return” adds a paragraph break, and pressing “shift return” adds a line break. However that convention seems to be enforced only when the text storage format is a rich format. E.g. it works this way in Microsoft Word, but it doesn’t work this way in notepad. Similarly in our forms. When entering boilerplate text in Designer or when entering data in a rich text field we follow this keystroke convention. Entering “return” generates a <p/> element, and entering “shift return” generates a <br/> element. However, when entering data in a plain text field there is no difference between return and shift-return. Both keystrokes generate a linefeed — which is interpreted as a line break.

Characters

You might assume that in plain text we could simply use the linefeed (U+000A) and the carriage return (U+000D) to differentiate between a line break and a paragraph break. However, it is not so easy. We store our data in XML, and the Unicode standard for XML does not support differentiating these characters. XML line end processing dictates that conforming parsers must convert each U+000A, U+000D sequence to U+000A, and also instances of U+000D not preceded by U+000A to U+000A.

As of Reader 9, we have a solution by using Unicode characters U+2028 (line break) and U+2029 (paragraph break). When these characters are found in our data, they will correctly generate the desired line/paragraph breaking behaviours.

The problem now is one of generating these characters from keystrokes. We can’t just change the default behaviour of Reader to start inserting a U+2029 character from a carriage return. Legacy systems would be surprised to find Unicode characters outside the 8-bit range in their plain text.

However, the form author can explicitly add this behaviour. The trick is to add a simple change event script to your multi-line plain text field:

testDataEntry.#subform[0].plainTextField[0]::change – (JavaScript)

// Modify carriage returns so that they insert Unicode characters
if (xfa.event.change == ‘\u000A’)
{
    if (xfa.event.shift)
        xfa.event.change = ‘\u2028′;  // line break
    else
        xfa.event.change = ‘\u2029′;  // paragraph break
}

As you can see in the sample form, entering text into these fields will now generate the desired paragraph breaks in your plain text.

Transpromo: the sequel

Before the break I wrote an entry describing how to place transpromo content on your form.  The sample was fairly restricted in that the advertisement could be inserted only in specific spots in the subform hierarchy.  Today’s sample allows the advertisement to be placed anywhere on the form. 
Note that this sample is not an interactive PDF, but is a print form — as you’d expect for a transpromo appication.  Here is the XDP form, the sample data and the result of a print operation.  In order to generate some interesting white space to work fill up, I’ve added a conditional break on the form.  Every time the date field value changes, I force a new page.

The strategy behind this sample is that we use a dynamic subform to place the advertisement on the master page rather than in the form hierarchy.  This isn’t as easy as it sounds, so I’ll explain the steps to build the form.

In a nutshell, the strategy involves:

  1. find the place on the page where the ad will fit
  2. create an instance of the advertisement subform
  3. modify the (x,y) coordinate to position the ad correctly. 

However, there are some challenges to make this work.  The main problem is that when we re-paginate, we might throw away any existing master page instances (pageAreas) and re-create them.  Any subform instances and (x,y) positions would be lost in the process.  The solution involves persisting the ad placement information in the form data and binding the advertisement subforms to the data.  The specific mechanics need a fair bit of explaining…

Processing Steps

In the previous post I described these processing steps:

  1. Loading content
  2. Merging data with template (create Form DOM)
  3. Executing calculations and validations
  4. Layout (pagination)
  5. Render

This isn’t the complete story.  There are some intermediate steps as well.  Specifically, when doing the pagination, we create pageArea objects — along with their associated content.  We then attempt to merge this page content with form data.  The expanded processing steps are:

  1. Loading content
  2. Merging data with template (create Form DOM)
  3. Executing calculations and validations
  4. Layout (pagination)
    4a. Merge data with page content
    4b. Execute calculations and validations on page content
  5. Render

Settings stored in Data

The way to customize page content at runtime is to put all the page settings in the form data and have the page content subform bind to the data.  You need to find a place in the data that will not interfere with the rest of the form data.  This is especially important if your form is based on an XML schema.  Any page data intermixed with form data would violate the schema — or could inadvertently merge with other elements in the template.  The solution is to place the page data in a separate dataset.  The default location for form data is under <xfa:data>.  We’ll place the page data under <pageData>:

<xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
   <xfa:data>
       … form data goes here …
   </xfa:data>
   <pageData>  <!– data for transpromo ads goes here –> 
      <advertisement>  <!– placement data for one ad –>  
         <x>0.25in</x> <!– x,y position –> 
         <y>2.89in</y>

                       <!– bind to presence property –>   
         <presence>visible</presence>  
         <Subform7in>  <!– this will select the right sized ad –> 
                       <!– populate info/debug field –> 
            <available>7.85in</available>  
         </Subform7in>  
      </advertisement>  
      <advertisement> … next page ad … </advertisement>
   </pageData>
</xfa:datasets>

Once we generate the data, we need to make sure that the subforms on the master pages bind to the data accordingly.  The advertisement subform uses the binding expression: "!pageData.advertisement[*]". (The "!" character is a shortcut in SOM that brings you to a child of xfa.datasets).  The rest of the bindings are "Normal" (based on a name match).

Now any time a repagination happens, the page content will re-bind to the data and all the settings will be restored.

Use Dynamic Properties

Setting properties such as the x and y coordinates via data is done using dynamic properties.  Designer supports setting properties such as caption, error messages, choice list contents from from data.  While Designer exposes the set of commonly used properties, in reality almost all properties can be set this way.  e.g. To populate the x and y properties from data, I used XML source view to add the necessary <setProperty> elements:

<subform name="advertisement" w="203.2mm" layout="tb">
   <occur min="0" max="-1"/>
   <setProperty target="x" ref="$.x"/>
   <setProperty target="y" ref="$.y"/> 
   <setProperty target="presence" ref="$.presence"/>

   …
</subform>

The presence property is set to "hidden" by default so that the advertisement subforms do not clutter up the design view.  In addition to setting the x and y properties via data, we also set the subform presence property to "visible".

Note: There was one other case where I needed to use XML source view to design this form.  In order to make "advertisement" subform optional, I added the <occur> element:

<subform name="advertisement">
   <occur min="0" max="1"/>

Building the data in script

Adding data can be done using createNode() as we’ve done in previous samples.  The code to add specify the x property would look something like:

var vPageData = xfa.datasets.createNode("dataGroup","pageData");
var vAds      = xfa.datasets.createNode("dataGroup","ads");
var vAdvert   = xfa.datasets.createNode("dataGroup","advertisement");
var vX        = xfa.datasets.createNode("dataValue","x");

xfa.datasets.nodes.append(vPageData);
vPageData.nodes.append(vAds);
vAds.nodes.append(vAdvert);
vAdvert.nodes.append(vX);
vX.value = vCA.x;

Or we can use the assignNode() method and do the whole thing in one command:

xfa.datasets.assignNode("pageData.ads.advertisement.x", vCA.x, 0);

The way assignNode() works is that as it traverses the SOM expression and creates any intermediate nodes that don’t exist. In this example it would create "advertisement" as a dataGroup (since it can tell that it’s a grouping node) and then creates "x" as a dataValue (since it can tell that it is a leaf node).  Once the nodes are created, the second parameter is the value to assign to the leaf node.  The last parameter dictates how to create nodes (0 == "create/replace").

The other benefit to using assignNode() is that if your method inadvertently gets called twice, it won’t create another instance of the data.  It will overwrite the data added previously.

Triggering re-layout

In this sample we’ve placed our code in the docReady event of the adControl field.  The docReady event fires after layout and render are complete.  Once we have generated form data for each page, we add an advertisement subform (merging it with the new data) and then call xfa.layout.relayout(). I encountered a bug along the way — the relayout() call ought to have remerged without an explicit call to add the advertisement subform. Since it didn’t, the workaround was to add the subform explicitly.

Limits

1. As mentioned already, this form is intended for print.  It is not intended for (nor will it work in) an interactive environment where subforms are added or removed.  In interactive forms we can count on more layout:ready events firing.  In fact, the first sample transpromo form I made the mistake for relying too much on the layout:ready event.  Consequently, that form works only in interactive mode and not for print.  But given the greater flexibility of this second sample, I recommend staying with this approach for transpromo form design.

2. This form currently works only on page areas that have a single content area. All object positions are computed relative to the content area, but it isn’t possible to determine which content area you are currently in.

Re-Use

In order to re-use the code from this sample, simply take the content from page1 (the adholder subform) and place it on each master page where you’d like to add content.  A cautionary note: if you copy/paste the adHolder subform or if you create a custom object from it, Designer will remove the binding information. You will need to re-specify it when you bring it into a new form.

Next Steps

The sample could be extended to search for other kinds of white space.  Currently we find only the leftover vertical white space at the end of each page.  There should be enough information in the layout tree for us to also discover white space inside positioned content.

Reader Survey

Happy new year!

The Adobe Reader team is taking a poll on priorities for upcoming Reader releases.
You are encouraged to go and complete a survey at the Reader blog.