Main

May 21, 2009

Debug merge and layout

It has been quiet for a while. That is because I took on a more ambitious task over the last couple weeks.  The result is a new XFA/PDF debugger tool.

In the past I've posted samples (tools) that help users debug their merge (form dom) and view their form layout. The new tool consolidates and extends those capabilities and implements the debugger in flash -- a SWF embedded in a PDF. 

This tool will be useful to anyone having problems designing dynamic forms:

  • Subforms aren't being created from data where you expected they would
  • Layout has a mysterious blank page
  • Garbled or overlapping layout
  • Layout did not appear in the order you expected
  • leaders and trailers aren't being created
  • leaders and trailers are created too often
  • Content seems to be missing

If you're encountering any of these symptoms then this tool could help.

Usage

You need to be running Acrobat (not Reader) to use the tool.  And you need to be using version 9. 

To start a debug session, populate your PDF form with data and save it.  Then open the PDF from the XFADebugger.pdf tool.  You will see a snapshot of the form represented in three vertical boxes:

  • Form DOM tree
  • Data DOM tree
  • Layout -- page display (with content areas rendered as grey boxes)

From there on it is hopefully self-explanatory.  You can expand/collapse the trees.  Nodes in the trees are colour coded:

  • grey -- node is not bound
  • black -- node is bound once
  • blue -- data node is bound to more than one form node
  • green -- represents a subform leader/trailer that has been added by the layout process

Selecting a node in the form tree will:

  • if bound, highlight the corresponding node in the data tree
  • if not hidden, highlight the area on the page display where that node is rendered.
  • Display any interesting attributes/properties that impact merge and layout

Selecting a node in the data tree will (if bound)

  • select all the form node(s) that are bound to this data
  • display the attributes and highlight the corresponding form node(s)

Warnings

The tool reports on suspicious conditions on the form that could impact merge or layout.  Clicking on the "Find Warnings" button will cycle through any warnings found in the form.  For each warning, the corresponding nodes are highlighted and the the warning text appears in red.

These are the warning conditions detected:

Object extends outside its parent container
Just what it says.  When the offending form node is highlighted on the page layout in dark blue, it's parent node is highlighted in pale blue.

leader/trailer subforms must not be flowed
Leader and trailer subforms must always have positioned content.  As you might imagine having a variable-sized leader/trailer would make it pretty difficult for the layout algorithm to reserve space for the leader/trailer subform.

Subform is splittable but cannot split because the parent is not splittable
Designer also warns about this condition.

Object is growable, but not-splittable.  With enough data, it could grow too large for its content area
If an object can grow vertically without any upper limit, it could eventually grow too big to be rendered inside a content area.

Object is growable but its parent is not.  With enough data, it could grow too large for its parent
If you place a growable object inside a subform with a fixed layout size, you could end up with an object that's too big for its container.

Keep with previous' conflicts with 'break after' on previous element
Conflicting break/keep directives. (By the way, the keep will trump the break -- but this is condition is a leading cause of mysterious blank pages in your output)

'Break before' conflicts with 'keep with next' on previous element
Same as previous except the other way around.

Repeating subforms should not specify keep with next or keep with previous
Having a keep on a repeating subform will result in the entire group of subforms being un-splittable.

Subforms with repeating children should be splittable
If a subform has a repeating child, it is likely to require a split.

Multiple repeating form nodes are bound to the same data. This might cause a different merge result when the form is re-opened
This takes more explanation.  Consider this template definition:

<subform name="S0"><occur min=1 max=10/><bind ref="S[*]"/></subform>
<subform name="S1"><occur min=1 max=1/><bind match=once/></subform>

When this form is first opened without any data, we will create two instances of <S> in the data -- one for S0 and one for S1. Then when we save/close/reopen, subform S0 will bind to both instances of <S> and subform S1 will create a new instance of <S>. i.e. after save/close/reopen there is one more subform than there was before.  This is a form design issue that crops up occasionally and can be very confusing for novice form authors.

Rows with more than one multi-line field might have difficulty splitting

If you have a splittable table row with more than one multi-line field, you might find that it does not split.  The algorithm for splitting rows requires finding a common font baseline between rows on the sibling cells.  For current shipping product, the check for the baseline is very exact.  If there is any difference between the fields that can cause the lines to be offset slightly, then the split algorithm will not find a split point.  Some of the attributes that affect the position of the baselines include: top margin, paragraph space before/space after, line spacing, vertical justification, typeface, font size, vertical scale... and probably a couple more I haven't thought of.

 

As is the nature of warnings, not all warnings are problems that need to be fixed.  Your form might report warnings that are innocuous. 

Here is a sample of a very badly designed form that manages to have (at least) one instance of each warning.

How the Tool Works

The sample has two parts.  There is a base PDF with a document-level JavaScript defining:
function PDFLoader() that will:

  • Select and open an XFA-based PDF
  • Extract an XML snapshot representing the state of the form after it has opened

The form has a page-sized embedded SWF which holds the implementation of the debugger.  The SWF has a button that calls the document-level JavaScript using a call to ExternalInterface.call("PDFLoader").

Once the SWF has the XML snapshot of the form, it renders it and doesn't communicate with the base PDF anymore.

Other Uses

Educational

Loading up a form and seeing the form/data/layout graphically displayed can help to get insight on how the merge and layout processes work.

Quality Assurance

There are two ways that this tool can be used or adapted to maintain quality in your forms. 

1) loading and viewing your dynamic form in the debugger lets you verify that merge and layout are happening as designed.  Just because your form looks ok on screen doesn't necessarily mean that your data merged correctly or that your layout is behaving as planned.  You might be surprised by what you see.  You should make it a habit to check for warnings.

2) Adapt the XML snapshot to produce 'gold data' for your form.  When you are satisfied that your form is working correctly, produce a snapshot of the form that you can save as a baseline.  Then if your form gets modified -- perhaps some cosmetic changes -- you can compare the new snapshot to the baseline and confirm that any changes are as expected.

Futures

There are undoubtedly more form design problems that could be flagged by this tool. If you have suggestions for other conditions to detect, please let me know.

The form DOM could include more objects -- instance managers and draw elements.  For now I've left them out because they clutter the form tree too much.

Updates

June 1, 2009

  • Fixed bug where field splittable status was reported incorrectly
  • Increased the tolerance when checking for objects outside their extent
  • Added a new warning: "Rows with more than one multi-line field might have difficulty splitting"

March 6, 2009

A Form to Design a Form

There are a class of form designs where the form fill-in experience follows a basic pattern with a small number of variations.  A survey form is the prime example.  The variable pieces are the question text and a set of response types.  A user who wants to create one of these forms should not have to learn a complicated form design tool.  Given the relatively small number of properties that need to be specified to make a survey work, it should be possible to design a single dynamic form that can be shaped by any survey definition -- i.e. one form that can render all variations of a survey.

We can design that survey form, but then we need to figure out an easy way for the author to define their survey.  This is really just another data capture problem -- one that can be handled by a form.  So we use a form to design our survey.  A form to design a form.  Kind of like a play within a play.

To accomplish the form-within-a-form, there are two sets of data.  The survey-designer-form captures information for question text and response types and saves this information as its form data.  The fill-in-form has built-in variability (dynamic subforms) whose appearance and behaviour are customized from the data produced by the designer form.  The design data moves from being form data for the designer form to become metadata for the fill-in form.  When the fill-in version of the form runs, it produces its own data -- the survey results.

Two Forms in One

Ideally, design and fill would be two separate forms.  But two separate forms means moving data between forms.  And even a relatively simple operation such as moving data is probably more than we can expect from our target users' skill set.  As well, any multi-step process gets in the way of quickly previewing a survey.  To keep the experience as simple as possible, I've taken the approach of combining the design and fill-in experience into the same PDF.   The advantage is that the user deals with only one form and doesn't have to manage their data.  There are likely better ways to do this.  If the experience were tethered to a server (LiveCycle for example :), it would be easier to manage the data on behalf of the user and keep the forms separate.  That would also make it easier to use a cool flash UI for the survey-design piece. 

But for now, to make the sample easy to distribute, I've combined them into one PDF.

Here's the sample in several stages of processing:

An XFA form can fairly easily house two separate design experiences.  In my example, I had two optional top-level subforms: DesignSurvey and FillSurvey.  During survey design, the DesignSurvey subform exists.  During preview, DesignSurvey and FillSurvey both exist.  During fill-in, only the FillSurvey subform exists.  Which subform(s) appear is controlled by the form data and by script logic.

The Design mode allows you to create sections and question within sections.  The data to define a simple one-question survey looks like this:

<DefineSurvey> 
  <SurveyTitle>A survey about surveys</SurveyTitle>

  <ChoiceList>
    <ChoiceName>YesNo</ChoiceName>
    <Choice>
      <ChoiceText>Yes</ChoiceText>
      <Value>1</Value>
    </Choice>
    <Choice>
      <ChoiceText>No</ChoiceText>
      <Value>0</Value>
    </Choice>
  </ChoiceList>

  <Section>
    <SectionName>Personal Information</SectionName>

    <Question>
      <QuestionNumber>3</QuestionNumber>
      <QuestionText>Are you married?</QuestionText>
      <QuestionType>multipleChoice</QuestionType>
      <Required>0</Required>
      <ChoiceListName>YesNo</ChoiceListName>
      <MinSelections>0</MinSelections>
      <MaxSelections>1</MaxSelections>
    </Question>

  </Section>
</DefineSurvey

When designing the form, this data resides in the normal place for form data under:

<xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
   <xfa:data>
      <DefineSurvey>...</DefineSurvey>
   </xfa:data>
</xfa:datasets>

When we switch to "fill-mode", we move the form definition (<DefineSurvey>) to a separate dataset and the fill-in data then lives under <xfa:data>:

<xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
  <DefineSurvey>...</DefineSurvey>
  <xfa:data>
    <Survey>
      <Section>
        <Question>
          <QuestionNumber>3</QuestionNumber>
          <QuestionText>Are you married?</QuestionText>
          <Answer>1</Answer>
        </Question>
      </Section>
    </Survey>
  </xfa:data>
</xfa:datasets>

Once the form is in "fill mode", the PDF can be distributed to users.  Enable usage rights so they can save the results.  Or better yet, host your survey on acrobat.com.

Next Steps

The form design could expose more options. e.g. conditional logic, more response types, more constraints on responses, styling options.  It's all just SMOP (small matter of programming). 

Submit

I added a submit button to the form in order to return the survey results.  There are a couple of things that are interesting about the handling of the submission. The survey definition includes a target email address.  The submit button gets updated with target and subject with this bit of code:

var vEmailTarget = "mailto:" + xfa.datasets.DefineSurvey.Email.value
                             +
"?subject=" + xfa.datasets.DefineSurvey.SurveyTitle.value;
EmailSubmit.event__click.submit.target = vEmailTarget;

The other thing I did with submit was make use of the new event cancel capability.  When the user clicks on the "submit" button, the pre-submit event fires.  I put this script there:

Survey.FillSurvey.#subform[2].EmailSubmit::preSubmit:form - (JavaScript, client)
if (scValidate.formHasErrors())
{
    scValidate.showErrors();
    xfa.host.messageBox("The survey is incomplete.  Please fill in the highlighted questions.");
    xfa.event.cancelAction = true;
    xfa.host.setFocus(scValidate.getFirstError());
}

The xfa.event.cancelAction property is new in Acrobat/Reader 9.  It allows you to cancel the upcoming action in prePrint, preSubmit, preExecute, preOpen, preSign events.

Validation Framework

The form makes extensive use of the validation framework I defined in previous blog entries -- most notably, the exclusion group objects.  The framework is contained in the three script objects at the top of the form: scValidation, scGeneral and scGroup. These are re-usable objects that can be used in forms where you want fine-tuned control over the form validation experience.

For those who have used previous versions of this framework, I added some enhancements to suit my needs for this sample:

New function: scValidate.hideErrors()

After this is called, any fields with errors are not highlighted until...

New function: scValidate.showErrors(subform)

This function causes error fields that are descendents of the input subform to be highlighted.  If subform is omitted, errors are displayed on the entire form.

New function: getFirstError(subform)

Returns the first descendent field under the given subform that has an error.  It subform is not specified, returns the first error field on the whole form.

scGeneral.assignRichText(targetField, source)

Where targetField is a rich text field and the source is either a dataValue with rich text or another rich text field.

I also changed the code that highlights invalid fields.  Instead of mucking with borders, I simply set the fill colour.

February 23, 2009

Form DOM Debugging Tool

One area where a novice form designer often needs help is in figuring out how to bind their form to data.  Ok, scratch that.  Advanced form designers often need help in this area as well.  There is lots of reading you can do in the XFA specification to learn how the template DOM gets merged with the data DOM to produce the form DOM.  Designer does a great job of generating binding SOM expressions for you.  But even still, when you are dealing with a complex schema, it can be hard to figure out where things went wrong.

A good way to debug this problem is to visualize the resulting DOMs.  Since we have full scripting access to the form DOM and to the data DOM, we can add a visual display of the DOMs to our form.  That's the approach that today's sample takes.  I took a work-in-progress purchase order form and added a "debugging subform" (domDump) at the end of the form in order to display the DOMs.  When you open the form and look at the last page you will see two side-by-side subforms.  The left side shows a tree view of the form DOM, the right side shows a tree view of the data DOM.  Some of the things you should note:

  • Entries in the tree are color-coded depending on their bind status
  • The display is cut off at one page.  To see more, use the scroll buttons at the top
  • Collapse and expand sections of the trees using the +/- buttons on the rows
  • Set focus in a field on either side of the tree and the display will highlight that entry and the corresponding entry(s) in the other DOM in red.
  • Set focus on a row in the form DOM, and some binding information is displayed in the top right of the display
  • Shift+click on a row in the form DOM will set focus to the corresponding field on the form.
  • By default we display the current state of the form at the docReady event.  If you want to refresh/rebuild the tree views, click the refresh button.

To try this out on your own forms, take the domDump subform and add it to your object library in Designer.  When you want to debug a form, drag the domDump subform onto your form.  (The subform needs to be displayed on a new master page with a content area that is at least 8x10.5 inches.) After debugging, when you are happy with your form, remove domDump.

There are some restrictions on the usage of this tool:

  • It works only on interactive forms
  • It works only on dynamic forms
  • Because of the extensive use of JavaScript objects, it does not work on forms with version 8.1 strict-scoping turned on

There is code in the script to check that these conditions have been met.

I won't go into an in-depth description of the JavaScript that makes this debugger work.  It's complicated :-) The intent is that form designers can use it without understanding the internals.  The one area where users might be tempted to tweak the script is to customize the information that gets displayed when a row in the form DOM is highlighted.  If that interests you, look for the script function: debugDisplay().

For interest sake, I've included one of my previous samples with the debugging subform added.  When I first added the dompDump subform I had to press the "refresh" button in order to see the completed DOMs.  That's because the transpromo content and the debugging content are both populated from the docReady event -- and the transpromo happens last.

Futures

  1. Admittedly, displaying tree views using dynamic subforms is non-trivial and a bit clunky.  One possible enhancement would be that rather than display the tree view on the form, we could export all the data for the tree views.  Then we could write a cool flash app to load the data give a proper rich user experience.  The only drawbacks with that approach is that a) it becomes a multi-step process and b) you lose the ability to "shift+click" to the corresponding form field.
  2. It would be great to have a similar debugging capability for WSDL connections.

March 19 Update

After using the form DOM debugger with several forms, I'm hooked.  I couldn't resist making a couple improvements to it.  I've lifted most of the restrictions as to what flavour of forms it can be used in.  It now works with a broader range of template versions and strict scoping variations.  It also works for non-interactive documents.  For non-interactive, the tree display will spill over multiple pages and give a full dump -- rather than windowing the content on a single page.  The only remaining restriction is that it works only with dynamic forms.

June Update

There is a follow-up debugger effort that supercedes the sample in this entry. Please have a look here.

February 12, 2009

Working with Data Fragments

Many of you will be familiar with the idea of constructing a form using template fragments. Template fragments are a powerful way to construct a form experience with modular parts. But there are some workflows where template fragments do not (yet) have all the functionality we might like. In cases where the content of the form is determined at runtime, template fragments might not have the flexibility you need. Take the transpromo examples from earlier posts(here and here). In those samples, the advertisement was baked into he template.  But in real life the actual ad that gets inserted will change. On starting a new marketing campaign, a company may want to issue a new set of ads to embed in their output. The actual ad chosen will vary depending on data in the form. Is the client single? Insert the sports car ad. Do they have a new baby? Insert the minivan ad. Some of our customers are building these kinds of applications using what we call “stitching solutions”. They have written Java libraries that assemble XFA templates on-demand. Writing a Java solution is fine for some, but eventually we have to make this easier.

One of the solutions available using today’s LiveCycle products leverages the notion of data fragments (instead of template fragments). I use the term “data fragments” to refer to the idea of embedding rich content in data. You might be surprised at how much you can customize the look of your document via XML instance data. You can add images, rich text, hyperlinks, positioned text and even floating fields.

A solution that uses data fragments to place ads in statements might look like this:

  1. An application for authoring data fragments representing the advertisements
  2. An application for adding metadata to the ads and storing them in a repository
  3. A rules engine for selecting ads from the repository based on correlating transaction data with ad metadata
  4. Print engine to render the statements with the ads

I can't give you the whole solution, but can offer a sample that would help you get started with parts 1 and 4:

  • AdGenerator.pdf: This is a PDF for generating a data fragment and adding it to statement data. (Ideally we’d define something fancier for designing data fragments – maybe a slick flash app side-by-side with the PDF.)
  • statement.xdp: A sample credit card statement that includes a placeholder subform to render the ad data fragment.

Here is a copy of AdGenerator.pdf, populated and ready to export data. (Please give special notice to the image artwork that I worked so hard on.)

Here is a copy of the resulting statement generated with the ad.

How AdGenerator.pdf works

The advertisement is a growable subform that holds:

  • An image field
  • A repeating subform housing rich text

The form has various field and button controls to add the image and to add and position the rich text.  To understand how to use the form, read the instructions on the form itself. The scripts are also well documented and a good source for discovering the techniques used.

Dynamic properties

The (x,y) placement of rich text and the size of the ad subform are controlled by dynamic properties. Authoring these means going into XML Source view and adding the appropriate <setProperty> elements. E.g.:

<field name="Image" w="203.2mm" h="25.4mm">
   
<setProperty target="h" ref="height"/>
   
<setProperty target="w" ref="width"/>

Repeating, positioned text

Another case where we had to use XML source mode: Having added the TextData subform as a positioned child of advertisement, add an <occur/> element to allow it to repeat:

<occur min="0" max="-1"/>

The buttons and fields that position the text will update both the text coordinates as well as the data that the text coordinates are dynamically bound to.

Data References

You can personalize the text on the ad by injecting data values inline with the text.

On loading transaction data, we populate a listbox with all the data values found in the instance data. Adding the data reference uses the same mechanism as floating fields in Designer. We inject a special <span/> element into the xhtml with an embedded data reference. E.g.:

<span xfa:embed="$data.Statement[0].AccountNumber[0]" />

Styling Rich Text

In order to style your rich text in Acrobat, you need to bring up the properties editor (ctl + e). To add a hyperlink, select some text and choose “Hyperlink…” from the context menu. (By the way, there seems to be a bug here.  Rich text editing never works for me in Designer preview.  I use standalone Acrobat -- with Designer shut down).

January 15, 2009

Transpromo: the sequel

Before the break I wrote an entry describing how to place transpromo content on your form.  The sample was fairly restricted in that the advertisement could be inserted only in specific spots in the subform hierarchy.  Today's sample allows the advertisement to be placed anywhere on the form. 
Note that this sample is not an interactive PDF, but is a print form -- as you'd expect for a transpromo appication.  Here is the XDP form, the sample data and the result of a print operation.  In order to generate some interesting white space to work fill up, I've added a conditional break on the form.  Every time the date field value changes, I force a new page.

The strategy behind this sample is that we use a dynamic subform to place the advertisement on the master page rather than in the form hierarchy.  This isn't as easy as it sounds, so I'll explain the steps to build the form.

In a nutshell, the strategy involves:

  1. find the place on the page where the ad will fit
  2. create an instance of the advertisement subform
  3. modify the (x,y) coordinate to position the ad correctly. 

However, there are some challenges to make this work.  The main problem is that when we re-paginate, we might throw away any existing master page instances (pageAreas) and re-create them.  Any subform instances and (x,y) positions would be lost in the process.  The solution involves persisting the ad placement information in the form data and binding the advertisement subforms to the data.  The specific mechanics need a fair bit of explaining...

Processing Steps

In the previous post I described these processing steps:

  1. Loading content
  2. Merging data with template (create Form DOM)
  3. Executing calculations and validations
  4. Layout (pagination)
  5. Render

This isn't the complete story.  There are some intermediate steps as well.  Specifically, when doing the pagination, we create pageArea objects -- along with their associated content.  We then attempt to merge this page content with form data.  The expanded processing steps are:

  1. Loading content
  2. Merging data with template (create Form DOM)
  3. Executing calculations and validations
  4. Layout (pagination)
    4a. Merge data with page content
    4b. Execute calculations and validations on page content
  5. Render

Settings stored in Data

The way to customize page content at runtime is to put all the page settings in the form data and have the page content subform bind to the data.  You need to find a place in the data that will not interfere with the rest of the form data.  This is especially important if your form is based on an XML schema.  Any page data intermixed with form data would violate the schema -- or could inadvertently merge with other elements in the template.  The solution is to place the page data in a separate dataset.  The default location for form data is under <xfa:data>.  We'll place the page data under <pageData>:

<xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
   <xfa:data>
       ... form data goes here ...
   </xfa:data>
   <pageData>  <!-- data for transpromo ads goes here --> 
      <advertisement>  <!-- placement data for one ad -->  
         <x>0.25in</x> <!-- x,y position --> 
         <y>2.89in</y>

                       <!-- bind to presence property -->   
         <presence>visible</presence>  
         <Subform7in>  <!-- this will select the right sized ad --> 
                       <!-- populate info/debug field --> 
            <available>7.85in</available>  
         </Subform7in>  
      </advertisement>  
      <advertisement> ... next page ad ... </advertisement>
   </pageData>
</xfa:datasets>

Once we generate the data, we need to make sure that the subforms on the master pages bind to the data accordingly.  The advertisement subform uses the binding expression: "!pageData.advertisement[*]". (The "!" character is a shortcut in SOM that brings you to a child of xfa.datasets).  The rest of the bindings are "Normal" (based on a name match).

Now any time a repagination happens, the page content will re-bind to the data and all the settings will be restored.

Use Dynamic Properties

Setting properties such as the x and y coordinates via data is done using dynamic properties.  Designer supports setting properties such as caption, error messages, choice list contents from from data.  While Designer exposes the set of commonly used properties, in reality almost all properties can be set this way.  e.g. To populate the x and y properties from data, I used XML source view to add the necessary <setProperty> elements:

<subform name="advertisement" w="203.2mm" layout="tb">
   <occur min="0" max="-1"/>
   <setProperty target="x" ref="$.x"/>
   <setProperty target="y" ref="$.y"/> 
   <setProperty target="presence" ref="$.presence"/>

   ...
</subform>

The presence property is set to "hidden" by default so that the advertisement subforms do not clutter up the design view.  In addition to setting the x and y properties via data, we also set the subform presence property to "visible".

Note: There was one other case where I needed to use XML source view to design this form.  In order to make "advertisement" subform optional, I added the <occur> element:

<subform name="advertisement">
   <occur min="0" max="1"/>

Building the data in script

Adding data can be done using createNode() as we've done in previous samples.  The code to add specify the x property would look something like:

var vPageData = xfa.datasets.createNode("dataGroup","pageData");
var vAds      = xfa.datasets.createNode("dataGroup","ads");
var vAdvert   = xfa.datasets.createNode("dataGroup","advertisement");
var vX        = xfa.datasets.createNode("dataValue","x");

xfa.datasets.nodes.append(vPageData);
vPageData.nodes.append(vAds);
vAds.nodes.append(vAdvert);
vAdvert.nodes.append(vX);
vX.value = vCA.x;

Or we can use the assignNode() method and do the whole thing in one command:

xfa.datasets.assignNode("pageData.ads.advertisement.x", vCA.x, 0);

The way assignNode() works is that as it traverses the SOM expression and creates any intermediate nodes that don't exist. In this example it would create "advertisement" as a dataGroup (since it can tell that it's a grouping node) and then creates "x" as a dataValue (since it can tell that it is a leaf node).  Once the nodes are created, the second parameter is the value to assign to the leaf node.  The last parameter dictates how to create nodes (0 == "create/replace").

The other benefit to using assignNode() is that if your method inadvertently gets called twice, it won't create another instance of the data.  It will overwrite the data added previously.

Triggering re-layout

In this sample we've placed our code in the docReady event of the adControl field.  The docReady event fires after layout and render are complete.  Once we have generated form data for each page, we add an advertisement subform (merging it with the new data) and then call xfa.layout.relayout(). I encountered a bug along the way -- the relayout() call ought to have remerged without an explicit call to add the advertisement subform. Since it didn't, the workaround was to add the subform explicitly.

Limits

1. As mentioned already, this form is intended for print.  It is not intended for (nor will it work in) an interactive environment where subforms are added or removed.  In interactive forms we can count on more layout:ready events firing.  In fact, the first sample transpromo form I made the mistake for relying too much on the layout:ready event.  Consequently, that form works only in interactive mode and not for print.  But given the greater flexibility of this second sample, I recommend staying with this approach for transpromo form design.

2. This form currently works only on page areas that have a single content area. All object positions are computed relative to the content area, but it isn't possible to determine which content area you are currently in.

Re-Use

In order to re-use the code from this sample, simply take the content from page1 (the adholder subform) and place it on each master page where you'd like to add content.  A cautionary note: if you copy/paste the adHolder subform or if you create a custom object from it, Designer will remove the binding information. You will need to re-specify it when you bring it into a new form.

Next Steps

The sample could be extended to search for other kinds of white space.  Currently we find only the leftover vertical white space at the end of each page.  There should be enough information in the layout tree for us to also discover white space inside positioned content.

December 12, 2008

Canadian/US Address Data Capture

When I fill out an interactive form that prompts for an address that could be Canadian or US; I am constantly disappointed with the data capture experience. Usually the form uses a single field to capture both state and province: State/Province:________ with a drop down that lists all states and provinces. And then a single field to capture both zip code and postal code: Zip/postal Code:__________ . Or worse, the captions on the field are biased toward a US address, but allow you to enter values for a Canadian address. I.e. you get prompted for a zip code, but it allows you to type in a postal code.

The exercise for this blog entry is to come up with a data entry experience that is tailored according to country. The samples build on the work from the previous blog entry that dealt with validating a Canadian postal code.

Single Schema

The premise of the exercise is that you want to have only one place in your data where you store an address – whether Canadian or US. The samples are based on generating this XML data:

<Address>
   <Country/>
   <City/>
   <StateProv/>
   <PostZip/>
</Address>

Validate a zip code

To be fair, I thought I should try to offer advanced validation for Zip codes.  After all, I did a whole blog entry on Canadian postal codes.  No offence to my American friends, but zip codes are not nearly as interesting as postal codes. When I poked around to see if I could do more advanced validation beyond the standard Zip or Zip+4, I was pretty disappointed. The only thing I found was that there is a correlation between the first digit and the state. For example, for zip codes starting with a “1”, the state must be one of: Delaware, New York or Pennsylvania. Better than nothing. The sample forms have a utility function to validate a zip code:

/**
* validateZipCode() - validate whether a field holds a valid zip code
* @param vZip -- the zip code field. If the validation fails, this field
* will be modified with an updated validation message
* @param vState (optional)-- a corresponding field value holding the
* state abbreviation.  This method will make sure the first digit of
* the zip code is valid for this state.
* @return boolean -- true if valid, false if not.
*/

Keystroke validation

For the Canadian postal code validation, I introduced a change event that forced the entry to upper case. This time around, I have extended that concept to disallow keystrokes that would cause an invalid zip or postal code. A few words of explanation about some of the xfa.event properties that were used:

  • xfa.event.change – represents the delta between the previous change and the current one. Usually this is the content of a single keystroke. However it can also contain the contents of the clipboard from a paste operation. This property can be updated in the change event script to modify the user’s input. It can be set to an empty string to swallow/disallow user input.
  • xfa.event.newText – represents what the field contents will look like after the changes have been applied. Modifying this property has no effect.
  • xfa.event.selEnd – The end position of the changed text. Usually when the user is typing, we are positioned at the end of string, but the user could be inserting characters at any position.

Here is the change event script for the zip code:

Address.Address.USAddress.Zip::change - (JavaScript, client)
// restrict entry to digits and a dash
if (xfa.event.change.match(/[0-9\-]/) == null)
    xfa.event.change = "";

// Allow the hyphen at the 6th character only
if (xfa.event.change == "-" && xfa.event.selEnd != 5)
    xfa.event.change = "";

// If the 6th character is a digit, and they're typing at the end, insert the hyphen
if (xfa.event.change.match(/[0-9]/) != null &&
    xfa.event.newText.length == 6 &&
    xfa.event.selEnd == 5) 
{

    xfa.event.change = "-" + xfa.event.change;
}

var vMax = 9;
if (xfa.event.newText.indexOf("-") != -1)
    vMax = 10;

// don't allow any characters past 9 (10 with a hyphen)
if (xfa.event.newText.length > vMax)
    xfa.event.change = "";

In hindsight, I could have done a better job with this script.  It is still possible to enter invalid data.  e.g. after adding the hyphen at the 6th character, the user could cursor back and insert digits, forcing the hyphen beyond the 6th character.  A better approach might be to modify the validateZipCode() method so that it will validate partial zip codes.  Then block any user input that doesn't result in a correct partial zip code.

There is a similar block of logic for the postal code change event.

Customizing Data Capture

The really hard part of this data capture problem is how to tailor the input by country. I have two samples that take different approaches.

Sample 1: Different subforms for each country

In this sample, the strategy is to use two subforms for data capture. One subform that has a province and postal code field for Canadian addresses. One that is tailored for capturing a US address.

To make this design work, we create two subforms (CanadianAddress and USAddress), set the binding of each subform to “none”. Then bind the individual fields to the address data. The reason for this approach is that we want both subforms to populate the same data. Multiple fields are allowed to bind to the same XML data element, but you cannot bind multiple subforms to the same XML data.

Show/hide logic. It is not enough to simply set the presence of the subforms to visible/invisible. A hidden field will still run its validation script. We want to make the subforms optional and add/remove them as appropriate. To make this exercise a little more interesting, I assumed that we were not in a flowed layout. Now the problem is that unless you’re in a flowed context, Designer does not allow you to make the subform optional (under the binding tab). However, the XFA runtime does not have this restriction. There are two workarounds: 1) Modify the source in the XML view 2) fix it in script. I chose the latter approach. Subform occurrences are managed by the <occur> element. By default, the address subforms will be defined as:

<occur initial="1" max="1" min="1"/>

We can change the minimum via script in order to make them optional:

Address.Address.Country::initialize - (JavaScript, client)
USAddress.occur.min = "0";
CanadianAddress.occur.min = "0";

Once the subforms are defined, simply place them on top of each other at the same page location. When changing country from the country drop down list, the subforms will toggle on/off accordingly:

Address.Address.Country::validate - (JavaScript, client)
// Choose which subform address block to use depending on the country
_USAddress.setInstances(this.rawValue == "U.S." ? 1 : 0);
_CanadianAddress.setInstances(this.rawValue == "Canada" ? 1 : 0);
true;

Sample 2: One Subform, change the field properties

In this sample, the strategy is to create one set of dual-purpose fields. One field to capture either a postal code or a zip code and one field to capture either a state or a province. When the country changes, we modify the field definitions so that they behave appropriately. The changed properties included the caption text, the picture clauses and the contents of the state/province drop down lists. The validation that happens in the change event and in the validation script needs to branch to accommodate the appropriate context.

The logic to toggle the field definitions looks like:

Address.Address.Country::validate - (JavaScript, client)
if (this.rawValue == "U.S." && 
    ZipPostal.caption.value.text.value != "Zip Code:")
{
    ZipPostal.caption.value.text.value = "Zip Code:";
    ZipPostal.ui.picture.value = "";
    ZipPostal.format.picture.value = "";

    StateProv.caption.value.text.value = "State:";

    StateProv.clearItems();
    StateProv.addItem("Alabama", "AL");
    StateProv.addItem("Alaska", "AK");
    StateProv.addItem("Arizona", "AZ");
  . . .
    StateProv.addItem("Wyoming", "WY");
} else if (this.rawValue == "Canada" &&
           ZipPostal.caption.value.text.value = "Postal Code:"
)
{
    ZipPostal.caption.value.text.value = "Postal Code:";
    ZipPostal.format.picture.value = "text{A9A 9A9}";
    ZipPostal.ui.picture.value = "text{OOO OOO}";

    StateProv.caption.value.text.value = "Province:";

    StateProv.clearItems();
    StateProv.addItem("Alberta", "AB");
    StateProv.addItem("British Columbia", "BC");
    StateProv.addItem("Manitoba", "MB");
. . .
    StateProv.addItem("Yukon", "YT");
}
true;

Comparing the approaches

  • Both samples work in Reader version 7, 8 and 9
  • Sample 2 is easier to design, even though it requires more script.
  • Sample 1 is easier to extend in the event that you want your address block to handle more than just two countries.
  • Sample 1 requires dynamic forms.
  • Sample 2 could be modified to work for forms with fixed-pages. You would need to change the form design so that the caption is represented by a protected field (captions can be modified only on dynamic documents).

October 30, 2008

Data Binding with Predicates

Have you ever come across “inverted XML”? Take this example:

<addresses>
  <row>
    <column name="first name">Halla</column> 
    <column name="last name">Ayers</column> 
    <column name="street address">7466 Etiam Avenue</column>
    <column name="city">Arlington</column>
    <column name="state">BC</column>
    <column name="postal">T4K4O7</column>
   </row>
</addresses>

The question is, how do we bind a “firstName” field to the data value: <column name=”first name”>?

We would much prefer to deal with this data in the form:

<addresses>
  <row>
    <firstName>Halla</firstname> 
    <lastName>Ayers</lastName> 
    <streetAddress>7466 Etiam Avenue</streetAddress>
    <city>Arlington</city>
    <state>BC</state>
    <postal>T4K4O7</postal> 
  </row>
</addresses>

Then we would bind our firstName field to addresses.row.firstName. Easy.

But we do not always have control over the format of the data that we receive. If we are given the inverted form of the XML we have two choices:

  1. Convert to non-inverted XML using XSLT
  2. Bind to the data using predicate expressions

My preference is to avoid XSLT when at all possible. It would be very tricky with this example, because the names have spaces in them, so cannot be re-used as XML element names. Instead we use predicate expressions for binding.

Predicates

We find predicate filtering in both XPath and E4X.
The XPath expression to identify the “first name” value from the sample is:

addresses/row/column[@name=="first name"]

The E4X expression is:

addresses.row.column.(@name=="first name")

The XFA SOM expression is either:

addresses.row.column.[name=="first name"]

Or

addresses.row.column.(name.value=="first name")

The difference between the two XFA variations is that when brackets [] are used, the expression inside the brackets is evaluated as Formcalc. When parenthesis () are used, the expression is evaluated as JavaScript.

In all these examples, predicates are evaluated the same way. We take each candidate instance of the <column> element and evaluate the predicate expression in the context of that element. When the expression returns true, the element is selected.

Data Binding

This sample form (and sample data) has fields with binding expressions that display the set of addresses.

The (repeatable) address subform binds to addresses.row[*]

The firstName field binds to $.column.[name == "last name"]

The lastName field binds to $.column.(name.value == "last name")

Data Creation Problem

One significant issue with using predicates for binding is that we cannot easily create new instance data. i.e. it is not possible to infer the XML structure that corresponds to the binding expression.  If we were to simply create a new instance of the address subform, the data we create would look like:

<row>
  <column name=""/>
  <column name=""/>
  <column name=""/>
  <column name=""/>
  <column name=""/>
  <column name=""/>
  <column name=""/>
</row>;

To get around that, we create our (empty) data before we create a new instance of the address subform. The script looks like this:

addresses.addRow::click - (JavaScript, client)
// define the data for a new row
var vNewRowData =
  <row>
    <column name="first name"/>
    <column name="last name"/>
    <column name="street address"/>
    <column name="city"/>
    <column name="state"/>
    <column name="country"/>
    <column name="postal"/>
  </row>;

// load the new row of data into the data DOM
addresses.dataNode.loadXML(vNewRowData.toString(), false, false);

// add a new instance of the address subform,
// and tell it to bind to the new data 
_address.addInstance(true);

Notice that the address row has been specified according to the data schema and has been inserted into our data DOM before the address subform was created.  Then when we create a new instance of the address subform, it will bind to the data we pre-created.

The Deep End

Using this same data set, it is interesting to apply predicates in a slightly different way. In this example I wanted my form to display the US addresses first and the Canadian Addresses second. The form has separate subform definitions for each. If we had been working with the non-inverted form of the data this would have resulted in a pretty simple binding expression: $.row.[country == "US"]

But because of the inverted form of the data, the binding expression gets very complicated. We want to qualify the row element based on the column where name="country" and the value is "US". To get there we use a nested predicate to bind the row:

$.row.[ $.resolveNode("column.[name==""country""]") == "US" ]

The outer predicate filters <row> and the inner predicate filters <column>.