Working with multiple datasets

One of the formfeed blog commenters ("mo") asked about preserving data during an import operation.   I gave her (him?) a flippant reply with some hand-waving about saving/restoring data before/after import. Then I tried it myself and discovered it was not nearly as easy as I thought.

Here’s the problem description:  Your data arrives in two separate data files.  You need to import them both into your form.  Problem is that importing new data replaces existing data.  Loading the second data file will discard the data from your first import. 

Let’s set up a specific example — Suppose my data looks like:

<multidata>
  <set1> … </set1>
  <set2> … </set2>
</multidata>

We want to be able to load set1 and set2 from different data files.

There are a couple of solutions to this problem.   But first some review on dataset handling within XFA/PDF forms.  Normally form data gets stored under $data — which is a shortcut to: xfa.datasets.data.  If the root node of your form is "multidata", then your data appears under xfa.datasets.data.multidata.

The XML hierarchy looks like this:

<xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
  <xfa:data>
    <multidata>
      <set1> … </set1>
      <set2> … </set2>
    </multidata>
  </xfa:data>
</xfa:datasets>

When Acrobat performs a data import, it replaces the <xfa:data> element. But it *appends to* any other datasets. 

Solution 1: Preserve/Restore data before/after import

If during import you could arrange your data to look like:

<xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
  <xfa:data>
    <multidata>
      <set2> … </set2>
    </multidata>
  </xfa:data>
  <set1> … </set1>
</xfa:datasets>

In this case, <set1> would be preserved and only <set2> would be replaced.  Then after the import is complete,  you’d move <set1> back where it belonged.  The way to control the import is to use the Acrobat script function: importXFAData();  Here’s the outline of the script to import set2:

  1. Copy set1 data to be a child of xfa:datasets
  2. Remove the set1 subform
  3. Call importXFAData() to load set2
  4. Move the set1 data back under <multidata>
  5. Re-add the set1 subform

There are a number of tricky parts:

  • When importXFAData() is successful, it causes a remerge.  When remerge happens, any script commands after the call to importXFAData() will not execute.  The workaround is to perform steps 4 and 5 in a separate form:ready script.
  • importXFAData() does not return a status.  You have no way of knowing if the user cancelled.  If they did cancel, you need to restore set1 without depending on the form:ready script.
  • If the user is running in Reader, then importXFAData() will throw an error.  We need to catch this error and restore set1 data.
  • If the user imports data from the Acrobat menu (Forms/Manage Form Data/Import Data) then all your clever script won’t run and set1 data will get cleared.  You need to figure out how to remove this option from the Acrobat menu.

Note that this specific example assumes you are loading data in the order set1 then set2.  The form could be coded more generally to load the data in any order.  It would just be a bit more complicated.  You’d need to move both set1 and set2 and then after the load you’d figure out which one(s) need to be moved back.

Here is a sample form.  Here is sample set2.xml data you can load.  Have a look at the button click and form:ready events for all the gory details.

Solution 2: Bind set1 outside of xfa:data

Instead of temporarily arranging your data so that set1 is under <xfa:datasets>, you could permanently arrange your data this way.  In the binding expression for the set1 subform, specify "!set1" — which is a shortcut for xfa.datasets.set1.  Now whenever you import data for set2, it will leave set1 untouched.  However, this introduces a new problem.  Whenever you import new set1 data you will end up with multiple copies of set1.  You need a form:ready script that will delete all but the last copy.  This also means that the data file holding set1 needs to include the <xfa:datasets> element so that it can correctly specify the location for set1

Here is a sample file with set1 data and set2 data. The script to trim back the extra copies of set1 data is found in the multidata form:ready event.

My personal preference would be to use Solution 2.  The script is simpler.  The user can use the menu commands for loading the data.  But this approach might not be possible if your data is bound to a schema.

10 Responses to Working with multiple datasets

  1. Christopher Till says:

    Hi John,I’m stumped.I created a form with a flow-able layout in Livecycle ES 8.2.This scope of the form is duplicate as many forms as I have records. Each form has flowable subforms that will adjust depending on the imported data.End result is a 2 page form for each record.It is bound to an xml file which feeds the data.My test xml file has 12 records.When I pull it into Adobe Acrobat 9 Pro it appears as a single blank page. (As it should)I import my xml data file which instantly populates 12 forms (24 pages) one after another. (As it should)The goal:I want to extract each form into a folder with a specific name (nameonform current date.pdf). I want to do this as an automated process since extracting individually would 1. be to time consuming and 2. not possible through the menu with a dynamic xml pdf created in livecycle.The information I have found on doing a batch process does not indicate anywhere this type of extraction and rename cannot be done with my document and yet nothing happens when I run my basic script.// A regular expression to acquire the base name of the file.var re = /.*\/|\.pdf$/ig;var filename = this.path.replace(re,””);try{for ( var i = 0; i

  2. Christopher:Unfortunately, you cannot extract pages from a dynamic PDF.One thing you could try is to save 12 copies of the same form, but with just the data for the pages you want displayed.i.e. break up your data into 12 fragments and embed one fragment with each copy of the form.Good luck!John

  3. Jason says:

    Hi John,I’m not too quick in the programming stakes, but have been scouting for a way to import multiple livecycle generated .xml files into one livecycle document (essentially so we can reuse aspects of data gathered from various forms submitted by the client)– is that “in essence” what is being achieved in the “Working with multiple datasets” example above (as i understood it your getting two .xml into one form – which is about what we really need); and if so could that approach be extended to permit say 4 .xml data sets to come into one document.For (budget and) workflow reasons our forms are not reader extended so all data will be .xml coming back (hence also aggregating the data by a web service function wont work)sorry to bring the forum IQ down (quite) a bitAny pointers much appreciated

  4. Jason:Generally speaking, your challenge is to aggregate multiple sets of instance data into a single document. There are (at least) 3 ways you can do this:1) Aggregate the instances as a repeating element in a single record. e.g.<xfa:data> <aggregatedData>   <formData>first form data</formData>   <formData>second form data</formData>   <formData>third form data</formData>   … </aggregatedData></xfa:data>2) Multiple records (see: http://blogs.adobe.com/formfeed/2009/09/working_with_multiple_data_rec.html)<xfa:data> <formData>first form data</formData> <formData>second form data</formData> <formData>third form data</formData> …</xfa:data>3) Use multiple data sets:<xfa:datasets> <xfa:data>  <formData>first form data</formData> </xfa:data> <formData>second form data</formData> <formData>third form data</formData></xfa:datasets>With each option you ought to be able to add as many sets of instance data as needed.It’s hard to know which technique to recommend without knowing more about your specific application.Although if your data is homogenous (same schema) I’d lean toward options 1 or 2.good luck.John

  5. jason says:

    Thanks John,For what we are working with the data will derive from varying forms (wholly different content) and not be multiple instances of the same sets – so im guessing method three above is the best way to explore ?cheers

  6. Jason:If each of the different xml fragments are different, then probably option 2 should be eliminated. You can probably make 1 or 3 work. Option 1 would likely be cleaner in the long run — especially if you can aggregate the data outside of Reader.John

  7. jason says:

    Thanks John :)

  8. Jason says:

    Hi John,I’ve been hacking away at option 1 for a couple weeks (hempered by having paitence, but not intelligence as my primary virtue), is there any chance you have an example pdf for option 1 that is able to load say 2, 3 sets of disparate (non-homogenous) data that I could look at/see working…. that would be a massive help…thanks

  9. Gary Cooper says:

    Let’s see if I understand this. Either way you are adding to the original dataset. You are not using one to import data and one to export data. Is this correct? How would you be able to deal with the two datasets independantly? One as an input and one as an output?

    • Gary:

      There’s no distinction in the data as to what’s input or output. You’d have to set this up yourself somehow. If you wanted to export a subset of data, there are two possibilities I can think of:
      – generate the output XML using saveXML() or E4X; create a data object: createDataObject(); then export it: exportDataObject()
      – Use Web services. The WSDL bindings let you differentiate input from output

      good luck.

      John