Null Data Handling

We run into scenarios where we want to control how null values are represented in our XML instance data.  There are three different ways that null data can be represented in XML data:

  1. Exclude the element.  If the value has no data, do not write it out to the XML file
  2. Use XML Schema’s nil attribute:
    <spouseName xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                                                                                                  xsi:nil="true"/>
  3. Write out an empty element.  e.g. <spouseName/>

Null Handling in XML Schema

When determining which strategy to use, we look to the form’s XML Schema for hints:

  1. If a leaf element (no child elements, no attributes) is marked as optional (minOccurs="0") then it will be excluded when null.
  2. If an element in the schema is marked as nillable="true", then the data will be marked with the xsi:nil attribute
  3. In all other cases we write out null values as empty elements

For the rest of this blog entry I’ll describe exactly how the XFA processor deals with null data, and offer some tips on how you can further customize the behaviour.  In particular, I’d like to show how to control null handling at the data group level.  

Let’s look at the problem using a specific example.  Here’s a sample purchase order schema:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="purchaseOrder" type="poType"/> <xsd:complexType name="poType">
<xsd:sequence>

  <xsd:element name="emptyItem" minOccurs="0">
   <xsd:complexType>
    <xsd:sequence>
     <xsd:element name="partNum" type="xsd:string"/>
     <xsd:element name="description" type="xsd:string"/>
     <xsd:element name="quantity" type="xsd:positiveInteger"/>
     <xsd:element name="unitPrice" type="xsd:float"/>
    </xsd:sequence>
   </xsd:complexType>
  </xsd:element>

  <xsd:element name="excludeItem" minOccurs="0">
   <xsd:complexType>
    <xsd:sequence>
     <xsd:element name="partNum" type="xsd:string"/>
     <xsd:element name="description" type="xsd:string"/>
     <xsd:element name="quantity" type="xsd:positiveInteger"/>
     <xsd:element name="unitPrice" type="xsd:float"/>
    </xsd:sequence>
   </xsd:complexType>
  </xsd:element>

  <xsd:element name="xsiItem" minOccurs="0">
   <xsd:complexType>
    <xsd:sequence>
     <xsd:element name="partNum" type="xsd:string"/>
     <xsd:element name="description" type="xsd:string"/>
     <xsd:element name="quantity" type="xsd:positiveInteger"/>
     <xsd:element name="unitPrice" type="xsd:float"/>
    </xsd:sequence>
   </xsd:complexType>
  </xsd:element>

  <xsd:element name="comment1" type="xsd:string" minOccurs="0"/>
  <xsd:element name="comment2" type="xsd:string" nillable="true"/>

</xsd:sequence>
</xsd:complexType>
</xsd:schema>

 

Note the two comment elements at the bottom.  When comment1 is saved to data, it will be excluded when null.  When comment2 is saved to data it will be annotated with the xsi:nil attribute.

But what some form authors want is for empty groups (purchase order items in our example) to be excluded when their contents are null.  I can show you how — but we are now getting into the deep end.

Data Description

When we save a PDF/XDP file that is based on a schema or sample XML or WSDL connection, we generate a data description.  This data description is really a distilled version of the schema.  It takes the form of a sample XML annotated with special namespaced attributes.  The data description for the sample above looks like:

<xfa:datasets
      xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
<dd:dataDescription
      xmlns:dd="http://ns.adobe.com/data-description/"
      dd:name="purchaseOrder">
  <purchaseOrder>

   <emptyItem dd:minOccur="0">
    <partNum/>
    <description/>
    <quantity/>
    <unitPrice/>
   </emptyItem>

   <excludeItem dd:minOccur="0">
    <partNum/>
    <description/>
    <quantity/>
    <unitPrice/>
   </excludeItem>

   <xsiItem dd:minOccur="0">
    <partNum/>
    <description/>
    <quantity/>
    <unitPrice/>
   </xsiItem>

   <comment1 dd:minOccur="0" dd:nullType="exclude"/>
   <comment2 dd:nullType="xsi"/>

  </purchaseOrder>
</dd:dataDescription>
</xfa:datasets>

 

Note how the comment elements are annotated.  We use dd:nullType to specify the null handling behaviour:

  1. dd:nullType="exclude" don’t write out elements where the value is null.  Note that this option may be used only if the element is marked in the schema as optional (minOccur="0"). 
  2. dd:nullType="xsi" Use the XML schema ni
    l attribute.  As described in the W3C definition:

    "XML Schema: Structures introduces a mechanism for signaling that an element should be accepted as valid when it has no content despite a content type which does not require or even necessarily allow empty content. An element may be valid without content if it has the attribute xsi:nil with the value true. An element so labeled must be empty, but can carry attributes if permitted by the corresponding complex type."

  3. dd:nullType="empty" (the default) save null values as empty elements

The dd:nullType attribute can also be placed on grouping elements.  When it is placed on a grouping element, then the setting gets applied to all children of the group.  While there is nothing in XML schema that will do this for us, we can do it by hand-editing the item elements in XML source view:

<emptyItem dd:minOccur="0" dd:nullType="empty">

<excludeItem dd:minOccur="0" dd:nullType="exclude">

<xsiItem dd:minOccur="0" dd:nullType="xsi">

And here is the obligatory sample form.  And the sample schema.  The sample form is bound to the sample schema, and has a field that shows the current state of the XML instance data.  Try typing values into the 3 different partnum fields and the comment fields, and watch how the data reacts according to the instructions in the data description.

The Really Deep End

The disadvantage to hand-editing your data description is that when your schema changes and the data description gets refreshed, your edits will be lost.

There is a way that you can write script to update your data description automatically.  The general technique is described at this blog post.  The script below will update all your data descriptions and for every element that is marked minOccur="0", we’ll add nullType="exclude".  Note that the changes will be applied when saving as PDF or when generating a PDF using LiveCycle.

purchaseOrder::initialize - (JavaScript, server)
if (xfa.host.name === "XFAPresentationAgent") {
    var vDataDescriptionList = xfa.datasets.dataDescription.all;
    for (var i = 0; i < vDataDescriptionList.length; i++) {
        var vDD = vDataDescriptionList.item(i);
        var sDD = vDD.saveXML();
        sDD = sDD.replace(
             /dd:minOccur= "0" (?!dd:nullType="exclude")/g,
             "dd:minOccur=\"0\" dd:nullType=\"exclude\" ");
        xfa.datasets.nodes.remove(vDD);
        xfa.datasets.loadXML(sDD, false, false);
    }
}