Archive for August, 2009

Base64 Encode a PDF attachment

This blog entry has been re-published with updated information.

Some time ago I experimented with PDF attachments — trying to add them to my XML data.  I wasn’t happy with the outcome at that time, and I was going to leave it.  But then I saw a customer scenario that called for this capability, and then one of my regular commenters (Bruce) brought up the topic as well.  So I’ve tried again and had a little more success this time. 

The end goal is to copy a PDF attachments into XML data.  If you can do so, it opens up a couple of interesting possibilities:

  • Take any attachments a user has added to the PDF and include it in a form submission or in a web service request
  • Take image attachments and use them to populate image fields

Acrobat methods for attachments

The acrobat document object has a property: dataObjects that returns an array of all the attachments in the current document.  Then are a set of methods for dealing with attachments: openDataObject, getDataObject, createDataObject, importDataObject, removeDataObject, getDataObjectContents, and setDataObjectContents.

The interesting method in our case is getDataObjectContents().  It returns a stream object with the contents of an attachment.  If your attachment happens to be textual, then you can use util.stringFromStream() to convert to a string value:

var inputStream = event.target.getDataObjectContents("MyNotes.txt");
Notes.rawValue = util.stringFromStream(inputStream);

The default encoding for binary attachments is a hex-encoding (each byte written as a 2 digit hex value).  However, when your attachment is in a binary format, the standard way to include it in an XML file is to encode it as base64.  To convert to a base64 encoding, use the Acrobat Net.streamEncode() method:

// Get a stream for the image attachment
var inputStream = event.target.getDataObjectContents("Smile.jpg");

// Get a new stream with the image encoded as base64
var vEncodedStream = Net.streamEncode(inputStream, "base64");

// Get a string from the stream
var sBase64 = util.stringFromStream(vEncodedStream);

// assign the base64 encoded value to an image field
sampleImage.rawValue = sBase64;

We know there are issues with Net.streamEncode() failing where content has null bytes.  However, when used in the context of encoding an attachment, it seems to work fine.

When I first looked at this problem I assumed that the Net.StreamEncode() method wasn’t working so I wrote a base64 encoding JavaScript.  It works fine — but it is slow!  On a 140K image, it takes 10 seconds to encode.  I’ve included this code in the sample just for interest sake.

Conversion to Base64

The attached sample form has an initialization script that displays a subform for each attachment.  There are two buttons that will take the corresponding attachment, convert it to base64 and assign it to the image field value.  One button uses the (slow) JavaScript encoding algorithm the other button uses Net.streamEncode() and works pretty quickly. 

The easiest and most reliable way to encode attachments in your XML data is to keep them in a hex encoding.  But of course, for this to work the consumer of your XML needs to be able to handle hex encoding as well.

Updated Message Handler and Debug Fragments

As I mentioned, I need to update a number of previously distributed script objects as fragments.  The reasons for the updates are:

  1. Make sure the code passes the lint check
  2. Add version information so that we can check whether your copy of the fragment/sample is stale or not
  3. New functionality/bug fixes

Message Handling

This entry discussed handling exception objects in a JavaScript catch block.

Here is the updated fragment.  You ought to be able to simply copy it into your fragment library.

The updates to the script include:

Fixed line end handling in Designer

Scripts marked runAt="server" will run when you save a form in Designer (this is explained here).  Any log messages issued by those scripts are emitted to the log tab in Designer.  For line breaks to appear correctly in this window, the line endings need to include both a carriage return and line feed.  The getServerMessage() method now makes sure to format the message accordingly.

Handle String Exceptions

The message handler functions can now handle the case where the code throws a string.

To see how this is useful, suppose you’re writing code for a click event.  Part way through your 200 lines of code you encounter a condition where you want to give an error message and exit.  But you can’t.  You’re not in the context of a function call, so you can’t simply say "return;" and get out.  You end up putting a big conditional branch in your code:

try {
    if (errorCondition) {
        xfa.host.messageBox("This event failed");
    } else {
        … 150 lines of remaining code …
    }
} catch(err) {
    xfa.host.messageBox( scMessage.handleError(err) );
}

The alternative now is to simply throw a message string and get out.  Your revised event code looks like:

try {
    if (errorCondition) {
        throw "This event failed";
    }
    … 150 lines of remaining code …
} catch(err) {
    xfa.host.messageBox( scMessage.handleError(err) );
}

Debug Trace

This entry showed how to easily add debug trace information to your JavaScript functions.  The trace() method has been updated to include handling for function parameters with null values.

Here is the updated fragment.

Dependency Tracking

One of the really cool aspects of XFA forms is the dependency tracking mechanism.  Dependency tracking is the feature where your field (and subform) calculations and validation re-fire whenever any of their dependents change.  Today I’ll explain a bit about the mechanics of how dependency tracking is implemented as well as have a look at possible design issues related to this mechanism.

Simple Example

Suppose my purchase order form has order rows with fields: price, quantity and subtotal along with a total field.

The subtotal field has a JavaScript calculation: "price.rawValue * quantity.rawValue"

The total field has a FormCalc calculation: "sum(order[*].subtotal])"

Now whenever price or quantity fields change, the corresponding subtotal field will be recalculated automatically.
Whenever any of the subtotal field values change, the total calculation will re-fire.

Discovering Dependencies

When the form is first opened, all the calculations and validations are executed.  While executing a calculation or validation, the XFA engine keeps track of each object (field, subform, data node) that gets referenced during the script execution.  Each referenced object is added to its dependency list.  In our example, each subtotal field becomes dependent on the price and quantity fields from its row.  The total field becomes dependent on all the order subforms and on all the subtotal fields.  The mechanism is robust enough that even if you reference your objects indirectly, we still find the dependency.  e.g. the subtotal calculation could be written as:

parent.nodes.item(0).rawValue + parent.nodes.item(1).rawValue

Since the nodes we visited were still price and quantity, the dependencies are established just the same.

In some cases, the dependency list will grow as values change.  Consider this calculation:

if (A.rawValue < 100) {
    A.rawValue;
} else {
    B.rawValue;
}

Suppose the first time it executes, field A has a value of 20.  This means the code will not execute the else clause and will not access field B.  The initial dependency list will include only A.  However, if the A of changes to 200, the calculation will re-fire; the else clause will be evaluated and field B will be added to the dependency list.

If your calculation or validation calls a script object, dependencies will continue to be tracked during the execution of the script object method.

Dependent Properties

What constitutes a field change? What changes cause a calculation to re-fire?  I don’t have a complete list.  But changing a field’s visual properties (colours, captions, presence) do not cause a dependent to recalculate.  Changing the value or changing the items in a list box will cause a dependent calculation/validation to re-fire.

Turn off Automatic Calculations

If, for some reason, you’re not happy with the default dependency tracking, then you can turn it off. There are two switches to control calculations and validations independently:

xfa.host.calculationsEnabled = false;
xfa.host.validationsEnabled = false;

Note that turning validationsEnabled off disables not only script validations, but also turns off mandatory field and picture clause validations.

Dependency Loops

It is possible to get into a situation where sequence of dependency changes goes into an infinite loop. In these cases, the XFA engine will allow the calculation or validation to re-fire a fixed number of times (10) and then will quietly stop re-firing the calculation or validation until the next time a dependent changes.  While the 10 times limit is ok in most cases, I have seen forms where this has been the root of performance issues.  If you have a dependency loop and your calculations are re-firing, you want to be aware of it and you need to fix it. 

Dependency loops are typically introduced when a calculation or validation modifies another field.

I should highlight that statement.  If your calculation or validation minds its own business and never modifies another field you shouldn’t run into any loops.  There are lots of cases where a calculation or validation modifies another field and everything works fine — but tread carefully.

Looping example

In order to track when my validations were changing, I wrote validation scripts write to a debug field:

field T1 validation:

debugTrace.rawValue += this.rawValue;
true;

field T2 validation:

debugTrace.rawValue += this.rawValue;
true;

Both T1 and T2 now have a dependency on field: debugTrace

The sequence of events:

  1. Field T1 changes
  2. T1 validation fires and modifies debugTrace
  3. T2 validation fires and modifies debugTrace
  4. T1 validation fires and modifies debugTrace
  5. T2 validation fires and modifies debugTrace

Here is a sample form to illustrate this example.

Note that if we remove the validation from T2, the circularity stops.  The validation on T1 modifies debugTrace, but after the validation completes, debugTrace does not change and T1’s validation does not re-fire.

Version Control for Forms and Fragments

It’s a little late in the season for spring cleaning, but better late than never. I have struggled with how best to deliver and (more importantly) update the samples delivered on this blog. Normally what I do is add a note to the blog entry where the sample was introduced and then hope that people notice (do RSS readers notify you when an entry gets updated?) But then there’s the problem where a framework has evolved over a series of blog entries. Which entry has the most recent version of the validation framework? Dunno. How do you know if you’re using the most recent version of the debugger or the lint checker?  What if I find a bug in the tool and want to send you an update? How do I best communicate that to you?

I am fairly certain that I’m not the first to struggle with these problems. Many of you will have similar issues in distributing forms to your end-users. Given that, I am working on a methodical way to track content. The result is a framework that you could adapt for your own form distribution.

Here’s my revised plan for distributing content:

  • Distribute script objects as fragments (in addition to embedding them inside the samples)
  • Embed version information in both the sample forms and fragments
  • Post an inventory XML file with version information on the blog site
  • Embed script in the sample forms so they can “call home” and check their version number(s)

Distribute as fragments

From now on when I distribute a sample I will also distribute any re-usable pieces (subforms or scripts) as downloadable fragments (XDP files).

The sample forms will have their fragment references embedded. This is to make it easier to open the sample in Designer — without the need to re-connect the fragments.

Embed Version History

Once version information is embedded in a form or fragment, then we have the basis for checking whether there is a newer version available.  The question is how best to embed the version information.

My first thought was to embed the information in the form metadata. But there is not yet enough tooling and support around metadata to make this work well. In the long run, I think this would be the right solution. But in the short term we can’t easily get there from here.

The next best solution is to embed the version information in script object variables. Here is the script object variable for the fragment used in the sample that allows formcalc to be called from JavaScript:

var oVersionInfo = {
  identifier: "FormCalcFromJavaScript",
  assetType: "fragment",
  description: "JavaScript access to FormCalc built-in functions.", 
  currentVersion: 2,
};

Some interesting things to note:

  • The same script object is used to describe both fragments and forms.
  • The variable must be named “oVersionInfo” in order to be detected by the version checking mechanism.
  • A form that uses multiple fragments will have several of these “oVersionInfo” variables embedded.

Inventory

Having version information in the form is the first step.  The second step is to have a central location to store the up to date version information. To this end, I’ve posted an "inventory" at: http://blogs.adobe.com/formfeed/Samples/Inventory.xml

The inventory will be used to compare the version information in a given form against the most recent available. The xml for the inventory looks very similar to the JavaScript object:

<inventory xmlns="http://blogs.adobe.com/formfeed/">
  <asset>
    <identifier>PhoneHome</identifier>
    <assetType>fragment</assetType>
    <description>Compare version info to the latest</description>
    <currentVersion>1</currentVersion>
    <filename>checkVersion.xdp</filename>
  </asset>

  <asset>
    <identifier>FormCalcFromJavaScript</identifier>
    <assetType>fragment</assetType>
    <description>Support for FormCalc functions</description>
    <currentVersion>2</currentVersion>
    <filename>scFormCalc.xdp</filename>
  </asset>

  <asset>
    <identifier>ExceptionHandler</identifier>
    <assetType>fragment</assetType>
    <description>Handle exception objects</description>
    <currentVersion>2</currentVersion>
    <filename>scMessage.xdp</filename>
  </asset>

  <asset>
    <identifier>FCfromJSSample</identifier>
    <assetType>form</assetType>
    <description>Demonstrates how to call FC from JS.</description>
    <currentVersion>2</currentVersion>
    <filename>FormCalcFromJS.pdf</filename>
  </asset>
  …
</inventory>

Phone Home

Now to put it all together.  We need some script to check whether a form has the most recent versions of all its content.

The first step is to update the sample from the “FormCalc from JavaScript” entry.  I added support for FormCalc’s Get() function.   In order to retrieve the inventory, the form performs a call:

fc.func.Get(“http://blogs.adobe.com/formfeed/Samples/Inventory.xml”);

(Acrobat/Reader will warn you about this request)

The result then loaded into an E4X XML object. Meanwhile we scan the rest of the form looking for script objects with an “oVersionInfo” variable. We compare the version information embedded in the form against the version information in the inventory and notify regarding any available updates.

The fragments to make this all possible are:

scFormCalc.xdp : The subform holding the FormCalc functionality.

scMessage.xdp : Updated exception handling methods

scVersion.xdp : Contains the logic for the "phone home" capability — compare the form version information against the inventory.

scXML.xdp : Has a method for trimming an XML string so that E4X can load i
t.

Futures

Server-side logic

I chose to store the version checking logic in the sample form itself.  A better solution would be for that logic to reside on the server.  That way, the version checking logic could be updated without changing all the distributed forms.  But given my limitation of distributing via a blog site, I went with the client side logic.

Tools

I have a couple house-keeping tools I’m using that are not ready for prime time:

  • A form to harvest version information from a fragment/form and update Inventory.xml
  • A form to externally introspect a form for the latest version information (as opposed to embedding this logic in the form itself)

If there’s interest, I will consider polishing these to the point where they can be shared.

Source Control

I’ve looked at the versioning problem primarily from a distribution point of view.  I didn’t try to do anything from the perspective of source control.  There are things that could be done to make source code control easier:

  • Store revision details — e.g. Store comments, dates, author info for each change made to a form or fragment
  • A tool for tracking differences between versions of forms. e.g. report on all added/removed/modified fields/subforms/scripts etc between versions.
  • Integration of version information with form metadata

(No promises that I’ll manage to get to these topics.)

Samples

In the short term my challenge is to update a bunch of the samples I’ve previously distributed.