Track PDF Forms with Omniture

No doubt you noticed that Adobe acquired Omniture — a company that provides online business optimization software — starting with web analytics.  One of the integration possibilities is to help companies track the activity inside their PDF documents — including forms.  What pages did they view? Did they print? save? add annotations? sign?  In the case of a form: what fields did they fill in? What buttons did they click?  How far into the form did they get before they abandoned their session?  Today we’ll work through a sample of adding tracking code to a PDF form.

Tracking code

When you want to track activity on a web side, the Omniture tools offer assistance for instrumenting your html pages.  You give it your tokens and it returns the appropriate script to embed in your source.  Similarly, we can generate code to add to ActionScript, Java and other environments.  In this blog entry I’ll show you how to do the same for your PDF.

A word about privacy

Tracking activity is a sensitive business.  End-users have the right to know that their actions are being tracked.  They also have the right to opt out of tracking.  Adobe Reader has a security policy that protects users.  In practise, what this means is that when a PDF is hosted in the browser, the document may post data as long as it adheres to the cross domain restrictions.  When a PDF is open stand-alone, it can perform http operations only if there is a level of trust.  A couple of ways to establish trust are to use a certified PDF, or the user can explicitly allow http access via the "phone-home" dialog.

Today’s sample, limits the tracking experience to PDF forms that are open in the browser.   Tracking a PDF in standalone Reader isn’t really recommended, because the phone home dialog is too ugly:

warning1

Data Insertion API

The API used by HTML JavaScript to do tracking is based on doing an HTTP Get operation from an image resource.  However, there are other APIs.

Omniture exposes a Data Insertion API where you can http post simple XML fragments to the server.  Once you’re logged in with a developer account, you can find this API described at:

https://developer.omniture.com/documentation/datainsert/understanding

The XML grammar used is fairly simple.  The sample form constructs XML ‘pulse’ transactions that look like:

<request>
   <prop1>Acrobat9.3:WIN</prop1>
   <language>en_CA</language>
   <visitorID>27585603</visitorID>
   <pageURL>51cb51b4-535d-49a4-b6bd-1a975cc94f69</pageURL>
   <pageName>firstname:changed</pageName>
   <channel>PDF Form</channel>
   <reportSuiteID>FormTracker</reportSuiteID>
</request>

Of course, you can format this data any way you like — as long as the reportSuiteID and the URL that you post to are correct.

A few notes about the various fields we populated:

prop1

The API allows us to include up to 50 user defined properties: prop<1> to prop<50>.  In the sample, I’ve included some information about the version of Reader/Acrobat and the platform.  I originally wanted to put this information under <userAgent>, but that value is applicable only to browsers.

visitorID

When tracking from a web page, the way to identify a visitor is with an IP address or with a script-generated id stored in a cookie.  However inside the Acrobat object model, there is no equivalent property to uniquely identify a visitor.  Ideally we’d be able to establish a constant visitorID between the users session in the browser and their session in Adobe Reader.  There’s some more discussion about establishing a unique visitorID below in "the deep end".

pageURL

We need something to identify the PDF.  Using the PDF name is not reliable, since PDFs are easily renamed.  The sample below uses the xfa.uuid property.  This value remains constant even if the form is renamed.  For non-xfa PDFs we could use the doc.docID[0] property.

pageName

The form uses pageName to encode the action that has taken place.  I adopted a scheme where the string is a combination of "field name : event : additional information"

channel

A way to categorize groups of transactions for better reporting.

The Sample Form

Unfortunately I couldn’t include a fully functioning sample form.  I have an Omniture sandbox set up for my own testing, but would rather not expose it to the world. The visitor namespace used in the example is fictitious.  Instead, I’ve changed the code that would normally post data and instead it will populate a field with the xml that would otherwise have been posted to the server.  To see the sample work — follow the link above and open it in the browser.  Or download it and open it in Designer ES2 preview mode.

Detecting a browser

As stated earlier, the sample form will track user activity only when hosted in the browser.  To detect when we are in the browser we look at the document path from the acroform object model: event.target.path.  If the prefix includes a protocol scheme (e.g. http:) then we know we are hosted in the browser.  (as an aside, Designer uses the browser plugin mechanism for hosting Acrobat/Reader when in preview mode.  When testing the sample form in Designer preview, it will behave as if it were loaded in the browser.  This explains why when you close your designer preview, the form itself doesn’t close — until the next preview.  We get the browser behaviour where the document is kept open for a while in the event that the user navigates back to the page hosting that PDF.)

Designer Macro

When you look at the sample form you’ll see that I’ve injected lots of script to gather and emit pulse data:

  • A hidden Tracker subform that contains a script object, and several other events
  • enter and exit events on every field in order to track when field values change

Manually adding script for tracking would get very tedious.  To make it easier, I wrote a designer macro that will instrument my form for tracking.  The macro dialog looks like:

image

Once you select the options you want, the macro injects the required script.  If you want to remove the tracking code from your form, de-select all the tracking options and press "Ok".

Here is a zip file with the macro JavaScript, SWF, and MXML.

HTTP Post

Posting from an XFA form is pretty straightforward, given that FormCalc includes a built in post() function.  However posting from a non-XFA form is not so easy.  I tried a number of options:

doc.submitForm() — While this uses HTTP post, it also displays the server response.  In this case the Omniture server returns: <status>SUCCESS</status>.

Net.HTTP.request() — cannot be called from within a document. This function is available only in folder-level JavaScript.

Net.SOAP.request() — The documentation makes it look like it could be dumbed down to do a raw post, but in practise this is not the case.

The method I eventually cobbled together was to embed an XFA-based PDF as an attachment to the document I wanted to track.  When the document wanted to initiate tracking, it opened the attachment in the background and called into the tracking functions defined there..

The Deep End

There are several interesting things about the markup injected into the form:

HTTP Post

The call to post data is made using the FormCalc post() function.  In order to call post(), I added a "full" event to the tracker subform. We use xfa.event properties to hold the parameters to post() and invoke it with a call to
Tracker.execEvent("full");  This technique is described at: Calling FormCalc From JavaScript.

Multiple events

You might think that adding an enter and exit event to every field object would be a problem if the form happened to have its own enter and exit events.  However, the XFA spec allows fields to have multiple events with the same activity.  i.e. there’s no problem having two enter events. They’ll both fire.  However, Designer will show you only one enter event.

Protos

To keep the markup as terse as possible, I made use of protos when injecting script.  The tracking subform contained the source code for the enter and exit events:

<proto> 
  <event activity="enter" name="Track_enter" id="Track_enter"> 
    <script contentType="application/x-javascript"> 
    
Tracker.Track.FieldEnterExit();
    </script> 
  </event> 
  <event activity="exit" name="Track_exit" id="Track_exit"> 
    <script contentType="application/x-javascript">
      Tracker.Track.FieldEnterExit();
    </script> 
  </event> 
</proto>

Then when adding these events to field objects, the syntax is very terse:

<field>
   <event use="#Track_enter"/>
   <event use="#Track_exit"/>
</field>

Propagating Events

Instead of adding enter and exit events to every field, I could have used a single propagating enter/exit event for all fields.  But since propagating events are available only since 9.1, I chose to add individual events so that the form would work in older releases of Acrobat/Reader.

Tracking validation errors is a different matter.  In this case there is no easy workaround for older versions of Reader — unless you’ve implemented some kind of validation framework.  In order to track validation failures the form uses the validation state change event.  Any time it fires, the form posts to the Omniture tracking server.  Note that the state change event also uses syntax not exposed by designer:

<event activity="validationState" ref="$form"
       name="event__validationState" listen="refAndDescendents">
   <script contentType="application/x-javascript">
   …
   </script>
</event>

Notice the attribute "ref="$form".  Designer doesn’t expose the ref attribute.  It would default to "$" — the current node.  In our example we’re able to house this logic inside the Tracker subform, but have it monitor validation activity in the rest of the form by pointing it at the root form model.

Ideally the Designer macro would be able to query the target version and then would control whether logic to track validation failures is feasible.

Unique Visitor ID

There is one way to create a persistent id using the Acrobat object model — by way of the global object.  I won’t bore you with all the details about how the global object works, but I will show you how I used it to create a persistent id:

/**
* Effective reporting needs a persistent visitor id — across
* all PDF documents.
* @return a persistent visitor id
*/
function getVisitorID() {
    var sVisitorID = "";
    // We use the global object to store/retrieve a visitor id.
    for (var sVariable in global) {
        // The global object security policy doesn’t
        // allow us to examine the contents
        // of all global variables, but it does allow us
        // to enumerate them.
        // We’re looking for a variable named:
        // _OmnitureTracking_*
        // The trailing digits will be our visitor id.
        if (sVariable.indexOf("_OmnitureTracking_") === 0) {
            sVisitorID = sVariable;
            break;
        }
    }
    if (sVisitorID !== "") {
        // Strip off the prefix
        sVisitorID = sVisitorID.replace(/^_\S*_/, "");

    } else {
        // Create a new visitor id
        sVisitorID = Math.ceil(Math.random() * 100000000);
        var sVisitorVar = "_OmnitureTracking_" + sVisitorID;
        // Add this visitorID as a global, and make it persist
        // so that it will be available next time in as well.
        global[sVisitorVar] = "x";
        global.setPersistent(sVisitorVar, true);
    }
    return sVisitorID;
}

In the scenario where the PDF is being tracked in the context of a web site, we might consider embedding the users web site visitorID into the form data.  Then for the PDF tracking we’d concatenate the two values.

 

7 Responses to Track PDF Forms with Omniture

  1. Jeff K says:

    Hi John,Cool example!So if I am using Designer 8.2, does a newer version exist or is it just in Beta testing right now? Will the update be free?Thanks,Jeff in Columbus

  2. Jeff:Designer gets released both in Acrobat and in LiveCycle. The latest version that I used for this sample was the Designer that shipped with LiveCycle ES2 (but note that the macro capability in Designer ES2 is not a formally supported feature). I don’t know the answer on the upgrade cost.John

  3. Ian C, UK says:

    Hi – if you are using Acrobat 9 Pro or Pro Extended, you’ll be on LCD ES (v8.2.1……)You can upgrade to LCD ES2 (v9.0.0. …even more redundant digits…) for a very reasonable price of around USD/GBP 30 – just visit the Adobe Store, order and you receive a disk via mail.HTH

  4. Dave says:

    Hi John,

    The use of proto’s has been great in simplifying my forms; I have used them to implement field focus highlighting and to ensure all my fields have the same borders, fonts, etc. The problem I have with them is when trying to create fragments, this means the proto the controls in the fragment refer to has to be resolved in the source form, the one including the fragment. My goal is to have a fragment that can be styled by the document that is using it, all my forms have very similar parts but always different branding. This seems to be possible from my reading of the “XML Forms Architecture (XFA) Specification Version 3.0” on page 221, but I have had no success in implementing it. Hopefully I am just doing it wrong or is this just my wishful thinking when reading this document. Maybe proto’s could be added to your list of blog topics?

    Thanks Dave

    • Dave:
      I’m glad you’re poking at this. protos are a very powerful construct. I support your idea to use them to style your content. I’ll put it on my list.

      John

  5. Dave says:

    Hi John,

    Thanks for your support, we are finding more and more ways of using protos. But we have noticed that Designer 10 has changed the way protos are handled, it seems that any events that are defined on a proto are now generated where the proto is used … as if it was an override.

    It has also been pointed out to us that the Designer 9.0 help refers to protos as deprecated.

    So we aren’t sure if this is a bug with Designer 10 or not. Would you expect Designer 10 to generate overrides for proto properties?

    Dave

    • Dave:

      Historically Designer doesn’t support building documents with proto relationships. However, protos are well-supported by the runtime. I’m unaware of any changes there.
      The one big change in the ADEP 10 designer is that the new stylesheet capability is based on protos. This may have caused other Designer behavior changes wrt protos.
      As for whether Designer will generate overrides on protos or not — I’m not sure what the current behavior is. But as I said, Designer does not support arbitrary authoring based on protos. The best we can hope for is that it respects established proto relationships.
      It’s entirely possible that you might have to cleanup some overrides after the fact. perhaps with a macro.
      good luck
      John