Shared Data in Packages Part 1

In a previous post I warned about using doc.disclosed = true; There’s just too much risk from some rogue form that you might happen to open.  But yet, there are some very compelling applications we can develop if/when we could selectively disclose a document to another trusted PDF.  The application I have in mind is where we share access to the various documents that are open in a package.

I want to be able to propagate shared data between a package document and its attached forms.  When multiple forms capture the same data (e.g. name, address) you’d like to be able to capture it once and have the values automatically shared with each form in the package.

The shared data problem had two parts:

  1. Establishing a trusted connection between a package document and its
    embedded documents
  2. Propagating field values between documents

The first problem was very difficult.  It’s the topic of today’s post.
There will be a part 2 where I provide a solution for the data sharing
mechanics.

Today? We’re pretty much in the deep end.

Opening an embedded document

A package could establish communications with its embedded documents by opening each of them using: doc.openDataObject().  But this solution doesn’t scale.  Your package could have dozens of embedded forms.  We can’t assume it works to open all of them.  Eventually we’ll run into performance problems.

What we want is to get a handle to the embedded documents that the user has launched.  And there is nothing in the Acrobat object model that will tell you which of your child documents are open.

Step 1.  Modify the app object

We need a technique where the host/package document can expose an API that selectively discloses itself — just to its children.  The way to expose an API that all open documents can share is to modify the app object.  e.g. if I write
this code in one document:

app.myfunc = function() {return “hello world”;};

then any other open document can call app.myfunc();

Step 2.  Add a disclosedPackages object

The code outline belows shows how to add a disclosure API to the app object:

// At docReady, the host document/package adds a package 
// disclosure function/object to the acrobat app object 
var fDiscloseObject = new function() { 
    // private list of disclosed package functions 
    var disclosedPackagesList= [];
    this.disclose = function(packageDoc) { ... } 
    this.undisclose = function(packageDoc) { ... } 
    this.findPackage = function(embeddedDoc) { ... } 
}
if (typeof(app.disclosedPackages) === "undefined") { 
  app.disclosedPackages= fDiscloseObject; 
} 

After executing this code, any PDF can call:

app.disclosedPackages.disclose(packageDoc); 
app.disclosedPackages.undisclose(packageDoc); 
app.disclosedPackages.findPackage(embeddedDoc);

The specific usage:

On docReady, a package will disclose itself to its attachments by calling: app.disclosedPackages.disclose(event.target);

On docReady an attached document will try to locate its package document by calling:

app.disclosedPackages.findPackage(event.target);

Once it has a handle to its package document, it can call synchronize methods found in the package.

At docClose the package will call: app.disclosedPackages.undisclose(packageDoc);

Step 3. Validate an embedded child

The app.disclosedPackages object maintains a private
list of disclosed package documents.  We need to selectively disclose the package to any
of the children that calls app.disclosedPackages.findPackage(embeddedDoc);

The problem is: How do we determine whether the candidate document is
actually a child?

There are a couple of tests we apply:

1) Check whether the candidate path is consistent with an attachment.  The doc.path property of an
embedded PDF is constructed to look like:

|<packagedocpath>|U:<byte-order-mark><childfilename>

Loop through the parent’s embedded objects (doc.dataObjects) and look for any that have the same path as the candidate object.

Once we’re satisfied that the paths match we perform test 2

2) The package opens the embedded object that has a path matching the candidate. The package now has two doc objects and needs to confirm that they are the same.  The technique we use is to modify one and see if the modification shows up in the other.

Step 4. Spoof-proof the code

Unfortunately we live in a world where we have to take extra precaution to protect ourselves from hackers.  After all, that’s why we don’t use doc.disclosed in the first place.  In step 2 we added a JavaScript API accessible to every PDF open on our system.  How do we ensure that this API cannot be replaced by malicious JavaScript? After all, if we modified the app object, another document could overwrite our code with their own script.

Check your source

var fDisclose= new function() {
    // private list of disclosed package functions 
    var disclosedPackagesList = [];
    this.disclose = function(packageDoc) { ... }
    this.undisclose = function(packageDoc) { ... }
    this.findPackage = function(embeddedDoc) { ... }
 } 
if (typeof(app.disclosedPackages) === "undefined") {
    app.disclosedPackages = fDisclose; 
}
// Make sure the disclosedPackagesfunction has not been 
// replaced/spoofed 

if (app.disclosedPackages.toSource() === fDisclose.toSource()) {
     app.disclosedPackages.disclose(event.target); 
}

Note that we disclose ourselves only after making sure the source is the same as the original.

The clever JavaScript coder will point out that the toSource() method can be overridden. Not so.  The Acrobat JavaScript implementation does not allow overriding the toSource() and toString() methods of objects.

Naming Conflicts

Something to be careful about is managing different versions of this code.
If there are two variations of this code in different PDFs and both use the same name (app.disclosedPackages), one of them will fail.  So if you write your own version of this, use a unique name.  Better yet, use a unique name that incorporates a version number so you can manage the code over time.

What? No Sample?

Next post.  I promise.

4 Responses to Shared Data in Packages Part 1

  1. Umesh says:

    Hello,

    Please help me, i am new in Adobe javascript please tell me where can i get good resource.

    I want to set the Text box property value through javascript.

    Thanks in advance!!!
    Umesh

    • John Brinkman says:

      Umesh:
      If you’re new in JavaScript, this packaging sample is probably not a good place to start. It’s pretty complex. But if you’re interested in learning more, the best way forward is to:
      a) become a good JavaScript programmer — lots of resources on the web to help
      b) learn the JavaScript object model exposed by reader/acrobat — Designer help should be a good resource.

      good luck!

      John

  2. Rob says:

    Hi John – genius stuff you have on your blog, and you have already helped with table sorting. I have this situation: my customer uses a case management system. That system holds data. My PDFs (Acroform and XFA) are used in their workflow, and they store these PDFs locally (within their server domain). When they load one of these PDFs, they want to extract a field list – manually match that to their database fields – then pass that matched data back into the PDF, and pass the remerged PDF back to the user. The user has only Reader – all forms have Reader Extensions applied, with data input and output enabled. What can I tell my customers? Is there any widget or app that does this interface easily? Thanks in advance

    • Rob:
      Not sure I get the scenario exactly… Sounds like the operations include:
      – extract a field list from the form
      – match fields to database data
      – repopulate the PDF with database data
      And the only tool available is Reader…
      Offhand, it seems like the best way to do this would be with a web service. Of course, to make that work you’d have to modify every form. But if Reader is your only tool, then that seems inevitable.
      good luck.

      John