Big and Complex Forms

How Big and Complex can you make your form?

I get asked this question often. Customers or partners develop very complex or large dynamic forms with many pages and large amounts of script. At what point do we cross the line and reach a level of complexity where Reader/PDF is no longer the right tool for the job?

There is no easy answer. The answer will be different for different users. But it is helpful to look at some of the stress points you’ll encounter with large forms.

Note that these notes apply to forms opened in Acrobat/Reader. The stress points for forms rendered on the server are much different.

  1. Number of pages to render
  2. File Size
  3. Script size, complexity and development methodology
  4. Script performance

Number of pages to render

One of the great properties of regular PDF files is that the file open time is constant no matter how large the PDF. The time to open a two thousand page PDF is pretty much the same as for a one page PDF. This is because Reader doesn’t load the whole PDF into memory and doesn’t read the bytes for page <n> until the user navigates to page <n>.

Dynamic XFA/PDF forms offer a different value proposition. The pages are shaped at form open time by the form data. Of course, there are great advantages to dynamic forms. But there are also associated processing costs. At form open time the entire form definition is loaded into memory. The entire set of data is loaded and merged with the form template. Reader performs enough of the layout to determine how many pages will be rendered. Then when you navigate to page <n>, Reader renders that page from the in-memory structures.

How many pages can Reader handle for a dynamic document? This depends on the complexity of the template. I’ve seen five page forms that take forever to open. I’ve seen a hundred page form open in a second. The limit is more related to the density/complexity of template and data rather than the actual number of pages.

Some form authors attempt to reduce file open time by hiding inactive pages. This strategy was effective in reducing form open time in Acrobat/Reader 7. But in Reader 8.1 when the form open algorithm was improved, the ‘page hiding’ strategy no longer makes a significant difference.

File Size

Dynamic XFA/PDF forms tend to be smaller than static documents. This is because of the template property of forms. For example: a hundred page static PDF will have a hundred pages of PDF mark-up. Whereas in the dynamic case, this could be one page of XFA mark-up that gets replicated a hundred times when merged with data. The latter will be a much smaller file. Nonetheless, dynamic documents can grow to the point where they begin to stress your system. The time to read and parse the documents happens very quickly – even for very large templates. However, the size of the template becomes more of a factor when there are security components in play. Operations such as Certification, Reader extensions and Signatures will perform comparison operations on ‘before-and-after’ versions of the form. The costs of these comparisons are proportionate to the size of the template.

So while there is no absolute threshold on file size, you will find the threshold is lower for certified/extended/signed forms.

Script size and complexity and development methodology

I have seen XFA/PDF files with tens of thousands of lines of JavaScript. Given that there is no debugger, you have to be pretty persistent to create this amount of script. If your big script library is well written, it may perform well enough, but the stress comes with the maintenance of the script:

  • When you change the script, do you have the ability to rigorously test your changes? When you modify fields or subforms, will your script still work? Do you have test collateral that gives you code coverage for all the edge cases in your script? Do you have some form of automated testing? QTP anyone?
  • Is your script maintainable? Or is the code ‘write-only’? Unless you have been disciplined in the creation of your library, you will have longer term maintenance issues when a new developer comes along to update an existing form.
  • When you encounter problems with your script, are you able to isolate the problem when you ask for help? Your friends in our support organization are much better at solving problems with small, simple forms than with large, complex ones. If your script is modular and isolated into components then you’ll be able to ask for help much more easily than if your script is an inter-tangled mess.
  • When you change script, do you preserve previous versions of your form? You need the ability to roll-back changes.

Again, there are no absolutes here, but if you want/need to write lots of script, you need to have the associated discipline in your development environment to make it maintainable.

Script performance

Large amounts of script do not necessarily imply poor performance. But poorly written script of any amount can kill form performance. A script that traverses the entire form hierarchy will have performance that is proportionate to the number of objects in the form. As the form grows, the script slows down. There are many ‘best practises’ for writing efficient script. It is very important to pay close attention to the contents of frequently executed loops.

Conclusion

But before you make a big investment in a form, make sure you consider the alternatives. You might be better off with a Flash form or an AIR application.  If you choose Reader/PDF, the maximum size and complexity of your form depends primarily on your own tolerances.  You need to decide whether the runtime experience is responsive enough.  You need to decide if you are getting the return on investment for your cost to develop and maintain the form. 

10 Responses to Big and Complex Forms

  1. Hi John,Thank you for a very interesting post.Over the years I have learnt to appreciate features that affect performance in our forms and tried to limit their use. While I am generally happy with the forms, there are a couple that are real resource hogs, if the user ticks all of the available options in the form. Multiple repeating subforms can start off okay, but performance dives once the number of pages extends past 20.Your linkage of ROI to forms is very interesting. I develop forms for clients. Simple process:- brief;- develop solution;- client happy €€€ or client not so happy, more work and less €;- repeat process.Apart from hardware/software costs, my investment over the last four years has been in building competencies to develop forms/solutions that our clients’ need. At times it has been a tough road, starting from zero and trying to understand how to bend a monster (LC Designer) to our will. LC Designer is such a powerful tool, that in some cases it is like using a sledgehammer to crack a nut.To date I have considered the ROI to be our ability to win work and deliver bespoke solutions. The trouble is that these tend to be one-off solutions and while we can apply the lessons learnt, we tend not to sell on these solutions to other clients.We are operating in a small market and our investment in LC Designer has lead us down a single road. I don’t regret it, but we have been banging our head against the glass ceiling that is the step up to the full enterprise suite. We have made several attempts (on our own and with clients), but it requires a level of investment that cannot be justified.This brings me to the door that you have opened – alternatives to Acrobat/Reader. This intrigues me and terrifies me at the same time! I appreciate that these tools have been available for some time now and while I have looked at these (and bought the books!), I have tended to stay away because of the steep learning curve and the level of further investment required.Recently I have started going through Bruce Eckel and James Ward’s First Steps in Flex (http://www.firststepsinflex.com/), which is very helpful. For example I did not appreciate that Flex was drag and drop. Now some people may think that if I can’t stand the heat, I should stay out of the kitchen. But in this small market you need to be a Jack of all trades. I can’t afford to have a string of specialists for each area that we need in order to deliver to clients. I appreciate that I will never be a master of any of these tools.What I think would be very useful is to get a full and clear understanding of what restrictions or hurdles are in front of us, if we were to start down the route of developing forms/solutions in Flex/Air. For example, there are several features in an XFA form that get turned on and off depending on whether the form is reader enabled or not (and how it is enabled). While you have to bear this in mind when developing a solution, it is not a problem once you know in advance. The trouble starts when you invest in a solution and then find that it can’t be used because you are missing another element or there are license restrictions.We have started using other tools such as Xcelsius Engage 2008 to develop solutions that allow a data connection and export to AIR. This has highlighted mistakes that we have made in the past. For example, which it is possible to implement charts in XFA forms (thanks to your post http://blogs.adobe.com/formfeed/2009/05/diy_column_chart.html) after investing in this, we are now wondering if there were better tools to achieve a more robust solution.So your post has come at a critical time for our business. Do we stick with the LC route we are on? Or do we branch off onto other tools that may in the long run help us deliver better solutions? I appreciate these are questions that only I can answer.To help us with this decision:- Are there license restrictions with a Flex/AIR form/solution?- Can a Flex/AIR solution connect directly into a database, or would it require LC Data Services?- Are there security issues or restrictions, such as transmitting data or accessing user data?- What would be the tipping point to move from LC to Flex/AIR?Thanks again for a very thought provoking post.Regards,Niall

  2. Dear John,as we did some development for forms with Adobe LiveCycle Output ES there is one are to look into which is the number of scripts being executed. E.g. page x of y calculation is often done on the layout:ready event for two fields – currentPage and numberOfPages. That can be enhanced by putting the calculation on a subforms and make the fields part of that. This way instead of 2 scripts being executed only one is fired. When it comes to large documents that has a significant impact.Kind regardsMaruan

  3. Maruan:You must have very many pages if you’ve chased down this optimization. Of course, if you want it to run even faster you should code in FormCalc rather than JavaScript.Instead of:this.rawValue = xfa.layout.page(this);use:$ = xfa.layout.page(ref($));John

  4. John:the document is about 60000 pages. There are also some complex calculations as we are producing paper handling marks (OMR) for envelope stuffing.Maruan

  5. Maruan:Obviously you’re processing this in LiveCycle Output. Server print has a different set of considerations for size and performance. 60,000 pages is reasonable for print — but not for interactive, dynamic PDF.And as you point out: attention to details and such as script performance is very important with this size job.John

  6. Niall:First of all, my apologies for the delay in getting your comment posted.Comments containing web URLs often get flagged as spam and I don’t discover them until I check for them.You raise an interesting stress point that I didn’t address — the scenario where forms are developed without the backing of a LiveCycle server.I certainly agree that the threshold for complexity will be lower when delivering solutions without a server to either offload some of the processing or to enhance the functionality: Reader Extensions, PaperForms barcodes.As for your specific questions:- Are there license restrictions with a Flex/AIR form/solution?I have to say, I am not an AIR expert (It’s on my “to-do” list :)But AFAIK, there are no client runtime-license restrictions. The AIR runtime is free and gets installed with Reader or can be installed from Adobe.com- Can a Flex/AIR solution connect directly into a database, or would it require LC Data Services?You need to purchase a server license to use LCDS.However, in Flex/AIR you have access to http operations that would allow you to post data to a server.- Are there security issues or restrictions, such as transmitting data or accessing user data?Here I am not as sure. I know that when you use flash inside Reader/PDF, the flash component inherits the Reader security model.I also know that when you use flash in a browser, it implements a cross-domain security.Aside from that you need to speak to the experts.- What would be the tipping point to move from LC to Flex/AIR?First of all, it’s important to note that if you move to Flex/AIR, you won’t write less code.In fact, you’ll undoubtedly write more — but it will be MXML and ActionScript instead of JavaScript.However, the development tools allow managing a much larger code base in these environments.On top of that, compiled ActionScript runs much, much faster than interpreted JavaScript.And the benefits you gain are that you have far greater control over the user experience — and for data capture, your users will undoubtedly appreciate the well-crafted richness that you can provide as you move away from the paper metaphor.So unfortunately I can’t point you to a specific tipping point. But I’d encourage you to:- try a couple simple AIR apps and find out first-hand how hard/easy they are to develop- monitor the blogs and forums to get a sense of how well other users are doing with AIR applicationsGood luck!John

  7. @Niall:I think when it comes to very dynamic solutions where data capture experience is key you can also combine Flex with PDF using the JavaScript bridge (needs Reader Extensions to be enabled). This way you can use the flexibility of Flex to capture the data and PDF to render the output.Maruan

  8. John,Thank you very much for the direction. I feel there is definitely something there to explore, so I am going to make time and try to get to grips with Flex. I think James Ward’s book will be a good starting point. It just seems that Flash is raising its head everywhere at the moment and there appears to be a huge amount of debate. To give credit, the Flex/AIR community are an active bunch, with a huge amount of code available online and in Tour de Flex – an example to the rest of us!Maruan,Back in v6 or v7 (I think) I started with Form Guides and then realised I could not deploy them without the server components. So we dropped that interaction between Flex and LC Designer at that stage. As the Flex/PDF combination requires LC RE, it is outside our reach. I am very interested in the Flex/AIR approach, as I think it will help us get the user experience that you refer to, but without the enterprise suite overheads.Thank you,Niall

  9. Fabio Oliveira says:

    Hi John.I’m designing a little complex quality certificate, and there are two issues that are preventing me to finish my form.One is how to put a footer only on the last page without waste space in previous pages and also with no overflow of the data. Because if I put the footer in Master page I have the overflow problem in last page and if I put it on the body page I couldnt make it stay at the end of the page. Im trying to find how much blank space I have on the last page and then manipulate the height of a blank subform to force it to go to the end of the page.The another issue is that I have a “table inside table” subform and I need to repeat all data if a page break occur in the last table. I saw your last post(Duplicating subform structures), but I could not open your pdf example and didnt understand very well if it would solve my case.I dont know if you understood my problem but I would be grateful for any help.Regards.Fabio

  10. Fabio:I have a blog post that describes how to anchor a subform at the bottom of a page. Have a look at:http://blogs.adobe.com/formfeed/2009/01/position_a_subform_at_page_bot.htmlFor your second issue, it sounds like the sample from “Duplicating subform structures” might help, so I’d encourage you to make sure your copies of Reader/Acrobat and Designer are up-to-date so that you can work with the sample.good luck.John