Converting Web Pages with Flash Content to PDF in Acrobat 9: Demystified

| 1 Comment

With the release of Acrobat 9, we completely revamped the way that we convert HTML files to PDF (WebCapture) and the results are better than ever. We even embed SWF files and then play them in place on the PDF page.

However, there will be cases where the SWF doesn’t play as you would expect it to – if at all. This is actually an unfortunate side effect of authoring the SWF to make the browser experience faster.

When Acrobat 9 converts these HTML pages to a PDF file, it can only embed the SWF that is referenced in the HTML <object> or <embed> tag. So, assuming all of the resources that are required to play the SWF properly are embedded in that one file, everything will work properly when embedded in the PDF file.

Here’s an example of the Adobe.com Home Page with a fully functional embedded SWF. In order to play the SWF, you’ll need to trust the domain when opened in the browser or the file if you’ve downloaded it and then click on the blank looking area which is where paused SWF is.

Unfortunately, more often than not, SWFs with high resolution graphics, video, fonts database connections and other resources are often authored as components that load dynamically when needed. This makes the initial load time in the browser much faster than if all of the resources were embedded into a single, monolithic SWF. In many cases, the SWF that is referenced in the HTML <object> tag is pretty much an empty shell that loads in it’s resources as needed; different language strings and streaming videos for example. Typically, the resources are addressed via URLs that are relative to the top level SWF or use FlashVars to direct the SWF to it’s resources. There’s no way, short of reverse compiling the SWF, for Acrobat to detect what additional resources are required by the playing SWF. When you save that PDF file to your desktop, the relative URLs that the SWF needs to load it’s assets from are now meaningless. There’s also the possibility that the server that is serving up the resources has same-domain policy rules in place that allow only SWFs running inside it’s domain to access resources. When a SWF is embedded in a PDF file and the file is running on your desktop, it is no longer in the same domain as the resources and the server requests are denied.

Here’s an example of the Adobe Store Home Page which is basically just a SWF in a small HTML wrapper. The site is completely data driven and won’t work properly when embedded in a PDF file due to al of the reasons that I stated above.

Needless to say, if you’re trying to WebCapture a site that’s not set up to be PDF friendly and you don’t know the web designer that put it together, there’s nothing you can do to make it work. However, below are a few tips that web designers can keep in mind if they think users of their sites may want to convert their HTML pages to PDF files.

Creating PDF friendly SWF:
If you want to author your SWF files to be able to be converted to PDF files properly there are a few things that you can do. The easiest would be to author the SWF to contain all of the resources it needs as embedded assets. This would work pretty well for small SWFs that can load into the browser quickly since you probably want to have a good web experience and conversion to PDF is a secondary consideration. It isn’t a good solution for large SWFs or SWFs that stream content.

You could optionally author the SWF such that resources are referenced by fully qualified URLs. This way, when the top level SWF that’s embedded in the PDF file starts up, it can find the additional resources and play them. There will be security alerts for each resource unless the user selects the "remember" check box in the security warning or the domain is trusted in the Acrobat preferences.

As I mentioned, short of reverse compiling the SWF, there’s just no way to detect all of the resources that the SWF requires.

If you need to do this sort of thing, my recommendation is to go ahead and capture the HTML using WebCapture and delete the embedded SWFs. Then, if you can collect up the source materials (the main SWF and all of it’s resources) you can add them back in their original positions using the Flash tool in Acrobat 9. When placing the SWF, use the resources tab to embed all of the resources that the SWF needs to play properly. You’ll end up with a well functioning version of the HTML page with embedded SWF content.