Batch Conversion to PDF/A

In my two previous articles (see the archives ) on PDF for Archiving—PDF/A—I discussed the benefits of this archival format and how to create and validate PDF/A.

One challenge facing government agencies, law firms and their clients is the conversion of large numbers of legend files to PDF/A.

In this article, I’ll cover how to use the Batch Processing facility in Acrobat 8 Professional to transform common file formats found in the legal market to PDF/A-1b.

Read on to learn more…

What document types can be converted in batch to PDF/A-1b?

I’ve tested this method with the following common legal market document types.

  • Microsoft Excel
  • Microsoft PowerPoint
  • Microsoft Visio
  • Microsoft Word
  • TIFF
  • PDF image-only documents

If you need to convert a text file to PDF/A, open the file in Word and save as .doc files.

As noted in my previous articles, it is very difficult to convert PDF Normal documents without embedded fonts to PDF/A. 

Converting Office Files to PDF/A-1b in Batch

Converting Office application files to PDF involves two steps:

  1. Change conversion preferences
  2. Create a Batch Sequence for the conversion

Note that you will need the right application installed to create PDF/A compatible files. For example, if PowerPoint is not installed, Acrobat cannot convert PPT files to PDF/A.

As a practical matter, you should install all authoring applications first, and then install Acrobat 8 Professional. Successful PDF/a conversion requires the installation of the one-button PDFMakers supplied by Acrobat Professional.

Changing Conversion settings to PDF/A-1b

For PDF/A-1B conversions, you can change settings easily in Acrobat Preferences.

Follow these steps to change your default conversion settings to PDF/A-1B:

  1. Edit—>Preferences
  2. In the Categories list at left, click on Convert to PDF
    Conversion Preferences
  3. Select the file type(s) you need to convert from the list. E.g.
    - Microsoft Word
    - Microsoft Excel
    - Microsoft PowerPoint
  4. Click the Edit Settings button
  5. Change the conversion setting to PDF/A-1b:
    Choosing PDF/A-1b
  6. Repeat as necessary for all document types.
  7. Close the Preferences Window

Creating the Batch Sequence

Follow these steps to create a Batch Sequence in Acrobat 8 Professional:

  1. First, move the documents you wish to convert into a folder. Nested folders (sub folders) are OK.

    In this example, we’ll use a folder named doc-source which contains Word, Excel , PowerPoint, TIFF and PDF image-only files.
    Folder of Files

  2. Create a destination folder on your hard drive for the converted documents. In this example, we’ll use a folder named doc-destination
  3. Choose Advanced—>Document Processing—>Batch Processing
  4. Click the New Sequence button
    Batch Sequence Window
  5. Give the sequence a descriptive name
    Name the sequence
  6. Change the following in the Edit Batch Sequence window:
    A   Click the Browse button and locate the doc-source folder
    B   Select the file types you wish to convert.  You should deselect any file types which cannot be converted to PDF/A directly. Consult the list of eligible file types previously in this article.
    C   Select your doc-destination folder
    D   Click Output Options and choose PDF/A as the type
    Setting up the convesion
  7. Click OK

Running the Batch Sequence

  1. Choose Advanced—>Document Processing—>Batch Processing
  2. Select the sequence you created from the list and click the Run Sequence button.
    Running the sequence window
  3.   Click the OK button to start.
  4. Authoring applications will open as necessary. You may see progress bars:
    Progress bar

Batch Conversion Issues

By default, Acrobat’s batch processor will stop if it cannot convert a file to PDF/A. An onscreen warning will appear.

You will need to click OK to each warning before Acrobat will continue on and process the next file.

The primary reason existing PDF files cannot be converted to PDF/A is that fonts are not embedded. Unfortunately, there is no way to add the fonts to the PDF after it is converted.

I’m hopeful that future releases of Acrobat will allow for post facto font embedding.

12 Responses to Batch Conversion to PDF/A

  1. Tsui says:

    This teaches how to batch convert MS-Office documents (Word, Excel, Powerpoint) to PDF/A, how about batch convert single page TIFF files to PDF/A files (say for example to -1b). Thanks.
    —– Rick’s Reply —-
    The instructions should work the same. TIFF is like any other source file to Acrobat, and it converts quickly to PDF. You might want to add an OCR step so that the files are searchable.

  2. Punchy Sandoval says:

    This process did not work for the vast majority of my PDF files–errors and failures on nearly every document. However, if I found much greater success by setting the batch process to run the Preflight command “Convert to PDF/A-1b (sRGB)” (and setting Output Options to “Don’t Save Changes” because Preflight has its own output options). The process takes much longer, but it at least had some success. There are still plenty of failures, reporting “Syntax problem: real value out of range (too high)” and always also the same error with “(too low).” I haven’t figured out yet the cause or remedy of those conversion errors.
    — Rick’s Reply—
    I’ve had better success with Acrobat X in this regard. If fonts are not embedded in the original document, and you don’t have the fonts, you cannot conform to PDF/A.

  3. Tsui says:

    Your link “see the archives” for http://blogs.adobe.com/acrolaw/pdfa_pdf_for_archiving/

    does not exist.

  4. Susan McClure says:

    Is there a way to preserve the “last modified date” or “Date Created” of the original file? For an Archives this is helpful metadata to maintain.
    thanks!

  5. George Dee says:

    HELP! Please. In batch converting tif files to searchable pdf, I have not figured out how to RETAIN the original File Creation date of the tif files. Hence, search is a nightmare, leaving only Bates numbers or file names to work with. Please Advise. THANK YOU

    • Rick Borstein says:

      When OCRing, you are creating a new document which will have a new creation date. You would need to use some sort of database system to manage that.

  6. raghav says:

    Is there a way to execute batch pdf to pdfa over a command line utility with arguments.

  7. DB says:

    I know this is a very old post, but am wondering if you have come up with a way to batch Lotus Notes to PDF/A?