Reducing the File Size of Scanned PDFs

It seems like a lot of folks are struggling with the size of scanned PDFs. Below are excerpts from two emails I received recently:

My [Fujitsu] ScanSnap makes PDFs that are too big . . . like around 60K per page! What can I do to make these smaller in Acrobat?

I have to eFile [with the Federal Court] and am having to split the filings into many segments to go through the [Court] gateway. The issue seems to be with documents that are scanned on our network scanner. PDFs produced directly from Word are a lot smaller. Is there some trick to reduce the size of scanned files?

Before covering how to reduce the size of scanned documents in detail, let’s discuss four factors that affect the size of scanned images:

  1. Scanning Resolution
    A scan at 600 dpi results in a much larger file than at 300 dpi.
  2. Color Space
    Color and grayscale files result in much larger files than black and white files.
  3. Physical dimensions of the scanned page
    A legal-size scan will be larger than a letter-size scan, with all other factors being equal.
  4. Compression
    Raw scan data can be compressed to make it smaller.

 

Compression Types

Lossless compression retains the exact appearance of the original.

Two common types of lossless compression are ZIP and CCITT Group 4.

Lossy compression makes some (hopefully) non-noticeable visual trade-offs to further reduce file size.

JPEG is a common lossy compression method.

Ideally, you would control all of the above factors yourself by scanning at 300 dpi, black and white and using an efficient compression algorithm.

Unfortunately, you many not have that option. Many desktop and network scanners offer limited or confusing options— or— the scanned PDFs arrived from outside your firm.

Legal Scanning Recommendations
In almost all situations, scan at 300 dpi, black and white.

For the purpose of this article we will make a couple of assumptions:

  1. You have a black and white scanned document of unknown dpi and compression
  2. You have already OCR’d the document, or don’t need OCR

Read on to learn how to reduce the file size of scanned documents using Acrobat.


Black and White Image Compression

There are three common types of compression used on black and white scanned images:

Compression Type Avg Size per page Notes
CCITT G4
50K Most commonly used type of compression
JBIG2 Lossless
36K Good lossless alternative to
CCITT G4 compression
JPBG2 Lossy
15K A lossy compression scheme which often does a good job on typical legal documents

 

For most 300 dpi black and white scans, it can be very difficult to spot any visual differences.

Comparison of Compression, 300 dpi, 200% Enlargement

Compression comparison

Using "Optimize Scanned Image" in Acrobat Standard and Pro

The Optimize Scanned Image feature performs various image clean-up tasks (de-skewing, edge enhancement) and also nicely compresses files.

Here’s how to use this feature:

  1. Open the PDF you wish to optimize
  2. Choose Document—> Optimize Scanned PDF. . .
  3. The Optimize Scanned Image window appears.
  4. Choose the appropriate level of compression and click OK.

What do the settings mean?

The slider at the top of the window has six clickable positions:

Optimize Scanned Image Window
For 300 dpi black and white scans, only options a, b and f result in different file sizes.

Results for a 4-page scanned document
Original
a
b
c
d
e
f
199K
55K
55K
55K
55K
132K
199K

a, b, c and d = JBIG2 Lossy
e = JBIG2 Lossless
f=CCITT G4

Using Acrobat’s PDF Optimizer to Compress Scanned PDFs

The PDF Optimizer can be used to analyze and selectively compress documents. Sorry Acrobat Standard users— this feature is in Acrobat Pro and Pro Extended only.

Analyzing File Size of Scanned Documents

To better understand why a document is big, view the statistics available via the PDF Optimizer.

  1. Open the PDF you wish to analyze
  2. Choose Advanced—> PDF Optimizer . . .
  3. Click the Audit Space usage. . . button
  4. The Audit Space Usage window appears:
    Audit Space Usage Window

The window above reflected the state of a 4-page scanned document:
A)Total file size about 200K
B) Over 190K was allocated to images!

We can do a lot better than that . . .

Reducing the Size of an Individual Scanned PDF using the PDF Optimizer

  1. Open the PDF you wish to compress
  2. Choose Advanced—> PDF Optimizer . . .
    The PDF Optimizer window appears:

    PDF Optimizer window

  3. In the list on the left, ensure that only Images and Clean Up are checked:
    Choosing PDF Optimizer categories
  4. At the bottom of the window, set the following for black and white documents:
    a) Set to 300 ppi
    b) Set to 300 ppi
    c) Set to JBIG
    d) Choose Lossy or Lossless

    PDF Optimizer settings for B&W files

  5. Save your setting so you can easily recall it:

    Saving PDF Optimizer settings
    a) Clickthe Save button at the top of the window
    b)Give the setting a name and click OK

 

Note: The PDF Optimizer may be used in batch mode which allows you to process hundreds of files. See my article on Batch OCR with Acrobat Pro.

 

Comments are closed.