NOTE: I wrote this article for Acrobat 9. In Acrobat X, exporting to Excel is super simple and works great. Just choose File> Save As> Spreadsheet. It’s worth the upgrade for this feature alone!
I received this email from a paralegal at a large law firm recently:
Help! An attorney has asked me to convert PDFs we received in discovery to Excel. The PDFs are tabular in nature (probably originated in Excel). Some are scanned in from paper and others appear to be converted electronically. How do I do this?
Fortunately, Acrobat 9 offers a couple of different ways to export to Excel.
- Select table and open in Excel
This allows you to select a portion of a page and open it in Excel.
Works best when you only need small part of the table
Better results if the file didn’t originate from a spreadsheet
- Export as Tables in Excel
This method uses some artificial intelligence to convert multiple page PDF documents to multiple worksheets in an XML-based spreadsheet file. It works best on files which were converted directly from Excel to PDF.
To open the XML-based file output generated using method 2 above, you’ll need either:
Acrobat generally will usually do a pretty good job converting the text, but formatting and column widths will look different than the original. Acrobat only copies over the text. Formulas will not convert. Do not expect 100% fidelity.
In the full article, you’ll receive my usual step-by-step instructions.
Converting to Excel from PDF: Copy Table as Spreadsheet
I’ve had better luck using this method for scanned documents and documents which were not originally spreadsheets.
How to use it:
- Open a PDF and OCR if it was originally scanned
Document—> OCR Text Recognition
- Select the Select Text tool (cursor)
- Hold down the ALT (CMD on the Mac) key to make a rectangular selection over a table in the document.
Your cursor will change shape to:
- With the text still selected, right-click and choose “Open Table in Spreadsheet”
- The table data will open in Excel
What are the other options?
Mac Users: Only Copy as Table and Save as Table are available.
Converting to Excel from PDF: Save As Tables in Excel Spreadsheet
This method allows you export a multiple page PDF to multiple Tables in an Excel file. It seems to work best on documents which were:
- Converted directly to PDF from Excel
- Converted using Acrobat (rather than a clone)
Save as Tables works better in Acrobat 9.1
How to use it:
- Open the PDF you want to convert
- OCR the document if it was originally scanned.
Choose Document—> OCR Text Recognition
- Choose File—> Save As
- From the Type list at the bottom of the window, choose Tables in Excel Spreadsheet
- Click Save
How do I open the file in Excel?
Where are all the pages?
Batch Converting PDF to Excel
Have a lot of PDFs you want to convert to Excel? No problem! This works in any version of Acrobat 9.
- Choose File—> Export—> Export Multiple Files
- Click the Add Files button at the top of the window and locate your source PDFs
- The Output Options window appears:
A) Click Browse to select a folder for the Excel output
B) If desired, add a prefix or suffix to the filename
C)Change Export to “Tables in Excel”
- Click OK