Don’t say more than you intended
We’ve all seen those government or legal documents (like the below image) where the author has removed sensitive content from the prying eyes of the public. This process is called redaction and it allows, for example, the National Security Agency to publish content per the Freedom of Information Act and still keep the nation’s secrets secret. But what happens when users don’t follow best practices and accidentally leak more than they intended? More after the break…
Click image to see it larger
In Acrobat 8, we introduced a set of features which allow you to publish your content with more confidence. With Redaction, you can remove visible content (like text or images…or any content) from your PDF that you don’t wish to publish. You can optionally replace that content with blocks or labels indicating that content was removed. And with Examine Document, you can search through your PDF for hidden content (like metadata, comments, file attachments, etc.) and remove those items as well. In Acrobat 9, we improved on both of these features, including the ability to search for and redact entire lists of words/phrases or redact patterns (like social security numbers). Rick Borstein did a great demo of these new features here. And Donna Baker has a step-by-step tutorial on how to use Acrobat redaction here.
The reason the Acrobat Team has been investing in this area is because there can unfortunately be real business or security consequences when an organization publishes content which reveals more than what is intended. There have been a lot of incidences mentioned in the news, but let me highlight a few of them.
In May 2008, as reported in the Connecticut Law Tribune, General Electric was involved in a class-action lawsuit involving sexual discrimination which sought damages of $500 million. As part of the legal proceedings, GE’s counsel filed paperwork in PDF to the federal court electronic filing system called PACER, with a portion of that paperwork sealed by court order and therefore redacted. However, the information was not properly redacted and the hidden information was revealed by “[copying] the black bars that cover the text on the screen and [pasting] them into a Word document.” [Often times, users mistakenly highlight text in black in Word and then convert that Word file to PDF. However that only covers the text and that text can be later retrieved. I assume that’s what may have occurred in this case.] The result was that information which was “supposed to be sealed by court order, [appeared] with little technical savvy required.” Therefore, the case may be jeopardized due to “revelations that there’s a large leak of information in the case.”
Also in May 2008, as reported on Matt Blaze’s blog, the U.S. Department of Justice released a PDF report detailing wire-tapping and measures the DOJ is taking to make sure that wire-tapping cannot be easily defeated by the bad guys with technical counter-measures. Pieces of the document were marked “REDACTED – FOR PUBLIC RELEASE”. But again, the marks used simply covered the text underneath and as Matt reported the “extra layer can be removed easily with Adobe’s own Acrobat software or by just cutting and pasting text.” When the marks are removed, they reveal details such as the FBI’s financial arrangements with Verizon regarding wire-tapping as well as a survey of law enforcement agencies on the problems with wiretapping. Matt includes a copy of the original PDF with the bad redaction marks, and you can see for yourself how easy it is to recover this content.
I’ll go back a bit for my third example. In May 2005, as reported in both Government Computer News and Washington Technology, the U.S. military issued a PDF report detailing the accidental shooting death of an Italian journalist by U.S. forces in Iraq, again with portions improperly redacted. When the improper redaction marks were removed, additional information was revealed, including details of the telecommunications breakdowns which may have been partially responsible for the death.
There are numerous other examples. But the bottom line is that the Acrobat Team wanted to make sure there were effective tools available in Acrobat so that you could publish your content with confidence without concern that sensitive information was still there.
Let us know what you think, and thanks for reading.
Dave Stromfeld, Acrobat Product Manager