The open data movement is pushing for organizations, in particular government agencies, to make the raw data that they collect, openly available to everyone for the common good. Open data has been characterized as the “new oil” that is driving the digital economy. Gartner claims: “Open data strategies support outside-in business practices that generate growth and innovation.”
What promises to be a very interesting workshop on the topic “Open Data on the Web,” is being sponsored by the W3C in London on April 23-24, 2013. I will be attending and will present a talk entitled “The Role of PDF and Open Data,” which explores how PDF (Portable Document Format – ISO standard ISO 32000-1) can be effectively used to deliver raw data.
There is widespread belief that once data has been rendered into a PDF format, any hope to access or use that data for purposes other than for the original presentation, is lost. The PDF/raw-data question arises because raw data is usually best represented as comma-separated values (CSV) or in a specific (well documented) XML language.
PDF is arguably the most widely used file format for representing information in a portable and universally deliverable manner. The ability to capture the exact appearance of output from nearly any computer application has made it invaluable for the presentation of author-controlled content.
The challenge has been to find ways to have your cake and eat it too: to have a highly controlled and crafted final presentation and yet keep the ability to reshape the same content into some other form. We know of no perfect solution/format for this problem but there are several ways in which PDF can contribute to solutions, which I have explored in previous blog posts and will expand on in my presentation at the workshop. I hope to see you there.
James C. King
Senior Principal Scientist