DocEng 07: Mars — PDF in XML

The Mars Project at Adobe is aimed at creating an XML representation for PDF documents. We use a component-based model for representing different aspects of the document and we use the Universal Container Format (a Zip-based packaging format) to hold the pieces. Mars uses XML to represent the individual components where that makes sense, but otherwise uses industry standard formats to represent other components. Examples of these include Fonts (we use OpenType), Images (PNG, GIF, JPEG, JPEG2000), Color (ICC Color Profiles), etc.. We use SVG to represent page content, which fits as both an XML format and an industry standard.

We presented a paper [PDF] on Mars at the ACM Symposium on Document Engineering. The presentation [PDF] [Mars] covered the basics of Mars and gave some examples of converting a PDF to Mars. I actually gave the entire presentation using Mars as my presentation format and Adobe Acrobat® to show it and have included both the PDF and Mars above.

Some files that were shown on the day are not yet available here, but we will try to include them in the near future.

2 Responses to DocEng 07: Mars — PDF in XML

  1. Will Pollard says:

    The link to the paper seems to be to ACM at the moment. I think yesterday there, was an Adobe version there possibly a draft.Are you allowed to distribute this? Any guidance on what to do if you have a copy? The ACM site has a fairly complex routine for logging on, then tells you that full text is restricted….

  2. Yes, I wasn’t sure whether I was allowed to post a direct link to the draft and so when I found the DOI for it on the ACM site, I changed it to match. Let me see if I am allowed to directly link to a draft and, if so, I will put that back.