The Mars Project at Adobe is aimed at creating an XML representation for PDF documents. We use a component-based model for representing different aspects of the document and we use the Universal Container Format (a Zip-based packaging format) to hold the pieces. Mars uses XML to represent the individual components where that makes sense, but otherwise uses industry standard formats to represent other components. Examples of these include Fonts (we use OpenType), Images (PNG, GIF, JPEG, JPEG2000), Color (ICC Color Profiles), etc.. We use SVG to represent page content, which fits as both an XML format and an industry standard.
We presented a paper [PDF] on Mars at the ACM Symposium on Document Engineering. The presentation [PDF] [Mars] covered the basics of Mars and gave some examples of converting a PDF to Mars. I actually gave the entire presentation using Mars as my presentation format and Adobe Acrobat® to show it and have included both the PDF and Mars above.
Some files that were shown on the day are not yet available here, but we will try to include them in the near future.