The Most Important New Feature

XML is a beautiful thing…primarily for geeky people who are passionate about topics like workflow automation and industrial strength publishing solutions. Being able to describe document geometry and document content in a non-binary file format that can be edited and manipulated by databases, scripts, and other processes enables a vast universe of vertical market solutions that can save publishers a lot of time and money, as well as offer opportunities for new types of publishing products.

For the last few versions, InDesign has enabled you to export page objects or entire documents in an XML file format called .INX. It was designed primarily for backward compatibility, and it enables you to take documents from, for example, InDesign CS3 to InDesign CS2. What it was not designed to be was human readable…or easy to deconstruct, edit, and put back together again. Nevertheless, many third pary developers expended substantial effort to do just that simply because of the workflow opportunities afforded by a text only XML file format that would enable applications other than InDesign to create and/or edit document files.

What third party solution providers wanted to be able to do with .INX are things like:

  • Programmatically create .INX outside InDesign, i.e., create InDesign document files using a database or some other application
  • Edit or transform document
  • Programmatically replace old with new content
  • Extract and recombine document subcomponents
  • Include and preserve their own proprietary data in the .INX structure
  • Validate the .INX document structure
  • Pre-flight the .INX code
  • Execute processes on INX using industry standard tools like XSLT, XQuery, E4X, RelaxNG validators,  and converters
  • Build rich internet applications that serve as front ends to a publishing system that uses InDesign Server as the layout engine.

Because INX was not designed to be used in these ways, and because Adobe did not officially support it, building workflows around .INX was challenging at best. Many dedicated developers, however, did just that, doing their best to work around the format’s obstacles and limitations.

IDML

With the release of InDesign CS4, Adobe has delivered a new document XML file format that is actually designed to be a developer tool. IDML is the best thing since sliced bread for third party solution developers who have been wrestling with the inadequacies of the INX for building web to print and other workflow automation solutions. It’s designed to be everything that .INX was not when it comes to supporting third party development.

DESIGN GOALS

The primary design goals of IDML are:

  • Completeness: Any object, attribute, or preference can be represented in IDML. Complete "round trip" compatibility is expected of IDML files.
  • Readability: The IDML format is human-readable, IDML is designed to be read and written by virtually any program or tool capable of reading and writing XML.
  • Robustness: Developers have more visibility to errors and increased flexibility in handling them as well.
  • Backward compatibility: A user will be able to take an IDML file generated for version X and open it in version X-1
  • Performance: IDML aims to maintain or exceed the performance of INX.

We’ve designed IDML to make it a key part of automated workflows. Using IDML, you can:

  • Programmatically generate or modify IDML documents
  • Re-use parts of IDML documents in other documents
  • Break a document into components
  • Transform document elements using XSLT
  • Find data in InDesign documents using XPath.

READABILITY

Here’s a small sample of IDML code to illustrate the human readability, part of the description of a text frame:

<ItemGeometry NumPath="1" GeometricBounds="31 31 571 751" TransformationMatrix="1 0 0 1 5 -391">
<GeometryPath NumPoint="4" IsOpen="false">
<PathPoint AnchorPoint="31 31" LeftDirectionPoint="31 31" RightDirectionPoint="31 31"/>
<PathPoint AnchorPoint="31 751" LeftDirectionPoint="31 751" RightDirectionPoint="31 751"/>
<PathPoint AnchorPoint="571 751" LeftDirectionPoint="571 751" RightDirectionPoint="571 751"/>
<PathPoint AnchorPoint="571 31" LeftDirectionPoint="571 31" RightDirectionPoint="571 31"/>
</GeometryPath>
</ItemGeometry>

SUPPORT FOR THIRD PARTY DATA

IDML supports the inclusion of new scripting objects and properties added by 3rd part InDesign plug-ins. This means that third parties can embed their own proprietary data, any features added by plug-ins that support InDesign scripting can be included in the IDML package.

IDML EXPORT PACKAGE

When you export a document as IDML, InDesign creates a Zip archive containing multiple XML files.

The InDesign document is split into separate files representing different aspects of an InDesign document so that you can more easily identify and perform operations on the objects and properties you need. Document resources, spreads (page geometries), and stories are stored in different XML files within the zipped package.

LEGACY FORMATS

So, what becomes of INX and all its variants in past InDesign and InCopy workflows? First, here’s a list of XML formats supported by InDesign and InCopy CS4:

Terminology

  • IDML – InDesign Markup Language
  • ICML – InCopy stories will be InCopy Markup Language
  • INCD – old style InCopy story format used from InCopy 2.0 through CS2
  • INCX – CS4 will have inport/export support for INCX
  • INX traditional – the old style format that the INX clients have used from CS through CS3

INX will continue to be used for backward compatibility for InDesign versions prior to InDesign CS4.

IDML will be used for backward compatibility between InDesign CS4 and future versions.

Note: CS4 has no export support for INCD; however, we will continue to support INCD import.

The advent of IDML as a developer technology is a major development in the history of InDesign. We expect IDML to be leveraged extensively by third parties to deliver innovative and powerful publishing solutions to the market.

If you’re a developer interested in learning more about IDML, you can do that do that through our developer partner program.

8 Responses to The Most Important New Feature

  1. seuzo says:

    IDML is wonderful architecture.
    I will become the prisoner of IDML.
    I look forward to the arrival of InDesign CS4.

    [TC: Seuzo, IDML is supposed to set you free. We’re hoping you get addicted to IDML, not imprisoned by it. ;^) ]

  2. When will the InDesign CS4 SDK be available to the public? Will it include a specification for IDML and IDPP (“Live Preflight” profiles)?

    I’m also wondering why there’s still INDD and INDT…

    [TC: Frank, I’ll double check for you, but I believe all of the specs should be available now through our developer partner programs. As for the binary formats, I expect they’ll be around for the foreseeable future as well.]

  3. Stefan Gentz says:

    Obviously the IDML concept is a great step forward and has many advantages over INX. When I started playing with IDML a few months ago I was enthusiastic about the simple, straight forward and “organized” code. I was even able to integrate simple IDML source code into a SDL TRADOS translation workflow simulation.

    However, considering Frank’s post, when I heard of it for the first time a few months ago, I thought: Cool, they finally get rid of the old binary file format. I thought, that Adobe would follow the ZIP Container / XML concept that Microsoft introduced in 2007 Office and that OpenOffice has for some time, too. I guess it did not make it into CS4 due to performance considerations (opening binary indd still seems to be just faster). I hope that INDD will be replaced with IDML 2.0 in CS7 (I take Tim’s “foreseeable future” for CS5 and 6) …

    [TC: Stefan, as you point out, binary is faster, and there’s a reason for that. IDML is interpreted by InDesign’s scripting engine to create a document, not open one. There’s no one stopping you or anyone else from using IDML as a way to save files rather than the native binary format. Feel free to re-map your cmd/ctrl S keyboard shortcut to the Export command…]

  4. Andrew Meit says:

    Wow, Its like Adobe rediscovered its Postscript roots. 😉 Visual view and code view models should always exist and be able to roundtrip both, now finally they can. Thank you. Please post the link to the SDK stuff when you can. Btw, I assume the raw text for the stories is stored as unicode or is it an xml thingy too?

    [TC: Hi Andrew, I just posted a link to the SDK pages on our devnet site. Answer on the raw text question will be forthcoming.]

    [TC: Andrew, the story text content is UTF-8 encoded Unicode values.]

  5. We are a translation agency that receives hordes of Indd files to translate and that sends hordes of inx files to various translators throughout the world. The problem at the moment is that not all translators have upgraded to the latest Trados version – Trados 6.5 users cannot work with CS2 or CS3 inx files so we have to save CS2 files down to CS1 and create story-collector ISC files for them. Trados 7.5 users cannot work with CS3 inx file so we have to save down to CS2 (from CS3) to generate CS2-inx files for Trados 7.5 users. Trados 8 users can work with CS3 inx files. What will the situation be with CS4 IDML files? Will we have problems with transaltors who have not upgraded to the latest Trados version? Hope to read you before we decide to upgrade to CS4. Thank you.

    [TC: Hi Dani. I think you’ll need to direct this question to the Trados people. I’m not familiar with this product. What I can tell you is that IDML will be substantially easier for third party developers to use in their workflow systems, as it’s specifically designed for that purpose…in contrast to .inx which was not. At this point only the Trados people can tell you what their product will be able to do with IDML.]

  6. Kaoru says:

    We are also a translation company, and use an INX workflow based on a little filter programme we wrote for ourselves. IDML sounds like easier to work with than INX, but I do hope so for our purpose. It will largely depend on whether we can pick and change translatable texts and get away with it without adjusting other parts. I would like this and later InDesign versions to be generous enough to accept the modified IDML files, just like CS/CS2/CS3 versions happily accept such INX files.

    [TC: Kaoru, what you describe is exactly why IDML was developed. The goal was to deliver an XML file format that made processes like you describe much easier implement, and much more reliable. INX wasn’t designed to be edited or deconstructed…but people did it anyway because they could. IDML will be forward and backward compatible from ID CS4 onward, and it will be a far more easy format for you to pick and choose which part of the document gets modified.]

  7. Our software produces and automatically imports InDesign tagged text files because there were many limitations of InDesign’s XML import, not the least of which was performance with big files. Do you consider that IDML is a replacement for tagged text too?

    [TC: No, not at this time…although for many users it could.

    Performance is always potentially an issue when you’re dealing with code that needs to be interpreted…the larger the file, the more complex the document formatting, the longer it’s going to take. IDML can be used to describe much more than text formatting.

    One of the main benefits of IDML is the fact that an InDesign document can be created and edited outside of InDesign. InDesign is only required to assemble, print/export/preview the finished document. It opens up a lot of opportunities for different types of automated workflows.

    All that said, I’m not familiar with exactly what you’re doing with your tagged text import, so I’m not sure whether or not IDML is a good fit for you or not.]

  8. Serge Basharov says:

    I didn’t work with IDML yet but I wonder if it is possible to check, for example, if my script-generated story will have overset text or not (without launching indesign).
    Thanks.

    [TC: Serge, you can estimate the length of the text outside of InDesign (editorial systems using Word rather than InCopy have been doing that for years), but only by having InDesign actually compose the text and return that composition information to you would you know definitively if there was overset text, and if so, how much. In an automated workflow, this is normally done by handing the story off to InDesign Server.]