by lrosenth

Created

November 3, 2007

This just might be apocryphal but to the best of my knowledge this explains the heritage of the Adobe Acrobat Distiller. The story also motivates one of the key ways that PDF deviates from PostScript.

When you make a printer, you need to have some pages to print that really show off the great features of the printer. At least this was true in the very early ’80s when small desktop copiers were being experimented with as printers. Adobe had invented PostScript as a device and resolution independent way to get sophisticated text, image and graphics output on a laser printer.

Dr. John Warnock, one of the founders of Adobe, had decided to use the classic US 1040 tax form as a great example of a complex and graphically rich document to print. It involves a lot of text, lines and shaded areas, the type of material that a laser printer can do a great job on but that typical printers of the day could not handle. Now remember that PostScript is a programming language and you can write subroutines to do various imaging jobs. So John made heavy use of subroutines as he hand-crafted the US 1040 tax form. This was a lot of work but by carefully building up a nice hierarchical line drawing, box drawing and shading subroutine library he was able to get a very close 1040 facsimile to print on an experimental PostScript printer. Trouble was it took many minutes to print, many minutes. Not great for the intended purpose of demos.

John was pretty sure that most of the processing time was due to the extensive subroutine use and not due to the actual rendering of the lines, shades and text. Also being a typical programmer who would rather have a computer do the work, he set out to see if he couldn’t automatically convert his 1040 PostScript program into one without subroutines at all. One of the features of PostScript is that you can redefine all of the operators in the language. So John wrote a PostScript program when loaded just before printing the 1040 form, redefined all the graphic rendering primitives to write out a text string that represented the PostScript for what that operator was being asked to do. Did I forget to mention that PostScript can also write out strings that get sent back to the computer to which the printer is attached. Well it can. This may be a little subtle so let me try to make it more specific and understandable. He wrote a PostScript program to run on a PostScript printer whose purpose was to transform PostScript print jobs and send the transformed versions back to the computer to which the printer was attached. Suppose that one of the uses of the graphics primitives was buried deep in a subroutine stack and was written like this: “X1 Y2 moveto X2 Y2 lineto X3 Y3 lineto stroke”, where the Xs and Ys were variables computed in some way, probably within a loop in the subroutines. What would get sent back to the computer in one pass of the loop might look like: “200 200 moveto 250 200 lineto 250 250 lineto stroke”. In the second pass of the loop the same PostScript might send back: “300 300 moveto 350 300 lineto 350 350 lineto stroke”. If the subroutine loop was executed 15 times then 15 strings like those would get sent back to the computer
but each would have different numbers to draw lines in different spots on the 1040.
The interesting thing was that the operators in this new PostScript had no variables and the numbers they contained were the result of executing the original 1040 PostScript program. So if we do this for everything on the page, we end up with a rather unraveled and perhaps voluminous set of basic graphic utterances. If those are subsequently sent down to the printer they will produce the exact same 1040 form but much faster since there is no more programming language.

The result was a 1040 form that could be used very effectively to demo the capabilities of PostScript and laser printers. But a side benefit of the work was this PostScript program which John called a “distiller” that you could load in front of any print job and it would convert it into a “distilled version”  and send it back to the computer. The derived or distilled PostScript would produce the same results but didn’t use any of PostScripts (slower) programming constructs or variables.

Glenn Reid picked up John’s program and fleshed it out to be more complete and to make it more widely available for people to experiment with. This exercise proved two things: that you could automatically “distill” PostScript programs into a more primitive form that didn’t use variables or subroutines, and that the resulting simple PostScript was significantly faster to process and print.  It might be larger, but that was a less serious problem of communicating to the printer more quickly.

If you are following this at all, by now you should realize that this distiller made by John and Glenn went on to become the Adobe Acrobat Distiller and the ideas of simplifying PostScript to remove variables and subroutines led to those features being a mainstay of PDF.  Doug Brotz next wrote the actual first version of what is now Acrobat Distiller by taking the Adobe PostScript interpreter which was written in C and modifying it to do the distiller function. Having that, you didn’t need a PostScript printer to run the PostScript version of the distiller, as John and Glenn had done. Interestingly, Doug did his work on a NeXT machine. That machine’s imaging model was based upon Display PostScript. It all swirls around!

Contact me at: jking@adobe.com