Archive for January, 2006

Sony BBeB and eBook formats

Josh Carter commented to ask “how do you see Sony’s BBeB format splintering the market? If their store is halfway successful … the bickering over open formats may be largely irrelevant to the average consumer.” Good question!
My viewpoint is that BBeB is simply one of the best of a number of what I would call “compiled from XML” derivative formats. The vast majority of eBook content that is not PDF starts life as OEBPS XHTML-based XML. But no eBook reading systems directly consume this XML. ETI uses a minimal ZIP-based container file wrapper. Microsoft Reader’s .LIT format encodes the OEB content into a DRM-protected container file. Mobipocket and Sony BBeB can be created via a “compile to bytecode” process on the OEB XML source. BBeB has some additional capabilities that overlap into the sphere of final-form paginated PDF, and the Japanese Librie supported a print driver creation utility as well as (naturally) strong support for the Japanese writing system. But given the nature of the 6-inch Sony Reader screen, and the presence of PDF on the device, I expect the BBeB-based eBooks for sale in the U.S. to be reflowable, and the results of the OEB translation pipeline.
Assume the industry successfully establishes an XHTML-based reflowable document format based on the evolution of OEB, with an associated single-file container package with pluggable DRM, then I see no strong raison d’etre for Mobi, BBeB, or any of the other OEB-derivative eBook formats to hang around forever. That doesn’t mean BBeB will go away overnight and indeed Sony has discussed taking steps to make it more open and accessible, but in the long run I believe the momentum behind interoperable XML-based formats is unstoppable.

More on PDF/A

There seems to be some serious confusion about the role of PDF/A. David Rothman of TeleRead recently wrote that “PDF/A… is not a consumer standard. It is for archivists, not the typical consumers who buy commercial e-books and other publications.” This is way wrong, although I will admit that Adobe may have contributed to the confusion by so far failing to fully promote the significance of the core of PDF becoming a full ISO standard.
First, the mobile/device implementation of PDF supplied by Adobe to partners like Sony, NTT DoCoMo, Access, Nokia, and others is effectively a small superset of PDF/A, you might say a “PDF/A+”, with the only major addition being its support for (simple) password-protected encryption. This means that anyone creating PDF/A content can be sure that content can be consumed on mobile and embedded devices as well as desktop systems.
Secondly, PDF/A sets a clear lower bar for alternative implementations of PDF viewing technology, such as the Quartz imaging model in Apple’s OS/X. This cross-platform cross-implementation portability is of value to businesses and consumers, not just archivists.
And, PDF/A highlights the underlying nature of the PDF architecture, in which by and large documents which uese new features degrade gracefully in older/subset implementations, with support for features like alternative image representations for new formats like JPEG 2000. Probably 99% of PDF files in existence today can be viewed reasonably in a PDF/A implementation. A few features, notably DRM, don’t afford backwards compatibility but so far “open DRM” has proved an oxymoron – Adobe’s seems no more or less open than any other DRM scheme out there.
Finally, there is a good reason for the subset model, beyond PDF and PDF/A. The reason is to constrain content from having programmatic application content which may be inherently unportable, and is almost certainly not going to have longevity, because it depends on particular VM versions and/or external resources like web services. Programmatic content also presents major security issues. Certainly there’s a role for highly programmatic interactive application content, but having a format that guarantees a declarative nature seems quite sensible, for consumers as well as archivists, whether that format is fixed and paginated a la PDF or reflowable a la XHTML/OEBPS.

The Fork in the Road Is Paved With Good Intentions

Jon Noring replied very graciously to my earlier post questioning how open the OpenReader project really is. Clearly Jon Noring and I agree on much, and I thank him for the positive spirit in which he took my rather pointed comments. But nevertheless I think we still have a bit of a disconnect on openness. While we may have some minor disagreements on other points, the biggie for me is the “fork and take over” move that the OpenReader group seems to expect us all to buy into. So I think this whole fork issue merits a bit more discussion.
Jon stated clearly that “we will consider submitting the OpenReader Framework Specification … when the specification is stable and proven ‘in the field’ with commercial implementation(s).” His colleague David Rothman further noted that they are “shopping around” for the right standards-group. To me this is a Microsoft-style takeover: embrace an existing format/protocol/language, extend, and then declare that your “fork” is the right stuff which you will kindly “consider” submitting for standardization to your choice of venue – not necessarily the folks who are responsible for the underlying standard you’ve “borrowed” (C# anyone?). And whether or not a pliant standards body is found, the fait accompli of first putting commercial implementations in the field creates a strong disincentive to do anything other than bless the particular dialect that you’ve already unilaterally established.
I’m not suggesting these aren’t reasonable business tactics under some circumstances, only that we call a spade a spade. Or, rather, a fork a fork. And hold the we-are-more-open rhetoric, please.
Of course forks happen all the time, with formats as well as open source projects. There can be valid reasons for forking, although in e.g. the Linux community it seems ego and vested interests have a lot to do with it, with the issue of “control” looming a lot larger than that of “openness”. Indeed it’s widely accepting that forking is undesirable, “a kind of plagiarism that is not supported by the community without significant reasons” [1]
And in this case I’m left scratching my head to find reasons: the reinvigorated IDPF has a head of steam up to get extensions to OEBPS worked out, including the critical container format, so why not focus on helping get the job done? Membership fees is a red herring: as Jon Noring mentioned,the IDPF has extended Invited Expert status to interested parties, including himself.
To me OpenReader appears eerily similar to the WHATWG splinter effort to advance web formats started by Opera, Mozilla, and Safari folks started back in 2004, about which I feel pretty much the same way. But while WHATWG hasn’t gotten very far with this approach, at least they have submitted their proposals to the group (W3C) responsible for the underlying standards which they propose to extend. And, they at least started off with inherent credibility as constituting all the major non-Microsoft web browser vendors. Finally, they are up-front that it’s an invitation-only party, and don’t lay it on thick about being more-open-than-thou.
Yet as far as I can tell, the WHATWG effort has after close to two years accomplished nothing more than taking wind out of the sails of W3C’s own standards efforts, for example increasing FUD around adoption of open standards like XForms. Thus, only helping Microsoft as they seek to promote IE7 and proprietary alternatives to Web standards. Exactly the opposite of the WHATWG’s expressed objectives! I lay the majority of the blame on egos (doubtless some of them on the W3C side of the fence), and have some theories about how the vested interests in Redmond may have helped along this debacle.
I really really don’t want to see this same scenario play out in the ePublishing space. That’s the reason I’m spending cycles to push on this issue: not hostility to OpenReader’s goals, which I largely share, but the desire not to see them backfire.
So while I appreciate Jon’s invitation for Adobe to join his “splinter group”, at this point I see no good reason to do a “WHATWG-style fork” on the IDPF. I submit to Jon and the OpenReader group that rather than set themselves up to be a fork, they work with the rest of us to define an “OEBPS 2.0” and then to establish an early open source implementation thereof. That would truly be an OpenReader accomplishment to be proud of. And, I’m not a patient guy: if the IDPF doesn’t keep moving briskly, I promise I will be jostling to be first in line in search of alternative venues for accomplishing our shared goals.

OpenReader or NoringOSoftReader?

Despite Jon Noring and David Rothman of Teleread incessently banging the “open” drum, and dissing PDF and other formats as proprietary, the so-called OpenReader effort at this point seems to be somewhat mis-named. Its definition of “open” appears to amount to “whatever Jon Noring and vendor OSoft think”. I support the principles that OpenReader espouses, so I challenge Jon, David and their colleagues to either lay off on the openness window dressing, or better yet become worthy of the “open” moniker they are so fond of using by pursuing their principles openly within the IDPF organization whose format they want to “embrace and extend”, instead of as a separate splinter activity behind closed doors.
First, a bit more on “open”. To me “open” means that design is not controlled by a single entity or closed set of entities: that proposals are worked out in an open, transparent process rather than via fait accompli. “open” also implies that no patent licenses are required to practice the technology. “open” formats should ideally be de jure standards, under the auspices of a recognized standards body and evolving via well-defined processes. Finally, and most important for adopters, there should be multiple compatible implementations, lest there be lock-in to a single vendor.
For example PDF/A, aka ISO 19005, is about as open as it gets. PDF/A is defined by an AIIM committee in an open process with multiple commerical and non-commercial participants. Adobe asserts no patent or other IP rights against PDF implementors, An approved ISO standardis about as “de jure” as it gets on this planet. Finally, there are literally dozens of interoperable implementations of PDF creation and viewing software, including many open source packages.
OpenReader, by contrast, at this point is being defined by a closed set of self-selected contributors in an unclear manner, in conjunction with a single commercial implementor that asserts patent rights on its DRM implementation, and their standardization roadmap seems to amount to an intended Microsoft-esque “we’ll submit it when it’s already completed” fait accompli. While its admirable that they plan to build on XHTML+CSS foundation leveraging prior OEPBS efforts, that is no different than many a Microsoft “embrace and extend” move. Open source is also admirable, but again a single vendor-controlled open source implementation does not openness make. And just because you’re a non-profit doesn’t make you neutral: there are plenty of nonprofits out there with personal axes to grind, or who are effectively controlled by corporate interests.
Again, I respect and support the vision and principles underlying the OpenReader effort. And, I don’t truly think that the overall group involved in OpenReader wants to simply end up with a yet another eBook format and associated reader, in effect just a NoringOSoftReader. But if they are going to be relevant to the future of ePublishing, and a positive force for that future, I solicit the OpenReader crowd to realize that there’s a bigger picture, with no room for yet another one-off eBook format, and that you don’t have a corner on the market for good ideas on how to achieve the unified format the industry needs. Standards groups and industry consortia have their flaws, no doubt about it. And they afford less room for ego fulfillment than self-selected Executive Directorships and being able to unilaterally make decisions. But at the end of the day open forums beat the heck out of smoke-filled rooms, hands down.

More on Sony Reader’s PDF, DRM, and reflow

A couple of comments on my post on Sony Reader and Adobe PDF indicate there’s still a bit of confusion.
To answer Rob McDougall’s question: yes, absolutely, just stick a PDF on a MemoryStick or SD card, insert, and read (Sony’s increased openness is shown by their supporting the latter media type). In general the implementation just ignores advanced features like JavaScript that aren’t supported by the mobile/device PDF subset so pretty much any unencrypted PDF works. And you don’t need to use “memory card sneakernet” – Sony’s companion PC application makes it simple to drag&drop both BBeB and PDF documents to the device via USB (which also recharges the battery). As simple as iYouKnowWhat.
Henrique’s questions are both very interesting. His first question is whether this means that “the sony reader will only be able to read encrypted pdfs from the sony store and not pdfs from other online sellers?”. AsI said in my earlier post the PDF implementation initially shipping in the device doesn’t support DRM at all. So I believe Sony initially plans to sell only DRM-protected BBeB-format eBooks from the Sony Connect eBookStore, not PDF format content at all. However there’s nothing to stop Sony or other vendors from offering unencrypted documents (PDF or BBeB) under the “rule of law”. Baen is among a number of publishers experimenting with business models that are not predicated on heavyweight DRM. As well, one might imagine that down the road the Adobe PDF implementation for the Sony PRS product line could conceivably support DRM.
Henrique also asked “does this version of PDF offer any reformatting characteristics?” noting that “although there are some ways to put a PDF on Librie (usually printing to pictures), the text is usually a little small.” The version of mobile/device PDF shipping with the Sony Reader 1.0 does not utilize advanced “Tagged PDF” structure information, which the desktop Adobe Reader uses to implement accessibility and reflow features. So, no, PDFs on this device are fixed-format. Admittedly, fixed-format pages are not a great match to a 6-inch display, especially when many PDFs are essentially created as print masters. Indeed this is arguably the primary reason that Sony had to support BBeB: although it has structure capabilites, PDF doesn’t really solve the problem of efficiently and reliably representing flowable content that doesn’t necessarily even have a single canonical paginated representation. But Adobe is working with its partners, including Sony as well as the broader publishing industry, to solve these issues. We envision expanding our platform to incorporate first-class standards-based support for reflowable content. More on this soon.

Sony Reader and Adobe PDF

There has been quite a bit of speculation on the nature of the PDF support in the Sony Reader PRS-500 product announced by Sony at CES. To clarify, the Sony Reader supports true Adobe PDF via Adobe software. This is an extension of an previously announced partnership that has delivered PDF support into several Sony consumer electronics products, including a car navigation system.
Adobe has developed a profile of PDF for consumer devices and mobile handset platforms (such as Nokia Series 60 and NTT DoCoMo FOMA). Because its implementation is optimized for limited resource devices and consumption-oriented use cases appropriate for smaller screens, our mobile PDF profile omits a number of features from the complete PDF 1.6 specification that is implemented by Adobe Reader 7.x for desktop operating systems. For example, at this time neither of the document protection (DRM) technologies incorporated into Reader 7.x (that support Adobe Content Server and Adobe Policy Server) are implemented in our device-oriented PDF software. Although it predates it, our current mobile profile turns out to be well-aligned with the recently finalized ISO 19005 standard, aka PDF/A.
I expect the Sony PRS product line to be a great success. More on this soon.