RTF, Word, and FB2 file conversion

EPUBGen is a project that Peter Sorotokin has started, it’s a conversion utility for rtf files, word files, and FictionBook files. The output in each case is, of course, epub.

The project is open source, and available for download.

Note that there is also a .jar file for rtf2epub and it should work, but the main intent of this project is to provide source code and examples of the way things could be done. In other words, there’s plenty of room for developers to improve and enhance the conversion.

The project includes code to convert a couple different formats to ePub, including generating all the required files and creating the package. The project also shows how to mangle embedded fonts, how to sub-set those font (thus reducing the size of the ePub).

Location

The code is hosted with the epub-tools site on code.google.com. Thanks to Liza Daly for organizing a location for ePub tools.

Building

I created an ant file for building the rtf2epub converter, and it’s included. I expect there’s a need for build scripts for the others, but it’s really straight forward. FB2 conversion does have some additional dependencies, I’ll cover that in more detail in an upcoming entry to this blog.

Once you have the project building it should be straightforward.You’ll find the ‘build.xml’ in the com.adobe.conv.rtf2epub folder. Of course you’ll need Ant installed to use the Ant script.

ant

Should give you results similar to the following:

Buildfile: build.xmlcreateDistributionDir:buildSrcZip:removeClasses:compile:[javac] Compiling 9 source files to .../com.adobe.conv.rtf2epub/bin[javac] Compiling 47 source files to .../com.adobe.conv.rtf2epub/bin[javac] Compiling 78 source files to .../com.adobe.conv.rtf2epub/bin[javac] Compiling 2 source files to .../com.adobe.conv.rtf2epub/binbuildJar:[jar] Building jar: .../com.adobe.conv.rtf2epub/dist/rtf2epub-0.1.0.jarbuildBinZip:[delete] Deleting:  .../com.adobe.conv.rtf2epub/dist/rtf2epub-0.1.0.zip[zip] Building zip: .../com.adobe.conv.rtf2epub/dist/rtf2epub-0.1.0.zipbuildrtf2epub:BUILD SUCCESSFULTotal time: 3 seconds

Running

After building, or if you download the pre-built jar file, you can run the converter at the command line. Arguments are the path to the file to convert and the path to the epub file.

java -jar rtf2epub-0.1.0.jar In.rtf Out.epub

Some items worth noting

Of course Main.java is the entry for each of the converters, but much of the interesting code is in the Publication.java file.

FB2 conversion requires some additional dependencies, which I’ll note in an upcoming entry.

3 Responses to RTF, Word, and FB2 file conversion

  1. I suppose more tools is better than fewer, but what’s the motivation for a new conversion tool given calibre’s existing support for RTF and FB2 to EPUB conversion?

  2. Paul Norton says:

    This project isn’t really about rtf specifically, it’s more about how to handle certain things, like font sub-setting.We also did this as a way to demonstrate font mangling. Something the IDPF ePub working group asked us to do.

  3. Well why didn’t you say so! (Ok, you did, and I just missed that part.) I never even considered the possibility of font sub-setting. Nifty. Do you know of any other open source OTF sub-setting code, or is this a first?And I notice that the font-mangling method implemented is “http://ns.adobe.com/pdf/enc#RC” and not “http://www.idpf.org/2008/embedding”. Is the plan still to support the later in DE?