Untangling RSS in Both Java and ColdFusion

Anyone who has dealt with RSS extensively knows that programatically accounting for all four different versions at the same time can be a little tricky. If you are dealing with a known version of RSS, there is generally no problem, as any one particular version is easy enough to parse. What is far more challenging, however, is being able to reliably parse any arbitrary RSS feed from any arbitrary source. The first problem you run into is the fact that the four versions of RSS which are actively in use (.91, .92, 1.0 and 2.0) are all somewhat different, and in some cases, considerably different. The second problem is that RSS supports several optional tags, so it’s difficult consistently rely on specific pieces of data. Problem number three is the fact that RSS has been extended to include modules, like Dublin Core. And finally, RSS is generated in so many different ways by so many different people and pieces of software that many feeds are just plain broken.

So what is an RSS aggregator to do?

RSS Untangle, or RSSU, address these problems by encapsulating the complexity of parsing and making sense of all versions of RSS. From the DRK 4 product page on Macromedia’s site:

RSS Untangle is a Java library that encapsulates the process of parsing RSS XML feeds. RSSU takes any version of RSS (0.91, 0.92, 1.0, or 2.0) and parses it into a straightforward, intuitive Java object model. Integrate RSSU into your ColdFusion applications through its custom tag interface or use it directly in your Java applications.

RSSU is implemented entirely in Java, but it comes with both a ColdFusion custom tag interface, and a ColdFusion component interface which makes working with RSS as easy as this:

<cfscript>rssParser = createObject("component", "com.macromedia.rssu.Parser");channel = rssParser.parseUrl("http://www.macromedia.com/go/ccantrell_rss");</cfscript>The title of this feed is #channel.getTitle()#<br>.Here is the description: #channel.getDescription()#.<br>This feed's category is #channel.getCategory()#.<br>This feeds has #channel.getItems().size()# items in it.<br>Here are all the items...<br>

You get the idea. RSSU can load feeds by URL or from the local file system. It supports and exposes all known tags in all RSS versions, including image tags, and will even retrieve images bytes referenced in feeds for you. RSSU provides simple Java, ColdFusion custom tag, and ColdFusion component interfaces, and can just as easily generate RSS-style or RDF-style feeds as parse them. Finally, RSSU was tested against more than 11,000 arbitrary feeds from all over the web to ensure its ability to adapt to the huge variety of formats in use.

If there is a more robust, comprehensive and versatile RSS parser out there, I sure don’t know about it.