Optimizing RSS Delivery

File under: ,,,,,,,,

My colleague Darrick asked during lunch if there was any market interest in "push" technology for RSS subscriptions, rather than the somewhat hockey "pull" that all aggregators do today.

My first thought was that most end users don’t really know or care how their aggregator is getting data, it all looks like push to them, and so any change in the underlying technology will be obscure and viewed as without value.

But setting that aside, we talked about how good aggregators and blog servers are using conditional GETs today to minimize wasteful bandwidth usage.  Then we talked about how a true "push" model would imply a persistent connection, which is really what the Atom over XMPP proposal is all about.  But going from pull to push is a fairly big leap, since it requires a lot more server and a lot more client.

Fortunately, there is another potential optimization that I will call conditional deltas, which basically means that for itemized data streams like RSS, an HTTP GET can indicate that it only wants changes since the last successful GET.  (Bob Wyman calls it RFC3229+feed, but try saying that in a conversation!)

The beauty of a conditional delta GET is that the bandwidth used to transport an RSS feed is never more than the size of the RSS feed, no matter how many requests or changes are made.  Today, when you read this blog entry in your aggregator, you should note that your aggregator probably had to download the entire RSS feed again and manually parse it to determine this was the new entry.  With a conditional delta GET, each entry is transported exactly once, saving oodles of bandwidth.

At the end of our lunch conversation, I was struck by the fact that major blog hosting and RSS serving vendors really should WANT aggregators to support conditional deltas.  Companies like Google (Blogger), SixApart (Typepad, LiveJournal), FeedBurner, MySpace, and eBay collectively serve up millions of RSS feeds over and over again when just one new entry triggers the conditional get.  Sure they support conditional GETs today, but if aggregators supported conditional delta GETs, these RSS servers could save tremendously on bandwidth costs.

As it turns out, more and more aggregators and blog servers are supporting conditional deltas, including WordPress, FeedDemon, Bloglines, and Vista.  So, what are these RSS servers waiting for?