Try RTMFP and Client-to-Client Direct Streaming, With FP10 and Cocomo, Today!

One of my favorite features in Flash Player 10 is the ability to stream live audio/video via RTMFP, including “client-to-client direct” streaming, and a sweet new voice codec, Speex. If you want access to these features, there’s a way to try it all out in your Flex apps, right now.

tin-can.jpg

First, let’s talk a little about RTMFP, “client-to-client” (C2C), and Speex. Lots of new goodies to try, and they all have some subtleties. You can read the RTMFP FAQ for more detail, but let me try to summarize (and then, we’ll build an app) :

:: Why RTMFP? Well, the Real Time Media Flow Protocol is UDP-based, rather than TCP-based. Why does it matter? Let’s review our old CS Networking classes (way simplified, leaving a lot out) :

TCP is a higher-level protocol than UDP. Some of the features it adds are guarantees on packet delivery and order of delivery. This is accomplished via a system of acknowledgement messages (ACKs) and their evil brothers, messages indicating delivery failures (NACKs). UDP doesn’t really have much of this – packets come in when they come in, and are released into the application level of the network stack without guarantees.

So why is UDP a better choice than TCP for live streaming? TCP is a *must* for something like file transfer, where losing any bytes leaves you with junk. For a live stream, however, if I (the receiver) lose a block of packets (say, 250ms worth), TCP would demand that we send a message to the publisher, asking for retransmission, and wait for that retransmission before playing *anything* further. Even if subsequent blocks come in in the meantime, I’m not playing them. So for the sake of avoiding a 250ms gap in audio, I’ve backed up the *whole stream* until I get it back. What’s worse, on congested networks, which can lead to packet loss, all this ACKing and NACKing actually *makes the problem worse*, congesting the network further, and leading you down a vicious “NACK hell” spiral. Using a UDP-based protocol, such as RTMFP, allows for packet losses to be ignored (for cases like live a/v streaming), and for users to get on with their lives. Simply by using RTMFP, latency of audio drops significantly.

:: Why Speex? Speex is a much more focused codec than the NellyMoser codec in older Flash Players. It specifically targets voice encoding, meaning it can encode voice with higher quality, at a lower bitrate (win-win!). What’s more, it’s designed to be especially effective over UDP, meaning it handles lossiness relatively gracefully,

:: So, what is C2C streaming? Well, they won’t let me say “P2P“, because then people get visions of file pirating networks (and no, you can’t get there from here). But essentially, RTMFP allows clients to stream live audio/video directly from themselves to the recipient, without going through a server. This has the obvious benefit of reducing latency, since you’re likely to reduce the number of hops from source to destination. It also means that a hosted service, like Cocomo, doesn’t have to maintain as much infrastructure to allow you to stream.

All this said, RTMFP/C2C isn’t a panacea, and there are issues you’ll likely encounter in the real world. But before discussing that, let’s build an app! If you haven’t yet, go get Cocomo, set up your FlexBuilder for Cocomo and FP10, and enter the following (use your accountname/roomname, username, and password) :

audioCode.png

(download the file here)

Notice the magic here – in order to attempt RTMFP, enter ‘protocol=”rtmfp”‘ in your AdobeHSAuthenticator. This will route your traffic to one of our super-secret experimental RTMFP servers, if it’s got enough capacity left. Eventually, all Cocomo servers will be of the super-secret RTMFP variety, and you won’t need the parameter, but for now we wanted to separate pre-pre-pre-beta machines in our cluster out so that they need to be explicitly requested.

:: Pitfalls of RTMFP and Solutions (in Cocomo)

There are a couple of very interesting problems in using RTMFP. First of all, being UDP-based means that a lot of firewalls just don’t allow it. It’s pretty common to attempt the connection and fail. Cocomo works around this by attempting a couple of connections at the same time, with a couple of protocols. Whichever succeeds wins the prize of being your session’s connection.

Secondly, even if RTMFP succeeds, this doesn’t mean every client can use C2C streaming. Firewalls can allow UDP but get in the way of C2C, or, more commonly, publishers can run out of bandwidth; since RTMFP doesn’t yet support application multicasting, the publisher essentially needs to pump out more bytes for every subscriber. At a certain point, the publisher’s uplink just won’t be able to handle it if too many people are on the receiving end.

Cocomo has a bunch of built-in smarts to handle these situations. It can tell if all your clients can handle RTMFP and/or C2C. If they all can, Cocomo’s client classes come to this agreement via messaging, and all of them switch to C2C. If someone new enters the room who can’t handle it, again agreement is reached among the clients, and they switch down to server-based streaming. Ditto if the number of receivers becomes too high. This is all done adaptively, as various conditions change, and you as a developer don’t need to worry about it at all.

So, yeah, it’s been fun playing with all the new toys. We still have some outstanding issues to resolve (flaky connections can still lead to sub-par experiences), but we’re actively working on these right now. Eventually, we want to see a world where live streaming is a staple capability in any Flex app, and this new goodness brings us one step closer =).