(Cross-posted from the forums.)
There’s a lot about OSMF for newcomers to digest, so I thought I’d share the thinking behind the core design of the media framework. Let’s start with a use case: how would you build a video experience like that of mtv.com, Hulu, or some other premium content publisher? The first thing to acknowledge is that consumer video sites like Hulu are not just about video. They provide multimedia experiences: video is choreographed with images (companion ads), overlays (SWFs), and even the branding of the underlying HTML page. And the experience is seldom confined to visual media alone; typically, the playback of the media happens in conjunction with CDN requests, ad server interactions, and tracking & reporting calls. There’s a lot happening under the hood.
So the core of a media framework needs to be a) supportive of any and all media types, and b) flexible in how it can integrate backend server calls that enable and complement the media experience.
With OSMF, we’re attempting to solve this problem by defining three key classes at the heart of the framework. First, a MediaElement represents a unified media experience. A MediaElement might represent something as simple as a single video, or it could represent the entire choreographed experience of a web video site. But given the dynamic nature of a site like MTV’s, if a MediaElement is going to represent the media experience as a whole, it too needs to be very dynamic. That’s where the second key class in the framework, IMediaTrait, comes in. A media trait represents a fundamental capability of a piece of media, without making any assumptions about the type of the media or how the media achieves that capability. Examples of traits include IPlayable, IViewable, IAudible, and ITemporal. A MediaElement that represents an image (ImageElement) would only contain the IViewable trait, whereas a MediaElement for a video (VideoElement) would have that trait plus IPlayable, IAudible, and ITemporal. An audio-specific MediaElement (AudioElement) would have IPlayable, IAudible, and ITemporal, but not IViewable (since it has no visual representation). Traits can come and go dynamically over time, which is the key to representing a unified media experience as a MediaElement. Taken as a whole, the complete set of traits makes up the vocabulary of the system. If you can map the behavior of a media type into this vocabulary, then it can leverage all of the functionality of the framework.
But what about non-visual media, such as integration of CDNs or tracking servers? Here’s the key: the trait-based vocabulary of the system applies to both visual and non-visual media. In other words, everything that a CDN or ad server needs to do can be expressed through one or more traits. For example, one of the traits in the framework is ILoadable, which represents the process needed to transform an input (such as a URL) into ready-to-play media — i.e. the load process. But if a CDN plugin needs to do authentication or other custom logic, all it needs to do is map that custom logic into the ILoadable API. Under the hood, the load process (as represented by ILoadable) can work with NetConnections and NetStreams, or with Flash’s Loader class, or with the Sound/SoundChannel API, or with custom RTMP or HTTP requests and responses. In a sense, OSMF is taking all of the idiosyncracies and incompatibilities of the different media-specific Flash APIs and abstracting them into a common API.
Hopefully by now it’s at least somewhat clear how to represent a single piece of media. But for complex media experiences with many moving parts, we need the third key class in the framework. A CompositeElement is a MediaElement that represents a composition of multiple MediaElements. The two specific examples are SerialElement, which represents a set of MediaElements that play in sequence; and ParallelElement, which represents a set of MediaElements that play simultaneously. These two classes allow you to build complex media experiences with many different MediaElements. I could go on about this, but it’s probably more instructive to post a code snippet:
// Create a root-level parallel element.
var parallel:ParallelElement = new ParallelElement();
// Add a sequence of videos to the root.
var videoSequence:SerialElement = new SerialElement();
videoSequence.addChild(new VideoElement(new VideoLoader(),new URLResource("http://www.example.com/video1.flv")));
videoSequence.addChild(new VideoElement(new VideoLoader(),new URLResource("http://www.example.com/ad.flv")));
videoSequence.addChild(new VideoElement(new VideoLoader(),new URLResource("http://www.example.com/video2.flv")));
// Add a sequence of rotating banners in parallel:
// - The first banner doesn't appear until five seconds have passed.
// - Each banner shows for 20 seconds.
// - There is a 15 second delay before a subsequent image shows.
var imageSequence:SerialElement = new SerialElement();
imageSequence.addChild(new TemporalProxyElement(20, new ImageElement(new ImageLoader(),new URLResource("http://www.example.com/image1.jpg")));
imageSequence.addChild(new TemporalProxyElement(20, new ImageElement(new ImageLoader(),new URLResource("http://www.example.com/image2.jpg")));
imageSequence.addChild(new TemporalProxyElement(20, new ImageElement(new ImageLoader(),new URLResource("http://www.example.com/image3.jpg")));
// Add the whole thing to the MediaPlayer.
player.media = parallel;
There, in about twenty lines of code, is your (basic) multimedia experience. (Note that I haven’t covered everything that’s in the code in this post — checkout the developer documentation to learn about the rest.) Yes, we’re still a far cry from reproducing Hulu or MTV. But this is just a warmup. Hope you’ll stick around to see where we’re going with this, and help us get there.