Posts in Category "Video"

Is video so hard?

The VIDEO tag isn’t as hard to understand as media campaigning would have you believe.

A good example is at TechCrunch today: “MeFeedia Reports 63 Percent Of Web Video Is HTML5-Friendly”. Back in May 2010 the same TechCrunch writer wrote “H.264 Already Won — Makes Up 66 Percent Of Web Videos”. This in itself is pretty confusing…. 😉

The background is that an encoding site measured what codecs their clients desire, then equated the popularity of H.264 (and any VP3/VP8) with “HTML5”, which advocacy sites then equated with “iPad”.

The presence of an H.264-encoded video does not mean the site has a VIDEO tag to invoke it… retagging a site and providing a control UI does not come automatically with video compression. The numbers, as presented, mean nothing. But the headlines have already attracted further confusion, with weak headlines like “Apple Won The War Against Flash”.

My October post “Who Needs War?” still contains background on how video works, and the title was a soft allusion to those who need to posit some form of conflict to justify their position. Mike Chambers also explained how these basic technical aspects are being misrepresented.

There are so many blindspots and contradictions with this persuasion campaign. Take a look at, and their maps of popular browsers across the world. Firefox is the biggest “HTML5” desktop browser, yet doesn’t decode H.264 video. Opera is the biggest “HTML5” mobile browser, yet also doesn’t do VIDEO/H.264. Apple is just one small part of the total “HTML5 VIDEO” discussion. H.264 != “HTML5” != Apple.

More confusion: “The choices between Flash, H.264, Ogg, and VP8 means that if a video publisher wants full user support (and they should), they’ll need to support several formats for each video.”

Makes no sense. Adobe Flash Player has used H.264 for three-and-a-half years now, reaching +80% consumer support within six months. There is no “choice between H.264 and Flash”, just as there’s no real comparison between your groceries and their shopping cart. One contains the other. This is very simple to understand.

In the real world, to show video to everyone, you need Flash, and then something for Apple devices. Doesn’t require re-encoding the video, just re-working the site. Maybe provide something for older devices too.

And to understand the real world, do we need techblogs? The evidence they’re giving doesn’t lend confidence….

Balancing diverse needs

Some heat around video features today… browser vendors’ VIDEO plans don’t include a doorlock, and rebuttal… not all content providers can afford to provide for less-capable audiences, and rebuttals.

Made me think of a quote from Adobe co-founder Charles Geschke a few years ago, at a Kendall Whitehouse interview for Knowledge@Wharton:

One of the things I talk a lot about is the necessity to juggle all of the constituencies that have an interest in the business: shareholders, customers, employees, vendors, and the communities in which we operate. Those constituencies are all mildly in conflict with one another in terms of what’s best for them. Your job as a leader in a company is to find an appropriate way to juggle those conflicting interests so everybody feels like they’re getting a fair deal, without letting any one dominate the others because they’ll drag your company down.

Sustainable technological solutions work for more people… balancing the needs of consumers, AND creators, AND investors, AND all the other diverse groups which are affected. If any constituency feels slighted or oppressed, then things won’t move forward as easily.

Asking video creators to create multiple interfaces for intentionally-hobbled devices, or telling creators that they can’t even install a lock on their front door… that’s as unfriendly as saying “use another browser” or “install this new plugin to watch” would be to consumers. Finding solutions which work for diverse groups is harder, but, in the long run, more fun.

VIDEO debate, cutting to the chase

Earlier this week Google’s Chrome team announced that they’d no longer be including an H.264 video decoder in their browser. I haven’t seen any updates from them responding to the massive third-party conversation following their use of the fluffy and prone-to-dispute “because it’s open” explanation.

But that massive conversation seems to hide more than it reveals — burying us all under word fatigue. Here are some simple basics:

  • The VIDEO tag was simply not well-considered at the outset. Its original rationale was: “You don’t require a plug-in to view images… video is the next natural evolution of that.” But from the very start the practical questions about use were swept under the rug… at least until the rug started piling up too high. It wasn’t sustainable.
  • The VIDEO tag serves two different constituencies: those with an “Open Web” banner who wanted to expand their own scope (lookin’ at ya, Mozilla), and those who have invested in Apple and their devices. These two groups have had different needs. The original proponents from Mozilla and Opera saw their desires for a royalty-free codec (for royalty-free tooling) hijacked by Apple fans’ needs for an H.264-baseline implementation. It wasn’t sustainable.
  • Video publishers need the VIDEO tag for one purpose only: to support Apple’s non-standard HTML browser and its denial of third-party extensibility via APPLET, OBJECT, and EMBED. [I’m copying that linked comment below, because TechCrunch’s Disqus commenting system doesn’t seem very web-friendly.] Flash’s popularization of H.264 meant that much video did not need to be re-worked for Apple’s standard-breaking devices, just the tagging and — significantly — the interactive and adjunctive features of anything beyond plain linear video playback. This may have been endurable, but the demand from the start was not sustainable.
  • Does Chrome’s H.264 move affect Chrome users? I don’t know of any (non-ideological, non-Apple-only) video which uses only VIDEO/H.264. The only real effect it seems to have is to blunt Apple’s campaigning by removing a nominal incentive. iPad users may be most affected, but impact on Chrome users seems minimal. Its removal does not seem unsustainable.
  • In this week when we’ve seen Arizona shootings and the easy (and incorrect) apportionment of blame, it’s quite unsettling to see how techblogs go on about the “war” and “blood feud” and other speech which is meant to incite, and earn more ad revenue. For goshsakes, consider how you’re speaking. Reflexive hatred is not sustainable.

Beyond the pettifoggery of This Week’s Blog Outrage, a simple truth remains: We humans are now witnessing a migration of video interactivity to pocket device. People all around the world, from varied economic strata, can now capture what they see and share it with others. It’s right up there with the invention of printing and the invention of the Internet. Instead of arguing about branding issues we should be thinking of how people will want to use video, how they will need to use video. This would be more sustainable. Would be more useful, and kinder, too.

Who needs war?

How does web video work? You’ve got a video file, compressed as On2 VP6 or VP8 or H.264 or whatever. You’ve got some type of interface layer, whether a standalone Real or QuickTime controller, or a Flash-based UI (OSMF, custom, etc). You’ve also got some markup in the HTML page to invoke the whole thing (OBJECT/EMBED, VIDEO). Then you’ve got any backend services, such as adaptive streaming, random access, access controls, clustering, advertising integration, analytics services, annotation layers.

Basically four parts: the compressed linear video itself, the user controls, the invoking markup, and any backend work.

Techmeme’s frothy again today about a blogpost from a firm which indexes videos hosted on a set of video sites. The followup headlines are rather dramatic, but here’s what was measured: “Our final tally included only video that can be delivered within HTML5’s ‘video’ tag. In the vast majority of cases, this means videos were encoded in H.264.”

From what that reads to me, and from checking the graph’s caption, it sounds like the core idea is “Across a range of video-hosting sites, 54% of the H.264 files which had a SWF-based UI also make some use of the VIDEO tag.” (open to correction)

If so, that’s reasonable… Apple’s devices have dominated the press the past year, and the world’s existing H.264 content would be invisible to that new audience without using the VIDEO tag. Considering the marketing pressure, it’s surprising this isn’t higher.

But some of the blogposts with takeaways like “Apple wins the Web” and “Victory in HTML5 war” just are over-the-top — particularly when they’re still confusing a codec (usually H.264 among these folks) with a presentation format (usually Flash).

There’s also still confusion between the VIDEO tag and a codec… Firefox and Opera are very popular on desktop and mobile, but their VIDEO tag does not equate to H.264.

“54 percent of Web video is now compatible with HTML5″… what could that mean? It seems more a phrase about branding than technology. Branding needs wars, technology doesn’t.

The reality is that we humans are gaining _far_ more communicational abilities with video now… screens on the desk, screens in the pocket, screens on the wall. What we choose to watch will be “out there”, available to all our screens. We expect to have a consistent personal experience with what we watch, regardless of the current device.

We’ll also need a diversity of backend services to create a consistent personal experience across screens, to connect those screens.

Finding ways to bring about sustainable ecologies in these new technologies… that’s more interesting and useful than a lot of the talk out there these days.

Different parts to “online video”… a compressed video file, the interface used to control it, the markup used to invoke it, and any backend services in use. They work together. Creating wars among them is more an exercise in branding than anything real.

Tip: Keep an eye on OSMF

Summary: If you help people make choices in web technology, then it would likely be profitable to get the new Open Source Media Framework onto your personal radar now. OSMF is an industry-wide collaborative effort to make it easier, faster, cheaper and more reliable to developer advanced video interfaces for desktop and mobile.

How we got here: Real Networks started web video in 1997, before Apple and Microsoft expanded The First Codec Wars from CD-ROM to Web… in 2002 the ubiquitous Macromedia Flash cross-browser extension added video, and although fragmentation remained an issue for awhile, people like Jens Brynildsen clearly saw the trend… by 2005 phenomena like YouTube started showing how useful and popular play-on-demand video could be.

Popularity of web video has exploded since then… demand has gone viral. Meanwhile feature requests have increased too, from download-and-play to progressive streaming to live streaming to adaptive streaming to rights-management to advertising revenue to analytics to DVR functions to multi-feed to social annotations to… the list goes on. Content providers needed to continually reduce their increasing delivery costs, while the complexity of serving the video also increased.

How to reconcile delivery costs and feature costs? One tack has been to move to ordinary HTTP servers, rather than dedicated media servers. But this requires that much of the “smarts” in a dedicated media server be replicated in the client for a cheaper HTTP server. This increases development costs. But the Open Source Media Framework is designed to slash those development costs — tapping into the whole industry for best practices for a clientside presentation layer, making a framework which all stakeholders can expand.

Check out this June 10 post from Kevin Towes… he gives a deeper overview of the feature requirements and the trends. Then read Greg Hamer’s Devnet article on how to approach OSMF. Click on some of the links that interest you. After reading both these essays you’ll have a much clearer view of where video growth is going than will most of the other people who might advise your friends.

I think OSMF will be very useful in the real world. Lots of producers are now figuring how to minimize “The iPad Tax” of multiple deployment paths, and OSMF workflows will naturally integrate with the most efficient solutions. When large numbers of browsers start supporting the VIDEO/H.264 and VIDEO/VP8 approaches, the “HTML5” UIs will likely integrate or parallel the OSMF methodologies to tap into its broader ecology. Mobile delivery adds multiple complexities, and OSMF efforts are explicitly designed to deal with them. And, at the leading edge, the community approach of OSMF will just make it easier to deliver better features, cheaper. Just as with Jens’ piece back in 2003, the trends are clear if you look at them.

Anyway, that’s my pitch… if you ever advise people about video at all, then spending a few minutes now examining the full release of the Open Source Media Framework will guarantee your video expertise into the future…. 😉

CES 2010 thoughts

This week’s Consumer Electronics Show should deliver on some early guidance given last year, about the home screen finally becoming an interactive communications device. I don’t know what the announcements will be, but here are some tips to put them in context.

Main theme: As phones and televisions become computers, nearly all manufacturers are optimizing for SWF as an interface layer.

  • This is only the very first generation. The widespread adoption by manufacturers signals a good future, but it will take us all a few generations to really understand multi-device interface design.
  • The early announcements may not make much mention of Flash. That’s normal — they’re announcing their new device, not a universal runtime. For the rest of us the big news is common cross-device capability, but most of the press material should be about device differences.
  • The early shipments will likely have differences in what’s available when — many, many schedules are being cross-plotted to each other, across an exceptionally large range of companies. But a key requirement in Open Screen Project is over-the-air updating. Player fragmentation should be relatively low.
  • I don’t know what the business opportunities will be, what types of stores and financial arrangements will come to pass. Apple’s App Store did a good thing by cutting developers a check. We’ll learn more of different types of contracts over the coming year.
  • Some devices may use Flash mainly in-the-browser, while others use them as native interface layer, or as a user-application layer, or perhaps even as a video overlay layer. Particularly in this very first generation, different manufacturers may make different choices.
  • Most of the “small screen” news should hit next month, at Mobile World Congress. One of the difficulties here is that today’s “World Wide Web” has been designed for workstation screens. Some sites do try to degrade-to-mobile, while others make a special mobile site, but webdesign-for-devices has in general been a moving target. Adobe has been doing outreach to many key websites to improve the user experience, but The Web as a whole may be a little rocky at first.
  • Many of these devices will have HTML renderers too. Brands and versions — and therefore capabilities — will vary. Flash will offer more advanced capabilities, more predictably, more widely.
  • There will be a very strong tendency in popular conversation to port today’s workstation use-cases to new devices. But your TV probably doesn’t need an email program, nor your car a WWW browser. We have to figure out how people can use the entire Internet, most appropriately, when they’re using the new device. We humans tend to see new things in terms of the past. Follow your instinct, not the crowd.
  • My own instinct is that the big home screen will take off when it adds a social layer atop viewing, when it’s used as a two-way communication device with distant people you already know. Early social networks like Twitter, Facebook, even Digg give only a hint of how we’ll naturally interact with our TVs. Think outside the box.

Summary: We’re in a transition year. Very exciting time, very promising, but 摸着石头过河 — we must cross the river by feeling the stones with our feet, it’s hard to predict the exact path beforehand. The other side sure does look nice, though…. 😉

Helping video understanding

Apple-oriented pundit John Gruber writes of his understanding of showing video in browsers. Here are some tidbits which may help:

  • HTML 4.0 did deprecate the realworld EMBED tag, but the Apple/Google/WhatWG “HTML5” proposals bless it again.
  • VIDEO tag is not a Standard, nor a W3C Recommendation… this shouldn’t stop your experimentation, but neither should the label “standard” stop you.
  • Congratulations on moving past the very proprietary “QuickTime”… the stewardship of that proposal has been disappointing.
  • If thinking of VIDEO tag, you must think of codecs too… Ogg Theora decoders and VIDEO tag parsing do not completely overlap.
  • It’s fortunate that this experimentation was performed on a site where two-thirds of the visitors use Apple-branded browsers. Most sites don’t enjoy the luxury of such near-monoculture.
  • Yes, buffering and startup behavior, like codecs, are issues which were not adequately addressed in the spec. Specifying a VIDEO tag was an easy first step, but does not suffice… captioning, for instance, has been a particularly divisive omission. The WHATWG proposals make more sense as a “blessing” of what Apple was going to do anyway.
  • “I think the HTML5 spec should be changed such that the value of the autobuffer attribute may not be ignored.” May be difficult, due to last call.
  • “Why would I publish content using a technology that I personally block by default?” Because it works.

Two incidental followups: I’m still seeking the factual basis of that surprising claim “We know for a fact that Adobe has no interest in the Mac implementation’s quality,” and if you could correct that prior “arrogance” claim to reflect what I actually said then that would be great, thanks.

[Comments: Ideas are good, but trolls won’t be fed.]

Macintosh enfranchised in Japan

Japan Broadcasting Corporation (NHK) has moved its On Demand website to Flash video. Previously it used Windows Media, which placed barriers in front of Mac users… better than QuickTime, which placed barriers in front of Windows users, but still not ideal.

I don’t have much to add to the story myself, just trying to bring the news into the English-only weblogs. There’s a Japanese entry in an Adobe blog (Google translation), and a few Japanese news results, of which this one (translation) seems the source.

The video producers want their work to reach their audience. Their audience chooses a diverse assortment of browsers and operating systems. Flash works with all of them. It’s a pretty simple calculus.

(Flame-arresters: “HTML5” and either Ogg Theora or H.264 would not help the existing audience. The garden wall around Apple’s iPhone still disenfranchises those audience members, but is not significant cross-culturally.)

How does video work?

When you open up a web browser and type in an URL, how do you then see a video in the webpage?

There are a couple of different levels of requirements, but it’s not hard to understand how they stack together. You’ve got to have a computer of some type, of course, as well as an operating system. This lets you load applications like WWW browsers.

The HTML markup that a browser processes could be written in any of several ways, but at some point will call up a locally-installed “codec” (short for “encoder/decoder”) to process the incoming video stream, and display it within the browser window.

Most of the time the markup asks the browser to ask Adobe Flash Player to render the video, because Player is already on (nearly) all the world’s machines, and Adobe has paid the licensing costs to distribute a good set of codecs. Sometimes the markup asks browsers to use other cross-browser plugins, such as QuickTime or Windows Media or Real Video. Sometimes the markup asks the browser to call a codec directly, as with the VIDEO tag or iPhone.

But that’s the gist of it — a given video file requires a corresponding local video decoder. It’s a pretty simple thing.

Of course, there are ways to make the story more complicated, by adding more details…. 😉

For instance, each codec has different strong points, different angles to appreciate. Some of the older codecs have been donated by their creators for any type of redistribution. The more efficient codecs usually charge a license for redistribution to recoup their development costs. Flamewars among codec fans have historically been amusing, but are a separate topic from understanding what a codec is and does.

The server can introduce additional complicating details too. For instance, some servers can deliver different video streams to different client requests, depending on what codecs and transfer rates the clients have. Other servers can confirm that the request comes from a paid-up subscriber (“DRM”) or offer playback amenities such as scrubbing and variable-speed playback. Still other video servers can make use of both the encoder and decoder on the playback machine, for videochat and such.

And, of course, the above is talking about only the most basic type of video, the linear playback stream that we’re used to seeing on movies and TV. Usually you’d also expect, at minimum, some type of playback interface in the webpage (stop/pause/play etc), and there are other interactive features such as captioning, bookmarking, branching, realtime editing, the list goes on. The codec handles just the linear video stream, which is usually only one part of the overall video experience.

But those are the basics: you’ve got one or more video files on the server, and the markup asks the browser to find an appropriate local decoder. That’s how your video gets viewed.

(My apologies if you already knew the above info, and I ended up wasting your reading time… I’m hoping it will be read and understood by more of the weblogs listed on Techmeme, so that in the future all of us readers will end up wasting less time reading soap-opera dramas atop a “web video” storyline.)

What drove Flash?

Michael Calore, at WIRED Webmonkey, has some current estimates of possible adoption dates for different features within “HTML5”. A useful read.

I’m more interested in a minor quote in there: “What’s driving the most successful [browser] plug-in, which is [Adobe Flash Player], is video support.”

I suspect that might be the other way ’round… Macromedia Flash Player had been solidly above 90% consumer support for many years before video was introduced in 2002. Early adopters started using video via Flash in 2003, but it wasn’t until 2004 that we started seeing businesses built atop it, and by 2006 there was widespread awareness.

Why? Video took off only after the production costs were lowered: once producers did not have to multiply-encode video for different audiences, and once support costs for consumer installations were removed. Adobe Flash Player added video in early 2002, then became a practical choice towards late 2004, after consumer support levels rose above 90%.

The same kind of dynamic occured with “Ajax” a few years back… consumer support was already high for Microsoft browsers, and as soon as browsers from Mozilla and Apple added support for live XML requests, developers could immediately build websites which large audiences could immediately view. When Jesse James Garrett coined the name on Feb 18 2005, those startling new “Ajax” projects would magically “just work” for their audiences.

Both Ajax and Flash video were considered “overnight sensations”, even though the groundwork had actually taken many years. The hype started only after the capability was already there.

Anyway, linear video playback on a notebook is certainly a lucrative area right now… lots of firms are making lots of money from massive audiences via their video content — popular video is certainly “a shiny object” these days — so I can understand the mental shortcut of thinking that video drove Flash.

But history shows that it was Flash’s total ecology of creators and audiences — all the exceptionally diverse people who found value in using Flash — which successfully drove the later practicality of in-browser video. In a sense, sites like JibJab and NewGrounds made sites like YouTube possible.

Adobe today? The company still establishes publishing technologies, then profits within these new, wider ecologies. That pattern is embedded deep within its corporate culture. Yesterday’s view of video will not be tomorrow’s view of video, and Adobe is trying to solve newer, harder problems.