Diary Of An x264 Developer

02/22/2010 (3:05 pm)

Flash, Google, VP8, and the future of internet video

Filed under: google,H.264,HTML5,Theora,VP8,x264 ::

This is going to be a much longer post than usual, as it’s going to cover a lot of ground.

The internet has been filled for quite some time with an enormous number of blog posts complaining about how Flash sucks–so much that it’s sounding as if the entire internet is crying wolf.  But, of course, despite the incessant complaining, they’re right: Flash has terrible performance on anything other than Windows x86 and Adobe doesn’t seem to care at all.  But rather than repeat this ad nauseum, let’s be a bit more intellectual and try to figure out what happened.

Flash became popular because of its power and flexibility.  At the time it was the only option for animated vector graphics and interactive content (stuff like VRML hardly counts).  Furthermore, before Flash, the primary video options were Windows Media, Real, and Quicktime: all of which were proprietary, had no free software encoders or decoders, and (except for Windows Media) required the user to install a clunky external application, not merely a plugin.  Given all this, it’s clear why Flash won: it supported open multimedia formats like H.263 and MP3, used an ultra-simple container format that anyone could write (FLV), and worked far more easily and reliably than any alternative.

Thus, Adobe (actually, at the time, Macromedia) got their 98% install base.  And with that, they began to become complacent.  Any suggestion of a competitor was immediately shrugged off; how could anyone possibly compete with Adobe, given their install base?  It’d be insane, nobody would be able to do it.  They committed the cardinal sin of software development: believing that a competitor being better is excusable.  At x264, if we find a competitor that does something better, we immediately look into trying to put ourselves back on top.  This is why x264 is the best video encoder in the world.  But at Adobe, this attitude clearly faded after they became the monopoly.  This is the true danger of monopolies: they stymie development because the monpolist has no incentive to improve their product.

In short, they drank their own Kool-aid.  But they were wrong about a few critical points.

The first mistake was assuming that Linux and OS X didn’t matter. Linux is an operating system used by a tiny, tiny minority of end-users, yet those users make up a huge portion of the world’s software developers and web developers.  Merely going by user count suggests that Linux isn’t worth optimizing for; accordingly, Adobe allocated just one developer, one single developer, to the entire Linux platform.  And in terms of OS X, Macs have become much more popular in recent years — especially among that same group of developers.  Furthermore, Apple is a huge company; Flash performing terribly on their platform is a very good incentive for Apple to want to position themselves in opposition.  Thus, Adobe made enemies of Apple and developers alike.

The second mistake was attacking free software. Practically all the websites on the internet use free software solutions on their servers — not merely limited to LAMP-like stacks.  Youtube, Facebook, Hulu, and Vimeo all use ffmpeg and x264.  Adobe’s H.264 encoder in Flash Media Encoder is so utterly awful that it is far worse than ffmpeg’s H.263 or Theora; they’re practically assuming users will go use x264 instead.  For actual server software, the free software Red5 is extraordinarily popular for RTMP-based systems.  And yet, despite all this, Adobe served a Cease&Desist order to servers hosting RTMPdump, claiming (absurdly) that it violated the DMCA due to allowing users to save video streams to their hard disk.  RTMPdump didn’t die, of course, and it was just one program, but this attack lingered in the minds of developers worldwide.  It made clear to them that Adobe was no friend of free software.

The third mistake was not supporting a free software Flash implementation. The lack of a good free software Flash client is not really Adobe’s fault; it has become clear that the Gnash folks are completely incompetent and nobody else seems interested.  Cody Brocious wrote his own Flash rendering code in a matter of days for purpose of a Flash->iPhone app converter; he only stopped because Adobe released their own mere days before he had intended to release his.  The Flash spec is open, and there are existing free software implementations of every single codec in Flash: there’s really nothing stopping a good free implementation.  But Adobe’s mistake is one of inaction: they didn’t push for it because it wasn’t important to them.

By comparison, look at Moonlight, the free software implementation of Silverlight.  Microsoft has actively worked with the free software community to help produce Moonlight.  Think about how absurd that sounds; Microsoft — the bane of free software, if one goes by Slashdot — has been actively supporting an LGPL free software project, while Adobe has not!  The biggest problem this creates is one of monopoly: people feel insecure using Flash because there is only one implementation, leaving them at the mercy of Adobe.  In any situation, once there are multiple popular implementations of a file format, it’s far more difficult for any one party to commit abuse.  Of course, this is intentional by Adobe: they wanted to have that power of abuse, which is why they didn’t support an alternative implementation.

Now it becomes clear why Flash is so disliked.  It’s nowhere near the most insecure of popular browser plugins; Java has had far more vulnerabilities according to Secunia.  It’s certainly not the least reliable, nor is it completely proprietary; as previously mentioned, the spec is public.  Yet because of the above three mistakes, Adobe has made enemies of developers worldwide.

So, what now?  Flash is crap, we hate Flash, but how do we get rid of Flash, at least for purposes of internet video?

Let’s start with HTML5 <video>.  It’s quite clear that, barring an act of God (or Google, more on that later), if Flash is replaced in the near future, this will be how.  But at the moment there are many serious problems, most of which must be solved for it to even have a chance:

1.  Missing features.  Developers who haven’t worked with Flash often underestimate its capabilities and assume that displaying video is as simple as displaying images.  But there are many things that are useful to control.  Flash lets you tell the client how long to buffer before playing a stream (critical for reliable playback of any live video).  It provides signalling back to the server of packet loss rates, so that the server can throttle bandwidth accordingly.

There are dozens more; these are just a few.  But this is the core problem mentioned at the start of this article, the problem that hit Adobe so hard: “believing that a competitor being better is excusable”.  Many free software advocates promote HTML5 while declaring that these missing features are not a big issue and that we can do without them.  This is not excusable!  If you want to outcompete Adobe, you need to provide a superset of all commonly used features.

2.  Video/audio/container format issues.  Theora is a seriously hard sell to most companies, given its compression is much worse than x264′s and H.264 has no royalties for web video; as before, it’s hard to market something with at most nebulous benefits.  Being “patent-free” may sound nice to free software advocates like us, but most business types only care about the bottom line, and if being “patent-free” doesn’t benefit the bottom line, they’re probably not going to care. (NB: a commenter noted that H.264 is only royalty free for free content, not paid content.  This probably is not a big issue, since the royalty % is very small for paid content, and if you’re charging for content, you can probably afford to pay the small fees.  But it obviously is a slightly different situation.)

But even if you ignore the compression issue, most companies don’t like storing multiple versions of every video — they still need H.264 for iPhone support.  As a side note, Dirac is a potential patent-free option as well, and may provide better compression, but is slower to decode than Theora.  It’s definitely an option to consider though, and one which is way too often ignored when considering formats for HTML5 video.

Youtube, for example, has thrown away petabytes of bandwidth in the pursuit of fewer versions of each video: the default “low quality H.264″ format, which now uses x264, is Baseline Profile-only.  Providing a High Profile alternative could save them 30-50% bandwidth for desktop users, which make up the vast majority of Youtube users.  But it would require storing yet another copy of the video (since Baseline is needed for iPhones), which is too costly for them.  Duplicating each video again would require there to be some serious benefit to doing so — and Google apparently believes that a 30-50% compression improvement is not sufficient, though there seems to be something weird going on with the 360p/480p madness they recently unrolled (neither is High Profile though).

Of course, despite having $50 million dumped on their doorstep each year by Google, Mozilla will never pay the H.264 license fees, nor will they probably ever support users installing their own codecs.  Thus, we are at an impasse.  If Microsoft supports HTML5 <video> in IE9, which is quite possible, they will almost certainly support H.264 and probably not Theora.  Thus, even ignoring the case of mobile devices like the iPhone, neither H.264 nor Theora will span the whole market.  So which will web companies pick?  Most likely, neither — they’ll see the split market as a reason to avoid HTML5 altogether, and drop back to Flash.

3.  Ubiquity.  Flash has the 98% market installation base on its side, a powerful force.  Until Internet Explorer becomes a low-popularity browser (unlikely in the near term) or supports HTML5, Flash simply won’t be replaced.  Furthermore, this effectively forces websites into using H.264: if they want to support both Flash and HTML5, using Theora would force them to store two redundant copies of the video, since Flash can’t play Theora.

4.  Quality of implementations.  Existing HTML5 implementations range from “bad” to “atrocious”; despite years of developers ragging on Flash, many of the existing implementations are still far slower than native media players, use terrible pixelated scaling (esp. Chrome), are outright buggy, or some combination of the above.  Not only are the implementations often bad, but they’re inconsistently bad!  Even if some work well, it does no good if many others don’t.

With all these problems, HTML 5 <video> looks to be in serious danger despite its promise.  And this brings us to the main topic: what about Google, On2, and VP8?  If, as the FSF frantically pleads, Google opens VP8, what problems does it solve and what problems does it create?  And what benefits would this bring Google?

VP8 solves the compression problem: while still probably not as good as x264 (see the Addendum at the end for more details on this prediction), the gap is far smaller than with Theora, enough so that compression is far less of an issue.  But it also brings up a host of new problems.

1.  A few years ago, Microsoft re-released the proprietary WMV9 as the open VC-1, which they claimed to be royalty-free.  Only months later, dozens of companies had come out of the woodwork claiming patents on VC-1.  Within a year, a VC-1 licensing company was set up, and the “patent-free” was no more.  Any assumption that VP8 is completely free of patents is likely a bit premature.  Even if this does not immediately happen, many companies will not want to blindly include VP8 decoders in their software until they are confident that it isn’t infringing.  Theora has been around for 6 years and there are still many companies (notably Nokia and Apple) who still refuse to include it!  Of course this attitude may seem absurd, but one must understand who one is marketing to.  One cannot get rid of businesspeople scared of patents by ignoring them.

2.  VP8 is proprietary, and thus even if opened, would still have many of the problems of a proprietary format.  There may be bugs in the format that were never uncovered because only one implementation was ever written (see RealVideo for an atrocious example of this).  There will be only one implementation for quite some time; Theora has been around for 6 years now and there’s still only one encoder.  Lack of competing implementations breeds complacency and stagnates progress.  And given the quality of On2′s source releases in the past, I don’t have much hope for the actual source code of VP8; it will likely have to be completely rewritten to get a top-quality free software implementation.

3.  It does nothing to solve the problems of hardware compatibility: most mobile devices uses ASICs for video decoding, most of which probably cannot be easily repurposed for VP8.  This might be less of a problem if they’re targeting software implementations though; while it would eat more battery and be limited to mobile devices with powerful CPUs, it would not be unreasonable to play back VP8 on a fast ARM chip (see the Addendum for more on this).

The big advantage of VP8 is that it solves a problem that is unsolvable for Theora: Theora is forever crippled by its outdated technology and weak feature set.  With state-of-the-art RD and psy optimization, as in x264, Theora can likely become competitive with Xvid or even maybe WMV9, but probably not x264.  The only way to fix this would be a “Theora 2″, and attempting to ensure Theora’s “patent-free” status while adding new features would be extraordinarily difficult in today’s software patent environment.  VP8, on the other hand, offers an immediate jump to what is hopefully an H.264-comparable level of compression.

But now for the big question: why would Google want to open VP8, and if they did, how would they do it?  Google probably doesn’t pay a cent in license fees for Youtube; H.264 is free until at least 2016 for internet distribution and encoder fees only apply if you have more than 100,000 encoding servers.  The cost of the license fees for Chrome are minimal (a few million dollars a year, capped).  But despite that, there are actually some very good reasons.

1.  Control.  Google may view the control of other companies over H.264 as a threat: even though H.264 is licensed under RAND terms (Reasonable and Non-Discriminatory, they legally cannot be anti-competitive), there are many reasons for Google to want more control.  If they push VP8, they not only compete with Flash via HTML5, but they also prevent Flash from playing their video streams.  As it is unlikely (for the reasons mentioned at the start of the article) that Adobe will immediately jump ship to VP8, this creates a window of opportunity for Google to steal control from Adobe.

2.  Blitzkrieg.  The most risky, but most powerful thing Google could do is switch Youtube over to exclusively VP8 and roll out a new browser plugin to play it (but support HTML5 if available).  Given Youtube’s popularity, this would likely get them 80%+ install base in a matter of a month or two; effectively a “blitzkrieg” targeting Adobe’s market share.  This would be powerful because it wouldn’t rely on waiting for existing browsers (especially Internet Explorer) to switch over to VP8.

3.  Trump card.  Google may be worried about the future; if H.264 does succeed in eliminating all competition in the web marketplace, it would be quite possible that MPEG-LA would attempt to abuse their position and start charging fees for web usage.  Perhaps MPEG-LA needs a good “scare” to make sure they never consider such a thing.  Software monoculture is dangerous.

These seem like good enough reasons, albeit somewhat insidious ones (especially 2), for Google to release this attack.  Do we really want Google having this much control?  I’m not sure, but it’s sure as hell a better option than Adobe.  Will it actually happen?  Quite possibly; the only other sane purpose of an acquisition for $100 million would be to acquire patents to use as leverage in patent lawsuits.  Would it succeed?  Depends on how they do it and what other companies they rope into their plan.  It also depends on what their target is: would they try to push hardware support too?

Where does x264 fit in all this?  H.264 is certainly not going away, not for quite a while.  In most sane parts of the world, software patents are a non-issue.  But in the end, none of it matters for x264: we will continue our quest to create the best video compression software on Earth.  Unlike Adobe, we don’t sit complacent when we are the best; we keep trying to become better.  We add new features, improve compression, support new platforms, improve performance, and there’s far more to come.  We don’t care that many H.264 encoders are so bad that they can be beaten by Theora or Xvid.  We don’t care if VP8 comes out; that’s just another encoder to beat.  We are here to ensure that the best choice is always free software by constantly making free software better.

But, of course, we wholeheartedly support the quest for royalty-free, free-software multimedia formats.  There are many use-cases in which being free of patents is more important than compression, quality, performance, or even features.  Bink Video is a staggeringly popular example: used in tens of thousands of games despite having compression 10 times or more worse than modern video formats — almost entirely because of its royalty-free (albeit proprietary) nature.  If the day comes when Bink is replaced by a free software alternative, we will know the quest for a widely-accepted, free software, patent-free video format has succeeded.  Until then — I wish luck to those pursuing such a goal.

Addendum: VP8′s feature set and compression capabilities

Many people have been wondering what the reality behind VP8 is, behind the usual marketing bullshit that these sorts of companies put out.  As there is no public specification and even the encoder itself still isn’t released, this is all an educated guess based on what information I do have.

VP8 has been marketed in press releases as basically an “improved VP7″, primarily with the intent of being faster to decode in software on mobile devices, especially those without SIMD (e.g. ARM11).  Thus, it is likely reasonable to start approaching VP8 by commenting on VP7.  VP7 was released in ~2003, around the same time as H.264.  It made waves due to being dramatically better than practically all H.264 encoders at the time.  The reason wasn’t that VP7 was better than H.264, but rather that On2 had a far more mature codebase: they had been developing their encoder for years, while most H.264 encoders were slapped together in the months following the finalization of the specification.  However, VP7 never caught on because it was completely proprietary; nobody wants to rely on a proprietary codec anymore.  Over the years, as far as I can tell On2 never updated VP7 and the best H.264 encoders, like x264, moved well ahead.  VP7 was mostly forgotten except for a few apps like Skype that licensed it.

Now let’s look at VP7 technically.  While I don’t know too much about the internals, VP7 is notable in relying very heavily on strong postprocessing filters.  This is not unique; practically all On2 codecs have this.  Even Theora has an optional postprocessing filter that it inherited from VP3 in addition to its in-loop deblocker.  On2′s postprocessing filters usually fall into three categories: deblocking, sharpening, and dithering/grain.  The dithering filter is useful for avoiding blocking in dark and flat areas, similar to the effects of the gradfun mplayer filter.  The sharpening filter helps compensate for the natural blurring effect of the quantizer in encoders that are not very psychovisually optimized.  The deblocking filter is also notorious for blurring out tremendous amounts of texture and detail (example: vp7, x264).  But this also provides a significant advantage: by moving many features of the codec into postprocessing, the video format becomes scalable; a decoder can do “less work” while still playing back the file, albeit at a lower quality.  This doesn’t work if all the steps are mandatory.

Since VP8 is marketed as less complex than VP7, it likely still does not contain arithmetic coding, B-frames, or other computationally intensive features.  We know from marketing material that one of the big promotions was it allowing the encoder to pick between interpolation modes to allow faster decoding if necessary.  Clearly, VP8 is big on speed — which means they likely have not added a lot of new compression-related features.  If it improved greatly over VP7, it would most likely be due to psychovisual optimizations in the encoder.  But given their last press releasing showing a “comparison with x264″, it’s clear that they haven’t done this.  Their “VP8″ image is a blurry disaster with nearly no detail at all, as opposed to the artifacty-but-detailed x264 image, which actually looked better to many commentors at Doom9, despite the obviously staged test.

Overall, I expect VP8 to be surely better than MPEG-4 ASP (Xvid/DivX) and WMV9/VC-1.  It will likely be nearly as good as Mainconcept’s H.264 encoder (one of the best non-x264 encoders), but assuming they still believe that blurring out the entire image is a good idea, probably still significantly inferior to x264.

Update: According to gmaxwell, a Theora dev, this seems quite likely: “What I’d heard from ex-on2 folks was that there is some philosophical disagreement about how to optimize [encoder] tuning, and the tune for PSNR camp mostly won out. Apparently around the time of VP6, On2 went the full-retard route and optimized purely for PSNR, completely ignoring visual considerations.  This explains quite well why VP7 looked so blurry and ugly.

If there’s anything to take away from this, it’s that psy optimizations are the single most critical feature of a modern video encoder.  They’re the reason that Vorbis beat MP3 for audio, and now they’re just as important for video.

95 Responses to “Flash, Google, VP8, and the future of internet video”

  1. kierank Says:

    I would say that some reasonable level of “content protection” in the form of RTMP is a big advantage for Flash for sites like Hulu/iPlayer etc. To record current streams one has to jump through a lot of hoops with one-time keys, obfuscated URLs and the likes.

    I would also say that 80%+ distribution of VP8 through Youtube within a few months is very high, ~50% within 6 months or so would be more reasonable in my opinion.

  2. Ian Hickson Says:

    Regarding HTML5, in particular point #1 (lack of features): we don’t think the competition being better is excusable, we just want to fix #4 (bugs) as well. Either we add features at a slow rate, and have the implementations fix bugs as we go, or we add features at a high rate, and don’t let the implementations fix bugs. In the latter case, we end up with very buggy implementations, each with their own subset of the spec’s feature set, and IMHO that is significantly worse than features missing in all implementations.

    Basically, moving slowly paradoxically gets us to the end point faster than moving fast. Fixing bugs is easier when there are few of them and they just got added, than when there are a lot of them and they are all old and baked into assumptions in the code.

  3. Dark Shikari Says:

    @Ian

    The features, though, have to go in the spec in order to be useful. We don’t want a disaster of non-standard features only supported by some browsers (IE all over again). Which means they have to be decided on sooner rather than later, even if they’re not implemented just yet.

  4. skal Says:

    Good to see Mainconcept mentioned in a non-too-negative way for a change…

  5. Dark Shikari Says:

    @skal

    So we have our local Google/Youtube representative here, now you have to make a statement ;)

  6. George Bray Says:

    YouTube is Google’s ace, but only after they’ve got some sort of hardware support across future Android platforms.

    They’re competing with Apple on the iPpod, iPhone and iPad. The hardware and codec are optimised, a position Google may yet achieve.

    The ingrained nature of H.264 in the broadcast world has momentum. As wired and wireless bandwidths increase there’s a good argument to stay with a codec family that runs across the spectrum (broadcast feed to mobile).

    Google’s use of VP8 could be anything, as you say. From just keeping it aside until MPEG-LA determine the license framework for internet broadcasts; or full-on development with hardware support and conversion of the masses via a mandatory YouTube update.

  7. Josh Says:

    “…and Adobe doesn’t seem to care at all.”

    The upcoming 10.1 update to Flash Player will have much-improved performance on Mac to the point where it may actually be faster than the Windows implementation in some cases. Adobe’s CTO, Kevin Lynch, recently commented on the Mac performance situation:

    “In Flash Player 10.1 we are moving to CoreAnimation, which will further reduce CPU usage and we believe will get us to the point where Mac will be faster than Windows for graphics rendering”

  8. Anthony Says:

    @Dark Shikari:

    Given that HTML5 (or just HTML?) is now in an unversioned model [1,2], I would assume new features wont take ages to add and get into the specification. Though it still will take forever for IE to catch up and I don’t see MS switching to a faster browser release model instead of once every 30 months. I have a hard time imagining that would fit well in corporate environments who wouldn’t want to extensively test every .x release.

    [1] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-December/024477.html
    [2] http://blog.whatwg.org/whats-next-in-html-episode-1

  9. Peter Says:

    Missing features 1: Control over buffering

    This is not a missing feature. I don’t want the web developer to have control over buffering. 95% of the time, the web developer will mess it up. Youtube is the only video playback site that does kinda okay.

    The browser is in a much better position to evaluate bandwidth available and to cache than the server is. See embedded playback with mplayer in Firefox, QuickTime, or just about anything else. Works much better than Flash.

    At some point, the web went AJAX. Before AJAX, web sites worked okay — HTML 2 gave a consistent way to define content, and my browser gave a consistent way to render it. Now, arrows don’t work for scrolling in Horde or most other e-mail programs — they switch between messages. News sites pop up ads when I highlight text. Google Doc Reader doesn’t let me wget a mirror of documentation to use with limited internet connectivity on a plane. The back button doesn’t work in Slashdot, and the vast majority of web sites don’t give me a nice spinner on my tab bar when loading pages. I won’t even get into ADA-mandated handicapped accessibility.

    This is much less usable. Web-based word processors? AJAX is the way to go. Delivering normal content? 98% of web sites fuck it up, and it’s much more usable to have Firefox developers do it for me.

  10. D.J. Capelis Says:

    “it has become clear that the Gnash folks are completely incompetent and nobody else seems interested”

    I’m willing to believe this without a whole lot of convincing, but I wonder if you might be willing to elaborate. From what I understand implementing a basic flash spec is about as hard as falling off a log, but doing it right, compatibly with existing stuff for all the versions of the spec, is more difficult. Do you have more information about how the gnash team is incompetent?

  11. Torossian Says:

    You touched on a point that’s rarely discussed in the context of h.264 adoption and that’s the size of the stick MPEG-LA is going to wield if the war between and Flash is tipped in favor of the former.

    It makes me uneasy that the amnesty is extended only to 2016 without much explanation as to what the next step is going to be.

    I have a strange feeling that we’re getting rid of one dictator just to install another one, despite my strong desire to get phase out Flash as soon as humanly possible.

  12. Gregory Maxwell Says:

    Some random thoughts—

    “We don’t care that many H.264 encoders are so bad that they can be beaten by Theora or Xvid. We don’t care if VP8 comes out; that’s just another encoder to beat. We are here to ensure that the best choice is always free software by constantly making free software better.”

    But an unfortunate side effect is that you’re also ensuring that the best choice is always a deeply patent encumbered format… Formats have a lot more impact than individual applications. If someone I’m communicating with uses encumbered software, I’m still free to choose… but if they use encoumbered formats my choices are far more limited.

    Clearly doing that isn’t your goal, but it’s a material and unavoidable side effect. I think this diminishes the value of your “We’re just making free software better” position— no man is an island, and all that jazz. Though it certainly doesn’t diminish the impressive engineering you’ve done on x264.

  13. Dark Shikari Says:

    @Peter

    On the topic of buffering, the modern buffer model used in MPEG formats (and pretty much universally borrowed even outside of the MPEG world) depends on the encoder knowing how long the client has buffered in order to be able to send video data and know that the client will be able to decode it in time without buffering extra. This buffer model, called the VBV, is a leaky-bucket model where the client is assumed to have a buffer of some size. Decoding a frame reduces the data in the buffer; during the time that the frame is decoded, the client downloads more data, thus filling up the buffer more. The encoder’s job is to ensure that the buffer never, ever empties (as an empty buffer would result in playback stopping).

    If the encoder does not know the size of the buffer, it is completely, utterly impossible for it to make any intelligent decision at all. It has to guess wildly. Do note that there are ways of signaling this in some video formats; for example, H.264 allows the encoder to specify HRD information which informs the client not only of the buffer size, but also of the expected state of the buffer at any point in the stream. This extra information lets the decoder buffer only what is necessary when tuning into a live stream, rather than the whole buffer size. This is particularly important for broadcast television, where channel change times need to be kept to a minimum.

    Note this primarily applies to streaming, not progressive download, the latter being a case where this is much less important, and in which you are correct–the client is generally smart enough to buffer correctly.

    @D.J. Capelis
    >Do you have more information about how the gnash team is incompetent?

    See http://news.ycombinator.com/item?id=1144097 on Hacker News.

    @Gregory

    The best choice will always be a deeply patent-encumbered format until someone gets rid of software patents. There is no other feasible solution.

    Also, the world isn’t the United States. In most of the rest of the world, we don’t have to worry about these things ;)

  14. Steve Says:

    Very well thought out article. Thank you.

    One note about Microsoft helping *anybody*, many of us still remember similar “helpfulness” to use Microsoft technologies in useful ways only to have them turn on you and demand massive licensing fees once general acceptance had been achieved. See terminal services and FAT, for simple examples. Terminal Services used to not require CALs, until enough Citrix connections came about. Same with FAT where Microsoft encouraged every flash memory manufacturer to use it and suddenly smacked everyone with an invoice.

    If Silverlight or derivatives achieve any penetration, I fully expect that behavior to repeat. To that end, I’m highly skeptical of their cooperation and advise everyone to run the other way as fast as possible. Perhaps it’s cynical to the extreme, but it’s an expectation they’ve earned. In any case, it’s a lesson on what really happens if any portion of software is even fractionally proprietary.

  15. kosmonaut Says:

    DS, thanks for this article, it’s basically what I’ve been hoping you would write since I posted my questions about HTML5 and VP8 on doom10. ;)

    Given all the technical insight you’ve posted here, the main question still remains what MPEG-LA have planned for after 2015. I’m sure they don’t precisely know yet, but it’s an awful lot of uncertainty waiting out there for something as important as the future of internet video.

  16. Asa Dotzler Says:

    I think your introduction is a bit revisionist. Flash got its ubiquity because Microsoft included it in Windows XP which was ubiquitous.

  17. Dark Shikari Says:

    @Asa

    Hmm, maybe my memory is failing me then. I recall Flash being quite popular even before XP.

  18. Asa Dotzler Says:

    Dark, popular, not ubiquitous.

    And as for video, I don’t think Flash even had video when it shipped in XP (Flash 5, if I rememnber correctly.) Windows Update later delivered newer versions of Flash that did include video capabilities.

  19. kierank Says:

    @Asa

    Flash had both ExpressInstall and Windows Update by which to deliver new versions. Both are probably as simple as it’s going to get in terms of user update methods. It was inevitable that it would end up being the standard for video; it’s just that Microsoft assumed that it would be a WMP/Real/QT battle.

  20. SeanC Says:

    Regarding missing features, I’ve been watching the full screen issue closely for HTML 5. The spec doesn’t require this feature and suggests browsers handle this issue with their own apis. This could be another staying point for Flash if the browser vendors don’t provide enough functionality here or disallow custom dom elements for security purposes.

  21. 23Skidoo Says:

    Regarding Bink, a lot of games use Theora nowadays (StarCraft 2, for example).

  22. rm Says:

    > Youtube, for example, has thrown away petabytes of bandwidth in the pursuit of fewer versions of each video

    Uhh… YouTube has thrown away petabytes of bandwidth, storage and encoding time for no good reason at all. I’m pretty sure they choose their settings by rolling dice. I hope there’s some hidden deep reason mere mortals can’t grasp, but it’s getting really difficult to believe.

    Case in point: 720p video, uploaded about 10 hours ago. Brand-new, couldn’t have matched an old video. Versions available:

    1 (“Flash 7″). FLV, 400×226 ~240Kbps Sorenson Spark, 22050Hz stereo ~60Kbps MP3
    2 (“iPhone”). MP4, 480×270 ~512Kbps h.264, 44100Hz stereo ~110Kbps AAC
    3 (“360p”). FLV, 640×360 ~620Kbps h.264, 44100Hz stereo ~110Kbps AAC
    4 (“480p”). FLV, 854×480 ~1200Kbps h.264, 44100Hz stereo ~110Kbps AAC
    5 (“720p HD”). MP4, 1280×720 ~1920Kbps h.264, 44100Hz stereo ~125Kbps AAC

    Note that the audio track of 2, 3 and 4 has been encoded again and again and not re-used (the compressed data is substantially different but almost the same size (+-3 bytes) and decompressed and compared it has high PSNR but doesn’t match exactly, lots of +-1 differences)

    There goes the “low quality sucks because of the iPhone” theory: there’s a special version for the iPhone and friends. It’s baseline profile and I don’t think the encoder was x264: quality is garbage, it’s significantly worse pixel-for-pixel than the 360p one despite having 45% more bitrate per pixel.

    Now, the 360p and 480p are incomprehensibly Main profile. The 720p is High and uses the 8×8 transform (so it never uses baseline for desktop users). All of them use 3 refs. But the shocking thing is that none of them use B-frames. You mention B-frames as an “expensive” feature but without weighted prediction I haven’t seen a decoder that lost more than 5% speed for them. Not using B-frames is atrocious in my opinion, I hope it’s not intentional. Maybe they have a problem with encoder delay?

    Other than that looks like an old x264 version (with aq for sure and psy-rd I think, but not mbtree). All encodes (incl. the Spark and the iPhone ones) were two-pass. There’s no encoder identification string in any of them.

  23. Shiretoko Says:

    Very informative article. I really do hope Mozilla considers at least adding the ability to install decoders, but, as you said, it is unlikely.

  24. sonicoliver Says:

    just like the flash vorbis player, a skilled flash developer could get a theora decoder running in the flash VM using tools like alchemy (with “look ma, no plug-in update!” to boot).

    this is why flash will be around for a while yet and as you rightly point out flash is often underestimated, rarely duplicated…

    thanks for the great post!

  25. Michael Critz Says:

    The VP8 feature set doesn’t validate it’s existence.

  26. Drew Thaler Says:

    Excellent post! Just a side note about Bink and games:

    I know it was a convenient aside, but I doubt very much that royalties and licensing really enter very far into the reasons that games often use Bink. In my experience, by far the most important factor is that it has a small memory footprint — that trumps everything else.

    Bink’s poor compression isn’t much of a strike against it in games, either. Oddly enough, there is no particular economic pressure pushing games to be smaller.

  27. bash Says:

    The issue of Mozilla licensing H.264 is also that of distribution. Even if Mozilla would include H.264 and pay the fees, it is unclear that this would extend to the repacked Versions of Mozilla products as well, mainly Linux distributions.
    This could very easily result in Linux users becoming second class Mozilla users, which clearly cannot be in Mozillas best interest.

  28. Dark Shikari Says:

    @rm

    Quite true, Youtube often seems to not know what they’re doing, but they have smart guys working for them (e.g. Pascal) so I think they have their reasons. The duplicate videos aren’t too big a deal though; note most of the “HD” options only exist for a tiny tiny group of videos, so it probably doesn’t take much storage overall.

    B-frames are an expensive feature if only because they significantly increase decoder complexity–even if they don’t cost speed. This would be an issue for smaller embedded devices.

    @sonic

    Alchemy is about 10 times slower than native code. There’s no way that would work. It would work in Silverlight though, as .NET is only a few times slower than native code (after considering the cost of no SIMD).

  29. ChuckEye Says:

    Yeah, the intro is a bit revisionist… “Given all this, it’s clear why Flash won: it supported open multimedia formats like H.263 and MP3, used an ultra-simple container format that anyone could write (FLV), and worked far more easily and reliably than any alternative.”

    QuickTime was a container as well, that supported more than a dozen commonly used formats over the years. It even supported Flash content from v5 to 7.3.

  30. John Dowdell Says:

    Hi, that’s ten screenfuls of text, then eleven screenfuls of comments. Can you summarize? Thanks.

    (There are many more Linux engineers on the Player team than just Mike M. The upcoming Safari shift to Core Animation helps avoid some of the browser interference dictated by the earlier Macintosh drawing models. The RTMP transfer protocol is documented, but attempting to crack encryption is a different thing entirely. I wouldn’t call the Gansh folks “incompetent”, but like the “HTML5″ proponents, they didn’t seem to think through the codec issues first. I may be just picking off digressions, though… would appreciate refactoring to confirm what’s most important to you, thanks.)

    (btw, Asa’s been corrected many times on the Flash ubiquity occuring long before Windows XP included it (or YouTube occurred).)

    jd/adobe

  31. Dark Shikari Says:

    @Chuckeye

    Back in the 90s I don’t recall Quicktime supporting much anything standardized. Back then, they were using SVQ1 (proprietary Sorenson video format) and some bizarre proprietary audio format.

  32. kierank Says:

    @John

    I wouldn’t say RTMP has “encryption” by any stretch of the imagination. It merely has obfuscation.

  33. Drazick Says:

    Since you mentioned it, is there a chance for a post about Dirac?

    I would really would like to read your insight about this format.

    Thanks.

  34. Dark Shikari Says:

    @Drazick

    Hmm, I should probably do a post about wavelet formats in general at some point. JPEG2K, Snow, and Dirac could be covered.

  35. Dave Says:

    You talk about VP7 & 8 but don’t mention VP6 which has the benefit of being supported by Flash. Is it just too old and crappy? Simple maths suggests it’s 3 better than VP3, which forms the basis of Theora.

    How good is it right now and how good do you think it could get by tweaking or rewriting the code without breaking the spec (assuming it has one) or more importantly without breaking comparability with the existing Flash install base.

    Your own work on x264, lame for mp3 and the current Theora improvements seem to suggest that you can get substantial improvements from codecs long after a patent holder or proprietary software company would have just rested on their laurels or forced an upgrade.

  36. Jammec Says:

    “Since VP8 is marketed as less complex than VP7, it likely still does not contain arithmetic coding, B-frames, or other computationally intensive features.”

    Interesting post. Must comment one thing there. Even VP6 contain arithmetic coding so I don’t think they would have dropped it out.

  37. Funtomas Says:

    Since Google, a highly ranked innovating company and since Internet video is the next big thing, consuming a great deal of web traffic already, I dare to guess Google will come out with some more disturbing model of video distribution.
    When it comes to “there seems to be something weird going on with the 360p/480p madness” I suggest not to focus on VP8 only in the Google-On2 acquisition. There’s more to it, like Flix. Imagine, when the engineers come up with reversed functionality which would provide delivering videos in any format, encoded on the fly from a single file.
    Did you even know the FTTH, WiMax and perhaps LTE provide symmetrical bandwidth, pretty convenient when utilized in delivering live video broadcasting, harnessing torrent streaming?
    I believe, web video future is bright.

  38. Dark Shikari Says:

    @Dave

    VP6 is better than VP3, for sure, but at this point most people have discounted it since everyone’s switching to H.264 for Flash purposes (cheaper, better compression, and you don’t have to use On2′s awful encoder).

    That’s a possibility as well though, to release VP6.

    @Jammec

    Whoa, you’re right. I didn’t realize that at all. That might explain why VP6 was so much slower than MPEG-4 SP.

  39. Diego Says:

    Well, the gnash people are paranoid, but that doesn’t means the project is dead.

  40. Kelly Clowers Says:

    On an unrelated topic, I was wondering if you could replace your Atom 0.3 feed with Atom 1.0? Sorry to be a bother.

  41. Ed Says:

    Can we have a totally new Marketable name for H.264. It is totally insane that iPhone is stuck with MPEG 4 Baseline Profile. Which is BIG difference to High Profile.
    And the Youtube or Video Site have to work with these devices.

    I am wondering if one have to paid MPEG-LA if they host all there Servers and have there viewers in country which Software Patents are non-enforceable.

  42. Dark Shikari Says:

    @Kelly

    How would I do that? I’m just using a stock WordPress install.

    @Ed

    I know an extremely popular website whose philosophy is “we’ll use High Profile, and anyone who doesn’t support it can suck it” ;)

  43. OnOffPT Says:

    @Dark Shikari

    Could you (or anyone else) please post some links supporting the statement:
    “most mobile devices uses ASICs for video decoding…”

    If so, what profile are they supporting ?

  44. Dark Shikari Says:

    @OnOffPT

    Depends heavily on the device. Do note I should have probably been a bit less specific; many devices use DSPs like the C64x on the OMAP which are technically not ASICs, but rely very heavily on dedicated ASIC functional units for specific tasks (bitstream decoding, transform, dequantization, etc).

    A good example is the iPhone 3GS: it has a ~500mhz CPU yet can decode up to 720p-1080p High Profile video.

  45. Louise Says:

    Doesn’t On2 have their own container?

    What container format are flv and mp4 videos from YouTube in?

  46. Kelly Clowers Says:

    @ Dark Shikari
    My mistake, it is labeled as “Atom 0.3″, but it actually is Atom 1.0

    Your template/theme has a line that produces this html:

    <link rel=”alternate” type=”application/atom+xml” title=”Atom 0.3″ href=”http://x264dev.multimedia.cx/?feed=atom” />

    The title attribute should be: title=”Atom 1.0″

    Since it’s just the label, it isn’t such a big deal.

  47. Dark Shikari Says:

    @Louise

    Not as far as I know; VP6 generally just used FLV or AVI. Youtube uses FLV and MP4 containers, as you stated.

    @Kelly

    Fixed.

  48. Louise Says:

    @Dark

    Thanks for clearing that out. I had no idea that flv and mp4 were contains.

    So in case Google want to use all open formats and containers, I guess that leaves them with ogg and mkv.

    Can mkv be streamed like ogg can?

  49. Dark Shikari Says:

    @Louise

    MKV supports indexed progressive download, like MP4 or hinted FLV. The DivX Web Player plugin can play MKV files in the same fashion as Youtube videos work.

  50. Ian Hickson Says:

    @Dark

    Yes, the goal is to basically keep the spec one step ahead (but only one step ahead) of the browser vendors in terms of features. The next step (captions, subtitles, timed events) is likely coming in the next few months.

  51. Louise Says:

    @Dark

    Wow, that’s impressive!

    So is MKV better in every way than OGG, or does OGG have things that it does better than MKV?

  52. Dark Shikari Says:

    @Louise

    MKV is not the best designed container format out there (it’s reasonably arguable that MP4 is more reasonably structured), but it isn’t *bad* and is extraordinarily flexible.

    Ogg is notoriously terrible, possibly the worst container format in history. Xiph would have done well to adopt MKV instead. Ogg’s primarily failures are twofold: it was originally designed for streaming only, and thus has no index. This makes seeking quite obnoxious. Second, and more importantly, its method of storing timestamps is completely insane: every single codec stores timestamps differently! This means a splitter applications needs to have a dozen different decoders linked into it just to be able to parse the file, even if you don’t want to decode it. See http://hardwarebug.org/2008/11/17/ogg-timestamps-explored/ for more details.

    There’s much more beyond this as well; this is just a few things.

  53. James R Grinter Says:

    What about the MXF (“Material Exchange Format”) container format? Where does that fit in?

  54. Louise Says:

    @Dark

    You really know your stuff! Very interesting to read and learn!

    So what do you predict or expect Google will use? MKV or perhaps develop their own?

    What about audio codec? From this PDF
    http://www.on2.com/file.php?224

    can I see, that On2 supports AAC. What will Google likely choose? Vorbis?

    In this video Eric Schmidt, explains that Google want to develop point-to-point video conference for mobile
    http://www.youtube.com/watch?v=YuqiE2lukDM

    Could this impact their choice of container and audio codec?

  55. Dark Shikari Says:

    @James

    I don’t know too much about MXF. To me, it’s “some weird format used by Apple and ‘professional video people’”. Which is why I don’t want to say anything about it, because I’ll probably be wrong ;)

    @Louise

    AAC’s licensing terms are exponentially more onerous than H.264′s. If Google didn’t want to use H.264, they would sure as hell not want to use AAC. Since Vorbis doesn’t work in AVI, they’d probably have to use MKV or adapt Ogg to support VP8.

  56. Louise Says:

    @Dark

    This years Google I/O will definitely be interesting!

    Some say that Google will choose the container that allows DRM, in case they want to sell movies on YouTube. Doesn’t should good about the DRM part.

    Just out of curiosity. Is it possible to say either AAC or Vorbis gives the best quality per size? I remember reading that Vorbis should be very good a low bit rates, where AAC is best at high bit rates.

    Is that correct?

  57. Joseph Says:

    This is off-topic, but i need to ask:

    -Do you know any company that sells x264 based appliances with SDI input?

    We are a iptv company, and we find hardware h264 encoders stupidly expensive in price/quality. While we can probably hack out something with ffmpeg(audio)+x264+libDVBpsi and a PCI SDI input card, we still have to set it up on a “normal” Linux server… And our management will not like this… But if we buy an appliance(even that under the hood that appliance is just Linux+ffmepeg+x264) with a support contract and warranty, our management will be happy…

  58. Dark Shikari Says:

    @Joseph

    Avail Media (now Avail-TVN) sells them, but as far as I know, only as part of their whole service. But of course, as an IPTV company, you might find their whole service useful as well–and maybe you could negotiate a deal.

    @Louise

    They’re pretty close. AAC wins handily at low bitrates due to SBR and PS, but at AAC-LC bitrates (~70+kbps) it’s pretty much a dead heat.

  59. Louise Says:

    @Dark

    Thanks =)

    Just read now about the NUT container
    http://www.nut-container.org/

    It sounds like they have fixed the problems that you mentioned that MKV have. ?

    Could NUT be a container candidate for Google?

  60. Dark Shikari Says:

    @Louise

    NUT didn’t really go anywhere, despite being well-designed. But if you wanted to start nearly from scratch with a good baseline, NUT would be the place to start.

  61. totoum Says:

    @Dark

    lol,you might have already said too much about MXF when you said “it’s “some weird format used by Apple”

    Because Apple doesn’t use it :) Final Cut can’t edit MXF files without plugins,they have to be remuxed into MOV.
    It’s used by Avid though,so someone with a mac running Avid could use it.

    Anyway,thanks for this post,it was a great read.I had been spending the last few days reading all kinds of blog post about it and was amazed by how much ignorance there was out there.

  62. Michael Stanley Says:

    @Dark #52

    Lacking an index is not a fatal design flaw for the container. It can be added afterwards with no incompatible changes and minimal disruption. In fact, Ogg is growing an index for fast seeking over high latency connections right now: http://wiki.xiph.org/Ogg_Index

    Nut looks awfully similar to Ogg to me, so it’s unusual that you call Nut “well designed” and Ogg “the worst container format in history”.

  63. Dark Shikari Says:

    @Michael

    Something can seem well designed yet have many fatal flaws. An ice cream sundae with a horseradish sauce may be “awfully similar” to a normal ice cream sundae, but it’s not going to taste nearly as good.

    Mans Rullgard can tell you in more detail; I’m not the foremost expert on containers.

    It can be added afterwards with no incompatible changes and minimal disruption. In fact, Ogg is growing an index for fast seeking over high latency connections right now: http://wiki.xiph.org/Ogg_Index

    This is about 10 years late. Not having an index in this day and age is completely ridiculous, nevermind the fact that no legacy implementation will support such a thing. We’re not going to wait 5 years for every implementation to be updated to support a feature without which reasonable seeking is impossible. Ogg is a transport stream format despite HTML5 primarily being designed for progressive download, not streaming. It’s simply not the right tool for the job.

    There is simply no significant advantage that Ogg provides over other free, more widely-supported, better-designed, and more flexible containers. Well, except one: it allows Xiph to dictate what video and audio formats everyone else uses! Of course, this is only an advantage to Xiph, not anyone who believes in a free web.

  64. Igor Says:

    >>Just out of curiosity. Is it possible to say either AAC or Vorbis gives the best quality per size? I remember reading that Vorbis should be very good a low bit rates, where AAC is best at high bit rates.

    HE-AAC has best quality up to 64 kbps. http://listeningtests.t35.com/mf-64-1/results.htm
    LC-AAC (>=80 kbps) is on par or even better than Vorbis http://www.hydrogenaudio.org/forums/index.php?showtopic=74781

    Unfortunately Vorbis has failed badly on promised hardware support. It’s phased out by far more popular audio formats MP3 and AAC.

  65. Louise Says:

    In this post
    http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=61465&view=findpost&p=548918

    the Flac developer explains that he doubts Flac can be optimized to compress 5% better.

    Is this also the case with Vorbis/AAC? Have we reached the best possible audio codec now?

  66. Dark Shikari Says:

    @Louise

    No. The primary reason FLAC can’t compress much better is because audio consists primarily of noise. At any particular point in a file, the low N bits (where N is some number probably between 2 and 6 or so) will be nearly pure noise, impossible to compress losslessly. It’s worse at 24-bit samples, of course.

    Lossy audio is different, and has no theoretical entropy limit.

  67. Louise Says:

    @Dark

    I see. Thanks =)

  68. astrange Says:

    @Igor

    > LC-AAC (>=80 kbps) is on par or even better than Vorbis http://www.hydrogenaudio.org/forums/index.php?showtopic=74781

    Could you add error bars to the post? I think introducing apparent strict ordering will only confuse people.

    @Louise

    > Is this also the case with Vorbis/AAC? Have we reached the best possible audio codec now?

    There are more things audio codecs can do. Two examples would be better entropy coding (they use VLCs like older video codecs, whereas H.264 uses an arithmetic coder) and longer-term prediction (AAC compresses sets of 1024 samples without referring to previous ones, which isn’t much).

    On the other hand AAC implements both of those in different profiles (BSAC and LTP) and neither are popular, so maybe not.

  69. beyondtheeyes Says:

    You may find this article interesting: http://arstechnica.com/open-source/news/2010/02/ogg-theora-vs-h264-head-to-head-comparisons.ars

  70. Dark Shikari Says:

    @beyondtheeyes

    We talked about that on IRC, both with x264 and Theora people: all considered it one of the worst articles they had ever seen.

    They screwed up aspect ratio, luma levels, BT.601 vs BT.709, and so many other things in their comparison that you might as well throw the entire thing out. Plus they used an incredibly bad H.264 encoder. It’s practically a checklist of how to be completely incompetent idiots.

  71. Shiretoko Says:

    After the FFMS integration, will x264 be able to encode ranges of frames of the input file? will you ever consider this feature?

  72. beyondtheeyes Says:

    @DS#70
    Hoops, I should have put a smiley at the end of my post. Initially I thought that the moderator on this blog (yourself I believe) would reject this link (for good reasons). You did not and finally you are right. It’s good to have your feeling here rather than having people reading this “article”. Do not forget that most people do not follow IRC, but many people read this blog ;-) I second your comments: I personally stopped at “I used Sorenson Squeeze to produce the H.264 files” :P

  73. Dark Shikari Says:

    @Shiretoko

    Probably. There’s an upcoming filter API that will add resizing, deinterlacing, cropping, and other such support, so anyone will be able to write a filter to encode ranges of the source.

  74. Fredrik Says:

    Walkthrough of some of the features in VP8:

    http://www.dspdesignline.com/howto/214303691;jsessionid=FZILVS2ROBCY3QE1GHOSKH4ATMY32JVN

  75. hyc Says:

    @kierank – RTMP *does* have encryption, several flavors of industry standards in fact. But it lacks mutual authentication. This is the same mistake that many other companies have made when they claim to implement security. (Like McAfee’s SecureSockets product from back in the 90s, but that one lacked integrity checks too, so you could just inject random noise into a securesocket stream and totally hose them…)

    People seem to think that “making it secure” just means encrypting a session; they think this is the only benefit that SSL/TLS are giving them. They’re wrong, and it’s because they miss the authentication aspect of security that their systems fall apart. Everything that Adobe slaps on top of RTMP to tighten their control is an exercise in futility because it’s already fundamentally broken.

    The even greater irony is that RTMPE was supposedly implemented because it has lower CPU cost than SSL, but their crypto libraries are obfuscated, which bloats their cipher code by a factor of 1,000-10,000. If you profile it against a plain-jane OpenSSL you’ll see that it’s several times slower than SSL/TLS. (The other reason for RTMPE of course was to avoid the certificate management overhead that comes with SSL. But as I already pointed out, dropping credential management / authentication from the system is precisely what makes RTMPE fundamentally insecure and insecurable.)

  76. Dom Says:

    FYI, Microsoft has announced some of the new features of Internet Explorer 9, they include HTML5 support of and including H.264, MPEG-4 (One assume ASP), MP3 and AAC.

    http://www.microsoft.com/Presspass/press/2010/mar10/03-16MIX10Day2PR.mspx?rss_fdn=Press%20Releases

  77. Nil Einne Says:

    :One note about Microsoft helping *anybody*, many of us still remember similar “helpfulness” to use Microsoft technologies in useful ways only to have them turn on you and demand massive licensing fees once general acceptance had been achieved. See terminal services and FAT, for simple examples. Terminal Services used to not require CALs, until enough Citrix connections came about. Same with FAT where Microsoft encouraged every flash memory manufacturer to use it and suddenly smacked everyone with an invoice.

    Not that familiar with the Terminal Services example but I don’t think the FAT example is comparable. For starters, I don’t know how much of a role Microsoft had in encouraging flash memory manufacturers to use it, I expect they did it to themselves. Remember this was in 2003 and before, there was little alternative unless you wanted to lock yourself out of the Windows market (which probably still include Windows ME not that that was particularly relevant, NTFS wasn’t any better). I guess cameras and the like could have used something else but I’m just not convinced they really had much choice but to go with FAT.

    More significantly while Silverlight has come under criticism as a possible part of the embrace, extend, extinguish model and others have highlight problems with the Moonlight Covenant, there is some belief (e.g. by Debian) they’ve made an irrevocable patent grant due to the Silverlight XAML vocabulary specification being released under a Microsoft Open Specification Promise.

    In other words Microsoft has two avenues. They could in the future develop a new version of Silverlight which isn’t open and try force everyone to upgrade leaving FLOSS users in the cold (i.e. the embrace, extent or extinguish model). The FLOSS developers could of course continue on a different pathway and it’ll once again be a crapshoot between who supports what. Of course they could act in bad faith and argue their current promises which have been intepreted by some to give a perpetual grant to their patents are incorrectly intepreted and try to shut down existing software.

    This is quite different from the FAT case where they never (at least publicly) made any offers of granting their patents royalty free perpetually. They may have never enforced them but in reality, hoping someone is not going to enforce their patents in the future just because they aren’t now is a bit silly, and it’s one of the reasons for Xiph.Org etc.

    BTW, I have to agree with the Dirac being another candidate, it’s an interesting one and I think I even pointed out that personally I felt Mozilla should have pushed harder on Dirac then Theora when they made their annoucement of support last year or whenever. Yes Theora was more mature, but it seemed to made Theora was never going to achieve any resonable quality. With the world moving to h.264 convincing people to go to such a crappy codec wasn’t going to be easy simply for patent reasons. Furthermore someone mentioned DivX which is an interesting point. While perhaps too late by the time, the power of the P2P market can’t be ignored. If Dirac could have gained a good foothold, it would have likely improved its chances significantly including for hardware support. Theora was never going to have a chance there.

    Now of course there’s also the possibility for VP8 although I too am sceptical Google is going to make a move like the FSF wants. In any case it seems VP7 might be just as important as VP8 although I guess we’ll gain both.

    One final point, in terms of Flash’s problems I suspect the annoying use of Flash in ads something which isn’t really Adobe’s fault even if something they like hasn’t helped its popularity among the more general internet population.

    P.S. I agree with the comment on Flash ubiquity predating Windows XP. Heck I remember some of my friends working with flash stuff when I was in secondary schools in the 90s.

  78. Nil Einne Says:

    Hahaha and in that long comment I realised I completely forgot about the first point which made me want to reply.

    Peter: I’m not sure whether you’re trying to download the actual document, or simply the Google display of the document but if it’s the former and while I don’t use wget (usually FDM), if you’re having problems downloading from Google Docs try making the file public. It seems to work for me. It’s an authentication issue I presume, there may be some way to fix it, I never tried since it’s not something I run in to much.

  79. Nil Einne Says:

    Final post I promise!

    Just realised I forgot to thank Dark Shikari for the informative post. I never knew of the VC-1 history and in particular it’s good to hear some confirmation of a lot of what I’d been thinking re: Theora & VP8. (Also interesting to see your more recent post on Dirac)

  80. Scott Holden Says:

    Great post. Buckle up, we’re all about to find out how this all pans out. I’m cautiously optimistic that this will be in the best interest of open standards. Let’s hope for the best!

  81. Paul Eccles Says:

    So Google are planning to open source VP8.

  82. Charles Hill Says:

    Well, it looks like Google is open-sourcing the VP8 codec, so will we get a new post after you have time to actually look at what On2 did in code? :-)

    http://newteevee.com/2010/04/12/google-to-open-source-vp8-for-html5-video/

  83. Raptus Says:

    And the chips fall:
    http://newteevee.com/2010/04/12/google-to-open-source-vp8-for-html5-video/

  84. anon Says:

    I second the request by #10: What’s wrong with Gnash’s developers? I notice that neither Gnash nor swfdec deal very well with flash content; why is this the case, given that the specs are freely published?

  85. avada Says:

    As I understand the patenting problem is only an issue in US and some other stupid counties. Why let them ruin html5 videos for the whole world? Why not just implement h.264 and if they ask money for it boycott those countries.

  86. Tom Wright Says:

    @Dark Shikari
    Moonlight and Mono have SIMD support whilst standard .NET/Silverlight does not.

  87. HM Says:

    well just one argument in defense of flash … uninstall it … it’s an opt in / out situation it’s a plugin … the user has the choice … i ask you now what will be your choice with html5.

    just food for thought

  88. JT Says:

    “Over the years, as far as I can tell On2 never updated VP7″

    On2 stopped updating the “Personal Use” version sometime ago but VP7 has been updated over the years for their customers needs and the latest version is superior to the personal use version that you’ve had access to.

  89. Jeffrey Says:

    The Gnash developers aren’t incompetent it’s like post #10 said: Implementing the specifications is easy but making it bug compatible with the existing Flash Player is the incredibly hard part.

    They already figured out the stuff in the specification (way before they were released) just not the enormous amount of quirks in Adobe’s implementation.

    Also, Cody Brocious’ Flash rendering code doesn’t even run. At least not with any of the Ruby environments I’ve tested it with. (Ruby1.8, 1.9, 1.9.1)

  90. seoweb Says:

    Nice post.
    Anyway, the video tag is the right step. Flash for websites always seems to me as a bad idea. And in most cases search bots think the same ;)

  91. Me Says:

    Well I am pleased to see reminded how the html5 video implementations are still far from the decency mark. A few Apple ignorant fanbois, conditioned by the cult to hate and denigrate flash all the day long, could at least learn a few hard facts from people having a clue.

  92. Arioch Says:

    HM> flash … uninstall it … it’s an opt in / out … what will be your choice with html5.

    For Opera/Win32 i’d just remove GStreamer folder alltogether :-)
    Opera also always had menu to quickly turn on/off graphics/animation/sound. I epect they’d add video into list. And it would be much more easy than quickly on/off the flash content.

    > Google… 2. Blitzkrieg.
    Never. Content makers and viewers would just flee to Vimeo and so on.
    I think, if they would, they would just make lo-q video for Flash and would dispaly hiq still-image advertising for new plugin or for Chrome.
    Afterall i nsome environments users just cannot install anything, completely pullling the plug of Flash is not that good. But making it obviously inferrior (by large hi-q still image – see the difference for yourself) may make the trick w/o alienating users.

  93. BigJeff Says:

    @Arioch:

    Blitzkrieg is more than possible if done correctly.

    If Google creates VP8 plugins for IE 7&8, Safari, Opera, and (obviously) Chrome before throwing the switch on all their content they could easily hit 70-80% of users almost overnight. When most people see “Oh no! You need to download a plugin!” they just download the plugin.

    It’s a lot of work – particularly getting a deliverable format to non-html5 browsers, but it also means an instant alternative to h.264 for other content distributors. If 80% of users have a (possibly) patent-free alternative to h.264, then it makes the effort switch much less risky.

    It sounds like a nasty tactic, but Google’s claim to fame is releasing superior tech and thus completely wiping out the competition.

  94. Reason A Bubble Says:

    I don’t understand how some can say that Flash provides protection from content theft. There are many apps that enable the download of anything that can be played or viewed in a browser.

  95. jbsurveyer Says:

    Lets take the contrarian view and say that many of the problems that Apple is having with Flash performance is self inflicted. Tests on Apple machines using MacOS and then Windows 7 as OS[guaranteeing an Apples to Apples identical-hardware test comparison] show Windows [and Linux too]easily outperforming Mac whether the graphics software is several games, OpenGL, Cinebench or Flash. Second Apple did not release to Adobe the APIs for GPU hardware acceleration on the Mac until May 10, 2010 yet despite this delay Adobe has delivered over the past 2 years steady progress improvements in Flash 9, 10.0 and 10.1 speed, battery life and features. As for bugs Secunia shows that Apple’s Quicktime has had over the past 3 times as many security warnings almost all of high to extreme than the Flash Player – both appear to apply patches in roughly the same 1-4 week period. But consider the Flash Player a much more complex software program than Quicktime. See details here – http://www.theopensourcery.com/keepopen/?p=2366.
    Now I suspect Steve Jobs is using Adobe Flash:
    1)As a scapegoat and distraction for the poor graphics performance of MacOS vis a vis Windows and Linux. If discovered this would be a marketing/brand hit against Apple because a big part of the Apple mystique is based on superior graphics performance.
    2)as a lockout method along with the bans on Java and program generators for use on all iDevices development or delivered apps. This allows Apple to more readily obtain monopoly control ala Microsoft in a hugely expanding market – i.e. Apple iDevice apps cannot be easily ported to other smartphones, tablets, and other competing devices.
    Finally, the argument that HTML5 is open for business and can be used by developers not wanting to use Objective C is also empty. Steve knows that the critical HTML5 video and audio standard are 1-2 years away [and Apple is a significant holdout here] while the multi-touch + gestures HTML5 spec is more like 2-5 years away. Sure Steve can demo a Safari-only set of multi-touch+gestures examples but it simply does not work in any other browser and is nowhere near an HTML5 standard – the working group just got started late last year. It is passingly weird experience … Steve Jobs morphing into Bill Gates right before our eyes.

Leave a Reply