Diary Of An x264 Developer

10/23/2011 (12:09 pm)

The neutering of Google Code-In 2011

Filed under: development,GCI,google,x264 ::

Posting this from the Google Summer of Code Mentor Summit, at a session about Google Code-In!

Google Code-In is the most innovative open-source program I’ve ever seen.  It provided a way for students who had never done open source — or never even done programming — to get involved in open source work.   It made it easy for people who weren’t sure of their ability, who didn’t know whether they could do open source, to get involved and realize that yes, they too could do amazing work — whether code useful to millions of people, documentation to make the code useful, translations to make it accessible, and more.  Hundreds of students had a great experience, learned new things, and many stayed around in open source projects afterwards because they enjoyed it so much!

x264 benefitted greatly from Google Code-In.  Most of the high bit depth assembly code was written through GCI — literally man-weeks of work by an professional developer, done by high-schoolers who had never written assembly before!  Furthermore, we got loads of bugs fixed in ffmpeg/libav, a regression test tool, and more.  And best of all, we gained a new developer: Daniel Kang, who is now a student at MIT, an x264 and libav developer, and has gotten paid work applying the skills he learned in Google Code-In!

Some students in GCI complained about the system being “unfair”.  Task difficulties were inconsistent and there were many ways to game the system to get lots of points.  Some people complained about Daniel — he was completing a staggering number of tasks, so they must be too easy.  Yet many of the other students considered these tasks too hard.  I mean, I’m asking high school students to write hundreds of lines of complicated assembly code in one of the world’s most complicated instruction sets, and optimize it to meet extremely strict code-review standards!  Of course, there may have been valid complaints about other projects: I did hear from many students talking about gaming the system and finding the easiest, most “profitable” tasks.  Though, with the payout capped at $500, the only prize for gaming the system is a high rank on the points list.

According to people at the session, in an effort to make GCI more “fair”, Google has decided to change the system.  There are two big changes they’re making.

Firstly, Google is requiring projects to submit tasks on only two dates: the start, and the halfway point.  But in Google Code-In, we certainly had no idea at the start what types of tasks would be the most popular — or new ideas that came up over time.  Often students would come up with ideas for tasks, which we could then add!  A waterfall-style plan-everything-in-advance model does not work for real-world coding.  The halfway point addition may solve this somewhat, but this is still going to dramatically reduce the number of ideas that can be proposed as tasks.

Secondly, Google is requiring projects to submit at least 5 tasks of each category just to apply.  Quality assurance, translation, documentation, coding, outreach, training, user interface, and research.  For large projects like Gnome, this is easy: they can certainly come up with 5 for each on such a large, general project.  But often for a small, focused project, some of these are completely irrelevant.  This rules out a huge number of smaller projects that just don’t have relevant work in all these categories.  x264 may be saved here: as we work under the Videolan umbrella, we’ll likely be able to fudge enough tasks from Videolan to cover the gaps.  But for hundreds of other organizations, they are going to be out of luck.  It would make more sense to require, say, 5 out of 8 of the categories, to allow some flexibility, while still encouraging interesting non-coding tasks.

For example, what’s “user interface” for a software library with a stable API, say, a libc?  Can you make 5 tasks out of it that are actually useful?

If x264 applied on its own, could you come up with 5 real, meaningful tasks in each category for it?  It might be possible, but it’d require a lot of stretching.

How many smaller or more-focused projects do you think are going to give up and not apply because of this?

Is GCI supposed to be something for everyone, or just or Gnome, KDE, and other megaprojects?

09/30/2010 (7:48 pm)

H.264 and VP8 for still image coding: WebP?

Update: post now contains a Theora comparison as well; see below.

JPEG is a very old lossy image format.  By today’s standards, it’s awful compression-wise: practically every video format since the days of MPEG-2 has been able to tie or beat JPEG at its own game.  The reasons people haven’t switched to something more modern practically always boil down to a simple one — it’s just not worth the hassle.  Even if JPEG can be beaten by a factor of 2, convincing the entire world to change image formats after 20 years is nigh impossible.  Furthermore, JPEG is fast, simple, and practically guaranteed to be free of any intellectual property worries.  It’s been tried before: JPEG-2000 first, then Microsoft’s JPEG XR, both tried to unseat JPEG.  Neither got much of anywhere.

Now Google is trying to dump yet another image format on us, “WebP”.  But really, it’s just a VP8 intra frame.  There are some obvious practical problems with this new image format in comparison to JPEG; it doesn’t even support all of JPEG’s features, let alone many of the much-wanted features JPEG was missing (alpha channel support, lossless support).  It only supports 4:2:0 chroma subsampling, while JPEG can handle 4:2:2 and 4:4:4.  Google doesn’t seem interested in adding any of these features either.

But let’s get to the meat and see how these encoders stack up on compressing still images.  As I explained in my original analysis, VP8 has the advantage of H.264′s intra prediction, which is one of the primary reasons why H.264 has such an advantage in intra compression.  It only has i4x4 and i16x16 modes, not i8x8, so it’s not quite as fancy as H.264′s, but it comes close.

Read More…

07/23/2010 (4:01 pm)

Announcing the world’s fastest VP8 decoder: ffvp8

Filed under: ffmpeg,google,speed,VP8 ::

Back when I originally reviewed VP8, I noted that the official decoder, libvpx, was rather slow.  While there was no particular reason that it should be much faster than a good H.264 decoder, it shouldn’t have been that much slower either!  So, I set out with Ronald Bultje and David Conrad to make a better one in FFmpeg.  This one would be community-developed and free from the beginning, rather than the proprietary code-dump that was libvpx.  A few weeks ago the decoder was complete enough to be bit-exact with libvpx, making it the first independent free implementation of a VP8 decoder.  Now, with the first round of optimizations complete, it should be ready for primetime.  I’ll go into some detail about the development process, but first, let’s get to the real meat of this post: the benchmarks.

Read More…

05/19/2010 (9:30 am)

The first in-depth technical analysis of VP8

Filed under: google,VP8 ::

Back in my original post about Internet video, I made some initial comments on the hope that VP8 would solve the problems of web video by providing a supposed patent-free video format with significantly better compression than the current options of Theora and Dirac. Fortunately, it seems I was able to acquire access to the VP8 spec, software, and source a good few days before the official release and so was able to perform a detailed technical analysis in time for the official release.

The questions I will try to answer here are:

1. How good is VP8? Is the file format actually better than H.264 in terms of compression, and could a good VP8 encoder beat x264? On2 claimed 50% better than H.264, but On2 has always made absurd claims that they were never able to back up with results, so such a number is almost surely wrong. VP7, for example, was claimed to be 15% better than H.264 while being much faster, but was in reality neither faster nor higher quality.

2. How good is On2′s VP8 implementation? Irrespective of how good the spec is, is the implementation good, or is this going to be just like VP3, where On2 releases an unusably bad implementation with the hope that the community will fix it for them? Let’s hope not; it took 6 years to fix Theora!

3. How likely is VP8 to actually be free of patents? Even if VP8 is worse than H.264, being patent-free is still a useful attribute for obvious reasons. But as noted in my previous post, merely being published by Google doesn’t guarantee that it is. Microsoft did similar a few years ago with the release of VC-1, which was claimed to be patent-free — but within mere months after release, a whole bunch of companies claimed patents on it and soon enough a patent pool was formed.

We’ll start by going through the core features of VP8. We’ll primarily analyze them by comparing to existing video formats.  Keep in mind that an encoder and a spec are two different things: it’s possible for good encoder to be written for a bad spec or vice versa! Hence why a really good MPEG-1 encoder can beat a horrific H.264 encoder.

But first, a comment on the spec itself.

Read More…

03/18/2010 (10:29 pm)

Announcing x264 Summer of Code 2010!

Filed under: development,google,GSOC,x264 ::

With the announcement of Google Summer of Code 2010 and the acceptance of our umbrella organization, Videolan, we are proud to announce the third x264 Summer of Code!  After two years of progressively increasing success, we expect this year to be better than ever.  Last year’s successes include ARM support and weighted P-frame prediction.  This year we have a wide variety of projects of varying difficulty, including some old ones and a host of new tasks.  The qualification tasks are tough, so if you want to get involved, the sooner the better!

Interested in getting started?  Check out the wiki page, hop on #x264 on Freenode IRC, and say hi to the gang!  No prior experience or knowledge in video compression necessary: just dedication and the willingness to ask questions and experiment until you figure things out.

02/22/2010 (3:05 pm)

Flash, Google, VP8, and the future of internet video

Filed under: google,H.264,HTML5,Theora,VP8,x264 ::

This is going to be a much longer post than usual, as it’s going to cover a lot of ground.

The internet has been filled for quite some time with an enormous number of blog posts complaining about how Flash sucks–so much that it’s sounding as if the entire internet is crying wolf.  But, of course, despite the incessant complaining, they’re right: Flash has terrible performance on anything other than Windows x86 and Adobe doesn’t seem to care at all.  But rather than repeat this ad nauseum, let’s be a bit more intellectual and try to figure out what happened.

Flash became popular because of its power and flexibility.  At the time it was the only option for animated vector graphics and interactive content (stuff like VRML hardly counts).  Furthermore, before Flash, the primary video options were Windows Media, Real, and Quicktime: all of which were proprietary, had no free software encoders or decoders, and (except for Windows Media) required the user to install a clunky external application, not merely a plugin.  Given all this, it’s clear why Flash won: it supported open multimedia formats like H.263 and MP3, used an ultra-simple container format that anyone could write (FLV), and worked far more easily and reliably than any alternative.

Thus, Adobe (actually, at the time, Macromedia) got their 98% install base.  And with that, they began to become complacent.  Any suggestion of a competitor was immediately shrugged off; how could anyone possibly compete with Adobe, given their install base?  It’d be insane, nobody would be able to do it.  They committed the cardinal sin of software development: believing that a competitor being better is excusable.  At x264, if we find a competitor that does something better, we immediately look into trying to put ourselves back on top.  This is why x264 is the best video encoder in the world.  But at Adobe, this attitude clearly faded after they became the monopoly.  This is the true danger of monopolies: they stymie development because the monpolist has no incentive to improve their product.

In short, they drank their own Kool-aid.  But they were wrong about a few critical points.

Read More…

05/02/2008 (9:04 am)

Google summer of code

Filed under: google,GSOC,x264 ::

This year, x264 is participating in Google Summer of Code. We have accepted four applications out of a total of roughly 15 applications (out of Videolan’s roughly 80 applications and 14 slots). The projects are as follows:

Robert Deaton (masquerade): Improve B-frame decision, both in terms of simple numbers of B-frames, direct modes, frame ordering, reference frame using, etc.

Joey Degges (keram): Improve inter mode search and decision and experiment with more psychovisual optimizations

Aki Jäntti (Kuukunen): Use a “macroblock tree” structure to measure the temporal importance of various parts of the image.

Holger Lubitz (holger): General assembly optimizations and speed improvements.

Good luck to all participants!