Diary Of An x264 Developer

05/25/2010 (11:01 pm)

Anatomy of an optimization: H.264 deblocking

Filed under: assembly,development,H.264,speed,x264 ::

As mentioned in the previous post, H.264 has an adaptive deblocking filter.  But what exactly does that mean — and more importantly, what does it mean for performance?  And how can we make it as fast as possible?  In this post I’ll try to answer these questions, particularly in relation to my recent deblocking optimizations in x264.

H.264′s deblocking filter has two steps: strength calculation and the actual filter.  The first step calculates the parameters for the second step.  The filter runs on all the edges in each macroblock.  That’s 4 vertical edges of length 16 pixels and 4 horizontal edges of length 16 pixels.  The vertical edges are filtered first, from left to right, then the horizontal edges, from top to bottom (order matters!).  The leftmost edge is the one between the current macroblock and the left macroblock, while the topmost edge is the one between the current macroblock and the top macroblock.

Here’s the formula for the strength calculation in progressive mode. The highest strength that applies is always selected.

If we’re on the edge between an intra macroblock and any other macroblock: Strength 4
If we’re on an internal edge of an intra macroblock: Strength 3
If either side of a 4-pixel-long edge has residual data: Strength 2
If the motion vectors on opposite sides of a 4-pixel-long edge are at least a pixel apart (in either x or y direction) or the reference frames aren’t the same: Strength 1
Otherwise: Strength 0 (no deblocking)

These values are then thrown into a lookup table depending on the quantizer: higher quantizers have stronger deblocking.  Then the actual filter is run with the appropriate parameters.  Note that Strength 4 is actually a special deblocking mode that performs a much stronger filter and affects more pixels.

Read More…

05/19/2010 (9:30 am)

The first in-depth technical analysis of VP8

Filed under: google,VP8 ::

Back in my original post about Internet video, I made some initial comments on the hope that VP8 would solve the problems of web video by providing a supposed patent-free video format with significantly better compression than the current options of Theora and Dirac. Fortunately, it seems I was able to acquire access to the VP8 spec, software, and source a good few days before the official release and so was able to perform a detailed technical analysis in time for the official release.

The questions I will try to answer here are:

1. How good is VP8? Is the file format actually better than H.264 in terms of compression, and could a good VP8 encoder beat x264? On2 claimed 50% better than H.264, but On2 has always made absurd claims that they were never able to back up with results, so such a number is almost surely wrong. VP7, for example, was claimed to be 15% better than H.264 while being much faster, but was in reality neither faster nor higher quality.

2. How good is On2′s VP8 implementation? Irrespective of how good the spec is, is the implementation good, or is this going to be just like VP3, where On2 releases an unusably bad implementation with the hope that the community will fix it for them? Let’s hope not; it took 6 years to fix Theora!

3. How likely is VP8 to actually be free of patents? Even if VP8 is worse than H.264, being patent-free is still a useful attribute for obvious reasons. But as noted in my previous post, merely being published by Google doesn’t guarantee that it is. Microsoft did similar a few years ago with the release of VC-1, which was claimed to be patent-free — but within mere months after release, a whole bunch of companies claimed patents on it and soon enough a patent pool was formed.

We’ll start by going through the core features of VP8. We’ll primarily analyze them by comparing to existing video formats.  Keep in mind that an encoder and a spec are two different things: it’s possible for good encoder to be written for a bad spec or vice versa! Hence why a really good MPEG-1 encoder can beat a horrific H.264 encoder.

But first, a comment on the spec itself.

Read More…

05/08/2010 (1:47 pm)

Taking submissions for encoder comparison

Filed under: Uncategorized ::

With VP8 supposedly going to come out in about 2 weeks, it’s time to get a rough idea as to the visual state of the art in terms of encoders.  Accordingly, I’m doing a small visual codec comparison in which we will take a few dozen encoders, encode a single test clip, and perform score-based visual tests on real humans using a blind test.  There will be no PSNR or SSIM results posted.

See the doom9 thread for more information and feel free to submit streams for your own encoders.  I’m particularly interested in some newer proprietary encoders for which I wouldn’t be able to get the software for due to NDAs or similar (such as VP8, Sony Blu-code, etc) — but for which I would be able to get a dump of the decoded output.

05/07/2010 (8:57 am)

Simply beyond ridiculous

Filed under: H.265,speed ::

For the past few years, various improvements on H.264 have been periodically proposed, ranging from larger transforms to better intra prediction.  These finally came together in the JCT-VC meeting this past April, where over two dozen proposals were made for a next-generation video coding standard.  Of course, all of these were in very rough-draft form; it will likely take years to filter it down into a usable standard.  In the process, they’ll pick the most useful features (hopefully) from each proposal and combine them into something a bit more sane.  But, of course, it all has to start somewhere.

A number of features were common: larger block sizes, larger transform sizes, fancier interpolation filters, improved intra prediction schemes, improved motion vector prediction, increased internal bit depth, new entropy coding schemes, and so forth.  A lot of these are potentially quite promising and resolve a lot of complaints I’ve had about H.264, so I decided to try out the proposal that appeared the most interesting: the Samsung+BBC proposal (A124), which claims compression improvements of around 40%.

The proposal combines a bouillabaisse of new features, ranging from a 12-tap interpolation filter to 12thpel motion compensation and transforms as large as 64×64.  Overall, I would say it’s a good proposal and I don’t doubt their results given the sheer volume of useful features they’ve dumped into it.  I was a bit worried about complexity, however, as 12-tap interpolation filters don’t exactly scream “fast”.

I prepared myself for the slowness of an unoptimized encoder implementation, compiled their tool, and started a test encode with their recommended settings.

Read More…