Diary Of An x264 Developer

09/30/2010 (7:48 pm)

H.264 and VP8 for still image coding: WebP?

Update: post now contains a Theora comparison as well; see below.

JPEG is a very old lossy image format.  By today’s standards, it’s awful compression-wise: practically every video format since the days of MPEG-2 has been able to tie or beat JPEG at its own game.  The reasons people haven’t switched to something more modern practically always boil down to a simple one — it’s just not worth the hassle.  Even if JPEG can be beaten by a factor of 2, convincing the entire world to change image formats after 20 years is nigh impossible.  Furthermore, JPEG is fast, simple, and practically guaranteed to be free of any intellectual property worries.  It’s been tried before: JPEG-2000 first, then Microsoft’s JPEG XR, both tried to unseat JPEG.  Neither got much of anywhere.

Now Google is trying to dump yet another image format on us, “WebP”.  But really, it’s just a VP8 intra frame.  There are some obvious practical problems with this new image format in comparison to JPEG; it doesn’t even support all of JPEG’s features, let alone many of the much-wanted features JPEG was missing (alpha channel support, lossless support).  It only supports 4:2:0 chroma subsampling, while JPEG can handle 4:2:2 and 4:4:4.  Google doesn’t seem interested in adding any of these features either.

But let’s get to the meat and see how these encoders stack up on compressing still images.  As I explained in my original analysis, VP8 has the advantage of H.264′s intra prediction, which is one of the primary reasons why H.264 has such an advantage in intra compression.  It only has i4x4 and i16x16 modes, not i8x8, so it’s not quite as fancy as H.264′s, but it comes close.

The test files are all around 155KB; download them for the exact filesizes.  For all three, I did a binary search of quality levels to get the file sizes close.  For x264, I encoded with --tune stillimage --preset placebo.  For libvpx, I encoded with --best.  For JPEG, I encoded with ffmpeg, then applied jpgcrush, a lossless jpeg compressor.  I suspect there are better JPEG encoders out there than ffmpeg; if you have one, feel free to test it and post the results.  The source image is the 200th frame of Parkjoy, from derf’s page (fun fact: this video was shot here!  More info on the video here.).

Files: (x264 [154KB], vp8 [155KB], jpg [156KB])

Results (decoded to PNG): (x264, vp8, jpg)

This seems rather embarrassing for libvpx.  Personally I think VP8 looks by far the worst of the bunch, despite JPEG’s blocking.  What’s going on here?  VP8 certainly has better entropy coding than JPEG does (by far!).  It has better intra prediction (JPEG has just DC prediction).  How could VP8 look worse?  Let’s investigate.

VP8 uses a 4×4 transform, which tends to blur and lose more detail than JPEG’s 8×8 transform.  But that alone certainly isn’t enough to create such a dramatic difference.  Let’s investigate a hypothesis — that the problem is that libvpx is optimizing for PSNR and ignoring psychovisual considerations when encoding the image… I’ll encode with --tune psnr --preset placebo in x264, turning off all psy optimizations.  

Files: (x264, optimized for PSNR [154KB]) [Note for the technical people: because adaptive quantization is off, to get the filesize on target I had to use a CQM here.]

Results (decoded to PNG): (x264, optimized for PSNR)

What a blur!  Only somewhat better than VP8, and still worse than JPEG.  And that’s using the same encoder and the same level of analysis — the only thing done differently is dropping the psy optimizations.  Thus we come back to the conclusion I’ve made over and over on this blog — the encoder matters more than the video format, and good psy optimizations are more important than anything else for compression.  libvpx, a much more powerful encoder than ffmpeg’s jpeg encoder, loses because it tries too hard to optimize for PSNR.

These results raise an obvious question — is Google nuts?  I could understand the push for “WebP” if it was better than JPEG.  And sure, technically as a file format it is, and an encoder could be made for it that’s better than JPEG.  But note the word “could”.  Why announce it now when libvpx is still such an awful encoder?  You’d have to be nuts to try to replace JPEG with this blurry mess as-is.  Now, I don’t expect libvpx to be able to compete with x264, the best encoder in the world — but surely it should be able to beat an image format released in 1992?

Earth to Google: make the encoder good first, then promote it as better than the alternatives.  The reverse doesn’t work quite as well.

Addendum (added Oct. 2, 03:51):

maikmerten gave me a Theora-encoded image to compare as well.  Here’s the PNG and the source (155KB).  And yes, that’s Theora 1.2 (Ptalarbvorm) beating VP8 handily.  Now that is embarassing.  Guess what the main new feature of Ptalarbvorm is?  Psy optimizations…

Addendum (added Apr. 20, 23:33):

There’s a new webp encoder out, written from scratch by skal (available in libwebp).  It’s significantly better than libvpx — not like that says much — but it should probably beat JPEG much more readily now.  The encoder design is rather unique — it basically uses K-means for a large part of the encoding process.  It still loses to x264, but that was expected.

[155KB]

126 Responses to “H.264 and VP8 for still image coding: WebP?”

  1. Yanito Candra Says:

    this only technology preview version v-0.0.1 of WebP. w’ll see future development and improvement.

  2. Dark Shikari Says:

    @Yanito

    Changes to “webp” don’t change the VP8 video format (or libvpx encoder) that it’s based on.

  3. LIu Liu Says:

    Personally, I think that JPEG XR is a far better alternative since it has a good lossless format and more sophisticated prediction scheme (comparing to JPEG). But Microsoft just didn’t promote it hard enough …

  4. av500 Says:

    @LIu Liu: its not about promoting, its about 20ys of JPEG, like DS said :)

  5. verb3k Says:

    Is it true that VP8 encoding might come to x264? or was it just trolling/joke? :D

  6. Scaevolus Says:

    Can you link to the source frame, and note the filesizes of the various compressed versions?

  7. Dark Shikari Says:

    @Scaevolus

    Done and done.

  8. Joshua Says:

    We already have a great replacement for jpg: .PNG

  9. Patcito Says:

    > it doesn’t even support all of JPEG’s features, let alone many of the much-wanted features JPEG was missing (alpha channel support, lossless support). It only supports 4:2:0 chroma subsampling, while JPEG can handle 4:2:2 and 4:4:4. Google doesn’t seem interested in adding any of these features either.

    Wrong:

    > We plan to add support for a transparency layer, also known as alpha channel in a future update.

    from:

    http://googlecode.blogspot.com/2010/09/webp-new-image-format...

  10. Dark Shikari Says:

    @Paticito

    Sounds better than expected then, but don’t count your chickens before they hatch. They said they’d make libvpx good too, and look where that’s gone — there hasn’t been much of any work on the encoder in the past month.

  11. Drazick Says:

    DS, What’s your thought about JPEG XR?
    Could you add it to the comparison?
    Could your Psy optimization implemented into JPEG XR / WebP?

    Anyhow, it is really about time to let go of the JPEG.

  12. Patcito Says:

    @Dark many googlers are on vacation during August/September, don’t expect much work done.

  13. Dark Shikari Says:

    @Drazick

    I don’t know much about JPEG-XR, as was demonstrated today on #theora when I didn’t even realize that it used a lapped transform. In terms of psy, the only bitstream feature that really matters enormously is whether or not it supports adaptive quantization.

    If you can do a test with a JPEG-XR encoder, I’d be happy to post it, but keep in mind that I’ve heard the official encoder is not very good…

  14. Matt Says:

    Why didn’t they replace GIF if they have this fancy video codec turned image compressor? That would actually be useful.

  15. Dark Shikari Says:

    @Matt

    That’s something that’s been tried for a while… to begin with there was MPNG and APNG (to try to get an animated GIF with 24-bit color support). But none of these formats even have motion compensation… not even basic fullpel.

  16. Lazy Miha Says:

    Why not http://en.wikipedia.org/wiki/Progressive_Graphics_File ? It is the most suitable format up to date.

  17. KJ645 Says:

    Is there a way to compare the decoding complexity of JPEG vs. WEBP? ie will WebP make my cell phone lava hot and kill my battery life for its compression “improvements.”

  18. Dark Shikari Says:

    @KJ645

    “WebP”, aka VP8, is definitely much more complex. Most importantly, it uses arithmetic coding and a complex deblocking filter, both of which are not present in JPEG. I would guess it’s at least 3 times as intensive to decode.

  19. Sape Says:

    Hello!

    1. I think it is a terrible idea to test still image compression with a still taken from a video shot with a video recorder, not an actual camera. Even the source has an inferior image, none of the coders can really shine if they already have to use a crappy source. You can grab a whole lot of amazing pictures under creative commons license.
    2. You keep repeating it is not the format that is important for the perceived quality but the psyvis optimizations. This means, that eg. your work of psy optimizations for x264 could be “ported” to produce only those type of frames and features that are also present in VP8, and then we would have much better looking VP8 encodes? (I also see no point in actually doing it, since I suppose it would still be a bit inferior to x264).

    Thanks,
    Sape

  20. Fruit Says:

    Imho an h.264 based still image format should have been standardised for years already… not necesarilly just for web use; as said, jpeg has been here for 20 years. I would even be fine with Apple pushing the format (where is that Jobs bloke when one needs him, heh).

  21. Dark Shikari Says:

    @Sape

    That video is taken on 65mm film by a camera that costs more than most houses — it is higher quality than almost any image taken by any “photo camera”. I highly doubt your average Creative Commons images even have a quarter the detail that an Arriflex 765 can take.

    Yes, some of x264′s psy optimizations can in theory be ported to work on VP8. Adaptive quantization is the iffy one, as VP8 doesn’t have delta quantizers. It only has “segments”, which cost roughly 2 bits per macroblock to signal, and you can only have 4 of them. Ptalarbvorm has demonstrated that you can get a pretty good portion of the benefit without the precision of H.264′s quantizers (i.e. with only a few quants to pick from), but I’m not so sure about the cost of the segments. 2 bits per macroblock probably isn’t too bad for images though; it’s likely much worse for actual videos.

  22. Sape Says:

    @Dark
    1. I have just had another look at the source, and I still think it is not a big bang as a still image. But of course this is subjective.
    1/b. As for CCL images: I did not say all are brilliant, but you can find serveral of them that are.
    1/c. Go eg. here:
    http://www.dpreview.com/galleries/reviewsamples/albums/canon-eos-1d-mark-iv-review-samples#page=1
    or here:
    http://www.dpreview.com/galleries/reviewsamples/albums/nikon-d3s-review-samples#page=1
    for some professional images. These ar not CCL-ed though, you will have to ask for permission if you want to publish. (personal use is allowed)

    2. Yeah, I also felt it was not really worth it (for you). Google OTOH should pay you to do what is possible if they care… :)

  23. schnaader Says:

    The JPEG output image can be further compressed using PackJPG (http://www.elektronik.htw-aalen.de/packjpg/) to 136711 bytes, lossless (it’s no JPG afterwards anymore, but the original JPG can be restored).

  24. przemo_li Says:

    So you say that:
    If sizes are the same quality of WebP is inferior to JPEG.

    And Google say that:
    If quality is the same size of WebP is superior to JPEG.

    It sound quite strange.

    BTW Google targeted PSRN of 42 for all files. So with their comparison PSRN was not so important factor.

  25. Chrysler Says:

    I did a quick test of how JPEG XR performs. The Microsoft Expression encoder seems to be more efficient than the reference implementation, so I used that one. The nearest I could get was 160 KB:
    http://www.speedyshare.com/files/24494880/source.wdp
    As PNG: http://img3.imagebanana.com/img/skxxylcl/source_jxr.png

  26. Brandon Says:

    If anything should become mainstream, it should be Jpeg XR. Much better quality for the size. File formats are lame. Hard to change, hard to move.

  27. Chris Carpenter Says:

    Well, another good overview of a new format! (I would assume, I don’t really know much about video/image formats/encoding myself)

    However, I believe that you are wrong in one point:

    “Earth to Google: make the encoder good first, then promote it as better than the alternatives. The reverse doesn’t work quite as well.”

    The Open Source/Free Software way is to “Release early, release often”. Linus Torvalds released the Linux kernel when it could barely do anything, and look how well it did! I’m not saying that this is going to work for google(it doesn’t always work out), but it is the way it is supposed to be for Open Source software. Only proprietary companies worry about the software looking good when it is first released. Google(I would assume) is counting on the support of many programmers (not unlike yourself) to improve this software. It’s the Open Source/Free Software way.

    P.S. I just realized that I assumed(without any direct evidence that I can find) they released/are releasing the source code for their Image converter and whatever other tools are required. If they aren’t, then the comment above is basically invalid. Releasing early and often works best for open source, not near as well for proprietary code. Either way, though, they do specifically say it is a developer release… so it isn’t supposed to be ready for production use yet.

  28. Nagilum Says:

    The official jpeg tools work just fine:
    pngtopnm source.png|cjpeg -quality 18 -progressive -dct float -outfile source.jpg
    The resulting jpeg is 152K and looks much like the x264 version imho.

  29. Stefano Says:

    What about jpeg2000? It is way better than jpeg, and it’s been a few years since it’s available, but no-one ever talks about it…

  30. Aslak Raanes Says:

    How does this compare to libjpeg’s (from v 7 or 8) arithmetic encoder? Using “jpegtran -arithmetic” usually gives 10-20 % size reduction. I don’t if the IJG jpeg code have implementet other parts of T.851

    If Google wanted alpha channel support they could have implemented JNG subset of the MNG.

  31. Jeff Muizelaar Says:

    What did you use to decode the jpeg? output.jpg appears washed out compared to jpeg.png when I decode it.

  32. Orthochronous Says:

    An interesting question is who WebM is aimed at: is it purely for semi-pro photographers who know lots of technical stuff already, or is it aimed at anyone who’s putting their snaps online? If it’s aimed at everyone, then the other thing that would be interesting would be investigating what happens if you take an image that’s already been compressed in being output from the camera (eg, low-end consumer cameras that don’t offer RAW output, only jpeg) and transcode it and see what happens in terms of transcoding artifacts. With my image analysis hat on I’m shuddering at the thought, but the human eye is sensitive to different things. So they might be negligible, but they can sometimes be visible even when looking at the images at full magnification. In particular, one would expect that a representation that is closer to JPEG would do marginally better (since in general the re-encoder doesn’t know that which parts of the reconstruced-from-jpeg image are artifacts and which are from the original image.

  33. Warren Says:

    Hmm, why would you encode still images that way? If I take the source image and save it with Photoshop as jpg @ 20 percent (312kB), and save it again as png (3.21MB), I get far better results than your x264 example.

    Can’t save it as WebP that way yet (obviously) but if your standard jpg example is already so far off, I doubt that your WebP example is accurate either.

  34. Manuel Says:

    You seem very touchy about WebM etc.

    Maybe you’re slowly understanding that you are on the wrong side, the MPEG LA side.

  35. Ed Says:

    Well, from simply looking at those picture, it just prove why we dont need to replace Jpeg. At least when it is scale down, ( Zooming in shows it loses a of of information )

    But 90% of the world dont care. At least on the Web. If we need High Quality we use High Bitrate Jpeg to solve the problem or PNG. Casual web surfer dont want the hassle to save only 20% of file image size for software incompatibilities. If we could get the same image quality for only 20% of file size i suspect the world would pick it up, but that is not even possible with the best encoder we have; x264.

    The only image format that I think could succeed jpeg is an H.264 based image format where the decoding complexity is offload to an widely available h.264 hardware decoder. Not to mention a small update to flash player would immediately see 90%+ of the internet being able to view h.264 images. But of coz, in a imperfect world that is not going to happen due to stupid human politics…

  36. KeyJ Says:

    @Manuel: He’s rather on the “let’s use the technically best solution and don’t care about the legal bullshit” side.

  37. Witold Baryluk Says:

    webp looks basically like jpeg with applied deblocking filter. It is useless, as deblocking filter can be easly added to jpeg without any compatibility problems.

    x264 is much more detailed.

    I think adding mandatory deblocking to jpeg, and growing blocks from 8×8 to 32×32 (yes, not 16×16, as 32×32 isn’t a performance problem for still images IMHO), improving “psy” encoding (buy for example using variable quality for each block – by detecting foregraound, backgraound, etc) is better idea. Much more compatible, easier to implement decoder (hey, it is just a tweaked jpeg), and probably better than both jpeg and webp.

    As for me webp is no go, as it do not have HDR support or any alpha channel support. Size improvements do not care me so much (Google care i understand), but i do not want to store both webp and jpeg for compatibility and server dynamically one or other, depending which one browser supports.

    jpeg2000 would be much better actually as it do not have many jpeg artifacts, and have alpha channel, and lossless mode.

  38. Travis Says:

    I notice that the sample images here are all compressed WAY more than anyone would ever compress them for actual use. Go look at the pictures on flickr, and you’re going to have a REALLY hard time finding images that have their JPEG quality settings set low enough to produce an image that looks that bad. Is it possible that VP8/WebP does better (particularly in comparison to JPEG) when using higher quality settings?

  39. Andy Baker Says:

    I hope Joshua is joking:
    “We already have a replacement for JPEG. It’s called PNG.”

    And who needs h264 when you can have folders full of numbered BMPs

    This is someone reading and commenting on the x264 blog. You would hope they know the difference between lossy and lossless compression.

  40. Dark Shikari Says:

    @Travis

    Highly doubtful; there’s nothing about VP8 that would make it better at high rates — in fact, by design, it should be better at low rates (as should H.264). The reason to compare at low rates is simply that it’s easier to see a difference — people love to compare at high rates and say “there’s not much difference” even when the difference is large enough that it represents a factor of 50% in bitrate.

  41. Dark Shikari Says:

    @Warren

    Well obviously at 312kb, your jpeg will look much better than at 156kb!

  42. Dark Shikari Says:

    @Jeff

    I used an ffmpeg set up explicitly for the purpose of avoiding colorspace range conversion, since that plagues so many comparisons. That is, I kept the colorspace in TV-range YV12 for both encoding and decoding, without conversion, because of the mess that causes.

  43. Mathias Picker Says:

    Why not use PGF? http://www.libpgf.org/

    Lossy&&lossless, fast, proven, existing library, progressive loading.

    Seems to be the smarter move.

    Cheers, Mathias

  44. Dark Shikari Says:

    @przemo_li

    Of course PSNR is important. If you optimize for PSNR, your image will look worse at the same PSNR than with an encoder that optimizes for psy. x264 will often look better at 40db (optimizing for psy) than at 43db (optimizing for psnr).

  45. Witold Baryluk Says:

    @Stefano Says: JPEG2000 is SLOW.

    PGF looks much better in this respect.

  46. mpz Says:

    While we’re on the subject of experimental image compression, I’d like to direct your attention to DLI: http://sites.google.com/site/dlimagecomp/

    It beats the pants off of JPEG2000 and the other well known (experimental) image coder called ADCTC.

  47. Dark Shikari Says:

    @mpz

    I’ve seen that one before; iirc, last I tested it, it was competitive with x264 and even beat it significantly at times. Given that it was adapted for insanely high compression times, of course, this is not very surprising. Still rather impressive.

  48. Kruz Says:

    Mine subjective opinion is that the VP8 (decoded to PNG) frame looks much better.

    Any way, with more bandwidth, bigger hard drivers, (hopefully in some decades quantum computers), we should better use PNG, it is loose-less, and for audio FLAC :)

  49. compare Says:

    See a thorough comparison of JPEG XR vs JPEG 2000 at http://www.compression.ru/video/codec_comparison/pdf/wmp_codec_comparison_en.pdf

  50. Digital Camera Fan Says:

    Microsoft should subsidize all of the digital camera manufacturers to support JPEG-XR in order to kick-start the use of that format. It is superior by far to WebP, is an open standard, and has a good chance of adoption (particularly in new cameras that will have high dynamic range sensors) if promoted properly.

  51. STaRMaN Says:

    Mmm… I think the unique feature which is missing in jpg is alpha channel. PNG is for me the best format, but if the image is big, png is a large file vs jpg. I think the next good steep, is improve png format.

    Another problem with Google, as always, google is a business, always thinking in business, so, if they are to license vp8, and google MAKE standards (i don’t like that), all people are going to suffering license problems before or after..

  52. Alereon Says:

    Is quality for images produced with libvpx the same as those produced using webpconv? From the way the WebP site is written it sounds like they’re only using libvpx for decoding. While of course you’re still using VP8 I-frames, it may be possible that webpconv is tuned better for quality rather than PSNR.

  53. Blue_MiSfit Says:

    Glad to see another blog post from you, Dark Shikari!

    Derek

  54. Christian Says:

    I am not convinced that introducing a “new” image format makes sense: When I’m browsing the web on a standard broadband connection, images load instantly anyways, and Moore’s law holds not only for the bandwidth of Internet connections, but also for hard drive capacities.

  55. WP Says:

    WebPee? What kind of stupid name is that? Does that mean the images are going to be stained in yellow?

  56. Alan Says:

    Here is a 152KB jpeg compressed with cjpeg and a perceptual quantization table: http://dailyburrito.com/output_cjpeg_perceptual.jpg

  57. Dark Shikari Says:

    @Alan

    jpegrescan shrinks that another 3%, so it’s actually smaller than it needs to be! Either way, that’s really impressive.

  58. Alan Says:

    I’ve been arguing that jpeg has legs if we could just agree on a better quantization table for a while. And if someone added some de-blocking in the decoder …

    Nice tip with jpesrescan. Here’s my 147KB result with that (unfortunately with cjpeg, the next quality step up gives 164KB after the rescan, so I wont cheat and post that):
    http://dailyburrito.com/output_cjpeg_spec_rescanned.jpg

  59. Alan Says:

    And, for posterity, here’s the quantization table I was using. It is setup to approximate the perceptual performance of the eye at a viewing distance of 6 image heights:

    # Quantization tables (6 image heights)

    # Table 0 (Y channel)
    5 3 4 7 11 16 24 34
    3 4 4 6 8 12 18 25
    4 4 8 9 11 15 20 28
    7 6 9 14 16 20 26 33
    11 8 11 16 26 28 34 42
    16 12 15 20 28 41 46 54
    24 18 20 26 34 46 63 71
    34 25 28 33 42 54 71 95

    # Table 1 (Cr channel)
    7 9 18 28 43 63 91 128
    9 8 17 23 33 48 68 94
    18 17 31 34 43 58 78 105
    28 23 34 55 63 77 98 126
    43 33 43 63 98 108 128 157
    63 48 58 77 108 154 174 204
    91 68 78 98 128 174 239 255
    128 94 105 126 157 204 255 255

    #Table 2 (Cb channel)
    14 18 46 71 109 161 232 255
    18 17 43 59 85 122 173 240
    46 43 80 87 111 148 200 255
    71 59 87 142 160 196 249 255
    109 85 111 160 251 255 255 255
    161 122 148 196 255 255 255 255
    232 173 200 249 255 255 255 255
    255 240 255 255 255 255 255 255

  60. Dark Shikari Says:

    @Alan

    You can tweak all the values in the quant table up/down by a small amount to tweak quality by small amounts. That’s what I did with x264 for the PSNR test, since I couldn’t get the size to quite match.

  61. Alan Says:

    @ Dark

    Thanks. Good idea tweaking the quant table to match more exactly.

  62. Alan Says:

    And I meant to post this: http://dailyburrito.com/output_cjpeg_perceptual_rescanned.jpg

  63. Drazick Says:

    We all had better if Google just support JPEG XR.
    We don’t need another format we just need support for something.

  64. Jonas B. Says:

    Google released a lot of strange things through the Chrome project that are more like early drafts. The spdy protocol is not really thought through either.

  65. ender Says:

    > And I meant to post this: http://dailyburrito.com/output_cjpeg_perceptual_rescanned.jpg

    Interestingly, Opera doesn’t like this image at all – it either doesn’t display anything, or only displays a B&W image at 1/8th resolution. Firefox, Chrome and IE have no such problems.

  66. Pieter Says:

    Have a look at this website. They are doing a good job compared with Jpeg and Jpeg2000 if you look specifically at file size.

    http://sites.google.com/site/dlimagecomp/

  67. Ed Says:

    What the heck is cjepg?
    Are those jpeg standard compliant? ( i seems to have no problem decoding it on Firefox )

    Is it somekind of a new and better jpeg encoder?

    Why isn’t this being widely used? It is much better then the original jpeg.

  68. Ira Dvir Says:

    Why deal with theories? HM has implemented its tiled and multi-dimentional interlacing algos, using high profile H.264 for stills coding. It is a 422 scheme, utilizing standard H.264 encoder under mp4, allowing import/export of EXIF, which can be integrated with HW cellular codecs (a pilot with a leading handset vendor on the way). Goto http://www.hipixpro.com and you can download freely and experiment with great Windows tools and Android. Viewers will be always free for the PC and developers are welcome to have their PC apps. The code has a huge advantage over WebP as it utilizes Inter-Intra tools and not just intra. It has to do with out of the box technology, which leave things wuthin the satndard. Try it yourself.

  69. Alan Says:

    @Ed

    Sorry I didn’t explain. cjpeg is a command line utility that comes with the IJG “reference” libjpeg library.

    Note that the usage of non-standard quantization tables (that take human perception into account) is the reason that these jpegs look so nice but it is also those quantization tables that likely cause some decoders to fail with these jpegs.

  70. Mangix Says:

    are there any binaries made from jpgcrush and jpgrescan? using perl on windows is pretty much a hassle :\

  71. Han Says:

    Here’s the resulting image from the Hipix software referenced in response 68.
    http://img121.imageshack.us/img121/9370/hipix.png
    I used 165 for “Set Target Size” for an output size of 153KB.
    IMO it’s nowhere near the quality of x264 on this particular image, there’s significant loss of detail, and noticable artificial lines added into the image.

  72. Fedyon Says:

    For some reason I might prefer Theora over x264. Sure x264 has more details, but sometimes it looks too peaky compared to blurred neighbour. Theora is overall blurry, but has less jumps of texture retention and feels more natural.

    Maybe there’s also a difference between psy for movies and psy for stills?

    And yes, for vp8 that is embarassing anyway. It fails. Even block jpeg is better than blurred lumps…

  73. eeei Says:

    Lossless-wise, I never understood why JPEG-LS got no attention. Compression is about as good as JPEG2K lossless (possibly a bit better), but it’s so much faster both to encode and decode, also compared with PNG.

  74. BondEar Says:

    Does anyone know of a good command line SSIM tool for still color images on Linux? I can’t get Medhi Rabah’s ssim.cpp or qpsnr to compile, nor can I get ssim.m to run in GNU Octave.

  75. Ira Dvir Says:

    Dearest Han, I have no ides what you did, and I don’t have your source images. We tried all the WebP images and soon we’ll upload the gallery. Our gain for immaculate quality – compared to the source JPEGs is huge. Where WebP saves 10& we save 40 and more. Try yourself if you like. Everybody invited to do that – hipixpro.com

  76. Han Says:

    Here’s the resulting image from the DLI software referenced in response 66.
    http://img814.imageshack.us/img814/7184/dli.png
    I used the default settings with “-q 20″ for an output size of 153 KB.
    The output quality was shocking.. I’ve tested numerous codecs that claim better compression than h264/x264 but this is the first time I’ve seen it. The image has similar detail as the x264 image but has significantly less artifacts (blocking, distortion around edges, pink chroma).

  77. Han Says:

    @Ira Dvir

    I used the source image referenced in the article/blog. Your free to download it and test to see if I somehow compressed the image incorrectly. I’ll get around to trying your software with more images for a better evaluation.

  78. ladautt Says:

    What is the use case illustrated here ?

    a 150 KB compressed image imply a bit rate of 0.6bpp…

  79. Ira Dvir Says:

    Guys, the use case is clear, and the technology can be used with any H.264 CoDec (including x264). Goto the hipipro.com site, there’s a lot of stuff to read and experiment with. The main advantages: No limitation of resolution, using exosting HW/SW CoDecs (could be even WebP) for Inter-intra stills coding. This is going into cellular with the existing HS… Try using the presets for simpler operation, and see the results for yourselves.

  80. Ira Dvir Says:

    http://hipixpro.com/webpcompare.html
    Here’s the comparing gallery of hioix Vs. WebP. You can download the files.

  81. IgorC Says:

    Is better image compression still required?
    Today the biggest part of bandwidth is video content not images.

    4 temporal generations of compression:
    1. Generic Data (nothing changes in >15 years)
    2. Image (nothing changes in >15 years)
    3. Sound (nothing changes in >15 years)
    4. Video (everything is changing )

    So generic data, image and sound compressions are closed books.
    ….

    1. First there was a data compression (zip, rar) and all people compressed their 20-30 KB files with it.
    Now when we use email we barely care about compression if the size of attachment is smaller lets say than 1-2 MB. Just try to send something like .7z

    2. The same is applicable to lossy compression of images. JPEG is just universal like ZIP.

    3.MP3 compression as old as JPEG and AAC is from the same decade (+3-4 years). HE-AAC doesn’t count as it was a subset for low bitrates and not entire replacement of AAC. And MP3 is vastly more popular than AAC today.

  82. David Says:

    Not sure how well applies with relation to colorspace conversions and whatnot, but I just took the source image and simply saved it out with IrfanView as a JPG (no fancy reworking of quant tables), setting the quality down to 18 to reach a 152 KB file size, and it looks substantially better than the jpg DS put up. I’d rate it as slightly better than the theora pic, but a bit worse than the x264 pic (x264 is substantially better in the water reflections, but mostly similar elsewhere). So the bar that jpg sets seems to be a fair bit higher than implied in the article.

    On the other hand, the hipixpro image really does look quite impressive, mainly in not introducing a lot of the noise that you get in the x264 image while still retaining that high image quality. If they could add transparency to it (not sure if that’s easy to do, given that it’s based on h.264) I’d be quite happy to switch to it… aside from the fact that Firefox will never support it because it’s still tied to h.264…

  83. Jeff Muizelaar Says:

    Here’s my attempt:
    http://people.mozilla.org/~jmuizelaar/parkjoy.jpg (155K)

    It uses the same quantization table as Alan’s but also uses trellis quantization to get closer to the target rate.

    @Alan, how were your quantization tables derived?

  84. Peter De Pan Says:

    Oh my god!

    Your site, as good as your article seems to be, is unreadable. My eyes hurt after a few seconds. Do yourself a favour and switch your colors to an eye-friendly format as described in thousend bokks about webdesign for the bloody beginner! Have you ever tried to read your site your own?

    Amateurish, nerdy and painful…

    Peter

  85. Dark Shikari Says:

    @Peter

    I use the same colorscheme for my site that I use for all my applications on my own computer (at least ones where I can change the color scheme easily).

    I personally do not believe that staring into the sun (i.e. dark text on a light background) is suitable for anything any sane person would want to read. Maybe you are of some other species which is adapted to staring into extremely bright lights for long periods of time. I am not, and neither are most of my readers. Well, at least I assume most of my readers are human — I can’t be sure of that!

    Above all, I refuse to use a colorscheme that I cannot read comfortably for more than a couple minutes. Black text on a white background falls squarely into that category, giving me headaches quite rapidly. I would rather not have to pop ibuprofen every few hours (and ruin my vision!) just because some random commenter on the internet tells me that I should stare into a lamp all day.

    Now, if you are reading this on a device without a backlight (e.g. a Kindle), then it might be useful to have an alternate CSS. But I don’t think anyone is reading this on a Kindle.

    If you do in fact happen to be of some alien species that likes staring into bright lights, try this script.

  86. KeyJ Says:

    @Alan:
    > Note that the usage of non-standard quantization
    > tables is the reason that these jpegs look so nice
    > but it is also those quantization tables that likely
    > cause some decoders to fail with these jpegs.

    I strongly doubt that. In JPEG, there’s no scalar quantizer (like the QP in H.264), so the only way to signal the amount of quantization to the decoder is specifying complete quantization tables in the header. A decoder that assumes fixed quantization tables would fail for 99.9% of all files.

  87. Han Says:

    @David

    I think your mixing up the hipixpro and DLI images, the hipixpro one looks similar to x264 PSNR optimized while DLI is like x264 with less distortion.

  88. Ira Dvir Says:

    It seems that Han and some of you guys don’t really understand that it is still pictures and not HDTV frames we are discussing. First – for low res (circa 1MP) x264 and hipix are more or less the same. However there is a huge gap between the efficiency of pure intra to the inter-intra coding we use. The reason? Hipix was designed for cellular handsets and cameras, supporting 5,8,12,16,21 and also 40 and 50MP. the higher the resolution – the more effective it is.

  89. Jeff Muizelaar Says:

    @BondEar

    I have a quick and dirty utility for measuring the SSIM between two pngs.
    http://github.com/jrmuizel/ssim

  90. Dark Shikari Says:

    @Ira

    “Inter-intra” sounds like SVC or wavelets. Aka useless. Can you be more specific? You can’t really use inter coding on one image. Unless you mean in-intra motion vectors.

  91. David Says:

    @Han – You’re right, I mixed them up. The DLI image is indeed what I was looking at when I thought I had the hipix image up. Will have to look into DLI some more, then.

  92. BondEar Says:

    @Jeff

    Where’s ssim.h? It’s referenced in ssim.c but isn’t in the package.

  93. Jeff Muizelaar Says:

    @BondEar

    Fixed.

  94. Ira Dvir Says:

    dark Shikari – Inter-Intra means IP and IBBP GOPs for high res images. Read more at hipixpro.com we have a description of the technoilogy. Try it with 8-12-16-21MP images and you’ll find how effective it is copared to simple Intra coding.

  95. Dark Shikari Says:

    @Ira

    That’s just a horribly inefficient implementation of in-intra motion vectors. If you’re going to rip off a 10-year-old technology and claim that it’s yours, at least do it efficiently.

  96. CAFxX Says:

    Wouldn’t it be possible to create a JPEG encoder targeting SSIM over PSNR, possibly using multiple quant tables to allow some sort of psy optimizations (I’m not sure if JPEG allows switching quant tables in the middle of a scan… I’ll check)?
    Moreover, if the (optional) arithmetic coding feature was nowadays sufficiently widespread decoder-side it should allow JPEG to improve quite a lot while retaining compatibility with existing decoders, wouldn’t it?

  97. KJ645 Says:

    What do you think of hipix Dark Shikarki?

    http://www.vizworld.com/2010/10/hipix-image-format-challenge-google-webp/

  98. r2d2 Says:

    KJ645: read the comments on this blog. The post just above yours is what Dark Shikari thinks of hipix. =)

  99. Han Says:

    @Ira

    Given the information that Hipix performs better on higher resolution images, I re-tested using some of the images from http://www.imagecompression.info/test_images/
    Unfortunately Hipix performed similarly as it did with the smaller < 1MP images, ie. results were similar to x264 targeting PSNR. x264 with psychovisual/HVS optimizations is significantly superior on every image tested. To help Hipix's cause, I suggest providing a specific example where it performs better than x264 that the public can verify.

  100. Esurnir Says:

    Is it just me or the text of the site became enormous ? it’s hard to read now ^^;

  101. ender Says:

    Regarding site colour scheme: while I prefer bright text on dark background (just check out my Windows colour scheme: http://eternallybored.org/imgs/thebat.png), I think that the text on this site is a bit too bright – however since most sites use black text with white background, I browse most of the time with custom CSS enabled anyway, so it doesn’t bother me as much.

  102. Ira Dvir Says:

    Dear Han, since Hipix was designed to enhance the efficiency of “existing h.264 encoders” to compress still images, and utilize existing HW codecs, within the HW constrains (like coding 12MP images using a 720p encoder), I’ll be happy to explore the possibility of enhancing further the quality by using the x264 code within the scheme for PC SW, which we want to be distributed freely (within the limits of the h.264 licensing terms). If it is of interest to you, contact me at ira@human-monitoring.com and we’ll be more than happy to cooperate.

  103. netol Says:

    Ender, doesn’t this break most web sites? Are you using firefox or chrome?
    Sorry, I know this is off topic here…

  104. PNG Says:

    Don’t support patent encumbered image formats, everyone should have learned from the GIF debacle.
    Yes, that means HIPIX also.

    http://burnallgifs.org/archives/

    PNG is superior image without all the artifacts of inferior lossy compression garbage, patent free, open format.

    http://en.wikipedia.org/wiki/Portable_Network_Graphics

  105. jiu Says:

    Just wanted to say: if I were google and trying to improve my image format, I don’t think I could have made a better move than to post it early. Just reading through this thread, they would have most of the brainstorming already done for them, and ready to organise :-)

  106. Fruit Says:

    PNG is inferior itself – it *only* does lossless compression (hello tenfold increase in bandwidth; oh wait, wasn’t this what google meant to adress?) – but at lossless compression it is easily beaten by jpeg XR or jpeg 2K. Screenshots of desktop dominated by gui elements are exception, but hey…

  107. Lee Says:

    Moving everything to PNG as many comments have suggested is a terrible idea.

    You want to try storing your 12MP camera images as .PNG? They would be massive and RAW already exists for PROs who need it.

    PNG for web sites are *much* heavier than .jpg when the image is continuous tone and larger than 250×250. If you want only flow chart line drawing .png is perfect.

    PNG also does not support HDR like JPG XR does, and HDR is expanding more to the general public as it becomes more automated.

    PNG also does not support high bit depth pixels like TIFF and JPG XR. This mostly affects pro uses, but it would be a plus to have the capability in a common format.

    Btw, can’t wait to see how well x264 can do on Sandy Bridge, hope DS will comment on this soon:

    http://lee.hdgreetings.com/2010/09/intel-cpu-vs-nvidia-gpu-video-transcoding.html

  108. Mangix Says:

    @Fruit

    last i remember jpeg-xr was worse than png compressed with pngout in terms of filesize.

  109. Fruit Says:

    On photographic image?
    Quick test here (only with an old HD Photo plugin though – should be at least similar) shows not. Also, it is with a pngout, while no such optimizer was used for the hdp/jxr image.
    I don’t even know where to get a proper/good jpeg XR encoder, much less some that would be guaranteed to have better than average compression.

  110. AnonyMou.se Says:

    Fruit, you should try Microsoft Expression Design (I hope there’s a trial available somewhere).

  111. Ira Dvir Says:

    Guys, soon we’ll release Hipix plugins for a few popular picture editing tools like Photoshop and XnView. The main issue for us is photos of hi-res coming from cellular handseta with 8,12,1nd soon 16MP, where the CMOS output is compressed using Hipix far better than JPEG-XR JP2K and WebP. We also support the following: 422 color format (as coming from the CMPS) EXIF import/export. PNG is worth nothing for this purpose (as well as for pictures as pictures). It is a great format for other purposes. We are operating in the space which is focused on EXISTING HARDWARE. JP2K necer got there, and so JPEG-XR. Show me a codec, which can encode 60MP per second on existing cellular device with 30% higher file size than Hipix and I’ll call it competition.

  112. MaxSt Says:

    I’m not impressed how WebP and Hipix decoders interpolate chroma info:
    http://img822.imageshack.us/img822/125/santacrop.png

  113. LigH Says:

    Next to JPEG2000, LuRaWave (developed by the German Aerospace Company = Deutsche Luft- und Raumfahrt-Technik) could have been one of the best technologies to concur with JPEG. If it was not relying on plugins…

  114. Kazuo Says:

    @Ira All your arguments are empty, ppl asked for a sample are Hipix is good but none come yet. The one posted only show that is not that good.

    Please, post a source image (in png) and a Hipix codec image decoded to png. Choose the source in your benefit (i.e. the “hi-res coming from cellular handseta” ones)

  115. tamás Says:

    Fractal compression FTW :)

  116. Evan Says:

    I haven’t seen any image quality comparison based on only one image and one compression point like this. We only can say this is a special case analysis for these image coders. In this special case, we can see that VP8 turns the blocky effect to the blurred effect by using deblocking filter. If you don’t like this blurred image, just turn off the deblocking filter. (ps. since this is intra coding, decoder can choose to turn on or off the deblocking filter. I see similar blocky effect with JPEG for this case if I turn off the deblocking filter when decoding.) If you want less blocky effect, just decrease the DC QP value. BTW, the flat quantization matrix of AC coefficients of VP8 is designed to optimized PSNR, not this kind of subjective view.

    Please never compare image coders using such small data set and configuration. It is much more misleading than any other comparison.

    BTW, there is indeed one problem of VP8 intra coding for still image. It can’t provide lossless and fast transcoding for image rotation and flip like JPEG. The intra prediction will make this kind of transcoding lossy and time consuming.

  117. Shevach Riabtsev Says:

    @Ira
    I know you are a senior official in Hipix therefore your reasoning and comparisons might be biased.
    Is there any comparisons Hipix vs VP8 performed by independent experts

  118. MaxSt Says:

    @Shevach Riabtsev: here you go, pick your poison:
    http://img822.imageshack.us/img822/125/santacrop.png

  119. KJ645 Says:

    Comparisons:

    http://www.mathias-schindler.de/2010/10/22/webp-file-format-remarks/

    http://englishhard.com/2010/10/01/real-world-analysis-of-googles-webp-versus-jpg/

  120. HJRodrigo Says:

    Here is another comparison of interest:
    http://jpegmini.com/?page_id=108

  121. Aki Says:

    @Ugg

    New libvpx version was released just yesterday:

    http://blog.webmproject.org/2010/10/vp8-codec-sdk-aylesbury-release.html

  122. StoneOakvalley Says:

    It has nothing to do with wrong promotion, better compression than JPEG, 20ys of JPEG etc. etc. etc.

    It has to do with who came first up, standarized into a wealth of software and hardware just about the same time as internet really ignited and blew up into our faces and spawned over to phones and other media devices. This WebP effort is way too late. They have the same arguement about OGG VS MP3. Which came first? MP3. MP3 will and are always to most supported/common format around even if it supposedly have flaws in compare with OGG. Most people don’t care, MP3 was available first and is always supported everywhere. Easy answer to why we do not need JPEG improvements or MP3 improvements, in any case it will take several decades before something like this becomes reality and implemented. There is way to many newer video formats, audio formats to confuse us already, lets pretend it dont happen to pictures too.

  123. Relgoshan Says:

    Well, wepb works immediately in Opera if you grab a compatibility userjs from extendopera.org; but I still wish the company added h264 to its native formats on desktop.

  124. Marcos FRM Says:

    libvpx has received many improvements since the 0.9.5 release. Any significant progress on the encoder?

  125. skal Says:

    https://review.webmproject.org/1780

  126. Igor Akulov Says:

    Jason, thank your very much, very interesting conclusions, but let me add a few my dimensions in Your Model:

    1st Dimesion – patent cleanliness
    So, in this model, increased in one dimension are interested in patent cleanliness. It is not secret for anybody that scientific clusters (R&D bundles) supplying modern standards, begin to intertwine form the 3x-years of the XX century. Not only H.264, but also all modern “technology bundles”, such as LTE, WiMAX etc has as minimum of 50 years of iterative research history. Simple see the ITERATIVE graph of research history for all H.264-related mathematic and engeeniring stuff in your brain. Almost of 99% of this stuff you and personally I, memorized as empiric approach. What is the meaning of blame one or another group of people in applying the algorithm similar to another algorithm? (or even a set of algorithms for the problem area?).

    It seems that the 20th century has arrogated to itself the right to all newly invented. We are just lucky, that up to this were the other century, when this is not done. Otherwise we could not have breakfast, lunch and dinner, and even send their natural needs.

    The main VP8 Google initiative, by my opinion, is not in some qualitatively new innovations. It is simple creation of the (near to) innovative flow, free from patent claims, that known by google. Although, of course, because of a very troubled patent system of the USA it is not clear what are hidden in the sleeve by google. And after this core is formet, most of the initiative and ambitious peoples, such as you, must simple extend this core with empirical and theoretical points.

    2nd Dimension – energy
    It is absolutely clear that it is now the time for freeze the full standard of digital content and for good escape from the hell of historical (almost analog) layers in the media. Yes, the most part of these transformations connected with transformations of meta-information. However, major headaches are converting heavy media content, such as photos, audio, and especially video. It is quite obvious that we simply do not realize the energy scale of this problem. For example, (no, not need power consumption example, because all that read this blog, almost known about it)
    Video industry has also absolutely wild time-domain PWM-like interpolation solutions, such as interlacing and inverse telecine process, requiring very power and energy-intensive decoding approach with incoherent result. because the path back to the source material does not remain in 5-10 years!

    3rd Dimension – politics
    Some states, which call the states of the “third world”, have patent policy, significantly differs from the patent policy of the US. And in these states are home of 150+160+2500+1500+500=4.81 billions of people. You can be sure that no one ever there does not transfer a cent of money for such patent as H.264. Why, indeed, for other list? Because its is too comprehensive and contains a large number of patents, intersecting with local patents of these countries.
    In order to avoid problems related to the discredit of the patent law of US, most of the patents on H.264 will be so or otherwise released in foreign areas.

    The idea of targetting for x264 project
    1. To create an umbrella empirical algorithms, especially the algorithms of mocomp/moest, the most protected from the patents. Most likely, they will also be used later in the VP8.
    2. Do not scold VP8, let guys to relax
    3. R&D. Still more serious approach to the issue of hw-sw partitioning, enter the third level of abstraction for internal API/thread model and to start experiments at least with CPU-based OpenCL (blocked , attached to current theread model).
    4. R&D. Redesign the heterogeneus, OpenCL-based thread model.
    5. Evolution to OpenCL-based heterogeneous design
    If you do not go into details, the Opencl is the only way for a programmer to go the way of the von-neumann to the hardware pipeline without serious bugs in brain.

Leave a Reply