H.264 and VP8 for still image coding: WebP?
Update: post now contains a Theora comparison as well; see below.
JPEG is a very old lossy image format. By today’s standards, it’s awful compression-wise: practically every video format since the days of MPEG-2 has been able to tie or beat JPEG at its own game. The reasons people haven’t switched to something more modern practically always boil down to a simple one — it’s just not worth the hassle. Even if JPEG can be beaten by a factor of 2, convincing the entire world to change image formats after 20 years is nigh impossible. Furthermore, JPEG is fast, simple, and practically guaranteed to be free of any intellectual property worries. It’s been tried before: JPEG-2000 first, then Microsoft’s JPEG XR, both tried to unseat JPEG. Neither got much of anywhere.
Now Google is trying to dump yet another image format on us, “WebP”. But really, it’s just a VP8 intra frame. There are some obvious practical problems with this new image format in comparison to JPEG; it doesn’t even support all of JPEG’s features, let alone many of the much-wanted features JPEG was missing (alpha channel support, lossless support). It only supports 4:2:0 chroma subsampling, while JPEG can handle 4:2:2 and 4:4:4. Google doesn’t seem interested in adding any of these features either.
But let’s get to the meat and see how these encoders stack up on compressing still images. As I explained in my original analysis, VP8 has the advantage of H.264′s intra prediction, which is one of the primary reasons why H.264 has such an advantage in intra compression. It only has i4x4 and i16x16 modes, not i8x8, so it’s not quite as fancy as H.264′s, but it comes close.
The test files are all around 155KB; download them for the exact filesizes. For all three, I did a binary search of quality levels to get the file sizes close. For x264, I encoded with --tune stillimage --preset placebo. For libvpx, I encoded with --best. For JPEG, I encoded with ffmpeg, then applied jpgcrush, a lossless jpeg compressor. I suspect there are better JPEG encoders out there than ffmpeg; if you have one, feel free to test it and post the results. The source image is the 200th frame of Parkjoy, from derf’s page (fun fact: this video was shot here! More info on the video here.).
Files: (x264 [154KB], vp8 [155KB], jpg [156KB])
Results (decoded to PNG): (x264, vp8, jpg)
This seems rather embarrassing for libvpx. Personally I think VP8 looks by far the worst of the bunch, despite JPEG’s blocking. What’s going on here? VP8 certainly has better entropy coding than JPEG does (by far!). It has better intra prediction (JPEG has just DC prediction). How could VP8 look worse? Let’s investigate.
VP8 uses a 4×4 transform, which tends to blur and lose more detail than JPEG’s 8×8 transform. But that alone certainly isn’t enough to create such a dramatic difference. Let’s investigate a hypothesis — that the problem is that libvpx is optimizing for PSNR and ignoring psychovisual considerations when encoding the image… I’ll encode with --tune psnr --preset placebo in x264, turning off all psy optimizations.
Files: (x264, optimized for PSNR [154KB]) [Note for the technical people: because adaptive quantization is off, to get the filesize on target I had to use a CQM here.]
Results (decoded to PNG): (x264, optimized for PSNR)
What a blur! Only somewhat better than VP8, and still worse than JPEG. And that’s using the same encoder and the same level of analysis — the only thing done differently is dropping the psy optimizations. Thus we come back to the conclusion I’ve made over and over on this blog — the encoder matters more than the video format, and good psy optimizations are more important than anything else for compression. libvpx, a much more powerful encoder than ffmpeg’s jpeg encoder, loses because it tries too hard to optimize for PSNR.
These results raise an obvious question — is Google nuts? I could understand the push for “WebP” if it was better than JPEG. And sure, technically as a file format it is, and an encoder could be made for it that’s better than JPEG. But note the word “could”. Why announce it now when libvpx is still such an awful encoder? You’d have to be nuts to try to replace JPEG with this blurry mess as-is. Now, I don’t expect libvpx to be able to compete with x264, the best encoder in the world — but surely it should be able to beat an image format released in 1992?
Earth to Google: make the encoder good first, then promote it as better than the alternatives. The reverse doesn’t work quite as well.
Addendum (added Oct. 2, 03:51):
maikmerten gave me a Theora-encoded image to compare as well. Here’s the PNG and the source (155KB). And yes, that’s Theora 1.2 (Ptalarbvorm) beating VP8 handily. Now that is embarassing. Guess what the main new feature of Ptalarbvorm is? Psy optimizations…
Addendum (added Apr. 20, 23:33):
There’s a new webp encoder out, written from scratch by skal (available in libwebp). It’s significantly better than libvpx — not like that says much — but it should probably beat JPEG much more readily now. The encoder design is rather unique — it basically uses K-means for a large part of the encoding process. It still loses to x264, but that was expected.
September 30th, 2010 at 9:00 pm
this only technology preview version v-0.0.1 of WebP. w’ll see future development and improvement.
September 30th, 2010 at 9:17 pm
@Yanito
Changes to “webp” don’t change the VP8 video format (or libvpx encoder) that it’s based on.
September 30th, 2010 at 9:22 pm
Personally, I think that JPEG XR is a far better alternative since it has a good lossless format and more sophisticated prediction scheme (comparing to JPEG). But Microsoft just didn’t promote it hard enough …
September 30th, 2010 at 10:19 pm
@LIu Liu: its not about promoting, its about 20ys of JPEG, like DS said
September 30th, 2010 at 10:22 pm
Is it true that VP8 encoding might come to x264? or was it just trolling/joke?
September 30th, 2010 at 10:47 pm
Can you link to the source frame, and note the filesizes of the various compressed versions?
September 30th, 2010 at 10:57 pm
@Scaevolus
Done and done.
September 30th, 2010 at 11:05 pm
We already have a great replacement for jpg: .PNG
September 30th, 2010 at 11:49 pm
> it doesn’t even support all of JPEG’s features, let alone many of the much-wanted features JPEG was missing (alpha channel support, lossless support). It only supports 4:2:0 chroma subsampling, while JPEG can handle 4:2:2 and 4:4:4. Google doesn’t seem interested in adding any of these features either.
Wrong:
> We plan to add support for a transparency layer, also known as alpha channel in a future update.
from:
http://googlecode.blogspot.com/2010/09/webp-new-image-format...
September 30th, 2010 at 11:51 pm
@Paticito
Sounds better than expected then, but don’t count your chickens before they hatch. They said they’d make libvpx good too, and look where that’s gone — there hasn’t been much of any work on the encoder in the past month.
September 30th, 2010 at 11:55 pm
DS, What’s your thought about JPEG XR?
Could you add it to the comparison?
Could your Psy optimization implemented into JPEG XR / WebP?
Anyhow, it is really about time to let go of the JPEG.
October 1st, 2010 at 12:00 am
@Dark many googlers are on vacation during August/September, don’t expect much work done.
October 1st, 2010 at 12:18 am
@Drazick
I don’t know much about JPEG-XR, as was demonstrated today on #theora when I didn’t even realize that it used a lapped transform. In terms of psy, the only bitstream feature that really matters enormously is whether or not it supports adaptive quantization.
If you can do a test with a JPEG-XR encoder, I’d be happy to post it, but keep in mind that I’ve heard the official encoder is not very good…
October 1st, 2010 at 1:39 am
Why didn’t they replace GIF if they have this fancy video codec turned image compressor? That would actually be useful.
October 1st, 2010 at 1:44 am
@Matt
That’s something that’s been tried for a while… to begin with there was MPNG and APNG (to try to get an animated GIF with 24-bit color support). But none of these formats even have motion compensation… not even basic fullpel.
October 1st, 2010 at 1:48 am
Why not http://en.wikipedia.org/wiki/Progressive_Graphics_File ? It is the most suitable format up to date.
October 1st, 2010 at 1:48 am
Is there a way to compare the decoding complexity of JPEG vs. WEBP? ie will WebP make my cell phone lava hot and kill my battery life for its compression “improvements.”
October 1st, 2010 at 1:52 am
@KJ645
“WebP”, aka VP8, is definitely much more complex. Most importantly, it uses arithmetic coding and a complex deblocking filter, both of which are not present in JPEG. I would guess it’s at least 3 times as intensive to decode.
October 1st, 2010 at 2:29 am
Hello!
1. I think it is a terrible idea to test still image compression with a still taken from a video shot with a video recorder, not an actual camera. Even the source has an inferior image, none of the coders can really shine if they already have to use a crappy source. You can grab a whole lot of amazing pictures under creative commons license.
2. You keep repeating it is not the format that is important for the perceived quality but the psyvis optimizations. This means, that eg. your work of psy optimizations for x264 could be “ported” to produce only those type of frames and features that are also present in VP8, and then we would have much better looking VP8 encodes? (I also see no point in actually doing it, since I suppose it would still be a bit inferior to x264).
Thanks,
Sape
October 1st, 2010 at 2:29 am
Imho an h.264 based still image format should have been standardised for years already… not necesarilly just for web use; as said, jpeg has been here for 20 years. I would even be fine with Apple pushing the format (where is that Jobs bloke when one needs him, heh).
October 1st, 2010 at 2:34 am
@Sape
That video is taken on 65mm film by a camera that costs more than most houses — it is higher quality than almost any image taken by any “photo camera”. I highly doubt your average Creative Commons images even have a quarter the detail that an Arriflex 765 can take.
Yes, some of x264′s psy optimizations can in theory be ported to work on VP8. Adaptive quantization is the iffy one, as VP8 doesn’t have delta quantizers. It only has “segments”, which cost roughly 2 bits per macroblock to signal, and you can only have 4 of them. Ptalarbvorm has demonstrated that you can get a pretty good portion of the benefit without the precision of H.264′s quantizers (i.e. with only a few quants to pick from), but I’m not so sure about the cost of the segments. 2 bits per macroblock probably isn’t too bad for images though; it’s likely much worse for actual videos.
October 1st, 2010 at 3:12 am
@Dark
1. I have just had another look at the source, and I still think it is not a big bang as a still image. But of course this is subjective.
1/b. As for CCL images: I did not say all are brilliant, but you can find serveral of them that are.
1/c. Go eg. here:
http://www.dpreview.com/galleries/reviewsamples/albums/canon-eos-1d-mark-iv-review-samples#page=1
or here:
http://www.dpreview.com/galleries/reviewsamples/albums/nikon-d3s-review-samples#page=1
for some professional images. These ar not CCL-ed though, you will have to ask for permission if you want to publish. (personal use is allowed)
2. Yeah, I also felt it was not really worth it (for you). Google OTOH should pay you to do what is possible if they care…
October 1st, 2010 at 3:56 am
The JPEG output image can be further compressed using PackJPG (http://www.elektronik.htw-aalen.de/packjpg/) to 136711 bytes, lossless (it’s no JPG afterwards anymore, but the original JPG can be restored).
October 1st, 2010 at 4:22 am
So you say that:
If sizes are the same quality of WebP is inferior to JPEG.
And Google say that:
If quality is the same size of WebP is superior to JPEG.
It sound quite strange.
BTW Google targeted PSRN of 42 for all files. So with their comparison PSRN was not so important factor.
October 1st, 2010 at 5:27 am
I did a quick test of how JPEG XR performs. The Microsoft Expression encoder seems to be more efficient than the reference implementation, so I used that one. The nearest I could get was 160 KB:
http://www.speedyshare.com/files/24494880/source.wdp
As PNG: http://img3.imagebanana.com/img/skxxylcl/source_jxr.png
October 1st, 2010 at 6:25 am
If anything should become mainstream, it should be Jpeg XR. Much better quality for the size. File formats are lame. Hard to change, hard to move.
October 1st, 2010 at 6:40 am
Well, another good overview of a new format! (I would assume, I don’t really know much about video/image formats/encoding myself)
However, I believe that you are wrong in one point:
“Earth to Google: make the encoder good first, then promote it as better than the alternatives. The reverse doesn’t work quite as well.”
The Open Source/Free Software way is to “Release early, release often”. Linus Torvalds released the Linux kernel when it could barely do anything, and look how well it did! I’m not saying that this is going to work for google(it doesn’t always work out), but it is the way it is supposed to be for Open Source software. Only proprietary companies worry about the software looking good when it is first released. Google(I would assume) is counting on the support of many programmers (not unlike yourself) to improve this software. It’s the Open Source/Free Software way.
P.S. I just realized that I assumed(without any direct evidence that I can find) they released/are releasing the source code for their Image converter and whatever other tools are required. If they aren’t, then the comment above is basically invalid. Releasing early and often works best for open source, not near as well for proprietary code. Either way, though, they do specifically say it is a developer release… so it isn’t supposed to be ready for production use yet.
October 1st, 2010 at 6:50 am
The official jpeg tools work just fine:
pngtopnm source.png|cjpeg -quality 18 -progressive -dct float -outfile source.jpg
The resulting jpeg is 152K and looks much like the x264 version imho.
October 1st, 2010 at 7:00 am
What about jpeg2000? It is way better than jpeg, and it’s been a few years since it’s available, but no-one ever talks about it…
October 1st, 2010 at 7:06 am
How does this compare to libjpeg’s (from v 7 or
arithmetic encoder? Using “jpegtran -arithmetic” usually gives 10-20 % size reduction. I don’t if the IJG jpeg code have implementet other parts of T.851
If Google wanted alpha channel support they could have implemented JNG subset of the MNG.
October 1st, 2010 at 7:07 am
What did you use to decode the jpeg? output.jpg appears washed out compared to jpeg.png when I decode it.
October 1st, 2010 at 7:13 am
An interesting question is who WebM is aimed at: is it purely for semi-pro photographers who know lots of technical stuff already, or is it aimed at anyone who’s putting their snaps online? If it’s aimed at everyone, then the other thing that would be interesting would be investigating what happens if you take an image that’s already been compressed in being output from the camera (eg, low-end consumer cameras that don’t offer RAW output, only jpeg) and transcode it and see what happens in terms of transcoding artifacts. With my image analysis hat on I’m shuddering at the thought, but the human eye is sensitive to different things. So they might be negligible, but they can sometimes be visible even when looking at the images at full magnification. In particular, one would expect that a representation that is closer to JPEG would do marginally better (since in general the re-encoder doesn’t know that which parts of the reconstruced-from-jpeg image are artifacts and which are from the original image.
October 1st, 2010 at 7:30 am
Hmm, why would you encode still images that way? If I take the source image and save it with Photoshop as jpg @ 20 percent (312kB), and save it again as png (3.21MB), I get far better results than your x264 example.
Can’t save it as WebP that way yet (obviously) but if your standard jpg example is already so far off, I doubt that your WebP example is accurate either.
October 1st, 2010 at 8:03 am
You seem very touchy about WebM etc.
Maybe you’re slowly understanding that you are on the wrong side, the MPEG LA side.
October 1st, 2010 at 8:13 am
Well, from simply looking at those picture, it just prove why we dont need to replace Jpeg. At least when it is scale down, ( Zooming in shows it loses a of of information )
But 90% of the world dont care. At least on the Web. If we need High Quality we use High Bitrate Jpeg to solve the problem or PNG. Casual web surfer dont want the hassle to save only 20% of file image size for software incompatibilities. If we could get the same image quality for only 20% of file size i suspect the world would pick it up, but that is not even possible with the best encoder we have; x264.
The only image format that I think could succeed jpeg is an H.264 based image format where the decoding complexity is offload to an widely available h.264 hardware decoder. Not to mention a small update to flash player would immediately see 90%+ of the internet being able to view h.264 images. But of coz, in a imperfect world that is not going to happen due to stupid human politics…
October 1st, 2010 at 8:43 am
@Manuel: He’s rather on the “let’s use the technically best solution and don’t care about the legal bullshit” side.
October 1st, 2010 at 9:03 am
webp looks basically like jpeg with applied deblocking filter. It is useless, as deblocking filter can be easly added to jpeg without any compatibility problems.
x264 is much more detailed.
I think adding mandatory deblocking to jpeg, and growing blocks from 8×8 to 32×32 (yes, not 16×16, as 32×32 isn’t a performance problem for still images IMHO), improving “psy” encoding (buy for example using variable quality for each block – by detecting foregraound, backgraound, etc) is better idea. Much more compatible, easier to implement decoder (hey, it is just a tweaked jpeg), and probably better than both jpeg and webp.
As for me webp is no go, as it do not have HDR support or any alpha channel support. Size improvements do not care me so much (Google care i understand), but i do not want to store both webp and jpeg for compatibility and server dynamically one or other, depending which one browser supports.
jpeg2000 would be much better actually as it do not have many jpeg artifacts, and have alpha channel, and lossless mode.
October 1st, 2010 at 9:07 am
I notice that the sample images here are all compressed WAY more than anyone would ever compress them for actual use. Go look at the pictures on flickr, and you’re going to have a REALLY hard time finding images that have their JPEG quality settings set low enough to produce an image that looks that bad. Is it possible that VP8/WebP does better (particularly in comparison to JPEG) when using higher quality settings?
October 1st, 2010 at 9:08 am
I hope Joshua is joking:
“We already have a replacement for JPEG. It’s called PNG.”
And who needs h264 when you can have folders full of numbered BMPs
This is someone reading and commenting on the x264 blog. You would hope they know the difference between lossy and lossless compression.
October 1st, 2010 at 9:13 am
@Travis
Highly doubtful; there’s nothing about VP8 that would make it better at high rates — in fact, by design, it should be better at low rates (as should H.264). The reason to compare at low rates is simply that it’s easier to see a difference — people love to compare at high rates and say “there’s not much difference” even when the difference is large enough that it represents a factor of 50% in bitrate.
October 1st, 2010 at 9:15 am
@Warren
Well obviously at 312kb, your jpeg will look much better than at 156kb!
October 1st, 2010 at 9:16 am
@Jeff
I used an ffmpeg set up explicitly for the purpose of avoiding colorspace range conversion, since that plagues so many comparisons. That is, I kept the colorspace in TV-range YV12 for both encoding and decoding, without conversion, because of the mess that causes.
October 1st, 2010 at 9:16 am
Why not use PGF? http://www.libpgf.org/
Lossy&&lossless, fast, proven, existing library, progressive loading.
Seems to be the smarter move.
Cheers, Mathias
October 1st, 2010 at 9:17 am
@przemo_li
Of course PSNR is important. If you optimize for PSNR, your image will look worse at the same PSNR than with an encoder that optimizes for psy. x264 will often look better at 40db (optimizing for psy) than at 43db (optimizing for psnr).
October 1st, 2010 at 10:19 am
@Stefano Says: JPEG2000 is SLOW.
PGF looks much better in this respect.
October 1st, 2010 at 10:37 am
While we’re on the subject of experimental image compression, I’d like to direct your attention to DLI: http://sites.google.com/site/dlimagecomp/
It beats the pants off of JPEG2000 and the other well known (experimental) image coder called ADCTC.
October 1st, 2010 at 10:51 am
@mpz
I’ve seen that one before; iirc, last I tested it, it was competitive with x264 and even beat it significantly at times. Given that it was adapted for insanely high compression times, of course, this is not very surprising. Still rather impressive.
October 1st, 2010 at 11:23 am
Mine subjective opinion is that the VP8 (decoded to PNG) frame looks much better.
Any way, with more bandwidth, bigger hard drivers, (hopefully in some decades quantum computers), we should better use PNG, it is loose-less, and for audio FLAC
October 1st, 2010 at 11:38 am
See a thorough comparison of JPEG XR vs JPEG 2000 at http://www.compression.ru/video/codec_comparison/pdf/wmp_codec_comparison_en.pdf
October 1st, 2010 at 11:55 am
Microsoft should subsidize all of the digital camera manufacturers to support JPEG-XR in order to kick-start the use of that format. It is superior by far to WebP, is an open standard, and has a good chance of adoption (particularly in new cameras that will have high dynamic range sensors) if promoted properly.
October 1st, 2010 at 1:05 pm
Mmm… I think the unique feature which is missing in jpg is alpha channel. PNG is for me the best format, but if the image is big, png is a large file vs jpg. I think the next good steep, is improve png format.
Another problem with Google, as always, google is a business, always thinking in business, so, if they are to license vp8, and google MAKE standards (i don’t like that), all people are going to suffering license problems before or after..
October 1st, 2010 at 2:09 pm
Is quality for images produced with libvpx the same as those produced using webpconv? From the way the WebP site is written it sounds like they’re only using libvpx for decoding. While of course you’re still using VP8 I-frames, it may be possible that webpconv is tuned better for quality rather than PSNR.
October 1st, 2010 at 2:14 pm
Glad to see another blog post from you, Dark Shikari!
Derek
October 1st, 2010 at 3:36 pm
I am not convinced that introducing a “new” image format makes sense: When I’m browsing the web on a standard broadband connection, images load instantly anyways, and Moore’s law holds not only for the bandwidth of Internet connections, but also for hard drive capacities.
October 1st, 2010 at 4:20 pm
WebPee? What kind of stupid name is that? Does that mean the images are going to be stained in yellow?
October 1st, 2010 at 5:51 pm
Here is a 152KB jpeg compressed with cjpeg and a perceptual quantization table: http://dailyburrito.com/output_cjpeg_perceptual.jpg
October 1st, 2010 at 5:58 pm
@Alan
jpegrescan shrinks that another 3%, so it’s actually smaller than it needs to be! Either way, that’s really impressive.
October 1st, 2010 at 6:19 pm
I’ve been arguing that jpeg has legs if we could just agree on a better quantization table for a while. And if someone added some de-blocking in the decoder …
Nice tip with jpesrescan. Here’s my 147KB result with that (unfortunately with cjpeg, the next quality step up gives 164KB after the rescan, so I wont cheat and post that):
http://dailyburrito.com/output_cjpeg_spec_rescanned.jpg
October 1st, 2010 at 6:22 pm
And, for posterity, here’s the quantization table I was using. It is setup to approximate the perceptual performance of the eye at a viewing distance of 6 image heights:
# Quantization tables (6 image heights)
# Table 0 (Y channel)
5 3 4 7 11 16 24 34
3 4 4 6 8 12 18 25
4 4 8 9 11 15 20 28
7 6 9 14 16 20 26 33
11 8 11 16 26 28 34 42
16 12 15 20 28 41 46 54
24 18 20 26 34 46 63 71
34 25 28 33 42 54 71 95
# Table 1 (Cr channel)
7 9 18 28 43 63 91 128
9 8 17 23 33 48 68 94
18 17 31 34 43 58 78 105
28 23 34 55 63 77 98 126
43 33 43 63 98 108 128 157
63 48 58 77 108 154 174 204
91 68 78 98 128 174 239 255
128 94 105 126 157 204 255 255
#Table 2 (Cb channel)
14 18 46 71 109 161 232 255
18 17 43 59 85 122 173 240
46 43 80 87 111 148 200 255
71 59 87 142 160 196 249 255
109 85 111 160 251 255 255 255
161 122 148 196 255 255 255 255
232 173 200 249 255 255 255 255
255 240 255 255 255 255 255 255
October 1st, 2010 at 6:26 pm
@Alan
You can tweak all the values in the quant table up/down by a small amount to tweak quality by small amounts. That’s what I did with x264 for the PSNR test, since I couldn’t get the size to quite match.
October 1st, 2010 at 6:30 pm
@ Dark
Thanks. Good idea tweaking the quant table to match more exactly.
October 1st, 2010 at 6:44 pm
And I meant to post this: http://dailyburrito.com/output_cjpeg_perceptual_rescanned.jpg
October 2nd, 2010 at 1:23 am
We all had better if Google just support JPEG XR.
We don’t need another format we just need support for something.
October 2nd, 2010 at 2:03 am
Google released a lot of strange things through the Chrome project that are more like early drafts. The spdy protocol is not really thought through either.
October 2nd, 2010 at 2:47 am
> And I meant to post this: http://dailyburrito.com/output_cjpeg_perceptual_rescanned.jpg
Interestingly, Opera doesn’t like this image at all – it either doesn’t display anything, or only displays a B&W image at 1/8th resolution. Firefox, Chrome and IE have no such problems.
October 2nd, 2010 at 3:24 am
Have a look at this website. They are doing a good job compared with Jpeg and Jpeg2000 if you look specifically at file size.
http://sites.google.com/site/dlimagecomp/
October 2nd, 2010 at 8:28 am
What the heck is cjepg?
Are those jpeg standard compliant? ( i seems to have no problem decoding it on Firefox )
Is it somekind of a new and better jpeg encoder?
Why isn’t this being widely used? It is much better then the original jpeg.
October 2nd, 2010 at 8:40 am
Why deal with theories? HM has implemented its tiled and multi-dimentional interlacing algos, using high profile H.264 for stills coding. It is a 422 scheme, utilizing standard H.264 encoder under mp4, allowing import/export of EXIF, which can be integrated with HW cellular codecs (a pilot with a leading handset vendor on the way). Goto http://www.hipixpro.com and you can download freely and experiment with great Windows tools and Android. Viewers will be always free for the PC and developers are welcome to have their PC apps. The code has a huge advantage over WebP as it utilizes Inter-Intra tools and not just intra. It has to do with out of the box technology, which leave things wuthin the satndard. Try it yourself.
October 2nd, 2010 at 8:58 am
@Ed
Sorry I didn’t explain. cjpeg is a command line utility that comes with the IJG “reference” libjpeg library.
Note that the usage of non-standard quantization tables (that take human perception into account) is the reason that these jpegs look so nice but it is also those quantization tables that likely cause some decoders to fail with these jpegs.
October 2nd, 2010 at 12:01 pm
are there any binaries made from jpgcrush and jpgrescan? using perl on windows is pretty much a hassle :\
October 2nd, 2010 at 5:31 pm
Here’s the resulting image from the Hipix software referenced in response 68.
http://img121.imageshack.us/img121/9370/hipix.png
I used 165 for “Set Target Size” for an output size of 153KB.
IMO it’s nowhere near the quality of x264 on this particular image, there’s significant loss of detail, and noticable artificial lines added into the image.
October 2nd, 2010 at 10:17 pm
For some reason I might prefer Theora over x264. Sure x264 has more details, but sometimes it looks too peaky compared to blurred neighbour. Theora is overall blurry, but has less jumps of texture retention and feels more natural.
Maybe there’s also a difference between psy for movies and psy for stills?
And yes, for vp8 that is embarassing anyway. It fails. Even block jpeg is better than blurred lumps…
October 2nd, 2010 at 11:54 pm
Lossless-wise, I never understood why JPEG-LS got no attention. Compression is about as good as JPEG2K lossless (possibly a bit better), but it’s so much faster both to encode and decode, also compared with PNG.
October 3rd, 2010 at 12:37 am
Does anyone know of a good command line SSIM tool for still color images on Linux? I can’t get Medhi Rabah’s ssim.cpp or qpsnr to compile, nor can I get ssim.m to run in GNU Octave.
October 3rd, 2010 at 2:14 am
Dearest Han, I have no ides what you did, and I don’t have your source images. We tried all the WebP images and soon we’ll upload the gallery. Our gain for immaculate quality – compared to the source JPEGs is huge. Where WebP saves 10& we save 40 and more. Try yourself if you like. Everybody invited to do that – hipixpro.com
October 3rd, 2010 at 2:27 am
Here’s the resulting image from the DLI software referenced in response 66.
http://img814.imageshack.us/img814/7184/dli.png
I used the default settings with “-q 20″ for an output size of 153 KB.
The output quality was shocking.. I’ve tested numerous codecs that claim better compression than h264/x264 but this is the first time I’ve seen it. The image has similar detail as the x264 image but has significantly less artifacts (blocking, distortion around edges, pink chroma).
October 3rd, 2010 at 2:39 am
@Ira Dvir
I used the source image referenced in the article/blog. Your free to download it and test to see if I somehow compressed the image incorrectly. I’ll get around to trying your software with more images for a better evaluation.
October 3rd, 2010 at 2:56 am
What is the use case illustrated here ?
a 150 KB compressed image imply a bit rate of 0.6bpp…
October 3rd, 2010 at 7:22 am
Guys, the use case is clear, and the technology can be used with any H.264 CoDec (including x264). Goto the hipipro.com site, there’s a lot of stuff to read and experiment with. The main advantages: No limitation of resolution, using exosting HW/SW CoDecs (could be even WebP) for Inter-intra stills coding. This is going into cellular with the existing HS… Try using the presets for simpler operation, and see the results for yourselves.
October 3rd, 2010 at 10:56 am
http://hipixpro.com/webpcompare.html
Here’s the comparing gallery of hioix Vs. WebP. You can download the files.
October 3rd, 2010 at 1:00 pm
Is better image compression still required?
Today the biggest part of bandwidth is video content not images.
4 temporal generations of compression:
1. Generic Data (nothing changes in >15 years)
2. Image (nothing changes in >15 years)
3. Sound (nothing changes in >15 years)
4. Video (everything is changing )
So generic data, image and sound compressions are closed books.
….
1. First there was a data compression (zip, rar) and all people compressed their 20-30 KB files with it.
Now when we use email we barely care about compression if the size of attachment is smaller lets say than 1-2 MB. Just try to send something like .7z
2. The same is applicable to lossy compression of images. JPEG is just universal like ZIP.
3.MP3 compression as old as JPEG and AAC is from the same decade (+3-4 years). HE-AAC doesn’t count as it was a subset for low bitrates and not entire replacement of AAC. And MP3 is vastly more popular than AAC today.
October 3rd, 2010 at 6:25 pm
Not sure how well applies with relation to colorspace conversions and whatnot, but I just took the source image and simply saved it out with IrfanView as a JPG (no fancy reworking of quant tables), setting the quality down to 18 to reach a 152 KB file size, and it looks substantially better than the jpg DS put up. I’d rate it as slightly better than the theora pic, but a bit worse than the x264 pic (x264 is substantially better in the water reflections, but mostly similar elsewhere). So the bar that jpg sets seems to be a fair bit higher than implied in the article.
On the other hand, the hipixpro image really does look quite impressive, mainly in not introducing a lot of the noise that you get in the x264 image while still retaining that high image quality. If they could add transparency to it (not sure if that’s easy to do, given that it’s based on h.264) I’d be quite happy to switch to it… aside from the fact that Firefox will never support it because it’s still tied to h.264…
October 3rd, 2010 at 9:22 pm
Here’s my attempt:
http://people.mozilla.org/~jmuizelaar/parkjoy.jpg (155K)
It uses the same quantization table as Alan’s but also uses trellis quantization to get closer to the target rate.
@Alan, how were your quantization tables derived?
October 3rd, 2010 at 10:23 pm
Oh my god!
Your site, as good as your article seems to be, is unreadable. My eyes hurt after a few seconds. Do yourself a favour and switch your colors to an eye-friendly format as described in thousend bokks about webdesign for the bloody beginner! Have you ever tried to read your site your own?
Amateurish, nerdy and painful…
Peter
October 3rd, 2010 at 11:41 pm
@Peter
I use the same colorscheme for my site that I use for all my applications on my own computer (at least ones where I can change the color scheme easily).
I personally do not believe that staring into the sun (i.e. dark text on a light background) is suitable for anything any sane person would want to read. Maybe you are of some other species which is adapted to staring into extremely bright lights for long periods of time. I am not, and neither are most of my readers. Well, at least I assume most of my readers are human — I can’t be sure of that!
Above all, I refuse to use a colorscheme that I cannot read comfortably for more than a couple minutes. Black text on a white background falls squarely into that category, giving me headaches quite rapidly. I would rather not have to pop ibuprofen every few hours (and ruin my vision!) just because some random commenter on the internet tells me that I should stare into a lamp all day.
Now, if you are reading this on a device without a backlight (e.g. a Kindle), then it might be useful to have an alternate CSS. But I don’t think anyone is reading this on a Kindle.
If you do in fact happen to be of some alien species that likes staring into bright lights, try this script.
October 4th, 2010 at 12:13 am
@Alan:
> Note that the usage of non-standard quantization
> tables is the reason that these jpegs look so nice
> but it is also those quantization tables that likely
> cause some decoders to fail with these jpegs.
I strongly doubt that. In JPEG, there’s no scalar quantizer (like the QP in H.264), so the only way to signal the amount of quantization to the decoder is specifying complete quantization tables in the header. A decoder that assumes fixed quantization tables would fail for 99.9% of all files.
October 4th, 2010 at 2:41 am
@David
I think your mixing up the hipixpro and DLI images, the hipixpro one looks similar to x264 PSNR optimized while DLI is like x264 with less distortion.
October 4th, 2010 at 8:48 am
It seems that Han and some of you guys don’t really understand that it is still pictures and not HDTV frames we are discussing. First – for low res (circa 1MP) x264 and hipix are more or less the same. However there is a huge gap between the efficiency of pure intra to the inter-intra coding we use. The reason? Hipix was designed for cellular handsets and cameras, supporting 5,8,12,16,21 and also 40 and 50MP. the higher the resolution – the more effective it is.
October 4th, 2010 at 11:05 am
@BondEar
I have a quick and dirty utility for measuring the SSIM between two pngs.
http://github.com/jrmuizel/ssim
October 4th, 2010 at 12:25 pm
@Ira
“Inter-intra” sounds like SVC or wavelets. Aka useless. Can you be more specific? You can’t really use inter coding on one image. Unless you mean in-intra motion vectors.
October 4th, 2010 at 2:15 pm
@Han – You’re right, I mixed them up. The DLI image is indeed what I was looking at when I thought I had the hipix image up. Will have to look into DLI some more, then.
October 4th, 2010 at 9:53 pm
@Jeff
Where’s ssim.h? It’s referenced in ssim.c but isn’t in the package.
October 5th, 2010 at 5:39 am
@BondEar
Fixed.
October 5th, 2010 at 6:27 am
dark Shikari – Inter-Intra means IP and IBBP GOPs for high res images. Read more at hipixpro.com we have a description of the technoilogy. Try it with 8-12-16-21MP images and you’ll find how effective it is copared to simple Intra coding.
October 5th, 2010 at 7:12 am
@Ira
That’s just a horribly inefficient implementation of in-intra motion vectors. If you’re going to rip off a 10-year-old technology and claim that it’s yours, at least do it efficiently.
October 5th, 2010 at 8:32 am
Wouldn’t it be possible to create a JPEG encoder targeting SSIM over PSNR, possibly using multiple quant tables to allow some sort of psy optimizations (I’m not sure if JPEG allows switching quant tables in the middle of a scan… I’ll check)?
Moreover, if the (optional) arithmetic coding feature was nowadays sufficiently widespread decoder-side it should allow JPEG to improve quite a lot while retaining compatibility with existing decoders, wouldn’t it?
October 5th, 2010 at 3:47 pm
What do you think of hipix Dark Shikarki?
http://www.vizworld.com/2010/10/hipix-image-format-challenge-google-webp/
October 6th, 2010 at 8:07 am
KJ645: read the comments on this blog. The post just above yours is what Dark Shikari thinks of hipix. =)
October 6th, 2010 at 2:12 pm
@Ira
Given the information that Hipix performs better on higher resolution images, I re-tested using some of the images from http://www.imagecompression.info/test_images/
Unfortunately Hipix performed similarly as it did with the smaller < 1MP images, ie. results were similar to x264 targeting PSNR. x264 with psychovisual/HVS optimizations is significantly superior on every image tested. To help Hipix's cause, I suggest providing a specific example where it performs better than x264 that the public can verify.
October 6th, 2010 at 9:32 pm
Is it just me or the text of the site became enormous ? it’s hard to read now ^^;
October 7th, 2010 at 3:20 am
Regarding site colour scheme: while I prefer bright text on dark background (just check out my Windows colour scheme: http://eternallybored.org/imgs/thebat.png), I think that the text on this site is a bit too bright – however since most sites use black text with white background, I browse most of the time with custom CSS enabled anyway, so it doesn’t bother me as much.
October 7th, 2010 at 9:11 am
Dear Han, since Hipix was designed to enhance the efficiency of “existing h.264 encoders” to compress still images, and utilize existing HW codecs, within the HW constrains (like coding 12MP images using a 720p encoder), I’ll be happy to explore the possibility of enhancing further the quality by using the x264 code within the scheme for PC SW, which we want to be distributed freely (within the limits of the h.264 licensing terms). If it is of interest to you, contact me at ira@human-monitoring.com and we’ll be more than happy to cooperate.
October 7th, 2010 at 1:11 pm
Ender, doesn’t this break most web sites? Are you using firefox or chrome?
Sorry, I know this is off topic here…
October 7th, 2010 at 2:07 pm
Don’t support patent encumbered image formats, everyone should have learned from the GIF debacle.
Yes, that means HIPIX also.
http://burnallgifs.org/archives/
PNG is superior image without all the artifacts of inferior lossy compression garbage, patent free, open format.
http://en.wikipedia.org/wiki/Portable_Network_Graphics
October 8th, 2010 at 3:06 am
Just wanted to say: if I were google and trying to improve my image format, I don’t think I could have made a better move than to post it early. Just reading through this thread, they would have most of the brainstorming already done for them, and ready to organise
October 8th, 2010 at 12:04 pm
PNG is inferior itself – it *only* does lossless compression (hello tenfold increase in bandwidth; oh wait, wasn’t this what google meant to adress?) – but at lossless compression it is easily beaten by jpeg XR or jpeg 2K. Screenshots of desktop dominated by gui elements are exception, but hey…
October 8th, 2010 at 12:10 pm
Moving everything to PNG as many comments have suggested is a terrible idea.
You want to try storing your 12MP camera images as .PNG? They would be massive and RAW already exists for PROs who need it.
PNG for web sites are *much* heavier than .jpg when the image is continuous tone and larger than 250×250. If you want only flow chart line drawing .png is perfect.
PNG also does not support HDR like JPG XR does, and HDR is expanding more to the general public as it becomes more automated.
PNG also does not support high bit depth pixels like TIFF and JPG XR. This mostly affects pro uses, but it would be a plus to have the capability in a common format.
Btw, can’t wait to see how well x264 can do on Sandy Bridge, hope DS will comment on this soon:
http://lee.hdgreetings.com/2010/09/intel-cpu-vs-nvidia-gpu-video-transcoding.html
October 8th, 2010 at 12:49 pm
@Fruit
last i remember jpeg-xr was worse than png compressed with pngout in terms of filesize.
October 9th, 2010 at 6:24 am
On photographic image?
Quick test here (only with an old HD Photo plugin though – should be at least similar) shows not. Also, it is with a pngout, while no such optimizer was used for the hdp/jxr image.
I don’t even know where to get a proper/good jpeg XR encoder, much less some that would be guaranteed to have better than average compression.
October 9th, 2010 at 8:38 am
Fruit, you should try Microsoft Expression Design (I hope there’s a trial available somewhere).
October 9th, 2010 at 3:23 pm
Guys, soon we’ll release Hipix plugins for a few popular picture editing tools like Photoshop and XnView. The main issue for us is photos of hi-res coming from cellular handseta with 8,12,1nd soon 16MP, where the CMOS output is compressed using Hipix far better than JPEG-XR JP2K and WebP. We also support the following: 422 color format (as coming from the CMPS) EXIF import/export. PNG is worth nothing for this purpose (as well as for pictures as pictures). It is a great format for other purposes. We are operating in the space which is focused on EXISTING HARDWARE. JP2K necer got there, and so JPEG-XR. Show me a codec, which can encode 60MP per second on existing cellular device with 30% higher file size than Hipix and I’ll call it competition.
October 10th, 2010 at 6:09 am
I’m not impressed how WebP and Hipix decoders interpolate chroma info:
http://img822.imageshack.us/img822/125/santacrop.png
October 11th, 2010 at 3:01 am
Next to JPEG2000, LuRaWave (developed by the German Aerospace Company = Deutsche Luft- und Raumfahrt-Technik) could have been one of the best technologies to concur with JPEG. If it was not relying on plugins…
October 11th, 2010 at 11:08 pm
@Ira All your arguments are empty, ppl asked for a sample are Hipix is good but none come yet. The one posted only show that is not that good.
Please, post a source image (in png) and a Hipix codec image decoded to png. Choose the source in your benefit (i.e. the “hi-res coming from cellular handseta” ones)
October 12th, 2010 at 8:47 am
Fractal compression FTW
October 17th, 2010 at 12:29 am
I haven’t seen any image quality comparison based on only one image and one compression point like this. We only can say this is a special case analysis for these image coders. In this special case, we can see that VP8 turns the blocky effect to the blurred effect by using deblocking filter. If you don’t like this blurred image, just turn off the deblocking filter. (ps. since this is intra coding, decoder can choose to turn on or off the deblocking filter. I see similar blocky effect with JPEG for this case if I turn off the deblocking filter when decoding.) If you want less blocky effect, just decrease the DC QP value. BTW, the flat quantization matrix of AC coefficients of VP8 is designed to optimized PSNR, not this kind of subjective view.
Please never compare image coders using such small data set and configuration. It is much more misleading than any other comparison.
BTW, there is indeed one problem of VP8 intra coding for still image. It can’t provide lossless and fast transcoding for image rotation and flip like JPEG. The intra prediction will make this kind of transcoding lossy and time consuming.
October 18th, 2010 at 3:40 am
@Ira
I know you are a senior official in Hipix therefore your reasoning and comparisons might be biased.
Is there any comparisons Hipix vs VP8 performed by independent experts
October 18th, 2010 at 7:18 am
@Shevach Riabtsev: here you go, pick your poison:
http://img822.imageshack.us/img822/125/santacrop.png
October 21st, 2010 at 10:53 pm
Comparisons:
http://www.mathias-schindler.de/2010/10/22/webp-file-format-remarks/
http://englishhard.com/2010/10/01/real-world-analysis-of-googles-webp-versus-jpg/
October 26th, 2010 at 6:30 pm
Here is another comparison of interest:
http://jpegmini.com/?page_id=108
October 29th, 2010 at 4:57 am
@Ugg
New libvpx version was released just yesterday:
http://blog.webmproject.org/2010/10/vp8-codec-sdk-aylesbury-release.html
November 7th, 2010 at 5:09 am
It has nothing to do with wrong promotion, better compression than JPEG, 20ys of JPEG etc. etc. etc.
It has to do with who came first up, standarized into a wealth of software and hardware just about the same time as internet really ignited and blew up into our faces and spawned over to phones and other media devices. This WebP effort is way too late. They have the same arguement about OGG VS MP3. Which came first? MP3. MP3 will and are always to most supported/common format around even if it supposedly have flaws in compare with OGG. Most people don’t care, MP3 was available first and is always supported everywhere. Easy answer to why we do not need JPEG improvements or MP3 improvements, in any case it will take several decades before something like this becomes reality and implemented. There is way to many newer video formats, audio formats to confuse us already, lets pretend it dont happen to pictures too.
November 16th, 2010 at 2:06 pm
Well, wepb works immediately in Opera if you grab a compatibility userjs from extendopera.org; but I still wish the company added h264 to its native formats on desktop.
January 26th, 2011 at 11:50 am
libvpx has received many improvements since the 0.9.5 release. Any significant progress on the encoder?
February 19th, 2011 at 9:54 am
https://review.webmproject.org/1780
March 26th, 2011 at 5:12 pm
Jason, thank your very much, very interesting conclusions, but let me add a few my dimensions in Your Model:
1st Dimesion – patent cleanliness
So, in this model, increased in one dimension are interested in patent cleanliness. It is not secret for anybody that scientific clusters (R&D bundles) supplying modern standards, begin to intertwine form the 3x-years of the XX century. Not only H.264, but also all modern “technology bundles”, such as LTE, WiMAX etc has as minimum of 50 years of iterative research history. Simple see the ITERATIVE graph of research history for all H.264-related mathematic and engeeniring stuff in your brain. Almost of 99% of this stuff you and personally I, memorized as empiric approach. What is the meaning of blame one or another group of people in applying the algorithm similar to another algorithm? (or even a set of algorithms for the problem area?).
It seems that the 20th century has arrogated to itself the right to all newly invented. We are just lucky, that up to this were the other century, when this is not done. Otherwise we could not have breakfast, lunch and dinner, and even send their natural needs.
The main VP8 Google initiative, by my opinion, is not in some qualitatively new innovations. It is simple creation of the (near to) innovative flow, free from patent claims, that known by google. Although, of course, because of a very troubled patent system of the USA it is not clear what are hidden in the sleeve by google. And after this core is formet, most of the initiative and ambitious peoples, such as you, must simple extend this core with empirical and theoretical points.
2nd Dimension – energy
It is absolutely clear that it is now the time for freeze the full standard of digital content and for good escape from the hell of historical (almost analog) layers in the media. Yes, the most part of these transformations connected with transformations of meta-information. However, major headaches are converting heavy media content, such as photos, audio, and especially video. It is quite obvious that we simply do not realize the energy scale of this problem. For example, (no, not need power consumption example, because all that read this blog, almost known about it)
Video industry has also absolutely wild time-domain PWM-like interpolation solutions, such as interlacing and inverse telecine process, requiring very power and energy-intensive decoding approach with incoherent result. because the path back to the source material does not remain in 5-10 years!
3rd Dimension – politics
Some states, which call the states of the “third world”, have patent policy, significantly differs from the patent policy of the US. And in these states are home of 150+160+2500+1500+500=4.81 billions of people. You can be sure that no one ever there does not transfer a cent of money for such patent as H.264. Why, indeed, for other list? Because its is too comprehensive and contains a large number of patents, intersecting with local patents of these countries.
In order to avoid problems related to the discredit of the patent law of US, most of the patents on H.264 will be so or otherwise released in foreign areas.
The idea of targetting for x264 project
1. To create an umbrella empirical algorithms, especially the algorithms of mocomp/moest, the most protected from the patents. Most likely, they will also be used later in the VP8.
2. Do not scold VP8, let guys to relax
3. R&D. Still more serious approach to the issue of hw-sw partitioning, enter the third level of abstraction for internal API/thread model and to start experiments at least with CPU-based OpenCL (blocked , attached to current theread model).
4. R&D. Redesign the heterogeneus, OpenCL-based thread model.
5. Evolution to OpenCL-based heterogeneous design
If you do not go into details, the Opencl is the only way for a programmer to go the way of the von-neumann to the hardware pipeline without serious bugs in brain.