Hacker News new | past | comments | ask | show | jobs | submit login
Introducing the ‘mozjpeg’ Project (blog.mozilla.org)
409 points by joshmoz on March 5, 2014 | hide | past | favorite | 128 comments



Bravo. I love JPEG. Amazing that it's been 23 years since its release and it remains as useful as ever.

I remember what it was like to watch a 320*200 JPEG image slowly build up on a 386SX PC with a VGA card. Today, a HD frame compressed with JPEG can be decoded in milliseconds. This highlights the secret to JPEG's success: it was designed with enough foresight and a sufficiently well-bounded scope that it keeps hitting a sweet spot between computing power and bandwidth.

Did you know that most browsers support JPEG video streaming using a plain old <img> tag? It works also on iOS and Android, but not IE unfortunately.

It's triggered by the "multipart/x-mixed-replace" content type header [0]. The HTTP server leaves the connection open after sending the first image, and then simply writes new images as they come in like it were a multipart file download. A compliant browser will update the image element's contents in place.

[0] http://en.wikipedia.org/wiki/MIME#Mixed-Replace


Unfortunately chrome recently removed support for "multipart/x-mixed-replace". It's too bad too. It was a simple way to implement a webcam.

https://code.google.com/p/chromium/issues/detail?id=249132 http://blog.chromium.org/2013/07/chrome-29-beta-web-audio-an...


According to those links, they still support it for images, but removed support for other types of resources.

It's too bad IE never supported multipart/x-mixed-replace or we might have seen more live updating websites earlier. Now that we have WebSockets and well understood long polling approaches it doesn't matter any more, since x-mixed-replace would keep the download spinner spinning forever and the newer approaches don't have that problem.


I used multipart replace in firefox for streaming data for years with no download spinner. I couldn't use it in chrome because chrome always seemed to have some weird bug where 'frames' (of data in my case) were delayed.

I was disappointed when they were unceremoniously ripped out, but yes, websockets are better.


Ok, I believe you, it's been over 10 years since I tried it :-) Maybe I was thinking of the forever iframe, where a regular application/javascript document had content added to it incrementally over time.


It is actually pretty bizarre. If you point Chrome at a .jpeg that does multipart/replace it will stall, giving you roughly 0.2 fps. Point it at a. .html that contains an <img> tag and it works fine. Found this out while hacking on Hawkeye: https://igorpartola.com/projects/hawkeye/


Thanks. I misunderstood that post.


Apparently breaking the web interface for older versions of CUPS in the process.


It is interesting that you take up JPEG video streaming with the <img> tag in the context of a mozilla project. It works great in Chrome and Safari but in Firefox mjpeg streams only show the first frame. Here is the bug for it https://bugzilla.mozilla.org/show_bug.cgi?id=479015 which is unconfirmed since 2009, so clearly there is nobody over there who cares about this functionality.


I play with motion jpeg a lot actually and I've never had an issue with Firefox.


The part that is amazing is called fast Fourier transform (http://en.wikipedia.org/wiki/Fast_Fourier_transform), it enabled the way we do stills, audio and video nowadays.

But there is more: (https://www.google.nl/search?q=fast+fourier+transform+use+ca...)


Except that JPEG uses the Discrete Cosine Transform on 8x8 blocks of pixels, which, afaik doesn't make use of the FFT.


The DCT is analogous to the FFT.


I wrote a little toy in PHP to play with motion jpeg actually - here: https://github.com/donatj/mjpeg-php/blob/master/mjpeg.php


I used a picture viewer named LXPIC in DOS on a 486 to look at JPEGs - and it was incredibly fast and small. Apparently it works even on the original IBM PC with an 8088, but I've never tried that.


But each frame is a separate JPEG, right? So I'm guessing the compression here is less than fantastic.


It's different from something like mjpeg. More like the jpeg2000 format used by digital (4k) cinema[1].

For web cams, where you might not want true "live" video but prefer higher resolution still frames, it sounds like a reasonable choice.

[1] https://en.wikipedia.org/wiki/Digital_Cinema_Package


There's no fundamental difference between MJPEG and streaming a sequence of independent JPEG files over HTTP.

The MJPEG format doesn't really exist anymore: it was designed in the '90s to account for interlaced video content, but that's a rare breed nowadays. For progressive video, Photo-JPEG is equivalent.

Many popular intraframe video codecs are basically the JPEG algorithm with some modifications for specific pixel formats and some custom metadata. These include Apple ProRes, Avid DNxHD and the stalwart DV format (as in MiniDV tapes).


My bad, I thought mjpeg was a keyframe based format.


Somebody please post a link to a multipart IMG, I want to see what you guys are talking about.

It must be cool for sure.


http://141.89.114.58/cgi-bin/video320x240.mjpg?dummy=garb

Nothing in Chrome, but an animated (slowly) image in Firefox


not worse than gif tho :p


This is very promising. Images by far dominate a web page, both in number of requests and total number of bytes sent [1]. Optimizing image size by even 5-10% can have a real effect on bandwidth consumption and page load times.

JPEG optimization using open source tools is an area that really needs focus.

There are a number of lossless JPEG optimization tools, but most are focused on stripping non-graphical data out of the file, or converting the image to a progressive JPEG (since progressive JPEG's have rearrange pixel data you can sometimes get better compression since there may be more redundancy in the rearranged data). Short of exceptional cases where you can remove massive amount of metadata (Adobe products regular stick embedded thumbnails and the entire "undo" history for an image) lossless optimization usually only reduces file size by 5-15%.

Lossy JPEG optimization has much more potential. Unfortunately, beyond proprietary encoders, the most common lossy JPEG optimization exclusively is to reduce the JPEG quality. This always felt like killing flies with a tank, so advances in this area would be awesome.

I've written extensively about Lossy optimization for JPEGs and PNG, and spoke about it at the Velocity conference. A post and my slides are available[2].

[1] - http://httparchive.org/trends.php

[2] - http://zoompf.com/blog/2013/05/achieving-better-image-optimi...


JPEG has shown amazingly good staying power. I would have assumed "JPEG is woefully old and easy to beat" but Charles Bloom did a good series of blog posts looking at it, and my (non-expert and probably hopelessly naive) takeaway is that JPEG still holds its own for a 20+ year old format.

http://cbloomrants.blogspot.com/2012/04/04-09-12-old-image-c...


Do note that his "JPEG" is really JPEG with a PAQ entropy coder, which is actually a new format and not at all decodable by a JPEG decoder. He does no tests on baseline JPEG.

But it's useful to point out that a lot of new overly complex formats can't beat something as simple as that.


In my opinion, the biggest drawback of JPEG is that its window is non-overlapping. Ring-artifacts are generally not a big deal for natural images at medium quality or higher, but blocking artifacts can be noticeable even at relatively high quality settings.

There are a lot of post-processing techniques to try and mitigate this, but in my experience they tend to do about as much damage as they fix. The proper solution is to overlap the blocks using one of the myriad techniques DCT-based audio codecs use.

It is bizarre to me that for all of the attempts to beat JPEG, nobody seems to have tried simply overlapping the blocks by 2 pixels. You'd have an implementation only marginally more complex than JPEG (in fact, you can even implement it on top of an existing JPEG encoder/decoder) with a slowdown of only 25%.


JPEG-XR has (optional) lapping. But lapping has the problem that you can't do spatial intra prediction, which is significantly more valuable. And no one figured out how to make frequency-domain intra prediction as good until Daala. Plus deblocking filters have gotten pretty good now that they're tuned based on the quantizer used.

But a bigger problem is that no one is really interested in designing a new still image codec that's better than JPEG, since JPEG can't be unseated. So video codecs are where the practical development goes. And avoiding in-loop deblocking filters there means OBMC, which is extremely computationally intensive.


We'd likely have moved on to jpeg2k if it weren't for the patents.


(*)Charles Bloom.


Fixed.


For improving general-purpose gzip / zlib compression, there is the Zopfli project [1] [2]. It also has (alpha quality) code for PNG file format; since this functionality wasn't originally included, there are also third-party projects [3].

You might be able to shave a percent or so off the download size of compressed assets.

[1] https://news.ycombinator.com/item?id=5316595

[2] https://news.ycombinator.com/item?id=5301688

[3] https://github.com/subzey/zopfli-png


Now if only they'd do a mozpng.

(For context: libpng is a "purposefully-minimal reference implementation" that avoids features such as, e.g., Animated PNG decoding. And yet libpng is the library used by Firefox, Chrome, etc., because it's the one implementation with a big standards body behind it. Yet, if Mozilla just forked libpng, their version would instantly have way more developer-eyes on it than the source...)


There's already a "mozpng", it's called Zopfli. There's also AdvPNG which compresses PNGs with 7-zip's deflate implementation.

And if you want much much smaller PNGs, then try http://pngquant.org or http://pngmini.com/lossypng.html


Or try ImageOptim, which bundles these and other tools with a nice GUI: http://imageoptim.com/


ImageOptim also bundles PNGOUT, which I find to have exceptional compression when compared to the others (but it is a bit slower)


See my comment about zopfli [1] for improved png compression algorithm.

[1] https://news.ycombinator.com/item?id=7349635


Mozilla already uses patched libpng (with APNG support).


We've been using http://www.jpegmini.com/ to compress JPGs for our apps. Worked OK, although we didn't get the enormous reductions they advertise. However 5% - 10% does still make a difference.

We've been using the desktop version. Would love to use something similar on a server, but jpegmini is overpriced for our scenario (I'll not have a dedicated AWS instance running for compressing images every second day or so). Will definitely check out this project :)


Have u tried https://kraken.io ?


I noticed that optimizing JPEG images using jpegoptim (http://www.kokkonen.net/tjko/projects.html) reduces the size by a similar factor, but at the expense of decoding speed.

In fact, on a JPEG-heavy site that I was testing with FF 26, there was such a degradation in terms of responsiveness that transitions would stutter whenever a new image was decoded in the background (while preloading).

It made the effort to save 2-4% in size wasted with a worse user experience.


Did you file a bug for this? This doesn't sound normal at all.


Honestly, no. libjpeg would show similar slowdown (interestingly, PNG decoding is slower than JPEG for the same size), and it make sense anyway.

The problem is that even if the bug would be fixed in recent FF versions, libjpeg is basically used in all other browsers as well.


I'm using jpegoptim extensively and haven't noticed such behavior.

All jpegoptim does is rewrite JPEG with optimized Huffman tables, so it shouldn't have any impact on decoding performance. In the process it also changes progressive to baseline, which is even slightly faster to decode.

If you can reproduce the problem with libjpeg-turbo (which is the library that browsers use) you should definitely file a bug.


If my goal were to compress say 10,000 images and I could include a dictionary or some sort of common database that the compressed data for each image would reference, could I not use a large dictionary shared by the entire catalog and therefore get much smaller file sizes?

Maybe images could be encoded with reference to a common database we share that has the most repetitive data. So perhaps 10mb, 50mb or 100mb of common bits that the compression algorithm could reference. You would build this dictionary by analyzing many many images. Same type of approach could work for video.


Well, if we're shipping a common database and you're maximizing transmission efficiency then yes of course.

In information theory, my understanding (which is quite limited) is that when we talk of bits transmitted we can think of it as "uniquely identifying from within the set of total possible messages". So, if you shipped the entire 10,000 image catalog "transmitting" an image from within that catalog would take you a mere, uh, let me count my fingers, 13 bits.

We could go one step further and find some way of hashing all the data together to remove any redundancies and so forth - but the problem alas is about defining arbitrary images :).

What you described tho is kind of what happens with all compression algorithms except on the "micro"/individual image level. You may already be aware but check out Huffman coding: http://en.wikipedia.org/wiki/Huffman_coding for a simple intro.


I am aware that compression algorithms kind of work on this general idea of referencing common bits. Of course I am aware of that.

How do you read my question and interpret simply it as "lets send all of the images in full and then give their index and call it compression?"?? What I suggest is that we take a standard encoding technique like Huffman, or some modification, but rather than creating a table based on data in an individual image, build this code table by analyzing many, many images.

I have read the Wikipedia article on Huffman coding before. However, the details are not really important in regards to my point.

What I am suggesting is that rather than looking at just the bits in individual images and using them to construct a Huffman table or some other kind of reference, look at the bits on many, many images and create a larger reference table. And then of course you may need a local table for things in the image that don't quite correspond to the larger table.

Earlier compression techniques were much more constrained in terms of processing power, RAM, network connectivity etc. and so distributing and using a large table for compression was not practical. I am suggesting that someone who has knowledge of compression engineer a system where 10MB, 50MB, or 100MB of RAM is used and a large common bits file is transmitted, rather than starting with the idea that almost all of the data or all of the data has to be contained in one file. I am not suggesting that an existing compression algorithm could be translated directly into this general concept. I am suggesting an engineering effort starting with different constraints and trade-offs.


*shrug

Lots of different people on this forum, no offense was intended, and like all nerds I get excited when I get to share knowledge.

>How do you read my question and interpret simply it as "lets send all of the images in full and then give their index and call it compression?"??

Because it struck me as analogous, and yeah I'd call that compression - the message length for one is immensely improved.

Okay, so here's my admittedly piss poor understanding of most compression: you either find more intelligent ways to strip bits from the source in ways the consumer won't mind, or you find more intelligent ways to build reference tables given your problem domain.

I'm sure you know this, but the first one is why jpeg/various mpegs are successful: they have complex quantization models that eliminate gradients we won't notice or frequencies we can't hear.

The way you achieve better results is through building better models for how information in your problem domain is related. If we're compressing text and we know the language we can start referencing letter frequencies and index along that and so on.

The way this works in video to my knowledge is, amongst many other complicated things, they take NxN blocks of images and store only the deltas between Y numbers of frames.

So - perhaps you "image reference blob" could build a reference table for all 16x16 px blocks and transmit only the indices for them, and we're back at my original comment. But to my (again, please correct me) understanding those are kind of the only alternatives? Encoding and decoding is an interesting topic.

I very infrequently have to think in binary and I had perhaps too much fun counting to 10,000 on my fingers; my undergraduate was a long time ago and my knowledge on the topic sparse, so I'm interested in hearing more about it.


I thought about doing something like that for audio, but I think it gets really inefficient with higher sample rates because you end up with so little overlap.

Like this: imagine I've got an array of shorts representing audio data. If I've got two files with similar segments, I've saved one short:

[1, 2, 3, 2, 1, 2, 5] [7, 2, 3, 6, 8, 9, 3]

So I can say that [2, 3] is represented by a new value (z), and can shave a short off of both streams. Then what happens if a new stream comes along with no similarities:

[8, 2, 7, 1, 7, 3, 7]

...you still have to send each value.

Maybe I've just demonstrated that I don't know anything about compression, but I would be interested in working with you on this.


I think I would want to find an existing sparse autoencoder implementation, ideally a project already setup for encoding audio, and start from there. http://www.stanford.edu/class/cs294a/sparseAutoencoder.pdf


I don't think this helps. Are you thinking that what, certain 8x8 blocks of pixels appear in photos much more than others? I don't think that's true except for general cases like monochrome blocks or smooth gradients, and you don't need a dictionary to help compress those.


The space of possible JPEGs is vast. You might find that there is less commonality between any given JPEGs than you expect, even considering a corpus of billions, and even when most JPEGs are of similar subjects.


This look like a "Vector quantization" with a precalculated table. (I have absolutely no video/compression background, someone correct me please)



This is a Huffman table, right? I'm pretty sure this is how MP3s work.


I read it as just being a suggestion (which is not that uncommon) to use inter-file common characteristics to optimize for the common case, at least within a certain context.

JPEG is designed to compress any image. But imagine a new algorithm, gJPEG, which is only designed to compress photographs of grass. And furthermore, you get 100MB of raw buffer space in the executable to store some precomputed data that would be useful to gJPEG doing its work. It's quite possible you could significantly improve on the general performance by factoring out some data that's common to typical grass photographs, so that data could be stored once-and-for-all in the decoder and then omitted from each of your (presumably) billions of individual grass photographs. On the other hand, it's pretty tricky to make it work, so you might not be able to do such a thing effectively.


Why do you suggest that this would only work for photographs in one narrow domain?


I read the proposal as intending to take advantage of similarities among photographs in a particular domain. Lacking such similarity, you're back to the general photo-compression problem.


Why don't they just contribute the jpgcrush-like C code back to libjpeg-turbo?

Edit: A good reason given in the reply by joshmoz below.


This was discussed with the author of libjpeg-turbo. His priorities are different, it was agreed that a fork is best.


Listen, you're doing it all wrong. You're supposed to fork it, tell noone, do a years worth of work in secrecy, release it. palm it off to another FOSS community, then abandon it, then refork it, do another 2 years of work in secrecy and then release it again under a new name. Got it?

You'll never get to play alongside big boys like Apple and Facebook with this 'talk to upstream' attitude of yours. That's just not how the game is played.


It's telling you left off google.


If their only plans were to add that one feature I'm sure they would have done just that. Clearly they have grander plans than simply integrating existing functionality, and they felt they needed total control of the project to do so.

The beauty of open source is that if the maintainers of libjpeg-turbo want to incorporate it, they can, and I'm sure the Mozilla devs would be more than happy to help.


Data compression and image compression is a great way to improve the overall internet, bandwidth and speed. Maybe as important as new protocols like SPDY and js/css minification and cdn hosting of common libraries.

As long as ISPs/telcos don't go back to the days of AOL network wide compression to reduce bandwidth beyond low quality I am for this at service level like facebook/dropbox uploads. I hope this inspires more in this area. Games also get better with better textures in less space.

Still to this day, I am amazed at the small file sizes macromedia (adobe now) was able to obtain with flash/swf/asf even high quality PNGs would compress. So yes we all have lots of bandwidth now but crunching to the point of representing the same thing is a good thing. With cable company caps and other bandwidth false supply shortage that focus might resurge a bit.


SWF has a very cleverly designed binary vector graphics format, which naturally lends itself to small filesizes. Much better than SVG or (E)PS, I think.


Can you share more on the "clever" part?


To quote from the format spec, "SWF uses techniques such as bit-packing and structures with optional fields to minimize file size." Many fields are variable-width numbers of bits, using only as many bits as necessary to encode the data. Coordinates are delta-encoded.


JPEG-2000 exists, but decoding is still too slow to be useful.

http://en.wikipedia.org/wiki/JPEG_2000


Seen a movie in the theater recently? JP2k lives on in the DCI [1].

[1] http://www.dcimovies.com


Yep. Theater projectors have roughly $1k in specialized decoding chips to make that work.

Software encode/decode of JP2k is the hard part. That's why there's little adoption of J2k outside of hardware solutions.


It's a really weird place to use JP2K -- they're functionally not space limited, or compute limited, so what's the advantage of wavelets? You could just gzip V210 or something.


As mark-r says, it's about bandwidth. The movies are not hand-carried to each theater.

JP2K is still the leader in visual quality per byte. All DCT-based compression systems (jpeg, mpeg, dv, etc) are prone to "mosquito noise" artifacts which are extremely annoying in moving pictures since the noise moves around and looks like mosquitos flying about. The usual workaround for mosquitos is to blur the picture a bit to make it easier to compress; but that softens the edges of object onscreen. Wavelet systems like jp2 suffer from different, less annoying, artifacts.

gzip'ing 4:2:2 is a terrible idea :-). Keep in mind that Jp2K does have a lossless mode which you can use if you really don't want to lose any quality.


The spec allows for 4K (4096x2160 at 24 FPS) or 2K (2048x1080 at 24 or 48 FPS) source material and projectors. The spec recognizes that 2K sources may be played on 4K projectors (where it leaves the task of upscaling to the implementer) and 4K sources on 2K projectors.

The advantage of the wavelet format is that they can implement progressive resolution decoding, so a decoder only needs to read half of the data from a 4K source to decode a full quality 2K image.


Interesting. I had known about the mosquito noise problem (as cjensen notes (and I was joking about gziping 10-bit 422)), but I didn't realize the progressive decoding was a part of the standard. I spent some time in the guts of the DCP in a previous life, and software decoding of JP2K was always a nightmare. So I'm a bit jaundiced.

But, as always, when you think an engineering decision is insane, you're probably missing the context in which the decision was made.


According to Wikipedia they squeeze 36 bits/pixel down to 4.71 bits/pixel for a 2K 24FPS movie. That's over 7x - gzip would be hard pressed to get up to 2x. I'm sure part of the reason they need compression is to send the data over the wire rather than shipping physical media all the time.

[1] http://en.wikipedia.org/wiki/Digital_Cinema_Initiatives#Imag...


I believe that typically DCP movies are delivered via hard drive or satellite.


One of the benefits is that ingestion is flexible -- In general movies can be distributed via broadband (well in advance of release too, as the files are unplayable without decryption keys). For stuff like film festivals there's usually crates and crates of HDs :-)


This is why bandwidth isn't really a gating concern -- it's a mastered, offline format, so you don't need to squeeze all the possible bits out of the final package.


I wondered what had happened to it, thanks for the link. Now it's over a decade old...


What, it takes 2 ms instead of 1?


More like 500-2000ms - there are a few commercial codecs (Kakadu - as licensed in OS X, Aware) which are well-polished but the open-source situation was wretched for many years, relying on a few libraries which were indifferently maintained and seriously unoptimized. This meant that there were many valid files which could not be opened, key features like tiled decoding weren't supported and everything is so much slower that users will comment on how much longer it takes simply to open or convert a file.

There is some good news in that OpenJPEG (http://www.openjpeg.org/) has been making significant progress in recent years and is now on track to become an official reference implementation:

https://groups.google.com/d/msg/openjpeg/OMc40gUsBIw/UM1ggXk...

Hopefully this will also translate into continued performance work and robustness testing, which would mean potential hope for a browser other than Safari to add native support:

https://bugzilla.mozilla.org/show_bug.cgi?id=36351#c120


Funny story actually - OpenJPEG got a whole bunch of optimisation a few years back because Second Life relies on JPEG2000, after having been generally neglected for some time.


Yes - that was when I started looking at it more seriously. I'm hoping that work continues after the feature support is considered mature enough.


What about WebP? Isn't that intended to be a eventual replacement to JPEG?


From the article: "...replacing JPEG with something better has been a frequent topic of discussion. The major downside to moving away from JPEG is that it would require going through a multi-year period of relatively poor compatibility with the world’s deployed software. We (at Mozilla) don’t doubt that algorithmic improvements will make this worthwhile at some point, possibly soon. Even after a transition begins in earnest though, JPEG will continue to be used widely."

This is Mozilla's roundabout way of saying that they want to put off starting on WebP or JPEG 2000 as long as possible.


It's been 4 years already, and there's little adoption. It might take a decade before (if) WebP is as widely deployed and easily usable as JPEG.

According to Google it's about 30% smaller than JPEG: https://developers.google.com/speed/webp/docs/webp_study

It's a massive effort to switch entire industry to a new format for a 30% gain. With a drop-in replacement of libjpeg you can get 10% gain right now.

Also WebP is based on 2006 VP8, which is obsolete now. Chrome ships with VP9 that's further 26% smaller on keyframes.


Not until WebP switches to VP9, which is not planned currently. VP8 does only support 4:2:0 chroma subsampling. Photoshop sets 4:2:0 only on JPEG quality 50 and lower, as comparison.


Only if other browser vendors adopt it, although IIRC WebP has hard-coded maximum file size limits that make it impractical for e.g. retina displays, let alone anything we might see twenty years from now.


Ah, that's interesting. I hadn't heard that before. It is a maximum size of 16383x16383[1], so it's more than practical for retina displays, but I can see the point about files in 20 years (for reference, jpegs can be 4x that in each dimension, 65535×65535).

I haven't heard that brought up as an objection to the format before, though. If it were really a fundamental stumbling block, there are likely ways to adapt the format around it.

[1] https://developers.google.com/speed/webp/faq#what_is_the_max...


In addition to other concerns people have listed, I found WebP to be extremely slow – at or worse than JPEG 2000, with significantly worse compressed size. If we're going to go to the work of deploying a new image format I'd like at least a clear win versus the status quo, if not better features (JP2's tiled decoding would be perfect for responsive images if you could wave a magic wand to get browser support).


.. WebP as a replacement to jpegs, pngs, apngs, gifs, and maybe one day svgs and psds, to the moon baby, with or without Mozilla.

Though I'm sure they have their reasons to stay away from an across-the-board superior format. I just wish I knew what they were. Sigh.


I just wish I knew what they were. Sigh.

That's not hard: they put out a report months ago concluding that the claimed advantages of WebP were in reality slim to nonexistent, and that it could likely be matched and even outperformed by improved JPEG encoders...which is exactly what they're building now.



I'm actually disapointed. I hoped they developed a still image format from Daala. Daala has sigificant improments such as overlapping blocks, differently sized blocks and a predictor that works not only for luma or chroma, but for both.


One of our listed intern projects is exactly that. It's not a high priority project for the Daala team because we already have good royalty-free image codecs, but not video codecs. We have finite time, so we choose to attack what we see as the more important problem.

Developing a still image format from Daala is possible but not as trivial as it looks (the same can be said for the WebP work from WebM).


Is Daala anywhere close enough to being finished for this to even be feasible?


I like that Mozilla is improving the existing accepted standard, but using modern (mostly patented) codec techniques we could get lossy images to under 1/2 of the current size at the same quality and decode speed. Or at a much higher quality for the same size.

The speed modern web concerns me. The standards are not moving forward. We still use HTML, CSS, Javascript, Jpeg, Gif, and PNG. Gif especially is a format where we could see similar sized/quality moving images at 1/8th the file size if we supported algorithms similar to those found in modern video.

In all of these cases, they aren't "tried and true" so much as "we've had so many problems with each that we've got a huge suite of half-hacked solutions to pretty much everything you could want to do". We haven't moved forward because we can't. WebP is a good example of a superior format that never stood a chance because front-end web technology is not flexible.


I think the surprising thing is that JPEG is as good as it is. WebP certainly isn't "1/2 of the current size at the same quality and decode speed". It's a modest improvement if anything.

GIF, sure. But, IMO, nobody should really be using GIFs anymore. We've had <video> and APNG for a while if you want animation.


I have heard similar things about GIF (that there are optimisations that most encoding software does not properly take advantage of). But I haven't seen any efforts, or cutting edge software that actually follows through on that promise. The closest I've seen is gifscicle, which is a bit disappointing.

What would be great if there was some way for an animated gif's frame delays to opt-in to being interpreted correctly by browser- That is, a 0-delay really would display with no delay, and so optimisation strategies involving the splitting of image data across multiple frames could be done- and when read in by a browser, all frames would be overlaid instantly, module loading time.

What other things can be done to further optimise animated gif encoding?


GIF should be killed. Animated PNG is where it's at.


Animated PNGs should be killed. Just embed a video with an alpha channel. It compresses way better.

http://simpl.info/videoalpha/


Wrong use-case. Animated GIFs and animated PNGs were never intended for video. The analogy to still image formats is video:JPEG = web-animation:PNG. You don't save screenshots as JPGs and you don't save photos as PNGs. Same with video and web-animation, two different use-cases require two different formats.


except that's a false dichotomy

one format can support both lossy and lossless images and use better compression between frames to achieve 10x better results for animations than animated pngs without visible loss of quality


Which format can support both lossy and lossless images?

I spent 2 seconds in GIMP to make the following GIF: http://i.cubeupload.com/kbrcfQ.gif Please show me the video format that can replace it.


webp supports animations, lossless and lossy images, etc. I already linked to a video with an alpha channel that can match your gif


When you can only post/upload content that end up inside img tag, it's not a false dichotomy.


meanwhile, back in the real world, gifs and apngs are staying exactly where they are.


It's not clear from the article, in their "comparison of 1500 JPEG images from Wikipedia" did they just run through the entropy coding portion again or did they requantize? (I suspect they did jus the entropy coding portion, but hard to tell).

Getting better encoding by changing the quantization method can't be purely a function of file size, traditionally PSNR measurements as well as visual quality come into play.

Good to see some work in the area, I will need to check out what is new and novel.

That said, a company I worked for many moons ago came up with a method where by reorganization of coefficients post-quantization, you could easily get about 20% improvement in encoding efficiency, but the result was not JPEG compatible.

There is a lot that can be played with.


If only JPEG supported transparency.


You can do a JS hack with JPEG for RGB and PNG for alpha, but it's not ideal.


When I optimize JPG or PNG I usually use ScriptJPG and ScriptPNG from http://css-ig.net/tools/

They are shell scripts running many different optimizers


"support for progressive JPEGs is not universal" https://en.wikipedia.org/wiki/JPEG#JPEG_compression

e.g. the hardware decoder in the Raspberry Pi http://forum.stmlabs.com/showthread.php?tid=12102


Hey everyone, after some testing we have just deployed mozjpeg to our web interface at: https://kraken.io/web-interface

You can test it out by selecting the "lossless" option and uploading a jpeg. Enjoy!


So... version 1.0 is basically a shell script that calls libjpeg-turbo followed by jpgcrush?


No, the jpegcrush functionality is implemented in C as an extension to libjpeg-turbo. But yes, I suppose a shell script would achieve roughly the same result.


So why wasn't this upstreamed to libjpeg-turbo? Why fork at all? libjpeg-turbo is still an actively maintained project after all...


Maintainer has different priorities, see https://news.ycombinator.com/item?id=7349117


They want to do much more, and with a fork you don't have to justify and discuss every single commit with the original maintainer. If you want to modify heavily something, a fork is the right approach.


I believe this merge shows the C code that implements the crush functionality: https://github.com/mozilla/mozjpeg/commit/c31dea21188b48498d...


No shell script or second step involved. Functionality is built in and on by default.

This is just a start, we wanted to have something people could use on day one. Further developments to come.


Any chance of incorporating other psy improvements, instead of just targeting SSIM?


At first glance this seems wasteful. I do not think anyone would have problem in using Jpeg. However, in many cases, before the the invention of a thing who has had no problem using old tools!


Has somebody translated the jpeg library to JavaScript? Besides encoding and decoding jpeg, it has some useful modules that would be nice to have in the web browser.


A bit to soon to start announcing the project. But I like the initiative hope the project manages to improve stuff.


Mozilla tends to announce projects when they begin so people can contribute to them right away. I think that's a good thing and that's one of the reasons I work for Mozilla.

By the way, I think this is an awesome effort by Josh and the others working on this. I work mainly in security and networking, and I'm excited about anything we can do to make end-to-end communication more efficient because these kinds of improvements help enable more use of HTTPS and help reduce (hopefully eliminate) the need for content transforming proxies.


We would like to develop in the open, and hopefully with community participation.


BTW, please fix your lossy test methodology for when you do your tests on a future version of this.

https://blog.mozilla.org/research/2013/10/17/studying-lossy-... was rather flawed and you never acknowledged this.

(see https://news.ycombinator.com/item?id=6581827 - only one of the four sets of test results might have been valid)


What license are they doing this under? Hopefully they're aiming to upstream this to libjpeg.


This is so dumb, there are a million JPEG crushers in existence but instead of advocating the use of one of these Mozilla writes their own? Why not support webp rather than dismiss it due to compatibility and waste time doing what has been done before.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: