QOI – The “Quite OK Image Format” for fast, lossless image compression

sreekotay · on Dec 23, 2021

If QOI is interesting because of speed, you might take a look at fpng, a recent/actively developed png reader/writer that is achieving comparable speed/compression to QOI, while staying png compliant.

https://github.com/richgel999/fpng

Disclaimer: have not actively tried either.

jws · on Dec 23, 2021

I find it interesting that QOI avoids any kind of Huffman style coding.

Huffman encoding lets you store frequently used values in fewer bits than rarely occurring values, but the cost of a naïve implementation is a branch on every encoded bit. You can mitigate this by making a state machine keyed by "accumulated prefix bits" and as many bits as you want to process in a whack, these tables will blow out your L1 data cache and trash a lot of your L2 cache as well.¹

The "opcode" strategy in QOI is going to give you branches, but they appear nearly perfectly predictable for common image types², so that helps. It has a table of recent colors, but that is only of a few cache lines.

In all, it seems a better fit for the deep pipelines and wildly varying access speeds across cache and memory layers which we find today.

␄

¹ I don't think it ever made it into a paper, but in the mid-80s, when the best our Vax ethernet adapters could do was ~3Mbps I was getting about 10Mbps of decompressed 12 bit monochrome imagery out of a ~1.3MIP computer using this technique.

² I also wouldn't be surprised if this statement is false. It just seems that for continuous tone images one of RGBA, DIFF, or LUMA is going to win for any given region of a scan line.

chrismorgan · on Dec 24, 2021

Meta: ␄ (https://en.wikipedia.org/wiki/End-of-Transmission_character) isn’t the right control character when footnotes follow; ␃ (https://en.wikipedia.org/wiki/End-of-Text_character) is a better fit, and ␌ (https://en.wikipedia.org/wiki/Form_feed) would be a decent choice too.

(I write comments with footnotes in the same style as you, but use “—⁂—” as the separator, via Compose+h+r (name from the HTML tag horizontal rule). Good fun being able to use Compose+E+O+T, Compose+E+T+X and Compose+F+F in this comment; I added the full set to my .XCompose years ago.)

ajdude · on Dec 24, 2021

Entirely unrelated to the OP, but comments like this are why I love HN.

adgjlsfhk1 · on Dec 23, 2021

One thing to note is that QOI composes really nicely with high quality entropy encoders like LZ4 and ZSTD. LZ4 gives a roughly 5% size reduction with negligible speed impact, and ZSTD gives a 20% size reduction with moderate speed impact (https://github.com/nigeltao/qoi2-bikeshed/issues/25).

ErikCorry · on Dec 24, 2021

I would think that you could use a hybrid approach where you have a table that is perhaps 9 or 10 bits wide and covers many of the more common codes, which will by definition be more common. Should be small enough to fit in the cache. Then do something slower for the very long codes. This way you avoid difficult branches most of the time.

edflsafoiewq · on Dec 24, 2021

Exactly. Normally you use second-level tables for the long codes. In the first table, if code doesn't reach a leaf, tbl[code] holds a pointer to the next table to use.

For example, here's JPEG XL's Huffman decoder: https://github.com/libjxl/libjxl/blob/1c67f4eff0464e6241f365.... The branch uses a second table if the code was too long.

ErikCorry · on Dec 24, 2021

Funnily enough it's just a few days since I did some similar code to support writing PNGs from a small embedded device. In this case the full deflate algorithm seemed like overkill in memory and CPU requirement, and most of the images were probably going to be served over a LAN anyway. https://twitter.com/toitpkg/status/1471986776357097475

https://github.com/toitlang/toit/commit/65c6c1bd7138f9ebced4... It's not as highly optimized as this effort though, and it just uses the standard huffman table that is built into deflate, rather than a static-but-custom one.

cornstalks · on Dec 23, 2021

A couple previous interesting discussions from this past month:

- "The QOI File Format Specification" 214 points | 3 days ago | 54 comments: https://news.ycombinator.com/item?id=29625084

- "QOI: Lossless Image Compression in O(n) Time" 1057 points | 29 days ago | 293 comments: https://news.ycombinator.com/item?id=29328750

FullyFunctional · on Dec 23, 2021

Since we are rehashing this for the 3rd (4th?) time, I'll repeat mine (and apparently many others) key critique: there is no thought at all to enabling parallel decoding, be it, thread-parallel or SIMD (or both). That makes it very much a past millennium style format that will age very poorly.

At the very least, break it into chunks and add an offset directory header. I'm sure one could do something much better, but it's a start.

EDIT: typo

stathibus · on Dec 23, 2021

Who cares that it's not set up for simd?

Seriously, who?

This project is interesting because of how well it does compared to other systems of much higher complexity and without optimizing the implementation to high heaven. We can all learn something from that.

FullyFunctional · on Dec 23, 2021

Good question. The answer is all the poor souls that N years later find themselves stuck with a data in a legacy format that they have to struggle to decode faster.

Of all the artifacts in our industry, few things live longer than formats. Eg. we are still unpacking tar files (Tape ARchieve), transmitted over IPv4, decoded by machines running x86 processors (and others, sure). All of these formats couldn't possible anticipate the evolution that follow nor predicted the explosive popularity they would have. And all of these (the latter two notably) have overheads that have real material costs. IPv6 fixed all the misaligned fields, but IPv4 is still dominant. Ironically, RISC-V didn't learn from x86 but added variable length instructions making decoding harder to scale than necessary.

I'm not sure what positive lessons you think we should learn from QOI. It's not hard to come up with simple formats. It's much harder coming up with a format that learns from past failures and avoids future pitfalls.

ricardobeat · on Dec 23, 2021

QOI is designed with a very specific purpose in mind, which is fast decoding for games. This kind of image will be very unlikely be large enough to benefit from multi threading, and if you have a lot of them you can simply decode in parallel. It’s not meant to the the “best” image format.

nynx · on Dec 23, 2021

Unrelated to the rest of your comment, but risc-v does not have variable-length instructions. It has compressed instructions, but they're designed in such a way to be easily and efficiently integrated into the decoder for normal instructions, which are all 32 bits.

FullyFunctional · on Dec 23, 2021

My day job for 6+ years is implementing high perf RISC-V cores and my name is in many of the RISC-V specs.

Variable length ISAs are characterized by not being able to tell the beginning of an instruction without knowing the entrypoint. This applies to RISC-V with compressed instructions. Finding the boundaries is akin to a prefix scan and has a cost roughly linear in the scan length, but IMO the biggest loss is that you can’t begin predecode at I$ fill time.

gautamcgoel · on Dec 24, 2021

It sounds like you regret the decision to make RISC-V variable length. Is that correct?

FullyFunctional · on Dec 24, 2021

I fought against making the _current_ way to do compressed instructions a mandated part of the Unix profile, but RISC-V was (at least at the time) dominated by microcontroller people and there was a lack of appreciation of the damage it incurred. A lot of people far more senior than me couldn't believe what happened.

Interesting to contrast with Arm which upon defining Aarch64 did _away_ with variable length instructions and thus also page crossing ones. Maybe they knew something.

GoblinSlayer · on Dec 24, 2021

Can't you predecode speculatively, then redecode if you see a compressed instruction? Also I assume the bottleneck there is instruction cache, no?

Dylan16807 · on Dec 24, 2021

> IMO the biggest loss is that you can’t begin predecode at I$ fill time.

That helps enough to overcome the increased code size?

I really wouldn't say they learned nothing from x86, though. You only have to look at 2 bits, and if you can get your users to put in the slightest effort then compilers can be told not to use C.

FullyFunctional · on Dec 24, 2021

That's a false strawman. There are infinitely many ways to achieve the same or better density without the drawback. Allowing instruction to span cache line, or even pages, is a mistake that we'll pay for forever.

The simplest possible mitigation would have been to disallow an instruction from spanning a 64-byte boundary. It would have almost no impact on instruction density, but it would have saved a lot of headaches for implementations.

Dylan16807 · on Dec 24, 2021

Strawman? I wasn't even trying to characterize anyone else's point, I was just trying to list some significant improvements over x86.

> The simplest possible mitigation would have been to disallow an instruction from spanning a 64-byte boundary.

Sure, that sounds good. But before this you hadn't even mentioned any problems with split instructions that need to be mitigated.

(You did mention decoding without a known entry point, but a rule like that doesn't guarantee you can find the start of an instruction. And if it would help to know that a block of 64 bytes probably starts with an aligned instruction, that seems like something you could work out with compiler writers even without a spec.)

FullyFunctional · on Dec 24, 2021

I did forget to mention the requirement that you can't branch into the middle of an instruction. If you have both of these constraints then you can unambiguously determine the location of all instructions in any aligned 64-byte block, including at I$ fill time.

Implementing this would require instruction fetch to take an exception on line-crossing instructions (which must be illegal) and a change to the assembler to insert a 16-bit padding nop or expand a compressed instruction to maintain the alignment. There is nothing needed from the compiler (or linker AFAICT). JITs will have to be aware though.

Dylan16807 · on Dec 24, 2021

You'd also need to guarantee that there are no constants or other non-instruction data in the same cache line as instructions. If that's a reasonable constraint then sure, that sounds like it would be helpful.

rndgermandude · on Dec 24, 2021

Those poor souls N years later will either have to decode a very few images, which is still fast enough, or decode a lot of images, which can be parallelized and run concurrently on a per-image level. In the very worst case, decode an extremely large single image, you're a bit out of luck, but that case would be rare, and you're still pretty fast at decoding anyway.

Creating formats and specs that are "future proof" is a noble goal. Criticizing QOI for not being able to be well parallelized inside the decode function, that seems more like a demand for a premature optimization to me...

seoaeu · on Dec 24, 2021

> Criticizing QOI for not being able to be well parallelized inside the decode function, that seems more like a demand for a premature optimization to me

What? Faster encoding and decoding is one of the primary reasons for the format. Yet, QOI decoders are currently an order of magnitude slower than SSDs available today and even worse compared to DRAM! Now seems like the perfect time to look at possible optimizations to close that gap.

flohofwoe · on Dec 24, 2021

QOI is not an interchange file format like PNG or JPG, it's more akin to DDS or KTX (e.g. a specialized game asset pipeline file format which doesn't require a complex dependency for decoding).

GoblinSlayer · on Dec 24, 2021

Who struggles to decode images faster?

midjji · on Dec 23, 2021

A surprisingly good image format is to use a per line, or block, ar encoder, then compress the result with gzip on a low setting. It parallelizes very nicely, and beats png for encode, decode, and is trivial to implement.

ErikCorry · on Dec 24, 2021

You should be able to do the same technique, but in a PNG-compatible format.

midjji · on Dec 24, 2021

I'm really not a fan of using the generally unsupported aspects of the file formats. Like sure, png supports alternate compression schemes which in combination with the fully generic datablocks system which I think technically accidentally makes it Turing complete. But its not like any reader ever supports it fully, hell most readers dont support showing the other images in a png file. Its also quite common that palettes aren't properly supported, in particular for uncommon combination. Palettes themselves are also not truly sufficient for what they are supposed to do, as that would require generic scalar to scalar functions. Its like this with every damned old image format. People try to make the format to end all formats and end up accidentally inventing really shitty programming languages with terrible separation of concerns which are terribly supported but work as people only ever use them for single image data.

Another stranger mistake is the use of generic compression a part of the format, its much better to leave that up to the filesystem or stream. I dont really understand why this was ever a good idea, but it certainly hasn't been one for decades.

The more recent blob formats people have developed aren't much better, they still confuse the specification of a single blob with the specification of the container format, and try to be convenient and fully generic at once. If people actually wanted a fully generic dataformat and accepted that this requires a programming language, just let that be python, instead of inventing some shitty new one...

spider-mario · on Dec 24, 2021

> But its not like any reader ever supports it fully, hell most readers dont support showing the other images in a png file.

Do you mean APNG? In all fairness, that is not even in the specification although there is discussion to add it. https://github.com/w3c/PNG-spec/issues/26

The one that was specified was MNG but it was a different format and practically no one used it since it was not parsable as a PNG.

midjji · on Dec 25, 2021

No not animated, I mean that the .png format specifies how to specify an arbitrary number of datablocks, without specifying how they should be used aside from amounts to an recommendation of which one should be shown.

It used by some videogames as assets, for instance storing what amounts as a thumbnail as a shown image, but leaving meshes and textures in the other internal datablocks.

ErikCorry · on Dec 24, 2021

I think I don't understand what you mean by an AR encoder

midjji · on Dec 24, 2021

Its auto regression. Think of a row of pixels as a signal y[n]. Assume y[n] = a y[n-1] + b y[n-2]. estimate a,b. Then store y[0], y[1], a, b, and the residual, then encode the four values and the residual which will mostly be zeros. You can vary the length and number of coefficients very cheaply, but a fixed two beat png in my tests. There are fast standard algorithms to find a,b and which along with some simple tricks to correct for the precision ensure lossless encoding. Its the mathematicians version of the ad hoc thing png tries to do.

sushibowl · on Dec 24, 2021

isn't that very similar to what PNG does? I'm not sure what you mean by "ar encoder", but PNG uses a per-line filter and then adds DEFLATE on top of that.

meltedcapacitor · on Dec 23, 2021

A thread can scan the opcodes only to find cut-off points and distribute actual decoding to other cores. Surely you can do that with some simd magic, as well as the decoding threads, without needing to encode properties of today's simd in the encoding.

seoaeu · on Dec 24, 2021

No it can't. The encoder doesn't insert any "cut-off points". In fact, nearly every chunk encodes the current pixel value in terms of one of the previous pixel values, so without knowing those it is impossible for a second core to start up and initialize its decoder state enough to produce correct output.

meltedcapacitor · on Dec 24, 2021

Read again the proposal.

A top level thread scans the opcodes only to solve this, with no decoding and no writing, thus progressing faster in the stream than the child chunked decoding threads it progressively spawns.

Not as quick as a format with a chunk table, but faster than naive single core.

seoaeu · on Dec 24, 2021

I did read your comment. You don’t have any explanation of how the top level scan can maintain the color index array for QOI_OP_INDEX chunks without doing any decoding.

flohofwoe · on Dec 24, 2021

I bet you can split big images into smaller QOI encoded chunks and decode those in parallel.

QOI is simple enough to remix the file format as much as you want in your own encoder/decoder (that's actually the USP), it's not meant as a standardized image exchange format, just something that lives in your own asset pipeline.

throwamon · on Dec 23, 2021

Didn't HN disallow recent reposts? This (or its spec) was already posted 3 days ago (twice) and then 2 days ago...

https://news.ycombinator.com/item?id=29625084

https://news.ycombinator.com/item?id=29631717

https://news.ycombinator.com/item?id=29643370

versteegen · on Dec 24, 2021

It's because this link (https://github.com/phoboslab/qoi) has been approved by mods for reposting: it appears on the "pool" list [1] [2]. Which is a bit odd because as you point out, a different link [3] for the same project already received lots of attention.

[1] https://news.ycombinator.com/pool

[2] https://news.ycombinator.com/item?id=26998308

[3] https://news.ycombinator.com/item?id=29625084

corysama · on Dec 23, 2021

I think QOI inspired the creation of https://github.com/richgel999/fpng which creates standard PNGs and compares itself directly to QOI.

phoboslab · on Dec 23, 2021

Don't expect too much of QOI.

I wanted a simple format that allows you to load/save images quickly, without dealing with the complexity of JPEG or PNG. Even BMP, TIFF and other "legacy" formats are way more complicated to handle when you start looking into it. So that's what QOI aims to replace.

There's a lot of research for a successor format ongoing. Block based encoding, conversion to YUV, more OP-types etc. have all shown improved numbers. Better support for metadata, different bit-depths and allowing restarts (multithreading) is also high on the list of things to implement.

But QOI will stay as it is. It's the lowest of all hanging fruits that's not rotten on the ground.

hulitu · on Dec 24, 2021

> I wanted a simple format that allows you to load/save images quickly, without dealing with the complexity of JPEG or PNG. Even BMP, TIFF and other "legacy" formats are way more complicated to handle when you start looking into it. So that's what QOI aims to replace.

XPM ? compressed with gzip ?

abainbridge · on Dec 24, 2021

XPM and gzip are still not that simple. QOI is much simpler.

PostThisTooFast · on Dec 23, 2021

What are we to make of that warning? What shortcomings do you think people will find?

flohofwoe · on Dec 24, 2021

It's simply better suited for some types of images than others (e.g. the resulting size is sometimes bigger than expected). The main advantage is the very simple encoder and decoder with a specification that fits on a single page (and which still yields surprisingly good results for many image types):

https://qoiformat.org/qoi-specification.pdf

riedel · on Dec 24, 2021

I agree that it is quite easy to grasp the format in terms of implementation. It seems basically like writing a image VM that accepts byte code. I think that could really be a way to specify many file formats more concicesly. If e.g. you chose the correct automata/transducer class one can easily e.g. specify some hedge grammar based XML file format and get a binary representation. Starting from grammars as a spec it is typically more difficult if you want to derive an implementation.

However I e.g. wonder from reading the concrete spec why you e.g. cannot differentially change the alpha channel leading me to the question what happens if images have different alpha levels.

causality0 · on Dec 24, 2021

Mostly the fact people have found and will find shortcomings that won't be fixed because the project is done, like everything being big-endian.

flohofwoe · on Dec 24, 2021

"Everything" means two 32-bit integer values (width and height) in the header, that's hardly much of a downside ;)

Usually it's a good idea anyway to read file headers byte by byte instead of mapping a struct over it to avoid alignment, padding and endianness issues.

petitg1987 · on Dec 24, 2021

I just implemented this format in my game engine and the performances are crazy: images loading is 3.2 times faster (compared to png) and 40 times faster to generate game screenshot!

junon · on Dec 24, 2021

And the size difference? mapping in a raw pixel data binary is infinitely faster than any image encoding, but takes up the most space of course.

petitg1987 · on Dec 24, 2021

The generated screenshots are lighter (about 5%). However, the resource images in QOI format that I load are in average a little bigger (about 5% and sometimes until 35%). I guess it is not the perfect solution for AAA games which already use more than 30go nowadays.

flohofwoe · on Dec 24, 2021

"AAA games" (or rather any 3D games) typically use lossy image file formats which directly map to the hardware-compressed GPU texture formats (like DXTx/BCx in DDS or KTX containers), QOI is an interesting alternative where lossy formats don't work well though (e.g. pixel art).

zigzag312 · on Dec 23, 2021

Is there any open source audio compression format like that? Lossless and very fast. I haven't found any yet.

EDIT: I'm thinking about a format that would be suitable as a replacement for uncompressed WAV files in DAWs. Rendered tracks often have large sections of silence and uncompressed WAVs have always seemed wasteful to me.

LeoPanthera · on Dec 23, 2021

FLAC is always lossless, but has a variable compression ratio so you can trade compression for speed.

Using the command line "flac" tool, "flac -0" is the fastest, "flac -8" is the slowest, but produces the smallest files.

In my experience, 0-2 all produce roughly equivalent sized files, as do 4-8.

makapuf · on Dec 23, 2021

I tried passing stereo wavs in 2 x 16bits (4bytes) as rgba for qoi but I haven't been very successful.

adgjlsfhk1 · on Dec 23, 2021

That's not surprising. QOI is heavily optimized for images which tend to be relatively continuous, while audio tends to oscillate a ton.

cycomanic · on Dec 24, 2021

It might work to fourier transform first (although likely will kill performance)

adgjlsfhk1 · on Dec 24, 2021

fixed size FFT (eg length 64) can be made scary fast.

wombatmobile · on Dec 23, 2021

I'd also like to know what's the best (or any) lossless audio compression process/tools.

My application is to send audio (podcast recordings) to a remote audio engineer friend who will do the post processing, then round trip it to me to complete the editing.

Wav is so big it makes a 1 hr podcast a difficult proposition.

MP3 is unsuitable because compression introduces too many artefacts the quality suffers unacceptably.

What do other people do in this circumstances?

phonon · on Dec 23, 2021

1 hour of CD quality mono FLAC encoded is about 100-150 MB. Is that small enough?

wombatmobile · on Dec 24, 2021

Well, I'm using a Rodecaster in multi-channel mode with 3 mics so an hour is more like 450MB.

selectodude · on Dec 23, 2021

FLAC and ALAC can be losslessly converted to back to WAV and cuts the file size in half.

StreamBright · on Dec 23, 2021

ALAC? FLAC? What is the problem with these?

zigzag312 · on Dec 23, 2021

FLAC is limited to 24 bit depth. I was thinking of intermediate format suitable for use in DAWs and samplers that also supports floating point to avoid clipping.

LeoPanthera · on Dec 23, 2021

24-bit integer and 32-bit float have the same dynamic range available, so you are not losing any fidelity.

However, frankly, if you're working professionally with audio like that, the best solution is simply to have sufficient disk space available to work with raw audio.

Use FLAC to compress the final product, when you are done.

kloch · on Dec 23, 2021

They have the same precision but float has vastly larger dynamic range due to the 8-bit exponent. When normalized and quantized for output this does result in roughly the same effective dynamic range (depending on how much of the integer range was originally used).

The issue is audio is typically mixed close to maximum so any processing steps can easily lead to clipping. One solution is to use float or larger integers internally during each processing step and normalize/convert back to 24-bit integer to write to disk. Another (better imo) option would be to do all intermediate steps and disk saves in a floating point format and only normalize/quantize for output once.

I haven't worked with professional audio in over 25 years (before everything went fully digital) but I would be surprised if floating point formats were not an option for encoding and intermediate workflows. Many quantization steps seems like a bad idea.

adzm · on Dec 23, 2021

Most DAWs and plugins and audio interfaces nowadays use floating point internally.

zigzag312 · on Dec 23, 2021

> I would be surprised if floating point formats were not an option for encoding and intermediate workflows.

For bouncing tracks to disk, uncompressed 32-bit floating point formats are avaliable, but I am not aware of any fast losslessly compressed 32-bit floating point format.

323 · on Dec 23, 2021

All professional audio production software these days internally works with 32/64 bit floats. That's the native format, because it allows you to go above 0 dBFS (maximum level), as long as you go back below it at the end of the chain.

zigzag312 · on Dec 23, 2021

With 24-bit integer you are at risk of clipping.

EDIT: Floating point is useful while you are working to avoid any accidental clipping. As an intermediate format, like a ProRes for video. FLAC is great as a final format.

artiii · on Dec 23, 2021

check WavPack (32pcm, floats etc) but it's slower(not much) than flac, offering slighty beter compresion.

zigzag312 · on Dec 23, 2021

WavPack seems a bit too slow already. 3x slower decode compared to FLAC in this test https://stsaz.github.io/fmedia/audio-formats/

TylerE · on Dec 24, 2021

Wavpack on a modern CPU, from your own link, decodes at approx. 250x realtime. How fast is 'fast enough' if that isn't?

zigzag312 · on Dec 24, 2021

Projects with 100+ tracks are not uncommon. Sampler/rompler of a single virtual instrument can play 10+ sounds simultaneously. Playback of an orchestral score with virtual instruments can easily go over 250 simultaneous sounds, so just a real-time playback (without any additional processing) would already be a challenge.

TylerE · on Dec 24, 2021

In those contexts samples are loaded once and then kept in RAM.

zigzag312 · on Dec 24, 2021

No, actually only first few dozen KB of each sample are usually preloaded into RAM. The rest is streamed from a SSD. One library of an orchestral section can have 100+ GB of samples. You wouldn't fit all sections in 128GB of RAM.

TylerE · on Dec 24, 2021

I'm quite aware of the size of large libraries, I own a number of them.

Part of why I built my current machine with 64GB of ram

StreamBright · on Dec 23, 2021

Nice, I was not aware.

323 · on Dec 23, 2021

WavPack might fit the bill. It has decent software support. Not sure if DAWs can use it natively, they might unpack it to a temp folder.

https://www.wavpack.com/

zigzag312 · on Dec 23, 2021

Reaper does. Unfortunately, WavPack has a bit too much performance overhead.

323 · on Dec 23, 2021

It's a 20 year old format.

ZStandard is a very good compressor, with an especially fast decompressor. Maybe someone should try using this instead of zlib in an audio format (FLAC, WavPack, ...)

BlueSwordM · on Dec 23, 2021

I mean, is there really a need for utilizing ZStd for audio compression?

FLAC is extremely good at compression audio, has very fast encode and uber fast decode. It also doesn't use zlib...

dmitrygr · on Dec 23, 2021

gzip -1 is lossless and fast. It will somewhat compress pcm data :)

zigzag312 · on Dec 23, 2021

You would loose fast seeking ability with gzip. Or am I mistaken?

jzwinck · on Dec 23, 2021

You can only seek within a gzip file if you write it with some number of Z_FULL_FLUSH points which are resumable. The command line gzip program does not support this, but it's easy using zlib. For example you might do a Z_FULL_FLUSH roughly every 50 MB of compressed data. Then you can seek to any byte in the file, search forward or backward for the flush marker, and decompress forward from there as much as you want. If your data is sorted or is a time series, it's easy to implement binary search this way. And standard gunzip etc will still be able to read the whole file as normal.

flohofwoe · on Dec 24, 2021

MOD files ;)

(but seriously, MODs can encode hours of audio into kilobytes, the downside is of course that they require a special authoring process which seems to be a bit of a lost art today)

meltedcapacitor · on Dec 23, 2021

It is nice but pity it does not have a "turn right" opcode: start going left, on the turn opcode, continue decoding pixels after turning 90 degrees to the right, until you hit a previously decoded pixel or the wall of the bounding box defined after the first two turns, in which case you turn automatically. The file ends when there's nowhere to turn.

This would eliminate the need for a header (bloat!) as the end of file is clearly defined, the size is defined after decoding the top and right line (second turn), and it's not so sensitive to orientation (a pathological image can compress very differently in portrait vs landscape in line oriented formats). Color profile can be specified in the spec.

Also allows skipping altogether some image-wide bands or columns that are of the background color (defined by the first pixel) as you do not need to walk over all the pixels.

adgjlsfhk1 · on Dec 23, 2021

Writing an encoder for that sounds like a nightmare though. Also the speed would suck since you would have unpredictable memory accesses.

meltedcapacitor · on Dec 24, 2021

An encoder just walking a regular spiral (no uniform bands detection) is not hard. The band thing is an accidental artefact of the idea but plain run length encoding probably already captures most of the effect so no imperative to actually implement it.

Speed, yes, it is a fair objection, until hardware adopts spiral encoding :-)

booi · on Dec 23, 2021

Seems like they benchmarked it against libpng which shows anywhere from 3-5x faster decompression and 30-50x compression. That's pretty impressive and even though libpng isn't the most performant of the png libraries, it's by far the most common.

I think the rust png library is ~4x faster than libpng which could erase the decompression advantage but that 50x faster compression speed is extremely impressive.

Can anybody tell if there's any significant feature differentials that might explain the difference (color space, pixel formats, .. etc)?

sakras · on Dec 23, 2021

I think fundamentally it’s faster just because it’s dead simple. It’s just a mash of RLE, dictionary encoding, and delta encoding, and it does it all in a single pass. PNG has to break things into chunks, apply a filter, deflate, etc.

pornel · on Dec 24, 2021

Filters are a form of delta encoding, and are optional for PNG encoders. Deflate is a form of dictionary encoding with RLE. There's no "breaking into chunks" in PNG — PNG can encode the entire image as a single iDAT chunk (and chunks themselves are so trivial they have no impact on speed).

You can choose not to do filtering when encoding PNG. Fast deflate settings are literally RLE-only, and you can see elsewhere in this thread people have developed specialized encoders that ignore most deflate features.

The only misfeature PNG has that slows down encoding is CRC. Decoders don't have to check the CRC, but encoders need to put one in to be spec-compliant.

aspyct · on Dec 24, 2021

I would love to have something to compress the raw files from my camera. They're huge, I have to keep a ton of them, and I also need to transmit them over internet for my backup.

I tried a few standard compression format, with very little luck.

Canon has devised a very smart (slightly lossy) compression format for newer cameras, but there's no converter that I know of for my old camera files.

So, unless I shell out large amounts of money for a new camera, I'm stuck sending twice the data over the internet. Talk about pollution...

rocqua · on Dec 24, 2021

There is the option of converting to DNG files. Which allow for really good lossless compression. This does come at the cost of changing the file format, and risks losing metadata. That's why I personally decided to just buy more storage instead.

Come to think of it, have you tried running a modern compression algorithm on the data? I don't think I did. Could be cool if combined with ZFS or similar to get the compression done transparently.

aspyct · on Dec 25, 2021

Converting the CR2 (Canon) to DNG tends to double or even triple it's size, but I haven't tried compressing it afterwards.

I should, as you suggest, test a more exhaustive list of formats, who knows...

jaxrtech · on Dec 24, 2021

At a previous job was looking at different binary parsing methods. This project looks quite interesting having binary format descriptions in YAML that then can be generated into your language of choice.

https://formats.kaitai.io/png/

jqpabc123 · on Dec 23, 2021

Interesting format. It would be much more interesting if browsers supported it.

dnautics · on Dec 23, 2021

Not sure what you're expecting given how old it is. Why not write a polyfill as an exercise for yourself? Convert it to png, then save as an image tag to a data url.

Here look some people adapted to ios in one hour faffing around on twitch: https://www.twitch.tv/videos/1241476768?tt_medium=mobile_web...

ReactiveJelly · on Dec 23, 2021

It's always gonna be chicken-and-egg for this, and browsers won't spend the time sandboxing and supporting a codec until it's already popular.

So this will probably see a JS / Webasm shim, and if that proves popular, Blink and Gecko will consider it.

The day might come soon when browsers just greenlight a webasm interface for codecs. "We'll put packets in through this function, and take frames out through this function, like ffmpeg. Other than that, you're running in a sandbox with X MB of RAM, Y seconds of CPU per frame, and no I/O. Anything you can accomplish within that, with user opt-in, is valid."

flohofwoe · on Dec 24, 2021

Here you go ;)

https://floooh.github.io/qoiview/qoiview.html

A QOI decoder should fit into a few hundred bytes of WASM at most, maybe a few kilobytes for a "proper" polyfill.