Compression of Spectral Images Using Spectral JPEG XL

rurban · 2025-03-17T10:54:31 1742208871

I also really wanted to use libjxl, but abandoned it eventually. The encoder was horrible. Also instead of OpenEXR we store our multispectral images as TIFF, with lossless lzw:2 compression. Lot of work with TIFF, but in the end much more flexible

fc417fc802 · 2025-03-17T11:53:44 1742212424

It seemed like most of their issues were due to the lossy compression (ie wanting different parameters per sub-image). If they had opted for lossless JPEG XL wouldn't things have "just worked"?

Unrelated, I'd also be curious to know how their initial data transformation varies from that of your TIFF scheme.

rurban · 2025-03-17T12:42:29 1742215349

The jxl compression would compute a base for all sub-images and create better diffs amongst them. The current encoder doesn't support that yet.

With my tiff lzw:2 it only takes the diffs for 2 consecutive lines per each sub-image, which is a shame. But still 50% compression. With jxl it would compress down to 10% or more with a sub-image base.

JyrkiAlakuijala · 2025-03-17T17:02:52 1742230972

Lossy compression can be megabytes vs. tens or hundreds of megabytes per image kind of question, i.e., worth all the sweat it brings.

tetrahedon · 2025-03-17T12:43:15 1742215395

Did you consider other compression methods? For us, ZSTD is quite good for TIFF files.

JyrkiAlakuijala · 2025-03-17T17:01:23 1742230883

ZSTD/Tiff is quite far from JPEG XL in compression density. There are many technical reasons why this is the case.

tehjoker · 2025-03-17T11:26:05 1742210765

what problem did you run into?

fc417fc802 · 2025-03-17T11:58:31 1742212711

Does JPEG XL compression having a perceptual basis pose any issue given that the data being encoded here isn't what the original algorithm was intended for? I see that the approach iteratively checks RMSE but I'm wondering about spatial artifacts in a given layer.

Scaevolus · 2025-03-17T15:46:31 1742226391

JPEG XL's XYB color space is perceptual and based on LMS, but you don't have to use it, and you can store 16-bit floats directly. The paper notes that the libjxl library interface lacks some necessary features:

"In principle, JPEG XL supports having one main image and up to 255 sub-images, which sounds like a good match for c0 and f1, . . . , fn−1. Unfortunately, the current implementation in libjxl does not allow us to tweak the compression ratio and subsampling on a per-sub-image basis. Due to these limitations, we currently use one JPEG XL file per channel so that we have full control over the compression parameters."

This follows a general trend in modern codecs where the format itself allows for many different tools, and the job of the encoder is to make good use of them. See "Encoder Coding Tool Selection Guideline" for a nice chart of the possibilities: https://ds.jpeg.org/whitepapers/jpeg-xl-whitepaper.pdf

fc417fc802 · 2025-03-18T00:40:03 1742258403

I see. It turns out I had a misunderstanding about some of the details, including that libjxl is responsible for a lot of stuff that I thought was inherent to the format.

It does seem a bit weird to me that we're going to end up with image files being more like video container formats where you need the appropriate codec available in order to decode them. But I suppose when the use cases are so widely varied it was probably inevitable.

Maybe we should just cut to the chase and standardize codecs that fit inside of mp4 or mkv for all media, including still images, audio, everything. I'm only half joking - it feels like where this is headed.

JyrkiAlakuijala · 2025-03-17T16:39:09 1742229549

I believe JPEG XL allows different scaling per layer, including decent default interpolation.

But usually exotic things are easier to engineer if you do them outside of the container, then you don't need to figure out how the standard works.

riggsdk · 2025-03-17T15:46:59 1742226419

Why not just store each "frame" of spectral image as it's own HEVC compressed video clip? Each image frame in the video corresponds to a slice of the light spectrum. The longer the video, the more precision you pack into the frame. With that you have variable precision per frame as you see fit. It being a video you exploit the obvious video compression advancements that has been achieved over the years without having to reinvent the wheel.

bob1029 · 2025-03-17T20:19:03 1742242743

> It being a video you exploit the obvious video compression advancements that has been achieved over the years without having to reinvent the wheel.

If you pull away enough layers in any modern video codec, you will find that the intraframe compression case looks very similar to how JPEG operates. You have a block/partitioning scheme, a transform (DCT/DST), quantization, and then some sort of entropy coding.