The Road to OCIv2 Images: What's Wrong with Tar? (2019)

gilgad13 · on Feb 2, 2022

I agree with many of the concerns the author raises, but I'm left with the question:

Given all this, what does layering give us?

It gives some deduplication, but only a crude form. It gives some reproducibility from building off a well-known base and tag, but not full reproducibility. It gives some security benefit from building off a well-known base, but not as large a benefit as standard package managers provide.

I would be excited to see a image distribution system based off of something like casync, maybe with an initial rootfs formed through image-focused distributions like yocto[1]. The embedded device ecosystem has been concerned with reproducibility, image signing, and incremental updates for awhile and I think their approaches are very applicable to container images.

[1]: https://www.yoctoproject.org/

nonameiguess · on Feb 2, 2022

Apparently, not a whole lot for image transfer and portability. But layering still gives you something at runtime if a single organization is using the same base image for all of its own containers. And, in practice, I think layer-level deduplication does still save on transfer costs. I'm not sure if this author just wasn't considering or realizing the state of where industry was heading, but with projects that are rebuilt on every commit, the change frequency of upper layers is still a lot greater than the rate of change on distro base images. They may be patched daily and you need to re-download the whole thing every day, but if you're building 40 times a day, that's still better than downloading 40 times a day. It's just a lot worse than we could be doing if we could only download diffs instead of the entire layer when a single bit changes.

It would be nice to see what, if anything, ever came of the ending tease. Something like git but also for binary files is what is called for. Arguably, ClearCase offered this exact feature 27 years ago, but being proprietary and expensive limited its adoption among modern web tooling.

cyphar · on Feb 14, 2022

You can have a "layer" build system using a snapshot-style approach. The fact that Docker files and build scripts are written in a layered manner doesn't mean that our storage format needs to be using layered tar archives that duplicate data needlessly.

As for the tease, sorry about that -- there were several discussions in the OCI community in relation to my proposals and other issues we might want to fix but sadly work has stalled.

nix23 · on Feb 2, 2022

Hmm thinking about it...a ZFS-filesystem on a file could solve ~all the problems with the exception of a good performing dedup and shrinking the file/device.

But really good article, quite interesting tar-history.

nonameiguess · on Feb 2, 2022

Machine independent representation? One of the explicit goals is a format that works no matter the underlying filesystem it is copied onto. ZFS is supported as a storage driver by most container runtimes, but it isn't desirable to have it be the only supported storage driver.

Datagenerator · on Feb 2, 2022

This is possible already with losetup?

rektide · on Feb 2, 2022

Great post, but:

> How Do We Get It?

> I’m afraid to find that out, you’ll need to wait until the next instalment. I hope to get it complete in a few weeks (I was hoping to have a PoC available with the next instalment but that’s just a silly goal at this point).

> If you want a taste though, the general idea is that we can resolve most of the issues I’ve listed and gain most of the properties we want by creating our own format that is built on top of the OCI content-addressable store, and is a Merkle tree with content-defined chunking of file contents. The basic idea is very similar to backup tools like restic or borgbackup. Since OCI has smart pointers, we can define a few new media-types, and then our new format would be completely transparent to OCI image tools (as opposed to opaque tar archives).

> But you’ll learn all about that next time. Thanks for reading, and happy hacking!

Any follow-up on this 3 year old post? This appears to be the final entry on cyphar's blog. Did this ever happen? I've done some hunting but can't find any stated plans for OCIv2 from Cyphar.

There was a hack-session in march 2020 on this topic: https://hackmd.io/@cyphar/ociv2-brainstorm . Really fascinating document. It has a section for each problem, & does a great job citing & discussing prior work in the area. Was a delight to stumble into this. But still seems very formative. It felt like cyphar was already onto a plan, well before this date. I don't see any evidence of v2 forming in the official repo, https://github.com/opencontainers/image-spec/tree/main/specs... . Cyphar does still seem active in the project.

Seems there was a come-together call on the mailing list last June, calling for trying to extend-their-way-forwards rather than do a big revamp/rewrite: https://groups.google.com/a/opencontainers.org/g/dev/c/6DGtH...

cyphar · on Feb 11, 2022

Hello. I was surprised to see someone post this so long after I wrote it.

We had several design discussions about this at the time but we struggled to come to an agreement about how we might migrate and various other issues. Tycho and I were the only people pushing for this and it wasn't clear how much other interest there is -- stargz is being pushed as a (pretty clever) attempt to hack around the tar issue by continuing to use tar but structuring it such that you can do random-access (among other things).

I haven't had much time for blogging recently. I did have a draft part 2 written but I decided it might make more sense to have some consensus on where we might go before publishing it -- at this point though, neither have happened so maybe I should've published the blog post regardless (though in fairness I was still writing prototypes so publishing something might've been a bit premature).

The general thesis would've been that we can basically design everything in a similar way to restic (content-defined-chunking and a snapshot-style system where the granularity of the filesystem data is much smaller than the layer level -- allowing for massively increased deduplication and eliminating the vast majority of issues with the layer-based model we have now).

However after looking into it more, there are several other things we should rethink (device inodes for instance -- which would require rethinking how systems like Kubernetes express their requirements from an image). Also there's a fair argument that we should move compression entirely to distribution (and use decompressed data in the image format), but such a change would possibly require even more discussions with them to make sure that wouldn't be too burdensome.

Also feel free to call me Aleksa lol.

pid-1 · on Feb 2, 2022

Since I learned that OCI images are pretty much archives with metadata, I've been thinking about applying that to other stuff.

Like, it would be cool to have all my python virtualenvs saved in bundles so they could be reused by other others, without all the container bloat.

ecnahc515 · on Feb 2, 2022

OCI is starting to be used for more non-container things like your describing actually. One example I'm aware of is the Helm tool in Kubernetes, which has support for using OCI as a method of storing/transporting helm charts (which are effectively just tarballs of Go templates/yaml). I

I think your challenge for virtualenvs would be that virtualenvs generally are not relocatable, as they hardcode absolute paths within the generated scripts (eg: venv/bin/activate`).

tyingq · on Feb 2, 2022

Sounds like a Java jar/war file...

nonameiguess · on Feb 2, 2022

And apk, deb, rpm, whl, msi. Virtually all packaging formats come down to "archive plus metadata."

tyingq · on Feb 2, 2022

Yes, though narrowed down to "packaging for a python virtual environment" draws closer to jar/war.

dang · on Feb 2, 2022

Discussed at the time:

Road to OCIv2 Images: What's Wrong with Tar? - https://news.ycombinator.com/item?id=18965881 - Jan 2019 (33 comments)

nhoughto · on Feb 3, 2022

Squashfs instead! There was another post about this recently, replace tar with squashfs solves a bunch of problems. Is arguably harder to create and had a smaller ecosystem than tar, but capability wise it would be great.

justinludwig · on Feb 2, 2022

The original title, "The Road to OCIv2 Images: What's Wrong with Tar?", is a lot better IMHO. Otherwise, a great history lesson on the tar format, and why it's not a great fit for OCI images.

Kim_Bruning · on Feb 2, 2022

Also a bit mean! The original author explicitly considered and rejected the "x considered harmful" title.