Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The OCI Distribution Spec is not great, it does not read like a specification that was carefully designed.

That’s par for everything around Docker and containers. As a user experience Docker is amazing, but as technology it is hot garbage. That’s not as much of a dig on it as it might sound: it really was revolutionary; it really did make using Linux namespaces radically easier than they had ever been; it really did change the world for the better. But it has always prioritised experience over technology. That’s not even really a bad thing! Just as there are tons of boring companies solving expensive problems with Perl or with CSVs being FTPed around, there is a lot of value in delivering boring or even bad tech in a good package.

It’s just sometimes it gets sad thinking how much better things could be.



> it really did change the world for the better.

I don’t know about that (hyperbole aside). I’ve been in IT for more than 25 years now. I can’t see that Docker container actually delivered any tangible benefits in terms of end-product reliability or velocity of development to be honest. This might not necessarily be Dockers fault though, maybe it’s just that all the potential benefits get eaten up by things like web development frameworks and Kubernetes.

But at the end of the day, todays Docker-based web app development delivers less than fat-client desktop app development delivered 20 years ago, as sad as that is.


If you haven’t seen the benefits, you’re not in the business of deploying a variety of applications to servers.

The fact that I don’t have to install dependencies on a server, or set up third-party applications like PHP, Apache, Redis, and the myriad of other packages anymore, or manage config files in /etc, or handle upgrades of libc gracefully, or worry about rolling restarts and maintenance downtime… all of this was solvable before, but has become radically easier with containers.

Packaging an application and its dependencies into a single, distributable artifact that can be passed around and used on all kinds of machines was a glorious success.


Circa 2005 I was working at places where I was responsible for 80 and 300 web sites respectively using a large range of technologies. On my own account I had about 30 domain names.

I had scripts that would automatically generate the Apache configuration to deploy a new site in less than 30 seconds.

At that time I found that most web sites have just a few things to configure: often a database connection, the path to where files are, and maybe a cryptographic secret. If you are systematic about where you put your files and how you do your configuration running servers with a lot of sites is about as easy as falling off a log, not to mention running development, test, staging, prod and any other sites you need.

I have a Python system now with gunicorn servers and celery workers that exists in three instances on my PC, because I am disciplined and everything is documented I could bring it up on another machine manually pretty quickly, probably more quickly than I could download 3GB worth of docker images over my ADSL connection. With a script it would be no contest.

There also was a time I was building AMIs and even selling them on the AMZN marketplace and the formula was write a Java program that writes a shell script that an EC2 instance runs on boot, when it is done it sends a message through SQS to tell the Java program to shut down and image the new machine.

If Docker is anything it is a system that turns 1 MB worth of I/O into 1 GB of I/O. I found Docker was slowing me down when I was using a gigabit connection, I found it basically impossible to do anything with it (like boot up an image) on a 2MB/sec ADSL connection, with my current pair of 20MB/s connections it is still horrifyingly slow.

I like how the OP is concerned about I/O speed and bringing it up and I think it could be improved if there was a better cache system (e.g. Docker might even work on slow ADSL if it properly recovered from failed downloads)

However I think Docker has a conflict between “dev” (where I’d say your build is slow if you ever perceive yourself to be waiting) and “ops” (where a 20 minute build is “internet time”)

I think ops is often happy with Docker, some devs really seem to like it, but for some of us it is a way to make a 20 sec task a 20 minute task.


And I'm guessing with this system you had a standard version of python, apache, and everything else. I imagine that with this system if you wanted to update to the latest version of python, in involved a long process making sure those 80 or 300 websites didn't break because of some random undocumented breaking change.

As for docker image size, really just depends on dev discipline for better or for worse. The nginx image, for example, adds about 1MB of data on top of the whatever you did with your website.


You hit a few important notes that are worth keeping in mind, but I think you handwave some valuable impacts.

By virtue of shipping around an entire system's worth of libraries as a deployment artifact, you are indeed drastically increasing the payload size. It's easy to question whether payload efficiency is worthwhile when the advent of >100, and even >1000 Mbit internet connections available to the home, but that is certainly not the case everywhere. That said, assuming smart squashing of image deltas and basing off of a sane upstream image, much of that pain is felt only once.

You bring up that you built a system that helped you quickly and efficiently configure systems, and that discipline and good systems design can bring many of the same benefits that containerized workloads do. No argument! What the Docker ecosystem provided however was a standard implemented in practice that became ubiquitous. It became less important to need to build one's own system, because the container image vendor could define that, using a collection of environment variables or config files being placed in a standardized location.

You built up a great environment, and one that works well for you. The containerization convention replicates much of what you developed, with the benefit that it grabbed a majority mindshare, so now many more folks are building with things like standardization of config, storage, data, and environment in mind. It's certainly not the only way to do things, and much as you described, it's not great in your case. But if something solves a significant amount of cases well, then it's doing something right and well. For a non inconsequential amount of people, trading bandwidth and storage for operational knowledge and complexity are a more than equitable trade


Agreed, I remember having to vendor runtimes to my services because we couldn't risk upgrading the system installed versions with the number of things running on the box, which then led into horrible hacks with LD_PRELOAD to workaround a mixture of OS / glibc version's in the fleet. Adding another replica of anything was a pain.

Now I don't have to care what OS the host is running, or what dependencies are installed, and adding replicas is either automatic or editing a number in a config file.

Containerization and orchestration tools like k8s have made life so much easier.


As you note, it was all solvable before.

A lot of us were just forced to "switch" from VMs to Docker; Docker that still got deployed to a VM.

And then we got forced to switch to podman as they didn't want to pay for Docker.


> As you note, it was all solvable before.

Washing clothes was possible before people had a washing machine, too; I’m not sure they would want to go back to that, though.

I was there in the VM time, and I had to set up appliances shipped as a VM instance. It was awful. The complexity around updates and hypervisors, and all that OS adjustment work just to get a runtime environment going, that just disappeared with Docker (if done right, I’ll give you that).

Organisations manage to abuse technology all the time. Remember when Roy Fielding wrote about using HTTP sensibly to transfer state from one system to another? Suddenly everything had to be „RESTful“, which for most people just meant that you tried to use as many HTTP verbs as possible and performed awkward URL gymnastics to get speaking resource identifiers. Horrible. But all of this doesn’t mean REST is a bad idea of itself - it’s a wonderful one, in fact, and can make an API substantially easier to reason about.


I’m aware of all of that, I’m just saying that this has not translated into more reliable and better software in the end, interestingly enough. As said, I’m not blaming Docker, at least not directly. It’s more that the whole “ecosystem” around it seems to have so many disadvantages that in the end overweigh the advantages of Docker.


It has translated to reliable legacy software. You can snapshot a piece of software, together with its runtime environment, at the point when it's still possible to build it; and then you can continue to run that built OCI image, with low overhead, on modern hardware — even when building the image from scratch has long become impossible due to e.g. all the package archives that the image fetched from going offline.

(And this enables some increasingly wondrous acts of software archaeology, due to people building OCI images not for preservation, but just for "use at the time" — and then just never purging them from whatever repository they've pushed them to. People are preserving historical software builds in a runnable state, completely by accident!)

Before Docker, the nearest thing you could do to this was to package software as a VM image — and there was no standard for what "a VM image" was, so this wasn't a particularly portable/long-term solution. Often VM-image formats became unsupported faster than the software held in them did!

But now, with OCI images, we're nearly to the point where we've e.g. convinced academic science to publish a paper's computational apparatus as an OCI image, so that it can be pulled 10 years later when attempting to replicate the paper.


> You can snapshot a piece of software, together with its runtime environment, at the point when it's still possible to build it

I think you’re onto part of the problem here. The thing is that you have to snapshot a lot of nowadays software together with its runtime environment.

I mean, I can still run Windows software (for example) that is 10 years or older without that requirement.


The price for that kind of backwards compatibility is a literal army of engineers working for a global megacorporation. Free software could not manage that, so having a pragmatic way to keep software running in isolated containers seems like a great solution to me.


There’s an army of developers working on Linux as well, employed by companies like IBM and Oracle. I don’t see a huge difference to Microsoft here to be honest.


You'd have a better time working with Windows 7 than a 2.x Linux kernel. I love Linux, but Microsoft has longer support Windows for its operating systems.


What are you even talking about? Being able to run 10 year old software (on any OS) is orthogonal to being able to build a piece software whose dependencies are completely missing. Don't pretend like this doesn't happen on Windows.


My point was that a lot of older software, especially desktop apps, did not have such wild dependencies. Therefore this was less of an issue. Today with Python and with JavaScript and its NPM hell it is of course.


> My point was that a lot of older software, especially desktop apps, did not have such wild dependencies. Therefore this was less of an issue.

Anyone who worked with Perl CGI and CPAN would tell you managing dependencies across environments has always been an issue. Regarding desktop software; the phrase "DLL hell" precedes NPM and pip by decades and is fundamentally the same dependency management challenge that docker mostly solves.


DLL hell was also essentially fixed decades ago. And rarely as complex as what you see nowadays.


Exactly!


I think the disconnect is in viewing your trees and not viewing the forest. Sure you were a responsible disciplined tree engineer for your acres, but what about the rest of the forest? Can we at least agree that docker made plant husbandry easier for the masses world-wide??


Im not sure I would agree here: from my personal experience, the increasing containerisation has definitely nudged lots of large software projects to behave better; they don’t spew so many artifacts all over the filesystem anymore, for example, and increasingly adopt environment variables for configuration.

Additionally, I think lots of projects became able to adopt better tooling faster, since the barrier to use container-based tools is lower. Just think of GitHub Actions, which suddenly enabled everyone and their mother to adopt CI pipelines. That simply wasn’t possible before, and has led to more software adopting static analysis and automated testing, I think.


This might all be true, but has this actually resulted in better software for end users? More stability, faster delivery of useful features? That is my concern.


For SaaS, I'd say it definitely improved and sped up delivery of the software from development machine to CI to production environment. How this translates to actual end users, it's totally up to the developers/DevOps/etc. of each product.

For self-hosted software, be it for business or personal use, it immensely simplified how a software package can be pulled, and run in isolated environment.

Dependency hell is avoided, and you can easily create/start/stop/delete a specific software, without affecting the rest of the host machine.


> But at the end of the day, todays Docker-based web app development delivers less than fat-client desktop app development delivered 20 years ago, as sad as that is.

You mean, aside from not having to handle installation of your software on your users' machines?

Also I'm not sure this is related to docker at all.


I actually did work in software packaging (amongst other things) around 20 years ago. This was never a huge issue to be honest, neither was deployment.

I know, in theory this stuff all sounds very nice. With web apps, you can "deploy" within seconds ideally, compared to say at least a couple of minutes or maybe hours with desktop software distribution.

But all of that doesn't really matter if the endusers now actually have to wait weeks or months to get the features they want, because all that new stuff added so much complexity that the devs have to handle.

And that was my point. In terms of enduser quality, I don't think we have gained much, if anything at all.


Being able to create a portable artifact with only the userspace components in it, and that can be shipped and run anywhere with minimal fuss is something that didn't really exist before containers.


Java?


There were multiple ways to do it as long as you stayed inside one very narrow ecosystem; JARs from the JVM, Python's virtualenv, kind of PHP, I think Ruby had something? But containers gave you a single way to do it for any of those ecosystems. Docker lets you run a particular JVM with its JARs, and an exact version of the database behind that application, and the Ruby on Rails in front of it, and all these parts use the same format and commands.


25 years ago I could tell you what version of every CPAN library was in use at my company (because I installed them). What version of what libraries are the devs I support using now? I couldn't begin to tell you. This makes devs happy but I think has harmed the industry in aggregate.


Because of containers, my company now can roll out deployments using well defined CI/CD scripts, where we can control installations to force usage of pass-through caches (GCP artifact registry). So it actually has that data you're talking about, but instead of living in one person's head it's stored in a database and accessable to everyone via an API.


Tried that. The devs revolted and said the whole point of containers was to escape the tyranny of ops. Management sided with them, so it's the wild west there.


Huh. I actually can understand devs not wanting to need permission to install libraries/versions, but with a pull-through cache there's no restrictions save for security vulnerabilities.

I think it actually winds up speeding up ci/cd docker builds, too.


> As a user experience Docker is amazing, but as technology it is hot garbage.

I mean, Podman exists, as do lots of custom build tools and other useful options. Personally, I mostly just stick with vanilla Docker (and Compose/Swarm), because it's pretty coherent and everything just fits together, even if it isn't always perfect.

Either way, agreed about the concepts behind the technology making things better for a lot of folks out there, myself included (haven't had prod issues with mismatched packages or inconsistent environments in years at this point, most of my personal stuff also runs on containers).


Yeah, but the Open Container Initiative is supposed to be the responsible adults in the room taking the "fail fast" corporate Docker Inc stuff, and taking time to apply good engineering principles to it.

It's somewhat surprising that the results of that process are looking to be nearly as fly-by-the-seat-of-your-pants as Docker itself is.


Was it really so amazing? Here is half a Docker implementation, in about 100 lines of Bash...

https://github.com/p8952/bocker


Lines of code is irrelevant.

Docker is important because:

1) it made a convenient process to build a “system” image of sorts, upload it, download it, and run it.

2) (the important bit!) Enough people adopted this process for it to become basically a standard

Before Docker, it wasnt uncommon to ship some complicated apps in VMs. Packaging those was downright awful with all of the bespoke scripting needed for the various steps of distribution. And then you get a new job? Time to learn a brand new process.


I guess Docker has been around long enough now that people have forgotten just how much of an absolute pain it used to end up being. Just how often I'd have to repeat the joke Them: "Well, it works on my machine!" Me: "Great, back up your email, we're putting your laptop in production..."


The other half is the other 90%.

Looking at it now, it won't even run in the latest systemd, which now refuses to boot with cgroups v1. Good luck even accessing /dev/null under cgroups v2 with systemd.


And like the famous hacker news comment goes, Dropbox is trivial by just using FTP, curlftpfs and SVN. Docker might have many faults, but for anybody that dealt with the problems that it aimed to solve do know in that it was revolutionary in simplifying things.

And for people that disagree, please write a library like TestContainers using cobbled together bash scripts, that can download and cleanly execute and then clean up almost any common use backend dependency.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: