WTF is a container?

Azeralthefallen · on Oct 17, 2016

I remember going to AWS Reinvent last year and having some beers with a bunch of people who did devops. We started talking about tools, and they were utterly flabbergasted, that we had not embraced docker. They went on an on about how simple docker made HA, and handling fail overs/maintenance. More or less made it seem like it was the greatest thing since sliced bread.

Me and a few coworkers decided to try and dockerize some of our serivces, and move our staging ES cluster to docker.

For the most part building our own containers was easy enough, for the various services. The biggest issue we had was with Elasticsearch, since we have 4 types of nodes. So we ended up building 4 specialized containers for each node type.

Then came the issues:

* Making containers work together, for example we use a logging agent, we decided to make that its own container. Then actually getting a way to share a log directory with the agent, was very painful and unclear. (Honestly the single most frustrating thing i recall was asking for advice in irc and more or less being told i am doing it wrong)

* Containers would randomly crash under load exiting with error 137 out of memory. Apparently a few of our services would randomly leak memory, but only when running inside a docker container vs ubuntu 14.04. (I never figured this out)

* Containers would randomly become unreachable, forcing me to kill them and restart them.

* Random hanging issues, suddenly docker exec is hanging. Being told to revert to an older version, or install a newer version is tiring and makes my environment very inconsistent.

* Trying to debug why our app in the container isn't working is not fun at all.

However the single part that killed me was, i was chatting with one of the people who i met at Reinvent, and i mentioned my issues. He acted like it was completely normal for these kinds of issues.

After a solid 2 weeks of random issues and the constant barrage of pagerduty alerts, i just rolled everything back to static EC2 instances for staging, and have ran into 0 issues. I want to try containers again because i want them to work, but i have just had too many issues.

happyslobro · on Oct 17, 2016

We have similar stories. We even brought in a devops consulting firm, who insisted on a pure AWS / Docker (ECS & co) stack. After a couple months of shocks, crises and occasional all-nighters, I just started deploying backup instances on Heroku, so that the QA and design teams wouldn't get blocked by the weekly clusterforks. After a few weeks of smooth sailing on that front, we just activated logging and auto scaling addons, and blessed the Heroku stack with the production domain name and CDN.

Heroku gets expensive quickly at scale, but the engineering required to make "the future" work on AWS was unmeasurable (because it never really succeeded for us).

I don't even know what to blame for the whole episode. Docker's incomplete architecture (2015)? AWS's inability to abstract and manage complexity into something that just works as described on their product pages? The consultants? Myself, for putting faith in that triad of unfamiliar but crucial people, services and products? Whatever, it's no longer a current problem for me.

bartread · on Oct 17, 2016

It's exactly these kinds of issues that add to my impression that containerisation, and Docker in particular, is more of a religion than a solution.

I don't hate it per se, but I'm just a bit fed up of people telling me I should be using it without being able to articulate what problem it's going to solve for me.

At the moment it feels like just one more thing to learn, and one more moving part, that isn't strictly necessary in order to achieve what I need, which is basically to deploy software without a heap of aggravation.

tracker1 · on Oct 18, 2016

For me, with appropriate base containers, the real benefit is load time... booting a full VM has a lot of overhead, and takes quite a while. Loading a dockerized app is much faster. If you're extending this for your testing environment, that can be a huge win over a larger team.

Being able to load more containers than VMs onto a server is another piece...

Beyond that, it's not too much different than running VMs was earlier on, it took orchestration and tooling to get working and there were growing pains... much of that abstracted well before becoming popular.

In the end, containers are just another step towards wherever we are going... I think it's pretty nice, but the tooling is just now starting to catch up.

bartread · on Oct 18, 2016

I think that's a legitimate argument but deployments to VMs that are already up and running can also be fast, and for me they are, so it's not overwhelming. My feeling is that one day I'll reach a level of complexity/deployment time where the benefits of containerisation become clear, but I'm not there yet, nor anywhere near it.

tracker1 · on Oct 18, 2016

That's a discipline issue though... shared nothing start from zero... The "docker" method doesn't upgrade an app already running in a container, it creates a new one.

koffiezet · on Oct 18, 2016

I wouldn't use it for everything, but whenever I encounter a problem where applications crash or behave weirdly - the problem is with the app, not the container technology. It's not because the problem only manifests itself in a container that the problem is suddenly "docker".

And as with every technology, you have to understand it's strength and weaknesses. I use Docker internally and in production with very few issues other than it advancing way too fast at the moment, but am always amazed how many people dive in head-first into a Docker adventure without understanding what it is and how it actually works and it's limitations.

This is from the perspective of a relatively small development company with applications where scaling is a non-issue. Our problem is that we have a ton of active projects. To give you an idea, our internal CI now still has over 500 active build jobs after a recent cleanup.

This CI is the first thing where Docker shines, it was an absolute god-send. I got rid of tons of frankenstein build slaves with unpredictable configurations, and replaced them with one huge VM running docker, with build images per project. This made this massive mess suddenly perfectly manageable, documented and version controlled. Need to build something in a build configuration from 1 year ago? Not something I want to do every day, but not completely impossible either, since I still have the exact same docker image in our internal registry.

Other than that, upgrading internal tools became a lot easier. Everything used internally (redmine, jenkins, ...) is containerized, which means it can easily be tested, migrated, cloned, ... It enforces data separation, which means it's clearer and easier to backup/restore and test these things. It means that now whenever a new version is released I can easily test this with our real data to see if there would be any problems in our configuration, and if not, quickly deploy a new version.

alauda · on Oct 19, 2016

Checkout https://github.com/jenkinsci/hyper-slaves-plugin. This plugin will launch your buildjobs as on-demand containers in Hyper.sh, then you don't even need the long-running huge VM!

so0k · on Oct 19, 2016

it would help these people to read http://docker-saigon.github.io/post/Docker-Caveats/

chousuke · on Oct 17, 2016

I feel that sometimes I'm pro containers, and sometimes I'm very much against them. Docker does make deployments and upgrades very easy once you have the initial infrastructure set up, and you can deploy applications directly from upstream as "units" so that each deployment is exactly what you want; deploying containers helps prevent hosts from differentiating too much.

If you have the infrastructure in place to rebuild your virtual machines though, containers offer little benefit other than perhaps not having to package your software.

At work we're mostly deploying docker containers using puppet, and often when docker fails to work, I can just obliterate everything and run puppet once to re-download images and set up everything again. I would not trust docker to manage any data that I can not recover from elsewhere.

My biggest gripe is that configuration management feels like an unsolved problem. Most examples I see on the web seem to simply ignore it, or do the usual "mount a host directory inside a container" thing which has issues with file permissions and host/container UIDs clashing, and just feels inelegant.

I'm also developing a dislike of Docker on account of it simply failing to work at times for no discernible reason, and having stupid issues like IPv6 simply not working correctly.[1] It's also rather inelegant when used on a systemd-based host because it wants to reimplement most systemd functionality.

I'm still waiting for rkt to mature a bit, maybe it will be better.

[1] https://github.com/docker/docker/issues/2174

digi_owl · on Oct 18, 2016

More like systemd has been reimplementing docker functionality...

so0k · on Oct 19, 2016

fyi, systemd had nspawn for kernel debugging before docker became a hype

shawnee_ · on Oct 17, 2016

Trying to debug why our app in the container isn't working is not fun at all.

My experience with AWS was similar, and I got to a point where I had to stop accepting clients who had built (or insisted on building) their infrastructure on AWS. It actually kind of reminds me of a pyramid scheme where they ensnare you with a seemingly good deal, but in order to get that you've gotta buy in just a little (time, cost, etc) more and more -- just to be able to debug -- and more until you're calculating sunk costs[1] bigger than any company should have to spend on something as trivial as, say, a Wordpress site. It's highly unlikely that 95 percent of the SMB websites out there need the overkill of AWS infrastructure / EC. Do a cost benefit analysis.

user5994461 · on Oct 17, 2016

You entirely missed the point. AWS is not relevant. All the issues are coming from docker.

seneca · on Oct 17, 2016

I think you missed his point. He was speaking generally about the costs of working on new tech stacks that other people just accept.

eatbitseveryday · on Oct 17, 2016

My reply is off-topic, but I cannot resist.

> I think you missed his point. He was speaking generally [..]

shawnee_ --> hackeress.com

> What is a hackeress?

> A hackeress is a female hacker.

Bad form to assume all people are males in this domain (or even the majority for that matter, regardless of the actual statistics).

Use the form "they" when referring to someone whose gender (or gender identity) is unknown to you. Or check their profiles ;)

2¢

striking · on Oct 17, 2016

I think the assumption arose not from the demographic of people on Hacker News, but from the username of that poster. "Shawn" is a fairly common name, where one in every 2000 people will be named it. [1]

Meanwhile, "Shawnee" is a really rare name[2], one you may not be aware of if you didn't grow up in the US (or in particular parts thereof). (Apparently only 4000 of them are alive today!)

Just as an aside, using "they" by default can be very confusing, especially when plurals are involved. I use it sometimes, but prefer to use the passive voice instead ("the parent poster", etc.) I definitely do not have time to check every person's profile when I comment on HN.

1: http://www.wolframalpha.com/input/?i=shawn

2: http://www.wolframalpha.com/input/?i=shawnee&rawformassumpti...

JdeBP · on Oct 17, 2016

That is not the passive voice.

eatbitseveryday · on Oct 17, 2016

Thanks for pointing out where the name confusion may have come from.

> I definitely do not have time to check every person's profile when I comment on HN.

At least then we should not make explicit such gender assumptions, regardless.

I have never heard the name "Shawnee" and generally do not assume usernames on web forums are indicative of real-life anything (e.g. you are 'striking', but apparently that is not your name).

I agree 'they' is an awkward construction in English, but for many people the alternative using passive voice is a little more complex to use (cf. non-native English speakers).

striking · on Oct 17, 2016

As a Polish immigrant, I personally think "singular they" is worse, because it creates ambiguity where there doesn't need to be any... which makes things a bit more confusing. Meanwhile, I think my proposed alternative is native to most other languages, and should translate fairly easily.

I'm not sure I see the point in thinking this long and hard about pronouns, but to each his own.

klodolph · on Oct 17, 2016

Yes indeed, to each their own. The singular "they" can be confusing in certain cases, but that's not really a problem unique to "they". "He", "she", "it", "we", and "you" can all be confusing in relatively common circumstances.

But you're going to have to get used to the singular "they", because it seems to only be getting more common (historically, singular "they" was acceptable until the late 19th century, so the strictly plural "they" might be seen as a 20th century anomaly).

OhHeyItsE · on Oct 17, 2016

I, too, think Docker is the bee's knees. But I don't think it would be good fit for a datastore (even though zeitgeist is to Dockerize all the things).

Docker seems to work best in an SOA type environment where you have a set of stateless services that might use different stacks. Docker simply unifies their provisioning and deployment.

brazzledazzle · on Oct 17, 2016

My impression is that it really shines when you need to scale and you can easily spin up and kill large amounts of nodes/compute without any impact to your service–when you're at the point where you're thinking about the health of the service rather than individual nodes. The whole pets vs. cattle ideal seems to be discussed most in configuration management contexts (because it's widely applicable to most architectures) but it translates into big benefits when operating infrastructure for something like a scaleable SOA.

So to give them the benefit of the doubt, they may have made some assumptions about your scale/workload/architecture. Perhaps for them containers occasionally crashing is just a small blip that will be automatically corrected so while they need to be aware of it to monitor for trends they're not generally concerned with them.

moondev · on Oct 17, 2016

The problem is you are trying to run a production environment with a development tool. Manually doing "docker run" is not how you deploy containers. Kubernetes is designed to run containers and addresses the most common issues.

> Making containers work together, for example we use a logging agent, we decided to make that its own container. Then actually getting a way to share a log directory with the agent, was very painful and unclear. (Honestly the single most frustrating thing i recall was asking for advice in irc and more or less being told i am doing it wrong)

kubernetes handles shared directories easily with all containers in a pod. If you need to share across pods you can use persistentVolumeRequests

> Containers would randomly crash under load exiting with error 137 out of memory. Apparently a few of our services would randomly leak memory, but only when running inside a docker container vs ubuntu 14.04. (I never figured this out)

Kubernetes provides replication controllers that will always re-launch or provision the number of desired pods for a service. It also provides health checks just like an aws elb to determine if a pod is healthy. You can also set resource limits (cpu and memory) per pod.

> Random hanging issues, suddenly docker exec is hanging. Being told to revert to an older version, or install a newer version is tiring and makes my environment very inconsistent.

Docker exec should not really be used on a running service in production, all of your provisioning should happen in the dockerfile to build the image

> Trying to debug why our app in the container isn't working is not fun at all.

If your application is 10-factor, you can easily tail the logs of any container at anytime

botw · on Oct 17, 2016

12-factor?

moondev · on Oct 17, 2016

doh! you're right https://12factor.net/

cookiecaper · on Oct 17, 2016

Yes, Docker is still very young and has a ton of issues like these. I think it will take another 2-3 years before that whole ecosystem has emerged, matured, and is ready for a medium-stability deployment.

Unfortunately for us, such considerations don't stop tech fads. Because containers can allow many more applications to cohabit on the same "hardware", it has business momentum behind it too (lower infrastructure expenditures). "Docker" will be the buzzword until such time as it's actually practical and intelligent to deploy with it.

neurostimulant · on Oct 17, 2016

>> Docker" will be the buzzword until such time as it's actually practical and intelligent to deploy with it.

At that point docker will be considered boring old technology and we'll be flocking to a hip new fad. Repeat ad infinitum.

waffle_ss · on Oct 17, 2016

> Docker is still very young and has a ton of issues like these

Very young? It's over 3 1/2 years old

https://webcache.googleusercontent.com/search?q=cache:eYg3Fs...

madmax96 · on Oct 17, 2016

Keep in mind that young is relative. In a field that still uses software daily originating from the 60's, 3 and a half years is very young.

groundhogday1 · on Oct 17, 2016

I'm confused about the need to have a web app that 'runs anywhere' when you know exactly 'where' it's going to run. When you spin up an instance, don't you get to decide what OS and services you're going to use?

I could maybe see the advantage if you had no control over the environment that the app was going to be run in like you might have with a downloaded executable for example. However, if your product is just some kind of web application, I don't see the need for containers.

Can_Not · on Oct 21, 2016

Being able to leave your current hosting provider quickly is important, but not over engineering abstract hotswapping interfaces is pretty important, too.

madmax96 · on Oct 17, 2016

Same tune here.

What docker promises is truly amazing and is something that I think everyone wants. However, docker itself still has a lot of problems.

In particular, docker's new builtin swarm (with 1.12) has tons of issues. I've experienced __so many__ problems coordinating container startups on a swarm cluster.

From the documentation:

    > Swarm can build an image from a Dockerfile just like a 
      single-host Docker instance can, but the resulting image   
      will only live on a single node and won’t be distributed 
      to other nodes.

This is a serious weakness.

You have to use __manual scheduling__ if you want to ensure services are started on the same node. This is problematic if your inventory of nodes is dynamic and frequently changing. Allegedly, kubernetes solves this problem.

Swarm (and the whole ecosystem) is still not as mature as it could be, but I think the end result is going to be very awesome and useful.

echlebek · on Oct 22, 2016

Kubernetes has the concept of a pod, which is a group of containers that run on the same node. It does indeed solve that problem.

http://kubernetes.io/docs/user-guide/pods/

moondev · on Oct 17, 2016

That's probobly a bad pattern to use the local registry like that. The goal is to be as stateless as possible, so pushing and pulling from an external registry would fix that.

lowbloodsugar · on Oct 17, 2016

>Containers would randomly crash under load exiting with error 137 out of memory. Apparently a few of our services would randomly leak memory, but only when running inside a docker container vs ubuntu 14.04. (I never figured this out)

Did you identify that the software definitely did not leak on ubuntu? Or was it that it ran on ubuntu because it consumed swap?

Was it for sure a leak? I have an app that doesn't have back-pressure so when too many requests come in, it fills up memory waiting to push them all through to the slower database. Normally it uses 256m. Spikes make it hit 700mb. If I tell docker -m 300m, then the process gets 300m ram + 300m swap, so when it tries to use 700m it gets killed. I could tell it to use 350m and then it will run, but it would be swapping furiously.

godzillabrennus · on Oct 17, 2016

Reminds me of the early days of asterisk and open-vz virtualization. Everyone sane stayed physical but a few of us crazy enough were able to push through and reap the rewards early on.

ksec · on Oct 17, 2016

flabbergasted? Wow this is a new word for me, I have never heard of it. When i first read it in my mind I thought it was a Spanish Football player that used to play for Arsenal, Then Barcelona, and now Chelsea. LoL

Is this word even used anywhere else beyond US? Never heard it used in the UK.

Jefro118 · on Oct 17, 2016

I'm from the UK and I hear "flabbergasted" reasonably often, more often from upper-middle class folks. Maybe a regional thing?

philk10 · on Oct 17, 2016

more of an age thing? I use it but I doubt my daughter ever does

digibri · on Oct 17, 2016

I was born and raised in the midwest part of the U.S. and I've been using "flabbergasted" for all my adult life (20+ years). Of course, I've always had a propensity for peculiar or anachronous verbiage.

TickleSteve · on Oct 17, 2016

flabbergasted is very widely used in the UK, really puzzled how you could possibly not come across it if you've spent more than a few months here.

oarsinsync · on Oct 17, 2016

Indeed. One might say it is almost flabbergasting!

dancek · on Oct 17, 2016

I agree that containers (both for shipping and servers) are a great idea. And because I'm tired of always configuring servers, I decided to give it a try some time ago.

I wrapped my IRC client (weechat + glowing-bear) in a Docker container. Oh, not a container though, because I also needed https, which meant I needed either a mechanism to build and update letsencrypt certs in the weird format that weechat expects, or to run an nginx instance in the front (and also somehow get the certs, but that's easier with nginx). So two containers.

And even though there was a ready-made nginx+letsencrypt https reverse proxy container (actually several), I had a huge amount of headaches to get it actually working. Even with the system set up, I occasionally have the container crash with exit status 137 (IIRC), which I've assumed might be because weechat leaks/consumes memory and eventually the host server kernel kills the process. Maybe.

So in my limited experience, comparing Docker containers to shipping containers is a gross simplification. Shipping containers are simple constructions requiring well-defined simple maintenance, while Docker containers seem to be complex thingamabobs that have multiple points of failure.

nvarsj · on Oct 17, 2016

Docker is a poorly engineered and over-hyped technology.

The concept is great - and in fact, many companies have built great tooling around Linux cgroups. It lets you efficiently binpack applications on a single server - which is why 'containers' were created in the first place.

The side benefit of letting you define your OS libraries, and other things, is a nice bonus, and way overblown in my opinion.

Docker and its tooling is just plain bad. It's so bloody unstable, unless you pick some esoteric combination of versions and storage backends (old docker + old aufs + old ubuntu seems to do the trick) - and even then, you'll run into problems. Documentation won't help you here - it's trial and error.

The docker group seems focused on feature, feature, feature, while the basic stability and performance remains poor. It's quite amazing how they get it so wrong. Browse through the github issues for performance and hanging issues. The surrounding tooling is poor as well - registry v2 doesn't even support deletes (because they need to maintain compatibility with the 100000 different storage backends for it). There is no LTS release, bug fixes only happen in latest version, so you're in a constant state of brokenness. And so on.

vacri · on Oct 17, 2016

We used Docker sort of early on, and got out of it around v1.6. Many problems that you had to work around yourself, but the one you just reminded me of was the repos: There was Dockerhub, a third-party place without 100%(-ish) uptime, and two serve-your-own, one of which was a black box you could never delete from, and the other had stamped "NOT FOR PRODUCTION USE" in big letters on its github page.

I remember one of the engineers giving a talk, saying that the problem they had was growing too big too quickly - they didn't have time to properly work out the base architecture in the early days.

DasIch · on Oct 17, 2016

I think the real problem with Docker is Docker Inc. The pressures and constraints the company is under promotes the creation of new features that can be marketed and new software that competes with products from the competition.

They have effectively no incentive on getting rid of bugs in the core product or to test features they do add exhaustively.

Sooner or later someone actually using containers will produce a docker replacement that will take over unless Docker focusses on what actually matters.

xorcist · on Oct 17, 2016

Docker Inc. desperately wants to be VMware. I don't think it's a coincidence that the official documentation is so full of virtualization-like choice of words you'd be forgiven if you thought they already were.

They do however seem to be cutting so many corners technically, that the risk is they get undercut on their core offering.

Leynos · on Oct 17, 2016

Deletion from registry v2 is definitely supported now. It's just a total PITA, and took them a while to implement. But I got my disc space back, so I'm not complaining. :P

ptman · on Oct 17, 2016

Any pointers?

Leynos · on Oct 17, 2016

This answer on SO is more or less the approach I used:

http://stackoverflow.com/a/37716286/308278

nine_k · on Oct 17, 2016

Ok, throw Docker away.

Did anybody have better experience with e.g. Rkt?

senex · on Oct 17, 2016

I really like: daemontools + static binary + setuidgid and maybe chroot.

Or mesos where everything needs to be in a tarball, which is extracted and sole program run.

If stuff is statically linked and related files (config, assets, ...) are part of bundle that can be chrooted, what is the value add of a container?

JdeBP · on Oct 17, 2016

It's not hard to use control groups in daemontools family style, either.

* http://jdebp.eu./Softwares/nosh/guide/move-to-control-group....

brazzledazzle · on Oct 17, 2016

Containers are easier for folks with less operational experience to understand. Or maybe easier to get started with is a better way to put it. It's easy to underestimate easy to use tooling when you aren't the target demographic.

ChoHag · on Oct 17, 2016

The concept is great but it's also not original. It's called "processes". Docker is little more than a mass of complication laid atop fork+exec.

That's why nobody can get it right - because we already did.

jjnoakes · on Oct 17, 2016

How is "your own network, your own view of the file system, your own view of the process table, your own view of the user IDs, ..." the same as "processes"?

elsonrodriguez · on Oct 17, 2016

On modern Linux distros every process is running in a cgroup and namespace by default. So these days the main difference between a "container" and a regular process is that regular processes are all jumbled together in the same root namespace, and containers are in separate namespaces.

jjnoakes · on Oct 18, 2016

Which Linux distributions do this? I'd like to read how "most" instances of fork and exec end up with unique networks and file system namespaces.

elsonrodriguez · on Oct 18, 2016

I didn't say unique, I said "jumbled together".

Now as far as which distros put processes in a namespace and cgroup by default, I know at least CentOS 7 and Ubuntu 15 do this. And those two distros on their own would qualify for "most".

To check if your distro does it, one way of checking is just doing a `cat /proc/1/cgroup`. This will show you what cgroups process 1 is in. By default you will be in the "root" cgroup.

To check your namespaces, `ls -l /proc/1/ns/`. You'll see the process is in some randomly generated namespace ID per item.

I'm sure you could recompile your kernel to disable this behavior, but the default reality of modern Linux is that everything is already running in a "container".

Now the question is whether or not people want to take advantage of that reality, and separate out processes in isolation, or keep running everything on a system in such a way that any single process can impact the whole system.

zeveb · on Oct 17, 2016

Plan 9 is knocking on the door and would like to have a word …

jjnoakes · on Oct 18, 2016

The context of this discussion would like to have a word too...

zeveb · on Oct 18, 2016

I think that Plan 9's filesystem-based namespacing (and lack of a superuser) actually have a lot to offer for container-like solutions. Any Plan 9 user can set up namespacing of the network and of resources and spawn a process within that restricted namespace.

The whole process is much simpler, I think, than that of creating a Linux container (that's the whole reason Docker exists: to simplify & abstract something which isn't really inherently complex, but is accidentally complex).

Plan 9 certainly wasn't perfect, but it had some really high-quality ideas we still haven't assimilated in mainstream platforms.

jjnoakes · on Oct 18, 2016

I'm not disputing any of this. But it really has nothing to do with the context of this thread...

IshKebab · on Oct 17, 2016

I totally agree. The real issue is dynamic libraries and how hard it is to compile C/C++ code statically with GCC.

If you could just pass `-static` to gcc and it actually worked like you expect this would never have happened.

Fortunately that seems to be changing somewhat. Go is totally static, and Rust can easily be made totally static using muscl. You can even do totally static C/C++ apps fairly easily with muscl.

jjnoakes · on Oct 17, 2016

Dynamic libraries (in the C/C++ sense) only scratch the surface. Containers give you your own file system namespace (among other namespaces), which means all of the files that make up your complicated application unit can be put together and work together in isolation, separate from the machine's main file system.

nix0n · on Oct 17, 2016

What advantages do file system namespaces have over separating by directories and users?

acdha · on Oct 17, 2016

A big one is managing software which you didn't write: if you have two things which expect to be able to write to /etc/mydaemon.conf etc. you either need to burn a VM for each one, fork the startup scripts or take the Debian-style approach of maintaining patches which make everything configurable, or manage something like maintaining chroots directory hierarchies.

(repeat for network namespacing: it's really nice not to need to play games to have your CI server start 3 running jobs which all think they're listening on port 80)

None of that is impossible – in the case of chroot there's many years of precedent - but if you do it regularly, there's a strong appeal to automating a common pattern.

This is especially true when your goal is supporting development teams: with something like Docker, normal users don't need root just to start a daemon on a privileged port or write to a couple of files. If you work in a large or security-conscious environment, that's a fairly big draw.

(Not saying that Docker is perfect or necessarily the long-term winner in this space, only that there's a usability gap which a lot of people fall into).

mmarx · on Oct 17, 2016

> This is especially true when your goal is supporting development teams: with something like Docker, normal users don't need root just to start a daemon on a privileged port or write to a couple of files. If you work in a large or security-conscious environment, that's a fairly big draw.

On the contrary, any user that can run arbitrary containers (such as rootplease[0], for example) has root-level privileges on the host system.

[0] https://hub.docker.com/r/chrisfosterelli/rootplease/

acdha · on Oct 17, 2016

What I was thinking about wasn't protecting against outright malice but rather mistakes and errors: If you give developers sudo access and you don't have an extremely diligent team with strong system administration experience, you're going to run into problems where people made incompletely documented changes or cause problems while working which aren't caught early enough – ever see someone break out sudo or chmod 777 as their first debugging step or even write that into the install process because it was too much work to do it right? Docker is an enormous win here both because it sharply reduces the number of times someone needs privilege escalation and because it ensures that the end result of their work can be reliably audited and repeatedly deployed.

It's true that Docker doesn't protect against compromised or malicious users with privileged access. That's a very hard problem in general which can only partially be addressed at this level — especially since many of the most damaging attacks don't need it (“The bad news: they exfiltrated our customer database. Good news: they didn't get root on the EC2 instance”). I think most of the answers for this problem are going to continue to rely on existing practices like code review, auditing, getting finely-grained SELinux / seccomp rules into the development mainstream, etc.

xorcist · on Oct 17, 2016

> If you give developers sudo access

Why would you, though? A developer would at the very most require the application's privileges, not the super user's. And that's only really necessary when doing live troubleshooting.

acdha · on Oct 17, 2016

You have to if your developers are installing packages, working on deployment, running anything which runs on a privileged port, use tools like systemtap, etc. In some cases that can be avoided with configuration but then you're asking for bugs due to discrepancies between the environments.

You can reduce the number of things which hit that friction in a number of ways: having easily repeatable builds so a developer can test on their own VM; using a cloud service so test VMs are disposable and never shared; setting up a platform / microservices approach so a wider range of things either don't need to be touched or can be deployed without privileged or direct access, etc.

Containers (really namespaces) one way to hit that last goal: as a classic example, if you have a web app running on port 80 everyone who deploys will need to periodically restart Apache/nginx/Varnish/etc. and that may include elevated access to debug processes running as a different user. This is by far the simplest problem in this class and there are various ways (proxies, site users, firewall NAT rules, moving config into .htacccess or its equivalent, etc.) to reduce the number of times you have to care about it but but it still requires work to maintain and adapt code (especially with third-party apps).

Some people quite reasonably prefer to solve this entire class of problems by tossing everything into a container so it can run as if it's the only thing on the box.

ChoHag · on Oct 19, 2016

Basically if you're a shit sysadmin, you don't have to bother learning anything or working hard.

This is absolutely fine if you intend to remain a shit sysadmin. Go nuts with Docker. The 21st century will be waiting for you when you're done.

So will 19 fucking 60 because nothing's fucking changed.

jalfresi · on Oct 17, 2016

Isn't that chroot jails?

icebraining · on Oct 17, 2016

Yes, quite literally, in fact: https://github.com/opencontainers/runc/blob/ee992e5ff7143ea3...

brazzledazzle · on Oct 17, 2016

Wouldn't it be more akin to jails?

nikcub · on Oct 17, 2016

You might have better luck with container specific reverse-proxy like Traefik[0] - it has builtin Let's Encrypt support with auto-renewal

> I had a huge amount of headaches to get it actually working.

Moving from running one container to running multiple containers is probably one of the most confusing parts of getting started with Docker

There are a large array of orchestration options and tools - each with their own pros and cons: Swarm, Kubernetes, Mesos, Marathon, Mesosphere, Centurion, Rancher, etc.

Docker 1.12 now having built-in orchestration with Swarm should make this easier[1].

[0] https://github.com/containous/traefik

[1] https://blog.docker.com/2016/06/docker-1-12-built-in-orchest...

dancek · on Oct 17, 2016

Traefik sounds great! I'm using https://github.com/SteveLTN/https-portal which is a Docker container, and that kind of setup is just complicated. Putting the reverse proxy outside sounds much cleaner.

dragonsh · on Oct 17, 2016

Containers!=docker, I think if you used lxd container, your experience might have been different as they work like virtual machine. I am using lxd in production and it has been a pleasure with live migration, snapshots and good old configuration management using ansible.

haalcion3 · on Oct 17, 2016

And, as probably everyone knows, Google runs everything in containers and has been using containers for a decade:

http://www.nextplatform.com/2016/03/22/decade-container-cont...

Docker may be flawed, but containers aren't. If you need some enterprise leader to tell you this instead, here are some Gartner posts showing this is the way:

VMs may be well established and "magic quadrant", but they are also on decline, and containers make better use of hardware/resources:

http://www.informationweek.com/cloud/infrastructure-as-a-ser...

It's important what you use to roll out containers and requires forethought:

https://www.gartner.com/doc/3267118/containers-change-data-c...

pm90 · on Oct 17, 2016

What really surprises me about Google is why they don't open source some of these great core technologies (MapReduce, Containers etc.) instead of publishing theory as academic papers. On the one hand, it may be a great way of promoting the creating of these tools from the ground up, inspired by the theory alone. On the other hand, Google's invaluable experience with using these technologies probably means their versions are more stable, faster, resilient etc.

haalcion3 · on Oct 17, 2016

Kubernetes:

http://kubernetes.io/

Google Container Engine:

https://cloud.google.com/container-engine/

Open-source MapReduce implementation:

https://gigaom.com/2015/02/18/google-open-sources-a-mapreduc...

https://github.com/google/mr4c

pacala · on Oct 17, 2016

Dataflow [aka Flume]:

https://github.com/GoogleCloudPlatform/DataflowJavaSDK

http://research.google.com/pubs/pub35650.html

jameskegel · on Oct 17, 2016

Kubernetes just got a pretty sweet update as well. The improvements to the dashboard alone were worth revisiting.

brazzledazzle · on Oct 17, 2016

I think they've said in talks/presentations that the challenge they have with some stuff (like Borg) is that they can't really extract individual components. It's all too tightly coupled. It wouldn't be fair to ask them to open source their whole stack. The fact that they took the lessons learned and created Kubernetes or published papers on their technologies is more than enough.

saurik · on Oct 17, 2016

Why enable their competitors to such an extent? What benefit is in it for them, when they seem to be at the absolutely forefront of this work? They have to let their researchers publish academic papers or they won't get the best researchers, but when they also have the most experience with ops and such a lead in implementation, the theory only gets competitors so far.

oblio · on Oct 17, 2016

It's entirely possible that they consider those core technologies part of their "magic sauce". If their development and production environments are so far ahead of everyone, their development cycle will be much faster.

agopaul · on Oct 17, 2016

Still, I'm not sure why I should use containers VS Vagrant + ansible (or just ansible)

Keyframe · on Oct 17, 2016

There's also rkt!

X86BSD · on Oct 17, 2016

I'll also add a plug for tredly.com here. "Containers done right" and pretty much all of Joyents container work.

Philipp__ · on Oct 17, 2016

Docker is just overhyped deal with it. There are some nice ideas, but nothing that we couldn't or haven't seen before. FreeBSD Jails and Solaris Zones exist for more than ten years. Where they addressed many things that Docker didn't.

vacri · on Oct 17, 2016

Can you download a BSD Jail image from an application's website, and have it 'just work'?

cookiecaper · on Oct 17, 2016

Yeah, surface simplicity seems to be the heart of what drives tech fads, even if the actual cost in labor and effort is exponentially larger on the back end.

MongoDB? Yes, don't worry about data design or normalization! Just throw that stuff in there, call it anything you want! Finally programmers are free from the tyranny of decent databases. I can't think of any downsides to that arrangement.

Docker? Yeah man, you can just say "docker pull redis" and then you don't even have to bother with Ubuntu getting in your grill with all of its apt-get shenanigans. It's awesome! Now how do we make that port accessible...

technofiend · on Oct 17, 2016

I work for a Fortune 50. I can't just download anything. When some third party curates a store of containers and guarantees their safety then maybe. Until then docker, rkt and the rest are a distant dream.

tachion · on Oct 17, 2016

If anyone would provide those, then sure, why not? I could just give you my jails with database, web server, irc client and everything else as compressed archives, you could just unpack them, execute jail commands on them and have them running. This is pretty much that's there to is. No different than Docker, and yet, many more years used in production environments.

icebraining · on Oct 17, 2016

Don't know about BSD Jails, but you certainly could download OpenVZ images and run them.

mtve · on Oct 17, 2016

it's called curlbashism.

Philipp__ · on Oct 17, 2016

No, but I do not see how it is related to my comment. Read more carefully. I didn't say that Jails and Zones did everything that Docker did, yeah DockerHub is cool and I gave them credits for that.

paulddraper · on Oct 17, 2016

"nothing that we couldn't or haven't seen before"

fyolnish · on Oct 17, 2016

Federation credits?

jonatron · on Oct 17, 2016

I ran into weird problems with FreeBSD Jails. Like cronjobs running twice, and `ezjail-admin console` not allocating a tty.

chewchew · on Oct 17, 2016

Docker gives you the building blocks, but that means you have more pieces to arrange and manage. Take a look at Docker Compose if you haven't already, since the Docker CLI only gets you so far when you're creating apps that consist of multiple containers.

I think the best approach for your cert issue is to abstract that into a separate service (nginx is an option, but I'd recommend the Rancher approach below). Yes, that means you have to add another container, but that's just another block in your docker-compose.yml file. Embrace the approach of separating your components into containers and organizing them as a stack. You can easily link containers together, share data volumes, and start/stop individual containers or the stack as a whole.

The problems that you're having are pretty easy to fix with some tooling. Rancher (http://rancher.com/) greatly simplifies the cert issue by allowing you to import certs and provide them to the Rancher loadbalancer service (which you can add to any stack). There's also a LetsEncrypt community catalog template that automatically retrieves and imports certificates to Rancher. There are other open source orchestrators like DCOS, but Rancher is probably the simplest to use, and it's the only one I'm very familiar with. There are SaaS options that you can look into, but I don't have experience with them.

As for the container crashes, it's trivial to automatically restart them. Just pass the --restart=always flag to the Docker run command. You can also add the flag to a docker-compose.yml file.

beachstartup · on Oct 17, 2016

> but that means you have more pieces to arrange and manage

wait. aren't these things supposed to give us less pieces to arrange and manage?

> The problems that you're having are pretty easy to fix with some tooling

yes, of course the solution is more tools. what exactly was the problem again?

chewchew · on Oct 17, 2016

> wait. aren't these things supposed to give us less pieces to arrange and manage? No, not fewer pieces. You'll have more pieces, but you can combine the pieces and control them individually or as a group. You can think of Legos, since you have many pieces but they all fit together in the same way.

Docker compose lets you group these containers together, and that is what ultimately makes it feel like you have fewer pieces to manage. With that, you can start/stop a stack of containers (e.g. django, nginx, postgres, redis) with a single command, but still inspect and manage each component separately. This is something you might normally do with bash scripts, but with Docker you can take that same app to an orchestration platform and run it on any host. Run it on your laptop, run it on a linux server, run it on a SaaS provider like Docker Cloud, run it on a private cloud with an orchestration platform.

> yes, of course the solution is more tools. what exactly was the problem again? Docker is just the foundation. Nothing more. I'm fine with learning more tools because I feel that the foundation is solid. The problem is being able to ship and manage your apps. That is much, much easier for me now and I'm very glad I invested the time.

pm90 · on Oct 17, 2016

Its somewhat akin to microservices, where you split each functionality into its own service rather than having one monolith do everything.

tigershark · on Oct 17, 2016

Seriously? If something crashes continuously in production then the solution is "just pass the --restart=always flag. I really wonder if you guys are really using docker in prod. I would never use something like that to manage important transactions.

X86BSD · on Oct 17, 2016

No the correct solution is to debug the issue not just reboot/restart/reinit the damn thing.

This mentality is WHY Linux/docker/containers/Linux worlds NIH syndrome, is such a tire fire.

chewchew · on Oct 17, 2016

It's not one or the other. Restarting might save you some downtime if you're running a single IRC container. That doesn't mean you shouldn't find and resolve the root issue. Normally I just rollback to the previous version container if I have a recurring issue.

Windows containers are now available, if you can't stand linux: https://msdn.microsoft.com/en-us/virtualization/windowsconta...

Anyway, don't use containers if you don't want to. I'm glad I invested the time since I understand them very well and use them to my advantage. But I did have to learn a lot and experiment with a bunch of tools, and maybe that's not worth it to you.

chewchew · on Oct 17, 2016

As someone said, this is the normal behavior in the Erlang/Elixir world, and it seems to have worked extremely well for the telecom industry.

That said, my reply was to someone running an IRC server, presumably on a single server, so don't stretch my advice to a production app handling millions of transactions. Obviously the core issue is that the app crashes, and it's still up to him to fix that. This is almost certainly a problem with his app/config, not Docker itself (though it ain't perfect). If it's something that happens every 6 months, then auto restarting will probably save him a lot of problems. If your transactions are so precious, don't pass the flag- it's up to you.

LgWoodenBadger · on Oct 17, 2016

Isn't this the much-lauded and respected Erlang approach to failures?

tigershark · on Oct 17, 2016

I never used Erlang, so I have no idea if it follows the same approach(although I find it quite strange). But for sure I can't afford to deploy anything like that in prod. You lose one transaction in the middle and several millions go lost. I'd rather lose an hand than try to explain my clients that it is fine, docker just restarted by itself as expected..

dancek · on Oct 17, 2016

Thanks, I'll have to look into Rancher. I'm already using Docker-compose.

Also good to know about automatic restarts. I don't know how I missed that, I had to read a lot of docs to get where I am.

Still, the amount of tooling that exists and the knowledge needed to pick the right ones for a given situation goes to show that this isn't as simple as packing a shipping container and letting someone ship it...

chewchew · on Oct 17, 2016

You're right, it's not quite that easy yet, but it probably will be eventually. Docker just provides the building blocks. SaaS providers like Docker Cloud will get better and continue to abstract complexity away until it really is that easy.

You don't need to use Rancher if you're just running one app. If that's all you need to do, then it could be as simple as running docker-compose on a linux server and mounting the certs into the nginx container as a host volume (https://github.com/jwilder/nginx-proxy). This is a fine approach until you want to split your containers across several hosts (redundancy or scaling) and you have several apps to worry about.

holografix · on Oct 17, 2016

Hey chewchew you seem to be knowledgeable on docker-compose.

What's the easiest way to get a .yml on a cloud server somewhere and let it assemble it assemble the containers for you?

Also, do you know where containers set environment variables? The official Postgres makes available something like PG_PORT_3542 and u can just refer to it from another container. Where I can't seem to do the same with the Redis one...

chewchew · on Oct 17, 2016

The absolute easiest is probably Docker Cloud. I've checked it out but haven't really used it much. It's still relatively immature but if you just want to deploy and forget something simple, this is probably the way to go.

If you want to use your own Linux host, then the simplest way would probably just be to SSH into the box, git pull, and run "docker-compose build && docker-compose up".

Setting environment variables is pretty easy, but I don't think you need them in this case. If you're trying to make a redis or postgres container available to another container (your app), then you can do so easily with links in docker-compose. Something like:

    myapp:
        image: myimage:0.0.1
        command: cmd to run
        links:
        - redis:redis
        ports:
        - 80:8000
        environment:
        - hardcoded_var=my_env_value
        - var_from_host=${host_var}
    redis:
        image: redis:latest

You can then access redis from the myapp service using the hostname "redis" and the default port "6379". So, "telnet redis 6379" would work from the myapp container (assuming telnet is actually installed). The redis port isn't even publicly exposed- it's only available to myapp.

If you need to define environment variables, you can do so with an environment dict as shown above. There are a few other ways to define env vars:

https://docs.docker.com/compose/compose-file/#variable-subst...

https://docs.docker.com/compose/environment-variables/#/the-...

holografix · on Oct 20, 2016

Thanks chewchew so much win in this reply. I use Heroku a lot and was wondering if there is a "believable" alternative using Docker.

jameskegel · on Oct 17, 2016

kubernetes allows you to upload a yaml snippet and launch, which sounds like what you described.

hvidgaard · on Oct 17, 2016

I wanted to write about my experience with containers. It mimics yours. The idea of a container is beautiful and elegant. The execution, not so much. I'm more or less convinced that to reduce this complexity, we need a limited scope ecosystem. Microsofts .Net would probably be a good place to start, but MS, being enterprise, have a habit of turning things into a mess of configuration.

ownagefool · on Oct 17, 2016

I think the container example is actually more apt than you think.

Running out of memory is a bit like running out of space in the container.

If I dropship a container on your yard, you're going to have a container but your options in what to do with it are limited.

Now if your container has a lifecycle and a scheduler because it needs to be in China on Wednesday, you suddenly have a lot more complexity.

Docker itself being a buggy piece of crap is neither here now there in the grand scheme of things.

lowbloodsugar · on Oct 17, 2016

Someone told me that I should switch to using shipping containers instead of my current method.

Unfortunately, when my cattle arrived in the US, they were all dead. I was told containers "would just work".

After some experimentation, we managed to get our cattle in a container by building a system that correctly managed food and waste. Then we found a partner who had successfully packaged sheep and so we just used that. Unfortunately, they appeared to have a leak in their waste management system, and the container overflowed and all the sheep died. We did not have this problem when the sheep were able to fill the entire hold with their output.

Clearly the idea that shipping containers just work is a gross oversimplification. In fact containers seem to be complex thingamabobs with multiple points of failure.

jbverschoor · on Oct 17, 2016

Same here with docker. The ideas are nice as an application, but it's just a pain in the ass. LXC on the other hand is a breeze.

sergiosgc · on Oct 17, 2016

LXC is nice, as a building block, but it is far from "a breeze".

It is refreshing, however, being able to write, in a couple of hours, a setup script from scratch that debootstraps an install, chroots in there to configure it and launches the container via systemd-nspawn. Very few moving parts, excellent for development environments. Not yet something I trust in production, though.

dancek · on Oct 17, 2016

Does LXC provide a way to automatically create new working instances (like a Dockerfile / docker-compose.yml)? I'm not quite grokking LXC in my first 10 minutes of reading...

gautamdivgi · on Oct 17, 2016

I think you could create your own lxc template and then create new instances. The other way would be to just clone existing working instances with overlayfs.

Base lxc may be a bit of a pain. Look at lxd - I believe that makes things much much simpler.

Another thing about lxc is the unpriviledged container - which I think is great for security (not sure if docker provides this).

contingencies · on Oct 17, 2016

man lxc.conf ... this is your dockerfile equivalent.

pjmlp · on Oct 17, 2016

As others say Docker is probably over-hyped technology.

However, I do see it as positive, because its hype, regardless if good or not, has created the traction for Go and OCaml on the data center, thus eventually leading to less C code for such use cases.

So hype or not, maybe we do get some security improvements on the overall stack.

shoover · on Oct 17, 2016

I'm missing the initial assumption. What is the connection between Docker and traction for Go and OCaml? People are using the latter in order to simply avoid containers?

pjmlp · on Oct 17, 2016

Parts of Docker are implemented on them.

So anyone that wants to improve Docker or adapt it to their distribution of choice needs to eventually use them.

For example, Microsoft did several contributions in Go for making Docker run on Windows.

The TCP/IP stack used by Docker on OS X is taken from MirageOS, written in OCaml.

shoover · on Oct 17, 2016

Got it. Thanks.

saji13 · on Oct 17, 2016

Out of curiosity, why is less C code a good thing?

pjmlp · on Oct 17, 2016

Security exploits caused by memory corruption, undefined behavior, ability to inject code, numeric overflows plus whatever is common to all memory safe languages.

https://www.cvedetails.com/vulnerabilities-by-types.php

pm90 · on Oct 17, 2016

Maintenance. A lot of the younger programmers have very little experience with running/working with C code and stack; they are a lot more comfortable with Java/Python/Go (for backend). So the less C code you have to deal with in your stack, the easier it is for deployment/debugging. Not to mention, the more modern languages also provide many features that allow pinpointing errors faster etc.

sidlls · on Oct 17, 2016

There are legitimate reasons to not want to use C, but I find around here it's mainly reflexive hate and language zealotry.

It's popular to hate on C (and C++) because the languages are so ubiquitous and long-used that a large body of terrible, unsecure, and poorly written code exists using them. Other languages haven't had the same success as these two yet, so haven't had their warts exposed enough to be dumped in the "automatically hated" category. Java comes close, but it also is typically lumped in the "automatically hate it" bucket and for similar reasons.

pjmlp · on Oct 17, 2016

It was already clear in the late 70's and early 90's that C wasn't a reliable option to write safe systems.

Dennis M. Ritchie himself on the history of the language[0]

"To encourage people to pay more attention to the official language rules, to detect legal but suspicious constructions, and to help find interface mismatches undetectable with simple mechanisms for separate compilation, Steve Johnson adapted his pcc compiler to produce lint [Johnson 79b], which scanned a set of files and remarked on dubious constructions."

Lint which is still mostly ignored by the masses to this day. At CppCon 2015, about 1% of the audience acknowledge using static analyzers.

Per Brinch Hansen letter to C.A.R. Hoare in 1993a [1]

"The 1980s will probably be remembered as the decade in which programmers took a gigantic step backwards by switching from secure Pascal-like languages to insecure C-like languages. I have no rational explanation for this trend. But it seems to me that if computer programmers cannot even agree that security is an essential requirement of any programming language, then we have not yet established a discipline of computing based on commonly accepted principles."

There are many other sources of similar statements since C exists, so the hate isn't something new.

Regarding C++, yes unfortunately it inherits C flaws, but at least the community tends to embrace language features to improve the language safety and push for type based programming.

[0] https://www.bell-labs.com/usr/dmr/www/chist.html

[1] brinch-hansen.net/papers/1999b.pdf

rbmiller · on Oct 17, 2016

Where is this mythical C++ community that promotes safe and auditable programs?

Whenever I'm forced to use a C++ program it's buggier than the C equivalent.

pjmlp · on Oct 17, 2016

They are here:

https://isocpp.org/

http://cppcon.org/

https://github.com/isocpp/CppCoreGuidelines/blob/master/CppC...

http://erdani.com/index.php/books/modern-c-design/

https://msdn.microsoft.com/en-us/library/hh279654.aspx

http://stroustrup.com/Tour.html

http://elementsofprogramming.com/book.html

Most of the C++ bugs I found happened to be written by former C developers that disregard using C++ stronger type safety, RAII, the standard library containers and usually use naked pointers alongside malloc() and free().

And that is the biggest problem with C++, its copy-paste compatibility with C, which allowed its adoption by C compiler vendors, but made it as safe as C when developers disregard best practices.

sidlls · on Oct 17, 2016

Except for relatively uncomplicated, or relatively low level programs I have had the opposite experience.

fanf2 · on Oct 18, 2016

Lint is built in to modern C compilers.

Gruselbauer · on Oct 17, 2016

I'm doing the same with a couple of containers and my experience is pretty much the opposite. What containers are you using, if I may ask?

I'm using jwilder's nginx rproxy container, the let's encrypt helper container for that plus half a dozen Web apps on a VPS. Among those Weechat, two instances of vanilla nginx, rutorrent, dovecot, mattermost, nextcloud and a test environment for my Python tinkering. Works like a charm. Upon bringing up a new container, I supply the desired subdomain and my letsencrypt user data, the container comes up and uses SSL plus automatically renewed certificates.

My Docker experience so far - in a private, limited, very much not production environment - has been "virtualisation light" all the way.

I think of Docker as a somewhat extended chroot, to wrap my mind around it.

vidarh · on Oct 17, 2016

They're "complex thingamabobs" because our applications are. Which is exactly why we want to wrap them up and isolate them and expose as narrow interfaces as possible.

Containers are a symptom of our still immature ways of building applications.

The problems you bring up would still be there without containers, but you might not notice them, until you e.g. want to bring your setup over to another machine, for example.

Or want to duplicate it somewhere else. Or want to upgrade something else on you machine that just happens to interact badly with your setup.

Hondor · on Oct 17, 2016

I never quite understood containers and this article makes them seem kind of similar to what OSs already do.

How is a container different from just installing all the dependencies along with an application? Coming from a Windows background, this is pretty common to avoid DLL hell. Nobody distributes a Windows application that requires the user to go and install some 3rd party library before it'll work.

Isolation from each other seemed like one advantage, but that's not even security strength isolation so you can't count on it to protect the host OS from malware in a container.

A claim I often see, and that's repeated here, is that containers can run anywhere. But can they really run in any more places than an ordinary application with dependencies included, or even statically compiled into it? You still need Linux and the same CPU architecture, right?

vidarh · on Oct 17, 2016

> How is a container different from just installing all the dependencies along with an application?

Once the image is built, you can get another installation that is guaranteed to be identical. You can do that with VM images too, but you can not reasonably do that if you try to install multiple applications side by side in a single VM without further isolation - there are too many ways they can interact.

> Isolation from each other seemed like one advantage, but that's not even security strength isolation so you can't count on it to protect the host OS from malware in a container.

That's only true to the extent that they don't have a long enough track record. Many container technologies do have a decent track record when it comes to security.

But even so there are plenty of reasons for isolation in cases where the security requirements are not the primary reason for further isolation. E.g. making it impossible for an app to accidentally reading/writing files it shouldn't is in itself helpful.

> A claim I often see, and that's repeated here, is that containers can run anywhere. But can they really run in any more places than an ordinary application with dependencies included, or even statically compiled into it? You still need Linux and the same CPU architecture, right?

Try to get an application - statically compiled or not - to run across different Linux distributions, and you will see why this matters.

Needing Linux and the same CPU architecture isn't much of a limiting factor on servers. Being able to not having to account for distribution peculiarities or version differences is a big deal.

paulddraper · on Oct 17, 2016

> Try to get an application - statically compiled or not - to run across different Linux distributions, and you will see why this matters.

Done. A statically compiled application has no other dependencies.

elsonrodriguez · on Oct 17, 2016

This is the eventual future.

What most people don't realize is that a large part of Docker's value is in being a generic static compiler for languages that don't have that feature.

Pretty soon you can expect raw process support in many "container" management systems, where you just provide it a linux binary which are then run in isolated cgroups and namespaces.

vidarh · on Oct 18, 2016

Does it never do DNS lookups? Open files? Do you not want to tie into process management? Logging? Do you really have no dependencies on a functioning locale? Network settings? Are you sure it won't try to exec anything?

An application with no other dependencies is exceedingly rare. Small tools, sure. Sometimes. But even then I see people making silly assumptions all the time, which makes using a container as a suitable straightjacket very useful.

E.g. I run all kinds of tools "with no other dependencies" all the time, that turns out to have all kinds of dependencies when you actually try to put it in the smallest container possible.

paulddraper · on Oct 19, 2016

> Does it never do DNS lookups?

Yes, it does.

> Open files?

Yes, it does.

> Do you not want to tie into process management?

I don't know what this means.

> Logging?

Yes.

> Do you really have no dependencies on a functioning locale?

What makes a locale "function"?

---

Remember that the my standard C lib or whatever can be statically linked as well. At that point, I'm left with syscalls.

Docker containers depend on syscalls too; it's not like they ship with their own kernel. (If they did, they'd be VMs.)

Koshkin · on Oct 17, 2016

I think this is too strong a statement. To have "no other dependencies" you would have to statically link in (1) the operating system (including device drivers) and (2) the hardware model, to be absolutely sure. Only virtual machines (possibly including Java VM) can give you such guarantee.

paulddraper · on Oct 18, 2016

And do you believe this set of issues doesn't apply to Docker?

k__ · on Oct 17, 2016

> guaranteed to be identical

[Citation Needed]

As far as I know, this isn't the case. That's why using Nix [0] for deployment is a much saner approach than Docker. But after installation and configuration has been done, containers are a viable technology for the rest.

[0] https://blog.wearewizards.io/why-docker-is-not-the-answer-to...

dekz · on Oct 17, 2016

The comment you are replying to mentions once the docker image is built, referring to the built layers. These are guaranteed to be identical.

Building from scratch, is not always guaranteed to be identical.

k__ · on Oct 17, 2016

Ah okay, got this wrong :)

Yes, the pre-build images are always identical.

Nix starts a step before this with solving the problem, so building from scratch is also guaranteed to be identical.

IshKebab · on Oct 17, 2016

You are correct. The issue they really fix is that it is extremely difficult to actually distribute all dependencies along with an app on Linux. If you link with glibc they you are screwed.

There are starting to be ways around glibc, like muscl. And Go doesn't even use a C standard library at all - there's no way you can say that Docker is easier than just copying a single statically linked binary around. But I guess Docker came along before those solutions have become really popular, and it is easier to apply to existing apps.

I think the security isolation is a side-benefit that is used to obscure the real reason for docker (distributing apps on Linux sucks).

akavel · on Oct 17, 2016

Another solution to dependencies distribution is NixOS.

pjmlp · on Oct 17, 2016

Containers also allow fine grained control over OS resources and sandboxing.

Since you appear to be a Windows dev, I advise you to have a look how Windows containers introduced in Window 2016 work.

https://msdn.microsoft.com/en-us/virtualization/windowsconta...

https://channel9.msdn.com/Events/Build/2016/B875

raesene6 · on Oct 17, 2016

Containers (and especially multi container orchestration software like docker compose or Kubernetes Pods) let you describe an application that might contain multiple processes (so a really simple example could be a web server with a database backend) in a single file and have that deployed to any system that runs the containerization software.

So to that extent its more flexible than a single app. which bundles its dependencies.

The other advantage is that you're not reliant on the software vendor or OSS project to create the package, so you as a user of the application, can create your own packages with your own customization.

As to where you can run them, yep at the moment the image is tied to an OS and architecture, so either Linux or Windows depending on the Docker engine version, although there are moves afoot for Multi-arch support on the registries (https://github.com/docker/docker/issues/15866)

madeofpalk · on Oct 17, 2016

See, rather than thinking of them as applications, I've always found it a lot easier to think of them as super lightweight virtual machine images.

Of course technically they're definitely not but IMHO with the isolation and reproducibility aspects of it VMs sounds most fit.

axlee · on Oct 17, 2016

> Nobody distributes a Windows application that requires the user to go and install some 3rd party library before it'll work.

A-hem, DirectX, VC++...?

Hondor · on Oct 17, 2016

Those things are usually included in the installer, or at least downloaded on the fly so you don't notice. So it's still self-contained from the user's point of view. Yes, sometimes they're installed globally like DirectX, so that can cause yucky interactions between applications.

krakensden · on Oct 17, 2016

Fewer than a statically linked binary, they rely on a bunch of brand new kernel APIs.

vacri · on Oct 17, 2016

> Nobody distributes a Windows application that requires the user to go and install some 3rd party library before it'll work.

On Saturday I installed a Steam game on Windows 10, and it forced me to download and install .Net 3.5 before it would launch.

> How is a container different from just installing all the dependencies along with an application?

It's more flexible - for example, you can set up your container so you can ssh into it and do some commandline troubleshooting.

disposablezero · on Oct 17, 2016

Weak virtualization often lacking security, resource metering/prioritizing/quotas. Hurray, zombie Docker instances needs a whole VM reboot yet again, still not fixed in several years. But look how fast I can deploy millions of containers without SELinux, monitoring, HIDS, SDN, billing, live migration, backup/restore/DR/data lifecycling and all the other things we just pretend to ignore when throwing away sensible production VMs on Type 1 hypervisors devopsec.

hvidgaard · on Oct 17, 2016

The smart use of containers is using it to specify the deployment. You should use proper virtualization as the environment to deploy into. Because you use containers doesn't mean you throw all that away.

BjoernKW · on Oct 17, 2016

Containers are often touted is this novel concept that's bound to revolutionise software development and software delivery in particular.

The general idea isn't all that new however. Java Applications have been delivered as containers since 1995 (although the concept isn't explicitly named that way with Java applications).

Each JAR / WAR is a self-contained application that can run anywhere where there's a JVM (which is pretty much everywhere).

From a feature perspective the only real innovation of Docker-style containers probably is that those aren't limited to the JVM but are (largely) language- and runtime-agnostic.

paulddraper · on Oct 17, 2016

Actually, JARs are even more compatible than that: they can run against a huge number of OSs and architectures, even in embedded systems.

geodel · on Oct 17, 2016

> Each JAR / WAR is a self-contained application that can run anywhere where there's a JVM (which is pretty much everywhere).

Jar could be self contained or it could depend on 100s of other jars so it is not as straightforward as one assume.

War required a Java application containers to be deployed which is outside a War not inside. Also J2EE deployments can get pretty complicated with its infamous XML usage for configuring anything.

JVM is not everywhere by default or work without config by default for non trivial application. It is complex software which can lead to myriad classpath/ version issues related to jar files if not installed very carefully.

dismantlethesun · on Oct 17, 2016

Containers are like that as well. You can have a container for your application, that doesn't really do anything unless you can connect to your database which runs in another container.

taeric · on Oct 17, 2016

As complicated as J2EE deployments could get, I don't see Containers being less complicated. Just a new language for managing the complexity.

mirekrusin · on Oct 17, 2016

JAR/WAR general idea is not that new as well, we had CPAN before!

BjoernKW · on Oct 17, 2016

CPAN is more akin to Maven (or rather public Maven repositories) / RubyGems / NPM in that context. Perl modules are hardly self-contained. Compiling Perl module dependencies could be a real pain at times.

sytelus · on Oct 17, 2016

These hand-wavy explanations that constantly avoids explaining how things work at low level are not adequate.

Here's short explanation for developers even with moderate understanding of how OSes work:

http://stackoverflow.com/questions/16047306/how-is-docker-di...

jdoliner · on Oct 17, 2016

This is a pretty good analogy for containers but there's an unfortunate conflation of terms. There's actually a distinction with computing containers between the thing that you ship code around as and the thing that you run code in. The latter is the real container while the prior is called a "container image" or often simply an "image." This gets confusing quick if you apply this analogy since you assume that the thing you ship the code as would naturally be called the container.

gtrubetskoy · on Oct 17, 2016

It's amazing how slow the development community en masse has been in "discovering" containers. I remember I worked for web hosting company that offered containers as a web hosting solution back in 1999, I then ran my own little host offering FreeBSD jails and then Linux containers (based on the excellent Linux-VServer project) in 2003, and I do remember how when I tried to explain to (pretty technical) people how this is way more efficient than stuff like Xen they'd go "but it's a hypervisor..." (as if that meant "magic"). I eventually gave up on it and sold my little hosting operation because it was too much work and not enough money, it looks like it was about 10 years ahead of it's time.

lsferreira42 · on Oct 17, 2016

I think the same, i've been deploying containers or things that look like today's containers for the past 12 years, i even did a tool that manages containers with lxc that really look like docker, but 3 years before.

cesnja · on Oct 17, 2016

There were many solutions very similar to Docker, but you have to agree that they made some essential things the right way - UnionFS (e.g. OverlayFS), standardized way of building containers (Dockerfiles) and the right timing.

fenomas · on Oct 17, 2016

I'm only a beginner, but the analogy that makes sense to me is that containers do for app deployment what npm does for Javascript development. That is, the magical part isn't that Docker simulates an operating system and so on - the magic is that it allows a chunk of logic to precisely declare its dependencies - including on other pieces of logic which declare their own dependencies - and then Docker knows how to (in theory anyway) run the logic in such a way that its dependencies are all satisfied.

And of course the meta-magic is then that there's a public registry of (in theory) solved problems, which one can build on top of by declaring dependencies against them.

I have a pet theory that this "declared dependencies + dependency wrangler + public registry" is a general formula which will keep cropping up as we find new places to apply it.

kpil · on Oct 17, 2016

Not really - Linux package management systems do exactly that.

The magic - if there is any - is to combine it all together; separation, discovery, relatively easy packaging and dependencies.

fenomas · on Oct 17, 2016

Sure, at a different level. Package managers at the app level, docker at the deployment level, npm at the development level, or something vaguely along those lines.

I wasn't suggesting this was unique to Docker - precisely the opposite, that it's a generally useful pattern, being applied here to deployment.

bpizzi · on Oct 17, 2016

To me it's more like a lightweight virtualization: processes inside containers have no clues that there are others processes running in others containers alongside its own, all of that without eating too much memory (at least well under the amount that virtual machines would use).

Added benefits:

- the 'host' os can be very light and tailored to run Docker and nothing more (cf CoreOS),

- we can design orchestration software that handle containers operations across a herd of lightweight hosts (cf Kubernetes).

The dependencies/recipes thing is more like a tool that enable those higher goals, but again that's just my point of view.

majewsky · on Oct 17, 2016

> processes inside containers have no clues that there are others processes running in others containers alongside its own

Except, of course, if you're dealing with privileged containers. Docker [1] gives you detailed control over what to share between container and host, and what to isolate (with the default being more rather than less isolation).

For example, I'm currently working on a container that mounts disks in the host's mount namespace. In that case (and many others), the selling point of Docker [1] is not the isolation, but the deployment story.

[1] Or any other container runtime. I'm saying "Docker" because that's the one I'm familiar with.

parito · on Oct 17, 2016

for the love of god - forget docker, use lxc containers - its simple, secure, goes with its own init, cron, and you dont need to do somersaults to achieve simple tasks. Included with linux kernel. Your own isolated linux system. We use lxc in production for over three years, and we have over 3000 of them. No issues whatsoever.

choxi · on Oct 17, 2016

You're right that LXC containers have a similar API compared to Docker, but I think developers often underestimate the benefit of the community around a certain technology.

Docker has significantly better documentation, extensions, package management tools, and third-party integrations. Overall, Docker has an incredibly more robust community than LXC or closer competitors like Kubernetes, and those features are just as important as the API for developers.

parito · on Oct 17, 2016

The point of the LXC is, you get a full blown standalone linux, rather than a single process - this simplifies everything a lot, meaning you don't have to have that much documentation about it in the first place.

majewsky · on Oct 17, 2016

Can you clarify what "full-blown standalone Linux" means? It sounds like running a separate kernel, but since we're talking containers rather than VMs, this can't be it.

Kubuxu · on Oct 17, 2016

It is shared kernel, separate userspace.

It uses: X-namespaces (network, pid, user, ...) and cgroups to separate those userspaces from each other.

I have community server running debian in which there are 10+ LXC containers running in which people are given normal root access, one container per user.

majewsky · on Oct 17, 2016

So it's the same as with Docker.

austinjp · on Oct 17, 2016

Encouraging to hear. Who do you work for, who has these 3,000 LXC containers in production use? And I'm curious, what orchestration system do you use to manage them? Can you outline your toolset?

parito · on Oct 17, 2016

We use ansible and bash scripts for orchestration.

austinjp · on Oct 17, 2016

If you'd care to go into more detail, or could point me to your technical docs, I'd be very interested.

How, for example, do you handle roll-out, destruction, monitoring, backups, network configs, secrets, etc etc... is there any degree of automation? And why not take advantage of existing orchestration solutions? Was nothing mature enough for your needs? How big is the team managing these 3,000 containers, what sort of traffic are you handling?

Asking out of genuine curiosity. I'm keen on understanding how plain LXC can be used robustly in a production environment.

flukus · on Oct 17, 2016

Do you have any examples of what LXC does better than docker? I'm very new to the whole containerization thing but I've already come across a couple of the issues you've mentioned.

parito · on Oct 17, 2016

Shameless copypaste from well written piece by Flockport:

Docker restricts the container to a single process only. The default docker baseimage OS template is not designed to support multiple applications, processes or services like init, cron, syslog, ssh etc.

As we saw earlier this introduces a certain amount of complexity for day to day usage scenarios. Since current architectures, applications and services are designed to operate in normal multi process OS environments you would need to find a Docker way to do things or use tools that support Docker.

Take a simple application like WordPress. You would need to build 3 containers that consume services from each other. A PHP container, an Nginx container and a MySQL container plus 2 separate containers for persistent data for the Mysql DB and WordPress files. Then configure the WordPress files to be available to both the PHP-FPM and Nginx containers with the right permissions, and to make things more exciting figure out a way to make these talk to each other over the local network, without proper control of networking with randomly assigned IPs by the Docker daemon! And we have not yet figured cron and email that WordPress needs for account management. Phew!

This is a can of worms and a recipe for brittleness. This is a lot of work that you would just not have to even think about with OS containers. This adds an unbelievable amount of complexity and fragility to basic deployment and now with hacks, workarounds and entire layers being developed to manage this complexity. This cannot be the most efficient way to use containers.

Can you build all 3 in one container? You can, but then why not just simply use LXC which is designed for multi processes and is simpler to use. To run multiple processes in Docker you need a shell script or a separate process manager like runit or supervisor. But this is considered an 'anti-pattern' by the Docker ecosystem and the whole architecture of Docker is built around single process containers.

Docker separates container storage from the application, you mount persistent data with bind mounts to the host (data volumes) or bind mounts to containers (data volume containers)

This is one of the most baffling decisions, by bind mounting data to the host you are eliminating one of the biggest features of containers for end users; easy mobility of containers across hosts. Probably as a concession Docker gives you data volumes, which is a bind mount to a normal container and is portable but this is yet another additional layer of complexity, and reflects just how much Docker is driven by the PAAS provider use case of app instances.

majewsky · on Oct 17, 2016

> Docker restricts the container to a single process only.

This is definitely not true. I'm running syslogd inside a container (next to the actual process) without any trouble.

> ssh

I'll take `kubectl exec` over SSH any-time because it's a much more plausible way to handle credentials. Also, it does not require an always-running daemon inside the container, which reduces the TCB and the memory footprint.

> Take a simple application like WordPress. You would need to build 3 containers that consume services from each other.

It's not required, but it's a good practice to take advantage of the capabilities of your container orchestration software of choice.

> a MySQL container plus [...] separate containers for persistent data for the Mysql DB

Why would you need a separate container for data? The thing you're looking for is a "volume" (in the simplest case just a bind-mount from the host into the container, as you even explain further down).

chewchew · on Oct 17, 2016

Distributed storage is still a big issue for sure. There are some options, but none are ideal. One option is to map to host and use NFS to share across hosts. Another option is to use something like Convoy or Flocker, which come with their own complexities and limitations. Hopefully more progress is made on this front.

As for the wordpress app and other issues mentioned, it's actually very simple:

    nginx:
        build: ./nginx/
        ports:
            - "80:80"
        volumes_from: 
            - php-fpm
        links:
            - php-fpm
    php-fpm:
        build: ./php-fpm/
        volumes: 
            - ${WORDPRESS_DIR}:/var/www/wordpress
        links:
            - db
    db:
        image: mysql
        environment:
            MYSQL_DATABASE: wordpress
            MYSQL_ALLOW_EMPTY_PASSWORD: "yes"
        volumes:
        - /data/mydb:/var/lib/mysql

This isn't a "production" config, but that wouldn't look that much different. The real beauty is that I found this compose file with a simple search and very easily made minor tweaks (e.g. not publicly exposing the mysql ports).

You might run into permissions issues if you use host mounted volumes, but I have not. Normally I prefer to use named volumes (docker-compose v2) and regularly backup the volumes up to S3 using Convoy or a simple sidecar container with a mysqldump script.

austinjp · on Oct 17, 2016

This is interesting. I'd been considering mounting drives for persistence of stateful data from containers.

Let's say I want to run a Wordpress hosting service. In my ideal world, I deploy an "immutable" container for each customer, i.e. everyone gets an identical container with Wordpress, Nginx, MySQL etc. So what to do with state info, like configs and the MySQL data files? I'm thinking of mounting a drive at the same point inside each container e.g. /mnt/data/ and /mnt/config/ or similar.

This way the containers can all be identical at time of deployment, and I can manage the volumes that attach to those mount points using some dedicated tool/process.

This is all still on the drawing-board... but what you've said here seems to suggest this approach should work. Or have I optimistically misinterpreted what you've said? :)

chewchew · on Oct 17, 2016

Yes that's a pretty good approach. Just organize the configs in a directory structure on your host and mount them as volumes (along with any other necessary volumes for e.g. uploaded media). There are more advanced methods like using Consul/etcd, but only go that route if you're ready to invest a lot of time and need the benefits.

jlgaddis · on Oct 17, 2016

In your example -- assuming 20 different blogs/customers -- you'd be running 20 separate instances of MySQL (plus 20 nginx instances plus 20 php-fpm instances plus ...)?

Now, let me first say that I haven't come anywhere close to even touching containers and most of what I know about them came from this HN thread so please forgive me if I'm missing something...

I, personally, would rather only have a single MySQL instance -- or, in reality, say, a few of them (for redundancy) -- and just give each customer their own separate database on this single instance.

With regard to containerization, why is all of this duplication (and waste of resources?) seemingly preferred?

austinjp · on Oct 20, 2016

You're quite right, of course.

In my scenario, I want to provide a package for easy download and deployment. Each customer will indeed run their own mysql db, if they choose to self-host the containerised software.

I plan to offer a paid hosting service, where I'll rent bare metal in a data centre, onto which I'll install management and orchestration tools of my choosing.

An identical container for any environment is my ideal, since this will make maintenance, testing, development etc simpler. Consequently each customer hosted in my data centre will, in effect, get their own mysql instance.

This way the identical software in each container will be dumb, and expect an identical situation wherever it's installed.

Now, in reality, I may do something clever under the hood with all those mysql instances, I just haven't worked out what yet :)

Actually it will probably be Postgres, but I'll use whatever db is most suited.

So yes, some duplication and wasted disk space, but that's a trade off for simplified development, testing, support, debugging, backups, etc.

chewchew · on Oct 17, 2016

In this case, a single mysql instance with individual databases may indeed be the best approach. It'd be very easy to launch a mysql container and have each wordpress container talk to it. I use Rancher for orchestration, and it automatically assigns an addressable hostname to each container/service, so I'd just pass that to each wordpress container. Or you could expose a port and use a normal IP + port.

The duplication is preferred because you can take that stack and launch it on any machine with Docker with one command. Database and all. Usually that's great, but it'd be very inefficient in this case.

vacri · on Oct 17, 2016

> Docker restricts the container to a single process only.

No, there is only a single process treated as init in the container, but you can spawn off multiple child processes.

> The default docker baseimage OS template is not designed to support multiple applications, processes or services like init, cron, syslog, ssh etc.

If you want init, cron, syslog, ssh, and your app(s) all rolled up into one, you want a VM, not a container.

saurik · on Oct 17, 2016

> No, there is only a single process treated as init in the container, but you can spawn off multiple child processes.

It was extremely clear that the person who wrote the text you are replying to understands this as they specifically cover this fact with respect to using a service management daemon: you are just being pedantic with the wording to complain about this :/.

> If you want init, cron, syslog, ssh, and your app(s) all rolled up into one, you want a VM, not a container.

No: a virtual machine would burn a ton of performance as it would also come with its own kernel. The entire premise here is to be able to share the kernel but split the userspace in a sane way.

vacri · on Oct 17, 2016

You mean the way the parent('s quote) needlessly broke down an application into single-process containers, and finally breezes by "actually, you can" in order to spruik LXC instead, because 'multi-process'? Or the way the parent complains about not having init, but then says you can use something like runit?

I don't particularly like Docker and led my company's exodus from it, but the parent is being very slanted in their wording.