Hacker News new | past | comments | ask | show | jobs | submit login
WTF is a container? (techcrunch.com)
366 points by kawera on Oct 17, 2016 | hide | past | favorite | 253 comments



I remember going to AWS Reinvent last year and having some beers with a bunch of people who did devops. We started talking about tools, and they were utterly flabbergasted, that we had not embraced docker. They went on an on about how simple docker made HA, and handling fail overs/maintenance. More or less made it seem like it was the greatest thing since sliced bread.

Me and a few coworkers decided to try and dockerize some of our serivces, and move our staging ES cluster to docker.

For the most part building our own containers was easy enough, for the various services. The biggest issue we had was with Elasticsearch, since we have 4 types of nodes. So we ended up building 4 specialized containers for each node type.

Then came the issues:

* Making containers work together, for example we use a logging agent, we decided to make that its own container. Then actually getting a way to share a log directory with the agent, was very painful and unclear. (Honestly the single most frustrating thing i recall was asking for advice in irc and more or less being told i am doing it wrong)

* Containers would randomly crash under load exiting with error 137 out of memory. Apparently a few of our services would randomly leak memory, but only when running inside a docker container vs ubuntu 14.04. (I never figured this out)

* Containers would randomly become unreachable, forcing me to kill them and restart them.

* Random hanging issues, suddenly docker exec is hanging. Being told to revert to an older version, or install a newer version is tiring and makes my environment very inconsistent.

* Trying to debug why our app in the container isn't working is not fun at all.

However the single part that killed me was, i was chatting with one of the people who i met at Reinvent, and i mentioned my issues. He acted like it was completely normal for these kinds of issues.

After a solid 2 weeks of random issues and the constant barrage of pagerduty alerts, i just rolled everything back to static EC2 instances for staging, and have ran into 0 issues. I want to try containers again because i want them to work, but i have just had too many issues.


We have similar stories. We even brought in a devops consulting firm, who insisted on a pure AWS / Docker (ECS & co) stack. After a couple months of shocks, crises and occasional all-nighters, I just started deploying backup instances on Heroku, so that the QA and design teams wouldn't get blocked by the weekly clusterforks. After a few weeks of smooth sailing on that front, we just activated logging and auto scaling addons, and blessed the Heroku stack with the production domain name and CDN.

Heroku gets expensive quickly at scale, but the engineering required to make "the future" work on AWS was unmeasurable (because it never really succeeded for us).

I don't even know what to blame for the whole episode. Docker's incomplete architecture (2015)? AWS's inability to abstract and manage complexity into something that just works as described on their product pages? The consultants? Myself, for putting faith in that triad of unfamiliar but crucial people, services and products? Whatever, it's no longer a current problem for me.


It's exactly these kinds of issues that add to my impression that containerisation, and Docker in particular, is more of a religion than a solution.

I don't hate it per se, but I'm just a bit fed up of people telling me I should be using it without being able to articulate what problem it's going to solve for me.

At the moment it feels like just one more thing to learn, and one more moving part, that isn't strictly necessary in order to achieve what I need, which is basically to deploy software without a heap of aggravation.


For me, with appropriate base containers, the real benefit is load time... booting a full VM has a lot of overhead, and takes quite a while. Loading a dockerized app is much faster. If you're extending this for your testing environment, that can be a huge win over a larger team.

Being able to load more containers than VMs onto a server is another piece...

Beyond that, it's not too much different than running VMs was earlier on, it took orchestration and tooling to get working and there were growing pains... much of that abstracted well before becoming popular.

In the end, containers are just another step towards wherever we are going... I think it's pretty nice, but the tooling is just now starting to catch up.


I think that's a legitimate argument but deployments to VMs that are already up and running can also be fast, and for me they are, so it's not overwhelming. My feeling is that one day I'll reach a level of complexity/deployment time where the benefits of containerisation become clear, but I'm not there yet, nor anywhere near it.


That's a discipline issue though... shared nothing start from zero... The "docker" method doesn't upgrade an app already running in a container, it creates a new one.


I wouldn't use it for everything, but whenever I encounter a problem where applications crash or behave weirdly - the problem is with the app, not the container technology. It's not because the problem only manifests itself in a container that the problem is suddenly "docker".

And as with every technology, you have to understand it's strength and weaknesses. I use Docker internally and in production with very few issues other than it advancing way too fast at the moment, but am always amazed how many people dive in head-first into a Docker adventure without understanding what it is and how it actually works and it's limitations.

This is from the perspective of a relatively small development company with applications where scaling is a non-issue. Our problem is that we have a ton of active projects. To give you an idea, our internal CI now still has over 500 active build jobs after a recent cleanup.

This CI is the first thing where Docker shines, it was an absolute god-send. I got rid of tons of frankenstein build slaves with unpredictable configurations, and replaced them with one huge VM running docker, with build images per project. This made this massive mess suddenly perfectly manageable, documented and version controlled. Need to build something in a build configuration from 1 year ago? Not something I want to do every day, but not completely impossible either, since I still have the exact same docker image in our internal registry.

Other than that, upgrading internal tools became a lot easier. Everything used internally (redmine, jenkins, ...) is containerized, which means it can easily be tested, migrated, cloned, ... It enforces data separation, which means it's clearer and easier to backup/restore and test these things. It means that now whenever a new version is released I can easily test this with our real data to see if there would be any problems in our configuration, and if not, quickly deploy a new version.


Checkout https://github.com/jenkinsci/hyper-slaves-plugin. This plugin will launch your buildjobs as on-demand containers in Hyper.sh, then you don't even need the long-running huge VM!


it would help these people to read http://docker-saigon.github.io/post/Docker-Caveats/


I feel that sometimes I'm pro containers, and sometimes I'm very much against them. Docker does make deployments and upgrades very easy once you have the initial infrastructure set up, and you can deploy applications directly from upstream as "units" so that each deployment is exactly what you want; deploying containers helps prevent hosts from differentiating too much.

If you have the infrastructure in place to rebuild your virtual machines though, containers offer little benefit other than perhaps not having to package your software.

At work we're mostly deploying docker containers using puppet, and often when docker fails to work, I can just obliterate everything and run puppet once to re-download images and set up everything again. I would not trust docker to manage any data that I can not recover from elsewhere.

My biggest gripe is that configuration management feels like an unsolved problem. Most examples I see on the web seem to simply ignore it, or do the usual "mount a host directory inside a container" thing which has issues with file permissions and host/container UIDs clashing, and just feels inelegant.

I'm also developing a dislike of Docker on account of it simply failing to work at times for no discernible reason, and having stupid issues like IPv6 simply not working correctly.[1] It's also rather inelegant when used on a systemd-based host because it wants to reimplement most systemd functionality.

I'm still waiting for rkt to mature a bit, maybe it will be better.

[1] https://github.com/docker/docker/issues/2174


More like systemd has been reimplementing docker functionality...


fyi, systemd had nspawn for kernel debugging before docker became a hype


Trying to debug why our app in the container isn't working is not fun at all.

My experience with AWS was similar, and I got to a point where I had to stop accepting clients who had built (or insisted on building) their infrastructure on AWS. It actually kind of reminds me of a pyramid scheme where they ensnare you with a seemingly good deal, but in order to get that you've gotta buy in just a little (time, cost, etc) more and more -- just to be able to debug -- and more until you're calculating sunk costs[1] bigger than any company should have to spend on something as trivial as, say, a Wordpress site. It's highly unlikely that 95 percent of the SMB websites out there need the overkill of AWS infrastructure / EC. Do a cost benefit analysis.


You entirely missed the point. AWS is not relevant. All the issues are coming from docker.


I think you missed his point. He was speaking generally about the costs of working on new tech stacks that other people just accept.


My reply is off-topic, but I cannot resist.

> I think you missed his point. He was speaking generally [..]

shawnee_ --> hackeress.com

> What is a hackeress?

> A hackeress is a female hacker.

Bad form to assume all people are males in this domain (or even the majority for that matter, regardless of the actual statistics).

Use the form "they" when referring to someone whose gender (or gender identity) is unknown to you. Or check their profiles ;)


I think the assumption arose not from the demographic of people on Hacker News, but from the username of that poster. "Shawn" is a fairly common name, where one in every 2000 people will be named it. [1]

Meanwhile, "Shawnee" is a really rare name[2], one you may not be aware of if you didn't grow up in the US (or in particular parts thereof). (Apparently only 4000 of them are alive today!)

Just as an aside, using "they" by default can be very confusing, especially when plurals are involved. I use it sometimes, but prefer to use the passive voice instead ("the parent poster", etc.) I definitely do not have time to check every person's profile when I comment on HN.

1: http://www.wolframalpha.com/input/?i=shawn

2: http://www.wolframalpha.com/input/?i=shawnee&rawformassumpti...


That is not the passive voice.


Thanks for pointing out where the name confusion may have come from.

> I definitely do not have time to check every person's profile when I comment on HN.

At least then we should not make explicit such gender assumptions, regardless.

I have never heard the name "Shawnee" and generally do not assume usernames on web forums are indicative of real-life anything (e.g. you are 'striking', but apparently that is not your name).

I agree 'they' is an awkward construction in English, but for many people the alternative using passive voice is a little more complex to use (cf. non-native English speakers).


As a Polish immigrant, I personally think "singular they" is worse, because it creates ambiguity where there doesn't need to be any... which makes things a bit more confusing. Meanwhile, I think my proposed alternative is native to most other languages, and should translate fairly easily.

I'm not sure I see the point in thinking this long and hard about pronouns, but to each his own.


Yes indeed, to each their own. The singular "they" can be confusing in certain cases, but that's not really a problem unique to "they". "He", "she", "it", "we", and "you" can all be confusing in relatively common circumstances.

But you're going to have to get used to the singular "they", because it seems to only be getting more common (historically, singular "they" was acceptable until the late 19th century, so the strictly plural "they" might be seen as a 20th century anomaly).


I, too, think Docker is the bee's knees. But I don't think it would be good fit for a datastore (even though zeitgeist is to Dockerize all the things).

Docker seems to work best in an SOA type environment where you have a set of stateless services that might use different stacks. Docker simply unifies their provisioning and deployment.


My impression is that it really shines when you need to scale and you can easily spin up and kill large amounts of nodes/compute without any impact to your service–when you're at the point where you're thinking about the health of the service rather than individual nodes. The whole pets vs. cattle ideal seems to be discussed most in configuration management contexts (because it's widely applicable to most architectures) but it translates into big benefits when operating infrastructure for something like a scaleable SOA.

So to give them the benefit of the doubt, they may have made some assumptions about your scale/workload/architecture. Perhaps for them containers occasionally crashing is just a small blip that will be automatically corrected so while they need to be aware of it to monitor for trends they're not generally concerned with them.


The problem is you are trying to run a production environment with a development tool. Manually doing "docker run" is not how you deploy containers. Kubernetes is designed to run containers and addresses the most common issues.

> Making containers work together, for example we use a logging agent, we decided to make that its own container. Then actually getting a way to share a log directory with the agent, was very painful and unclear. (Honestly the single most frustrating thing i recall was asking for advice in irc and more or less being told i am doing it wrong)

kubernetes handles shared directories easily with all containers in a pod. If you need to share across pods you can use persistentVolumeRequests

> Containers would randomly crash under load exiting with error 137 out of memory. Apparently a few of our services would randomly leak memory, but only when running inside a docker container vs ubuntu 14.04. (I never figured this out)

Kubernetes provides replication controllers that will always re-launch or provision the number of desired pods for a service. It also provides health checks just like an aws elb to determine if a pod is healthy. You can also set resource limits (cpu and memory) per pod.

> Random hanging issues, suddenly docker exec is hanging. Being told to revert to an older version, or install a newer version is tiring and makes my environment very inconsistent.

Docker exec should not really be used on a running service in production, all of your provisioning should happen in the dockerfile to build the image

> Trying to debug why our app in the container isn't working is not fun at all.

If your application is 10-factor, you can easily tail the logs of any container at anytime


12-factor?


doh! you're right https://12factor.net/


Yes, Docker is still very young and has a ton of issues like these. I think it will take another 2-3 years before that whole ecosystem has emerged, matured, and is ready for a medium-stability deployment.

Unfortunately for us, such considerations don't stop tech fads. Because containers can allow many more applications to cohabit on the same "hardware", it has business momentum behind it too (lower infrastructure expenditures). "Docker" will be the buzzword until such time as it's actually practical and intelligent to deploy with it.


>> Docker" will be the buzzword until such time as it's actually practical and intelligent to deploy with it.

At that point docker will be considered boring old technology and we'll be flocking to a hip new fad. Repeat ad infinitum.


> Docker is still very young and has a ton of issues like these

Very young? It's over 3 1/2 years old

https://webcache.googleusercontent.com/search?q=cache:eYg3Fs...


Keep in mind that young is relative. In a field that still uses software daily originating from the 60's, 3 and a half years is very young.


I'm confused about the need to have a web app that 'runs anywhere' when you know exactly 'where' it's going to run. When you spin up an instance, don't you get to decide what OS and services you're going to use?

I could maybe see the advantage if you had no control over the environment that the app was going to be run in like you might have with a downloaded executable for example. However, if your product is just some kind of web application, I don't see the need for containers.


Being able to leave your current hosting provider quickly is important, but not over engineering abstract hotswapping interfaces is pretty important, too.


Same tune here.

What docker promises is truly amazing and is something that I think everyone wants. However, docker itself still has a lot of problems.

In particular, docker's new builtin swarm (with 1.12) has tons of issues. I've experienced __so many__ problems coordinating container startups on a swarm cluster.

From the documentation:

    > Swarm can build an image from a Dockerfile just like a 
      single-host Docker instance can, but the resulting image   
      will only live on a single node and won’t be distributed 
      to other nodes.
This is a serious weakness.

You have to use __manual scheduling__ if you want to ensure services are started on the same node. This is problematic if your inventory of nodes is dynamic and frequently changing. Allegedly, kubernetes solves this problem.

Swarm (and the whole ecosystem) is still not as mature as it could be, but I think the end result is going to be very awesome and useful.


Kubernetes has the concept of a pod, which is a group of containers that run on the same node. It does indeed solve that problem.

http://kubernetes.io/docs/user-guide/pods/


That's probobly a bad pattern to use the local registry like that. The goal is to be as stateless as possible, so pushing and pulling from an external registry would fix that.


>Containers would randomly crash under load exiting with error 137 out of memory. Apparently a few of our services would randomly leak memory, but only when running inside a docker container vs ubuntu 14.04. (I never figured this out)

Did you identify that the software definitely did not leak on ubuntu? Or was it that it ran on ubuntu because it consumed swap?

Was it for sure a leak? I have an app that doesn't have back-pressure so when too many requests come in, it fills up memory waiting to push them all through to the slower database. Normally it uses 256m. Spikes make it hit 700mb. If I tell docker -m 300m, then the process gets 300m ram + 300m swap, so when it tries to use 700m it gets killed. I could tell it to use 350m and then it will run, but it would be swapping furiously.


Reminds me of the early days of asterisk and open-vz virtualization. Everyone sane stayed physical but a few of us crazy enough were able to push through and reap the rewards early on.


flabbergasted? Wow this is a new word for me, I have never heard of it. When i first read it in my mind I thought it was a Spanish Football player that used to play for Arsenal, Then Barcelona, and now Chelsea. LoL

Is this word even used anywhere else beyond US? Never heard it used in the UK.


I'm from the UK and I hear "flabbergasted" reasonably often, more often from upper-middle class folks. Maybe a regional thing?


more of an age thing? I use it but I doubt my daughter ever does


I was born and raised in the midwest part of the U.S. and I've been using "flabbergasted" for all my adult life (20+ years). Of course, I've always had a propensity for peculiar or anachronous verbiage.


flabbergasted is very widely used in the UK, really puzzled how you could possibly not come across it if you've spent more than a few months here.


Indeed. One might say it is almost flabbergasting!


I agree that containers (both for shipping and servers) are a great idea. And because I'm tired of always configuring servers, I decided to give it a try some time ago.

I wrapped my IRC client (weechat + glowing-bear) in a Docker container. Oh, not a container though, because I also needed https, which meant I needed either a mechanism to build and update letsencrypt certs in the weird format that weechat expects, or to run an nginx instance in the front (and also somehow get the certs, but that's easier with nginx). So two containers.

And even though there was a ready-made nginx+letsencrypt https reverse proxy container (actually several), I had a huge amount of headaches to get it actually working. Even with the system set up, I occasionally have the container crash with exit status 137 (IIRC), which I've assumed might be because weechat leaks/consumes memory and eventually the host server kernel kills the process. Maybe.

So in my limited experience, comparing Docker containers to shipping containers is a gross simplification. Shipping containers are simple constructions requiring well-defined simple maintenance, while Docker containers seem to be complex thingamabobs that have multiple points of failure.


Docker is a poorly engineered and over-hyped technology.

The concept is great - and in fact, many companies have built great tooling around Linux cgroups. It lets you efficiently binpack applications on a single server - which is why 'containers' were created in the first place.

The side benefit of letting you define your OS libraries, and other things, is a nice bonus, and way overblown in my opinion.

Docker and its tooling is just plain bad. It's so bloody unstable, unless you pick some esoteric combination of versions and storage backends (old docker + old aufs + old ubuntu seems to do the trick) - and even then, you'll run into problems. Documentation won't help you here - it's trial and error.

The docker group seems focused on feature, feature, feature, while the basic stability and performance remains poor. It's quite amazing how they get it so wrong. Browse through the github issues for performance and hanging issues. The surrounding tooling is poor as well - registry v2 doesn't even support deletes (because they need to maintain compatibility with the 100000 different storage backends for it). There is no LTS release, bug fixes only happen in latest version, so you're in a constant state of brokenness. And so on.


We used Docker sort of early on, and got out of it around v1.6. Many problems that you had to work around yourself, but the one you just reminded me of was the repos: There was Dockerhub, a third-party place without 100%(-ish) uptime, and two serve-your-own, one of which was a black box you could never delete from, and the other had stamped "NOT FOR PRODUCTION USE" in big letters on its github page.

I remember one of the engineers giving a talk, saying that the problem they had was growing too big too quickly - they didn't have time to properly work out the base architecture in the early days.


I think the real problem with Docker is Docker Inc. The pressures and constraints the company is under promotes the creation of new features that can be marketed and new software that competes with products from the competition.

They have effectively no incentive on getting rid of bugs in the core product or to test features they do add exhaustively.

Sooner or later someone actually using containers will produce a docker replacement that will take over unless Docker focusses on what actually matters.


Docker Inc. desperately wants to be VMware. I don't think it's a coincidence that the official documentation is so full of virtualization-like choice of words you'd be forgiven if you thought they already were.

They do however seem to be cutting so many corners technically, that the risk is they get undercut on their core offering.


Deletion from registry v2 is definitely supported now. It's just a total PITA, and took them a while to implement. But I got my disc space back, so I'm not complaining. :P


Any pointers?


This answer on SO is more or less the approach I used:

http://stackoverflow.com/a/37716286/308278


Ok, throw Docker away.

Did anybody have better experience with e.g. Rkt?


I really like: daemontools + static binary + setuidgid and maybe chroot.

Or mesos where everything needs to be in a tarball, which is extracted and sole program run.

If stuff is statically linked and related files (config, assets, ...) are part of bundle that can be chrooted, what is the value add of a container?


It's not hard to use control groups in daemontools family style, either.

* http://jdebp.eu./Softwares/nosh/guide/move-to-control-group....


Containers are easier for folks with less operational experience to understand. Or maybe easier to get started with is a better way to put it. It's easy to underestimate easy to use tooling when you aren't the target demographic.


The concept is great but it's also not original. It's called "processes". Docker is little more than a mass of complication laid atop fork+exec.

That's why nobody can get it right - because we already did.


How is "your own network, your own view of the file system, your own view of the process table, your own view of the user IDs, ..." the same as "processes"?


On modern Linux distros every process is running in a cgroup and namespace by default. So these days the main difference between a "container" and a regular process is that regular processes are all jumbled together in the same root namespace, and containers are in separate namespaces.


Which Linux distributions do this? I'd like to read how "most" instances of fork and exec end up with unique networks and file system namespaces.


I didn't say unique, I said "jumbled together".

Now as far as which distros put processes in a namespace and cgroup by default, I know at least CentOS 7 and Ubuntu 15 do this. And those two distros on their own would qualify for "most".

To check if your distro does it, one way of checking is just doing a `cat /proc/1/cgroup`. This will show you what cgroups process 1 is in. By default you will be in the "root" cgroup.

To check your namespaces, `ls -l /proc/1/ns/`. You'll see the process is in some randomly generated namespace ID per item.

I'm sure you could recompile your kernel to disable this behavior, but the default reality of modern Linux is that everything is already running in a "container".

Now the question is whether or not people want to take advantage of that reality, and separate out processes in isolation, or keep running everything on a system in such a way that any single process can impact the whole system.


Plan 9 is knocking on the door and would like to have a word …


The context of this discussion would like to have a word too...


I think that Plan 9's filesystem-based namespacing (and lack of a superuser) actually have a lot to offer for container-like solutions. Any Plan 9 user can set up namespacing of the network and of resources and spawn a process within that restricted namespace.

The whole process is much simpler, I think, than that of creating a Linux container (that's the whole reason Docker exists: to simplify & abstract something which isn't really inherently complex, but is accidentally complex).

Plan 9 certainly wasn't perfect, but it had some really high-quality ideas we still haven't assimilated in mainstream platforms.


I'm not disputing any of this. But it really has nothing to do with the context of this thread...


I totally agree. The real issue is dynamic libraries and how hard it is to compile C/C++ code statically with GCC.

If you could just pass `-static` to gcc and it actually worked like you expect this would never have happened.

Fortunately that seems to be changing somewhat. Go is totally static, and Rust can easily be made totally static using muscl. You can even do totally static C/C++ apps fairly easily with muscl.


Dynamic libraries (in the C/C++ sense) only scratch the surface. Containers give you your own file system namespace (among other namespaces), which means all of the files that make up your complicated application unit can be put together and work together in isolation, separate from the machine's main file system.


What advantages do file system namespaces have over separating by directories and users?


A big one is managing software which you didn't write: if you have two things which expect to be able to write to /etc/mydaemon.conf etc. you either need to burn a VM for each one, fork the startup scripts or take the Debian-style approach of maintaining patches which make everything configurable, or manage something like maintaining chroots directory hierarchies.

(repeat for network namespacing: it's really nice not to need to play games to have your CI server start 3 running jobs which all think they're listening on port 80)

None of that is impossible – in the case of chroot there's many years of precedent - but if you do it regularly, there's a strong appeal to automating a common pattern.

This is especially true when your goal is supporting development teams: with something like Docker, normal users don't need root just to start a daemon on a privileged port or write to a couple of files. If you work in a large or security-conscious environment, that's a fairly big draw.

(Not saying that Docker is perfect or necessarily the long-term winner in this space, only that there's a usability gap which a lot of people fall into).


> This is especially true when your goal is supporting development teams: with something like Docker, normal users don't need root just to start a daemon on a privileged port or write to a couple of files. If you work in a large or security-conscious environment, that's a fairly big draw.

On the contrary, any user that can run arbitrary containers (such as rootplease[0], for example) has root-level privileges on the host system.

[0] https://hub.docker.com/r/chrisfosterelli/rootplease/


What I was thinking about wasn't protecting against outright malice but rather mistakes and errors: If you give developers sudo access and you don't have an extremely diligent team with strong system administration experience, you're going to run into problems where people made incompletely documented changes or cause problems while working which aren't caught early enough – ever see someone break out sudo or chmod 777 as their first debugging step or even write that into the install process because it was too much work to do it right? Docker is an enormous win here both because it sharply reduces the number of times someone needs privilege escalation and because it ensures that the end result of their work can be reliably audited and repeatedly deployed.

It's true that Docker doesn't protect against compromised or malicious users with privileged access. That's a very hard problem in general which can only partially be addressed at this level — especially since many of the most damaging attacks don't need it (“The bad news: they exfiltrated our customer database. Good news: they didn't get root on the EC2 instance”). I think most of the answers for this problem are going to continue to rely on existing practices like code review, auditing, getting finely-grained SELinux / seccomp rules into the development mainstream, etc.


> If you give developers sudo access

Why would you, though? A developer would at the very most require the application's privileges, not the super user's. And that's only really necessary when doing live troubleshooting.


You have to if your developers are installing packages, working on deployment, running anything which runs on a privileged port, use tools like systemtap, etc. In some cases that can be avoided with configuration but then you're asking for bugs due to discrepancies between the environments.

You can reduce the number of things which hit that friction in a number of ways: having easily repeatable builds so a developer can test on their own VM; using a cloud service so test VMs are disposable and never shared; setting up a platform / microservices approach so a wider range of things either don't need to be touched or can be deployed without privileged or direct access, etc.

Containers (really namespaces) one way to hit that last goal: as a classic example, if you have a web app running on port 80 everyone who deploys will need to periodically restart Apache/nginx/Varnish/etc. and that may include elevated access to debug processes running as a different user. This is by far the simplest problem in this class and there are various ways (proxies, site users, firewall NAT rules, moving config into .htacccess or its equivalent, etc.) to reduce the number of times you have to care about it but but it still requires work to maintain and adapt code (especially with third-party apps).

Some people quite reasonably prefer to solve this entire class of problems by tossing everything into a container so it can run as if it's the only thing on the box.


Basically if you're a shit sysadmin, you don't have to bother learning anything or working hard.

This is absolutely fine if you intend to remain a shit sysadmin. Go nuts with Docker. The 21st century will be waiting for you when you're done.

So will 19 fucking 60 because nothing's fucking changed.


Isn't that chroot jails?



Wouldn't it be more akin to jails?


You might have better luck with container specific reverse-proxy like Traefik[0] - it has builtin Let's Encrypt support with auto-renewal

> I had a huge amount of headaches to get it actually working.

Moving from running one container to running multiple containers is probably one of the most confusing parts of getting started with Docker

There are a large array of orchestration options and tools - each with their own pros and cons: Swarm, Kubernetes, Mesos, Marathon, Mesosphere, Centurion, Rancher, etc.

Docker 1.12 now having built-in orchestration with Swarm should make this easier[1].

[0] https://github.com/containous/traefik

[1] https://blog.docker.com/2016/06/docker-1-12-built-in-orchest...


Traefik sounds great! I'm using https://github.com/SteveLTN/https-portal which is a Docker container, and that kind of setup is just complicated. Putting the reverse proxy outside sounds much cleaner.


Containers!=docker, I think if you used lxd container, your experience might have been different as they work like virtual machine. I am using lxd in production and it has been a pleasure with live migration, snapshots and good old configuration management using ansible.


And, as probably everyone knows, Google runs everything in containers and has been using containers for a decade:

http://www.nextplatform.com/2016/03/22/decade-container-cont...

Docker may be flawed, but containers aren't. If you need some enterprise leader to tell you this instead, here are some Gartner posts showing this is the way:

VMs may be well established and "magic quadrant", but they are also on decline, and containers make better use of hardware/resources:

http://www.informationweek.com/cloud/infrastructure-as-a-ser...

It's important what you use to roll out containers and requires forethought:

https://www.gartner.com/doc/3267118/containers-change-data-c...


What really surprises me about Google is why they don't open source some of these great core technologies (MapReduce, Containers etc.) instead of publishing theory as academic papers. On the one hand, it may be a great way of promoting the creating of these tools from the ground up, inspired by the theory alone. On the other hand, Google's invaluable experience with using these technologies probably means their versions are more stable, faster, resilient etc.




Kubernetes just got a pretty sweet update as well. The improvements to the dashboard alone were worth revisiting.


I think they've said in talks/presentations that the challenge they have with some stuff (like Borg) is that they can't really extract individual components. It's all too tightly coupled. It wouldn't be fair to ask them to open source their whole stack. The fact that they took the lessons learned and created Kubernetes or published papers on their technologies is more than enough.


Why enable their competitors to such an extent? What benefit is in it for them, when they seem to be at the absolutely forefront of this work? They have to let their researchers publish academic papers or they won't get the best researchers, but when they also have the most experience with ops and such a lead in implementation, the theory only gets competitors so far.


It's entirely possible that they consider those core technologies part of their "magic sauce". If their development and production environments are so far ahead of everyone, their development cycle will be much faster.


Still, I'm not sure why I should use containers VS Vagrant + ansible (or just ansible)


There's also rkt!


I'll also add a plug for tredly.com here. "Containers done right" and pretty much all of Joyents container work.


Docker is just overhyped deal with it. There are some nice ideas, but nothing that we couldn't or haven't seen before. FreeBSD Jails and Solaris Zones exist for more than ten years. Where they addressed many things that Docker didn't.


Can you download a BSD Jail image from an application's website, and have it 'just work'?


Yeah, surface simplicity seems to be the heart of what drives tech fads, even if the actual cost in labor and effort is exponentially larger on the back end.

MongoDB? Yes, don't worry about data design or normalization! Just throw that stuff in there, call it anything you want! Finally programmers are free from the tyranny of decent databases. I can't think of any downsides to that arrangement.

Docker? Yeah man, you can just say "docker pull redis" and then you don't even have to bother with Ubuntu getting in your grill with all of its apt-get shenanigans. It's awesome! Now how do we make that port accessible...


I work for a Fortune 50. I can't just download anything. When some third party curates a store of containers and guarantees their safety then maybe. Until then docker, rkt and the rest are a distant dream.


If anyone would provide those, then sure, why not? I could just give you my jails with database, web server, irc client and everything else as compressed archives, you could just unpack them, execute jail commands on them and have them running. This is pretty much that's there to is. No different than Docker, and yet, many more years used in production environments.


Don't know about BSD Jails, but you certainly could download OpenVZ images and run them.


it's called curlbashism.


No, but I do not see how it is related to my comment. Read more carefully. I didn't say that Jails and Zones did everything that Docker did, yeah DockerHub is cool and I gave them credits for that.


"nothing that we couldn't or haven't seen before"


Federation credits?


I ran into weird problems with FreeBSD Jails. Like cronjobs running twice, and `ezjail-admin console` not allocating a tty.


Docker gives you the building blocks, but that means you have more pieces to arrange and manage. Take a look at Docker Compose if you haven't already, since the Docker CLI only gets you so far when you're creating apps that consist of multiple containers.

I think the best approach for your cert issue is to abstract that into a separate service (nginx is an option, but I'd recommend the Rancher approach below). Yes, that means you have to add another container, but that's just another block in your docker-compose.yml file. Embrace the approach of separating your components into containers and organizing them as a stack. You can easily link containers together, share data volumes, and start/stop individual containers or the stack as a whole.

The problems that you're having are pretty easy to fix with some tooling. Rancher (http://rancher.com/) greatly simplifies the cert issue by allowing you to import certs and provide them to the Rancher loadbalancer service (which you can add to any stack). There's also a LetsEncrypt community catalog template that automatically retrieves and imports certificates to Rancher. There are other open source orchestrators like DCOS, but Rancher is probably the simplest to use, and it's the only one I'm very familiar with. There are SaaS options that you can look into, but I don't have experience with them.

As for the container crashes, it's trivial to automatically restart them. Just pass the --restart=always flag to the Docker run command. You can also add the flag to a docker-compose.yml file.


> but that means you have more pieces to arrange and manage

wait. aren't these things supposed to give us less pieces to arrange and manage?

> The problems that you're having are pretty easy to fix with some tooling

yes, of course the solution is more tools. what exactly was the problem again?


> wait. aren't these things supposed to give us less pieces to arrange and manage? No, not fewer pieces. You'll have more pieces, but you can combine the pieces and control them individually or as a group. You can think of Legos, since you have many pieces but they all fit together in the same way.

Docker compose lets you group these containers together, and that is what ultimately makes it feel like you have fewer pieces to manage. With that, you can start/stop a stack of containers (e.g. django, nginx, postgres, redis) with a single command, but still inspect and manage each component separately. This is something you might normally do with bash scripts, but with Docker you can take that same app to an orchestration platform and run it on any host. Run it on your laptop, run it on a linux server, run it on a SaaS provider like Docker Cloud, run it on a private cloud with an orchestration platform.

> yes, of course the solution is more tools. what exactly was the problem again? Docker is just the foundation. Nothing more. I'm fine with learning more tools because I feel that the foundation is solid. The problem is being able to ship and manage your apps. That is much, much easier for me now and I'm very glad I invested the time.


Its somewhat akin to microservices, where you split each functionality into its own service rather than having one monolith do everything.


Seriously? If something crashes continuously in production then the solution is "just pass the --restart=always flag. I really wonder if you guys are really using docker in prod. I would never use something like that to manage important transactions.


No the correct solution is to debug the issue not just reboot/restart/reinit the damn thing.

This mentality is WHY Linux/docker/containers/Linux worlds NIH syndrome, is such a tire fire.


It's not one or the other. Restarting might save you some downtime if you're running a single IRC container. That doesn't mean you shouldn't find and resolve the root issue. Normally I just rollback to the previous version container if I have a recurring issue.

Windows containers are now available, if you can't stand linux: https://msdn.microsoft.com/en-us/virtualization/windowsconta...

Anyway, don't use containers if you don't want to. I'm glad I invested the time since I understand them very well and use them to my advantage. But I did have to learn a lot and experiment with a bunch of tools, and maybe that's not worth it to you.


As someone said, this is the normal behavior in the Erlang/Elixir world, and it seems to have worked extremely well for the telecom industry.

That said, my reply was to someone running an IRC server, presumably on a single server, so don't stretch my advice to a production app handling millions of transactions. Obviously the core issue is that the app crashes, and it's still up to him to fix that. This is almost certainly a problem with his app/config, not Docker itself (though it ain't perfect). If it's something that happens every 6 months, then auto restarting will probably save him a lot of problems. If your transactions are so precious, don't pass the flag- it's up to you.


Isn't this the much-lauded and respected Erlang approach to failures?


I never used Erlang, so I have no idea if it follows the same approach(although I find it quite strange). But for sure I can't afford to deploy anything like that in prod. You lose one transaction in the middle and several millions go lost. I'd rather lose an hand than try to explain my clients that it is fine, docker just restarted by itself as expected..


Thanks, I'll have to look into Rancher. I'm already using Docker-compose.

Also good to know about automatic restarts. I don't know how I missed that, I had to read a lot of docs to get where I am.

Still, the amount of tooling that exists and the knowledge needed to pick the right ones for a given situation goes to show that this isn't as simple as packing a shipping container and letting someone ship it...


You're right, it's not quite that easy yet, but it probably will be eventually. Docker just provides the building blocks. SaaS providers like Docker Cloud will get better and continue to abstract complexity away until it really is that easy.

You don't need to use Rancher if you're just running one app. If that's all you need to do, then it could be as simple as running docker-compose on a linux server and mounting the certs into the nginx container as a host volume (https://github.com/jwilder/nginx-proxy). This is a fine approach until you want to split your containers across several hosts (redundancy or scaling) and you have several apps to worry about.


Hey chewchew you seem to be knowledgeable on docker-compose.

What's the easiest way to get a .yml on a cloud server somewhere and let it assemble it assemble the containers for you?

Also, do you know where containers set environment variables? The official Postgres makes available something like PG_PORT_3542 and u can just refer to it from another container. Where I can't seem to do the same with the Redis one...


The absolute easiest is probably Docker Cloud. I've checked it out but haven't really used it much. It's still relatively immature but if you just want to deploy and forget something simple, this is probably the way to go.

If you want to use your own Linux host, then the simplest way would probably just be to SSH into the box, git pull, and run "docker-compose build && docker-compose up".

Setting environment variables is pretty easy, but I don't think you need them in this case. If you're trying to make a redis or postgres container available to another container (your app), then you can do so easily with links in docker-compose. Something like:

    myapp:
        image: myimage:0.0.1
        command: cmd to run
        links:
        - redis:redis
        ports:
        - 80:8000
        environment:
        - hardcoded_var=my_env_value
        - var_from_host=${host_var}
    redis:
        image: redis:latest
You can then access redis from the myapp service using the hostname "redis" and the default port "6379". So, "telnet redis 6379" would work from the myapp container (assuming telnet is actually installed). The redis port isn't even publicly exposed- it's only available to myapp.

If you need to define environment variables, you can do so with an environment dict as shown above. There are a few other ways to define env vars:

https://docs.docker.com/compose/compose-file/#variable-subst...

https://docs.docker.com/compose/environment-variables/#/the-...


Thanks chewchew so much win in this reply. I use Heroku a lot and was wondering if there is a "believable" alternative using Docker.


kubernetes allows you to upload a yaml snippet and launch, which sounds like what you described.


I wanted to write about my experience with containers. It mimics yours. The idea of a container is beautiful and elegant. The execution, not so much. I'm more or less convinced that to reduce this complexity, we need a limited scope ecosystem. Microsofts .Net would probably be a good place to start, but MS, being enterprise, have a habit of turning things into a mess of configuration.


I think the container example is actually more apt than you think.

Running out of memory is a bit like running out of space in the container.

If I dropship a container on your yard, you're going to have a container but your options in what to do with it are limited.

Now if your container has a lifecycle and a scheduler because it needs to be in China on Wednesday, you suddenly have a lot more complexity.

Docker itself being a buggy piece of crap is neither here now there in the grand scheme of things.


Someone told me that I should switch to using shipping containers instead of my current method.

Unfortunately, when my cattle arrived in the US, they were all dead. I was told containers "would just work".

After some experimentation, we managed to get our cattle in a container by building a system that correctly managed food and waste. Then we found a partner who had successfully packaged sheep and so we just used that. Unfortunately, they appeared to have a leak in their waste management system, and the container overflowed and all the sheep died. We did not have this problem when the sheep were able to fill the entire hold with their output.

Clearly the idea that shipping containers just work is a gross oversimplification. In fact containers seem to be complex thingamabobs with multiple points of failure.


Same here with docker. The ideas are nice as an application, but it's just a pain in the ass. LXC on the other hand is a breeze.


LXC is nice, as a building block, but it is far from "a breeze".

It is refreshing, however, being able to write, in a couple of hours, a setup script from scratch that debootstraps an install, chroots in there to configure it and launches the container via systemd-nspawn. Very few moving parts, excellent for development environments. Not yet something I trust in production, though.


Does LXC provide a way to automatically create new working instances (like a Dockerfile / docker-compose.yml)? I'm not quite grokking LXC in my first 10 minutes of reading...


I think you could create your own lxc template and then create new instances. The other way would be to just clone existing working instances with overlayfs.

Base lxc may be a bit of a pain. Look at lxd - I believe that makes things much much simpler.

Another thing about lxc is the unpriviledged container - which I think is great for security (not sure if docker provides this).


man lxc.conf ... this is your dockerfile equivalent.


As others say Docker is probably over-hyped technology.

However, I do see it as positive, because its hype, regardless if good or not, has created the traction for Go and OCaml on the data center, thus eventually leading to less C code for such use cases.

So hype or not, maybe we do get some security improvements on the overall stack.


I'm missing the initial assumption. What is the connection between Docker and traction for Go and OCaml? People are using the latter in order to simply avoid containers?


Parts of Docker are implemented on them.

So anyone that wants to improve Docker or adapt it to their distribution of choice needs to eventually use them.

For example, Microsoft did several contributions in Go for making Docker run on Windows.

The TCP/IP stack used by Docker on OS X is taken from MirageOS, written in OCaml.


Got it. Thanks.


Out of curiosity, why is less C code a good thing?


Security exploits caused by memory corruption, undefined behavior, ability to inject code, numeric overflows plus whatever is common to all memory safe languages.

https://www.cvedetails.com/vulnerabilities-by-types.php


Maintenance. A lot of the younger programmers have very little experience with running/working with C code and stack; they are a lot more comfortable with Java/Python/Go (for backend). So the less C code you have to deal with in your stack, the easier it is for deployment/debugging. Not to mention, the more modern languages also provide many features that allow pinpointing errors faster etc.


There are legitimate reasons to not want to use C, but I find around here it's mainly reflexive hate and language zealotry.

It's popular to hate on C (and C++) because the languages are so ubiquitous and long-used that a large body of terrible, unsecure, and poorly written code exists using them. Other languages haven't had the same success as these two yet, so haven't had their warts exposed enough to be dumped in the "automatically hated" category. Java comes close, but it also is typically lumped in the "automatically hate it" bucket and for similar reasons.


It was already clear in the late 70's and early 90's that C wasn't a reliable option to write safe systems.

Dennis M. Ritchie himself on the history of the language[0]

"To encourage people to pay more attention to the official language rules, to detect legal but suspicious constructions, and to help find interface mismatches undetectable with simple mechanisms for separate compilation, Steve Johnson adapted his pcc compiler to produce lint [Johnson 79b], which scanned a set of files and remarked on dubious constructions."

Lint which is still mostly ignored by the masses to this day. At CppCon 2015, about 1% of the audience acknowledge using static analyzers.

Per Brinch Hansen letter to C.A.R. Hoare in 1993a [1]

"The 1980s will probably be remembered as the decade in which programmers took a gigantic step backwards by switching from secure Pascal-like languages to insecure C-like languages. I have no rational explanation for this trend. But it seems to me that if computer programmers cannot even agree that security is an essential requirement of any programming language, then we have not yet established a discipline of computing based on commonly accepted principles."

There are many other sources of similar statements since C exists, so the hate isn't something new.

Regarding C++, yes unfortunately it inherits C flaws, but at least the community tends to embrace language features to improve the language safety and push for type based programming.

[0] https://www.bell-labs.com/usr/dmr/www/chist.html

[1] brinch-hansen.net/papers/1999b.pdf


Where is this mythical C++ community that promotes safe and auditable programs?

Whenever I'm forced to use a C++ program it's buggier than the C equivalent.


They are here:

https://isocpp.org/

http://cppcon.org/

https://github.com/isocpp/CppCoreGuidelines/blob/master/CppC...

http://erdani.com/index.php/books/modern-c-design/

https://msdn.microsoft.com/en-us/library/hh279654.aspx

http://stroustrup.com/Tour.html

http://elementsofprogramming.com/book.html

Most of the C++ bugs I found happened to be written by former C developers that disregard using C++ stronger type safety, RAII, the standard library containers and usually use naked pointers alongside malloc() and free().

And that is the biggest problem with C++, its copy-paste compatibility with C, which allowed its adoption by C compiler vendors, but made it as safe as C when developers disregard best practices.


Except for relatively uncomplicated, or relatively low level programs I have had the opposite experience.


Lint is built in to modern C compilers.


I'm doing the same with a couple of containers and my experience is pretty much the opposite. What containers are you using, if I may ask?

I'm using jwilder's nginx rproxy container, the let's encrypt helper container for that plus half a dozen Web apps on a VPS. Among those Weechat, two instances of vanilla nginx, rutorrent, dovecot, mattermost, nextcloud and a test environment for my Python tinkering. Works like a charm. Upon bringing up a new container, I supply the desired subdomain and my letsencrypt user data, the container comes up and uses SSL plus automatically renewed certificates.

My Docker experience so far - in a private, limited, very much not production environment - has been "virtualisation light" all the way.

I think of Docker as a somewhat extended chroot, to wrap my mind around it.


They're "complex thingamabobs" because our applications are. Which is exactly why we want to wrap them up and isolate them and expose as narrow interfaces as possible.

Containers are a symptom of our still immature ways of building applications.

The problems you bring up would still be there without containers, but you might not notice them, until you e.g. want to bring your setup over to another machine, for example.

Or want to duplicate it somewhere else. Or want to upgrade something else on you machine that just happens to interact badly with your setup.


I never quite understood containers and this article makes them seem kind of similar to what OSs already do.

How is a container different from just installing all the dependencies along with an application? Coming from a Windows background, this is pretty common to avoid DLL hell. Nobody distributes a Windows application that requires the user to go and install some 3rd party library before it'll work.

Isolation from each other seemed like one advantage, but that's not even security strength isolation so you can't count on it to protect the host OS from malware in a container.

A claim I often see, and that's repeated here, is that containers can run anywhere. But can they really run in any more places than an ordinary application with dependencies included, or even statically compiled into it? You still need Linux and the same CPU architecture, right?


> How is a container different from just installing all the dependencies along with an application?

Once the image is built, you can get another installation that is guaranteed to be identical. You can do that with VM images too, but you can not reasonably do that if you try to install multiple applications side by side in a single VM without further isolation - there are too many ways they can interact.

> Isolation from each other seemed like one advantage, but that's not even security strength isolation so you can't count on it to protect the host OS from malware in a container.

That's only true to the extent that they don't have a long enough track record. Many container technologies do have a decent track record when it comes to security.

But even so there are plenty of reasons for isolation in cases where the security requirements are not the primary reason for further isolation. E.g. making it impossible for an app to accidentally reading/writing files it shouldn't is in itself helpful.

> A claim I often see, and that's repeated here, is that containers can run anywhere. But can they really run in any more places than an ordinary application with dependencies included, or even statically compiled into it? You still need Linux and the same CPU architecture, right?

Try to get an application - statically compiled or not - to run across different Linux distributions, and you will see why this matters.

Needing Linux and the same CPU architecture isn't much of a limiting factor on servers. Being able to not having to account for distribution peculiarities or version differences is a big deal.


> Try to get an application - statically compiled or not - to run across different Linux distributions, and you will see why this matters.

Done. A statically compiled application has no other dependencies.


This is the eventual future.

What most people don't realize is that a large part of Docker's value is in being a generic static compiler for languages that don't have that feature.

Pretty soon you can expect raw process support in many "container" management systems, where you just provide it a linux binary which are then run in isolated cgroups and namespaces.


Does it never do DNS lookups? Open files? Do you not want to tie into process management? Logging? Do you really have no dependencies on a functioning locale? Network settings? Are you sure it won't try to exec anything?

An application with no other dependencies is exceedingly rare. Small tools, sure. Sometimes. But even then I see people making silly assumptions all the time, which makes using a container as a suitable straightjacket very useful.

E.g. I run all kinds of tools "with no other dependencies" all the time, that turns out to have all kinds of dependencies when you actually try to put it in the smallest container possible.


> Does it never do DNS lookups?

Yes, it does.

> Open files?

Yes, it does.

> Do you not want to tie into process management?

I don't know what this means.

> Logging?

Yes.

> Do you really have no dependencies on a functioning locale?

What makes a locale "function"?

---

Remember that the my standard C lib or whatever can be statically linked as well. At that point, I'm left with syscalls.

Docker containers depend on syscalls too; it's not like they ship with their own kernel. (If they did, they'd be VMs.)


I think this is too strong a statement. To have "no other dependencies" you would have to statically link in (1) the operating system (including device drivers) and (2) the hardware model, to be absolutely sure. Only virtual machines (possibly including Java VM) can give you such guarantee.


And do you believe this set of issues doesn't apply to Docker?


> guaranteed to be identical

[Citation Needed]

As far as I know, this isn't the case. That's why using Nix [0] for deployment is a much saner approach than Docker. But after installation and configuration has been done, containers are a viable technology for the rest.

[0] https://blog.wearewizards.io/why-docker-is-not-the-answer-to...


The comment you are replying to mentions once the docker image is built, referring to the built layers. These are guaranteed to be identical.

Building from scratch, is not always guaranteed to be identical.


Ah okay, got this wrong :)

Yes, the pre-build images are always identical.

Nix starts a step before this with solving the problem, so building from scratch is also guaranteed to be identical.


You are correct. The issue they really fix is that it is extremely difficult to actually distribute all dependencies along with an app on Linux. If you link with glibc they you are screwed.

There are starting to be ways around glibc, like muscl. And Go doesn't even use a C standard library at all - there's no way you can say that Docker is easier than just copying a single statically linked binary around. But I guess Docker came along before those solutions have become really popular, and it is easier to apply to existing apps.

I think the security isolation is a side-benefit that is used to obscure the real reason for docker (distributing apps on Linux sucks).


Another solution to dependencies distribution is NixOS.


Containers also allow fine grained control over OS resources and sandboxing.

Since you appear to be a Windows dev, I advise you to have a look how Windows containers introduced in Window 2016 work.

https://msdn.microsoft.com/en-us/virtualization/windowsconta...

https://channel9.msdn.com/Events/Build/2016/B875


Containers (and especially multi container orchestration software like docker compose or Kubernetes Pods) let you describe an application that might contain multiple processes (so a really simple example could be a web server with a database backend) in a single file and have that deployed to any system that runs the containerization software.

So to that extent its more flexible than a single app. which bundles its dependencies.

The other advantage is that you're not reliant on the software vendor or OSS project to create the package, so you as a user of the application, can create your own packages with your own customization.

As to where you can run them, yep at the moment the image is tied to an OS and architecture, so either Linux or Windows depending on the Docker engine version, although there are moves afoot for Multi-arch support on the registries (https://github.com/docker/docker/issues/15866)


See, rather than thinking of them as applications, I've always found it a lot easier to think of them as super lightweight virtual machine images.

Of course technically they're definitely not but IMHO with the isolation and reproducibility aspects of it VMs sounds most fit.


> Nobody distributes a Windows application that requires the user to go and install some 3rd party library before it'll work.

A-hem, DirectX, VC++...?


Those things are usually included in the installer, or at least downloaded on the fly so you don't notice. So it's still self-contained from the user's point of view. Yes, sometimes they're installed globally like DirectX, so that can cause yucky interactions between applications.


Fewer than a statically linked binary, they rely on a bunch of brand new kernel APIs.


> Nobody distributes a Windows application that requires the user to go and install some 3rd party library before it'll work.

On Saturday I installed a Steam game on Windows 10, and it forced me to download and install .Net 3.5 before it would launch.

> How is a container different from just installing all the dependencies along with an application?

It's more flexible - for example, you can set up your container so you can ssh into it and do some commandline troubleshooting.


Weak virtualization often lacking security, resource metering/prioritizing/quotas. Hurray, zombie Docker instances needs a whole VM reboot yet again, still not fixed in several years. But look how fast I can deploy millions of containers without SELinux, monitoring, HIDS, SDN, billing, live migration, backup/restore/DR/data lifecycling and all the other things we just pretend to ignore when throwing away sensible production VMs on Type 1 hypervisors devopsec.


The smart use of containers is using it to specify the deployment. You should use proper virtualization as the environment to deploy into. Because you use containers doesn't mean you throw all that away.


Containers are often touted is this novel concept that's bound to revolutionise software development and software delivery in particular.

The general idea isn't all that new however. Java Applications have been delivered as containers since 1995 (although the concept isn't explicitly named that way with Java applications).

Each JAR / WAR is a self-contained application that can run anywhere where there's a JVM (which is pretty much everywhere).

From a feature perspective the only real innovation of Docker-style containers probably is that those aren't limited to the JVM but are (largely) language- and runtime-agnostic.


Actually, JARs are even more compatible than that: they can run against a huge number of OSs and architectures, even in embedded systems.


> Each JAR / WAR is a self-contained application that can run anywhere where there's a JVM (which is pretty much everywhere).

Jar could be self contained or it could depend on 100s of other jars so it is not as straightforward as one assume.

War required a Java application containers to be deployed which is outside a War not inside. Also J2EE deployments can get pretty complicated with its infamous XML usage for configuring anything.

JVM is not everywhere by default or work without config by default for non trivial application. It is complex software which can lead to myriad classpath/ version issues related to jar files if not installed very carefully.


Containers are like that as well. You can have a container for your application, that doesn't really do anything unless you can connect to your database which runs in another container.


As complicated as J2EE deployments could get, I don't see Containers being less complicated. Just a new language for managing the complexity.


JAR/WAR general idea is not that new as well, we had CPAN before!


CPAN is more akin to Maven (or rather public Maven repositories) / RubyGems / NPM in that context. Perl modules are hardly self-contained. Compiling Perl module dependencies could be a real pain at times.


These hand-wavy explanations that constantly avoids explaining how things work at low level are not adequate.

Here's short explanation for developers even with moderate understanding of how OSes work:

http://stackoverflow.com/questions/16047306/how-is-docker-di...


This is a pretty good analogy for containers but there's an unfortunate conflation of terms. There's actually a distinction with computing containers between the thing that you ship code around as and the thing that you run code in. The latter is the real container while the prior is called a "container image" or often simply an "image." This gets confusing quick if you apply this analogy since you assume that the thing you ship the code as would naturally be called the container.


It's amazing how slow the development community en masse has been in "discovering" containers. I remember I worked for web hosting company that offered containers as a web hosting solution back in 1999, I then ran my own little host offering FreeBSD jails and then Linux containers (based on the excellent Linux-VServer project) in 2003, and I do remember how when I tried to explain to (pretty technical) people how this is way more efficient than stuff like Xen they'd go "but it's a hypervisor..." (as if that meant "magic"). I eventually gave up on it and sold my little hosting operation because it was too much work and not enough money, it looks like it was about 10 years ahead of it's time.


I think the same, i've been deploying containers or things that look like today's containers for the past 12 years, i even did a tool that manages containers with lxc that really look like docker, but 3 years before.


There were many solutions very similar to Docker, but you have to agree that they made some essential things the right way - UnionFS (e.g. OverlayFS), standardized way of building containers (Dockerfiles) and the right timing.


I'm only a beginner, but the analogy that makes sense to me is that containers do for app deployment what npm does for Javascript development. That is, the magical part isn't that Docker simulates an operating system and so on - the magic is that it allows a chunk of logic to precisely declare its dependencies - including on other pieces of logic which declare their own dependencies - and then Docker knows how to (in theory anyway) run the logic in such a way that its dependencies are all satisfied.

And of course the meta-magic is then that there's a public registry of (in theory) solved problems, which one can build on top of by declaring dependencies against them.

I have a pet theory that this "declared dependencies + dependency wrangler + public registry" is a general formula which will keep cropping up as we find new places to apply it.


Not really - Linux package management systems do exactly that.

The magic - if there is any - is to combine it all together; separation, discovery, relatively easy packaging and dependencies.


Sure, at a different level. Package managers at the app level, docker at the deployment level, npm at the development level, or something vaguely along those lines.

I wasn't suggesting this was unique to Docker - precisely the opposite, that it's a generally useful pattern, being applied here to deployment.


To me it's more like a lightweight virtualization: processes inside containers have no clues that there are others processes running in others containers alongside its own, all of that without eating too much memory (at least well under the amount that virtual machines would use).

Added benefits:

- the 'host' os can be very light and tailored to run Docker and nothing more (cf CoreOS),

- we can design orchestration software that handle containers operations across a herd of lightweight hosts (cf Kubernetes).

The dependencies/recipes thing is more like a tool that enable those higher goals, but again that's just my point of view.


> processes inside containers have no clues that there are others processes running in others containers alongside its own

Except, of course, if you're dealing with privileged containers. Docker [1] gives you detailed control over what to share between container and host, and what to isolate (with the default being more rather than less isolation).

For example, I'm currently working on a container that mounts disks in the host's mount namespace. In that case (and many others), the selling point of Docker [1] is not the isolation, but the deployment story.

[1] Or any other container runtime. I'm saying "Docker" because that's the one I'm familiar with.


for the love of god - forget docker, use lxc containers - its simple, secure, goes with its own init, cron, and you dont need to do somersaults to achieve simple tasks. Included with linux kernel. Your own isolated linux system. We use lxc in production for over three years, and we have over 3000 of them. No issues whatsoever.


You're right that LXC containers have a similar API compared to Docker, but I think developers often underestimate the benefit of the community around a certain technology.

Docker has significantly better documentation, extensions, package management tools, and third-party integrations. Overall, Docker has an incredibly more robust community than LXC or closer competitors like Kubernetes, and those features are just as important as the API for developers.


The point of the LXC is, you get a full blown standalone linux, rather than a single process - this simplifies everything a lot, meaning you don't have to have that much documentation about it in the first place.


Can you clarify what "full-blown standalone Linux" means? It sounds like running a separate kernel, but since we're talking containers rather than VMs, this can't be it.


It is shared kernel, separate userspace.

It uses: X-namespaces (network, pid, user, ...) and cgroups to separate those userspaces from each other.

I have community server running debian in which there are 10+ LXC containers running in which people are given normal root access, one container per user.


So it's the same as with Docker.


Encouraging to hear. Who do you work for, who has these 3,000 LXC containers in production use? And I'm curious, what orchestration system do you use to manage them? Can you outline your toolset?


We use ansible and bash scripts for orchestration.


If you'd care to go into more detail, or could point me to your technical docs, I'd be very interested.

How, for example, do you handle roll-out, destruction, monitoring, backups, network configs, secrets, etc etc... is there any degree of automation? And why not take advantage of existing orchestration solutions? Was nothing mature enough for your needs? How big is the team managing these 3,000 containers, what sort of traffic are you handling?

Asking out of genuine curiosity. I'm keen on understanding how plain LXC can be used robustly in a production environment.


Do you have any examples of what LXC does better than docker? I'm very new to the whole containerization thing but I've already come across a couple of the issues you've mentioned.


Shameless copypaste from well written piece by Flockport:

Docker restricts the container to a single process only. The default docker baseimage OS template is not designed to support multiple applications, processes or services like init, cron, syslog, ssh etc.

As we saw earlier this introduces a certain amount of complexity for day to day usage scenarios. Since current architectures, applications and services are designed to operate in normal multi process OS environments you would need to find a Docker way to do things or use tools that support Docker.

Take a simple application like WordPress. You would need to build 3 containers that consume services from each other. A PHP container, an Nginx container and a MySQL container plus 2 separate containers for persistent data for the Mysql DB and WordPress files. Then configure the WordPress files to be available to both the PHP-FPM and Nginx containers with the right permissions, and to make things more exciting figure out a way to make these talk to each other over the local network, without proper control of networking with randomly assigned IPs by the Docker daemon! And we have not yet figured cron and email that WordPress needs for account management. Phew!

This is a can of worms and a recipe for brittleness. This is a lot of work that you would just not have to even think about with OS containers. This adds an unbelievable amount of complexity and fragility to basic deployment and now with hacks, workarounds and entire layers being developed to manage this complexity. This cannot be the most efficient way to use containers.

Can you build all 3 in one container? You can, but then why not just simply use LXC which is designed for multi processes and is simpler to use. To run multiple processes in Docker you need a shell script or a separate process manager like runit or supervisor. But this is considered an 'anti-pattern' by the Docker ecosystem and the whole architecture of Docker is built around single process containers.

Docker separates container storage from the application, you mount persistent data with bind mounts to the host (data volumes) or bind mounts to containers (data volume containers)

This is one of the most baffling decisions, by bind mounting data to the host you are eliminating one of the biggest features of containers for end users; easy mobility of containers across hosts. Probably as a concession Docker gives you data volumes, which is a bind mount to a normal container and is portable but this is yet another additional layer of complexity, and reflects just how much Docker is driven by the PAAS provider use case of app instances.


> Docker restricts the container to a single process only.

This is definitely not true. I'm running syslogd inside a container (next to the actual process) without any trouble.

> ssh

I'll take `kubectl exec` over SSH any-time because it's a much more plausible way to handle credentials. Also, it does not require an always-running daemon inside the container, which reduces the TCB and the memory footprint.

> Take a simple application like WordPress. You would need to build 3 containers that consume services from each other.

It's not required, but it's a good practice to take advantage of the capabilities of your container orchestration software of choice.

> a MySQL container plus [...] separate containers for persistent data for the Mysql DB

Why would you need a separate container for data? The thing you're looking for is a "volume" (in the simplest case just a bind-mount from the host into the container, as you even explain further down).


Distributed storage is still a big issue for sure. There are some options, but none are ideal. One option is to map to host and use NFS to share across hosts. Another option is to use something like Convoy or Flocker, which come with their own complexities and limitations. Hopefully more progress is made on this front.

As for the wordpress app and other issues mentioned, it's actually very simple:

    nginx:
        build: ./nginx/
        ports:
            - "80:80"
        volumes_from: 
            - php-fpm
        links:
            - php-fpm
    php-fpm:
        build: ./php-fpm/
        volumes: 
            - ${WORDPRESS_DIR}:/var/www/wordpress
        links:
            - db
    db:
        image: mysql
        environment:
            MYSQL_DATABASE: wordpress
            MYSQL_ALLOW_EMPTY_PASSWORD: "yes"
        volumes:
        - /data/mydb:/var/lib/mysql
This isn't a "production" config, but that wouldn't look that much different. The real beauty is that I found this compose file with a simple search and very easily made minor tweaks (e.g. not publicly exposing the mysql ports).

You might run into permissions issues if you use host mounted volumes, but I have not. Normally I prefer to use named volumes (docker-compose v2) and regularly backup the volumes up to S3 using Convoy or a simple sidecar container with a mysqldump script.


This is interesting. I'd been considering mounting drives for persistence of stateful data from containers.

Let's say I want to run a Wordpress hosting service. In my ideal world, I deploy an "immutable" container for each customer, i.e. everyone gets an identical container with Wordpress, Nginx, MySQL etc. So what to do with state info, like configs and the MySQL data files? I'm thinking of mounting a drive at the same point inside each container e.g. /mnt/data/ and /mnt/config/ or similar.

This way the containers can all be identical at time of deployment, and I can manage the volumes that attach to those mount points using some dedicated tool/process.

This is all still on the drawing-board... but what you've said here seems to suggest this approach should work. Or have I optimistically misinterpreted what you've said? :)


Yes that's a pretty good approach. Just organize the configs in a directory structure on your host and mount them as volumes (along with any other necessary volumes for e.g. uploaded media). There are more advanced methods like using Consul/etcd, but only go that route if you're ready to invest a lot of time and need the benefits.


In your example -- assuming 20 different blogs/customers -- you'd be running 20 separate instances of MySQL (plus 20 nginx instances plus 20 php-fpm instances plus ...)?

Now, let me first say that I haven't come anywhere close to even touching containers and most of what I know about them came from this HN thread so please forgive me if I'm missing something...

I, personally, would rather only have a single MySQL instance -- or, in reality, say, a few of them (for redundancy) -- and just give each customer their own separate database on this single instance.

With regard to containerization, why is all of this duplication (and waste of resources?) seemingly preferred?


You're quite right, of course.

In my scenario, I want to provide a package for easy download and deployment. Each customer will indeed run their own mysql db, if they choose to self-host the containerised software.

I plan to offer a paid hosting service, where I'll rent bare metal in a data centre, onto which I'll install management and orchestration tools of my choosing.

An identical container for any environment is my ideal, since this will make maintenance, testing, development etc simpler. Consequently each customer hosted in my data centre will, in effect, get their own mysql instance.

This way the identical software in each container will be dumb, and expect an identical situation wherever it's installed.

Now, in reality, I may do something clever under the hood with all those mysql instances, I just haven't worked out what yet :)

Actually it will probably be Postgres, but I'll use whatever db is most suited.

So yes, some duplication and wasted disk space, but that's a trade off for simplified development, testing, support, debugging, backups, etc.


In this case, a single mysql instance with individual databases may indeed be the best approach. It'd be very easy to launch a mysql container and have each wordpress container talk to it. I use Rancher for orchestration, and it automatically assigns an addressable hostname to each container/service, so I'd just pass that to each wordpress container. Or you could expose a port and use a normal IP + port.

The duplication is preferred because you can take that stack and launch it on any machine with Docker with one command. Database and all. Usually that's great, but it'd be very inefficient in this case.


> Docker restricts the container to a single process only.

No, there is only a single process treated as init in the container, but you can spawn off multiple child processes.

> The default docker baseimage OS template is not designed to support multiple applications, processes or services like init, cron, syslog, ssh etc.

If you want init, cron, syslog, ssh, and your app(s) all rolled up into one, you want a VM, not a container.


> No, there is only a single process treated as init in the container, but you can spawn off multiple child processes.

It was extremely clear that the person who wrote the text you are replying to understands this as they specifically cover this fact with respect to using a service management daemon: you are just being pedantic with the wording to complain about this :/.

> If you want init, cron, syslog, ssh, and your app(s) all rolled up into one, you want a VM, not a container.

No: a virtual machine would burn a ton of performance as it would also come with its own kernel. The entire premise here is to be able to share the kernel but split the userspace in a sane way.


You mean the way the parent('s quote) needlessly broke down an application into single-process containers, and finally breezes by "actually, you can" in order to spruik LXC instead, because 'multi-process'? Or the way the parent complains about not having init, but then says you can use something like runit?

I don't particularly like Docker and led my company's exodus from it, but the parent is being very slanted in their wording.


Doesn't Docker use lxc underneath? [1]

[1] http://unix.stackexchange.com/a/254977/152994


Not for a few years now.


I guess I'll never get it. Don't most OSs already run processes isolated from each other, have advanced process scheduling mechanisms and manage access to hardware resources? Also with static linking nothing stops you from creating huge binaries that "will run anywhere".


From my perspective, where I'm planning a hosting service with multiple customers, containers promise to allow me to slice machines into dedicated chunks. VMs could do this too, but they'd consume far more disk-space, which would mean my hosting costs would be comparatively higher.

Containers promise to allow me to limit the resources each of my customers can consume without killing the entire box. For example, I can limit RAM, CPU and disk-space per customer. If one customer goes rogue, the shared box remains performant for my other customers. Also there's the data protection: in theory customers would not be able to access each others' data, even if they tried to.

There are other considerations as others here have pointed out. This is just my primary concern at the moment.


All of your requirements can be satisfied with properly configured VMs.


Including total disk space consumed on the host? I can't see how. Containers are by definition far lighter.


It depends on what you build into your containers. Your customers will need storage at some point, unless they're just using the containers as processing engines. You'd also be surprised at how lean you can built a VM with Linux (or one of the BSDs) as the OS.


In containers a few more things are virtualized. The file system is semi-virtualized. Network ports are too. So from the pov of stuff inside the container nothing else is running. That's not true of processes in general. From outside the containers you can then choose how to map parts of the virtual file systems to parts of the real file system and what real network ports the virtual ports connect to etc...

There's more to it. Containers aren't one process theyre as many processes as are launched in that container


So let me try to understand this from a different angle: what's something that a VM can and does do, that container software like Docker can't? TFA makes it sound like legacy systems is the only place for VMs anymore, but I'm guessing that's probably approximation+exaggeration.


In my (limited) understanding: host OS and guest OSes can be completely unrelated. Whereas in containers, the host and guests share the kernel.


Imagine you run 3 different ruby (or whatever other language) and they all run on different versions of ruby. Containers allow you to easily isolate the whole stack including the individual version of ruby and install only the packages needed for that app to run in its own container. Of course it's still possible to do this without containers, my company handles it by building our own custom rpms for each ruby version and sticking them in /opt.


Not with a private root filesystem, a private loopback network, additional security isolation, no requirement for static linking, no.


Docker is the best thing since sliced bread! VM is much more resource and time consuming.

# We switched our dev/stage env to containers 2 year ago.

# We have made our own standalone app in Docker style. Once again - Easy as pie.

if you are a developer you should add Docker + Docker Compose to you working tools.


We moved over to Docker at the development process some time ago and it really speed up our work.


> The way virtual machines work, however, is by packaging the operating system and code together. The operating systems on the virtual machines believes that it has a server to its own, but in reality, it’s sharing the server with a bunch of other virtual machines — all of which run their own operating systems and don’t know of each other. Underneath it all is the host operating system that makes all of these guests believe they are the most important thing in the world. You can see why this is a problem. The guest virtual machines basically run on emulated servers, and that creates a lot of overhead and slows things down (but in return, you could run lots of different operating systems on the same server, too).

Wait a second. Isn't this the whole point of hardware virtualization support? So that hypervisors don't have all those VM overhead slowing things down?


On VM's: "Underneath it all is the host operating system that makes all of these guests believe they are the most important thing in the world. You can see why this is a problem."

Two paragraphs later (On containers):

"The only operating system on the server is that one host operating system and the containers talk directly to it."


This is not quite related to Docker/Kubernetes, but a phrase in the article triggered memories about a very good book that I think everyone should read - The Box by Marc Levinson. It really is a fascinating book.


Definitely "not quite", as it is about shipping containers, but an interesting read.


The reason people don't get the advantage of Docker is because there is a weird (and in my opinion stupid) tabboo against putting all of your deps in one container. This is about 10-100X easier than trying to compose a bunch of containers.

Not everyone can do that, but plenty of people could. Except they don't not because of some actual ops requirement (in many cases) but because they don't want someone to say they did it wrong.

I am assuming this situation has actually changed now I hope and swarm/compose or whatever is built in is not too hard to use?


no it's worse now. There's also all the PKI required compounding the pain and suffering.

Phusion baseimage-docker seems to help.

Worth nothing Atlassian stepped back and does scaleout approach using regular cloud instead of container orchestration technology. https://www.atlassian.com/company/events/summit/2016/watch-s...

Also worth noting you can take away all the PKI and registry pain and have instant scaling with the registry by using cloud store for registry and having Jenkins gatekeeper be the only writer, all dev and production nodes use the same store only readonly.


> The promise behind software containers is essentially the same. Instead of shipping around a full operating system and your software (and maybe the software that your software depends on), you simply pack your code and its dependencies into a container that can then run anywhere — and because they are usually pretty small, you can pack lots of containers onto a single computer.

Already got it wrong. Current containers are exactly the OS and the kitchen sink for running 'printf("hello world")'.


It's up to you to make your containers bloated or keep them slim. You can use the alpine versions of the official Dockerhub images. Python on Alpine is 30 MB (vs 267 MB for the debian one). https://hub.docker.com/r/library/python/tags/

You can create containers that are just a few MB with compiled languages like Go (5 MB). https://www.iron.io/microcontainers-tiny-portable-containers...

From the article: "Rather than starting with everything but the kitchen sink, start with the bare minimum and add dependencies on an as needed basis."


If your app is a compiled go binary (so, it runs anywhere) why do you need a container?

The whole point of containerisation is to group installed dependencies (as opposed to installable dependencies like with a regular deb or rpm package) and configuration into a 'black box'.

If your binary is already a single-file distribution, why lump it in with the crapfest that is docker?


I don't think that's the whole point. If all you care about is packaging your code in a container, then you don't need Docker. That was solved long ago. Docker simply adds a nice API on top which allows you to package, ship, and manage your apps in the same exact way.

By using something like Docker Compose, you can explicitly define each container and the relationships between the containers. Then, you can start/stop groups of containers (an application stack) while still retaining some component isolation and the ability to upgrade and scale containers independently. All of this is defined in a relatively simple YAML file, which can be committed to VCS and tweaked. I can't tell you how awesome it is to find repos on Github that have use docker-compose. Even complex apps with many pieces can often be launched with a single command. It's easy to take these stacks and tweak them to your needs, adding/removing/changing components as necessary.

Since Docker provides a standard way for managing any container, orchestrators like Swarm, Rancher (my preference for small-medium clusters), and DC/OS can take that functionality and scale it across many hosts. You can get a birds-eye view of your Docker cluster (all apps) or drill down into individual apps and their components. Each container is a uniform piece and can be controlled, scaled, updated, audited in the same way. Throw a UI in front of it and now you can manage applications that you know nothing about. That's great for developers that manage just a few apps, and it's incredible for enterprises with thousands of them. If you don't want to think about infrastructure at all, then you can use a SaaS Docker provider. Obviously there are pros and cons to each approach, and there are some remaining challenges.

Docker isn't perfect, especially in regards to storage and networking. Distributed storage isn't simple, but a lot of progress has been made with volumes and volume drivers. It's not as easy as it needs to be, but the general direction seems to be good and with the proper tooling I think this will be less of an issue.


If "you" here is one person and "your app" is one app, then yes, why do you need a container. If "you" is 200 developers, and "your app" is 75 different applications written in Java, Java plus native libs, .NET, python, the other python, R, scala, NodeJS, and various C libraries, then docker containers, and more importantly images, are about as ideal as it gets. We're running Mesos so we don't have to use docker to get containers, but packaging up the products as images is a significant advantage.


Both the post I replied to and my post explicitly talk about a go app, nothing else.


Alpine containers are nice if you're just looking at size. But they break down once you `docker exec` into them to try to debug something:

  $ docker exec -ti mycontainer /bin/bash
  stat /bin/bash: no such file or directory
  $ docker exec -ti mycontainer /bin/sh
  / # curl https://localhost:5000/
  /bin/sh: curl: not found
  / # strace $command
  /bin/sh: strace: not found


Takes about 2 seconds to fix:

  $ docker run -ti alpine /bin/sh
  / # apk add --update curl
  fetch http://dl-cdn.alpinelinux.org/alpine/v3.4/main/x86_64/APKINDEX.tar.gz
  fetch http://dl-cdn.alpinelinux.org/alpine/v3.4/community/x86_64/APKINDEX.tar.gz
  (1/4) Installing ca-certificates (20160104-r4)
  (2/4) Installing libssh2 (1.7.0-r0)
  (3/4) Installing libcurl (7.50.3-r0)
  (4/4) Installing curl (7.50.3-r0)
  Executing busybox-1.24.2-r9.trigger
  Executing ca-certificates-20160104-r4.trigger
  OK: 6 MiB in 15 packages


Hitting dl-cdn.alpinelinux.org repeatedly for simple things is probably not nice. Is there an easy way to have a local alpine mirror?


I'm not suggesting this would be done repeatedly. Just when you need to debug something.

Nonetheless: https://wiki.alpinelinux.org/wiki/How_to_setup_a_Alpine_Linu...


Thats not really broken. It is just a minimal image. If you want more pacakages then install them via the Dockerfile before you fire up the image. I like starting with a bare container. For example I might take an alpine image, install openvpn and use it to serve an always on vpn connection to other containers. The VPN container doesnt need a shell or anything else really. The only thing I want to see when I attach to the container is what openvpn is spitting out.


Well yeah, "break down" was probably too much of a term. I did not mean to diss Alpine; I just meant to point out that the downsides of minimal images should be carefully considered.


Plain and simple: containers violate the KISS principal. While you are developing reams of domain knowledge working with this turd of a technology, others are making progress with non-self-created problems that matter in the real world. This is another case of the tech world developing pointless tech that can be ignored by those solving concrete real world problems.


Putting a little bit more thought into naming things can help newcomers understand these concepts more easily. The word container is too generic, why not call it an 'application jail' or a 'virtual operating system' ?

Another word that is over used in our field is 'context'.


They were called "jails" long before "containers" became a thing on HN. Somehow "jails" didn't stick, so I'd assume it was even more confusing for newcomers.


Jail had no marketing hook. It's a very negative word. I'd guess that's Probably the major reason.


Well that and it wasn't done by a VC backed firm with strong press connections.


> Somehow "jails" didn't stick

I hear chroot jail tossed around a fair amount.


Jails is typically the FreeBSD name for a very similar concept. Linux "copied" (using the term very loosely) the idea, but not the name.


I was running a couple of jails on my FreeBSD server 15 years ago.. Worked like a charm.. But eventually I switched to linux.


It's based on the idea of container ships where the contents of every container can vary widely but the external interface for dealing with the containers is always identical. It's a pretty apt metaphor and not too generic, IMHO.


Guys, you really want to give a look to hyper.sh, if you are frustrated with running containers in production.


The analogy in the article is inferior.

Here comes the standard HN automobile analogy.

The car makers eternally release little light weight cars that are fattened up with cruft until they're as heavy and expensive and complicated as the dinosaurs they were originally meant to replace. At which time car makers birth a new little stripped down lightweight simple compact car line. That design pattern is self similar and fractal in that even car engines and transmissions and car radios undergo a similar pattern of revolution to make something new, then evolution to slowly and methodically make it the opposite of the original goal, repeat forever. This design pattern also applies to computer architecture.

To fight the problem of hardware evolution making life pretty hard on programmers, source code compatible OS and libraries were invented that could run the same software on vaguely similar hardware, on mainframes in the late 60s at IBM and on PCs in the 70s CPM era, later extended into the "home computer" series era in the 80s, then into the msdos era. This became unwieldy and too complicated for the end users so it was replaced.

The same people implemented the idea of OS packages, again more or less in the 60s on IBM mainframes or the 80s on early unix boxes. The idea is to compile emacs to be integrated very deeply into the OS and cooperate with every other piece of software. This is contemporary. However especially in NON-FOSS it doesn't work and doesn't scale and is very expensive if not impossible to implement, being the only closed thing on an open system is a nightmare for everyone and everything involved. The stereotype in the 90s was only one service installed on one MS windows server, even if that meant it took 20 MS servers to do the job of one unix server. Anyway, expensive, complicated, hard for end users.

Again the same people implemented the idea of OS virtualization. Again, IBM mainframes in the 70s with VM, and early PC hardware experiments with TSRs and multiprocessing in the 80s to give "two computers at once". This is also contemporary, enormously more advanced today, of course. It turns out that running 50 OS kernels on a single piece of hardware is kinda wasteful although possible and cooperation gets complicated and unwieldy so time to replace again.

Again the same people implemented process jails / chroot on the BSDs and eventually after some decades linux finally caught up resulting in docker. So now your packages don't cooperate or interoperate at all with each other or the OS, which solves all the problems where the previous technology didn't work, and creates massive new problems that never existed mostly where the old technologies worked great. There are of course completely separate new problems. It turns out that a system designed to eliminate interoperability between packages interoperability a huge PITA and there are other problems that make use unwieldy and complicated, hard for end users.

Again the same people implemented (this section to be written after 2016). Maybe IOT. Maybe collapse of hardware prices faster than business demand means 20 rasp-pi cluster is cheaper and easier to maintain than a single beefy desktop running 20 virtual images or 20 docker containers. Maybe FPGA on the desktop means people will just synth up whatever matches this hours workload. Maybe cloud will finally work and no one will maintain servers anymore, it'll all be magic, or at least push the magic to someone else who now has the same old problems. Maybe SaaS means we'll all be customers and most productive software will run on internet scale clusters not individual machines, individual machines will be dumb webbrowser terminals. Who knows!


IBM and Sun know.


lol. read two paragraphs => click author's name to verify he's got no actual tech background. life is boring.


The author has been writing for TechCrunch for a considerable amount of time.


I have been singing in the shower for a considerable amount of time. Still no singing expertise.

In both cases it is immediatly audible/visible in the output.


Hi there, good day fellows

I was expecting this kind of thread long ago, thanks guys for sharing your concerns, I am learning a lot from them!

IMO we can't compare containers vs VMs like many are doing now - and I was too when first heard about Docker and containers.

I hold almost all VMware certs (VCP/VSP/VTSP for 3.1/4.x/5.x and VCDX), I sign the 3 major VMware P2V migrations in BR/LA (287 P2Vs in 2005, 1600/2008 and 2500/2011). I was REALLY into VMware from 2000 to 2010, so I feel confident using it and recommending to many environments. I even manage some of them still today.

When we clone or migrate a physical machine to virtual, or whenever deploying a VM from scratch (or even using templates or copying VMdks etc) into production, we aim to develop the environment in order to see it lasting "forever". We want this to be flawless, because even with given most-players virtualization deploying resources (hyper-V, VMware, xen, kVM, vbox+vagrant, etc), nobody wants to troubleshoot a production environment, we cheer to see the VMs always up and running. I remember when P2V in mentioned projects during the night, and needed to fallback physical servers because the cloned VM didn't behaved accordingly. Please VMware, clone it ok, otherwise the troubleshoot for legacy shit will be a pain.

On the other side, containers are made to be replaced. They are impermanent. You can tune your images as your app/env needs. You can have an image for each service you have. You can have many images running many containers, and some of them providing services you have. You are able to customize an image in a txt file called Dockerfile, and generate an image from this file.

So imagine we got this infrastructure to "Dockerize" , a website with a DB. Does your webserver runs apache? so you can code a Dockerfile, that will deploy an apache instance to you. It could deploy FROM an ubuntu:16.10 or 10.04, depends on what is better for your app. OR, we could just pickup Apache's own image, like in FROM apache. You can save this image as yourid/apache. And you can do the same regardless of what DB you are using, just install it (the mysql using apt-get in a FROM ubuntu/debian based system), or use mysql images directly. You are able to publish the website cloning your site dev repo direct in Dockerfile itself, or you could have the website at some dir in your host, using ADD or CP to make it available in the right container dir (eg /var/www/) You could even use Volumes, to mount some host dir (or a blob, or a storage account, or a partition in your storage, something available in the host itself). This is specially interesting for DB in my opinion. Once you have your Dockerfile ok, you can name it yourid/db.

And if you have a main website, and a blog website, you could use the same apache Dockerfile changing only the git clone line, and save them as yourid/apache:website and yourid/apache:blog for example.

And when redeploying the container, you will have the same volume data available in the same container dir. Even if you redeployed it from ubuntu:15.10 to ubuntu 16:10.You can use the latest improvements from the freshest image (or patching some vuln), and redeploy all your containers that uses this same image at once.

The same goes on, you can test jenkins without knowing what OS jenkins image is made off. You dont have to worry about it. It will just work. You pull the image and run the container and voila.

NOW, my Docker instances are like this: I use Docker-machine, and locally I got the default and local envs. I got also an Azure instance at Docker-machine (that runs on Azure), and another instance configured in Docker-cloud using Azure as cloud provider (I use Azure due bizspark credits). So, 4 of them. All those instances, are VMs themselves. Ubuntu VMs to be sure.

You just replace the container (probably published in a cluster if you care enough your prod env). Not the same as with VMs at all.

I see Docker as a hypervisor for containers the same way VMware and hyperv are to VMs.

So I understand my Docker hosts VMs have the same HA, failover, load balance, resources allocation, and so many resources VMs have. I use Docker on those VMs to make easy deploys, images, tune images, really guys I was the VMware guy for so long, I went just crazy in the resources Docker gives to us.

Docker has many weak points indeed (NAT networking, privileged containers must be run sometimes, security concerns, etc), but again, I don't see Docker to be erasing VMs from my life and now on I can deploy everything that will run happy forever in containers. We still need HA, failover, load balance, resources allocation and so on. Docker needs to be used together with TOOLS that allows it to run smoothly, and allows us to maintain our environments easier.

One of those tools are containers clusters. I work mostly with Google Kubernetes, but there are other as Docker Swarm, Apache Mesos, DCOS... Azure has its Azure Container Services ACS, IBM has its BlueMix containers, etc. Using a cluster and a deploy methodology, you are able to deploy your containers in different namespaces such as DEV / STAGING / PROD. You can use a BUILD environment to read your Dockerfile, build the image and deploy containers to the namespace you need. You can configure this build to trigger with a commit in the git repo for exemple.

So lets say we have a developer, and he needs our yourid/apache:website to be deployed, with the new website version. If the website is already updated in your git repo, you just clone it. The Dockerfile would look like this:

  FROM apache
  MAINTAINER Your Name <your@email.com>
  WORKDIR /var/www/
  RUN git clone https://github.com/yourid/website/website.git .
  EXPOSE 80
  CMD ["/run.sh"]
This would be named as website.Dockerfile. If you change the project git repo to any of your other sites that runs on apache, you can SAVE AS other.site.Dockerfile, and always deploy this service from this specific repo.

You can customize your Dockerfile of course and add support to specific codes like installing PHP, Ruby, Python, etc. You could even use Configuration Managers (CMs) as Ansible, Salt, Chef, Puppet, juju etc to apply those changes.

Lets say we will start the build now. We are developing this image together. So I just changed my git url on the Dockerfile. When we commit, the autobuild triggers this build in our build system (in my case, Docker-cloud or jenkins). This is what Continuous Integration (CI) and Continuous Deployment (CD) are about.

So when build is triggered, it gets the Dockerfile from the repo, builds from its image, deploys the container in the namespace you wanted (our case, DEV). This service could be published as website.dev.mydomain.com for example. Same concerning to staging namespace. And to www.mydomain.com when ready to production, in the PROD kubernetes namespace for example. Kubernetes is a distributed thing, so you could have minions (nodes) splitted across different datacenters, or geo locations. This pretty much reminds me of VMware VMs running inside VMfs storage made available through a set of ESXi servers, all with access to the same luns/partitions/volumes.

This is just my point of view, so please feel free to comment and ask me anything.

Please, just dont blame Docker because you aren't aware of mainstream techs available nowadays. If you are comparing Docker to VMs, or SSHing inside the containers, and often mad cause your data vanished while redeploying your Docker Containers, believe me you are doing it wrong.

Being a pioneer is often the same: in the 90s we had to explain why Linux was good for the enterprise, in the 2000's we had to prove VMware was really going to cluster your legacy systems, and now we have to explain what's possible to do with Docker. And, as the tech is new (I know there were previously solaris zones, google borg, etc), but I see Docker will mature its features relying in other tools (and even copying features from k8s to Swarm eg). Docker is just one skill needed to run your stuff.

Cheers!

M Matos https://keybase.io/mmatos


'WTF' really?


Take an operating system. Remove all the advantages of a shared environment like dynamic libraries, package management, clarity. Stick a chroot before every fork. Boom! Containers.


I've read before on HN about 'don't use docker in production' can anyone elaborate on that for a newbie?


My understanding is that plenty of people do actually use Docker in production, but they use more than just Docker. A docker container isn't quite the same as a VM.


There is a lot of infrastructure around security and availability that isn't yet achiveable with off the shelf software. So ideally you use containers to define dependencies and deployment, and deploy into a proper environment.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: