I do a lot of Python and a lot of Docker. Mostly python in Docker. I've used bot...

throwaway894345 · on Jan 29, 2020

> Also, build times and sizes don't matter too much in the grand scheme of things. Unless you're changing the Dockerfile regularly, it won't matter. Why? Because Docker caches each layer of the build.

It does if you practice continuous deployment, or even if you use Docker in your local dev setup and you want to use a sane workflow (like `docker-compose build && docker-compose up` or something). Unfortunately, the standard docker tools are really poorly thought out, beginning with the Dockerfile build system (assumes a linear dependency tree, no abstraction whatsoever, standard tools have no idea how to build the base images they depend on, etc). It's absolute madness. Never mind that Docker for Mac or whatever it's called these days will grind your $1500 MacBook Pro to a halt if you have a container idling in the background (with a volume mount?). Hopefully you don't also need to run Slack or any other Electron app at the same time.

As for the build cache, it often fails in surprising ways. This is probably something on our end (and for our CI issues, on CircleCI's end [as far as anyone can tell, their build cache is completely broken for us and their support engineers couldn't figure it out and eventually gave up]), but when this happens it's a big effort to figure out what the specific problem is.

This stuff is hard, but a few obvious things could be improved--Dockerfiles need to be able to express the full dependency graph (like Bazel or similar) and not assume linearity. Dockerfiles should also allow you to depend on or include another Dockerfile (note the differences between including another Dockerfile and including a base image). Coupled with build args, this would probably allow enough abstraction to be useful in the general case (albeit a real, expression-based configuration language is almost certainly the ideal state). Beyond that the standard tooling should understand how to build base images (maybe this is a byproduct of the include-other-Dockerfiles work above) so you can use a sane development workflow. And lastly, local dev performance issues should be addressed or at least allow for better debugging.

cpuguy83 · on Jan 29, 2020

Docker's build system has been given a huge overhaul and does create a dependency tree. It is highly efficient and even does things like support mounts for caching package downloads, build artifacts, etc.

See https://github.com/moby/buildkit. You can enable it today with `DOCKER_BUILDKIT=1 docker build ...`

There is also buildx which is an experimental tool to replace `docker build` with a new CLI: https://github.com/docker/buildx

throwaway894345 · on Jan 29, 2020

I don't see how buildkit could possibly build the correct dependency tree because the Dockerfile language doesn't let you express nonlinear dependencies.

If you have a command `do_foo` that depends on do_bar and do_baz (but do_bar and do_baz are independent) and you do something like:

    RUN do_bar # line 1
    RUN do_baz # line 2
    RUN do_foo # line 3

I'm guessing the buildkit dep graph will look like `line_3 -> line_2 -> line_1` (linear). Unless there is some new way of expressing to Docker that do_foo depends on do_bar and do_baz but that the latter two are independent.

EDIT: clarified example.

cpuguy83 · on Jan 29, 2020

Dependcies are also for multiple stages, e.g.:

So `COPY --from=<some stage>` and `FROM <other stage>`

Also, a Dockerfile is just a frontend for buildkit. The heart of buildkit is "LLB" (sort of like LLVM IR in usage) which is what the Dockerfile compiles into. Buildkit just executes LLB, doesn't have to be from a Dockerfile.

For that matter you can have a "Dockerfile" (in name only) that is not even really dockerfile since the format lets you specify frontend to use (which would be a container image reference) to process it.

There's even a buildkit frontend to build buildpacks: https://github.com/tonistiigi/buildkit-pack Works with any buildkit enabled builder, even `docker build`.

throwaway894345 · on Jan 30, 2020

Yeah, there are a lot of workarounds when you leave the standard tooling. I like the idea of using something like Bazel to build Docker images (since modeling dependencies and caching build steps is its raison d'etre) and eschewing Dockerfiles altogether; however, I haven't tried it (and Bazel has its own costs). I'm not familiar with buildkit in particular, but it's cool that it has an internal representation. I'll have to dig around.

takeda · on Jan 30, 2020

Nix also can build efficient docker images and computes layers in a way to make them reusable among multiple projects.

cpuguy83 · on Jan 30, 2020

Yes! Check out https://nixery.dev/

takeda · on Jan 31, 2020

Yeah, it is cool, but it feels more like a demo for nix.

If you want to create docker image of your own app, you probably would use:

https://nixos.org/nixpkgs/manual/#sec-pkgs-dockerTools

This will produce an exported docker image as tar file, which then you can import it using either docker, or tool like skopeo[1] (which is also included in nixpkgs).

The nix-shell functionality is also quite nice, because it allows you to create common development environment, with all tooling available that one might need to work.

[1] https://github.com/containers/skopeo

tazjin · on Feb 6, 2020

> If you want to create docker image of your own app, you probably would use [...]

Nixery can be pointed at your own package set, in fact I do this for deployments of my personal services[0].

This doesn't interfere with any of the local Nix functionality. I find it makes for a pleasing CI loop, where CI builds populate my Nix cache[1] and deployment manifests just need to be updated with the most recent git commit hash[2].

(I'm the author of Nixery)

[0]: https://git.tazj.in/tree/ops/infra/kubernetes/nixery [1]: https://git.tazj.in/tree/ops/sync-gcsr/manifest.yaml#n17 [2]: https://git.tazj.in/tree/ops/infra/kubernetes/tazblog/config...

cpuguy83 · on Jan 30, 2020

I'm not sure I understand what you mean by workarounds and "leavings standard tooling"

weberc2 · on Jan 30, 2020

I’m guessing he meant roughly ”docker build”

thenewnewguy · on Jan 29, 2020

Correct (as far as I know), but:

A) Changing a Dockerfile is rare B) Typically the lines that change (adding your code) are near the end of the Dockerfile, and the long part with installing libraries is at the beginning

throwaway894345 · on Jan 30, 2020

(A) is true, but it's not the only way to bust the cache. Changes to the filesystem are much more common. You can try to account for this by selectively copying in files in some sort of dependency order and this can work okay so long as the dependency order resembles your filesystem hierarchy, but if you want to (for example) install all of your third party dependencies before adding your source code, you'll need to add each requirements.txt or package.json file individually so as to avoid copying in your source code (which changes more frequently). Doing this also tightly-couples your Dockerfile to your in-tree dependency structure, and keeping the Dockerfile in sync with your dependency tree is an exercise in vanity. Further, because you're forcing a tree-structure (dependency tree) into a linear structure, you are going to be rebuilding a bunch of stuff that doesn't need to be built (this gets worse the wider your dependency tree). Maybe you can hack around this by making an independent build stage per in-tree package, which might allow you to model your dependency tree in your Dockerfile, but you're still left keeping this in sync manually. No good options.

viraptor · on Jan 29, 2020

> Never mind that Docker for Mac or whatever it's called these days will grind your $1500 MacBook Pro to a halt if you have a container idling in the background (with a volume mount?). Hopefully you don't also need to run Slack or any other Electron app at the same time.

Just in case anyone is wondering, this is a great exaggeration. Idling containers are close to idling (~2%cpu currently), and slack got pretty small last year. These work just fine, I wish the trope died already.

novocaine · on Jan 29, 2020

I very much experienced this with our django application and traced the problem back to django's dev server polling for fs changes at 1hz. Apparently this is enough to spin fans with docker for mac.

I solved the problem by installing inotify into the container, which django will use if present, which reduced cpu from 140% to 10%. This is a couple of months ago.

slig · on Jan 30, 2020

Thank you very much! I have this same issue but did not discover the culprit.

dockerformax · on Jan 29, 2020

This is because there is no native support on Mac for docker. Everything has to run in a virtualized environment that's basically a slightly more efficient version of VirtualBox. When people say docker is lightweight and they're running it on a Mac they don't quiet understand what they're saying. Docker is lightweight on baremetal linux, it's not lightweight on other platforms because the necessary kernel features don't exist anywhere except linux.

ryall · on Jan 29, 2020

Yeah the slow-downs are killing me on my Mac book. Everything starts off fine but after a few hours (usually right when I'm in the zone) the whole system just grinds.

I've started experimenting with coding on a remote docker host using vscode's remote connection feature.

I'd be interested to know if anyone else had gone down this path?

throwaway894345 · on Jan 29, 2020

We looked into this. The biggest tradeoff is that your whole team has to change their workflow because the repo has to stay inside of a container on the Linux VM (it can't be shared with the host or you'll trigger the performance issue) which means anyone who wants to cat/sed/edit/grep/etc a file will have to go through a Docker container to do it. It's also a bit more complex if you're running multiple services via Docker Compose, as we were. We couldn't see this workflow change working out well in the long term, and someone had already done the work to use native processes orchestrated with PM2 and that seemed to work reliably once it was set up.

AlphaSite · on Jan 30, 2020

https://blogs.vmware.com/teamfusion/2020/01/fusion-tp20h1-in...

Project nautilus is a pretty interesting approach to running a container on mac. In theory it should be more efficient than docker for Mac.

Disclaimer I work for VMware, but on a different team.

noonespecial · on Jan 30, 2020

I have like this:

Mac Virtualbox runs linux with host only network vbox0.

Docker runs in vbox linux. (now it gets ugly)

Vbox linux brctl's docker0 (set to match vbox0 ip space) into vbox0.

Docker container is reachable by IP from mac host. All is fast and good.

cat199 · on Jan 29, 2020

> I'd be interested to know if anyone else had gone down this path?

this is at least one use case for 'docker-machine'

viraptor · on Jan 29, 2020

It's true that the overhead is larger on macos. And you're right it doesn't run natively there. But it's not like idle hardware virtualised process is less idle than a native one. Sure, there may be extra events waking up the kernel itself. But let's not exaggerate the impact. There's no grinding to a halt from an idle container.

throwaway894345 · on Jan 29, 2020

To be clear, the CPU on the container itself was negligble according to `docker stats`; however, the VM process was still using its max allotment. My money is on the volume manager, but we didn't see it being worthwhile to figure out how to debug deep into the bowels of the VM technology (we don't have VM or filesystem experts on staff) to sort it out. Note that we also tried docker-sync and a variety of other solutions, but the issue persisted. Eventually we gave up and just moved back to native processes managed by PM2. Our local dev environment sucks a fair bit more to set up, but it's cheaper than debugging Docker for Mac or dealing with the persistent performance problems.

henryfjordan · on Jan 29, 2020

Docker for Mac has always included some settings to tune the VM too. If your whole computer grinds to a halt because of Docker, it's because you probably allocated too many resources to the VM. I have half of my laptop CPU/RAM dedicated to the docker VM and while sometimes the fans go a little crazy I've never had the desktop lock-up or anything like that.

throwaway894345 · on Jan 29, 2020

This is true, but it doesn't solve the problem. If you give Docker half of your overall resources, it's just going to take Docker twice as long (most likely longer) to finish running the test suite or whatever you're doing in Docker. The crux of the problem is that Docker for Mac has pathological cases, probably involving host-mounted volumes that are big or overlaid or something else that we were doing; the containers can be near idle and Docker for Mac consumes 70-100% of its CPU budget (presumably doing volume/filesystem things).

Note that a little Googling reveals that this is a pretty common problem.

henryfjordan · on Jan 30, 2020

If you give any VM all your cores and then your desktop locks up, you played yourself. That wasn't a good idea before docker and it's not a good idea now. I've personally had issues with using file-change-watchers on mounted volumes in some cases but because I limited my VM to half my resources, the underlying OSX was fine and I could still do whatever I needed to do (including killing those containers).

weberc2 · on Jan 30, 2020

You’re being pedantic. Docker for Mac shouldn’t use the full VM allotment at idle, full stop. Nitpicking the parent for speaking in terms of the host cores instead of the VM cores is off topic and boring.

dijit · on Jan 29, 2020

There's a lot of space between those extremes, nobody is claiming that idle containers are consuming entire CPU cores. But idle virtualised machines are interrupting your host OS a lot more than you might realise.

viraptor · on Jan 29, 2020

That's what the comment I was responding to claimed - "grind to a halt" was a quote.

dijit · on Jan 29, 2020

Was it edited?

> This is because there is no native support on Mac for docker. Everything has to run in a virtualized environment that's basically a slightly more efficient version of VirtualBox. When people say docker is lightweight and they're running it on a Mac they don't quiet understand what they're saying. Docker is lightweight on baremetal linux, it's not lightweight on other platforms because the necessary kernel features don't exist anywhere except linux.

"grind to a halt" and "is not lightweight" are not even close to being synonymous.

viraptor · on Jan 29, 2020

https://news.ycombinator.com/item?id=22184547

dijit · on Jan 29, 2020

That's several layers up, and is true. There are many bugs in docker for mac, one of them is that vpnkit(?) leaks memory like a motherfucker. And the other is that volume mounts crunch I/O like madness, so your load factor spikes and your laptop perceptibly becomes slow due to IO latency.

So "grinds" is somewhat accurate, if you have long running containers doing very little, or you are constantly rebuilding, even if the machine does not look like it's consuming CPU.

pjmlp · on Jan 30, 2020

On Windows it plugs into Windows containers and Hyper-V.

2fast4you · on Jan 29, 2020

When people say Docker is lightweight and they’re running it on a Mac, they’re probably talking about their production environment and it’s rude to say that they don’t quite understand.

sa46 · on Jan 30, 2020

It's certainly not an exaggeration:

Docker with a few idle containers will burn 100% of CPU. https://stackoverflow.com/questions/58277794/diagnosing-high...

Here's the main bug on Docker for Mac consuming excessive CPU. https://github.com/docker/for-mac/issues/3499

throwaway894345 · on Jan 29, 2020

My quote was our observed performance across our fleet of developer machines until ~December of 2019. Maybe our project was hitting some pathological edge case (more likely this is just the performance you can expect for non-toy projects), but there's no documented way to debug this as far as I (or anyone else in our organization) could tell. Note that this was the performance even if nothing was changing on the mounted file systems and with all watchers disabled. Bystanders can feel free to roll the dice, I guess.

wiredfool · on Jan 29, 2020

npm watch on a host mounted volume is a pretty good way to kill performance though.

tgv · on Jan 30, 2020

I'll add the same: idling does take a whole CPU in my situation too, and only on the mac, not on Linux.

jonhohle · on Jan 29, 2020

I've built some docker build infrastructure which attempts to optimize build times and reduce cost of incremental builds. I was able to take a monolithic binary artifact which cost around 350mb per build down to less than 40mb by more intelligent layering. If you haven't found it already, `--cache-from` with the previously built stable image makes this relatively painless.

I've been considering writing a merge tool to support fork/merge semantics, focussing more on development and debugging than build-time optimization.

codereflection · on Jan 29, 2020

Speaking of best practices for Dockerfiles and CI/CD, a lot of these issues can be highlighted at build time with a Docker linter like https://github.com/hadolint/hadolint.

throwaway894345 · on Jan 29, 2020

I didn't know about hadolint, but I don't see how it (or any other linter) can address any of these issues (unless "these issues" is not referring to the issues I was mentioning in the post you responded to).

codereflection · on Jan 29, 2020

Hadolint will tell about things like adding --no-cache to apk add. My point being that comments were made about not following best practices, and hadolint will help with that.

kortilla · on Jan 29, 2020

Yeah, you replied to the wrong post. You should have replied to the parent of the one you replied to.

codereflection · on Jan 30, 2020

dunk010 · on Jan 29, 2020

Hadolint is nice. Hadolint addresses none of the major issues you raise.

giovannibonetti · on Jan 30, 2020

> As for the build cache, it often fails in surprising ways. This is probably something on our end (and for our CI issues, on CircleCI's end

We have been using Google Cloud Build in production for over an year and Docker caching [1] works great. And Cloud Build is way cheaper than CircleCI.

I recommend it, and I'm not getting paid anything for it.

[1] https://cloud.google.com/cloud-build/docs/speeding-up-builds...

jacques_chester · on Jan 29, 2020

Or, and I am biased here, use Cloud Native Buildpacks and never think about this stuff again.

throwaway894345 · on Jan 29, 2020

I'm not familiar, can you elaborate on how those solve these problems? I'm always looking for a better way of doing things.

jacques_chester · on Jan 30, 2020

Broadly, CNBs are designed to intelligently turn sourcecode into images. The idea of buildpacks isn't new, Heroku pioneered it and it was picked up in other places too. What's new is taking full advantage of the OCI image and registry standards. For example, a buildpack can determine that a layer doesn't need to be rebuilt and just skip it. Or it can replace a layer. It can do this without triggering rebuilds of other layers. Finally, when updating an image in a registry, buildpacks can perform "layer rebasing", creating new images by stamping out a definition with a changed layer. If all your images are built with buildpacks, you reduce hours of docker builds to a few seconds of API calls.

This is a bit of a word soup, so I'll point you to https://buildpacks.io/ for more.

blackrobot · on Jan 29, 2020

Most Dockerfiles for python projects will have a line to install their python dependencies though.

  COPY requirements.txt ./
  RUN pip install -r requirements.txt

If you're building the image on a CI server, docker can't cache that step because the files won't match the cache due to timestamps/permissions/etc... The same is true for other developer's machines.

This is a problem if your requirements includes anything that uses C extensions, like mysql/postgresql libs or PIL.

viraptor · on Jan 29, 2020

You can achieve a similar caching improvement by either:

1. Using poetry which keeps a version lock file so all changes are reflected/cached, or

2. Doing a similar thing yourself by committing `pip freeze` and building images from that instead of requirements.txt.

throwaway894345 · on Jan 29, 2020

To be clear, the only file in question is requirements.txt; Docker has no idea what files `pip install ...` is pulling and doesn't factor them into any kind of cache check. Beyond that, I didn't realize that timestamps were factored into the hash, or at least if they were, I would expect git or similar to set them "correctly" such that Docker does the right thing (I still think Docker's build tooling is insane, but I'm surprised that it breaks in this case)?

blackrobot · on Jan 29, 2020

I just tested if timestamps are factored in, and I was wrong. According to the documentation:

https://docs.docker.com/develop/develop-images/dockerfile_be...

> For the ADD and COPY instructions, the contents of the file(s) in the image are examined and a checksum is calculated for each file. The last-modified and last-accessed times of the file(s) are not considered in these checksums. During the cache lookup, the checksum is compared against the checksum in the existing images. If anything has changed in the file(s), such as the contents and metadata, then the cache is invalidated.

erinaceousjones · on Jan 29, 2020

Not been an issue for me using Gitlab CI runners, at least..? Which may be because Gitlab CI keeps working copies of your repos.

If the CI system keeps the source tree the Dockerfile is being built from around rather than removing it all after every build, it caches stuff as normal.

masklinn · on Jan 29, 2020

> Also, build times and sizes don't matter too much in the grand scheme of things.

In which case... why do you bother with alpine in the first place?

ShakataGaNai · on Jan 29, 2020

> Too much

There are cases in which it does matter. Just like anything else, it strongly depends on your use case. If you're into the microservices thing and rebuild your containers, or change requirements frequently - maybe container size (as they'll be pulled a lot, by a lot of different hosts) matter. Maybe you're making something for the public consumption and want to make sure it doesn't take up a huge amount of space. Maybe you're making an image for IOT/RPi type devices. You get the idea.

Personally, I like using Alpine where possible because it's got less stuff. Less software means less things that could potentially have a security issue needing fixing/patching/updating later.

However my default container for anything else is "miniubuntu" build as it's got all the basics, it's 85mb in size, and I can install all the things I need for the more complicated projects.

Rapzid · on Jan 29, 2020

I don't. Never got on board with it and just stick to Ubuntu so I don't have to think about the differences.

Never had a business driver come up for going with Alpine though.

trishankdatadog · on Jan 29, 2020

You're also supposed to delete any dependencies you no longer need after compiling. This image might be unnecessarily large.

https://stackoverflow.com/questions/46221063/what-is-build-d...

TomBombadildoze · on Jan 29, 2020

Assuming it's even possible (there are cases where it isn't), unless you're doing the entire operation in a single RUN instruction (install dependencies, compile, remove dependencies), deleting dependencies isn't enough because they'll exist in an ancestor layer before you delete them. That leads to image bloat.

This is why multi-stage builds are a thing, which the author advocates against doing.

andoriyu · on Jan 30, 2020

I've built openresty with tons of custom plugins in a single RUN call. It's possible. Image is tiny compared to: - debian based image for the same thing - Not deleting build-times dependencies.

Author just doesn't know better. That's what happen you never build things from source yourself.

e40 · on Jan 30, 2020

> Ex, use "apk add --no-cache PACKAGE"

I have an Alpine docker image which was 185MB and after I added the above, it was 186MB. I was definitely hoping for more, given your strongly worded advice.