Hacker News new | past | comments | ask | show | jobs | submit login
FreeBSD Jails Containers (vermaden.wordpress.com)
226 points by vermaden on June 27, 2023 | hide | past | favorite | 60 comments



Shout out to https://github.com/samuelkarp/runj which aims to provide OCI compatible runtime for FreeBSD jails. We are working on a Jails based sandbox implementation for running OCaml builds on FreeBSD, https://github.com/ocurrent/obuilder. If runj gets more support then it would be a good alternative to the various iocage/ezjail/etc solutions or the raw Jails implementation we have right now.


Does this OCI compatibility also translate to be able to run kubernetes on FreeBSD?


Not necessarily, Kubernetes also needs a ton of other things which are Linux kernel specific, like cgroups.

If you're just looking for an orchestrator capable of running FreeBSD jails, take a look at Nomad (Disclaimer: I work at HashiCorp, opinions my own, etc.)


cgroups are mapped into Jobs on Windows.

There could be a similar kind of mapping on FreeBSD.


Thank you for the suggestion, I'm interested in checking it out.

How does Nomad handle scaling? I'm thinking of a new project, I'd like to have the equivalent of HPA that easily scales to zero on custom metrics and a cluster autoscaler that can easily add heterogenous nodes as necessary. Even better if it can work with MacOS VMs.


Nomad supports all sorts of architectures and OSes, including M-series CPUs and MacOS. It can also scale to a lot, 2 million containers being tested. [0]

In terms of autoscaling there's the Nomad Autoscaler[1][2] that is super flexible, can get metrics from a bunch of sources, do custom queries, and perform actions (everything is via plugins, so even if your current stack isn't already supported, it's easy to add) like add extra nodes or add more instances of jobs.

0 - https://www.hashicorp.com/c2m

1 - https://developer.hashicorp.com/nomad/tools/autoscaling

2 - https://github.com/hashicorp/nomad-autoscaler


That sounds a lot like what I am looking for. Thank you for the links to reading material!


The issue I have with Jails, over something like docker, is that the kernel ABI changes between major releases so jails need to migrate with base systems upgrades. It also means a container registry would need to store container versions per release.

I like the isolation for my own bespoke home network, but does it work well in production set up’s?


There are compat libraries that are available, so if you are running 13.x (the most current major release) but want to run binaries originally compiled under 12.x, you install:

* https://www.freshports.org/misc/compat12x/

See further:

* https://www.freshports.org/misc/compat11x/

* https://www.freshports.org/misc/compat10x/

[…]

* https://www.freshports.org/misc/compat4x/


And for anyone that isn't intimately familiar with FreeBSD's release history or anything, I feel like it's worth pointing out that the 4.x major release dates from over 23 years ago. You can run binaries from before people were inhabiting the ISS and/or right around the time AOL was big enough to purchase Time Warner.


Thank you! I was going to point this out...


Kernels are generally backwards compatible and major versions are explicitly supported for five (5) years. Over in Linux land you'll see builds for the different libcs and it seems to work just fine.


The difference is that the container libc version and kernel version don't have to correspond to each other in any way on Linux (unless your kernel is like 20 years older, in which case it wouldn't support Docker anyway).


I thought that too, until the syscall semantics changed between RHEL 6 and RHEL 7. Spent way more time debugging that than I wanted to.


The end result is that you're matching the executable to the libc variant instead of to the kernel. In practice it's not much of a difference, if at all. If you need to use older binaries, install the compatibility shims (or use a base jail based on the older version of FreeBSD). In terms of a support matrix there are two major versions of FreeBSD that are given active support, which is less than over in Linux land.

All in all of all the concerns I'd have about using jails in a production environment, dealing with different major versions is not on that list.


But the half the point of the container is per application libcs/userlands, so the kernel ABI is the only required constant.


In FreeBSD the userland and kernel move in unison, and with the backwards compatibility there's nothing stopping you from running a jail based on a previous version of FreeBSD. The compatX packages are only there to provide the older libraries if you want to run an older application on a newer jail/install.

As was pointed out, you can run binaries from a FreeBSD version (4) that's probably older than most posters here. Still not seeing what the big fuss is about.


the hangup is because half the point is to be able to take a random userland and run it on your current system without changing it. As soon as your userland bitrots because the kernel ABI changed, you've lost a lot of the benefit.

And to be clear, you can not take a complete userland from 4.x. You have to reinstall your application on a newer userland with compat shims when you change the kernel.

Ironically enough, the Linux syscall compat layer in FreeBSD may give you more stability here.


  You have to reinstall your application on a newer userland with compat
  shims when you change the kernel.
No, you don't. You can absolutely take a jail based on a previous version of FreeBSD and run it on a newer version. I've not tried going as far back as 4.x, but I've done jail images of 9 and 10 on 11 and 12 hosts as part of a CI system.

The shims you're talking about are not shims at all, just packages of the older libraries that a program built for an older version of FreeBSD would expect. You can go ahead and look at the package manifests if you don't believe me. They're literally just a handful of older shared libraries.

Honestly, I can't remember a time when the syscall table changed in a backwards incompatible manner. The compat packages are only needed if you're trying to run an older program with a newer userland. If you're running an older jail in a newer freebsd host, you can run your program as-is because those libraries have already been installed.


> The kernel works, but ps(1) does not

> If the kernel version differs from the one that the system utilities have been built with, for example, a kernel built from -CURRENT sources is installed on a -RELEASE system, many system status commands like ps(1) and vmstat(8) will not work. To fix this, recompile and install a world built with the same version of the source tree as the kernel. It is never a good idea to use a different version of the kernel than the rest of the operating system.

https://web.archive.org/web/20180602150408/https://www.freeb...

It certainly seems like the freebsd userland and kernel are fairly tightly coupled where even basic utilities won't work anymore on a version change.


Just stop already, you don't know what you're talking about. ps works just fine. The jail itself picks up the version of the host because it's calling the newer kernel, but otherwise works as one would expect. In fact it's probably worth noting that between 10 and 12 the libc major version wasn't even bumped. For all the bellyaching you're missing that each major version is supported for five years, and that unlike RHEL you'll get up-to-date ports with that (instead of the half-assed cobbled together ancient versions that linger with the RHEL model) so there's less incentive to upgrade until you're ready. That's a lot of hand waving over a non fucking issue.

If you were making a more earnest argument you'd probably have looked at the current version of the documentation and noted that it doesn't have the warning you quoted.

https://docs.freebsd.org/en/books/handbook/kernelconfig/#ker...

Note that FreeBSD 10.0 was released in January 2014, over nine years ago. 10.4, which is the oldest base jail image I have kicking around, was released in 2017. So the userland-kernel interfaces you're wringing your hands over have remained stable for nearly a decade across three or four different major releases.

Of all the concerns I'd have about running FreeBSD and/or jails in a prod environment, dealing with binary compatibility across different major versions is not even on my radar and not even remotely an advantage for docker.

  # uname -rv
  12.4-RELEASE-p1 FreeBSD 12.4-RELEASE-p1 GENERIC
  # ldd /bin/ps
  /bin/ps:
          libm.so.5 => /lib/libm.so.5 (0x800254000)
          libkvm.so.7 => /lib/libkvm.so.7 (0x80028b000)
          libjail.so.1 => /lib/libjail.so.1 (0x80029e000)
          libxo.so.0 => /lib/libxo.so.0 (0x8002a6000)
          libc.so.7 => /lib/libc.so.7 (0x8002c5000)
          libelf.so.2 => /lib/libelf.so.2 (0x8006b9000)
          libutil.so.9 => /lib/libutil.so.9 (0x8006d4000)
  # md5sum /bin/ps
  ad8b5c8966c71e31cdc9603967860fa7  /bin/ps
  # objdump -p /lib/libc.so.7 | tail -12
  
  Version definitions:
  1 0x01 0x0865f4e7 libc.so.7
  2 0x00 0x077a28b0 FBSD_1.0
  3 0x00 0x077a28b1 FBSD_1.1
  4 0x00 0x077a28b2 FBSD_1.2
  5 0x00 0x077a28b3 FBSD_1.3
  6 0x00 0x077a28b4 FBSD_1.4
  7 0x00 0x077a28b5 FBSD_1.5
  8 0x00 0x077a28b6 FBSD_1.6
  9 0x00 0x0f1efaa0 FBSDprivate_1.0
  
  # iocage create -r 10.4-RELEASE
  e950694e-6182-410d-bc70-1a8f6e340516 successfully created!
  # iocage start e950694e-6182-410d-bc70-1a8f6e340516
  * Starting e950694e-6182-410d-bc70-1a8f6e340516
    + Started OK
    + Using devfs_ruleset: 1007 (iocage generated default)
    + Using IP options: ip4.saddrsel=1 ip4=new ip6.saddrsel=1 ip6=new
    + Starting services OK
    + Executing poststart OK
  # iocage console e950694e-6182-410d-bc70-1a8f6e340516
  FreeBSD 12.4-RELEASE-p1 GENERIC
  
  Welcome to FreeBSD!
  
  Release Notes, Errata: https://www.FreeBSD.org/releases/
  Security Advisories:   https://www.FreeBSD.org/security/
  FreeBSD Handbook:      https://www.FreeBSD.org/handbook/
  FreeBSD FAQ:           https://www.FreeBSD.org/faq/
  Questions List: https://lists.FreeBSD.org/mailman/listinfo/freebsd-questions/
  FreeBSD Forums:        https://forums.FreeBSD.org/

  Documents installed with the system are in the /usr/local/share/doc/freebsd/
  directory, or can be installed later with:  pkg install en-freebsd-doc
  For other languages, replace "en" with a language code like de or fr.

  Show the version of FreeBSD installed:  freebsd-version ; uname -a
  Please include that output and any error messages when posting questions.
  Introduction to manual pages:  man man
  FreeBSD directory layout:      man hier
  
  Edit /etc/motd to change this login announcement.
  # ldd /bin/ps
  /bin/ps:
          libm.so.5 => /lib/libm.so.5 (0x800827000)
          libkvm.so.6 => /lib/libkvm.so.6 (0x800a50000)
          libjail.so.1 => /lib/libjail.so.1 (0x800c59000)
          libc.so.7 => /lib/libc.so.7 (0x800e5e000)
  # objdump -p /lib/libc.so.7 | tail -16
  Version definitions:
  1 0x01 0x0865f4e7 libc.so.7
  2 0x00 0x077a28b0 FBSD_1.0
  3 0x00 0x077a28b1 FBSD_1.1
          FBSD_1.0
  4 0x00 0x077a28b2 FBSD_1.2
          FBSD_1.1
  5 0x00 0x077a28b3 FBSD_1.3
          FBSD_1.2
  6 0x00 0x077a28b4 FBSD_1.4
          FBSD_1.3
  7 0x00 0x077a28b5 FBSD_1.5
          FBSD_1.4
  8 0x00 0x0f1efaa0 FBSDprivate_1.0
          FBSD_1.5
  
  # md5 /bin/ps
  MD5 (/bin/ps) = 8a9c364705d29beb98503415428067cc
  # ps
    PID TT  STAT    TIME COMMAND
  80454 39  IJ   0:00.01 login [pam] (login)
  80455 39  SJ   0:00.03 -csh (csh)
  80593 39  R+J  0:00.00 ps


You are 100% factually right and I shared the same level of frustration as it seems you had with GP. I just wanted to 1) say thanks for sharing knowledge!, but also 2) suggest taking a deep breath or a walk before replying to folks like GP. I’m often guilty of it myself, but trying to get better.


The point at which it becomes clear (e.g. digging up outdated documentation) that the argument is not being made in earnest is the point at which I stop taking the poster seriously. My comment was meant more for anyone else who might stumble on this thread than as a reply to OP.

That said, I wouldn't look to run FreeBSD in a production environment without a good reason and 13 looks to be where I'll jump ship to Debian for homelab stuff. The big thing holding me back before was ZFS, but that's essentially a non-issue now.


They don’t need to migrate, but it’s usually a good idea to keep up to date. It’s indeed somewhat of a thing if you have a lot of systems and you have uptime requirements. Plan accordingly and it’ll be fine.


How is docker any different? Don't docker containers share the kernel with the host?


Linux have stable kernel ABI, while BSD changes kernel ABI between major versions.


I highly recommend the book “FreeBSD Mastery: Jails” by Michael W Lucas, in printed form https://bookshop.org/p/books/freebsd-mastery-specialty-files...

It had everything I needed to start manually setting up jails to my liking.

I eventually had to get some additional help online but you come a long, long, long way with that book in your hand.


Vnet jails are awesome. Totally changed the way I do networking and application hosting. Makes it pretty trivial to do things like set up a jail that can only talk to the internet through a VPN.


Related: A two-server three-jail network setup (firewall, vlan, tunnels)

https://blog.uidrafter.com/freebsd-jails-network-setup


I have a VNET jail for remote access to my home lab. This takes an extra NIC I wasn't using anyway. It has been solid running on FreeBSD 13.2.

This jail allows me to lock things down to minimal access to other services I need remotely, such as an SSH tunnel to Home Assistant.

I run my FreeBSD server mainly for ZFS. Adding a jail for remote access adds to the usefulness that hardware.


An extra NIC is fine if you've got slots, but you can also do an if_bridge in the main host, add the real nic and half of an if_epair; then move the other half to the vnet jail.

Potentially a non-vnet jail with the ipv4/ipv6 options set may also be an option. I used to use jails with those settings to squeeze multiple environments into a single host at Yahoo.


I use only if_epair and then pf to NAT it to the external world. I’ve only had issues with if_bridge, where I would either be able to make connections between jail and the external world, but not between host and jail or vice versa (between host and jail fine, but not between jail and external world). And it would act really flaky, like getting one ping response, followed by silence. pf is a lot simpler and more reliable.


> I used to use jails with those settings to squeeze multiple environments into a single host at Yahoo.

Greetings! Miss the ole days at the ‘Hoo when we were all FBSD based!


Can you elaborate on what this looks like?


jail.conf:

ssh { exec.start = "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown"; exec.clean; mount.devfs; path = "/var/jail/ssh"; host.hostname = "ssh"; vnet; vnet.interface += "em0"; }

It's IPv6 only, so this is key in rc.conf: rtsold_enable="YES"

OpenBSD PF based router does the rest. IPv6 simplifies things here.

PF on FreeBSD isn't the most ideal, sure, but I can limit local access on inet6.

For SSH tunnel, it's straight forward.


FWIW VNET is not inherently IPv6 only. I'm using it in an IPv4 only context just fine.


Props for using the base-system jail management tools instead of iocage/ezjail/etc.


IME, the base-system jail stuff is super easy to understand, use, and customize. When you start using "easy" tools on top of that, it rapidly becomes more complicated and opaque for non-trivial stuff.

It's awesome that you can do everything you would want with a jail with only 3 config files: rc.conf (for networking and start-up), pf.conf (for firewalling/routing), and jail.conf (for specifying jail params like mount points, network if setup, etc.).

Super clean, super easy.


Thanks.


VNET jails are great, especially now that they're stable, but I'd strongly recommend using a higher level tool. I've settled on iocage, but there are a few other tools out there that seem to have gained traction. Nobody sets up cgroups by hand, right?


A couple of tools which are both working on jail management & packaging

- bastille https://bastillebsd.org

- pot https://potluck.honeyguide.net

bastille is great for personal and small setups, pot is aimed at larger scale and meant to be used along with hashicorp nomad+consul


What is the state of pot in summer 2023? Is it ready for production?


Depends on your workloads and needs. It's not as mature as k8s and there are rough edges but if you're into the Hashicorp ecosystem and can run your workloads in FreeBSD jails it's pretty slick - think golang & rust apps, web, gaming servers, etc


I have a soft spot for FreeBSD. I hope someday we’ll have a lightweight container solution that works across the BSDs (macOS included), Linux and maybe even Windows. POSIX is all you need.


Also check out systemd-nspawn on linux. Another nice container solution, built in, that seems to be relatively little known.


I made a very serious attempt at using this awhile back on Arch and it high-key sucks. Like, it's not really meant as a general purpose container runtime solution, it's meant for developing systemd and related software. The ergonomics, particularly around networking are quite bad and it literally requires that you're all in on systemd-networkd and resolved to get any networking at all.


I’ve used it as a general purpose container runtime and been very pleased. Ymmv.


Here is a talk from SCaLE 2023 on Cloud Native FreeBSD: https://www.youtube.com/live/ReYon0HLj2k?feature=share&t=180...


Why isn't Jails supported on macOS? Seems like this would be a VERY powerful option.


The OS X kernel is a mashup of things; it's been too long since I looked, but IIRC the tcp code comes from just before FreeBSD added syncache (released in FreeBSD 4.5), so that should be after the release of jails in 4.0. But, Apple would have needed to integrate jails throughout the NeXT/Mach part of the kernel too.

And, you'd be missing out on features like VNET that came later (FreeBSD 8), unless Apple reintegrated with upstream, which is pretty rare for them. (They did update the FreeBSD userland, once)


I might be entirely off-base with this, but I think it probably has something to do with the roots of macOS' (or really, NeXTSTEP's) BSD component (BSD 4.3) pre-dating FreeBSD 4.0 (the first version that introduced jails) by 14 years.

It does seem like a feature worth porting, though.


very nice and comprehensive guide, thanks a lot!


Thank You.


"The FreeBSD host can run any other FreeBSD as long as its not newer then the host system version."

Docker does not have this limitation. You can run a container that is based on a newer version of Linux than the host.


They have that limitation, but it doesn't manifest in quite a simple way. If the container's glibc and all other binaries require Linux 6.3, you're not running it on a 5.18 host.

FreeBSD is developed as a whole unit. It makes it easier to reason that a 13.2 userland doesn't run on a 12.4 kernel.


> If the container's glibc and all other binaries require Linux 6.3, you're not running it on a 5.18 host.

FWIW, it would take quite a long time for glibc itself to start requiring Linux 6.3 on its current targets. Since glibc 2.24 (Aug. 2016), the minimum supported kernel has been at most Linux 3.2 (Jan. 2012), on every target that has existed that long. And that's the smallest that the gap between the glibc and Linux releases has ever gotten (except for new targets). Extrapolating, we'd have to wait for glibc 2.47 (Feb. 2028) to require Linux 6.3 (Apr. 2023), if that gap were to be repeated, something which I don't find particularly likely.

That is to say, with the exception of new targets, you'll rarely see a userland that absolutely requires the latest Linux kernel, even possibly if it suggests that kernel for the best support.


It's the "all other binaries" part that will catch you. You've completely overlooked that part.

For example: Per its own README, systemd requires Linux 3.15 at absolute minimum, recommends Linux 4.15 for baseline functionality to at least work, and specifies Linux 5.7 for full BPF functionality.


Depends what you mean by "Linux". You can possibly use a newer userland, but containers share the kernel with the host. That's kind of the whole point of containers: they're just a set of processes that are grouped together and configured with certain restrictions.


That's incorrect.

(Well. Technically it's kind of true. You can run a container based on a newer kernel, it'll just crash when you use any new features)


Unless that container uses kernel semantics that changed between kernel versions, or a feature that was added in the new version. Which invariably happens. See the RHEL 6/7 upgrade and the changes to syscall.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: