Nsjail – A light-weight process isolation tool for Linux

jeblair · on Oct 15, 2017

This seems very similar to Bubblewrap: https://github.com/projectatomic/bubblewrap

catern · on Oct 16, 2017

I don't see anything to suggest that nsjail has the main feature of bubblewrap: It is safe to make bubblewrap setuid-root, and therefore bubblewrap is a safe way for unprivileged users to use containers. (arguably the only safe way at the moment)

Without nsjail making that guarantee, nsjail is just yet another command line interface to namespaces.

woahhvicky · on Oct 15, 2017

How does this compare to firejail?

moosingin3space · on Oct 15, 2017

This tool is lighter-weight than firejail. nsjail seems to be a thin abstraction over Linux namespaces, while firejail contains profiles for common desktop applications and some X hackery to enable jailing of GUI programs.

jagger11 · on Oct 15, 2017

author here:

Yup, nsjail doesn't have X hacks (I should work on that), though it offers some profiles for Apache-like type of applications:

https://github.com/google/nsjail/tree/master/configs

I believe nsjail uses one of the most advanced (if not the most advanced) seccomp-bpf config language - kafel: https://github.com/google/kafel

audidude · on Oct 16, 2017

bwrap allows passing a FD containing the seccomp rules (--seccomp FD w/ seccomp_export_bpf). If it can export the compiled eBPF it should be trivial to use kafel profiles w/ bubblewrap/atomic/flatpak/etc.

Bromskloss · on Oct 15, 2017

Is this what I should use if I want to intercept filesystem calls (and rewrite them, or generate on the fly the file that is about to be accessed)? Something else I should look into for this purpose?

jagger11 · on Oct 15, 2017

author here:

Not exactly, you can technically overwrite a file with bind mounts, e.g. use

nsjail --chroot / -R /dev/null:/etc/passwd -- /bin/sh -i

This will make /etc/passwd empty, but nsjail doesn't rewrite syscalls. In order to do that, you'd have to use SECCOMP_RET_TRACE (TRACE(number) in kafel config lang), and then add some C code to nsjail which will use ptrace() to intercept and rewrite your syscall. It's possible, just not implemented, because it didn't seem like something that's required by users.

wmf · on Oct 15, 2017

It doesn't sound like NsJail does that; maybe try FUSE or SECCOMP_RET_TRACE?

jagger11 · on Oct 15, 2017

Yes, SECCOMP_RET_TRACE works, but nsjail doesn't have code to support that - it didn't seem that useful when mount namespaces can police access to files.

Otherwise, it's possible to make it support that. Though, a word of caution: ptrace() is complex, and sometimes buggy interface with a lot of corner-cases - iow: it's easy to make a mistake with consequences for security of the whole setup.

PS: It's possible to use SECCOMP_RET_TRAP (TRAP(number) in kafel's - nsjail seccomp-bpf cfg language - nomenclature), and rewrite syscalls in-process with help of SIGSYS signal handler.

therein · on Oct 15, 2017

Is there a minimum required kernel version? How does it compare to proot?

We use proot in our build pipeline and it would be interesting to look into alternatives.

jagger11 · on Oct 15, 2017

Re kernel versions: Depending on when CLONE_NEWUSER and seccomp-bpf were added to the kernel for different CPU architectures. For x86-64 it was probably around 3.16, for some others it might be even 4.3 (e.g. ppc64). It might even work with earlier versions if you use --disable_clone_newuser and avoid using seccomp-bpf filters.

Re 'proot'. I've never used it (it seems to be a configurator for the mount namespace), but nsjail seems much more advanced: cgroups support, seccomp-bpf via configuration language support, and a few more features (configs, net).

therein · on Oct 15, 2017

Thanks. I appreciate the response. I guess my only option until moving to a more recent kernel is `proot` as our build boxes are still in 2.6.32, but I am happy to have found out about `nsjail` for the future.

audidude · on Oct 16, 2017

What about older LTS systems that have CLONE_NEWUSER but only allow access to it from uid 0?

jagger11 · on Oct 16, 2017

You can run it as root, and specifiy users/groups to switch to before executing an app. Though, CLONE_NEWUSER was meant for exactly that - using namespaced without euid==0. Some systems like Debian have a sysctl flag:

kernel.unprivileged_userns_clone

which controls this behavior. Ultimately, it's up to you whether set it to "1", as CLONE_NEWUSER in the past opened many new attack vectors on the Linux kernel. However, I believe that currently the situation is much better, esp. after syzkaller and individual researchers reported and fixed many bugs in this area.

sitkack · on Oct 15, 2017

For reference, https://web.archive.org/web/20160305001149/http://proot.me/

TheDong · on Oct 16, 2017

This seems to be almost exactly like systemd-nspawn other than the ability to write seccomp policies in kafel.

Are there any other notable differences?

jagger11 · on Oct 16, 2017

I haven't been looking at systemd-nspawn for some time, but judging from its man page:

- ability to use config files (in nsjail in protobuf format)

- 3 operational modes: one of them allows to listen on a TCP port and run processes on-demand (inetd-style)

- support for cgroups (pid and mem limiting), here rlimits are not enough

- more expressive seccomp-bpf rules

TheDong · on Oct 16, 2017

> ability to use config files

systemd-nspawn supports ".nspawn" files (see --settings=true mode)

> socket activation

systemd can start up an nspawn thing in reaction to a systemd socket-activation request I think?

> cgroups

I guess for that you'd use 'systemd-run --scope -p MemoryLimit=10M -p CPUShares=100 -- systemd-nspawn ...'

> more expressive seccomp-bpf rules

Absolutely!

_Marak_ · on Oct 15, 2017

I've been using nsjail in production with good success lately. It's a solid tool.

Thank you authors! Really appreciate your work on this project.

andystanton · on Oct 16, 2017

I have become conditioned by seeing so many Javascript frameworks reach the front page over the years that I parsed this as 'JsNail' on first glance.