Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
An extensive ZFS setup on MacOS (justinscholz.de)
134 points by tobiasrenger on July 7, 2018 | hide | past | favorite | 65 comments


ZFS was supposed to be the default file system on Mac OS X 10.5 (Leopard). I know this because I talked to some engineers at Apple, but the external evidence is that Time Machine clearly was built as a ZFS snapshot explorer and manager.

Sadly, they had to drop the whole thing because of the murky licensing on ZFS.

I sometimes dream of a world with a ZFS root partition on my MacBook Pro...


Seeing it was SUN who announced that ZFS was going to be OSX‘s default file system, I think that is more likely Apple/Steve himself changed plans in a fit of rage over this premature announcement.


https://openzfsonosx.org/wiki/ZFS_on_Boot

Not quite "it just works," but it's an option...


Adam Leventhal doesn't seem to put much stock in the licensing theory, considering OS X's inclusion of DTrace, which was released under the same license. His discussion of the saga of ZFS on OS X is here: http://dtrace.org/blogs/ahl/2016/06/15/apple_and_zfs/ . Very interesting stuff.


Right from that very article:

> When Apple included DTrace in Mac OS X a point in favor was that it could be yanked out should any sort of legal issue arise. Once user data hit ZFS it would take years to fully reverse the decision.


Right, Leventhal mentions that such an eventuality was one in a group of several considerations, but clearly does not think it was the primary factor.

From the paragraph following your quote:

> Finally and perhaps most significantly, personal egos and NIH (not invented here) syndrome certainly played a part. [...] [C]ertain leads and managers preferred to build their own rather adopting external technology—even technology that was best of breed. They pitched their own project, an Apple project, that would bring modern filesystem technologies to Mac OS X.

and

> Licensing FUD was thrown into the mix; even today folks at Apple see the ZFS license as nefarious and toxic in some way whereas the DTrace license works just fine for them. Note that both use the same license with the same grants and same restrictions.

Leventhal's chronology continues to suggest that ZFS on OS X re-emerged even after this licensing argument had been advanced, and that the project was finally killed by Larry Ellison himself, in the interest of keeping his personal friendship with Steve Jobs unaffected by business considerations.

While it is of course possible that Apple is willing to accept any "murkiness" around the license as it pertains to DTrace but not willing to do so as it pertains to ZFS, it just doesn't seem like your original statement that "they had to drop the whole thing because of the murky licensing on ZFS" represents the situation clearly (at least not if we accept the version of events as told by Adam Leventhal; personally, I have no direct knowledge).


This was one of Sun's last middle fingers to the FOSS community. They knew by licensing ZFS under the CDDL there was no way Linux users, specifically RHEL customers could benefit from it. Hopefully Apple won't follow suit with APFS.


This is a common misconception but in fact the incompatibility is purely accidental, in addition I wouldn’t call GNU/GPL the most pragmatic license family either.


If it was an accident, then why does the CDDL even exist? What does it offer that MIT/BSD don't have, other than being incompatible with GPL?


Like Mozilla, per-file copyleft and clearer patent licensing, IIRC.

Some prominent former Sun folks have publicly claimed the GPL incompatibility was intentional, other equally prominent folks have disagreed.


This argument could be made for any license that’s similar to another, I don’t think it’s fair reasoning. Why does 2-Clause BSD exist? Why does Apache license exist?


That's a good point, though honestly I never could understand why you would use anything but Apache (for permissive) or (A)GPL (for copyleft). Other options seem like oversights (BSD/MIT don't cover patents) or cutting off your nose to spite your face (WTFPL). I'm sure there's nuance, but I can't seem to see it.


MIT license and BSD license seem to long predate Apache.

GPL and AGPL differ vastly in a few important aspects that make GPL pretty OK for commercial companies, and AGPL an absolute no-no.


Is there any hope that Time Machine might get reworked with APFS?


portions of time machine already have been (local backup uses snapshots, and is ridiculously simpler as a result).

Proper Time Machine "v2" is a bit more complex (basically a rsync clone based on local and remote snapshots), but still way simpler than the current "v1" method, which builds its own change table from fsevents and has to cope with a lack of snapshots in HFS+.

They can't use APFS as a Time Machine "v1" store because the unique feature of HFS+ to support directory hard-links did not carry over to APFS. Migration of a v1 to v2 volume would be a fun exercise in translations (folder hierarchy collapsed to volume snapshots, and differing crypto approaches for encrypted backups).

I suspect they are trying to get APFS in an evolutionary "stable" place so they can publicly document it with the iOS 12/macOS Mojave/etc releases - the changes they are making to support HDDs and Fusion drives this year are likely a big internal evolution.

I also suspect Apple is just not in a rush to move time machine over - it already works, has a surprising amount of integration across the OS for backup/recovery, and people tend to value their backups working vs using the newest technology.

Eventually, my belief is they will roll out Time Machine v2 and leave v1 (on HFS+) only supported for restoration. Just like with v1 today, non-APFS volumes will get a sparse bundle disk image to hold the APFS data. But they will wait until they have a strategy to migrate the external volumes and disk images.


There is definitely hope, it could turn out to be empty hope but... APFS has a lot to offer Time Machine as the hackery needed to get it to work on JHFS+/HFSX is tomfoolery at this point. At the least APFS has metadata checksums, and adding on an optional csum tree for data is not difficult for Apple to do. They already do COW with APFS so all that's needed is a separate tree for data csums. And then they have snapshots instead of hardlinks; or reflinks instead of hardlinks. I think it'll happen eventually as the deadweight of supporting HFS+ catches up with them.


Nice writeup. I have been tempted to try ZFS but to be honest it seems like too much of a hassle compared to rotating three portable usb drives with TimeMachine, and periodically creating ZIP / GPG encrypted backups with descriptive names and dates, and copying to OneDrive, Google Drive, and Dropbox. A few times a year these backups also include email, contacts, snapshots of github repos, etc.

This may sound time consuming, but it is not. For day to day, the 256G drive on my MacBook, github, and Dropbox are what I use for work.


Another solution for data integrity I've been tempted to try instead of switching filesystems is to go RAID 1 / 10 on a couple SSD's. A bit finicky to set up (and not viable on Macbooks, obviously) but it's practically zero-maintenance after that until one of the hardware parts fails.

I also do most of what parent comment does i.r.t. backing up to external drives at regular intervals. It /sounds/ like a hassle, but most of my process is automated so I plug in the drives, run a bash script, and wait til it's done. Works quite well.


The problem with RAID1 is, when the bit flips, you don't know (at the block device or filesystem level) which copy is the good one.


RAID1/10 doesn't really help with data integrity (at least if it's just basic "standard" RAID). It helps against catastrophic media failure and not much else.


Glad to see ZFS adoption increasing. It is such a great filesystem and is one of the main reasons my home servers run FreeBSD.


You can also use ZFS on Debian. I use it for my home server and I haven't had any issues.

Although I'll note that I don't use ZFS on my boot device, just on storage HDDs. Booting from ZFS seemed like a hassle, without clear benefits.


> Booting from ZFS seemed like a hassle, without clear benefits.

On FreeBSD, ZFS is a first-class citizen. Root-on-ZFS is something that the installer can do and the bootloader understands. I've used root-on-ZFS through five FreeBSD versions (plus numerous minor versions), all upgraded using freebsd-update, with no issues.

On Linux, ZFS seems to be an addon, so hassles are to be expected.

As for benefits, Boot Environments[0][1] are a great reason for having root-on-ZFS. While I don't personally do this, I can definitely see the appeal of being able to roll back a major OS upgrade.

[0] https://www.freebsd.org/cgi/man.cgi?beadm

[1] https://forums.freebsd.org/threads/howto-freebsd-zfs-madness...


> Booting from ZFS seemed like a hassle, without clear benefits.

I'm running ZFS on a single boot/user partition on an XPS 13 (9370.) I was unable to get the built in WiFi working with Debian Stretch but it seems to work well enough with testing (Buster.) I have apt-buglist and apt-listchanges (or similar) installed to warn me of possible problems, but I feel more comfortable thinking that I can roll back a system that becomes borked by a buggy upgrade. (Haven't had to do it yet.) Ordinarily I'd have a separate $HOME partition but I'm settling on a $HOME filesystem (which is covered by `rsync` backups.)

It was a bit of a hassle but not IMO w/out benefits for my use case.

I have other servers running Debian Stretch and Ubuntu 16.04 that are using ZFS on the storage drives/partitions and EXT4 on the root partition. (These are for personal use.)


> Booting from ZFS seemed like a hassle, without clear benefits.

It used to be a major hassle around 2 years ago but since then things have evolved significantly, up to the point where as long as you follow the install instructions correctly the first time, you can trust it to just keep on working.

One benefit (for me) is the ability of having full system snapshots taken periodically with minimum storage cost. I've used a setup on my personal workstation for years where a zfSnap job takes snapshots every hour. This has proven itself to be super convenient when accidents occur. Of course that the important data is backed up externally or committed to Git, but having the possibility to quickly revert mistakes is a big nice to have.

Such a system could also be easily improved to ship these snapshots to an offsite location as incremental backups. Of course the same process can be done in many other ways, but it is also other thing nice to have.

Another benefit I've had with this is portability between machines. My current Debian Stable system with ZFS on root began its life running out of an external HDD plugged onto an old Macbook Air. After I was able to get an external SSD and a Thunderbolt adapter, all I needed to do was to add the new device to the existing pool as a mirror and wait for resilvering, then remove the old device after all data was mirrored and the migration was done. When I finally got a new laptop, all I needed to do was to boot from the SSD and add the new NVMe device to the pool as a mirror, then remove the old SSD afterwards.

There is also the transparent compression feature that can potentially save a lot of HDD space depending on the usage pattern of the particular system.

And of course since ZFS is there present in your system (as it's the rootfs anyway!) you can use all other features you want, such as slicing volumes to use as root volumes for virtual machines or to use it as the backend for a local Docker host, where Docker can use an specific ZFS driver and leverage filesystem capabilities when storing images.

Of course you don't need to run it as the rootfs for many of these things, but you if do like ZFS then the benefits, possibilities, and the fact that you can do it outweigh the (nowadays reduced) hassle. :)


I’ve been using it on Debian for a about 3 years (and on FreeBSD almost 10 years), wouldn’t say that it has been without its problems on Debian but it seems mostly stable for home use.

Had some problems in Debian 8/Jessie especially when upgrading to 9, used it professional on some hardware testing rigs a few years back but it was a bit more cumbersome then I had hoped (had to resort to FreeBSD live boot at one point to access our data.

Haven’t had any problems in about a year now but only use it on one of my home servers running Debian. On FreeBSD it’s always been fantastic, same for illumos.


> Booting from ZFS seemed like a hassle, without clear benefits.

On Solaris it is integrated with the package manager, thus on updates a snapshot is taken and if anything goes wrong one can revert to the previous state.


It still lacks a handful of features for home use that btrfs offers. But btrfs raid5 is not stable enough. It's a pity.


What sorts of features does btrfs offer home users that ZFS lacks?

My experiences with btrfs have been somewhat mixed. On OpenSuse, I've found snapper to be an amazing lifesaver, but the regular system-crippling maintenance process[0] to be very frustrating.

[0] https://forums.opensuse.org/showthread.php/523354-High-CPU-l... - the thread implies that it is quota-related, but I've seen this happen on machines without quotas enabled.


The big missing feature, IMO, is that you can't grow a vdev just by adding a disk -- you have to go through this rigmarole of swapping in a larger disk and resilvering, one at a time.


Yes, this is a huge pain point with ZFS. RAIDZ expansion is coming[0] though, so things are getting better.

[0] https://www.freebsdfoundation.org/blog/openzfs-raid-z-online...


It's been a while since I played with btrfs but it was really nice to be able to dynamically add hard drives of any size to a volume and have the filesystem seamlessly balance the data between the drives with the requested replication factor.


Very flexible online resizing, re-striping and conversion between RAID levels of pools. On-demand deduplication instead of always-active. Reflink copy. On-demand compression and defragmentation. No ARC gobbling up memory (competes with page cache). NOCOW files.


- metadata raid1 with data raid5. If you currently have metadata raid5 'btrfs balance start -mconvert=raid1 <mp>' will online convert metadata from raid5 to raid1.

- raid10 with a device failure, you might think you have to wait a day or two for a device replacement, but in fact you can do `btrfs device remove missing <mp>` and the data on the missing device is replicated into the free space of remaining devices along with an online shrink, and the whole operation is COW and atomic per block group.


On stable hardware, Btrfs raid56 should be stable. As the code everywhere is constantly changing I guess it's somewhat subjective if raid56 specific changes mean the code is not stable.

Some seemingly small changes 4.17 to 4.18 thus far. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

And bigger changes from 4.14 to 4.17. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

I don't have a complete or detailed breakdown, but I think a big chunk of Btrfs raid56 problems would have affected any raid56 implementation, including the crazy Linux 30s command timeout default. Many I've seen are single drive failures, with an additional bad sector on another drive. If Btrfs itself detects an error (csum mismatch for metadata or data) on a remaining drive, this is no different than a two "device" failure for that stripe.

More reliable than raid5 for metadata and data, is to use raid1 metadata and raid5 data (this can be chosen at mkfs time or you can do a balance with -mconvert=raid1 to convert raid5 metadata to raid1).

And for all raid, it's questionable if raid56 is a good idea with 5+TB drives taking days to rebuild.


Remember when OS X have native ZFS support? It's unfortunate that Oracle choice the licensing path they did.


Ugh, for the life of me I never understood why Apple didn’t make Time Machine heavily integrated with ZFS right from the get go.

In those days it wasn’t Oracle, it was Sun. And I cannot imagine Apple not being able to negotiate licensing terms with Sun. Hell, Jonathan Schwartz had already worked with Steve Jobs before when Lighthouse was writing heaps of software for NeXT. I can’t imagine Schwartz and Sun not being willing to play.

In fact, back in those days Apple still was in the enterprise game with XServe hardware. Fellow NeXT / Apple aficionados and I remember one WWDC where we were suspecting a big Sun-Apple partnership announcement, possibly paving the way for a future Apple acquisition of Sun.

But then Oracle happened and burned everything that had been good about Sun to the ground. I have mostly given up my 1990s-born hatred of Microsoft and Bill Gates. It’ll be some time before I’d ever trust Oracle or Ellison.


I believe this has more to do with NetApp's patent claims and technical decisions than Oracle. (i.e. Apple announced to have stopped working before the Sun acquisition closed, before Oracle had any control)


I remember when APFS was introduced, it was framed as being for everything from the Apple Watch to the Mac Pro. Would ZFS have scaled down to whatever puny hardware is in the first-gen watch?


probably not


I can’t even think of a feature (other than compression/encryption but there are better ways) that a watch would use from zfs.


I imagine the main reason to use it would be that you have ONE filesystem that runs everything, everywhere. Though come to think of it, I feel like snapshots and relative ease of creating new filesystems in the pool could be handy for upgrades.


Zfs still has a huge overhead. It needs more ram than a watch or even phone is going to have - until they start packing enough that losing 4gigs of ram to overhead is fine.


It was Sun, wasn't it?

I mean, ZFS has always been CDDL licensed, even before it was acquired by Oracle. If it once was native in OSX, then it was under the same licensing terms it is today, right?


Correct.

I can't even imagine that Apple minded the CDDL, given that they still shipped DTrace.

The best theory I've ever heard about this was that it's much easier to rip out a debugging tool than in-place convert a filesystem if it turns out to be legally troublesome in some fashion.


Would use ZFS as the root file system on macOS if it wasn't so convoluted [1] — gives the impression of being liable to break on major updates.

[1] https://openzfsonosx.org/wiki/ZFS_on_Boot


the author of it here, hi ;-)

Yes, ZFS on Boot is right now more like a technology preview. Fonts break badly, so ZFS boot is not advisable (and also doesn't give you that much benefit really).

I want my data to be guaranteed to have integrity and really enjoy snapshots (I've been bitten by Final Cut Pro X data deletion errors etc and rolling back a snapshot is a bliss).


The main reason to use ZFS, for me, is file content checksums. Neither hfs+ nor apfs support this, allowing silent data corruption.


> Mail (super nice due to compression and big space savings (in my case 20% reduction of space needed)

Things will get even better once zstd is added[1]. Leaving transparent compression out of APFS was a real big mistake by Apple, imho, given how fast and cheap lz4 and zstd are. It's one of the reasons why I have a ZFS volume set aside myself.

[1] https://www.phoronix.com/scan.php?page=news_item&px=OpenZFS-...


Flash storage might be the highest-margin component sold by Apple, so there's a conflict of interest with filesystem features that reduce the need for storage.


That seems a little tinfoil hat considering the largest consumer use of storage is photos & video, neither of which compress any more anyway.


And that Apple is heavily pushing the HEIF/HEVC formats in ways that provide similar image quality with lower storage uses.


This is a macOS thread, but it would be helpful if Apple allows micro SD cards to be connected to the lightning port of iOS devices.


Like this[0]?

I have no idea how well it works or what its capabilities are, but it seems to be exactly what you are looking for.

[0] https://www.apple.com/ca/shop/product/MJYT2AM/A/lightning-to...


That is mostly limited to copying photos and videos to/from the "Photo Roll".

There are many thousands of applications which process other file formats. Some of them have natively implemented support for a 3rd-party (expensive) storage device called iXpand, which itself comes with a poorly implemented app. These third-party applications are poorly and pointlessly (re) implementing basic OS file system I/O functions.

Is it possible that these pointless I/O hoops are hampering uptake of $1000 iPad Pros for "laptop" use cases?


Macbook Pros have use cases which extend well beyond photos & video.


lz4 and zstd are cheap in what? CPU cycles? Apple cares more about energy use these days (correlated but not equivalent), maybe it's not a good tradeoff?


I like it! As someone who uses XHyve and FreeBSD on MacOS have you considered doing any of these ZFS ops against a VM? I also use NFS with my VM and it’s extremely fast and reliable. This could work well moving things from the VM into a fully working backup system which supports ZFS datasets and mounting via NFSD which is still native on MacOS.


good timing. i've been using zfs on macos for years (and different flavors as well), but in a very basic way, manual snapshotting on occasion, that kind of thing. i'm on a pretty old macos now, for fear of updating.

i've recently wanted to have a globally available privately hosted filesystem. ie, not icloud, not dropbox, not google drive. AFS (auristorfs) would fit the bill quite nicely except getting it backed by ZFS would be too much of a chore. The days where I had time to deploy and time to maintain such things are past.

but in perfect timing with this article, i've also just learned that BTMM can make filesystems globally available. Apparently I won't see the files via the same pathname everywhere, but I can live with that in exchange for the ease of use.


When I tried setting ZFS up on macOS, when I tried to set the ZFS pool as a network share through macOS's sharing panel any individual user permissions I set would not stay set after I closed out of the screen.

Is there another way of doing this that's recommended?


How compatible are the various ZFS forks?

Is it possible to create ZFS volume on macOS and mount it on Linux, FreeBSD and Illumos all the same?

I'm asking because we are still missing a good cross-platform filesystem with modern journaling and snapshotting.


If you take care when doing so, it's certainly possible. http://www.open-zfs.org/wiki/Feature_Flags is slightly behind ATM on documenting all of them, but is a good starting point.

(The various OpenZFS platforms all default to enabling all the feature flags they know about on pool creation if you don't specify otherwise, which can mean either read-only or no access to said pool on other platforms that don't know what those flags are.)


O.O ZFS without FUSE?


There's actually not an up-to-date ZFS FUSE port on any platform, at this point - all the OpenZFS ports are native to their respective kernels, and while people have kicked around updating/making a new FUSE port for a while, it's not been a priority for anyone.


\o/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: