I was going to do RAID for disk failure tolerance, either with a consumer NAS box, or a disk enclosure that does hardware RAID, and something very low power like a Zero 2 W.
Isn't RAID parity slightly more space efficient than versioned backups? Or is there a better way to do redundancy that doesn't involve just replicating entire files to multiple disks? Or some kind of automated manager that puts each individual file on N different disks out of M?
I mostly do embedded so reliable data storage isn't generally something I deal with, we usually leave that to the cloud or to the user, and I'm not quite familiar with what's out there.
>Isn't RAID parity slightly more space efficient than versioned backups?
It depends on your storage array. The more drives, the more space efficient RAID becomes, but RAID is still only a single copy of your data.
>Or is there a better way to do redundancy that doesn't involve just replicating entire files to multiple disks
Most of the industry is using erasure coding these days (https://blog.min.io/erasure-coding/) which allows for spreading your parity and data across multiple sites. Erasure coding usually runs a layer above the filesystem, as opposed to RAID which typically runs below the filesystem (Snapraid, Mergerfs and others excluded).
My personal "backup vault" is a Raspberry Pi 4 with a single 4TB external drive attached. The RPi runs Minio, and all backups are done through the S3 interface or SFTP/SMB. It is not the fastest box in the world, but it backs up (incremental) ~2TB in 30 minutes, which is "fast enough".
It consumes on average 4W, which means even with worst case electricity prices of €1/kWh (which we saw last winter), it costs less than €3/month.
For comparison, my NAS consumed around 50W, and at €1/kWh, that would cost €37/month in electricity alone, and then you need to add the cost of the actual hardware itself.
I switched off the NAS, and purchased ~10TB of cloud storage (main storage and backup storage at two different locations) for €20/month, and keep sensitive stuff encrypted with Cryptomator.
ZFS has RAIDZ1, RAIDZ2, RAIDz3 and Ditto blocks, which do much the same thing, although a bit differently.
My point was that even if you have 4 copies of your data, you still only have a single machine where your data is stored, and you're essentially just one flood/lightning strike/house fire/burglary away from all of it being gone. Or one bad power supply away from 4 dead drives.
With versioned backups, you have higher latency on restoring data in case a disk dies, but your data is also safer.
As i initially stated, RAID is for availability. It is great for making sure that data is available 24/7, but that is rarely what the average home user needs. Most home users access their data infrequently, and would be perfectly fine waiting a couple of hours while restoring data from a backup.
Isn't RAID parity slightly more space efficient than versioned backups? Or is there a better way to do redundancy that doesn't involve just replicating entire files to multiple disks? Or some kind of automated manager that puts each individual file on N different disks out of M?
I mostly do embedded so reliable data storage isn't generally something I deal with, we usually leave that to the cloud or to the user, and I'm not quite familiar with what's out there.