Doesn't this increase the chance of data loss? If a blob gets corrupted, then all the backups referencing that blob will have the same corrupted file(s). This is similar to having a corrupted index in an incremental backup chain (or maybe in this case you would lose everything?), but in the case of incremental backups the risk is mitigated by periodically performing full backups. Also my gut feeling is that you will save space with content-addressed backups only if you're backing up multiple machines that share files, but in the tipical average user scenario where one is backing up a single PC you get a similar space usage. Keep in mind that you tipically delete bacups older than a certain threshold. Could you maybe comment on my points?
Sure you could have multiple level-0 backups to increase the odds of whatever blob you have corrupted can be found in other copies, but that's inefficient.
It's much more efficient to deduplicate, then add redundancy. Like say storing said blobs on a RAIDz3. Or use backblaze's approach and split the blob into 17 pieces, add 3 pieces of redundancy, and distribute the chunks across 20 racks.
If you are serious of course you'd have an onsite backup, deduplicated, with added redundancy AND the same offsite.
Doesn't this increase the chance of data loss? If a blob gets corrupted, then all the backups referencing that blob will have the same corrupted file(s). This is similar to having a corrupted index in an incremental backup chain (or maybe in this case you would lose everything?), but in the case of incremental backups the risk is mitigated by periodically performing full backups. Also my gut feeling is that you will save space with content-addressed backups only if you're backing up multiple machines that share files, but in the tipical average user scenario where one is backing up a single PC you get a similar space usage. Keep in mind that you tipically delete bacups older than a certain threshold. Could you maybe comment on my points?