What is the actual difference between a backup and replication? If the 1’s and 0’s are replicated to a different host, is that any different than “backing up” (replicating them) to a piece of external media?
> What is the actual difference between a backup and replication?
Simplest way to think about it is that a backup must be an immutable snapshot in time. Any changes and deletions which happen after that point in time will never reflect back onto the backup.
That way, any files you accidentaly delete or corrupt (or other unwanted changes, like ransomware encrypting them for you) can be recovered by going back to the backup.
Replication is very different, you intentionally want all ongoing changes to replicate to the multiple copies for availability. But it means that unwanted changes or data corruption happily replicates to all the copies so now all of them are corrupt. That's when you reach for the most recent backup.
That's why you always need to backup and you'll usually want to replicate as well.
When those 1s and 0s are deleted and that delete is replicated (or other catastrophic change, such as ransomware) you presumably don't have the ability to restore if all you're doing is replication. A strategy that layers replication + backup/versioning is the goal.
I'll add that _usually_ a backup strategy includes generational backups of some kind. That is daily, weekly, monthly, etc to hedge against individually impacted files as mentioned.
Ideally there is also an offsite and inaccessible from the source component to this strategy. Usually this level of robustness isn't present in a "replication" setup.
Put more simply, backups account for and mitigate the common risks to data during storage while minimizing costs, ransomware is one of those common risks. Its organizational dependent based on costs and available budget so it varies.
Long term storage usually has some form of Forward Error Correction (FEC) protection schemes (for bitrot), and often backups are segmented which may be a mix of full and iterative, or delta backups (to mitigate cost) with corresponding offline components (for ransomware resiliency), but that too is very dependent on the environment as well as the strategy being used for data minimization.
> Usually this level of robustness isn't present in a "replication" setup.
Exactly, and thinking about replication as a backup often also gives those using it a false sense of security in any BC/DR situations.