Hacker News new | past | comments | ask | show | jobs | submit login

Velero can be configured to run on a schedule, but the scheduled command was apparently not the exact same command they were using to perform the manual tests - the scheduled job was missing part of the command, basically.



Sp they were manually doing a backup and then testing that backup, rather than testing their automated backups? If thats what they were doing that just makes me wonder... why?


It's possible that the backup restore/test process was still 'automated' to some degree, like a manually-run pipeline or playbook etc.

Obviously there were mistakes made in process design and tools deployment but I don't think this is 'baffling incompetence' more than it is a couple of small mistakes that compounded each other to have a pretty spectacular impact.


I am not sure about the pricing structure of their provider, but maybe it is a Hotel California situation? Data egress price of the provider was such that they did not want to pay to retrieve the full backup "just for a test"? Instead, repeat the backup command, but swap it out for a local location.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: