Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Two things that may be relevant that I've done before:

1. The simplest. Setup scripts for doing full backups and restores to/from S3. This is helpful for many reasons, but should also help your case. Diffs are an optimisation if your data is large enough, but I think many cases are easily small enough to just push and pull as a whole. Until you're at a significant number of gigs of data, I'd recommend trying this first.

2. Record your data in an append-only format (this doesn't require a different database, just a strict way of doing things). This means you can always get a full history of any bit of your data. I do this for some scraping work (recording lots of noisy/unreliable values, so not losing old versions is important). If your data is stored like this, pushing around diffs should be easy (grab everything with a created_at time after a certain point).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: