Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Does Route53 depend on services in us-east-1 though? Or maybe it's something else, but i recall us-east-1 downtime causing service downtime for global services


As far as I remember, Route53 is semi-regional. The master copy is kept in us-east-1, but individual regions have replicated data. So if us-east-1 goes down, the individual regions will keep working with the last known state.

Amazon calls this "static stability".


Static stability is a good start, but isn't enough.

In this outage, my service (on GCP) had static stability, which was great. However, some other similar services failed, and we got more load, but we couldn't start additional instances to handle the load because of the outage, and so we had overloaded servers and poor service quality.

Mayhaps we could have adjusted load across regions to manage instance load, but that's not something we normally do.


One of the core pieces of static stability (at least in one definition, it's an overloaded term) is being able to handle failure scenarios from a steady state.

The classic example is overprovisioning so that you can handle the extra zonal load in the event of a zonal outage without needing to scale up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: