Hacker News new | past | comments | ask | show | jobs | submit login

Back in 2010 when we were building Amazon Route 53, we had a really big problem to solve. DDOS attacks. DNS is critical, and it uses UDP, which is a protocol that allows attackers to spoof their source IP address. We knew that DNS services are a common target for attacks from botnets; and our research at the time showed that our established competitors used large and expensive "packet scrubbers" to handle this.

We budgeted out what we think it would cost to handle our scale and the price tag came to tens of millions of dollars. You might think that would be no problem for a big company like Amazon, but our total infrastructure budget for Route 53 was something like tens of thousands of dollars. At the edge, we were re-using CloudFront servers that had failed hard drives for our name servers; since we wouldn't need much storage, and our API servers were pretty modest. We had a team of about ~6 people. That's what "scrappy" looks like at AWS; spend nothing, minimize downside risk, get things done quickly. There was no way I was going to ask for tens of millions of dollars for packet scrubbers. Besides, they would take too long to arrive, and would make us too reliant on a vendor.

Early on we had decided to run Route 53 name servers on its own dedicated IP range to give some measure of isolation. We could use dedicated network links to make sure that Amazon's other infrastructure wouldn't be impacted. But that wouldn't help Route 53's customers from sharing fate with each other. We didn't have a real plan beyond "When it happens, get really good filtering using our existing network and system tools".

Early that summer, I was reading one of Knuth's recent fascicles for 4A and was swimming in combinatorial algorithms. One night it just "clicked" that by creating many virtual name servers, we could easily assign every customer to a unique combination of four of those virtual name servers. We could even control the amount of overlap; some quick math showed that we about two thousand name servers, we could guarantee that no two customer would share more than two name servers. That number is important because our experiments showed that domains resolve just fine even when two name servers are unreachable, but beyond that it starts to be a problem.

The recursive search algorithm to assign the IPs was inspired directly by the algorithms in 4A; it gives customer domains two more independent dimensions of isolation. They also get 4 name servers from 4 independent "stripes", which correspond to the different TLDs we use for the name server names (co.uk, com, net, org). This guarantees that if one of those TLDs has an issue (like a DNSSEC mistake), only one of the name servers is impacted. They also come from 4 independent "braids", which can be used to ensure that no two name servers share certain network paths or physical hardware. I just wouldn't have known how to do any of this without reading 4A. And I even have a background in combinatorials; from statistics and cryptography.

I've never been more excited by a solution; this approach gave us provable network IP level isolation between customer domains while costing basically nothing in real infrastructure. It's math. It wasn't completely free; we had to use 2,000 anycast IP addresses, and it turns out that we also had to register 512 domains for them because of how many TLDs require name servers to be registered and to have glue records; so that was a fun process working with our registrar. But we got it done.

I named the approach "Shuffle Sharding", and it's more discovery than invention. Many multi-tenant systems that use some kind of random placement get a kind of shuffle sharding, and network filtering techniques like Stochastic Fair Blue use time-seeded hashing to similar effect. But I've never seen anything quite the same, or with the level of control that we could apply; I could even extend it to a kind of recursive nested shuffle shading that isolates at even more levels. For example if you want to isolate not just a caller, but a caller's callers when they are in some kind of "on behalf of" call pattern.

Years later, I made a personal pilgrimage of gratitude to see a Knuth Christmas lecture in person, and sat in the front row. I still read every scrap of material that Knuth puts out (including the Organ pieces!) because I never know what it might inspire. All of this to say ... I do think his volumes are surprisingly practical for programmers; they broaden your mind as well as deepen your understanding. What more could you want.




These are the kinds of stories I love from HN! Stories of theory meeting practice are my absolute favorite.

My dad got me a set of TAoCP a few years back as a birthday gift (before 4b) and I realized that I do not have the background to even begin on 1 (at least not without some daily devotion of time). Maybe I'll have to take some more time and try again.


I suggest picking up Knuth's Concrete Mathematics. It helps you get used to the math in TAoCP as well as his writing style, which manages to be simultaneously rigorous, serious, and humorous.


Utterly unrelated to Knuth, and probably not useful for picking DNS servers because of their more static nature, but you might enjoy Ceph's CRUSH load distribution algorithm. Very rough gist: you can use it to e.g. set policy to avoid two replicas in the same rack, and ensure third replica goes on a different floor. And you can express server overload to migrate data off of one gradually, etc. Ceph uses that, with servers gossiping about failures & overload, to have clients directly contact the right servers.

https://ceph.io/assets/pdfs/weil-crush-sc06.pdf

(Disclaimer: Ex-employee. It's all open source.)


> Early that summer, I was reading one of Knuth's recent fascicles for 4A and was swimming in combinatorial algorithms. One night it just "clicked" that...

It's frankly unbelievable how frequently this happens. "Swimming around" in these concepts inspires solutions that otherwise seem to come from nowhere.


Do thank your colleagues in the retail arm over on .co.uk for me.

On launch, they very briefly mislisted the updated TAOCP 1-4B box set at £33 (~ $40), and to my surprise and delight, honoured it!


Hehe, I also got the box set at a huge discount when 4B came out. Amazon took their sweet time (and I had to confirm I didn't want to cancel the order), but they came through and I got my box set. It will be my son's graduation present.


you saved millions and all you can get are these free bananas




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: