Parts, yes. In reference to the specifics mentioned in here though, those services run on Infra Spanner, not Cloud Spanner, but they're the same stack. The main reason things like Gmail, Ads, etc haven't swapped into GCP is because of the internal tooling that's built up around the infra spanner relating to those services specific to Google that don't make sense in Cloud Spanner.
It's way WAY more than just Infra Spanner vs Cloud Spanner. Cloud spanner doesn't support protobuf, which is annoying, but that's not a dealbreaker; it's still just a DB. The issue is really all the various internal frameworks (such as Apps Framework for Java), deployment systems (Server Platform, AKA Boq/Pod/Urfin), and so forth.
Not just migrations are hard, either; Google Cloud has put (almost?) zero effort into making it easy to use Cloud from systems running on Borg.
My old team was building a system that was half-GCP and half-Borg, and we had to write our own (extremely bad) Cloud Spanner fake for use in tests. In contrast, Infra Spanner is extremely well supported for tests. Same with BigQuery vs Dremel and many other systems.
Tell me more (bonus points if you find me on LinkedIn or other social because tracking comment responses on HN is really rough). I'd love feedback you have so I can bring it back to the product team!
Really glad to hear! Please find me on social media or LinkedIn and let me know how it goes for you using the PG layer. I'd love to hear more feedback.
You can! Spanner has a free trial: https://cloud.google.com/spanner/docs/free-trial-instance. Keep in mind, that per-request pricing isn't free unless you stay under the free tier. So just take a look at what those limits are because going above them means you're not free anymore.
The DeWitt clause is only there to prevent badly written performance data from getting attention it shouldn't. If anyone writes a good benchmark (good process, not good results, necessarily) to bring to us, our product teams absolutely will consider it.
More specifically, infra and cloud Spanner are the same stack. So they've progressed together hugely since 2013. :) The real differences between the two are more about the internal tooling we (Google) have around infra that's built up with our other services that consume it over the years that aren't relevant to anyone other than Google.
But that's kind of a moot point. I mean, if you're even looking at the likes of DynamoDB or Spanner, it's because you need the scale of those engines. PostgreSQL is fantastic, and even working for Google, I 100% agree with you. Just use PG...until you can't. Once you're in the realm of Spanner and DynamoDB, that's where this discussion becomes more of a thing.
Not necessarily true. DynamoDB on demand pricing is actually way cheaper than RDS or EC2 based anything for small workloads, especially when you want it replicated.
100% this, and even though I work for Google I absolutely agree. BUT, for the folks that need it, PostgreSQL just DOESN'T cut it, so it's why we have databases like DynamoDB, Spanner, etc. Arguing that we should "Just use PG" is kinda a moot point.
I think I said this in another comment, but I'm not shitting on Spanner or DDB's right to exist here. Obviously, there are _some_ problems for which a globally distributed ACID compliant SQL-compatible database are useful. However, those problems are few and far between, and many/most of them exist at companies like Google. The fact is your average small to medium size enterprise doesn't need and doesn't benefit from DDB/Spanner, but "enterprise architects" love to push them for some ungodly reason.
True, but one would hope that both sides in this case would be putting their best foot forward. Getting peak performance out of right sizing your DB is part of that discussion. I can't imagine AWS would put down "126 million QPS" if they COULD have provided a larger instance that could deliver "200 million QPS", right? We have to assume at some point that both sides are putting their best foot forward given the service.
The 126M QPS number was certainly parts of Amazon.com retail that powers Prime Day not all of DDB traffic. If we were to add up all of DDB's volume, it would be way higher. At least a magnitude if not more.
Large parts of AWS itself uses DDB - both control plane and data plane. For instance, every message sent to AWS IoT will internally translate to multiple calls to DDB (reads and writes) as the message flows through the different parts of the system. IoT itself is millions of RPS and that is just one small-ish AWS service.
Infra and Cloud Spanner are the same stack. Having those services run on infra is more about the legacy of tooling to shift it rather than anything around performance or ability to handle it