Nice story: around 2016 the OP project was not yet open source, just a web app. Cockroach Labs wanted to automate using it into their docs build. I asked and was approved to donate a fairly good amount to the author. My argument was that that amount of money, though high, was less than my time to add a few other features we wanted (BNF pre-processing things).
Having programmed in both Rust and Go: I mostly didn't need what this does in Go because my problem was usually solved by panic'ing, which prints a stack trace of each go routine, allowing me to get enough of the way through figuring out my current problem (usually something was stuck waiting or sending). In tokio, there's no such print-stack behavior, so it's much harder to get a snapshot of what's going on. (I'm relatively new to tokio and Rust, so there's perhaps a handy "print all the tokio tasks and their stack traces" method, but I haven't come across it yet.)
If folks use this console thing for perf reasons and not debug reasons, then yeah, maybe cool to have in Go.
The setup you describe is very much not simple. I worked at a place with very good DBAs and our replication setup caused us more downtime than anything else. Cockroach and Spanner exist because many programmers observed that what you describe is hard.
As a counter-anecdote: multiple startup projects I've worked on with separate MySQL setups where each had just a single master + two slaves (one warm for fast failover in case of hardware failure or upgrades, one cold for slow analytics-style queries) did just fine with millions (to tens of millions) of users. No downtime at all for years on end.
MySQL and Postgres are massively more widely-used than Cockroach and Spanner, broadly very successfully. It's entirely feasible to run it with high uptime.
I think that is meant to be parsed as: Just like you can't check if the fridge light is on without opening the door (which of course turns it on), it's hard to know if a system is resilient to failure without having one. It just might be that there hasn't been a situation that would cause a failure.
I'll guess: money. Postgres is decades old and was designed when the internet was smaller. Doing a large, fundamental change like this requires an already experienced person (or maybe more than one) to devote a lot of time designing and implementing some solution. This time costs money. So some company must be willing to employ or pay some people to work full-time on this for months. Anyone qualified to work on this should be very expensive, so full costs to pay experts for months of their time would be in the ~$100-200k level. Much outside the donate-a-cup-of-coffee-each-month range, and outside of any small startup's budget, too.
This suggests that the various companies employing people to work on Postgres-related stuff (like Microsoft, perhaps due to their purchase of Citus) have more lucrative work they'd rather do instead of improve this at the design level.
This problem is now perhaps larger than open source is designed to handle because of how expensive it is to fix. Very few people (zero in this case) are willing to freely donate months of their life to improve a situation that will enrich other companies.
Regarding the difficulty of doing this: the blog post here describes how the concurrency and transaction safety model is related to connections, so any connection-related work must also be aware of transaction safety (very scary).
> This suggests that the various companies employing people to work on Postgres-related stuff (like Microsoft, perhaps due to their purchase of Citus) have more lucrative work they'd rather do instead of improve this at the design level.
Well I - the author of this post, employed by MS - did just work quite a while on improving connection scalability. And, as I outlined in the precursor blog post, improving snapshot scalability practically is a prerequisite of changing the connection model to handle significantly larger numbers of connections...
Trying to fix things like this in one fell swoop instead of working incrementally tends to not work in my experience. It's much more likely to succeed if a large project can be chopped up into individually beneficial steps.
I wrote one of the listed tools (github.com/mjibson/esc) and am thrilled about this proposal. I think it's great and solves all the problems in a great way.
I wrote an unlisted tool too, and I am also a fan of this proposal.
The fact that there are so many of these "embed file in binary" tools suggests that it really is a problem that could be usefully solved once, in a consistent and reliable fashion.
I've been using acme for ~6 years now and it's still my daily editor. I wrote a LSP client for it (https://github.com/mjibson/acre). acme is so weird because when you start out it's like "wait so I have to write little shell scripts to do everything?". But then it slowly dawns that larger programs (like acre) are possible that are much more interactive, like modern IDEs.
This looks pretty cool. But I'm having a hard time grasping concept of double-clicking for things like goto definition. Which makes me bring up this question, is acme useable with with 99% keyboard?
Which version of CockroachDB was that? A few years ago it had a rudimentary heuristic optimizer that made plenty of poor choices. Today it has a quickly improving cost-based optimizer that makes much better choices. Each release now has significant jumps in which kinds of queries it can produce fast plans for.
Money prevents it. It takes money to host things and pay people to work on infrastructure. While people often volunteer to contribute to OSS products because they like or use them, not many are willing to write infrastructure that can handle this kind of traffic in their spare time. Even if you can find someone to donate the time, you'd still need to fund that infra in some way. Having an infra company (say, Google donates a bunch of GCP credits) to cover the hosting costs still puts the project at risk if the host company decides to stop funding.
Whatever happened to people hosting things in an old computer in their basement? That used to be a more popular thing back in the days before the cloud came about and before we had these stable broadband connections. Obviously an infra like npm couldn't deliver with such a setup but at scale, who knows
Was that ever really a thing though? When I dig through my memory (and READMEs on old hard drives) I see a lot of .edu addresses. Seems the good-old-days of the internet wasn't about hosting things on an old computer in your basement but rather hosting things on an old computer in your school's basement.
And around the time when home connectivity became good enough that people considered home hosting was also around the time Slashdot was created.
Maybe you don't remember BBS systems. I know a bunch of people who hosted BBS systems "in their basement", hooking up multiple phone lines (4 or more) and multiple modems connected to a C64, or an Amiga, or a PC. And yes, we were calling all over the country (and the world) to exchange software, programming tools, games, etc. The good old days.
Super mega ultra hard. It would take so much time for us to learn rust, port everything including all tooling, and fix new bugs we introduce that we wouldn't add any new features (but lots of new bugs!) for like 2-4 years and the company would die.
Thanks for your input. I have no exp at all on db internals, but could you expand your thoughts on the possibility of moving critical (storage layer?) parts to Rust and leave all the networking stuff/rest in Go? Similar to what TiDb have done?
The Control of Nature (https://en.wikipedia.org/wiki/The_Control_of_Nature) is a book that contains this essay and two others (one about Iceland attempting to divert lava flows using pumped ocean water, one about southern CA attempting to build housing in an area dominated by complementary mudslides and firestorms). The whole book is a really great (in John McPhee's unique style) description of what happens when humans attempt to restrict or alter the earth's natural changes.
The above Atchafalaya essay is eye opening about the Mississippi River and how its natural course has swayed back and forth hundreds of miles over the centuries. We have now decided these two rivers should stop moving, but the earth doesn't see it that way. When they hit the gulf, their flow speed lowers, dropping the carried sediment. This causes their mouth to move slightly to an area with less dropped sediment. Humans have built walls attempting to constrain movement, but that may be a long-term losing battle.
Recommended reading, and a nice entry point to McPhee if you haven't read him yet.
The book was a fascinating read. Though sometimes I had to consult a map, it could have used with some visual aids.
Basically that river should take over the bulk of the flow of the Mississippi, but for human intervention, in this case the army Corp of engineers
“
If the Mississippi were allowed to flow freely, the shorter and steeper Atchafalaya would capture the main flow of the Mississippi, permitting the river to bypass its current path through the important ports of Baton Rouge and New Orleans.“
Nice story: around 2016 the OP project was not yet open source, just a web app. Cockroach Labs wanted to automate using it into their docs build. I asked and was approved to donate a fairly good amount to the author. My argument was that that amount of money, though high, was less than my time to add a few other features we wanted (BNF pre-processing things).