The area where I've seen the most homegrown implementations of things like these is HFT, with the caveat it's also designed to be distributed, integrated with isolation systems, start/stop dependency graphs...
I once worked for a company which chose to use Kubernetes instead, they regretted it.
A lot of people here don't understand Discord was born as an alternative to Teamspeak, Mumble and Ventrilo, which main purpose is voicechat rooms while playing video games. They were difficult to maintain since you had to install your own servers. Discord swept with them with the ease of setup and generous free tiers.
If you don't use it with that purpose, there's tons of alternatives.
I think you're right, but Discord also replaced IRC for a lot of people/communities, and I don't think they all make use of the voice chat feature. It may be there's no perfect alternative for everyone, but we could still "save" a large group.
The big reason is that H3 is data independant. You put your data in predefined bins and then join on them, whereas kd/r trees depend on the data and building the trees may become prohibitive or very hard (especially in distributed systems).
I think the key is in the distributed nature, h3 is effectively a grid so can easily be distributed over nodes. A recursive system is much harder to handle that way. R-trees are great if you are OK with indexing all data on one node, which I think for a global system is a no-go.
This is all speculation, but intuitively your criticism makes sense.
Also, mapping 147k cities to countries should not take 16 workers and 1TB of memory, I think the example in the article is not a realistic workload.
To add to sibling comment, if you have streaming data you have to update the whole index every time with r/kd trees whereas with H3 you just compute the bin, O(1) instead of O(log n).
Not rocket science but different tradeoffs, that’s what engineering is all about.
How do you join two datasets using r-trees? In a business setting, having a static and constant projection is critical. As long as you agree on zoom level, joining two datasets with S2 and H3 is really easy.
This data is indeed not irregularly distributed, in fact the fun thing about geospatial data is that you always know the maximum extent of it.
About your binary tree comment: yes this is absolutely valid, but consider then that binary trees also are a bad fit for distributed computing, where data is often partitioned at the top level (making it no longer a binary tree but a set of binary trees) and cross-node joins are expensive.
On that specific count, not really. There's a skate park north end of the Mission, and Stevenson St is a two way road that borders it, but it's narrow enough that you need to drive up on the curb to get two vehicles side by side on the street. Waymo's can't handle that on a regular basis. Being San Francisco and not London, you can just skip that road, but if you find yourself in a Waymo on that street and are unlucky to have other traffic on it, the Waymo will just have to back up the entire street. Hope there's no one behind you as well as in front of you!
Anyway, we'll see how the London rollout goes, but I get the impression London's got a lot more of those kinds of roads.
I live in London. Most residential streets are two-way but there is only space for one car, and driving on the curb is not really an option.
The trick to UK streets is that parking actually happens on the street itself, and when driving you must find a spot when people are not parking to make way for people coming the other way.
That is extremely narrow, I wonder why the city has not designated it as a one-way street? They've done that for other similarly narrow sections of the same street farther north.
Another comment mentioned the Philippines as the manifest frontier. SF is not on the same plane of reality in terms of density or narrow streets as PH, I would argue in comparison it does not have both.
This is an alley in Coimbra, Portugal. A couple years ago I stayed at a hotel in this very street and took a cab from the train station. The driver could have stopped in the praça below and told me to walk 15m up. Instead the guy went all the way up then curved through 5-10 alleys like that to drop me off right right in front of my place. At a significant speed as well. It was one of the craziest car rides I've ever experienced.
I live in such an area. The route to my house involves steep topography via small windy streets that are very narrow and effectively one-way due to parked cars.
Human drivers routinely do worse than Waymo, which I take 2 or 3 times a week. Is it perfect? No. Does it handle the situation better than most Lyft or Uber drivers? Yes.
As a bonus: unlike some of those drivers the Waymo doesn't get palpably angry at me for driving the route.
5 - Datacenter (DC) - Like 4, except also take control of the space/power/HVAC/transit/security side of the equation. Makes sense either at scale, or if you have specific needs. Specific needs could be: specific location, reliability (higher or lower than a DC), resilience (conflict planning).
There are actually some really interesting use cases here. For example, reliability: If your company is in a physical office, how strong is the need to run your internal systems in a data centre? If you run your servers in your office, then there's no connectivity reliability concerns. If the power goes out, then the power is out to your staff's computers anyway (still get a UPS though).
Or perhaps you don't need as high reliability if you're doing only batch workloads? Do you need to pay the premium for redundant network connections and power supplies?
If you want your company to still function in the event of some kind of military conflict, do you really want to rely on fibre optic lines between your office and the data center? Do you want to keep all your infrastructure in such a high-value target?
I think this is one of the more interesting areas to think about, at least for me!
When I worked IT for a school district at the beginning of my career (2006-2007), I was blown away that every school had a MASSIVE server room (my office at each school - the MDF). 3-5 racks filled (depending on school size and connection speed to the central DC - data closet) 50-75% was networking equipment (5 PCs per class hardwired), 10% was the Novell Netware server(s) and storage, the other 15% was application storage for app distributions on login.
Personally I haven't seen a scenario where it makes sense beyond a small experimental lab where you value the ability to tinker physically with the hardware regularly.
Offices are usually very expensive real estate in city centers and with very limited cooling capabilities.
Then again the US is a different place, they don't have cities like in Europe (bar NYC).
If you are a bank or a bookmaker or similar you may well want to have total control of physical access to the machines. I know one bookmaker I worked with had their own mini-datacenter, mainly because of physical security.
I am pretty forward-thinking but even when I started writing my first web server 30+ years ago I didn’t foresee the day when the phrase “my bookie’s datacenter” might cross my lips.
If you have less than a rack of hardware, if you have physical security requirements, and/or your hardware is used in the office more than from the internet, it can make sense.
5 was a great option for ml work last year since colo rented didn't come with a 10kW cable. With ram, sd and GPU prices the way they are now I have no idea what you'd need to do.
Thank goodness we did all the capex before the OpenAI ram deal and expensive nvidia gpus were the worst we had to deal with.
I stopped reading at "soon to become the practice of writing software".
That belief has no basis at this point and it's been demonstrated not only that AI doesn't improve coding but also that the costs associated are not sustainable.
Because typing in text and syntax is now becoming irrelevant and mostly taken care of by language models. Computational thinking and sematics on the other hand will remain essential in the craft and always have been.
Care to link your sources? At least one of the studies that got attention here was basically done with a bunch of programmers who had no prior experience with the tools.
I stopped reading at the abstract; garbage rant full of contradictions.
reply