H3 doesn't guarantee a child hexagon at level N+1 strictly belongs to 1 parent at level N. S2 is built on this exactly this primitive, but then struggles with cell-size variability across latitude.
This lack of strict hierarchy seeming negates alot of practical benefits (e.g tree data-structure that maps well to sharding and aggregation). Whilst I haven't dug into H3 that much from a practical sense - but I have build several Geospatial systems with S2 that exploit this strict hierarchy - I can't imagine this isn't a huge pain-point with H3.
Would be interested to hear of how these approximate cases are handled at Uber or in any practical setting.
I typically only rely on the logical parent/child relationship between cells, and containment there is strict even if geometric containment is only approximate. The logical relationship is useful, for example, in providing a compact representation when you have a large collection of cells: https://h3geo.org/docs/highlights/indexing
I'll use the approximate geometric containment mostly just to get a rough idea of where cells are. For example, in the plots of cells covering California in the link above, plotting the "compacted" cells is still visually useful, even if you aren't seeing the exact boundaries of the uncompacted set it represents.
How do you typically leverage exact geometric containment with S2 in your applications? I'm curious because I work on H3 and h3-py (https://uber.github.io/h3-py), and maybe there's something we can build (or it already exists) that would fit your use case.
One example I am trying to wrap my head around is if you have two adjacent polygons (say California and Oregon) and perform an interior cover of both with variable hex sizes.
It seems possible that a child hex might actually slip outside the boundary - since the 7 children don't fit squarely inside the parent (no pun intended).
In S2 it guaranteed that any child cell of the S2CellUnion representing that cover is strictly inside the polygon bounds.
This doesn't seem to be guaranteed in H3. I could have a location that is in Oregon, that depending on the child resolution could slip into to Oregon instead of California - or vice versa?
Now imagine an business application where a user must be mapped to one of 2 physically exclusive regions, (for say pricing, legal, compliance reasons) it seems like exact containment is preferred.
Perhaps there is another way to employ H3 that would mitigate this?
I tried to use this a few years ago but found the tooling inadequate for my use case. It wasn’t supported well enough across the various libraries involved in ingestion, storage and web client rendering. I’d love to hear from anyone using Hierarchical Triangular Mesh in the wild to learn more about how they made it work.
Fractals are super efficient (a binary tree is a fractal used for search) so I feel like I prefer S2 in this case. How are fractals being used in H3, and could other cell geometry be more efficient than the space-filling curve fractal used in S2?
Uber's open source geospatial suite is simply amazing. A while back I used H3 to visualize snowfall in Colorado (powdamap.com) and the result came out way better than a grid based approach with the added of benefit of doing a better job representing a continuous variable.
How was it better? I'm having trouble understanding why this is better than a simple grid. Grids on the surface of a sphere certainly have problems at high latitudes if naively applied. But they are incredibly simple to use, and don't have H3's problems of edges that don't quite overlap.
Depends on your use case. Hex's have a lot of advantages when you are a transportation oriented company. All hexes are the same size and the distance between the center of any two adjacent hexes is always the same distance. https://h3geo.org/docs/highlights/aggregation
As an example of where this is useful, it makes it very easy to get a list of all hexes with distance less than X from a current location.
Solve the problem "find me the all restaurants within 1 mile of of my location" efficiently in a database with restaurants and their lat-long coordinates.
Brute force solution: iterate over all possible restaurants, compute their distance to your location, then return the list that meet the criteria.
Better solution: cut the world into 1-mile square grids, and assign each restaurant a grid square index. Search for all restaurants in your grid square, plus all adjacent grid squares to that (because you might be on the edge of your square), and filter out the ones that are more than 1 mile away. This is a pretty good solution, but you're searching a square area for a circle of restaurants. Any restaurants in the corners of the square are wasting your time -- you're never within 1 mile of the corner.
So, if you could search a circular area, you'd have no corners. Using tessellated hexagons means your search space is more circular, so it's more efficient.
Fun fact: Uber surge pricing is calculated per-hexagon (h9, I think). If you're near a hexagon boundary experiencing surge, hop across to another one and check again.
Unfortunately this isn't strictly true anymore (and hasn't been for a few years). All areas of surge have a quadratic fall-off zone around the high point to solve just this. You'd have to walk some distance for this to be helpful.
It may work near a major avenue or political boundary or an event like a parade where there are "dispatch walls" set up or where there are two city areas that meet - think Reno / Tahoe.
Yes, while you don't get such extreme drops, surge can drop from 1.6x to 1.4x by crossing the street if you're in the right place, which can save quite a bit on a longer journey.
I wrote a blog post exploring the tradeoffs different geo grid systems make, including H3. The post describes a proof of concept exploring a different part of design-tradeoff space.
In S2, the binary representations of a cell’s parents (larger cells that contain the cell) are always a prefix of the cell itself’a binary representation. This lets you perform constant time containment checks.
Dumb question: I've put points on a sphere using a Fibonacci series, then relaxed them and triangulated them, and there are some 5's and 7's, not all hexagons. I thought an all-hexagon tiling wasn't possible. How do they do it?
There’s a 2018 blog post from Uber Engineering[0] which has more technical details about the system, and explains:
> Since it is not possible to tile the icosahedron with only hexagons, we chose to introduce twelve pentagons, one at each of the icosahedron vertices.
The warping you’re observing is a result of the projection used to display the map on a 2D screen. They actually do use pentagons to solve the tessellation issue.
First thought: "this looks like what we used at Uber."
Sure enough, it is. It works well IIRC and there's a lot of interesting math surrounding it. Some of our internal tools we had access to at the time (now, presumably, under lock and key) had maps laid out in hexagons.
E: Some of the visualizers linked here seem a bit weird. I could be mis-remembering things, but the hexagons were WAY smaller than some of the visualizers here. I remember looking at the SF map and there was a lot of granularity even in the city center. Like I said though, could be mis-remembering.
It wasn't that granular. Maybe a few blocks of the city in size, give or take. The visualizers I'm seeing have a single hexagon span the entirety of Austin, TX for reference. At least in Uber's usecase, that was less than useful.
It’s a hierarchy… so you can take your pick of granularity? Only thing that matters is min, max and step-size.
Unless we’re talking about different things, I’m not clear why we’re not assuming the tool you remember chose h10 or whatever level was actually useful to it, where the visualizations chose h5
also check out the "gnomonic icosahedral" at H0 from the dropdowns, that's the projection the base hexagonal grid is from, it's a perfectly planar hexgrid on an unfolded icosahedron net.
There’s gifs of hexagons on a planet you can search but you can’t cover a sphere with just hexagons. Even so, I imagine the poles are irrelevant to Uber
The image shows a number of pentagons, so it's not just hexagons unless you consider a pentagon some kind of degenerate hexagon. That said, you can indeed cover a sphere with only hexagons, if you relax the requirement that they all be regular hexagons.
If you have only hexagons, you end up with 6 vertices on the sphere where only 2 hexagons meet (whether you still consider these to be "hexagons" when they have two adjacent sides is a matter of definitions).
But what many spacial indices do instead is include 12 pentagons among the hexagons.
When I was working in space science we experimented with using Hierarchical Triangular Meshes but we ended up not really needing the faster indexing it offered since most of our processing was done in small portions of the sky at a time anyways.
Somewhat related - how would you setup geospatial indexing using a traditional index? For example IndexedDB?
I’ve been wanting to implement something similar to this (albeit much lighter) for the use with indexeddb - it’s challenging since many of the capabilities here just aren’t available to JavaScript on the browser.
Does anyone happen to know of any geospatial indexing solutions that can also index simultaneously by other dimensions. Temporal indexing for example (not only where, but when)?
Yes, with caveats and reasons you don't often see it.
As a practical matter, you want to fit the indexing structure to the properties of your data model as closely as possible. Increasing the generality of high-dimensionality spatial indexes comes at a high cognitive cost, so most complex high-dimensionality indexes are bespoke designs to limit generality. Things become pretty messy when you mix dimension types that are interval-like (e.g. polygons) and monotonic-like (e.g. temporal) so almost no one does it.
You can build, say, a general 8-dimensional index that can handle (some) distributions of interval and monotonic types simultaneously, in addition to the usual boring data types, that has excellent performance and scalability characteristics. It would probably only require something like 1000 lines of C++, so not too onerous. However, the code logic would be nearly impenetrable to read, never mind write, which matters for practical engineering.
H3 cell indexes are just integers so you can easily make a compound index of (cell_index,timestamp). This would be easy in SQL and I'd have to imagine just about anything else that supports a compound index.
H3 doesn't guarantee a child hexagon at level N+1 strictly belongs to 1 parent at level N. S2 is built on this exactly this primitive, but then struggles with cell-size variability across latitude.
This lack of strict hierarchy seeming negates alot of practical benefits (e.g tree data-structure that maps well to sharding and aggregation). Whilst I haven't dug into H3 that much from a practical sense - but I have build several Geospatial systems with S2 that exploit this strict hierarchy - I can't imagine this isn't a huge pain-point with H3.
Would be interested to hear of how these approximate cases are handled at Uber or in any practical setting.