H3: Hexagonal hierarchical geospatial indexing system

samirahmed · on Sept 15, 2021

S2 is used pretty heavily across the industry. Comparison - (https://h3geo.org/docs/comparisons/s2/)

H3 doesn't guarantee a child hexagon at level N+1 strictly belongs to 1 parent at level N. S2 is built on this exactly this primitive, but then struggles with cell-size variability across latitude.

This lack of strict hierarchy seeming negates alot of practical benefits (e.g tree data-structure that maps well to sharding and aggregation). Whilst I haven't dug into H3 that much from a practical sense - but I have build several Geospatial systems with S2 that exploit this strict hierarchy - I can't imagine this isn't a huge pain-point with H3.

Would be interested to hear of how these approximate cases are handled at Uber or in any practical setting.

ajfriend · on Sept 15, 2021

I typically only rely on the logical parent/child relationship between cells, and containment there is strict even if geometric containment is only approximate. The logical relationship is useful, for example, in providing a compact representation when you have a large collection of cells: https://h3geo.org/docs/highlights/indexing

I'll use the approximate geometric containment mostly just to get a rough idea of where cells are. For example, in the plots of cells covering California in the link above, plotting the "compacted" cells is still visually useful, even if you aren't seeing the exact boundaries of the uncompacted set it represents.

How do you typically leverage exact geometric containment with S2 in your applications? I'm curious because I work on H3 and h3-py (https://uber.github.io/h3-py), and maybe there's something we can build (or it already exists) that would fit your use case.

samirahmed · on Sept 16, 2021

One example I am trying to wrap my head around is if you have two adjacent polygons (say California and Oregon) and perform an interior cover of both with variable hex sizes.

It seems possible that a child hex might actually slip outside the boundary - since the 7 children don't fit squarely inside the parent (no pun intended).

In S2 it guaranteed that any child cell of the S2CellUnion representing that cover is strictly inside the polygon bounds.

This doesn't seem to be guaranteed in H3. I could have a location that is in Oregon, that depending on the child resolution could slip into to Oregon instead of California - or vice versa?

Now imagine an business application where a user must be mapped to one of 2 physically exclusive regions, (for say pricing, legal, compliance reasons) it seems like exact containment is preferred.

Perhaps there is another way to employ H3 that would mitigate this?

danbruc · on Sept 15, 2021

There is also Hierarchical Triangular Mesh [1] which also forms a proper quad tree but I am not sure how the cell variability compares to S2.

[1] https://www.microsoft.com/en-us/research/wp-content/uploads/...

techdragon · on Sept 16, 2021

I tried to use this a few years ago but found the tooling inadequate for my use case. It wasn’t supported well enough across the various libraries involved in ingestion, storage and web client rendering. I’d love to hear from anyone using Hierarchical Triangular Mesh in the wild to learn more about how they made it work.

peterburkimsher · on Sept 16, 2021

Thank you for linking the comparison to S2!

The big difference to me seems to be the shape of the cells: hexagonal for H3, square for S2.

H3 hexagonal is better for seeing more neighbouring cells.

S2 is better for being referenced with a numeric coordinate system, and uses a fractal space-filling curve to generate subdivisions.

https://s2geometry.io/devguide/s2cell_hierarchy

Fractals are super efficient (a binary tree is a fractal used for search) so I feel like I prefer S2 in this case. How are fractals being used in H3, and could other cell geometry be more efficient than the space-filling curve fractal used in S2?

kovek · on Sept 16, 2021

I remember searching for S2 (+”geofencing”, “map”) and not being able to find it. What search terms would have helped?

serjester · on Sept 15, 2021

Uber's open source geospatial suite is simply amazing. A while back I used H3 to visualize snowfall in Colorado (powdamap.com) and the result came out way better than a grid based approach with the added of benefit of doing a better job representing a continuous variable.

oofbey · on Sept 15, 2021

How was it better? I'm having trouble understanding why this is better than a simple grid. Grids on the surface of a sphere certainly have problems at high latitudes if naively applied. But they are incredibly simple to use, and don't have H3's problems of edges that don't quite overlap.

rodonn · on Sept 15, 2021

Depends on your use case. Hex's have a lot of advantages when you are a transportation oriented company. All hexes are the same size and the distance between the center of any two adjacent hexes is always the same distance. https://h3geo.org/docs/highlights/aggregation

As an example of where this is useful, it makes it very easy to get a list of all hexes with distance less than X from a current location.

Here's a comparison they give to some of the other common geographical partitions https://h3geo.org/docs/comparisons/s2

Pamar · on Sept 15, 2021

Yes. This is why, btw, wargames have more or less standardized in hex grids: no sudden 20% increase of speed for units moving diagonally.

Sanguinaire · on Sept 15, 2021

Hexagons are the bestagons; better than all the restagons.

peterb · on Sept 15, 2021

You forgot the reference video!! :-) https://www.youtube.com/watch?v=thOifuHs6eY&t=5s

Blammar · on Sept 15, 2021

Don't you mean sqrt(2) increase in speed? (1.41+)

Pamar · on Sept 16, 2021

Yes, sorry, 1.4 (also consider that most wargames have rules where travelling on roads units can double speed...)

thehappypm · on Sept 15, 2021

Solve the problem "find me the all restaurants within 1 mile of of my location" efficiently in a database with restaurants and their lat-long coordinates.

Brute force solution: iterate over all possible restaurants, compute their distance to your location, then return the list that meet the criteria.

Better solution: cut the world into 1-mile square grids, and assign each restaurant a grid square index. Search for all restaurants in your grid square, plus all adjacent grid squares to that (because you might be on the edge of your square), and filter out the ones that are more than 1 mile away. This is a pretty good solution, but you're searching a square area for a circle of restaurants. Any restaurants in the corners of the square are wasting your time -- you're never within 1 mile of the corner.

So, if you could search a circular area, you'd have no corners. Using tessellated hexagons means your search space is more circular, so it's more efficient.

fredley · on Sept 15, 2021

Fun fact: Uber surge pricing is calculated per-hexagon (h9, I think). If you're near a hexagon boundary experiencing surge, hop across to another one and check again.

krebby · on Sept 15, 2021

Unfortunately this isn't strictly true anymore (and hasn't been for a few years). All areas of surge have a quadratic fall-off zone around the high point to solve just this. You'd have to walk some distance for this to be helpful.

It may work near a major avenue or political boundary or an event like a parade where there are "dispatch walls" set up or where there are two city areas that meet - think Reno / Tahoe.

fredley · on Sept 15, 2021

Yes, while you don't get such extreme drops, surge can drop from 1.6x to 1.4x by crossing the street if you're in the right place, which can save quite a bit on a longer journey.

RicoElectrico · on Sept 15, 2021

This sounds similar to the S2 cell shenanigans people do in Pokemon GO and Ingress. ;)

Tarrosion · on Sept 15, 2021

I wrote a blog post exploring the tradeoffs different geo grid systems make, including H3. The post describes a proof of concept exploring a different part of design-tradeoff space.

https://evanfields.github.io/No-Perfect-Geo-Grid/

jowday · on Sept 15, 2021

Sounds similar to Google’s S2 library.

s2geometry.io

In S2, the binary representations of a cell’s parents (larger cells that contain the cell) are always a prefix of the cell itself’a binary representation. This lets you perform constant time containment checks.

rodonn · on Sept 15, 2021

They give a nice comparison of S2 vs H3 here https://h3geo.org/docs/comparisons/s2

Which is better definitely will depend on your use case.

adolgert · on Sept 15, 2021

Dumb question: I've put points on a sphere using a Fibonacci series, then relaxed them and triangulated them, and there are some 5's and 7's, not all hexagons. I thought an all-hexagon tiling wasn't possible. How do they do it?

ravar · on Sept 15, 2021

There are 12 pentagons conveniently placed over water. The docs are a pretty interesting read. https://eng.uber.com/h3/

progbits · on Sept 15, 2021

That is a clever solution! For a transportation company at least.

I wonder what happens if an artificial island and a major city pops up in one of those pentagons - presumably a fun tech debt to tackle :)

robinhouston · on Sept 15, 2021

There’s a 2018 blog post from Uber Engineering[0] which has more technical details about the system, and explains:

> Since it is not possible to tile the icosahedron with only hexagons, we chose to introduce twelve pentagons, one at each of the icosahedron vertices.

0. https://eng.uber.com/h3/

bloopernova · on Sept 15, 2021

They distort the hexagons as you go further north.

https://observablehq.com/@four43/h3-index-visualizer

https://imgur.com/a/SgDfJkG is an example

ritwikgupta · on Sept 15, 2021

The warping you’re observing is a result of the projection used to display the map on a 2D screen. They actually do use pentagons to solve the tessellation issue.

junon · on Sept 15, 2021

First thought: "this looks like what we used at Uber."

Sure enough, it is. It works well IIRC and there's a lot of interesting math surrounding it. Some of our internal tools we had access to at the time (now, presumably, under lock and key) had maps laid out in hexagons.

E: Some of the visualizers linked here seem a bit weird. I could be mis-remembering things, but the hexagons were WAY smaller than some of the visualizers here. I remember looking at the SF map and there was a lot of granularity even in the city center. Like I said though, could be mis-remembering.

setr · on Sept 15, 2021

https://h3geo.org/docs/core-library/restable

The smallest granularity is . 0000009 km^2, so I guess 1/3 of an inch. Which seems to me a ridiculously and unnecessarily precise, but who knows

junon · on Sept 15, 2021

It wasn't that granular. Maybe a few blocks of the city in size, give or take. The visualizers I'm seeing have a single hexagon span the entirety of Austin, TX for reference. At least in Uber's usecase, that was less than useful.

setr · on Sept 15, 2021

It’s a hierarchy… so you can take your pick of granularity? Only thing that matters is min, max and step-size.

Unless we’re talking about different things, I’m not clear why we’re not assuming the tool you remember chose h10 or whatever level was actually useful to it, where the visualizations chose h5

junon · on Sept 15, 2021

I don't remember ours being a hierarchy but perhaps it was just the visualization.

ambrood · on Sept 16, 2021

Uber's internal stuff used resolution 9 i think...

tln · on Sept 15, 2021

That table shows H15 cells are on average 0.9 m^2, or a hexagon with edges 0.59m long.

(The average edge length is 0.5m, but that's not the same as the edge length of the average area hexagon)

throwoutway · on Sept 15, 2021

Is there a UI or image of what the hexagons look like on the planet?

fredley · on Sept 15, 2021

This has a nice visualiser: https://observablehq.com/@four43/h3-index-visualizer

mxfh · on Sept 15, 2021

https://observablehq.com/@fil/h3-oddities

also check out the "gnomonic icosahedral" at H0 from the dropdowns, that's the projection the base hexagonal grid is from, it's a perfectly planar hexgrid on an unfolded icosahedron net.

saalweachter · on Sept 15, 2021

The blog post has a picture showing several layers of hexagons overlapping.

david_draco · on Sept 15, 2021

Yes, in the sections under Intro > Comparisons

prpl · on Sept 15, 2021

There’s gifs of hexagons on a planet you can search but you can’t cover a sphere with just hexagons. Even so, I imagine the poles are irrelevant to Uber

xvedejas · on Sept 15, 2021

The image shows a number of pentagons, so it's not just hexagons unless you consider a pentagon some kind of degenerate hexagon. That said, you can indeed cover a sphere with only hexagons, if you relax the requirement that they all be regular hexagons.

jacobolus · on Sept 15, 2021

> you can indeed cover a sphere with only hexagons, if you relax the requirement that they all be regular hexagons

More precisely, what you need to relax is the requirement that 3 hexagons always meet at every vertex. See https://en.wikipedia.org/wiki/Euler_characteristic#Polyhedra

If you have only hexagons, you end up with 6 vertices on the sphere where only 2 hexagons meet (whether you still consider these to be "hexagons" when they have two adjacent sides is a matter of definitions).

But what many spacial indices do instead is include 12 pentagons among the hexagons.

jayd16 · on Sept 15, 2021

Looks like H3 uses regular hexagons and instead drops the requirement that they can't overlap.

jacobolus · on Sept 15, 2021

At each tiling level, they have a proper tiling of hexagons with 12 pentagons. The system is based on an icosahedron.

But the way the hierarchical division system works, the tile boundaries from one scale don’t precisely match the tile boundaries from another scale.

ClumsyPilot · on Sept 15, 2021

That can be a deal breaker for some uses

X6S1x6Okd1st · on Sept 15, 2021

Blog post here: https://eng.uber.com/h3/

kevmoo1 · on Sept 15, 2021

Worth reminding folks: https://www.youtube.com/watch?v=thOifuHs6eY

codezero · on Sept 15, 2021

All the shapes!

When I was working in space science we experimented with using Hierarchical Triangular Meshes but we ended up not really needing the faster indexing it offered since most of our processing was done in small portions of the sky at a time anyways.

[0] https://arxiv.org/pdf/cs/0701164.pdf

endisneigh · on Sept 15, 2021

Somewhat related - how would you setup geospatial indexing using a traditional index? For example IndexedDB?

I’ve been wanting to implement something similar to this (albeit much lighter) for the use with indexeddb - it’s challenging since many of the capabilities here just aren’t available to JavaScript on the browser.

nrabinowitz · on Sept 15, 2021

Everything in H3 is available in the browser: https://github.com/uber/h3-js

endisneigh · on Sept 15, 2021

Thanks for the link - I’ll have to check that out. I wonder if the limitations in IndexedDB will make certain queries impossible

captare · on Sept 15, 2021

I’ve used the excellent DGGRID for building geo-hex grids: https://github.com/sahrk/DGGRID

JakeStone · on Sept 15, 2021

DGGRID was very useful for me when I first started experimenting with hexagon partitioning.

One of my projects is using the Dymaxion projection, which H3 uses, and I've found H3 to be a lot faster for my uses. YMMV.

eerikkivistik · on Sept 15, 2021

Does anyone happen to know of any geospatial indexing solutions that can also index simultaneously by other dimensions. Temporal indexing for example (not only where, but when)?

jandrewrogers · on Sept 15, 2021

Yes, with caveats and reasons you don't often see it.

As a practical matter, you want to fit the indexing structure to the properties of your data model as closely as possible. Increasing the generality of high-dimensionality spatial indexes comes at a high cognitive cost, so most complex high-dimensionality indexes are bespoke designs to limit generality. Things become pretty messy when you mix dimension types that are interval-like (e.g. polygons) and monotonic-like (e.g. temporal) so almost no one does it.

You can build, say, a general 8-dimensional index that can handle (some) distributions of interval and monotonic types simultaneously, in addition to the usual boring data types, that has excellent performance and scalability characteristics. It would probably only require something like 1000 lines of C++, so not too onerous. However, the code logic would be nearly impenetrable to read, never mind write, which matters for practical engineering.

eerikkivistik · on Sept 15, 2021

Thank you for the thorough answer.

rurabe · on Sept 15, 2021

H3 cell indexes are just integers so you can easily make a compound index of (cell_index,timestamp). This would be easy in SQL and I'd have to imagine just about anything else that supports a compound index.

geophile · on Sept 15, 2021

Z-order does this easily, just index on (x, y, t). I wouldn't recommend it for more than maybe 4-6 dimensions though.

jti107 · on Sept 15, 2021

so its true....hexagons are the bestagons

https://youtu.be/thOifuHs6eY

peterburkimsher · on Sept 16, 2021

The video is great, I wish I saw that back in maths class in school!

Hexagons tile the plane very nicely, and are used for choosing where to place phone transmitters!

But they can't be easily divided into sub-hexagons.

Could we instead model them as the combination of Sierpinski triangles?

https://larryriddle.agnesscott.org/ifs/siertri/symmetricZ3he...