Howdy, I work on S2 [1] so I have questions! How do you deal with polygons that cross the antimeridian?
The indexing structure you've come up with seems very interesting. In spherical coordinates line sweep algorithms like that are a little less intuitive because there's not really a min and max y value to work with. Does your index support multiple polygons indexed together?
The lack of exact predicates worries me a little bit. It's tricky because it will work right until it doesn't for mysterious reasons, and it's very hard to test for if you haven't built it on a foundation of exact predicates. You'll periodically fall into the fractal foam around edges when testing if you cross them or not ([2] has some good pictures). We do this in S2 by relying on a set of predicates [3] that fall all the way back to symbolic perturbation if they have to. We simply don't have to think about colinear points in S2 for that reason.
TG isn't spherical and wasn't designed to be. It's 2D and projection agnostic. Crossing the antimeridian is a user problem to solve. I recommend following the GeoJSON rule of antimeridian cutting [1].
As I said in the README. My goals are fast point-in-polygon, geometry intersections, and low memory footprint. Those are my bread-and-butter.
From what I know about S2, it has different goals. Such as being spherical and using discrete cells for distributed spatial indexes. Those are great things of course, but not really what I need.
This is the second comment that recommends using Shewchuk’s methods. And while yes I agree that that may be a superior way to go, and always on the table for the future, I'm still able to achieve my goals without those methods. At least for now.
Yeah we're spherical mostly because working in spherical coordinates let's you not have to worry about the projection. S2Cells are really just nodes in a quadtree, the difference being our quad tree is implicit, you can (and we do) store them in a linear array in memory.
There was an article here maybe about a month ago about testing Maya vs CAD vs Blender or something. Had to do with a litmus test where you'd create a screw and inspect how the threads terminated at the tip.
First thing they do is use low teen order polynomials and accept some loss of precision in the model representation based on real world needs. This means some systems have scale preferences. Others may limit model overall size. Still others use a tolerance based on model size, dynamic tolerancing.
Basically, these systems need to know whether points are conincident, lines colinear, curves are "cocurvular" (there is a better way to say that, but I coined that word with a friend in the 90's and it makes sense, so...), and so on.
Typically, there is a tolerance because there is always error due to computation limits and used to be data size limits.
In the 80's and 90's, solid models were done on megabytes of ram, including the OS. Solidworks ran on Windows 98, for example. A typical setup would be some 32 to 64Mb of RAM. Larger models were not possible, nor were extreme levels of detail, basically high surface count. Surface count is a good metric for how "heavy" a model is in terms of system resources, and it's scale is a good metric for what tolerance is needed.
There were times where it made sense to draw a model 10X scale or even 100X scale due to the actual model size being close to the system tolerance size lower limit. This can be true today still due to legacy geometry kernel compatability.
Early on, geometry kernels had a lot of limitations. Over time, these have been addressed and it's all down to a bazillion edge and corner cases. Millions and millions of man hours are in those code bodies. It's kind of like semiconductors and process nodes. Catching up is a seriously huge and expensive effort. It's generally not done and the kernels are licensed instead.
Parasolid, ACIS, OpenCascade are examples of ones that can be licensed, and others are in house, PTC, CATIA kernels are examples of those.
How these kernels handle geometry basically govern the CAD system and it's limits.
But I digress.
Getting back to CAD.
Imprecise representations were necessary, again due to CPU and data sizes possible early on. People were making manufacturable models on Pentium 90 machines with megabytes of RAM, and today we are doing it on multi-core machines that can approach 5Ghz and have Gigabytes of RAM.
The geometry kernels are not multi-threaded, though a CAD system can employ them in multi-threaded fashion depending on how they manage model history. They can also process parts concurrently, again all depending on what is linked to and dependent on what else.
Non concurrent / parallel execute capable geometry kernels are one reason why single threaded CPU performance remains important, and is a fundamental CAD system limit on model complexity. This is why, for example, very large systems, say airplanes, rockets, cars, are done design in place style rather than as fully constraint driven assembly style.
Flat models are the easiest to perform concurrent processing on. Hierarchical models are the worst, and it's all down to the dependency trees and model history being largely sequential, forward create from one or a few roots.
There is another difference too, and this is one way systems like MAYA and Blender are different from say, NX, CATIA, Solidworks. It's all in the data representation precision. Making pictures requires a lot less than running a mill along a surface does, for example. Entertainment centric software typically uses a lighter data representation and or has considerably looser tolerances. These models imported into a higher precision system will basically end up non-manifold and will often require a lot of rework and or decision making to become manufacturable.
On that note, a manifold model is one where each edge is shared by two and only two surfaces. This rule evaluates down to a real world representation for the most part.
A surface with a free edge means the model has one or more surfaces that do not contribute to a closed volume. That surface is an infinitely thin representation that has two sides. Not possible in the physical world. Yes, we have 2D materials, but they do have a thickness at the atomic level, and of course have sides because of that.
An edge that defined more than two surface boundaries contains an infinitely sharp edge or point of contact between volumes and or other surfaces. Again, at the atomic level, there is always a radius and there are no sharp edges smaller than the atoms matter is composed of.
Neither of these cases are manufacturable due to the properties of matter. There must be radii present, or more surfaces to define a manifold model.
In addition, there are some simple differences to account for and these are often ignored in the modeling phase of design because they will end up present in the manufacturing stage, and if not critical, are not important to be represented in the model properly.
One is sharp edges. Think of a cube. The truth is a manufactured cube would have some radii present on all edges, and vertices are theoretical constructions that do not exist in the real model at all. Curved edges blend into one another, leaving the vertex in space near the model at the theoretical intersection point of the theoretical edges.
Often this is ignored because it's simply not all that important. In some cases, say injection molding, it can't be ignored and all the radii need to be present on the model to allow for machining to happen properly, unless somehow the model tool can be machined with a ball end mill or something that will yield the radii as an artifact of manufacturing.
That's the big one.
Another one we often ignore are reference constructs, like partitions that subdivide a volume. These may be used to delineate material boundaries, or regions of a model that get special consideration or are needed for inspections, or similar kinds of tasks. A partition itself is manifold, but either entirely or partially shares volume with another manifold construct,
We often ignore these too, though sometimes an operator has to sort through model data in various ways to find and isolate the primary model or apply boolean operations to the partitions to yield a useful manifold model.
The numerical precision comes into play big here. All those surfaces are defined. Their edges are defined again with curves that must lie "on" the surface, and mathematically they just don't, so some deviation is allowable, and that's the tolerance compensation strategy I wrote about early on this comment.
Say we are machining something small. Something that would fit into a cubic centimeter or inch. The system tolerance would need to be 0.00001x" inches for an inch representation. The machining itself would operate to 0.0001.
On some systems locating a model like this far away from the origin would see a loss in precision not present near the origin. Many CAD systems employ a coordinate system hierarchy so that all models can be created with local part origins near the model, or ideally positioned so features can be mirrored and or patterned about the primary datums present in the origin. Axis, planes, and the point of origin itself.
This allows for very large, but also very detailed models, such as an airplane. It also allows for concurrent design as each subsystem can be assigned a coordinate system and those teams can build relative to it confident when they remain within their design volume their work will fit nicely into the parent models and all that into the full model.
On the other extreme, say we are modeling the solar system. On systems with dynamic tolerancing, the precision might be on the order of feet or meters. On others, it's not possible to do at scale, and the model would need a scale factor to fit into the model area allowed by the overall system precision.
Finally, going back to a difference between say, MAYA and NX, there is the amount of data given to define a surface. An entertainment model might be light, and some data may not even be present as it can be inferred or is implied in some way.
Take a two dimensional surface that has some curvature. In one dimension there is 10X the data given. The surface will look fine on the screen and when rendered, but when a surface like this is imported into a system with manufacturing grade precision, some operations will not be possible, or must be made with a stated, much looser tolerance than otherwise used. And doing that may well make other operations, such as milling toolpaths be imprecise and or erroneous, as in digging into material stock rather than making a smooth or appropriately sized cut.
The solution for that, BTW is to section the surface in various ways. The section curves created along the dimension that does not have much data will be imprecise. Guesses essentially. That's OK, because the data really isn't there to begin with.
But, doing that does lock in the guess. Basically, one commits to that data to continue on and manufacture something to a spec that is manufacturable, despite the original authority data being incomplete.
Now a new surface can be created with more data in both dimensions. Replace the old one with the new one, and now all operations can happen, intersections, trims, section curves, and the like will function properly and at standard tolerance.
The same thing can be done with erroneous surfaces containing too much data, or said data having more noise that drives the surface complexity into having folds and other problem states. The basic solution is the same, section it and obtain curves.
Select the curves that make the most sense, toss the outliers, create new surfaces, replace, trim, stitch and the resulting manifold model will generally more robust for ongoing operations, be they modeling or manufacturing, simulation, whatever.
Early on, these techniques were manual. And they still are, and can be, depending on how severe the problem cases are. That said, many CAD systems have automated functions to deal with problem surfaces and they are effective a high percentage of the time.
hey man! I am desperately want to report a series of sites that pure black hat spam the search engine rankings, but its weird that seems nobody (means those famous Googlers on Twitter lol) never responded me regard this, so if possible, if you'd love to see these evil ones and direct these to web spam teams, would be really grateful.
S2 wouldn't be a bad choice, it lets you compute "coverings" of arbitrary regions as S2 cells, with variable resolution. Fixed size is trickier but is probably doable, especially if you're allowed to null out unused cells. Check out https://s2.sidewalklabs.com/regioncoverer/
The indexing structure you've come up with seems very interesting. In spherical coordinates line sweep algorithms like that are a little less intuitive because there's not really a min and max y value to work with. Does your index support multiple polygons indexed together?
The lack of exact predicates worries me a little bit. It's tricky because it will work right until it doesn't for mysterious reasons, and it's very hard to test for if you haven't built it on a foundation of exact predicates. You'll periodically fall into the fractal foam around edges when testing if you cross them or not ([2] has some good pictures). We do this in S2 by relying on a set of predicates [3] that fall all the way back to symbolic perturbation if they have to. We simply don't have to think about colinear points in S2 for that reason.
[1] https://s2geometry.io/ [2] https://github.com/mourner/robust-predicates [3] https://github.com/google/s2geometry/blob/master/src/s2/s2pr...