That's somewhat true for VFX in general, but not really for Pixar: They mostly use procedural textures (they use them a lot more than most other VFX studios can) so texture IO is a lot less, and is one of the reasons they can get away with starting to use a GPU preview renderer (i.e. the textures fit in VRAM).
Regarding model complexity - that doesn't really matter these days - if you want to raytrace a scene, unless you want to do full-on deferred raybatching and geo paging (like Disney's Hyperion does), you need to fit the scene into memory, so other than at startup/ingestion time, model complexity doesn't really matter in terms of IO. Also, Pixar's base level subd meshes are really rough (compared to what you see in VFX in general - for VFX base levels are probably subdivided down 2/3 times more), so the geo given in in Pixar's case isn't actually that bad.
There's very little info in the Disney paper on geomapping, do you have any ideas on how they do it?
I guess that you don't want to page in full models (as you either need a very large amount of memory per thread, which means smaller ray batches, or you need to have a shared cache, which means waiting on other threads). So you'll probably need something similar to Toro (as described by Pharr et al.), which just doesn't seem very efficient to me (as grids have improved little compared to BVHs).
Easiest way is to have a two-level BVH, where the lower level just has overall AABBs of every object in the scene which contains a "geometryInstance", each of which then contain another object-space BVH of the triangles (or other primitives).
This is trivial to do, but you take a pretty big penalty for not being able to look down to the primitive (microtriangle, microcurve) level for the lower level BVH, so the quality of it isn't as good. So for stuff with hair on skin, you've pretty much got two objects overlapping which sucks for traversal performance. There is a way around this, but it involves storing a pointer to each primitive in the BVH which is expensive and mixed transforms / motion blur get complicated with this.
The more complex way of doing it would just be to use the single level BVH and partially build it - given the lengths Disney are going to in order to sort and batch the rays, it sounds like they can constrain stuff such as all rays they send in a batch are going in a similar direction anyway, so you can cull a lot of stuff.
These days in VFX we're pretty much just fitting stuff into memory anyway - there's no other option. PRMan does support geometry paging in theory, although Arnold doesn't and VRay only partially does (with VRay meshes), but performance absolutely sucks doing it, so buying more RAM is just the easiest/cheapest solution all round.
Disney are only really doing deferring / reordering to such an extent because of their love of PTex which sucks with random access IO, so...
Yup, I meant paging geometry. The problem with just doing a two level BVH is that some models can be very large, while others may be much smaller, which makes it hard to reserve memory for the geometry cache (even if it just fits one model per thread). Which means that you can store fewer rays in memory, and I'm not really a fan of constantly writing rays to disk. I was thinking of just cutting off subtrees of the BVH and storing those on disk, which makes it possible to assign just a few megabytes of memory to each thread for caching geometry. Not sure how efficient that would be though, probably a bit faster than rebuilding part of the BVH every time (especially when using high quality BVHs with spatial splits). Shouldn't be too bad with large ray batches, sorting and ray streams.
They do mention some tests with "out-of-core rendering of scenes with massive amounts of geometry" in the paper, but there isn't much info on it. AFAIK their patents mention streaming in geometry too.
I know that buying more RAM is probably easier, but it would be interesting to render very large scenes on commodity hardware. I guess paging in geometry is just too much of a hassle, you constantly have to stream in geometry to memory; to compute surface derivates, to do subsurface scattering, to sample arbitrary geometry lights, etc.
Why would you need per-thread geometry caches? That doesn't really make sense (memory duplication)...
What you'd do is just map the geometry + bvh into a serialisable format (like vraymesh) such that they can be fread() in one go pretty quickly. You could put heuristics in to not page out very small meshes...
But unless you batch and re-order rays (and I'm not convinced doing this is worth it if you're doing more than 3/4 bounces, as the incoherence just makes things a nightmare), there's no point really doing this (unless you're happy with very slow performance - even mmapping stuff is slow).
The per-thread caches are not strictly necessary, but you don't want threads to block because they can't stream in geometry to memory. You want some upper bound on the amount of memory each thread needs for geometry (each thread could try to ray intersect a different mesh). Dividing geometry into fixed sized chunks gives you this upper bound (max N chunks per thread).
Ray re-ordering is indeed a nightmare, but sorting very large ray batches (millions of rays) into coherent ray streams is less of a hassle, and it should enable coherent geometry access (which amortizes the cost of those reads, not sure to what extent).
But that doesn't really scale - unless the per-thread stuff is small (and then you'd have to page on a sub-mesh level - i.e. split up the mesh spatially and page it as you traverse through it for internal scattering or something), if you double the number of threads, the memory used for threads by doubling the number of threads will obviously double as well.
For stuff like hair where you need loads of scattering events or for SSS doing that (paging subsections of shapes) is just going to thrash any geometry cache you have unless they're quite big.
And I'm still not convinced by ray sorting: it means your light integration is extremely segmented into artificial progressions as you expand down the ray tree, and makes anything regarding manifold exploration (MLT or MNEE) close to impossible to do well.
We have been very happy with sources - it doesn't cover EVERY service we use (yet) but taking even just one or two out of in-house ETL is a huge benefit.
In the US there's about 100 child kidnappings by strangers every year [1]. So you're claiming that each year, there are fewer than 10000*100 = 1 million times in the US that a stranger talks to a child. There are 50 million children in the US under age 12 [2], so only one in fifty children would be approached by a stranger in each year. How many times did a stranger talk to you as a child? More than none? Were you kidnapped?
> How many times did a stranger talk to you as a child? More than none? Were you kidnapped?
Just to share anecdotes, I had a few memorable experiences as a kid (and probably many more that I don't remember).
A very nice person helped fix my bike chain when it slipped off the gears and got wedged against the frame. Without them, I'd have a long walk home with a stuck back wheel.
On another occasion, while I was playing outside my house, a man came by and invited me to attend his church. My parents weren't keen on that, and I was rather uninterested so we never took him up on the offer. I presume he was a Jehova's Witness or a Mormon.
And one of the interesting things about that is the paper was written in 1978 but the technology was not really used until 20 years later (Geri's Game, 1997) when hardware was finally fast enough to evaluate the algorithm in a reasonable amount of time.
I would LOVE to see a Founders at Work 2.0 - so many incredible companies have been built since that book was published. The iPhone had just been released, for instance.
It's not exactly Founders at Work 2.0, but I'm working on a book that's been heavily influenced by Founders at Work, as well as pg's "Do Things That Don't Scale" essay.
And yet, people buy SSDs even though they could get more memory in a HDD. Because they value speed over capacity.
"The Innovator's Dilemma" traces back the disk drive industry, and in each generation, people choose smaller memory capacity, for the sake of a smaller package (there used to be 8 inch drives, for example). The theoretical idea is that once users' need for performance is satisfied (it's "good enough"), they turn their attention to other issues - such as price, convenience/ease-of-use, customization etc.
It has happened with desktops: that's what caused the brief "netbook" popularity, and what made smartphones successful. Desktops had overshot what was needed for many tasks (browsing, email); but the smaller devices were just becoming powerful enough to manage. So although desktops were more powerful, that extra power didn't matter to many users.
The underlying idea is twofold: (1) all technologies improve over time, as engineers find better ways to do things (Moore's law is just one example); (2) what users demand also increases over time, but at a slower rate
Therefore, if you start with new approach that really struggles with many tasks, eventually it will become powerful enough for what users need; during the same period, the old technology started off powerful enough, and became even more powerful - but users didn't care, because it was more than they needed (or, at least, they didn't want it as much as they wanted other qualities, like convenience etc).
Steve was pretty explicit to Tim Cook and others that they should never ask "what would Steve do". He knew what that kind of attitude did to Disney after Walt died. It destroyed their creativity for years.