> The one place where today's GPUs aren't as good as Toy Story is filtering & anti-aliasing
This only makes sense if you are locked in to some texture filtering algorithm already, which isn't true. CPU renderers aren't doing anything with their texture filtering that can't be replicated on GPUs. Where the line should be drawn by using the GPUs native texture filtering and doing more thorough software filtering would be something to explore, but there is no reason why a single texture sample in the terms of a software renderer has to map to a single texture sample on the GPU.
> BTW, textures & texture sampling are a huge portion of the cost of the runtime on render farms. They comprise the majority of the data needed to render a frame.
I'm acutely aware of how much does or does not go into textures. Modern shaders can account for as much as half of rendering time, with tracing of rays accounting for the other half. This is the entire shader, not just textures and is an extreme example.
> The entire architecture of a render farm is built around texture caching.
This is not true at all. Render farm nodes are typically built with memory to CPU core ratios that match as the main priority.
> Just getting textures into a GPU would also pose a significant speed problem.
This is also not true. In 1995 an Onyx with a maximum of 32 _Sockets_ had a maximum of 2GB of memory. The bandwidth to PCIe 3.0 16x is about 16GB/s and plenty of cards already have 16GB of memory. The textures would also stay in memory for multiple frames, since most textures are not animated.
> I'm acutely aware of how much does or does not go into textures. Modern shaders can account for as much as half of rendering time, with tracing of rays accounting for the other half. This is the entire shader, not just textures and is an extreme example.
At least at VFX level (Pixar's slightly different, as they use a lot of procedural textures), Texture I/O time can be a significant amount of render time.
> This is not true at all. Render farm nodes are typically built with memory to CPU core ratios that match as the main priority.
I don't know what you mean by this (I assume that memory scales with cores?), but most render farms at high level have extremely expensive fast I/O caches very close to the render server nodes (usually Avere solutions) mainly just for the textures.
The raw source textures are normally of the order of hundreds of gigabytes and thus have to be out-of-core. Pulling them off disk, uncompressing them and filtering them (even tiled and pre-mipped) is extremely expensive.
> This is also not true. In 1995 an Onyx with a maximum of 32 _Sockets_ had a maximum of 2GB of memory. The bandwidth to PCIe 3.0 16x is about 16GB/s and plenty of cards already have 16GB of memory. The textures would also stay in memory for multiple frames, since most textures are not animated.
This is true. One of the reasons why GPU renderers still aren't being used at high-level VFX in general is precisely because of both memory limits (once you go out-of-core on a GPU, you might as well have stayed on the CPU) and due to PCI transfer costs of getting the stuff onto the GPU.
On top of that, almost all final rendering is still done on a per-frame basis, so for each frame, you start the renderer, give it the source scene/geo, it then loads the textures again and again for each different frame - precisely why fast Texture caches are needed.
> At least at VFX level (Pixar's slightly different, as they use a lot of procedural textures), Texture I/O time can be a significant amount of render time.
I was referring to visual effects.
> I don't know what you mean by this (I assume that memory scales with cores?), but most render farms at high level have extremely expensive fast I/O caches very close to the render server nodes (usually Avere solutions) mainly just for the textures.
I wouldn't say the SSDs on netapp appliances or putting hard drives in render nodes are 'architecting for texture caching'. These are important for disk IO all around. Still it's not relevant to rendering Toy Story in real time since it is clear that GPUs have substantially more memory than a packed SGI Onyx workstation in 1995.
> The raw source textures are normally of the order of hundreds of gigabytes and thus have to be out-of-core. Pulling them off disk, uncompressing them and filtering them (even tiled and pre-mipped) is extremely expensive.
I don't know if I would say 'normally', but in any event I don't think that was the case for Toy Story in 1995. Even so, the same out of core texture caching that PRman and other renderers use could be done from main memory to GPU memory, instead of hard disk to main memory.
> This is true. One of the reasons why GPU renderers still aren't being used at high-level VFX in general is precisely because of both memory limits (once you go out-of-core on a GPU, you might as well have stayed on the CPU) and due to PCI transfer costs of getting the stuff onto the GPU.
This was about the possibility of rendering the first Toy Story in real time on modern GPUs.
> On top of that, almost all final rendering is still done on a per-frame basis, so for each frame, you start the renderer, give it the source scene/geo, it then loads the textures again and again for each different frame - precisely why fast Texture caches are needed.
This is a matter of workflow, which makes perfect sense when renders take multiple hours per frame, but if trying to render in real time, the same pipeline wouldn't be reasonable or necessary.
> This only makes sense if you are locked in to some texture filtering algorithm already, which isn't true.
I said you could roll your own, didn't I? If you don't roll your own, you most definitely are locked in. All GPU libraries (OpenGL, CUDA, OpenCL, DirectX, Vulkan) only come with a limited set, none of which match the filtering quality that Renderman & other film renderers have.
If you do roll your own, your performance is going to suffer, and not by a little.
> I'm acutely aware of how much does or does not go into textures. Modern shaders can account for as much as half of rendering time, with tracing of rays accounting for the other half.
That ratio depends entirely on what you're doing. It's meaningless. That said, Pixar people have said 10:1 in the past (Toy Story time frame) for shading:rasterizing. You mention ray tracing, are you assuming ray tracing? Why? Toy Story wasn't ray traced.
> Render farm nodes are typically built with memory to CPU core ratios that match as the main priority.
That doesn't contradict what I said, at all.
> The textures would also stay in memory for multiple frames,
I doubt that was true for Toy Story, and it was not true for the films I worked on circa Toy Story. Textures usage was "out of core" at the time.
Texture & MIP tiles were loaded & purged on-demand into a RAM cache during a frame of render. Each renderfarm node also had a local disk cache for textures, to minimize network traffic. The amount of texture used during the render of the frame often (and I believe usually) exceeded the amount of RAM in our renderfarm nodes. You certainly can load textures on demand on a GPU, you just don't get any performance gain over a CPU when you do that.
Aside from Toy Story era assets, in-core GPU textures are not possible with today's film texture & geometry sizes. Film frames with good filtering can easily access terabytes of texture.
GPU core memory size & bandwidth is the single main determinant of whether studios can use GPUs for rendering today. Getting large (film sized, not game sized) amounts of geometry & textures onto a GPU is the main problem.
> You mention ray tracing, are you assuming ray tracing? Why? Toy Story wasn't ray traced.
I actually said that in my first post.
> I said you could roll your own, didn't I? If you don't roll your own, you most definitely are locked in
The point is that this is not something that is so computationally intensive that it would prevent it from running in real time, even when matching the filtering quality of PRman.
> > Render farm nodes are typically built with memory to CPU core ratios that match as the main priority.
>That doesn't contradict what I said, at all.
You said that render farms focus on texture caching, and I'm telling you that is not true. I've built large render farms and the topic never comes up. Textures aren't nearly the focal point you are making them out to be.
> I doubt that was true for Toy Story, and it was not true for the films I worked on circa Toy Story. Textures usage was "out of core" at the time.
If you were rendering in real time, there would be no reason to unload textures from GPU memory and then put the textures back into GPU memory, so this isn't relevant. Film is rendered one frame at a time, however if the Reyes architecture were running in real time, the same workflow used for 2 hour frames would not be practical, nor does it weigh on whether real time would be possible here.
> You certainly can load textures on demand on a GPU, you just don't get any performance gain over a CPU when you do that.
That assumes that GPUs' only advantage is due to memory bandwidth, which isn't true, though that is irrelevant here, since there are plenty of GPUs with much more memory than the computers used to render Toy Story.
> Aside from Toy Story era assets, in-core GPU textures are not possible with today's film texture & geometry sizes. Film frames with good filtering can easily access terabytes of texture.
> GPU core memory size & bandwidth is the single main determinant of whether studios can use GPUs for rendering today. Getting large (film sized, not game sized) amounts of geometry & textures onto a GPU is the main problem.
I'm not sure why you suddenly focused on modern GPU rendering for film. This thread was about whether Toy Story could be rendered in real time on modern GPUs, not modern film assets.
Man, I don't know why you're in hyperbolic attack mode, I'm sorry if I said something that irritated you. I'd love to have a peaceful technical conversation about how to do it, rather than a try to prove me wrong on every point fight. You're probably right, it's probably possible to render Toy Story in real time. I sincerely hope you have happy holidays.
It is a fact that the texture filtering Toy Story used is more computationally intensive than the "high quality" anisotropic mip mapping you get by default on a GPU. I did speculate getting PRman level texturing, filtering and anti-aliasing, combined with memory constraints, could compromise the ability to render Toy Story in real time. You disagree. That's fine, I could be wrong. But, if you don't mind, I'll wait to change my until after someone actually demonstrates it.
I've built render farms too, and renderers as well. My personal experience was that texture caching was a large factor in deciding the network layout, the hardware purchases, the software system supporting the render farm, and the software architecture of the renderer itself. You can tell me whatever you want, and keep saying I'm wrong, but it won't change my experience, all that means is that you might not have seen everything yet.
> If you were rendering in real time, there would be no reason to unload textures from GPU memory and then put the textures back into GPU memory, so this isn't relevant.
It is completely relevant, if you can't fit the textures into memory in the first place, which is precisely what was happening in my studio around the same time Toy Story was produced, and what I would speculate was also happening during production of Toy Story.
But, I don't know about Toy Story specifically, since I didn't work on it. I believe the production tree was smaller than 16GB, so perhaps it's entirely possible the entire thing could fit on a modern GPU. Still, this would mean that a good chunk of the software, the antialiased frame buffer, all animation and geometry data, all texture data -- all assets for the film -- would have to fit on the GPU in an accessible (potentially uncompressed) state. I'm somewhat skeptical, but since you're suggesting a theoretical re-write the entire pipeline & renderer, then yes, it definitely might be possible.
> I'm not sure why you suddenly focused on modern GPU rendering for film.
I was just making a side note that memory is still (and always has been) the GPU rendering bottleneck for film assets. Threads can't evolve? I'm not allowed to discuss anything else besides the first point ever?
I think my side note is relevant because an implicit meta-question in this conversation is: what year's film assets are renderable in real time on the GPU, regardless of whether Toy Story is?
The GPU memory limits are, IMO, becoming less of a bottleneck over time. Rendering today's film assets is becoming more possible on a GPU, not less, so I think if Toy Story couldn't be real-timed today today, it will happen pretty soon.
> Man, I don't know why you're in hyperbolic attack mode,
There isn't anything like that in my posts, just corrections along with pointing out irrelevancies, no need to be defensive.
> It is completely relevant, if you can't fit the textures into memory in the first place, which is precisely what was happening in my studio around the same time Toy Story was produced, and what I would speculate was also happening during production of Toy Story.
Yes, PRman has always had great texture caching and like I mentioned earlier, a 32 socket SGI Onyx would max out at 2GB of memory. I think a fraction of that was much more common.
> Still, this would mean that a good chunk of the software, the antialiased frame buffer, all animation and geometry data, all texture data -- all assets for the film -- would have to fit on the GPU in an accessible (potentially uncompressed) state.
I think you mean all assets for a shot, not the whole film.
This only makes sense if you are locked in to some texture filtering algorithm already, which isn't true. CPU renderers aren't doing anything with their texture filtering that can't be replicated on GPUs. Where the line should be drawn by using the GPUs native texture filtering and doing more thorough software filtering would be something to explore, but there is no reason why a single texture sample in the terms of a software renderer has to map to a single texture sample on the GPU.
> BTW, textures & texture sampling are a huge portion of the cost of the runtime on render farms. They comprise the majority of the data needed to render a frame.
I'm acutely aware of how much does or does not go into textures. Modern shaders can account for as much as half of rendering time, with tracing of rays accounting for the other half. This is the entire shader, not just textures and is an extreme example.
> The entire architecture of a render farm is built around texture caching.
This is not true at all. Render farm nodes are typically built with memory to CPU core ratios that match as the main priority.
> Just getting textures into a GPU would also pose a significant speed problem.
This is also not true. In 1995 an Onyx with a maximum of 32 _Sockets_ had a maximum of 2GB of memory. The bandwidth to PCIe 3.0 16x is about 16GB/s and plenty of cards already have 16GB of memory. The textures would also stay in memory for multiple frames, since most textures are not animated.