This guide will be relevant until OS compositors fundamentally change from just ...

pcwalton · on July 10, 2017

> This guide will be relevant until OS compositors fundamentally change from just dealing with bitmaps. Which is unlikely to happen anytime soon.

There is absolutely no reason why browsers have to use the native compositor for CSS. It's a bad fit, and browsers should stop doing it.

> And doing UI layout on background threads breaks the basic design of pretty much every UI framework, web or native, which are usually single-threaded.

That's why you have a render tree on a separate thread from the DOM. Look at how Servo does it (disclaimer: I work on Servo). We've proven that it works.

jcelerier · on July 10, 2017

> And doing UI layout on background threads breaks the basic design of pretty much every UI framework, web or native, which are usually single-threaded.

Dunno about the others but Qt has a threaded render loop (http://doc.qt.io/qt-5/qtquick-visualcanvas-scenegraph.html#t...) (and also has no problem handling 1080p/60fps animations on embedded hardware)

pcwalton · on July 10, 2017

> Dunno about the others but Qt has a threaded render loop (http://doc.qt.io/qt-5/qtquick-visualcanvas-scenegraph.html#t...) (and also has no problem handling 1080p/60fps animations on embedded hardware)

Largely written by Glenn Watson, the primary author of Servo's WebRender, no less. :)

whowouldathunk · on July 10, 2017

By "single-threaded" I meant business logic and non-fixed layout being done on the same thread. "Rendering" (drawing to a bitmap or translating to the OS compositor's object model or direct OpenGL/DirectX/etc.) is done on a separate thread in most frameworks I'm aware of.

kllrnohj · on July 10, 2017

> There is absolutely no reason why browsers have to use the native compositor for CSS. It's a bad fit, and browsers should stop doing it.

Not entirely true. It's pretty critical to use the OS compositor for <video> for power reasons so that video playback can be offloaded entirely.

For everything else, though, agree. The OS compositor isn't magic and at the end of the day doesn't do much for non-video layers.

panic · on July 10, 2017

> There is absolutely no reason why browsers have to use the native compositor for CSS. It's a bad fit, and browsers should stop doing it.

Here are a few:

- On some platforms, animations on native layers are applied every frame, even if the application that owns the layer is busy. This means fewer opportunities to drop frames.

- Sometimes web rendering engines are embedded in apps. Your app may want to draw web content that filters some other content that's behind it. This content may not be available to the browser engine (it may be in a separate process, for example). A native layer can apply this filter in the compositor process, where the content is available.

- Using the native compositor makes it easier to embed components (like video) that are provided by the system as native layers.

pcwalton · on July 10, 2017

> - On some platforms, animations on native layers are applied every frame, even if the application that owns the layer is busy. This means fewer opportunities to drop frames.

The browser can and should do the same with its CSS compositor. There's no reason why the CSS compositor should run on the main thread (and, in fact, no modern browser works this way).

> - Sometimes web rendering engines are embedded in apps. Your app may want to draw web content that filters some other content that's behind it. This content may not be available to the browser engine (it may be in a separate process, for example). A native layer can apply this filter in the compositor process, where the content is available.

For transparent regions, a browser can do this by simply exporting its entire composited tree as a single transparent layer, where it can be composited over other content. In the case of a single-layer page, this is what the browser is doing anyway.

If you're talking about CSS filters, there's no way that I know of in CSS to say "filter the stuff behind me" in the first place. You can only filter elements' contents.

> - Using the native compositor makes it easier to embed components (like video) that are provided by the system as native layers.

I grant that you have to use the native compositor to get accelerated video on some platforms. But that doesn't mean that a browser should do everything this way. In fact, no browser even tries to export all of CSS to the compositor: this is why you have the various "layerization" hacks which give rise to the sadness in this article. Reducing what needs to be layerized to just video would actually decrease complexity a lot over the status quo. (If you don't believe me, try to read FrameLayerBuilder.cpp in Gecko. It would be way simpler if video were the only thing that generated layers.)

panic · on July 10, 2017

> The browser can and should do the same with its CSS compositor. There's no reason why the CSS compositor should run on the main thread (and, in fact, no modern browser works this way).

You still have to swap buffers from your background thread and then composite the buffer instead of compositing the animating layer directly. It's a small advantage, but it is an advantage.

> If you're talking about CSS filters, there's no way that I know of in CSS to say "filter the stuff behind me" in the first place. You can only filter elements' contents.

https://developer.mozilla.org/en-US/docs/Web/CSS/backdrop-fi... (yes, it's an experimental property, but it's the one I was thinking about)

pcwalton · on July 10, 2017

> You still have to swap buffers from your background thread and then composite the buffer instead of compositing the animating layer directly. It's a small advantage, but it is an advantage.

By this I assume you mean that when you have two compositors, you have an extra blit. This is mostly true (though it's not necessarily true if the OS compositor is using the GPU's scanout compositing), but it's by no means worth the enormous downsides of current layerization hacks. Right now, when you as a Web developer fall off the narrow path of stuff that the OS compositor can do, your performance craters. The current status quo is not working: only about 50% of CSS animations in the wild are performed off the main thread.

There's another enormous downside to using the OS compositor: losing all Z-buffer optimizations. Right now, browsers usually throw away 2x or more of their painting performance painting pixels that are occluded. When using the OS compositor, the browser painting engine doesn't know which pixels are occluded, because that's something only the OS compositor knows, so it has to paint the contents of every buffer just in case. But with a smart CSS compositor, the browser can early Z-reject pixels covered up by other elements.

> yes, it's an experimental property, but it's the one I was thinking about

Ah, OK, I wasn't aware of that because it's only implemented in Safari right now. Well, using the OS compositor would make it easier to apply backdrop filters, as long as the OS compositor supports everything in the SVG filter spec (a big assumption—I suspect this is only the case on macOS and iOS!) But even with that, I think it results in less complexity to just use the OS compositor for this specific case and fall back on the browser compositor for most everything else, just as with video. CSS really does not map onto OS compositors very well.

whowouldathunk · on July 10, 2017

> There is absolutely no reason why browsers have to use the native compositor for CSS.

Sure, but you're basically re-implementing big parts of the OS in the browser. Which is maybe not a bad idea, but at some point there's inception.

pcwalton · on July 10, 2017

There's a good reason to, in this case: that the OS compositor is not designed to render the full generality of CSS.

The status quo is not working: only about half of CSS animations in the wild run on the compositor. As browser vendors, we need to admit that the attempt to carve out a limited, fast subset of CSS has failed, and we should do the hard work to make all of CSS run fast.

ZenoArrow · on July 10, 2017

> "Which is unlikely to happen anytime soon."

Mozilla are moving Servo's WebRender over to Firefox. The feature is called Quantum Render, you can read up about it here:

https://wiki.mozilla.org/Platform/GFX/Quantum_Render

To give some idea of what this means for CSS animation performance:

https://m.youtube.com/watch?v=u0hYIRQRiws

kllrnohj · on July 10, 2017

That's not true at all. You don't have most of these problems when dealing with native apps, for example, and the reason is that browser rendering is just extremely slow. Thus the gap between the common path and the fast path is unusually massive on browsers.

It will get better when things like this: https://bugs.chromium.org/p/chromium/issues/detail?id=591179... get marked fixed (aka, use GPU rendering everywhere)

Although browsers needing to be highly defensive doesn't help, either. If a native app renders slow you generally blame the app, whereas if a website is slow you blame the browser (regardless of who is actually at fault). This leads to browser being conservative and defensive in their graphics stacks so that scrolling can be smooth in the face of graphically intense rendering.

OS composition is unrelated entirely here, and really there's no need for that to change away from just dealing with pixmaps.

whowouldathunk · on July 10, 2017

You have this problem with native Windows apps. I'm not familiar with iOS or Android.

https://blogs.msdn.microsoft.com/windowsappdev/2012/05/01/fa...

An independent animation is an animation that runs independently from thread running the core UI logic. (A dependent animation runs on the UI thread.)

The elements map directly to OS compositor "visuals."

Disclosure: I work at Microsoft

kllrnohj · on July 10, 2017

I'm not as familiar with native windows apps anymore but the difference typically is that while there is a faster path on native it's not as critical that it's taken. As in, the slow path is still generally fast enough.

That certainly used to be true on win32 (in that an animation of left/top would easily hit 60fps on a desktop computer), but maybe Microsoft's UI toolkits have regressed significantly? I rather doubt it, though, and suspect that it would still work just fine on a native windows app even though the web equivalent grinds to a halt.

whowouldathunk · on July 10, 2017

Animating left/top on a fixed layout is indeed still fast. But modern app layouts using responsive design are still going to need to hit the UI thread to modify controls as the size changes.

kuschku · on July 10, 2017

The same issues also happen on Android, which is why Android has separate threads for layouting and interactions, rendering, and business logic in every app, and why all complicated rendering ends up on separate GPU layers.

kllrnohj · on July 10, 2017

> and why all complicated rendering ends up on separate GPU layers.

No, it doesn't. The app renders to a single surface in a single GPU render pass unless the app uses a SurfaceView, which is generally only for media uses (camera, video, games).

Multiple layers are only used when asked for explicitly (View.setLayerType) or when required for proper blending. They are generally avoided otherwise as it's generally slower to use multiple layers.

You can absolutely do the "bad" things in the linked article in a native Android app and still hit 60fps pretty trivially. The accelerated properties, like View.setTranslationX/Y, only bypass what's typically a small amount of work (and don't use a caching layer). It's an incremental improvement, not something absolutely required. Scrolling in a RecyclerView or ListView, for example, doesn't even do that. It just moves the left/top of every view and re-renders, and that's plenty fast to hit 60fps.

kuschku · on July 10, 2017

This used to be true, but since Android M and N, where a lot more animations were added, a lot of animation now happens on separate GPU layers (and is rendered, if necessary, by separate threads).

This was especially necessary due to many of the ripple animations being introduced.

kllrnohj · on July 10, 2017

I think your confusing the RenderThread with GPU layers. There's only 1 rendering thread per app and it handles all rendering work done by that app. It's really no different than pre-M rendering other than a chunk of what used to be on the UI thread is now on a different thread. The general flow is the same.

The new part is that some animations (basically just the Ripple Animation) can happen on their own on that thread, but it doesn't use a GPU layer for it nor a different OS composition layer.

kuschku · on July 10, 2017

> but it doesn't use a GPU layer for it nor a different OS composition layer.

Really? As so often, there was a lot of talk about doing that beforehand, and it wasn’t discussed at all later on, so I had assumed that this had been done. Interesting that this didn’t happen.

What’d be the reason for that? Animating objects on a static background seems a prime case for GPU layers. Or was it the issue with the framebuffer sizes being too huge again?

kllrnohj · on July 10, 2017

Think about what the static background actually is. It's probably either an image (which is already just a static GL texture, no need to cache your bitmap in another bitmap), or it's something like a round rect which can actually be rendered faster than sampling from a texture (since it's a simple quad + a simple pixel shader - no texture fetches slowing things down). In such a scenario a GPU layer just ends up making things slower and uses more RAM.

runeks · on July 11, 2017

> This guide will be relevant until OS compositors fundamentally change from just dealing with bitmaps. Which is unlikely to happen anytime soon.

A browser is essentially a small OS with server-side rendering, where clients/web apps send HTML/CSS/JS for their GUI and the compositor/browser engine renders into bitmaps. We probably need something simpler than HTML/CSS/JS, though, if we want it to be reasonably easy to implement a fast rendering engine.

westoncb · on July 10, 2017

Hmm... Wouldn't it be possible to address the problem at another level? For instance, when it's discovered that layout properties are being animated, see if it would be possible to create an equivalent effect in the compositing stage, if so, convert the layout position changes to transforms?

I realize that would probably not be easy, but maybe still possible without major architectural changes?

pcwalton · on July 10, 2017

> For instance, when it's discovered that layout properties are being animated, see if it would be possible to create an equivalent effect in the compositing stage, if so, convert the layout position changes to transforms?

Yes, and we as browser vendors should absolutely do that.

Though a better solution is to just make the compositor able to accelerate the full generality of CSS in the first place.

whowouldathunk · on July 10, 2017

Which thread are you going to do that work on?

It's complicated because layout changes can cause new elements to be created. Elements can be really expensive to create.

pcwalton · on July 10, 2017

> Which thread are you going to do that work on?

The layout thread.

> It's complicated because layout changes can cause new elements to be created. Elements can be really expensive to create.

Huh? No, they can't. In rare cases, layout changes can cause new render objects to be created (line breaks), but that's not elements.

whowouldathunk · on July 10, 2017

Not sure how Servo works, but what happens if a list viewport gets bigger and you need to display more list items?

pcwalton · on July 10, 2017

The newly visible list items are already laid out, so we just display them.

westoncb · on July 10, 2017

That's the kind of thing that would cause a rejection: elements are gong to be created, so this can't be converted to compositing stage operations instead.

Dunno which thread—just an idea.