A little bit of everything, but the WebGPU stuff requires a full interop layer to translate WebGPU calls on the Rust side into calls on the JS side (a poor man's wasm-bindgen basically).
Additionally, there's just a lot of code being generated for common containers. It would probably be simpler and smaller to create unsafe containers that just shift pointers to boxed structs around or something, but I haven't gotten around to that yet.
You'd still need the same kind of interop layer. The question then becomes if there are more or fewer classes and methods you'd need to wrap on the Rust side.
Do you think it’s the WebGPU, scene graph, the sound, or the Rust infrastructure that consumes the most space?