The main benefit of using the Web Assembly GC is not improved performance within the language, but rather having a single GC that handles both native DOM objects and language objects. Having two GCs interoperate is a big pain, and it's very difficult to pull off without leaks or performance problems. This is especially true for DOM objects, which have no user-visible finalizers. (The closest thing is WeakMaps, but I don't believe these are enough to implement proper cross-language GC, because you can't query whether an potentially-dead object is actually dead.)
Web Assembly GC is not a performance optimization; it's necessary for correctness.
In Chrome today, as an example, the C++ based DOM and V8 (Javascript ) objects are using different GCs. Obviously it is a pain point, but the situation of having different GCs for DOM and language objects is not new with Web Assembly.
I mean, as of Oilpan the V8 garbage collector marks C++ objects by calling trace hooks, as I understand things. That isn't an option for Web Assembly, which doesn't have access to the V8 garbage collector in that way (except via the wasm GC proposal; in fact, that's fundamentally what the wasm GC proposal is).
On languages where the algorithm is an implementation detail, one can make use of reference counting with a cycle collector, which are just a few hundred lines.
Implementing one is pretty simple, making it perform well is another matter.
Unlike your real processor though, you can't directly access memory or ensure its absolute location. You can make an advanced GC that is slow or a basic one that is not.
Which is good. You can always implement GC inside it, but personally, for performance-intensive computing, I do not want GC inside WASM unless eplicitly enabled.
What kind of WASM file sizes can we expect? I know some work has gone into shrinking Go executables, especially in 1.7, but will we be able to produce something like the 15KB Rust Hello World WASMs? Go has a fantastic stdlib but it hasn’t prioritized web optimized file sizes (yet).
A "hello world" in Rust is ~100 bytes; the more stuff you use, the bigger it gets, as less of the stdlib can be removed. The biggie is an allocator, which adds some size, but you can use things like https://github.com/fitzgen/wee_alloc and it's less than a kilobyte...
In Google's Doubleclick for Publishers UI they use Dart apps. The page loads several "apps" compiled to JS, each of which weights more than 2 Mb (even the "changelog" app). Page loads very slowly. So I would expect something similar for Go.
I'm trying to learn more about this topic and I'm curious if anyone could clarify for me --
The reason Go would need a specific architecture for WebAssembly is because Go supports features, like garbage collection, that WebAssembly does not.
Is that right? Close? An oversimplification. Way off?
Garbage collection isn't really a problem because no CPU architecture supports garbage collection- the compiler is generating code to implement the GC regardless. The issue is that WASM doesn't really resemble real CPUs at all. Basically all modern CPUs are register machines, but WASM is a stack machine.
Say you have the code C=A+B in your program.
The generated assembly for a real machine looks something like this:
(R* denote registers and A-C are memory addresses)
load A->R1
load B->R2
add R1+R2->R3
store R3->C
WASM looks more like this:
push A
push B
add
pop C
This alone makes the approach to code generation and optimization quite different.
WASM does lack some features that you would find in real CPUs, most notably the ability for execution to jump to arbitrary memory locations. They had to work around this to maintain the behavior of goroutines. Functions also live in their own address spaces which necessitates a change to how the program counter works.
It's more that Go _requires_ a runtime to support certain features which require Garbage Collection. The requirement of a runtime in turn dictates a different architecture to run on WebAssembly.
My understanding is that they're treating Web Assembly as just another architecture to compile for - rather than spitting out x86_64 assembly, or ARM assembly, they're emitting Web Assembly.
As others have said, Web assembly is an architecture/target for C, C++, rust, etc. that you compile for, and it will be the same for go.
The link does talk about GC support. It looks like wasm will have one, but go’s will most likely perform better since it is tailored to go. (edit for clarification)
I assume a lot of the planning for this was in the GitHub and mailing list discussions dedicated to go’s wasm support. I’m on mobile but I can link those later.
Wasm doesn't have one yet, but it's a future extension to to spec.
You can, of course, compile your GC in today, and it will work. Wasm's integrated support will mean re-using the existing GC, which means smaller code size.
Side note: most people thought this would be a precursor to DOM support, but the new "host bindings" proposal opens up DOM support without needing GC support, and many people believe that now the former will land before the latter.
"the WASM GC" only exists as a proposal, and proposals can change before they land. I don't know enough about it to say, and even if I did, it might be invalid by the time things are actually accepted.
The regular Go compiler doesn't use LLVM. It is derived from Plan9 compiler toolchain. There has been work on a LLVM Go compiler. There is also Gccgo, Go frontend for GCC, that could use GCC WebAssembly backend.
This is kind of what I was getting at... other languages seem to have a webasm pipeline that includes LLVM. I wanted to know what what the difference was between go and these languages that require one to use something other than LLVM.
> Go’s garbage collection is fully supported. WebAssembly is planning to add its own garbage collection, but it is hard to imagine that it would yield a better performance than Go’s own GC which is specifically tailored to Go’s needs
I don't think it's hard to imagine reading the GC proposal. The JS collector that might be reused could be off thread, something WASM can't (yet) do.
> Most file system operations are mapped to Node.js’ “fs” module. In the browser, file system operations are currently not available.
Please please abstract this. As a maintainer of a non-JS WASM backend, I'd love to use Go too.
> Especially a “goto” operation in WebAssembly would be very helpful.
I didn't look into the Go use case enough, but curious how much better this would be than the current labeled block and labeled break approach in WASM. WASM has fairly strict stack/frame rules/types, so arbitrary gotos wouldn't work.
Jump-to-arbitrary-address / labels-as-values isn't made available to Go devs by Go-the-language, but seems required by Go-the-runtime+compiler, such as for compiling Goroutines.
It sort of does via channels, though. Sending a value to a bufferless channel causes the next goroutine to run (eventually). This looks a bit like a goto if you squint.
Go needs multiple stacks and a way to switch to executing a different stack. Web assembly could provide this in a high-level way.
WebAssembly can be used anywhere. It is not tied to JavaScript in nature. So think of it kind of like the JVM in a way (but it's not an implementation, just a spec).
I'm ignorant of the details, but WASM is an assembly language that's designed from the ground up for running untrusted code. It will also have multiple independent open source implementations, which has some nice properties.
WASM can be good for other things too. It is portable and fairly easy to target. It's like asking "why emscripten of you have plenty of other languages to choose from, LLVM is good for executables"
Wasm exists because it's small and incremental; there's lots of stuff that's needed, but not in the initial spec. As long as it can be added in the future, that's what matters.
It's easier to add instructions than to remove our restrict them. Hardware or software exploits to break out of the sandbox are huge concerns. I'm all for
introducing new features slowly.
So when they say, it generates a big switch statement, do they generate this big switch statement for whole program or for individual functions which takes enough context to continue at at point where it yielded?
Speculating, but it sounds to me like it's within individual functions. Go's program counter is "[split] into 2 parts: PC_F and PC_B. PC_F is the index of the function to be executed. PC_B is the index of the basic block to be executed."
Afaik in theory stack machines are simpler to implement, but slower than register machines. In practise, the JVM is still pretty quick. Why was WASM designed as a stack machine?
What you’re saying applies to interpreters, which execute the program opcode by opcode, and (in the case of stack machines) keep an actual stack at runtime. WebAssembly was designed to be the input for (at least mildly) optimizing JIT compilers. Since the stack is required to be at a fixed depth at each instruction, when the compiler reads the instruction, it can map each of the inputs it pops to a single earlier instruction that pushed it[1], and record a reference directly to that in its internal IR. After that it completely forgets about the stack. Later on it’ll do register allocation for the native architecture, so the generated native code makes efficient use of registers.
Or in other words, the “stack machine” is basically just a compact encoding of an AST.
(Note that the native “stack” is an entirely separate thing which does exist at runtime.)
[1] Actually, there can be multiple possible instructions from, e.g., inside the ‘if’ and ‘else’ sides of a prior if-block (which can push values that stay on the stack after the end of the block). But since both sides have to push the same number of values, this is still tractable; it turns into a phi node in SSA representation.
Interestingly the issues sounds a lot like what a high performance Haskell implementation would run into. One additional note though on GC: highly tuned PL implementation requires completely control over object layout, pointer representation, and GC. (!)
No off-the-shelf GC will ever be perfect for all languages, thus I think WASM would be far better off doing whatever it could to facilitate runtimes implementing their own.
(!) Example from the lazy world: lazy closures when evaluated and updated often results in an indirection. We rely on GC to shortcut these. Sometimes we even have GC perform trivial constant-time evaluations when we know the result is smaller than the suspended evaluation.
Example from the Lisp world: cdr-coded cons cells can cut size in half, but no generic GC would be able to do this.
One big advantage of stack machines is that they tend to be better on code size, which very much matters when you're transferring code over the network. I expect other parts of the design make it easier to JIT; the lack of arbitrary gotos probably makes analyzing control flow easier, for example.
If you tried to look at the document and found that the entire internet has been screwing with the formatting, change "Suggesting" to "Viewing" in the top right. Seems like someone forgot to check permissions on the document.
Web Assembly GC is not a performance optimization; it's necessary for correctness.