Wasm is a software-defined abstraction boundary, possibly the most bullet-proof software-defined abstraction boundary yet to date, given the formal specification and machine-checked proofs of correctness. It has memory sandboxing properties that are easily and efficiently implementable on hardware, and it is low-level enough to run low-level languages at speeds very close to native compilation.
Others in the thread make analogies to things like the JVM, CLR, Parrot, Ocaml bytecode, and other things. None of those things have the rigor that Wasm has or are low-level enough to be an abstraction over hardware. And all of those come with ambient capabilities or language concepts that implicitly couple applications with the runtime system and/or other applications. Wasm by its API-less core spec does not. This was 100% intentional.
Andy mentioned WALI briefly in the article. We designed WALI to be a similar kind of "just one step up from the bottom" abstraction; if you're running on Linux, then the Wasm engine exposes a memory-sandboxed Linux syscall set with the exact same semantics (effectively pass-through), that allows building abstractions and security one layer up (outside of the engine).
Given WASM's goodness, has anyone built a WASM-cpu? that is, WASM as the assembly language (instead of x86, arm, risc-v)? Heck, does that even make sense?
If you can write an interpreter, then you can implement it in hardware. Whether it will be _efficient_ compared to using another ISA is a different matter.
Of the top of my head: the unbounded block stack would need to be managed and spilled to memory as it grows (not unlike SPARC's register windows), manageable but complex. The operand stack either needs to be bounded or likewise spilled to memory. For a superscalar, the WASM bytecodes aren't the nicest to decode, but neither are x86 nor RISC-V (C implies variable length instructions).
Unsurprisingly, it would make a lot more sense to design a µArch that is well matched (eg. has rotation instruction) to WASM and for which translation is fairly straightforward. I'm excited about WASM exactly because it provides an interesting path for getting software on new and exciting uArchs. It's _MUCH_ less work to make a WASM translator than to do the full software stack.
With a pre-pass on the bytecode (e.g. during verification), one can compute a side data-structure that makes it possible to interpret Wasm without tracking the control stack dynamically. It works surprisingly well.
The OP asked about a WASM-cpu. It assume he meant that literally thus without preprocessing. Once you allow preprocessing there's no limited to what you can do and in fact you can do something really efficient (margin is too narrow to contain the full proof).
I know, but the safety guarantees of Wasm are established by validating the code first. The sidetable that Wizard's interpreter used is small enough (about 1/3 the side of the original code) it could fit in a hardware cache along-side the original code, kind of like a u-op cache. Wizard computes this sidetable during code validation, so there is no separate step.
I think the technique could be adapted to make a Wasm CPU, but the issues mentioned (e.g. variable-length instructions) would complicate the CPU frontend. I think it being stack-based isn't as much of an issue, as register-renaming will virtualize the Wasm operand stack. But given how complex modern CPUs are, it'd be hard to build a competitive super-scalar chip (really for any ISA) without some verious serious investment.
> it'd be hard to build a competitive super-scalar chip (really for any ISA) without some verious[sic] serious investment
"Hundreds of millions of dollars" cf. Mark Horowitz's Micro 35 keynote (https://www.youtube.com/watch?v=q8WK63joI_Y&t=1140s) but if you look at the stack in his graph, it's obvious that the architecture portion of this is tiny.
Making a superscalar core simulation (= "model"), or even an FPGA softcore, is within the reach of any sufficiently motivated individual and indeed there are several already. Of course the performance will be dramatically lower than a state of the art silicon implementation.
It could be done but it likely wouldn't perform well or be very useful (because the software surrounding the application also needs to exist.) The IR notably has a lot of indirections in the way e.g. function calls are represented (indirect call through a table which then refers to the module it needs to call), it's stack based, stuff like that.
Realistically speaking you are just better off targeting a compiler to whatever instruction set you have, because writing a (basic) compiler isn't too difficult. You can even just use wasm2c and then run a C compiler on it, assuming one exists for your target...
Hashlink seems to have the ambient concept of e.g. objects directly in its design and bytecode format, whereas e.g. wasm instead thinks of that in terms of "reference types" which represent existence of some opaque object. So there's that.
But most importantly, Neko and HashLink do not have the same kind of explicit, formally defined semantics WebAssembly has. This is a big topic but TL;DR there is a logical specification of every WebAssembly operation and behavior that all together, gives rise to a validation algorithm. This algorithm is basically a type checker, and like any type checker, it rules out bad behaviors. This validation algorithm can be run over an arbitrary WebAssembly file and tells you "yes this is valid and safe" or "no it isn't safe." By safe, that means certain behaviors are not ever possible, like stack underflow, or unstructured control flow, or that an operator is called with an invalid operand (e.g. add a string and a number). The lack of these behaviors means that when executing this program, certain isolation properties can be shown to hold -- even if the program is produced by a hostile and untrustworthy source.
That is very, very important for many use cases. Consider a browser, where it downloads a wasm file from an arbitrary server. How does it know that file was produced by a trustworthy compiler that produces correct wasm files? It can't know that. So instead it must validate the file first (according to the precise specification given by the wasm standard) before being allowed to execute it.
The validation algorithm for WebAssembly is stated in precise mathematical language, the language of type theory. And there have been several independent efforts to validate that this algorithm is correct, in machine checked theorem provers, to show that it always produces a valid yes/no answer, among all possible WebAssembly programs. Type checking and mathematical theorem proving are very closely related, in the same field, and type checking algorithms have been formalized in this manner for a very long time. So it's a well understood topic, and there is very good reason to believe the WebAssembly validation algorithm is correct and accurate (contra the oft repeated "b-b-b-but the specification might be wrong!")
If you are writing a program where the trustworthiness of the generated code isn't a problem, then you don't need that feature. For example, in the case of Haxe, you are probably producing those programs so they can be downloaded by users who already trust you, or you are compiling them directly to native executables and then giving them that. They already trust the developer to not hack their machine. In practice, some people still use WebAssembly for those use cases because it's got mindshare and a bunch of implementations, and implementing it (at a basic level) is not too difficult, and the extra security and isolation guarantees are nice to have.
Wasm is a software-defined abstraction boundary, possibly the most bullet-proof software-defined abstraction boundary yet to date, given the formal specification and machine-checked proofs of correctness. It has memory sandboxing properties that are easily and efficiently implementable on hardware, and it is low-level enough to run low-level languages at speeds very close to native compilation.
Others in the thread make analogies to things like the JVM, CLR, Parrot, Ocaml bytecode, and other things. None of those things have the rigor that Wasm has or are low-level enough to be an abstraction over hardware. And all of those come with ambient capabilities or language concepts that implicitly couple applications with the runtime system and/or other applications. Wasm by its API-less core spec does not. This was 100% intentional.
Andy mentioned WALI briefly in the article. We designed WALI to be a similar kind of "just one step up from the bottom" abstraction; if you're running on Linux, then the Wasm engine exposes a memory-sandboxed Linux syscall set with the exact same semantics (effectively pass-through), that allows building abstractions and security one layer up (outside of the engine).