"YJIT code ported from C99 to Rust"
Beyond passing the test suite, are there more numbers to compare both versions? (e.g., compilation time, lines of code, size of binaries, performance, etc.)
> The new Rust version of YJIT has reached parity with the C version, in that it passes all the CRuby tests, is able to run all of the YJIT benchmarks, and performs similarly to the C version (because it works the same way and largely generates the same machine code). We've even incorporated some design improvements, such as a more fine-grained constant invalidation mechanism which we expect will make a big difference in Ruby on Rails applications.
I think the goal of this right now is just to match the C version.
The C implementation of YJIT supported x86 Unix/Linux platforms, and it sounds like adding Windows and arm64 support, plus other improvements was a daunting task with the tools C provides.
Now it’s in Rust we’ll hopefully see further improvements quicker.
Rust will eclipse C++. C is a harder nut to crack, particularly for the embedded space where ease of implementing and maintaining a compiler back-end/code-emitter for your new weird 8-bit architecture is important. C is pretty close to an assembly macro and it's barely updated, which is great for that use-case. But for use cases like interpreters Rust is perfectly suitable.
This position is like saying C or C++ won't eat ASM's lunch. While technically true since there's a lot of ASM code still being written, especially for extremely low-level or high performance code, the vast majority of C and C++ developers don't actually touch ASM (i.e. C/C++ dominate ASM in terms of number of developer hours spent).
I think you may also be overlooking the GCC backend for rustc and gccrs, a ground-up standalone reimplementation of the Rust language frontend for GCC. Both of those should drastically improve the coverage and availability of Rust to all the same platforms you would be using GCC to compile C code for.
Depending on the compiler support, you might get that architecture for free unless the vendor is providing their own C compiler. The harder part is that your new weird 8-bit architecture probably won't benefit as much from the strong nostd ecosystem of libraries, so the overhead of writing Rust won't be counterbalanced. Still, like I said at the outset, this is an extremely niche use-case. Rust doesn't have to wipe C or C++ from the map for it to crack that nut.
The harder nut for Rust to crack I think is actually C++. There are extremely large C++ codebases. Industry would love for there to be a significantly easier/cheaper story to tell in terms of integrating Rust with those codebases. That way you could set metrics around converting the codebase, new code has to be written in Rust etc. However, the challenge is that Rust can only replace components with very well-defined boundaries. Those boundaries are less clearly defined in C++ codebases than they are in C codebases (linkage + templates in particular are challenging). To truly crack the C++ nut probably requires solving this problem unless Rust codebases just starting eating C++ codebases commercially through development velocity (which is a much longer and harder path).
It's extremely important for meaningful commercial Rust adoption for legacy codebases to be able to adopt it incrementally (i.e. all new code is Rust). I think you're underestimating how much C/C++ code there is out there (Linux Kernel, Chrome, all of Google's internal infrastructure, all of Amazon's internal infrastructure etc). We're talking about many billions dollars worth of code that is never going to get rewritten and lines of code that keep accruing. Now competitors starting today may make other choices but there's enormous value to be had by cracking the nut of seamless integration of progressive migration (i.e. so that you can say "no more new C++ code"). The failure of this lesson is seen in banks that continue to run on Fortran at best and at worst other businesses that continue to run on old unsupported languages/technologies. Thankfully, I think the tech companies are engineering-led and understand this so I suspect they're paying people to figure out this problem.
I've been around for a long time and I haven't seen a PL with this much momentum since Java was launched. Inertia is real, but the benefits over C++ are undeniable.
Bootstrapping seems a silly thing to be obsessed about as C++ will be around forever still, but it obviously can be bootstrapped if that becomes important.
Ada/SPARK already provided such benefits, and NVidia has chosen it instead of Rust for automotive firmware.
Rust momentum is meaningless for GPUs unless NVidia decides it gets to play in CUDA, and they are now one of the companies with more ISO C++ people on their payroll.
It is also meaningless for PlayStation, Nintendo and Xbox, unless the respective SDKs integrate Rust.
Bootstraping isn't silly, because LLVM and GCC are written in C++, so there isn't any "Rust will eclipse C+", when it depends on it for its existence.
SPARK doesn't provide the same feature set as Rust. If you want safe heap allocation in SPARK, then you get a garbage collector (unless you're talking really recent experimental extensions IIRC). If you want to forego the GC and remain memory-safe, then you also forego heap allocation. This might work for avionics code, but not for most apps.
Besides, the post you're replying to is talking about "momentum", and it's obvious in 2022 that Ada doesn't have the momentum that Rust does (however you define "momentum"). NVIDIA is not the entire industry.
Much of the rest of your post concerns video games, which are only a small portion of the total C++ code in existence. (And in any case it's not accurate to say that languages are "meaningless" unless the platform vendor officially supports them—console vendors don't maintain C# VMs either and yet Unity titles work just fine.)
What garbage collector? Ada never had one, besides the optional one in early standards, never implemented in any commercial compiler, thus removed in Ada 2012.
I wasn't the one asserting momentum, and can relate to plenty of other industries where Rust isn't even on the radar.
Going back to Ada example, Rust certainly doesn't have any momentum over Ada in high integrity computing.
Console vendors do happen to collaborate with Unity, and make it first party on their SDKs, so yet another lack of information.
WebRender is certainly "GPU related" and is shipping to millions of happy Firefox users.
And yes, LLVM is written in C++. So what? C++ compilers depend on C code in libc. Portions of libc are written in assembler. Some assembly instructions are decomposed into microcode. Yet nobody doubts that C++ has eclipsed assembly language in terms of importance to the industry nowadays. We'll always need a way for humans to read the actual instructions that the silicon interprets, but relatively few people need to be able to do that nowadays. That dynamic is what the parent post means by one language "eclipsing" another.
I started in C and generally when people say 'rewrite it in Rust' I just roll my eyes, because I know how hard that is. But seeing it happen on a sophisticated project has made me take another look.
Obviously for the embedded world everything is pitched at C currently and I don't think that will change, but for larger projects this is proof that my intuition was wrong.
I suppose that's a long winded way of saying that it might be time for me to learn Rust.
Yes, but YJIT in rust is the same ~33.4% faster than vanilla CRuby than YJIT in C. The rewrite into Rust is expected to make YJIT easier to maintain and that may in turn make possible further improvements to code generation, but the rewrite generates the same machine code (and therefore the same speedup) as before.
AFAIK, the Rust YJIT doesn't change any (they explicitly say that the generated code is approximately the same), so there no significant difference in performance should be expected.
What about pointer size? Is still only 32-bits? It is impossible to write some C programs when you do not know the size of your data and pointers.
I'm asking that because despite all the hype, wasm binaries are still 32-bit (at least in browsers), while 32-bit OS are being deprecated. 64-bit pointers are really good for some applications, with no performance degradation at all.
Does Wasm 2.0 support branch instructions needed for irreducible control flows? (without the relooper hack)
> 64-bit pointers are really good for some applications, with no performance degradation at all.
That probably won't be the case with wasm64 unfortunately. A hidden advantage of wasm using a 32 bit address space while primarily targeting 64 bit platforms is that runtimes can get free implicit bounds checking by carving out a 4GB block of virtual memory for each wasm instance, so out-of-bounds accesses are trapped by the hardware.
That trick won't work for wasm64, so runtimes will have to go back to inserting explicit bounds checks at every memory access, and it'll be a tradeoff where in exchange for being able to access more memory everything will run a bit slower.
Having 32 bit pointers in WASM does not mean that the JIT emits 32 bit code. 64 bit pointers always are a performance degradation, as they use up more space, it can just be negligible. 64 bit x86 most of the time being faster has nothing to do with pointer width but with more with having more registers, a better calling convention and a minimum of SSE2.
This is also the reason why the x32 ABI is faster than regular x86-64.
The only advantage of 64 bit pointers is that you can have a bigger address space than 4GB.
The requirement for structured programs dates back from the asm.js hack. Both relooper and stackifier are still workarounds. CPUs do not require structure programs at all. Unless there is a really good reason this still looks a hack driven by limitations of the underlying code generation. I understand that some compilation passes are easier/cheaper with structured control. But then... why are not LLVM and GCC enforcing them as well (at least their roadmaps)?
Not yet, a note on the Memory instructions [1] still says: “Future version of WebAssembly might provide memory instructions with 64 bit address ranges.”
Perhaps... but 5 years without progress in some areas is an eternity. Rust in Firefox was super interesting and they run out of funding. It is really hard to explain to a client: "your application would work on WASM, on a browser, with no installation, but it will not scale in time and memory, and there is not a clear roadmap of then things will improve (despite OS and compiler guys know how to fix those things)".
Erlang has symbols because its strings are ridiculously expensive (and kinda shit), so while it does have immutable strings identifying objects based on that would be ridiculously costly.
I do not understand why people can be happy with the artificial software barriers. I'm really happy with my Apple M1 but I'd never use it if I could not run Emacs and free software on it.
Cars, dumb TVs, dishwashers, etc. aren't general purpose computer platforms. There's no market for running arbitrary software on them, nor do they contain a web-browser.
Smart TVs, game consoles, phones and tablets on the other hand already are general purpose computers. It is possible for someone to run whatever they want on them, be it by jumping through all the hoops or just by hosting a website. These artificial software barriers exist purely as anti-consumer rent extraction; they don't limit the functionality of the device beyond putting a price on participation.
UI is really slow. I wish that anyone takes QT or GTK backends for DOM+wasm seriously. More work should be delegated to the UI components. X-Windows was ugly... but there was a clean separation that was useful.
Isn't Firebase its own suite of services within the greater Google cloud offerings these days? I remember it started off as just a kind of data storage but then added a ton more things for e.g. deploying applications.
> Right now, we are routinely creating neural networks that have a number of parameters more than the number of training samples. This says that the books have to be rewritten.
I also believe that this statement is weird. I have a very shallow knowledge of ML, but I can imagine that in a convolutional neural network a training sample interacts with lots of parameters. This 'one training sample <-> one parameter' correspondence seems wrong to me.
One CNN parameter would interact with the total number of training samples, but that doesn’t tell you about the ratio of total training samples to total CNN parameters.