While I think this is great, Ruby implementations are notorious for being tricky to implement 100%. Rewriting Ruby in Rust is great and all but even TruffleRuby isn't at the point where the authors recommend running a Rails app on it.
I wonder if there's a way to add some Rust into MRI. Perhaps someone could write a YARV VM or a version of the JIT in Rust. It'd complicate the build pipeline, but it'd be iterative and improve the main implementation of Ruby.
Ruby has a great black box testing suite called ruby/spec [0] that is shared among multiple Ruby implementations. Artichoke has a custom runner [1] to track progress on implementation completeness.
If Artichoke passes ruby/spec and is not compatible with MRI, that's a bug in the specs.
re: adding Rust to MRI, I'm working on extracting the mruby backend from the core VM infra + core/stdlib impls [3] which would let us use MRI as the backing VM via rutie. When the Artichoke core and stdlib is complete, this would mean that the MRI runtime could be implemented entirely in Rust [2].
OTOH, I'm sure there are such bugs where the specs don't cover every corner-case behavior. Specs are added when finding bugs in alternative implementations and following MRI's NEWS for new features, but there is no guarantee specs are complete.
Hey, I didn't mean to dismiss your work. MRI is frankly a lot of complicated, at times archaic C and I'd certainly take a nice pile of Rust instead :D. I'm mostly just concerned because as far as I know there's maybe 3-4 implementations of Ruby that are actively developed and a whole lot more abandoned ones ^[1]. Have you considered taking some of the pieces of Artichoke and finding a way to incorporate them into MRI?
So what is the current pass rate for language and core specs? I didn’t see a number in any of those links or did I miss it? Do I have to build and run it myself to find out?
If you can run mspec at all then that’s pretty good to be honest.
I can run MSpec to completion for language, core, and library specs. Thank you for the encouragement :D.
I've implemented a runner that can skip known things that cause the spec to hang (e.g. Mutex specs since Artichoke has a single-threaded implementation of Mutex that deadlocks during the specs).
$ pushd spec-runner/vendor/ruby/language
$ cargo run --bin spec-runner **/*.rb
Passed 1078, skipped 35, not implemented 2, failed 476 specs.
$ popd; pushd spec-runner/vendor/ruby/core
$ cargo run --bin spec-runner **/*.rb
Passed 5412, skipped 998, not implemented 201, failed 8629 specs.
$ popd; pushd spec-runner/vendor/ruby/library
$ cargo run --bin spec-runner **/*.rb
Passed 0, skipped 22, not implemented 16, failed 3276 specs.
The library specs pass number is a bug in my spec runner. All packages added to Artichoke pass ruby/spec. Those packages are : delegate, forwardable, json, monitor, ostruct, set, srscan, and uri.
I'm building a tracing JIT for CRuby in Rust at the moment. I started out trying to build a method JIT but mixing type analysis and iterative inlining to deal with real idiomatic Ruby code gets complex very fast. With tracing, both follow naturally.
Is that something different to what vmakarov is doing [1], the work of current MJIT from k0kubun [2] seems to be based on this earlier work. And I believe vmakarov is working on RTL from YARV instructions, and a light weight JIT [3].
If I remember correctly there was a Tracing JIT for CRuby in 2016, but discontinued due to memory consumption
Yup. Quite different. Similar to RuJIT but I’m trying quite a different approach to trace recording and using CraneLift, an optimized code generation backend, rather than using tinycc
Light weight JIT only helps to remove the overhead of GCC etc. Not to be able to inline, constant fold etc. Something like the JRuby IR is required for that.
What is your plan for C extensions? They are heavily used in Ruby and tend to create a tricky barrier to optimisation. The fact the standard library is mostly written in C doesn’t help either.
I re-implemented a single method on Array a few years back, it is absolutely feasible in a technical sense. These kinds of things are always more complicated than “is it technically possible,” of course.
I love seeing more Ruby implementations! Looking forward to seeing how it will progress. Are there any details on how it’s implemented? How does it execute Ruby? Does it have its own bytecode? Does it compile to machine code on the fly? What will make it faster than MRI?
Growing pains of blowing up on HN lol, I do not have this written down yet, but I have a ticket to start documenting this [0].
The idea is to have a core set of traits [1] that when implemented allow an implementation to load an interpreter agnostic Ruby core and Ruby Standard Library.
There is currently only one interpreter that implements these traits, and it is backed by mruby. My ultimate goal is to move off of mruby to either an MRI-backed interpeter via rutie or a native Rust-implemented backend + VM + parser.
For an example of how Artichoke can be faster than MRI, I'm currently working on extending the oniguruma-based `Regexp` implementation to have a fast-path backed by the Rust regex crate in some circumstances. In testing this can speed up `String#scan` by a factor of 10 for some `Regexp`s
Without a GIL it would be incredibly difficult, would it not? AFAIK C extensions to MRI are not thread safe.
This could, of course, expose a different C API that allows for thread safe extensions, but I think the original comment is complaining about lack of compatibility with all of the important gems that depend on the existing MRI C API.
This has always been an issue that held back JRuby.
Hi, I'm the author of Artichoke. I don't have a good answer for whether or not implementing the MRI API would be difficult. I _suspected_ it would be, which is why I listed it as a non-goal. There is a lot of other work to do before we get to the point of C API compatibility.
You might want to clarify in the readme that the red X means non-goal. Until I read this comment I thought it was a goal that had not yet been implemented.
in the 14 hours since I wrote the comment you replied to, I moved an MRI C API from non-goal to goal [0]. :) The red X means the goal is not yet achieved.
Could one add, effectively, a RWLock GIL - one where any number of Ruby interpreter threads can run at once (since they know how to not step on each other's toes), but any C code which tried to "grab the GIL" would run exclusively w.r.t. interpreter threads (and other C code)?
Something similar to Racket places or V8 isolates can be implemented and keep compatibility with C extensions expecting GIL. It just needs somebody to actually do the work of swapping out hundreds of global variable accesses in MRI using something like CIL.
Do you mean every thread would have its own place/isolate for executing C extensions?
I think one difficulty there is some global state in C extensions might expect to be truly process-global.
Also, how would you isolate global variables in the C extension? If the C extension is a dynamic library, it's typically only loaded once per process.
Maybe something like dlmopen() to load multiple copies of a native library would help, but it's not portable.
Yeah, this is a real problem. The PG gem, for example, uses some global variables.
IMHO the only way to do it is to add incremental parallelism which leaves the GIL in place. Racket has already shown a solid path here.
Guilds would have a major performance problem: can't allocate objects without GIL. It's also a tricky mental model and requires invasive changes to existing Ruby code to handle frozen objects.
Places don't share a heap so they don't need the GIL to allocate objects and have independent GC rather than global GC. It's a model which fits better with existing Ruby code. The GVL can be relaxed while executing Ruby and grabbed by native methods.
> Guilds would have a major performance problem: can't allocate objects without GIL.
Why not? It'd be possible to have TLABs, isn't it? But yes, GC would still be for all Guilds at once.
Racket places don't allow to share objects, only arrays of primitive types, which seems very restrictive. And Racket futures are even more restricted.
I'm not sure about Guilds since it's not there yet, but it already sounds closer to Ruby multithreading and more flexible to me from an usage point of view.
TLABs are definitely possible but it's increasing the implementation complexity.
Places are deliberately shared-nothing to reduce the implementation complexity and provide a simple model to the user. Process oriented code can be easily ported.
Porting something like a Sidekiq based worker using ActiveRecord to work inside a Guild where all access to shared objects needs to be frozen would be a nightmare.
With Racket Places you can share more complex types but it needs to be done via FFI. It doesn't prevent you from implementing a sharable connection pool or anything.
The proposed API, MRB_API, is used in mruby. With enough popular support for that C API, it could force mainline ruby to modernize it's C API to that one.
Assuming MRB_API thread-safe-able. mruby is not thread safe at all, being a single-threaded script runner.
My expectation is that MRB_API can be implemented in a thread-safe way. The API manipulates the interpreter state (in the case of mruby, an `mrb_state`). The interpreter state itself and the implementation of the API can be thread safe or they can not be.
I don't think mruby even assumes the presence of threads on target systems, so it is completely reasonable that the implementation is not thread safe.
It also lacks compatibility with a lot of important C gems (not only or even predominantly because of the lack of a GIL, of course), which I’m pointing out has always impacted its adoption.
That’s Truffle, not JRuby. There was a plan several years ago to move Sulong support into JRuby based on trialing it in Truffle; it appears to have made no progress since 2016.
My personal experience with Ruby is extremely shallow, but a casual glance at the search results wrt the GIL in Ruby seems to say it actually does have one. I could be wrong, as I'm not an experienced Ruby developer, however.
Once this language is more mature, comparing it with Crystal would be great. Artichoke will have the added benefit of having Ruby's entire ecosystem as its behest.
I wonder how much this would change the performance of ruby... that's all kind of cool except no c extensions, that basically makes all the gems I use unusable.
They're going to have to spend an immense amount of time implementing all the performance optimizations present in mri/yarv bytecode engine before they achieve performance parity.
The amount of engineering effort to remove the GIL and simultaniously make C extensions work is probably not worth simply choosing a multi-process w/ shared memory solution for true parallelism.
I wouldn't say all gems are unusable. I implemented a Rack-compatible application server in an early version of Artichoke that is capable of serving a Sinatra Rack app.
It's a parallel world in which developers either use pure Ruby gems or Java jars. For example, the database drivers are JDBC adapters straight from the Java world not the usual mysql2 or pg gems. Obviously the programs that use them are not portable unless there is a compatibility layer. In that example, ActiveRecord abstracts the drivers.
I don't know anything about Rust but maybe it could end up in the same way.
I actually like the idea, being a Rust illiterate i have way of reading it. I like the tone of it "No GIL". I hope it does come to fruition. But i would love ruby programs to eventually be able produce machine code, like tons of lisp and scheme implementations do. Even Smalltalk did. I hope with "No GIL" that should be somewhere in the pipeline.
It doesn't get closer to a Ruby made with Rust than that...