Hacker News new | past | comments | ask | show | jobs | submit login
Artichoke – A Ruby made with Rust (github.com/artichoke)
234 points by ashton314 on Aug 4, 2019 | hide | past | favorite | 53 comments



This is a great idea but why would it be called anything other than Jasper: https://en.wikipedia.org/wiki/Jasper

It doesn't get closer to a Ruby made with Rust than that...


That would literally make my year, as a Jasper.


While I think this is great, Ruby implementations are notorious for being tricky to implement 100%. Rewriting Ruby in Rust is great and all but even TruffleRuby isn't at the point where the authors recommend running a Rails app on it.

I wonder if there's a way to add some Rust into MRI. Perhaps someone could write a YARV VM or a version of the JIT in Rust. It'd complicate the build pipeline, but it'd be iterative and improve the main implementation of Ruby.


Hi, I'm the author of Artichoke.

Ruby has a great black box testing suite called ruby/spec [0] that is shared among multiple Ruby implementations. Artichoke has a custom runner [1] to track progress on implementation completeness.

If Artichoke passes ruby/spec and is not compatible with MRI, that's a bug in the specs.

re: adding Rust to MRI, I'm working on extracting the mruby backend from the core VM infra + core/stdlib impls [3] which would let us use MRI as the backing VM via rutie. When the Artichoke core and stdlib is complete, this would mean that the MRI runtime could be implemented entirely in Rust [2].

[0] https://github.com/ruby/spec [1] https://github.com/artichoke/artichoke/tree/master/spec-runn... [2] https://github.com/artichoke/artichoke/issues/92 [3] https://artichoke.github.io/artichoke/artichoke_core/


Thanks for the praise on ruby/spec :)

OTOH, I'm sure there are such bugs where the specs don't cover every corner-case behavior. Specs are added when finding bugs in alternative implementations and following MRI's NEWS for new features, but there is no guarantee specs are complete.


Hi eregon, I've submitted a fix to ruby/spec already! \o/ (I'm @lopopolo on GitHub.)

One spec I know is missing is how capture groups should not be expanded into globals on `String#=~`. I'll file an issue [0].

[0] https://github.com/ruby/spec/issues/676


Hey, I didn't mean to dismiss your work. MRI is frankly a lot of complicated, at times archaic C and I'd certainly take a nice pile of Rust instead :D. I'm mostly just concerned because as far as I know there's maybe 3-4 implementations of Ruby that are actively developed and a whole lot more abandoned ones ^[1]. Have you considered taking some of the pieces of Artichoke and finding a way to incorporate them into MRI?

^[1] https://github.com/codicoscepticos/ruby-implementations


So what is the current pass rate for language and core specs? I didn’t see a number in any of those links or did I miss it? Do I have to build and run it myself to find out?

If you can run mspec at all then that’s pretty good to be honest.


I can run MSpec to completion for language, core, and library specs. Thank you for the encouragement :D.

I've implemented a runner that can skip known things that cause the spec to hang (e.g. Mutex specs since Artichoke has a single-threaded implementation of Mutex that deadlocks during the specs).

    $ pushd spec-runner/vendor/ruby/language
    $ cargo run --bin spec-runner **/*.rb
    Passed 1078, skipped 35, not implemented 2, failed 476 specs.
    $ popd; pushd spec-runner/vendor/ruby/core
    $ cargo run --bin spec-runner **/*.rb
    Passed 5412, skipped 998, not implemented 201, failed 8629 specs.
    $ popd; pushd spec-runner/vendor/ruby/library
    $ cargo run --bin spec-runner **/*.rb
    Passed 0, skipped 22, not implemented 16, failed 3276 specs.
The library specs pass number is a bug in my spec runner. All packages added to Artichoke pass ruby/spec. Those packages are : delegate, forwardable, json, monitor, ostruct, set, srscan, and uri.


I can generate a set of pass rates during build and output it to the gh-pages branch. That's a good idea. Thank you.

https://github.com/artichoke/artichoke/issues/101


> version of the JIT in Rust

I'm building a tracing JIT for CRuby in Rust at the moment. I started out trying to build a method JIT but mixing type analysis and iterative inlining to deal with real idiomatic Ruby code gets complex very fast. With tracing, both follow naturally.


Is that something different to what vmakarov is doing [1], the work of current MJIT from k0kubun [2] seems to be based on this earlier work. And I believe vmakarov is working on RTL from YARV instructions, and a light weight JIT [3].

If I remember correctly there was a Tracing JIT for CRuby in 2016, but discontinued due to memory consumption

[1] https://bugs.ruby-lang.org/issues/12589 [2] https://medium.com/@k0kubun/the-method-jit-compiler-for-ruby... [3] https://www.youtube.com/watch?v=emhYoI_RiOA


Yup. Quite different. Similar to RuJIT but I’m trying quite a different approach to trace recording and using CraneLift, an optimized code generation backend, rather than using tinycc

Light weight JIT only helps to remove the overhead of GCC etc. Not to be able to inline, constant fold etc. Something like the JRuby IR is required for that.


What is your plan for C extensions? They are heavily used in Ruby and tend to create a tricky barrier to optimisation. The fact the standard library is mostly written in C doesn’t help either.


Same thing LuaJIT does. Implement the really performance critical functions in IR and emit direct calls to C implementations of the complex ones.

In LuaJIT the FFI specifies enough type information to be able to call native functions from a trace.

I’m targeting optimizing through ActiveSupport String.blank? now I’ve got some micro benchmarks running.


I re-implemented a single method on Array a few years back, it is absolutely feasible in a technical sense. These kinds of things are always more complicated than “is it technically possible,” of course.


> I wonder if there's a way to add some Rust into MRI

Aaron Patterson spoke to this in a recent interview. In fact I think he specifically mentioned using Rust as a possibility for a JIT


A little oxidation and this would make the perfect logo:

https://www.amazon.com/dp/B00IK9F9F4


I love seeing more Ruby implementations! Looking forward to seeing how it will progress. Are there any details on how it’s implemented? How does it execute Ruby? Does it have its own bytecode? Does it compile to machine code on the fly? What will make it faster than MRI?


Thank you for your kind words.

Growing pains of blowing up on HN lol, I do not have this written down yet, but I have a ticket to start documenting this [0].

The idea is to have a core set of traits [1] that when implemented allow an implementation to load an interpreter agnostic Ruby core and Ruby Standard Library.

There is currently only one interpreter that implements these traits, and it is backed by mruby. My ultimate goal is to move off of mruby to either an MRI-backed interpeter via rutie or a native Rust-implemented backend + VM + parser.

[0] https://github.com/artichoke/artichoke/issues/102 [1] https://artichoke.github.io/artichoke/artichoke_core/


For an example of how Artichoke can be faster than MRI, I'm currently working on extending the oniguruma-based `Regexp` implementation to have a fast-path backed by the Rust regex crate in some circumstances. In testing this can speed up `String#scan` by a factor of 10 for some `Regexp`s


No c extensions? So most popular gems won’t work? Does that make this essentially an intellectual pursuit without practical application?


Hi I'm the author of Artichoke.

Thanks for your interest in the project. You can track progress on this issue on GitHub [0]. Would love your help on this if you could lend it. :)

[0] https://github.com/artichoke/artichoke/issues/99


No reason it can't be added in the future.


Without a GIL it would be incredibly difficult, would it not? AFAIK C extensions to MRI are not thread safe.

This could, of course, expose a different C API that allows for thread safe extensions, but I think the original comment is complaining about lack of compatibility with all of the important gems that depend on the existing MRI C API.

This has always been an issue that held back JRuby.


Hi, I'm the author of Artichoke. I don't have a good answer for whether or not implementing the MRI API would be difficult. I _suspected_ it would be, which is why I listed it as a non-goal. There is a lot of other work to do before we get to the point of C API compatibility.

However, it may be possible to use `rubysys` in rutie as a base for an MRI compatible API: https://github.com/artichoke/artichoke/issues/88


You might want to clarify in the readme that the red X means non-goal. Until I read this comment I thought it was a goal that had not yet been implemented.


in the 14 hours since I wrote the comment you replied to, I moved an MRI C API from non-goal to goal [0]. :) The red X means the goal is not yet achieved.

[0] https://github.com/artichoke/artichoke/pull/108


Could one add, effectively, a RWLock GIL - one where any number of Ruby interpreter threads can run at once (since they know how to not step on each other's toes), but any C code which tried to "grab the GIL" would run exclusively w.r.t. interpreter threads (and other C code)?


I believe the "guild" proposal makes it such that the global lock becomes guild-specific, which seems similar to what you propose.


Something similar to Racket places or V8 isolates can be implemented and keep compatibility with C extensions expecting GIL. It just needs somebody to actually do the work of swapping out hundreds of global variable accesses in MRI using something like CIL.


Do you mean every thread would have its own place/isolate for executing C extensions?

I think one difficulty there is some global state in C extensions might expect to be truly process-global.

Also, how would you isolate global variables in the C extension? If the C extension is a dynamic library, it's typically only loaded once per process. Maybe something like dlmopen() to load multiple copies of a native library would help, but it's not portable.


Yeah, this is a real problem. The PG gem, for example, uses some global variables.

IMHO the only way to do it is to add incremental parallelism which leaves the GIL in place. Racket has already shown a solid path here.

Guilds would have a major performance problem: can't allocate objects without GIL. It's also a tricky mental model and requires invasive changes to existing Ruby code to handle frozen objects.

Places don't share a heap so they don't need the GIL to allocate objects and have independent GC rather than global GC. It's a model which fits better with existing Ruby code. The GVL can be relaxed while executing Ruby and grabbed by native methods.

WDYT?


> Guilds would have a major performance problem: can't allocate objects without GIL.

Why not? It'd be possible to have TLABs, isn't it? But yes, GC would still be for all Guilds at once.

Racket places don't allow to share objects, only arrays of primitive types, which seems very restrictive. And Racket futures are even more restricted.

I'm not sure about Guilds since it's not there yet, but it already sounds closer to Ruby multithreading and more flexible to me from an usage point of view.


TLABs are definitely possible but it's increasing the implementation complexity.

Places are deliberately shared-nothing to reduce the implementation complexity and provide a simple model to the user. Process oriented code can be easily ported.

Porting something like a Sidekiq based worker using ActiveRecord to work inside a Guild where all access to shared objects needs to be frozen would be a nightmare.

With Racket Places you can share more complex types but it needs to be done via FFI. It doesn't prevent you from implementing a sharable connection pool or anything.


The proposed API, MRB_API, is used in mruby. With enough popular support for that C API, it could force mainline ruby to modernize it's C API to that one.

Assuming MRB_API thread-safe-able. mruby is not thread safe at all, being a single-threaded script runner.


Hi, I'm the author of Artichoke.

My expectation is that MRB_API can be implemented in a thread-safe way. The API manipulates the interpreter state (in the case of mruby, an `mrb_state`). The interpreter state itself and the implementation of the API can be thread safe or they can not be.

I don't think mruby even assumes the presence of threads on target systems, so it is completely reasonable that the implementation is not thread safe.


JRuby does not have a GIL.


I’m aware.

It also lacks compatibility with a lot of important C gems (not only or even predominantly because of the lack of a GIL, of course), which I’m pointing out has always impacted its adoption.


Then you should also be aware that it runs most of them via Sulong.


That’s Truffle, not JRuby. There was a plan several years ago to move Sulong support into JRuby based on trialing it in Truffle; it appears to have made no progress since 2016.


JRuby has no support for C extensions.

TruffleRuby supports C extensions, and has no GIL for Ruby code.


My personal experience with Ruby is extremely shallow, but a casual glance at the search results wrt the GIL in Ruby seems to say it actually does have one. I could be wrong, as I'm not an experienced Ruby developer, however.

I.e. https://thoughtbot.com/blog/untangling-ruby-threads


From your link: "Note that this is not the case for JRuby or Rubinius which do not have a GIL and offer true multi-threading."


Ruby has multiple implementations, JRuby uses Java Threads without any sort of GIL.

Plenty of talks done on this subject by Charles Nutter.


Once this language is more mature, comparing it with Crystal would be great. Artichoke will have the added benefit of having Ruby's entire ecosystem as its behest.


I wonder how much this would change the performance of ruby... that's all kind of cool except no c extensions, that basically makes all the gems I use unusable.


They're going to have to spend an immense amount of time implementing all the performance optimizations present in mri/yarv bytecode engine before they achieve performance parity.

The amount of engineering effort to remove the GIL and simultaniously make C extensions work is probably not worth simply choosing a multi-process w/ shared memory solution for true parallelism.


Hi, I'm the author of Artichoke.

I wouldn't say all gems are unusable. I implemented a Rack-compatible application server in an early version of Artichoke that is capable of serving a Sinatra Rack app.

https://github.com/artichoke/ferrocarril


JRuby doesn't support C extensions too https://github.com/jruby/jruby/wiki/C-Extension-Alternatives

It's a parallel world in which developers either use pure Ruby gems or Java jars. For example, the database drivers are JDBC adapters straight from the Java world not the usual mysql2 or pg gems. Obviously the programs that use them are not portable unless there is a compatibility layer. In that example, ActiveRecord abstracts the drivers.

I don't know anything about Rust but maybe it could end up in the same way.


I actually like the idea, being a Rust illiterate i have way of reading it. I like the tone of it "No GIL". I hope it does come to fruition. But i would love ruby programs to eventually be able produce machine code, like tons of lisp and scheme implementations do. Even Smalltalk did. I hope with "No GIL" that should be somewhere in the pipeline.


Is it possible to install ruby gems in the browser playground? So, Does ruby gem install via web assembly runtime work?


Not yet, but you can track progress here: https://github.com/artichoke/artichoke/issues/105




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: