DockYard R&D: FireFly Optimizes Your Elixir Compilation

tr1ll10nb1ll · on Sept 3, 2022

Dockyard seems to be making strides with their projects which in turn makes working with Elixir lucrative (assuming the projects do take off).

Yesterday, they posted LiveView Native and now— this.

JediLuke · on Sept 3, 2022

ElixirConf was this past weekend, I wasn’t there (sadly) but probably these were all reveals from that.

ch4s3 · on Sept 4, 2022

Both libraries were announced by Brian along with a CMS they’ve built.

yewenjie · on Sept 3, 2022

I think this project used to be called Lumen until pretty recently - https://github.com/GetFirefly/firefly

bcardarella · on Sept 4, 2022

yeah we renamed because a ton of things are called "Lumen". Arguably "firefly" is more common so you can never win :D

pdimitar · on Sept 3, 2022

I would appreciate if this post included just a few shell script snippets showing how to replace current Erlang compilation toolchain so Erlang/Elixir/other-BEAM-langs can try it, i.e. a drop-in replacement for your day-to-day dev workflow.

I don't mind compiling `firefly` from source (as the GitHub page instructs). That's all fine and an accepted reality when an experimental tool is starting to take off.

But not providing much of a context + actionable list of items on how to try it in your day-to-day work is to me a PR piece doomed to fall on deaf ears.

This is not an investor pitch to business people. You're advertising to programmers. Please craft your articles accordingly.

Don't get me wrong, I am glad this exists and will follow it. I just wish we had something to copy-paste in our `~/.bashrc` file and and be able to try the tool today.

bitwalker · on Sept 3, 2022

It's not quite ready for that level of experimentation yet, we only recently got the compiler implemented, and there is much remaining in the runtime to finish up. There will be more announcements in the future related to the project, and I'll make a note to ensure we provide the kind of instructions you're looking for when the time comes.

pdimitar · on Sept 4, 2022

Cool. When you guys feel that some projects can benefit from it, please post some instructions. Ideally just a few shell incantations to replace Erlang's compiler and runtime so one can easily run Erlang or Elixir apps right away.

Appreciate your work a lot and would contribute but too much on my plate for that. Hence -- double appreciation.

bitwalker · on Sept 4, 2022

Thanks! I’ll put some together in the next couple days just to make sure it gets done, so even though running those apps may not work (depending on what’s used), at least it’ll be easy to play with it.

derefr · on Sept 3, 2022

As someone who's the exact target audience of this post (deploying Elixir apps in production), I have no idea what this PR piece is trying to say.

> Unlike BEAM it compiles applications ahead of time, allowing Lumen to perform optimizations BEAM can’t.

What does this mean? Erlang/Elixir are AOT-compiled languages. They're compiled into bytecode. (Also, "BEAM" is not a compiler, but a bytecode VM which executes the bytecode emitted by the compiler.)

Do they mean that they further compile the BEAM bytecode to native host-ISA object code? Or did they write alternative Erlang and Elixir compilers, to compile these HLLs directly to host-ISA object code, skipping both the language runtime and BEAM bytecode entirely? Do they mean "compiles applications" literally, in the sense of Whole-Program Optimization?

In short — is this something like GraalVM's Native Images; or something like an AOT version of BEAM's HiPE extension; or something else?

> FireFly is able to compile Elixir applications without having to run through the Erlang Virtual Machine.

So is the key benefit here that the compilation is faster because it's not running on the BEAM (good for e.g. CI); or is the key benefit that the resulting executable is faster because it's not running on the BEAM?

Also, there are some questions I have that this post didn't even try to answer. E.g.:

It is my understanding that the the bytecode-ness of the Erlang VM is crucial to the lightweight bounded-runtime cooperative-scheduling mechanism that allows Erlang/Elixir code to be high-concurrency + soft-realtime. (Effectively, every ISA op implicitly decrements a per-actor reduction-counter as part of its implementation; and the yield-point checks are also inside the impls for certain BEAM ISA ops, rather than being their own explicit BEAM ISA instructions.) How does a non-bytecode version of Erlang/Elixir abstract-machine semantics, achieve these same guarantees? Is there an explicit reduction-counter being carried around in the emitted native code?

And also, there is no mention of disadvantages/constraints of using this system. It's pretty clear that you wouldn't be able to do hot reloading or dynamic trace-point insertion without the BEAM there to intermediate it. That's fine for some use-cases, but they should explicitly mention the trade-offs and target audience.

e3bc54b2 · on Sept 3, 2022

Indeed all the DockYard pieces announced are nice, but the posts are puff pieces at best.

From what I understood, Elixir/OTP are there, but instead of compiling to BEAM bytecode, they compile for WASI (targeting whatever can run WebAssembly). It is not HiPE, nor JIT, there will not be any bytecode. Only AOT compiled native code, except targeting WASI.

Generally BEAM is understood to be slower of the big runtimes (compared against Java, CLR, Go and often V8/Javascript). FireFly claims to be faster, and smaller in compiled form than current interation compiled against BEAM.

How that plays with actor model and preemptively switched lightweight processes is a mystery to me too.

I'd really appreciate something with more technical sauce but human explanation. Currently all we have is PR puffs like this and source code (at least for the LiveView native, haven't tried looking for this).

All that said, this does indicate there is good activity and desires to get Elixir working on stuff other than its current strong suit. This usually means the language and ecosystem is growing. It could be Baader-Meinhof phenomenon, but I started learning Elixir couple of weeks ago, starting with LiveView tomorrow, and there is nice stream of posts coming for it. Exciting times :)

bitwalker · on Sept 3, 2022

> Only AOT compiled native code, except targeting WASI.

More precisely, a number of targets are supported, WASI would be just one.

> Generally BEAM is understood to be slower of the big runtimes (compared against Java, CLR, Go and often V8/Javascript). FireFly claims to be faster, and smaller in compiled form than current interation compiled against BEAM.

The point is that by placing some restrictions on what is possible at runtime (specifically by removing the possibility of hot code loading), we can do whole program analysis and thereby do much more aggressive forms of optimization and dead code elimination across compiled applications _and_ the runtime they link to. It isn't guaranteed that such programs would be faster (though I suspect in some cases they would be), but they almost certainly should be smaller, which is important for Wasm, and other constrained targets.

> How that plays with actor model and preemptively switched lightweight processes is a mystery to me too.

It makes no difference, you can implement all of that with identical semantics from the perspective of the developer, the strategy used for compilation is orthogonal to those features, though naturally the implementation details are tightly integrated.

derefr · on Sept 4, 2022

> It isn't guaranteed that such programs would be faster (though I suspect in some cases they would be), but they almost certainly should be smaller, which is important for Wasm, and other constrained targets.

Smaller in code size, maybe (though that possibly doesn't matter given that code is pretty compressible), but possibly less cache-coherent! One thing that's interesting about bytecode interpreters like BEAM, the JVM, etc., is that since all the actual native-ISA code is just the same small set of instruction impls being jumped to over and over, the interpreter can stay entirely hot in L2 or even L1 cache at all times; with the program bytecode being executed through, while less hot, being more concise, and therefore also being "hotter per byte" since code that does more per op will take longer to fully run through before it needs to be evicted in favor of something else.

This has been a consideration for decades—it's why programs compiled to Pascal p-code tended to be faster than programs compiled natively for the low-level (mostly ALUless) host instruction sets of the time.

bitwalker · on Sept 4, 2022

Smaller code size was the goal; and BEAM bytecode adds up quick, even compressed. For example, the BEAM files for _just_ Ecto come out to about 1.2M in an uncompressed tarball, but in a gzipped tarball at maximum compression is still 799K. In my experience, about the smallest release for the average application in terms of BEAM bytecode is about 30M (uncompressed), across the standard library, dependencies, and your own code. Virtually none of that can be dead code eliminated, because the BEAM has to treat it all as potentially reachable. Shipping that much to a browser (even compressed) is just not viable.

Your point about instructions remaining hot in the cache might very well be true more often than not, but is highly sensitive to the application in question. The core interpreter opcode impls might all fit in cache at the same time (though I doubt even that with the BEAM due to how many there are), but any call to BIFs/NIFs is likely to cause evictions. It still might do better than natively-compiled code overall in that specific sense, but I would be hesitant to state any generalities about it when considered as a whole with all of the many other factors that play in to overall performance.

In any case, Firefly wasn't about building a faster BEAM, but about bringing BEAM languages to the browser (or really any Wasm host). While targeting standard server/desktop architectures was something we also wanted to support, particularly for writing CLI tools and such, we expect the BEAM will always be the first choice for people deploying to those systems. If we can build something that is faster than the BEAM in some cases due to the tradeoffs we make, that's great, but it isn't an explicit goal.

derefr · on Sept 4, 2022

> For example, the BEAM files for _just_ Ecto come out to about 1.2M in an uncompressed tarball

Did you try stripping out debugging chunks? IIRC every Elixir module embeds its own source code by default.

e3bc54b2 · on Sept 4, 2022

Thank you! That makes quite a few things clear.

As a naïve question, how does SBCL (Common Lisp compiler) generate such a wicked fast code? Other than actor model, it is also dynamically typed and hot code loading is one of the headline features there.

bitwalker · on Sept 4, 2022

When I say hot code loading, I’m really specifically referring to how that works in the BEAM, which is a more sophisticated mechanism than simply compiling/generating code on the fly. The biggest problem though is that it prevents most forms of dead code elimination. It also means that you can never assume anything about how a function will be called, because at any point new code could be loaded that calls it differently. You can still optimize such code with a tracing JIT, but in Wasm (at least currently) that’s not even an option.

Without knowing too much detail about SBCL specifically, I suspect that they use a combination of clever static analysis and specialization to unlock a lot of that speed. That way functions can be specialized/inlined where beneficial, but new code can safely call the unoptimized versions when loaded at runtime, but even hot loaded code could be specialized with a JIT on hand. The big reason we made the trade off with hot code loading though is due to the restrictions that Wasm imposes - there’s no particular reason we couldn’t support it otherwise. In general it is rarely used in production BEAM apps in my experience, so from my perspective it seemed like an opportunity to stop paying for a feature unused and gain something in return.

e3bc54b2 · on Sept 4, 2022

Thank you for detailed reply here and elsewhere in the thread. That probably took as much time as writing a new blog post, so it is really appreciated.

As mentioned before, as a newbie student of Elixir this is all very exciting. Please keep up the good work!

hosh · on Sept 3, 2022

If you are looking for more substantial technical information, they are found in the ElixirConf ‘19 keynote speech and demo when Lumen was announced.

https://youtu.be/uMgTIlgYB-U

freedomben · on Sept 3, 2022

Agreed, these posts from Dockyard have been really low quality. Dockyard makes some awesome stuff generally, so I'm surprised at how shallow and marketing-speak these have been.

That said I'm really excited for this and the other projects. Looking forward to some more technical dives.

bitwalker · on Sept 4, 2022

Speaking as the lead on the project, this is partially due to this weekend being ElixirConf, so things are hectic, but you can also blame me, as I probably should have written this post, but didn’t make the time as I was pretty heads down on the lead up to the conference.

I’ll make sure to do a follow up post in the near future that is more in depth

nunb · on Sept 4, 2022

Thank you for all the follow up posts on this thread... It strikes me they're probably taking up as much, if not more, time just in the clarifications!

hosh · on Sept 3, 2022

They make up for it in their ElixirConf keynotes and presentations.

e3bc54b2 · on Sept 4, 2022

I'd love to watch it, but the recordings are not available yet and 250$ virtual.attendance was way too much for personal interest language.

hosh · on Sept 4, 2022

They are published free after the conference. The 2019 keynote with substantial technical information on Lumen is available for free on youtube.

bitwalker · on Sept 3, 2022

> What does this mean? Erlang/Elixir are AOT-compiled languages..

With the BEAM compiler today (and by BEAM compiler, they are referring to the compiler provided as part of the standard library shipped with the BEAM, it is just a way to clarify which Erlang compiler is referred to), Erlang sources are partially AOT-compiled to BEAM bytecode, then the code loader does additional compilation steps at runtime. Firefly AOT compiles Erlang to native code directly, and does not use a virtual machine at runtime.

> So is the key benefit here that the compilation is faster because it's not running on the BEAM (good for e.g. CI); or is the key benefit that the resulting executable is faster because it's not running on the BEAM?

There are a few benefits we hope to provide using this approach:

1. Compilation can be faster because the compiler is implemented in Rust, rather than implemented in Erlang and running on the BEAM. 2. We impose a restriction that hot code loading is not permitted, so we are able to do forms of optimization that the BEAM cannot do, as a result of having to pessimize in the presence of hot code loading. 3. Related to above, we can do whole program analysis, including optimizations that take advantage of unboxing terms and working with more machine-friendly types. 4. We can produce a single statically-linked executable that is as small as possible, in part due to being able to do more aggressive and precise dead-code elimination across both the application being compiled and the runtime it links to, because we know what must be available at runtime. On the other hand, if one were to ship the BEAM itself to run in browsers via Wasm (if it was modified in such a way as to be possible), you'd also need to ship large amounts of bytecode for most applications, regardless of whether much of that bytecode is actually needed at runtime. Deployment has always been a pain point of the BEAM (and I'm speaking as someone who built the primary release tools for the Elixir ecosystem), Firefly aims to make deployment as simple as Go/Rust/etc. 5. I'd like to be able to take advantage of the fact that we use MLIR behind the scenes to explore using Firefly as a natural pairing with Elixir applications using Nx, by being able to more seamlessly integrate regular Elixir code with Nx-managed functions.

> How does a non-bytecode version of Erlang/Elixir abstract-machine semantics, achieve these same guarantees? Is there an explicit reduction-counter being carried around in the emitted native code?

In short, yes, we implement preemptive-scheduling the same way the BEAM does, using compiler-injected yield points based on a few criteria (a reduction counter is just one, some others are garbage collection and blocking I/O).

> And also, there is no mention of disadvantages/constraints of using this system. It's pretty clear that you wouldn't be able to do hot reloading or dynamic trace-point insertion without the BEAM there to intermediate it. That's fine for some use-cases, but they should explicitly mention the trade-offs and target audience.

The readme of the project certainly does this, but I agree it would have been good to include in the blog post. In any case, Firefly is still in early stages, so this isn't something anyone is using today. When we reach the point where we feel it is production ready, I can assure you I will be writing up a very detailed analysis of what it is ideally suited for, and what it is not - like you, I feel it is critically important to be clear about that.

derefr · on Sept 4, 2022

> Compilation can be faster because the compiler is implemented in Rust, rather than implemented in Erlang and running on the BEAM.

If this is a full-blown Erlang/Elixir (or only Elixir?) compiler, rather than a "BEAM IR to native" compiler, that sure sounds like a lot of independent work to maintain, especially without there being any sort of spec/standard defining what the behavior of such a compiler should be separate from what the reference compiler does, and what the resulting abstract machine should do separate from what the reference VM does. How do you plan on ensuring your compiler tracks updates introduced by major Erlang releases? Will there be features introduced in the Erlang runtime (I'm thinking things like Erlang 21's `atomics`) that won't be immediately available for FireFly-compiled programs, where FireFly will just choke on these?

> We impose a restriction that hot code loading is not permitted, so we are able to do forms of optimization that the BEAM cannot do, as a result of having to pessimize in the presence of hot code loading.

Hot code loading happens in places other than just relups, though, no? There are some Erlang/Elixir libraries which do crazy things at runtime — for example, protocol serialization libraries which, at runtime, discover remote schemas; dynamically generate code to ser/des values for those schema; compile that code to a module in memory; and then load the resulting module. And this is not discouraged/disincentivized in the Erlang ecosystem; the compiler infrastructure is usually considered to be "part of the standard library," for any application to use freely at runtime. So even if I don't do any of this in my own project, I would be worried that some transitive dependency of my project would be secretly doing this.

Even if you have no plans to support hot code loading in the "remote calls can jump into the new version of a module" sense, do you think it would make sense for FireFly to ever support "module compilation+loading at runtime"? Maybe with the resulting modules being native DLLs?

---

Tangent to that — and I know that this was probably nowhere on your mind when you were working on the project, but something interesting to consider: how hard would it be to convince the FireFly compiler to emit a C-ABI library that could be loaded into the BEAM as a NIF?

I ask, because this would be a really interesting way of achieving the same sorts of speedups what HiPE does/did — but more explicitly, on a higher unit level, and (probably) better.

freedomben · on Sept 4, 2022

I have a similar/related question as well: Will Firefly support most/all elixir libraries, or is it expected/known that apps will have to avoid some common APIs in order to ensure they avoid illegal calls?

bitwalker · on Sept 6, 2022

The goal is to support all libraries, but because there are some out there that dynamically compile/load code at runtime, despite it being a bad idea, there will necessarily be some libraries that are not supported, at least in the near term. Likewise, you might also want to compile a library for say, Wasm, that uses a NIF which lacks support for that target, which wouldn't work; but this is already a restriction with the BEAM today with NIFs (i.e. if they don't support a particular target, then the library won't work).

But as long as a library only uses code loading APIs at compile-time, and not runtime, it should be supported just fine. To be clear, we only plan to raise runtime errors in those cases, as if the call failed normally, and not prevent compilation just because a call to one of those APIs exists in the code, though we may choose to emit diagnostic warnings for it, that remains to be seen.

bitwalker · on Sept 6, 2022

> How do you plan on ensuring your compiler tracks updates introduced by major Erlang releases? Will there be features introduced in the Erlang runtime (I'm thinking things like Erlang 21's `atomics`) that won't be immediately available for FireFly-compiled programs, where FireFly will just choke on these?

We would plan to track Firefly releases against specific mainline Erlang/OTP releases, so for example, we would explicitly state that Firefly v1 tracks Erlang/OTP 25, v2 tracks 26, and so on.

Most changes implemented in each release are implemented in the Erlang-based standard library, so we would get most of that for free (we aren't planning to maintain a completely separate fork of OTP, only the parts that we need to implement ourselves, primarily stuff that is already separate and provided in the preloaded modules shipped with ERTS). What remains is typically small enough that we should be able to maintain parity pretty closely, though naturally there may be delays depending on the scope/effort involved. Obviously we hope to grow enough community around the project that this is a non-issue, but the project has enough financial backing at this point to handle this in the near term. I also hope to make alternative runtimes/implementations of Erlang/OTP something that the core team takes into account via the EEF, ideally resulting in some kind of technical spec around semantics of the language; in the near term we have to rely on the OTP test suite, various papers that have been produced over the years, and a whole lot of digging through the ERTS code itself.

> Hot code loading happens in places other than just relups, though, no?

Yes, of course, and obviously that means some libraries won't be compilable with Firefly. In my experience though, the majority of production apps are not and should not be doing dynamic code generation/loading at runtime; that's certainly something I would raise in a code review if I saw it. Naturally there are going to be teams out there that _are_ doing so, or use tools/libraries that do so, but I'm fine with this being a reason why you would choose the BEAM instead.

> Even if you have no plans to support hot code loading in the "remote calls can jump into the new version of a module" sense, do you think it would make sense for FireFly to ever support "module compilation+loading at runtime"? Maybe with the resulting modules being native DLLs?

We do support dynamic loading of libraries containing new modules/functions, and I think it is certainly possible for us to provide a JIT on supported platforms in the future to do the kind of unbounded code generation/loading that the BEAM supports, but it is not a priority by any means. As I've stated previously, it is rare that I've seen that kind of thing abused in production applications, and I don't see it as a major selling point of the platform (or a blocker for Firefly). It is a tradeoff though, and that does mean there will be things you just can't do with Firefly - but I think that's an important property of alternative implementations; ideally you want them to be tailored towards different use cases, otherwise there is little reason to have multiple implementations in the first place.

> Tangent to that — and I know that this was probably nowhere on your mind when you were working on the project, but something interesting to consider: how hard would it be to convince the FireFly compiler to emit a C-ABI library that could be loaded into the BEAM as a NIF?

You'd be surprised what things I've considered while working on this project ;). I think it is certainly possible, and I'd like to support it, especially since I think it might be a great way to use Firefly in a traditional BEAM deployment for things that, as you mentioned, one would have previously considered using HiPE for. I have yet to investigate just how tight that integration could be though.

agent281 · on Sept 4, 2022

Thank you for elaborating!

I think that Erlang/Elixir could make amazing TUI apps if they were easier to distribute. The ability to push work to the edges of the system while keeping the UI responsive would be awesome. It sounds like static binaries provided by Firefly would make this a viable option.

freedomben · on Sept 4, 2022

Likewise. I still mainly use Ruby or Go for writing CLIs, but I would much, much rather use Elixir. escript is nice but still requires the target to have erlang installed. Being able to produce a statically-linked binary would be enough to just use elixir only, which is my dream.

bitwalker · on Sept 6, 2022

This is one of the things I'm personally excited about using Firefly for, since I first started working professionally with Erlang/Elixir, I wanted the ability to use it for CLIs. You can be sure it will be a use case well supported :)

freedomben · on Sept 3, 2022

While I'd be fascinated to hear the answers to all your questions, I would guess maybe 2% of people would even understand them. I wouldn't expect a post like this to get anywhere near that technical.

jononomo · on Sept 3, 2022

It makes no sense to me to use Elixir without running on the BEAM. Absolutely zero. You lose OTP, for one thing, which means that the entire programming paradigm goes out the window.

bitwalker · on Sept 3, 2022

You don't lose OTP, because OTP is a library, written almost entirely in Erlang (not counting the set of NIFs/BIFs which provide intrinsic functionality), which we absolutely aim to compile with Firefly just like any other Erlang (or other BEAM language) sources.

The BEAM also provides a runtime, but that runtime can be implemented using other strategies. It essentially provides a M:N green threading abstraction (processes), with a specific set of semantics around how those communicate (messages) and how failure is handled (links/monitors/etc). Firefly provides a runtime that aims to be equivalent to that of the BEAM from the perspective of the developer, the only difference is in how that is done behind the scenes, what is produced by the compiler, and what restrictions we impose that the BEAM doesn't (namely no hot code loading, at least for the forseeable future).

I'm not sure where you got the idea that Firefly throws away OTP, or tries to implement Erlang with different semantics, because that is explicitly _not_ the goal.

jononomo · on Sept 4, 2022

So you're re-implementing the entire BEAM? I think you should make that clearer in the blog post.

hosh · on Sept 3, 2022

The people writing Firefly are trying to support OTP on WASM. Because the WASM runtime have different characteristics and guarantees from BEAM, and they don’t want to run BEAM inside WASM, they created an alternate runtime. There was at least one instance where Dockyard proposed and implemented changes to the WASM spec itself in order to support Elixir/Erlang/OTP in WASM.

sergiomattei · on Sept 3, 2022

I don’t think Firefly gets rid of the OTP.

jononomo · on Sept 3, 2022

OTP is a programming library/paradigm that doesn't work without the ability to create extremely lightweight threads, which is what the BEAM provides.

freedomben · on Sept 3, 2022

I don't understand, isn't the BEAM just a software runtime that provides a lightweight thread implementation that OTP can use? Why couldn't they implement an alternative that runs on WASI?

I agree that Elixir without OTP becomes much less useful, but there could be some changes to the language to enable in-process state changes so that you wouldn't be limited to a single process and the immutability restrictions that would make that very difficult to do anything useful with. I'm sure they thought of this problem and have a solution in some form.

bitwalker · on Sept 3, 2022

Correct, the BEAM is just one implementation of the runtime, there can (and have) been others. Firefly aims to be as close to the BEAM semantics as possible, but there will naturally be some differences as we're taking a different approach with compilation, with some benefits as a result, but naturally there are also tradeoffs.

pessimizer · on Sept 3, 2022

BEAM isn't the only runtime possible with green threads. OTP is just the libraries.

Lots of people have written alternatives to BEAM. The only problem they run into is that BEAM is very good, and would be tough to beat. I was an admirer of Erlang on Xen: https://github.com/cloudozer/ling

peoplefromibiza · on Sept 3, 2022

Apparently it has feature parity with the BEAM, but lacks support for NIFs

https://github.com/GetFirefly/firefly#runtime

bitwalker · on Sept 3, 2022

It doesn't currently, but you are correct that the goal is to maintain feature parity with the BEAM (with explicit caveats to that, namely hot code loading). There actually is support for NIFs, just not via the erl_nif interface that NIFs use today, support for that will arrive eventually.