People say this isn't what BEAM is intended for an it excels elsewhere, which yes I'm sure it does.
But why can't it be both? Why can't you do everything that BEAM does... and then also have an optimising JIT for the straight line maths code? Couldn't you leave all the other parts of the system the same and keep all the existing benefits? Improving one doesn't damage the other does it?
The problem with number crunching or maths is that it is very difficult to cut the whole computation into smaller units and pre-emptively schedule it. If it is possible for a specific use case, then it is moderately easy to replace that part with NIFs. For effective maths you need to convert the internal tagged number representation to machine native code that is also expensive. Solving these two things in the generic case is very difficult while preserving all the good parts.
Am I correct in saying that functions written in C do not get pre-empted like Erlang functions? If that is true, you could write computationally intense code in C within a BEAM app. But I think this misses the point. Pre-emption is really cool for concurrency abstractions, and the trade off is being less good at single threaded computation. Trying to turn Erlang into something like a Bitcoin miner is kind of like combining a bunch of Roombas to make a Shop-Vac.
Am I correct in saying that functions written in C do not get pre-empted like Erlang functions? If that is true, you could write computationally intense code in C within a BEAM app.
They cannot be pre-empted, but they must also return quickly, or risk causing lots of problems (see https://erlang.org/doc/man/erl_nif.html for slightly more detail on what this means). As such you can't just write some big function in C to do number crunching.
The NIF documentation mentions some ways around the problem, but all of them take some effort, or have tradeoffs of some sort. I was really excited when “dirty” NIFs were introduced, which can tell the BEAM that they'll run for a while, thus appearing to allow for long-running NIFs with no extra work other than setting a flag. However, it turns out that the BEAM just spins up N threads for scheduling dirty NIFs, and if you have too many such NIFs, too bad, some won't get scheduled till the others have completed. In retrospect it should have been obvious that there couldn't be a silver bullet for this problem, because it really isn't easy.
Erlang may well be my favorite language, but as you imply, it's just not going to be the right approach for everything: in my experience, it's absolutely fantastic in its niche but that niche is quite small. I think that's fine, though. For me, where Erlang does make sense, its concurrency approach makes it unbeatable, and I'll live with the performance tradeoffs. It turns out that basically all the NIFs I've had to write were just to gain access to functionality that Erlang doesn't expose (e.g. network namespaces on Linux, which are supported now, but weren't when I needed them).
It's actually worse than that; as I recall, the internal numerical representations of numbers do not necessarily map to the CPU's (for instance, there is no byte sizing; you have integers and floats, and they can be arbitrarily large). The work to perform that conversion, do the math, and convert back, would almost assuredly make it so that a single calculation takes more time than just doing it within the BEAM. The only way to save time would be to convert once, do a bunch of math, and convert back. Which would, yes, prevent pre-emption, AND require indication of intent (so brand new language constructs, minimally).
That's a lot to expect of the user, and a lot to implement in the language...all to avoid just writing a NIF.
Well pre-emption only happens at function call or return, or when a NIF calls the special function to allow for it; if all the math was in a single function, with no interleaved function calls (possibly after inlining?), unbox, math away, and rebox could work.
There wouldn't need to be an indication of intent, other than writing the math separate from any function calls. I don't know how much code fits this pattern, but it's an idea that could be explored. I think that's part of what hipe is supposed to do, but I haven't looked into hipe in a long time.
> Which would, yes, prevent pre-emption, AND require indication of intent
I don't understand why. If you have a maths-intensive operation like matrix-multiplication using untagged maths, why does that prevent pre-emption? Why does it require indication of intent?
And there's already a basically zero-overhead way to implement pre-emption - safepoints - that's what the JVM does when it wants to pre-empt in user-space.
Uh...no. A safepoint is when all threads in the JVM have blocked (which is purely cooperative, and happens during thread transitions), and, importantly, when OS threads running native code still are running, but can't return/respond to the JVM. The JVM doesn't pause those threads.
Which is the point. You can't preempt crunching those numbers if it's not within the BEAM. Which might be fine. Or it might not. Making it invisible to the user is not really a good idea when going for soft realtime properties; at least with a NIF and dirty scheduler you're being explicit about it.
> A safepoint is when all threads in the JVM have blocked
And having blocked them, you can then pre-empt them.
> You can't preempt crunching those numbers if it's not within the BEAM.
I still don't see why sorry. If you had a JIT and you compiled maths intensive code to native code, it could run efficiently in BEAM and still be pre-emptible by having a safepoint in the generated code.
How do you think Java is doing optimised numerical code that is pre-emptible from user-space? Safepoints! BEAM could do the same thing.
Yes I think that's a reasonable definition of pre-emption because they don't interrupt the numerical pipeline. They cause zero data or control dependencies. But even if you don't think it fits the definition, what do you think the practical difference is when you argue about this terminology?
What did we want to achieve? We wanted to be able to run a tight loop of highly optimised, untagged numerical code but still be able interrupt it to switch threads on demand from user-space if needed.
OTP and BEAM are maintained by a very small team compared to other language projects. Working on math performance could likely reduce the amount of time to be spent on other things.
Perhaps there are some easy wins, but JIT is not an easy thing. Depending on your needs, pushing math onto a port, or a nif is probably a quicker win than trying to make it fast in Erlang. However, I wonder if the single static assignment optimizer would offer a path towards recognizing 'straight line math code' and potentially running things much faster. But there's still an issue of potential mismatch between the very general number format with automatic bignum promotion and whatever the underlying machine provides.
> But why can't it be both? Why can't you do everything that BEAM does... and then also have an optimising JIT for the straight line maths code?
I love this attitude. BEAM is something novel and special, and I think it's important to think of how to incrementally address its current shortcomings instead of throwing our hands up. I find GHC is another place where incrementalism on top of novelty is resulting in a lot of people's wishlists to be fulfilled.
But why can't it be both? Why can't you do everything that BEAM does... and then also have an optimising JIT for the straight line maths code? Couldn't you leave all the other parts of the system the same and keep all the existing benefits? Improving one doesn't damage the other does it?