Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not hugely knowledgeable about LLVM, but it was my understanding that a major benefit of it was separating out high-level from low-level concerns.

There are lots of practical issues, obviously, but the dream was that the LLVM intermediate representation would take the role of C in the author's description, with the major difference that nobody has to actually hand-write anything in it.

So from that point of view, isn't the robust, long-term solution to this and related issues to build LLVM back-ends for all the "weird" architectures?

I'm surprised it's not mentioned as a possible way forward.



Yes, the reality is all those weird architectures stopped being developed and actively supported 20 years ago. If someone wants to get some old Itaniums and Alphas and such together to hack in support, go wild and have fun. But the reality is the venn diagram interesection of people knowledgeable enough to create a LLVM backend, people motivated to support vintage architectures, and people that actually have the hardware available and ready to test is basically zero. If people in the community are passionate about it happening then organize and make it a reality.


It's also that - per the article - it goes beyond "willing and able to implement in LLVM". Because that doesn't let the Rust devs or the Python devs test on those vintage platforms.


LLVM IR is not platform independent: https://releases.llvm.org/8.0.0/docs/FAQ.html#can-i-compile-...

There is no "compile to LLVM once, run anywhere with an LLVM backend". If your C frontend doesn't know that you want to compile for System/390, it will not be able to generate LLVM code that you can expect to turn into a working System/390 binary.


There are attempts to make portable IRs, like PNaCL and WebAssembly, but it's difficult to make something both portable and super-performant at the same time. You'd probably need to have some kind of optional SIMD, and variant function support so that both the SIMD and scalar versions exist and are both hand-optimized. Then paper over some other issues like unaligned memory access and the amount of masking in shift counts.

That would be enough to get it good on all of x86/ARM/RISCs I think, but to go further and support Alpha or VLIW machines well you'd need to include a lot of metadata in the IR to provide aliasing info, a problem I don't think anyone has taken seriously.

(Everyone always says "this language is good enough as long as you're not writing video codecs or something!". Well, I am writing video codecs or something.)


I'm not sure about SIMD. It's possible to recover (more likely, create by loop unrolling) parallelism that can be turned into SIMD code on the LLVM level. That's what LLVM already does.


Autovectorization for SIMD is, well, not great. The worst problems with it are when you run it against already SIMDed code (it tends to mess it up), which doesn't apply here, but it also tends to fail a lot and it relies on a lot of memory aliasing info. That's why it works well in Fortran, which is much looser than C.

I think a reasonable portable bytecode would have stricter memory rules than C and so would be harder to optimize like this.

So that's why I proposed having variants, but you could also invent some abstract vector operations and then scalarize them if they're not available. That's how shader languages do it.


The problem is that LLVM IR is not stable across releases, so keeping an out-of-tree back end up to date would be expensive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: