Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder where we’d be if the idea of CPU-independent bytecode had ever really taken off - for example, the TIMI bytecode of IBM System/38 and AS/400, TenDRA TDF (aka OSF ANDF), WebAssembly. You could have an AOT compiler in the system firmware which the OS invokes, when a program is installed the OS uses AOT to convert it to the actual machine code, about which the OS might know nothing, and could vary incompatibly from CPU to CPU, even among different CPU models in the same family. (Maybe a JIT mode too.)

I guess JVM/CIL are somewhat similar, but at a much higher level - I’m not talking about garbage collection or type safety.

In some ways that is true of TIMI too - it is designed to support a capability-based operating system, and hence has some rather high-level instructions, although still not as high level as JVM/CIL - it was generally used as a compilation target for non-garbage collected languages such as RPG, COBOL, C/C++, PL/I, Fortran, BASIC, Pascal, etc - and hence lacks a garbage collector.



> I wonder where we’d be if the idea of CPU-independent bytecode had ever really taken off [...]. You could have an AOT compiler in the system firmware which the OS invokes [...].

You could say modern CPUs are kind of like tracing JITs. On one hand, a normal tracing JIT has much more memory to save its work than a CPU’s trace cache, but on the other, the superscalar reordering and renaming stuff is even more aggressive than a trace recorder about looking at how the code actually executes and deriving assumptions from that instead of attempting to prove them statically.

Why not AOT instead? In part because they can’t, of course—a tracing JIT requires about the least amount of heavyweight compiler tech out of all the possibilities, which is an advantage if you’re trying to fit the compiler into silicon. (That’s not to say a tracing JIT is easy—the cost of a simple compiler is that you need to make it hella fast for the result to be any good.)

But in part I suspect it’s because a standard assembly-level bytecode kind of sucks to compile ahead of time. About the most useful assumptions such a compiler can make is which things don’t interfere with others, usually memory operations, or perhaps which writes can be forwarded to reads. A tracing JIT can see some of this, a superscalar even more so; an AOT or function-at-a-time JIT, in the absence of any aliasing information or even knowing when one object ends and another begins (boo WebAssembly), can’t.

Ironically, memory segmentation as in the Intel 432 or 286 (or the IBM dinosaurs) feels like could help with that (or are we calling this idea “capability-based” once again?). Does anyone who isn’t just a speculating dilettante (unlike me) think that’s a reasonable thought?

(Wait, is a selector table just a Smalltalk-style object table with a fake moustache?)

Of course, even then we’d still have the problem that VLIW microcode wide enough to require no decoding and engage the entirety of a modern CPU’s physical register file and execution units would be cripplingly slow to fetch from DRAM, and the “legacy” ISAs partly serve a compression format.


Transmeta Crusoe? They chose X86 machine code as their CPU-independent bytecode.


In demos the Transmeta processors was shown to support multiple instruction sets - per https://en.wikipedia.org/wiki/Transmeta#Code_Morphing_Softwa... , they demoed pico-Java , and also there were rumors of PowerPC compatibility.

Although you're probably right - none of those options made it into a shipping product, only x86.


I guess today's equivalent VLIW chip would be Tachyum Prodigy, not super confident about it.. https://www.tachyum.com/products/#products-prodigy


Nvidia's denver2 cores work this way. Shipped on an android tablet about 10 years ago. Not sure what happened to them after that.


Mill uses something similar as well. That being said, I have <1% confidence in Mill ever moving past the slideware stage, so...


"slideware": Hat tip. I never saw that term before. I usually see vaporware, e.g., Duke Nukem Forever.


> I wonder where we’d be if the idea of CPU-independent bytecode had ever really taken off [...] convert it to the actual machine code, about which the OS might know nothing, and could vary incompatibly from CPU to CPU, even among different CPU models in the same family

There are various examples I can think of.

Nvidia's CUDA platform compiles C++-like source code to PTX binary code which is GPU-independent. At run time, PTX is compiled and specialized for the specific GPU model you are running on. I can imagine that PTX is compiled differently depending on the number of registers in the GPU as well as its instruction set capabilities. https://en.wikipedia.org/wiki/Parallel_Thread_Execution

Mainstream virtual machine languages like Java, .NET, and JavaScript are obvious examples.


Given that everything is just microcode anyway, it would be really interesting if some (ex Intel) took their design and only switched out the instruction decode to decode ARM (or whatever) instead.

Sure it wouldn’t be perfect since the chip is optimized based on x86-64 workloads, and they’d never publish it anyway. Plus it may only be simulated instead of spending the money on manufacturing the one-offs.

But boy would it be interesting to see how it performed in various dimensions, just as an exercise.


Given that everything is just microcode anyway, it would be really interesting if some (ex Intel) took their design and only switched out the instruction decode to decode ARM (or whatever) instead.

You probably mean uops, but that thought has also crossed my mind in the past --- a multi-ISA CPU. They could add the decoders for other ISAs, along with extra GDT descriptor types for "ARM mode", "RISC-V mode", etc. segments like they did with V86. It's not a new idea either, https://en.wikipedia.org/wiki/NEC_V30#ISA_extensions could execute both x86 and 8080 code and of course ARM has cores with the triple-mode ARM32/Thumb/Aarch64 ISAs.


Yes I did, thanks. It’s also kind of reminiscent of the Transmeta Crusoe.

The problem I think multi-ISA would run into is the “master of none” issue. Intel can tune for how x86-64 works, Apple and Samsung for ARM.

But if one chip runs it all, it can’t tune for anything too specific.

It must not be worth it. I wonder if Apple would have done something like that for the M series to let it keep running Intel software. They must have tried to figure out if it was worth it right? I know they added a few instructions or an addressing mode or something to help. But they must have determined it wasn’t worth it and it could be done well enough in software.


Exactly, they added an flag to enable total store ordering to help x86 instructions map cleanly to ARM instruction.

https://twitter.com/ErrataRob/status/1331735383193903104

Considering how fast Apple M series can emulate x86 it's clearly not worth adding much more hardware than what they have now.


Back when we were digging into microcode we found a mention of this as a PoC/toy example [1]. Sadly we never found more than an overview, would have liked to know more about it, especially how the update was accepted.

[1] https://troopers.de/events/troopers16/655_the_chimaera_proce... by https://twitter.com/cynicalsecurity


Seems irrelevant though - the internet exists. I'll never have a problem getting the code I need, provided it exists - i.e. if all I have to do to support ARM is use the ARM compiler, then I'll support ARM.

Docker with ARM-specific Linux distributions solves this, as does things like Golang with it's "just set an environment variable and don't even worry about needing a cross-compiler" toolchain.


> Seems irrelevant though - the internet exists. I'll never have a problem getting the code I need, provided it exists

That assumes all code is open source, or else proprietary code shipped with source. That's not the world we live in. Most businesses run at least some closed source on-premise software. Open source is great at providing solutions to problems most people have. But when you start looking at specialised software which is highly industry-specific, suddenly open source starts to look a lot more patchy, and a closed source solution is often the only realistic option.

For example, at many engineering firms (whatever type of engineering they may be doing), you will find heaps of closed source software being used every day. For much of it, there simply is no open source solution available – or if there is, it is missing major features, or is clunky/buggy/poorly-designed, and the amount of extra cost in adopting it will be a lot more than just continuing to pay for the closed source alternative.


I agree with this - but my point is that I think the difficulty of compiling to alternative architectures is more of an impediment. If it's easy, then company's will just do it, give or take "we don't want to support that platform".


> Seems irrelevant though - the internet exists.

... but it's not always useful. MSI motherboards doing "secure boot" can't check for key revocation until after they've booted :( Sometimes you just have to rely on what you've got.

https://arstechnica.com/information-technology/2023/05/leak-...


We kind of have it, except it does not seem to have taken off and has been relegated to a second-class citizen at best.

I am talking about LLVM Bitcode[0] – when the binary product (a executable or .o/.a files) is shipped in the LLVM IR representation and then is «AOT»'d into the final product (a final executable) that can, for instance, take advantage of the latest ISA features (armv9, Zen23, POWER18 or new RISC-V extensions) with zero effort on the end user part. For a while, Apple even encouraged iOS devs to upload their apps into the App Store in the Bitcode format. That has all but ceased to exist for not obvious reasons about a year or two ago. Technically, if Apple chose to transition onto an alternative ISA again (say, RISC-V), at least iOS apps would not require recompilation and would get statically converted to the new ISA at the download time.

Imagine a world where there would be a single Linux distribution for a given architecture shipped in the Bitcode format (sans the small arch specific boot area and the AOT engine), for instance.

[0] https://lowlevelbits.org/bitcode-demystified/

[1] https://www.highcaffeinecontent.com/blog/20190518-Translatin...


LLVM has multiple issues that makes LLVM-Bitcode as a modern ANDF unsuitable.

It is a moving target. SPIR (OpenCL/Vulkan) used to be based on LLVM-IR, but each version had to be locked to one specific version of LLVM and that wasn't viable in the long run. So SPIR-V got its own IR, and hasn't looked back.

There are many subtle differences between architectures and their ABIs. In some ways LLVM-IR is too low-level, so the compiler has to lower to a specific ABI even before emitting LLVM-IR code.

LLVM-IR was made for a C/C++ - compiler, and retains many C-isms still. What is undefined behaviour in C is often undefined behaviour in LLVM-IR, and therefore bugs have different effects on different hardware. A large software vendor would therefore still need to keep a farm of different machines to test its code on, and that is the total opposite of what one would want to accomplish.

A truly hardware-agnostic platform would need to both have its own virtual CPU with defined semantics, and its own ABI, so as to provide an abstraction around the specifics of each hardware platform. But if it has its own ABI, then it wouldn't be 100% interoperable with existing Linux libraries on each hardware platform either.


I don't know about JVM, but CIL doesn't have notion of type safety, it's has been checked by compilator/verificator and doesn't enforced in runtime


Hotspot verifies the bytecode when it's loaded, but after that, it's completely unsafe too. This verification can currently be disabled, but the flag is deprecated for removal at some point.


(Turns to phone booth, ripping off tie) "Sounds like a job for super Forth!"


Seriously - something similar to the threaded interpreted code of Forth might be a nice way to implement low level byte-code on a modern machine.


x86 is already CPU-independent bytecode.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: