Hacker News new | past | comments | ask | show | jobs | submit login

From the site:

"There's a good comparison to be made with java, as they had similar design goals:

    good performance (within an order of magnitude of C)
    cross platform / virtual machine based (WriteOnceRunAnywhere, except actually delivered this time)
    safe execution of untrusted code
    powerful network / browser focus 
However, they diverged in a few areas (probably to their advantage):

    register based byte-code (as opposed to stack based), maps well to existing processors, JIT compilation works much better
    supported as a native operating system (i.e., it'll easily run on bare hardware, for many values of hardware)"



The choice of register vs stack for bytecode doesn't make any difference for a modern JIT like HotSpot. The bytecode gets transformed into an intermediate representation (IR) and optimized before being translated to machine code. A register based bytecode like Dalvik that has 65,535 registers certainly does not map directly to hardware.


Edit: Michael Franz's research would seem to disagree, and I believe Mike Pall (of LuaJIT) would also disagree.

As I remember, the Dis bytecode used 64-bit instructions, with a 16 bit opcode, a 16-bit address and a 32-bit address. I've heard it described as a memory machine, and also as a machine with 65,536 registers.

In any case, it's easier and more efficient to perform register allocation from an intermediate language with a huge number of registers down to a small number of physical registers than it is to assign slot locations to registers. See Michael Franz's SafeTSA [1] project for an example of modifying a JVM to be able to mix and match both Java bytecode and an SSA-based bytecode in the same process. Granted SSA has some additional properties besides an infinite register set that make code generation nice. However, they found that it took less time to generate native code from SafeTSA and that the generated native code ran faster, even though the JIT backend was identical for the two bytecode formats. The disadvantage of SSA and memory machine bytecodes is they are more complicated to efficiently interpret if you're not generating native code from them.

[1] http://en.wikipedia.org/wiki/SafeTSA


Easier? There have only been about 8000 PhDs done on register allocation, it's very well understood and the differences are pretty small. If you are reinventing it one might be incrementally easier


It would be cool to have a LuaJIT/Inferno/dis hybrid.


In any case, Java's stack-based bytecode is really a register machine in disguise. It isn't a general stack machine -- for instance, at a given "PC" the shape of the expression stack (i.e. the number and "basic" types of elements in it) needs to be constant.

The right way to look at Java's bytecode stream is opcodes for a register machine with _implicit_ input and output registers. Mapping it to a compiler IR would not be easy in the general case if that weren't true.


That's a disingenuous - of course it makes a difference. Trying to infer an efficient allocation of registers from a chain of stack instructions involves a ton of tricks that bloat the compiler and make loading the bytecode slow and processor intensive. That's the problem Dalvik was designed to solve, and why Dalvik's successor ART is a regular compiler and not a JIT.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: