A crash course in just-in-time compilers (2017)

hinkley · on July 6, 2020

One of my favorite little things about the calculus of JIT tuning is that a faster interpreter can give you a better JIT. When the interpreter is 'good enough' more often, then you compile fewer functions. If you compile fewer functions, you can budget more resources for the ones you do, giving you better performance.

Some have taken a different tack, use no interpreter at all, and instead dump the simplest, fastest compile they possibly can on first execution.

Stratoscope · on July 6, 2020

> Interpreters are quick to get up and running. You don’t have to go through that whole compilation step before you can start running your code. You just start translating that first line and running it.

> Because of this, an interpreter seems like a natural fit for something like JavaScript. It’s important for a web developer to be able to get going and run their code quickly.

> And that’s why browsers used JavaScript interpreters in the beginning.

I'm curious, is that true? How did the original JavaScript interpreter handle code like this:

  var r = test();  // not defined yet
  alert( r );  // 42
  alert( typeof result );  // "undefined"
  
  function test() {
      result = 42;  // local because of 'var' below
      return result;
      var result;
  }

Was there a first pass to find "hoisted" function and var statements? Or did it compile to bytecode? Or...?

chrisseaton · on July 6, 2020

Quite a long time ago, 'interpreter' used to imply that you were making no passes over the code at all before executing it, and yes that means hoisted functions would be a problem. I think some shell interpreters still work this way today, and you can see this if you modify a shell script file on disk as it's running, they will actually pick up the changes as they keep reading the lines while executing.

But for many decades, that hasn't been what most people mean by 'interpreter'. What people consider to be a 'pure interpreter' today may still make multiple passes over the code before starting to execute. This is what early JavaScript interpreters did and how they handled hoisting and things.

These passes may include creating an alternate representation like a bytecode. That's where the line between compiling and interpreting gets blurred, as going to bytecode is to some extent compilation.

Major industrial 'interpreters' like Python and Ruby have compilation steps and then interpret a bytecode.

All these terms need to be understood in context, and there's quite a continuum between pure interpreter and static native-code compilation.

derefr · on July 6, 2020

> and you can see this if you modify a shell script file on disk as it's running, they will actually pick up the changes as they keep reading the lines while executing.

Better yet—you can stream a script into bash, and each line will be executed as it's received over the pipe.

In theory, one of those `curl ... | bash` stanzas could actually be talking to a little web-app backend that writes lines of script to the HTTP socket one at a time (maybe using Chunked transfer, but not necessarily), and waits for those individual lines to be executed; where some of those lines are expected to make other curl requests back to the same server; where those other requests in turn change backend state that the original script-streaming session can see; and that state further determines what lines of script get written to the HTTP socket.

In other words, just by piping to bash, you're actually giving whatever's on the other end of the pipe an interactive shell on your computer, even without the script installing any separate CNC agent!

saagarjha · on July 6, 2020

Sounds like a security nightmare :P

Sebb767 · on July 6, 2020

I'm not an expert, but as far as I know, the first step is always to get an abstract syntax tree, which is needed to check for syntax errors and have the program in an usable representation. This means that your first pass always reads all the code and allows handling this case, possibly without getting into the hard work of optimizing, creating the variables etc.

That, or the early JS interpreters simply couldn't handle that case - I've never heard of this, though.

cardiffspaceman · on July 6, 2020

I don't think creation of a syntax tree is always explicit; I have read C compiler source codes that use one pass with back-patching and the usable representation is bytecode or machine code. One could reckon that the AST is implicit during code generation as functions call each other to generate the code. A valid JS interpreter could just do what the statements say using dictionaries for each scope, linked together to allow levels of scope with the right semantics.

jcranmer · on July 6, 2020

It's worth noting that the scoping of names in C language is designed to be solvable in one pass: you can't use a function before it's declared. Most newer languages instead require a full pass to discover all the available names, to allow functions to not require a separate declaration before all uses.

chrisseaton · on July 7, 2020

> I'm not an expert, but as far as I know, the first step is always to get an abstract syntax tree

But you don't need to reify it.

thecupisblue · on July 6, 2020

I once saw an article by this author, read all the rest and then forgot where I found it. I spent days devouring the net. Thank you for posting this!

saagarjha · on July 6, 2020

Psst…I think you may have been reaching for “scouring” ;)

thecupisblue · on July 6, 2020

Thanks. Tho now imagining myself devouring the internet sounds like some tech-god sci-fi scene.