That's not Node - that's V8. And it's possible to do the same thing for Python - there's nothing magic about JavaScript compared to Python - it's just a lot of engineering work to do it, which is beyond what this project's scope is. PyPy does it, but not inside standard Python.
I'm well aware of V8 and pypy. I also really like Python as a language, especially with mypy.
It just makes me sad that in a world with multiple high-performance JIT engines (including pypy, for Python itself), the standard Python version that most people use is an interpreter. I know it's largely due to compatibility reasons (C extensions being deeply intertwined with CPython's API).
There is a really important (if not "magic") difference between JavaScript and Python. JS has always (well, since IE added support) been a language with multiple widely-used implementations in the wild, which has prevented the emergence of a third-party package ecosystem which is heavily tied to one particular implementation. Python on the other hand is for a large proportion of the userbase considered CPython, with alternate implementations being second class citizens, despite some truly impressive efforts on the latter.
The fact that packages written in JS are not tied to (or at least work best with) a single implementation is also what made it possible for developers of JS engines to experiment with different implementation approaches, including JIT. While I'm not intimately familiar with writing native extension modules for Node (having dabbled only a little), my understanding is the API surface is much narrower than Python, allowing for changes in the engine that avoid breaking APIs. But there is less need for native modules in JS, because of the presence of JIT in all major engines.
> It just makes me sad that in a world with multiple high-performance JIT engines (including pypy, for Python itself), the standard Python version that most people use is an interpreter. I know it's largely due to compatibility reasons (C extensions being deeply intertwined with CPython's API).
this is misleading, if one sees the phrase "interpreter" as that code is represented as syntax-derived trees or other datastructures which are then traversed at runtime to produce results - someone correct me if I'm wrong but this would apply to well known interpreted languages like Perl 5. cPython is a bytecode interpreter, not conceptually unlike the Java VM before JITs were added. It just happens to compile scripts to bytecode on the fly.
Bytecode is just another data structure that you traverse at runtime to produce results. It's a postfix transformation of the AST. It's still an interpreter.
We don't normally call hardware or firmware implementations an 'interpreter'.
Almost all execution techniques include some combination of compilation and interpretation. Even some ASTs include aspects of transformation to construct them from the source code, which we could call a compiler. Native compilers sometimes have to interpret metadata to do things like roll forward for deoptimisation.
But most people in the field would describe CPython firmly as an 'interpreter'.
Java is interpreted in many ways, and compiled in many ways, as I said it's complicated. It's compiled to bytecode, which is interpreted until it's time to be compiled... at which point it's abstract interpreted to a graph, which is compiled to machine code, until it needs to deoptimise at which point the metadata from the graph is interpreted again, allowing it to jump back into the original interpreter.
But if it didn't have the JIT it'd always be an interpreter running.
I am not too concerned about the word "interpreter", and more about cPython being called an "interpreted language", which implies it works like Perl 5, or that cPython being an "interpreter" is somehow a problem. It's normal mode of operation works more like pre-JVM Java, with "interpreted bytecode" from .pyc files.
Most people don’t make this distinction, and would just say ‘interpreter’. Interpreting bytecode vs an AST is a pretty minor difference. It’s exactly the same data in a slightly different format. The ‘compilation’ is just a post-order linearisation. And storing it in files or not even more so.
as I'm sure you're aware, bytecode interpretation typically implies a superior performing model than AST interpretation, and compiling into bytecode produces a lot of opportunities for optimization that are not typically feasible when working with an AST directly. Of course it's all bits and anything is possible, but it's assumed to be a better approach in a generally non-subtle way.
To clarify my comment, I did mean bytecode interpreter.
This is a common implementation approach - parse the source to generate an AST, transform the AST to bytecode, then interpret the bytecode. It's still interpretation, and is slow. Contrast to JIT engines which transform the intermediate code (whether that's AST or bytecode) to machine code, and is fast.
I think it's more that cpython is so slow so a lot of things people use are implemented using the C API, and many optimizations will break a bunch of things. If everything was pure python the situation would be different.
Nearly everything (or is it everything?) in memory can be modified at runtime. There are no real constants for example. The whole stack top to bottom can be monkeypatched on a whim.
This means nothing is guaranteed and so every instruction must do multiple checks to make sure data structures are what is expected at the current moment.
This is true of JS as well, but to a lesser extent.
Aren't all the things you mentioned already fixed by deoptimisation?
You assume constants cannot be modified, and then get the code that wants to modify constants to do the work of stopping everyone who is assuming a constant value, and modify them that they need to pick up the new value?
> To deoptimize means to jump from more optimised code to less optimized code. In practice that usually means to jump from just-in-time compiled machine code back into an interpreter. If we can do this at any point, and if we can perfectly restore the entire state of the interpreter, then we can start to throw away those checks in our optimized code, and instead we can deoptimize when the check would fail.
No such thing as a constant in Python. You can optionally name a variable in uppercase to signal to others that it should be, but that's about it.
You can write a new compiler if you'd like, as detailed on this page. But CPython doesn't work that way and 99% of the ecosystem is targeted there.
There is some work on making more assumptions as it runs, now that the project has funding. This is about where my off-top-of-head knowledge ends however so someone else will want to chime in here. The HN search probably has a few blog posts and discussions as well.
No, we know how to optimise all these issues. They're solved, through a combination of online profiling, inline caching, splitting, deoptimisation, scalar replacement, etc. (I wrote a PhD on it.) I don't think you could name a single Python language feature that we don't know how to optimise efficiently. (I'd be interested if you could.) But implementing them all is a difficult engineering challenge, even for Google, mainly because it involves storing a lot of state in a system that isn't designed to have state attached everywhere.
Yes, that’s what my reply means, your “no…” is poor communication style. If you think you can do better than the folks working on it for a decade plus, by all means step up.
I'll give you one you could have used - the GIL - however I'm not sure the GIL's semantics are really specified for Python, they're an implementation detail people accidentally have relied on.