mypyc is cool and all, but I can't help thinking about how Node just JITs everyt...

chrisseaton · on Sept 27, 2022

That's not Node - that's V8. And it's possible to do the same thing for Python - there's nothing magic about JavaScript compared to Python - it's just a lot of engineering work to do it, which is beyond what this project's scope is. PyPy does it, but not inside standard Python.

peterkelly · on Sept 27, 2022

I'm well aware of V8 and pypy. I also really like Python as a language, especially with mypy.

It just makes me sad that in a world with multiple high-performance JIT engines (including pypy, for Python itself), the standard Python version that most people use is an interpreter. I know it's largely due to compatibility reasons (C extensions being deeply intertwined with CPython's API).

There is a really important (if not "magic") difference between JavaScript and Python. JS has always (well, since IE added support) been a language with multiple widely-used implementations in the wild, which has prevented the emergence of a third-party package ecosystem which is heavily tied to one particular implementation. Python on the other hand is for a large proportion of the userbase considered CPython, with alternate implementations being second class citizens, despite some truly impressive efforts on the latter.

The fact that packages written in JS are not tied to (or at least work best with) a single implementation is also what made it possible for developers of JS engines to experiment with different implementation approaches, including JIT. While I'm not intimately familiar with writing native extension modules for Node (having dabbled only a little), my understanding is the API surface is much narrower than Python, allowing for changes in the engine that avoid breaking APIs. But there is less need for native modules in JS, because of the presence of JIT in all major engines.

zzzeek · on Sept 27, 2022

> It just makes me sad that in a world with multiple high-performance JIT engines (including pypy, for Python itself), the standard Python version that most people use is an interpreter. I know it's largely due to compatibility reasons (C extensions being deeply intertwined with CPython's API).

this is misleading, if one sees the phrase "interpreter" as that code is represented as syntax-derived trees or other datastructures which are then traversed at runtime to produce results - someone correct me if I'm wrong but this would apply to well known interpreted languages like Perl 5. cPython is a bytecode interpreter, not conceptually unlike the Java VM before JITs were added. It just happens to compile scripts to bytecode on the fly.

detaro · on Sept 27, 2022

That's not misleading, that's standard terminology. an interpreter using bytecode is still an interpreter.

chrisseaton · on Sept 27, 2022

Bytecode is just another data structure that you traverse at runtime to produce results. It's a postfix transformation of the AST. It's still an interpreter.

an1sotropy · on Sept 27, 2022

Well, ok, but then isn't a CPU is also just an interpreter, traversing the object code text of compiled code?

chrisseaton · on Sept 27, 2022

We don't normally call hardware or firmware implementations an 'interpreter'.

Almost all execution techniques include some combination of compilation and interpretation. Even some ASTs include aspects of transformation to construct them from the source code, which we could call a compiler. Native compilers sometimes have to interpret metadata to do things like roll forward for deoptimisation.

But most people in the field would describe CPython firmly as an 'interpreter'.

zzzeek · on Sept 27, 2022

I call it "bytecode interpreted" to distinguish it from traditional parse-tree interpretation such as Perl 5 and others

zzzeek · on Sept 27, 2022

so you'd call the pre-JIT JVM an "interpreter" and you'd call Java an interpreted language?

chrisseaton · on Sept 27, 2022

> so you'd call the pre-JIT JVM an "interpreter"

Yeah? I think almost everyone would?

> and you'd call Java an interpreted language?

Java is interpreted in many ways, and compiled in many ways, as I said it's complicated. It's compiled to bytecode, which is interpreted until it's time to be compiled... at which point it's abstract interpreted to a graph, which is compiled to machine code, until it needs to deoptimise at which point the metadata from the graph is interpreted again, allowing it to jump back into the original interpreter.

But if it didn't have the JIT it'd always be an interpreter running.

zzzeek · on Sept 28, 2022

I am not too concerned about the word "interpreter", and more about cPython being called an "interpreted language", which implies it works like Perl 5, or that cPython being an "interpreter" is somehow a problem. It's normal mode of operation works more like pre-JVM Java, with "interpreted bytecode" from .pyc files.

chrisseaton · on Sept 28, 2022

Most people don’t make this distinction, and would just say ‘interpreter’. Interpreting bytecode vs an AST is a pretty minor difference. It’s exactly the same data in a slightly different format. The ‘compilation’ is just a post-order linearisation. And storing it in files or not even more so.

zzzeek · on Sept 28, 2022

as I'm sure you're aware, bytecode interpretation typically implies a superior performing model than AST interpretation, and compiling into bytecode produces a lot of opportunities for optimization that are not typically feasible when working with an AST directly. Of course it's all bits and anything is possible, but it's assumed to be a better approach in a generally non-subtle way.

peterkelly · on Sept 28, 2022

To clarify my comment, I did mean bytecode interpreter.

This is a common implementation approach - parse the source to generate an AST, transform the AST to bytecode, then interpret the bytecode. It's still interpretation, and is slow. Contrast to JIT engines which transform the intermediate code (whether that's AST or bytecode) to machine code, and is fast.

chromatic · on Sept 28, 2022

someone correct me if I'm wrong but this would apply to well known interpreted languages like Perl 5

Perl uses the same execution method you describe for cPython.

mkoubaa · on Sept 27, 2022

This is in the process of being addressed - look into the HPy project

mixmastamyk · on Sept 27, 2022

Python is a bit more dynamic than JS, which makes it uniquely hard to optimize. There is more improvement to be done however and is being done.

chrisseaton · on Sept 27, 2022

Right, but I think we know how to optimise all these things. It's all solved problems.

mixmastamyk · on Sept 27, 2022

A few things are impossible without changing/subsetting the language. What I was trying to get at.

cozzyd · on Sept 27, 2022

I think it's more that cpython is so slow so a lot of things people use are implemented using the C API, and many optimizations will break a bunch of things. If everything was pure python the situation would be different.

chrisseaton · on Sept 27, 2022

What things are you thinking of?

(Not trying to interrogate you or prove you wrong, but I've got an interest in optimising very difficult meta-programming patterns.)

mixmastamyk · on Sept 27, 2022

Nearly everything (or is it everything?) in memory can be modified at runtime. There are no real constants for example. The whole stack top to bottom can be monkeypatched on a whim.

This means nothing is guaranteed and so every instruction must do multiple checks to make sure data structures are what is expected at the current moment.

This is true of JS as well, but to a lesser extent.

chrisseaton · on Sept 27, 2022

> so every instruction must do multiple checks

Aren't all the things you mentioned already fixed by deoptimisation?

You assume constants cannot be modified, and then get the code that wants to modify constants to do the work of stopping everyone who is assuming a constant value, and modify them that they need to pick up the new value?

> To deoptimize means to jump from more optimised code to less optimized code. In practice that usually means to jump from just-in-time compiled machine code back into an interpreter. If we can do this at any point, and if we can perfectly restore the entire state of the interpreter, then we can start to throw away those checks in our optimized code, and instead we can deoptimize when the check would fail.

https://chrisseaton.com/truffleruby/deoptimizing/

I work on a compiler for Ruby, and mutable constants and the ability to monkey patch etc adds literally zero extra checks to optimised code.

mixmastamyk · on Sept 27, 2022

No such thing as a constant in Python. You can optionally name a variable in uppercase to signal to others that it should be, but that's about it.

You can write a new compiler if you'd like, as detailed on this page. But CPython doesn't work that way and 99% of the ecosystem is targeted there.

There is some work on making more assumptions as it runs, now that the project has funding. This is about where my off-top-of-head knowledge ends however so someone else will want to chime in here. The HN search probably has a few blog posts and discussions as well.

chrisseaton · on Sept 27, 2022

> No such thing as a constant in Python. You can optionally name a variable in uppercase to signal to others that it should be, but that's about it.

Yeah that’s the point - the JIT takes that capitalisation as a hint to treat it as a true constant and bake the value in until it’s redefined.

This is all solved stuff and isn’t a barrier to implementing a powerful JIT for Python if someone wanted to.

mixmastamyk · on Sept 28, 2022

It's solved stuff in languages other than Python. Many groups even at google have tried and failed.

chrisseaton · on Sept 28, 2022

No, we know how to optimise all these issues. They're solved, through a combination of online profiling, inline caching, splitting, deoptimisation, scalar replacement, etc. (I wrote a PhD on it.) I don't think you could name a single Python language feature that we don't know how to optimise efficiently. (I'd be interested if you could.) But implementing them all is a difficult engineering challenge, even for Google, mainly because it involves storing a lot of state in a system that isn't designed to have state attached everywhere.

mixmastamyk · on Sept 28, 2022

Yes, that’s what my reply means, your “no…” is poor communication style. If you think you can do better than the folks working on it for a decade plus, by all means step up.

chrisseaton · on Sept 28, 2022

But you can't actually give any examples? Ok.

I'll give you one you could have used - the GIL - however I'm not sure the GIL's semantics are really specified for Python, they're an implementation detail people accidentally have relied on.

CyberDildonics · on Sept 29, 2022

If it's solved, why is python so slow?

BiteCode_dev · on Sept 27, 2022

That's what Microsoft is paying Guido for, for the next versions of python.

chrisseaton · on Sept 27, 2022

I think that's not really the plan - they're talking about just basic template compilation, nothing like V8 https://github.com/markshannon/faster-cpython/blob/master/pl....