Hacker News new | past | comments | ask | show | jobs | submit login
Nuitka Progress in 2015 – Python Compiler (nuitka.net)
166 points by rachbelaid on Jan 29, 2016 | hide | past | favorite | 52 comments



IMHO, this is a very important project as it makes up for one of the biggest shortcomings in the Python ecosystem - distribution of software. Distribution is clearly one of the reasons that Go is so popular. It would be great if we could take advantage of Python as a language and Python as an ecosystem while still being able to deploy as if it was a walk in the park.


I find the way software is packaged in Go to be a terrible regression, but I'm more interested in why you think pip and venv's don't solve the problem for Python already?

Edit: I see from your reply that we're talking at cross purposes. I thought you meant source distribution, but you mean binary distribution to end users or deployment to production systems, and in that case I agree with you.


Being able to ship a self contained binary of your application is a very powerful concept, on which many seem to agree.

The way I see it, pip and virtualenv are not practical for deployment or distribution. You shouldn't have to download and install things during a production deployment. I even created a tool (https://github.com/objectified/vdist) to mitigate this problem, but it will always be a hack when doing it this way.


Remember that critical Go security update?

Usual procedure - update the shared library, restart affected services. Go - recompile everything.


Yeah, but... that's not really the "usual procedure" though. Nobody who knows what they are doing literally downloads openssl manually, compiles the new shared library, manually installs it, and manually restarts the affected services, on the grounds that if you do that you have just proved you don't know what you are doing. (Most charitably, you're doing a "Linux From Scratch" for educational purposes, but that's just about the only valid reason.) Once you have introduced package management and/or system management tools, it doesn't seem like a significantly different problem anymore, and likely to be utterly swamped by the other bigger problems that appear at scale.


No, you update the shared library using your distribution's package manager and then restart the affected services using one of the small scripts for that purpose.

What's the issue with that?


Yeah that's the usual refrain, but the only situation where you have shared libraries is on Linux with a package manager. In that case it is trivial to recompile all packages that depend on the insecure library anyway.

On Windows you have to package most shared libraries with your app anyway so you have to get a new version of it anyway.


Sure but that's one of the reasons I use Linux.


> update the shared library, restart affected services

... look for programs to break at runtime because of some unrelated API change in the shared library.


That's why Debian Stable and RHEL exist. Security patches don't break the API.


I am willing to believe that they come closer to the ideal than others, but nobody is perfect, and I'd rather discover incompatibilities myself, when I'm getting a new version of my app ready, before shipping, instead of trying to understand why random users are incoherently reporting impossible app failures.


> Security patches don't break the API.

shouldn't

When the patched library is not part of Debian Stable or RHEL's repositories (for example, if you require features from a release less than a year old) all bets of API stability are off.

OpenSSL and libc are not the only libraries which are patched for security that people use.


And heaven help you if RedHat decides not to backport a critical bugfix. OpenSSL on CentOS 6 has 99 patch files, a script named "hobble-openssl" and non-trivial changes to the build system that affect linkage, making DIY backports less than trivial.


Which only needs to happen once.


Yes but still. If you have a procedure in place to recompile and redeploy everything, you could just deploy the libraries as well.


Plus, the system admin doesn't just have to wait on the new security-patched library to be ready, they have to wait for everyone who used Go to recompile and distribute their programs.


So you're redeploying one binary instead of another binary. You still need to deploy something.

In fact, what about just switching connections to freshly launched VMs?


> The way I see it, pip and virtualenv are not practical for deployment or distribution. You shouldn't have to download and install things during a production deployment. I even created a tool (https://github.com/objectified/vdist) to mitigate this problem, but it will always be a hack when doing it this way.

Please excuse my newbness but doesn't python wheel do most of this (besides compiling to a single package)?


Not under linux, but it's coming. Alhough they have no way to include big dependancies like QT and the like.


Why not just follow the Erlang/OTP practice of creating a "release" which contains everything you need, including the runtime system, deps, and your code into a self-contained tarball?


Pip and virtualenv solve the problem of creating and populating isolated runtimes for things you control. They don't really address distribution to end users, i.e., those who use OS packaging or app stores.

Distribution of python "binaries" is a real pain though. Twitter's PEX (https://engineering.twitter.com/university/videos/wtf-is-pex) is the best, in my opinion. (It's basically a packaged venv, so in that regard, I guess I agree.) It's still problematic and requires a writable filesystem just to run.

I build OS packages for python programs using PEX, and it's okay, but it's probably my least favorite distribution mechanism.


I found it quite hard to produce a Windows Installer that my users can just run that installs my program plus all the libraries it depends on. If I recall correctly, it was hard to bundle OpenCV and other stuff that relied on native code.


Why is Go packaging terrible? Because we are going back to static binaries?

I write and ship Go and Python code every day and I found that distribution is a great overall benefit - one, that some developers for some reason seem to ignore.

It's not just convenient for me, but for various other parts of a project as well: less moving parts is welcomed by ops, fast iterations help to meet the requirements. Overall a very positive impact for a reasonable price tag: larger binaries and recompilation overhead, when a critical library that is used needs a fix.


It great when you hold all the go code and can deploy fixed version anytime. It sucks ass when there is a critical vulnerability in libc and all your go based binaries was compiled against it and you are waiting for the vendors to issue patches instead of just being able to upgrade libc.so.


The default Go compiler, gc, doesn't link to libc at all on Linux, but directly uses system calls, whose interface is stable. It does link (dynamically, I think) against libc on OS X and Windows, since the system call interface is not stable there.


Last time I needed single-executable distribution with Python (admittedly a while ago), there were easy to use "freeze" tools for this that worked really well. A cursory web search suggests that cx_Freeze is the current popular tool for it. Has the situation tehnically deteriorated or are people unaware of these Python tools now?


PyInstaller[0] is working really well. I am using it to deliver Python application in big companies setup as a single ".exe" without the need to install anything on the target machine.

The only drawback is that you have the decompression time when starting the software, but in my case, the customers have not noticed it yet.

Edit: The advantages of Nuitka are that as you compile your code to C++, you make it faster and also protect it against easy reverse engineering. This could be interesting to compile the critical parts of your application with Nuitka and pack the rest with PyInstaller.

[0]: http://www.pyinstaller.org/


cx_Freeze is still very effective for distribution - I use it all the time to produce a standalone version of my software and then I use Inno Setup to create an installer package for my customers.

As Loic states, Nuitka works differently from cx_Freeze in that Nuitka takes your Python source, compiles that to C++, then compiles the C++, whereas cx_Freeeze creates Python bytecode which is subsequently run by the included Python interpreter.

The result should be that the same Python code should run much faster in the form created by Nuitka, than the result created by cx_Freeze.


> The result should be that the same Python code should run much faster in the form created by Nuitka

How do you figure? Python's slowness is not due to it's lack of compilation, it's due to its dynamic nature and all of the runtime lookups.

Past efforts to compile python down to bytecode have not resulted in speedups. Unladen Swallow is one such failed example. PyPy gets around this by analyzing the actual running code and is only able to speed up a subset of all of Python.


I think you misunderstand what Nuitka is doing...

It is not compiling down to Python bytecode.

It is compiling to an operational-equivalent C++ source code.

Then it is compiling that C++ code to executable object code.


Nuitka is already faster than Cpython and it seems a majority of the speed work has yet to be done.


+1. I'll donate to the guy and if you are using Python professionally, you should consider doing so, and inviting your compagny to follow.


> Distribution is clearly one of the reasons that Go is so popular.

By those that never used an AOT compiler to native code.

Other than that, Given that Lisp, Dylan always had AOT compilers, it would be nice if Python eventually had something similar.


> The stable release has full support for Python 3.5, including the new async and await functions. So recent releases can pronounce it as fully supported which was quite a feat.

Wow. That is amazing! Pyston only supports Python 2 and PyPy is working on it.

Nuitka looks very promising.


This looks very similar to Cython, which can also compile .py files without modification and is more mature at the moment. It seems that Nuitka wants to do a few things differently, though. The biggest difference is probably that Nuitka wants to use type inference + hints instead of explicit declarations, which make Cython code incompatible with CPython but give you more control and C interoperability.

Edit: Thanks, no more interference!


I've never heard about type inTERference, but I'm sure that if something like that existed, it would have little to do with type inference which you probably meant :)

interference - obstruction, collision

inference - deduction, derivation


I imagine that type interference might be what an advocate of dynamically typed languages sees in statically typed languages.


No. Actually nuikta can take your core "as-is". It doesn't need any work of your part at all. And it embeds dependancies as well, without any change needed for them either. That's the game changer : take your Python program, call nuikta, get a standalone exe.


Cython supports type inference, in the form of `infer_types` which can be used either as a compiler directive or a decorator (which means you can pick and choose, and control, when and where type inference is allowed to occur).


It's been great watching Nuitka progress over the past couple of years.

I suggested the tl;dr section to Kay in the Overview ;)

I've also donated to the guy. I'd help out in the development, but to be honest a lot of what he's doing goes way over my head and I don't have the time to sit down and try to understand it all.

I encourage others better than me to help in Nuitka's development :)


> SSA (Single State Assignment Form)

Is this something different from static single assignment? I'm assuming not, but I've never heard SSA expanded in this form.


You are right, this should be static single assignment.


This looks pretty cool, know of any production-grade software using it? I remember reading about something like this being used for Docker Compose, can't recollect what exactly that was.


It's fantastic to see someone taking on the hard problems, things that take years to solve, especially if you're on your own. This is the essence of Open Source - take your time to do something right. Great stuff, good luck with it, we need more grand undertakings like this!

(This comment is mainly based on this talk: https://www.youtube.com/watch?v=a8RRbT4BTEw)


I believe this project is filling the single largest gap that Python has for development of widely-distributed desktop applications. This is pure awesomeness.


Looking at their future plan, can we say they are aiming to build something like RPython, but instead of a subset of Python support full Python language?


Trying to understand this project. Is this more than a python compiler?

Edit: Ok, now I see that the "Overview" button is hidden on mobile.


So, from my reading of the "Overview" page, it compiles Python to C++ and then use native types as much as possible to make things run really fast.

http://nuitka.net/pages/overview.html


The Requirements section of the user manual says that in addition to a C++ compiler, you need Python2 at compile time even if you're using Python3, due to a requirement of the Scons tool.


I wonder how the performance of this compares to ZiPy.


Does anyone know if this works with numpy?


It works with numpy. (I tried it and it seems to be 100% compatible, I haven't noticed any differences)




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: