Do people here use pypy in production? What are the benefits?

myusernameisok · on Aug 14, 2017

I tried. My company has a python API that we run on our machines, we sell the machines to businesses and don't manage them ourselves. We wanted to see if we could get some easy performance increases without too much investment.

At the time (a year ago) there wasn't a way to precompile using pypy, which meant shipping pypy along with gcc and a bunch of development headers for JIT-ing. Additionally a one of the extensions we used for request validation wasn't supported so we'd be forced to rewrite it. I also found that the warmup time was too much for my liking, it was several times longer than CPython's and it became a nuisance for development. I guess I could've pre-warmed it up automatically, but at that point I had better things to worry about and abandoned trying to switch.

I'm sure, given enough resources, it would be a lot better. But it's not quite as simple as switching over and realizing the performance increases without some initial investment.

sillysaurus3 · on Aug 14, 2017

Sure. Free 2-5x speedup. pypy + pypy's pip generally works as a transparent drop-in replacement to python + python's pip, so it's free speed.

It doesn't (or didn't) work when you need to rely on an extension that uses Python's C API. I haven't followed the scene in awhile so maybe that's changed. pypy's pip has so many libraries that I hardly notice, so maybe they solved that.

Unfortunately python is fundamentally slower than lua or JS, possibly due to the object model. Python traps all method calls, but even integer addition, comparisons, and so on are treated as metamethods. That's the case for Lua too, but e.g. it's absurdly easy to make a Python object have a custom length, whereas Lua didn't have a __len__ metamethod until after 5.1. I'm not sure it even works on LuaJIT either. Probably in the newer versions.

JulianWasTaken · on Aug 14, 2017

I can't tell what you mean by the last paragraph there, but oftentimes PyPy's speedups come exactly from inlining stuff like what you refer to there -- Python's not fundamentally slower, it's those kinds of stuff that you can speed up.

(And yeah the CPython API is still a pain point if you've got a library that uses it, although some stuff will still work using PyPy's emulation layer. It'd be great if people stopped using it though.)

sillysaurus3 · on Aug 14, 2017

For example, Python makes it fairly easy to trap a call to a missing method, both via __getattr__ and __missing__. In JS the only way you can do that is via Proxy objects, and even those have limits.

You can't always inline the arithmetic ops effectively. You can recompile the method each time it's called with different types, but that's why the warmup time is an issue. This wouldn't be a problem if Python didn't make it so trivial to overload arithmetic. JS doesn't.

JulianWasTaken · on Aug 14, 2017

Ah! Yes, agreed, Python does certainly make it too easy to do things that cannot reasonably be sped up.

sillysaurus3 · on Aug 14, 2017

Twist: Lua makes it trivial to overload arithmetic using metatables, but LuaJIT seems to have solved that. If there is any warmup time, it's hard to tell. Mike Pall is a JIT god, and I wish we had more insight into everything that went into producing one of the best JIT's of all time.

I'd love a comment/post that highlights the differences between JS and Lua as the reason why LuaJIT was able be so effective. There must be differences that make Lua possible to speed up so much. There are easy ones to think of, but the details matter a lot.

EDIT: I found some discussion at https://news.ycombinator.com/item?id=1188246 but it left me wanting more.

http://article.gmane.org/gmane.comp.lang.lua.general/58908

http://lua-users.org/lists/lua-l/2010-03/msg00305.html

sillysaurus3 · on Aug 14, 2017

More: https://www.reddit.com/r/programming/comments/1r2s82/lua_fun...

dr_zoidberg · on Aug 14, 2017

I tried in digital forensics. Depends on the project. May get up to 5x speedup in the software that runs, after a lot (a loooooooooot) of complaining by it. Many proejcts didn't manage to run though. In the end, not truly significant speedup (the bottleneck tends to lie somewhere else) for the effort that is required to get everything to work.

PS: I do realize "digital forensics" is probably not the kind of "production environment" you were thinking. Just a small datapoint about a particular branch of software that, while getting good speedups, may not benefit as much as the "X times faster" line would suggest.

DonbunEf7 · on Aug 14, 2017

Switched from CPython+Numpy to PyPy years and years ago, got a 60x speedup on a core numerical kernel and 20x speedup on real-world benchmarks. The codebase was a multiplayer game server. Less memory usage overall, leading to a big improvement in the number of players that could be connected.

You have to not have problematic libraries in your system, but honestly they're all either shitty on CPython too (literally every GUI toolkit that is not Tkinter!) or they're stuff like lxml, where the author/maintainer just has an anti-PyPy bias that they won't drop.

JulianWasTaken · on Aug 14, 2017

We've been running a very large production PyPy deployment across pretty much all our Python apps for about... 4 years now. Saves us a ton of money for essentially no real downside.

makmanalp · on Aug 14, 2017

Just out of curiousity, would you be willing to answer a few more questions? What has the memory tradeoff been like? What is the workload you're using it for?

JulianWasTaken · on Aug 14, 2017

Certainly! It's a bit hard to answer some of those questions because it's been so long since we've run CPython, and also because we've now got ~10 apps or so that run on PyPy.

Initially memory tradeoff was definitely significant, somewhere around 40% or so -- it's going to vary across applications though certainly, and in a lot of cases I'm a bit happy our memory usage went up because it forces us more towards "nicer" architectures where data and logic are cleanly separated.

Not that I mean to apologize too much for it, it's something certainly to watch, but for us on our most widely deployed low-latency, high-throughput app, we traded about 40% speedup for 40% RAM on an app that does very little true CPU-bound tasks (it's an s2s webapp where per-request we essentially are doing some JSON parsing, pulling some fields out, building some data structures, maybe calling a database or two, and assembling a response to serialize ~500 times/sec/CPU core).

On more CPU-bound workflows, like one we have that essentially just computes set memberships at 100% resource usage all day long, we saw multiplicative increases, and I can't even mention how much exactly, because the speedup was so great that we couldn't run it in our data center because it started using up all our bandwidth, so I only have numbers for once it was moved into AWS and onto different machines :).

Happy to elaborate more, as you can tell, I think companies with performance-sensitive workloads need to be looking at PyPy, so always happy to talk about our experiences.

sandGorgon · on Aug 14, 2017

are you using any math packages like Numpy/Pandas or Opencv ?

JulianWasTaken · on Aug 14, 2017

Not in any production workloads.

They do work these days in PyPy though, so I'd feel comfortable doing so if we did, although I'd probably feel just as comfortable writing whatever numerics in pure-Python too unless it was stuff that already existed easily elsewhere.

On a personal note I've played with OpenCV as well (and done so with PyPy to do some real-time facial analysis on a video stream), but yeah also not for $PRODUCTION_WORK.

sanxiyn · on Aug 14, 2017

We do. 2x speedup.