Hacker News new | past | comments | ask | show | jobs | submit login
An Inside Look at the (Python) GIL Removal Patch of Lore (dabeaz.blogspot.com)
90 points by mace on Aug 12, 2011 | hide | past | favorite | 17 comments



I think this patch could certainly do with revisiting:

1) reference counting used atomic operations on Windows but on Linux used a full pthread mutex; if that was necessarily back when the patch was written it's definitely not necessarily today.

2) Garbage collection could eliminate the reference counting problem entirely.

3) There's been a lot of work on immutable data structures that could address the dictionary locking issue.

I do wonder if this is really a deeper issue - programming isn't nearly so simple once you have multiple threads concurrently modifying your data structures and you have to start locking etc. A lot of the appeal of Python is that it's so simple, and I think threads and locks would greatly affect that.


Yeah, a mutex to protect refcounts? That's ridiculous. (If you want low-overhead locks, David Bacon's thin locks (http://www.research.ibm.com/people/d/dfb/thinlocks-publicati...) is a good place to start.)

I also question the logical premise offered here that data structures must be thread-safe. It's not something guaranteed by most languages (eg. Java), so why should Python guarantee it?


"all datastructures would have to be thread-safe is pretty common argument used against usage of native threads (as opposed to green threads) or removal of global locks in various VMs. While it's true that they don't have to be thread-safe from user's point of view, they should be thread-safe in the sense that they cannot get corrupted by race condition in user code in a way that could crash (or deadlock) the VM.


Every Python object has a backing dictionary, so if any object has to be shared between threads in a threadsafe fashion, then dictionaries need to be threadsafe.


This is tricky because dictionaries need to call __hash__ which can run arbitrary Python code, so it's non-trivial to make the dictionary threadsafe.


It's perfectly possible and even reasonable to call __hash__ before acquiring lock on the dictionary (and cache keys' hashes for data in the dictionary).

On the other hand, when almost every object is backed by dictionary, the locking has to be very fast, which seems to me like almost unsolvable problem.


__eq__ as well


Just make __hash__() wrap the underlying dict in a thread-safe one that delegates the calls.


Right. I think that backing dictionary could probably be replaced with a more thread-friendly implementation, even if we have to offer reduced guarantees.


Or every Python object's backing dictionary needs to use the special threadsafe implementation.


That's an argument for revamping the backing dictionary, really. If __hash__() is a problem, make it return a thread-safe proxy that delegates to the object's dictionary.


Atomic operations (atomic ind/dec) can still cause major performance issues because depending on the architecture it can cause a lot of bus traffic and cache invalidation which can greatle increase memory pressure for previously cached operations.. a better solution would be to use a more cache friendly counter that puts multiple counter on different cache lines and lazily reconciles them so you only need to synchronize one out of n operations, the problem is that this can cause a big increase in memory usage so its only suitable for highly contended locks. The ideal situation would be for the intepreter to analyse contended locks and then switch to this approach only when neccesay. The technique is called sloppy counter and has been used in linux smp kernels.


I think garbage collection is an even better answer, because you don't pay a penalty when copying pointers.

I hope someone does a good port of Python/Jython to the Java 7 JVM - it seems the easiest way to get a good garbage collector (and JIT compiler).


Yes, threads are pains in the ass, but many people appear to enjoy them nonetheless. That's pretty much the entire reason that languages have threading support.


I definitely agree. But the Python crowd seems very opposed to threads on many levels, and I think there's a place for a language which deliberately excludes the complexity of threading.


Our opposition mainly comes from the idea that threading is not worth it; the gains in parallelism don't make up for the maintenance and debugging costs.


I like the idea of exploring other models. Shared data and threads is not the only approach to parallelism.

And, usually, when you are stuck solving a hard problem, it often pays to take a step back and make sure we are failing to solve a problem you should solve. The problem we want to solve is not how to get rid of the GIL, nor improve Python performance with threads but how to use Python more effectively on multi-core/thread architectures and gain performance from that.

This is not a problem only Python has. The machines I work on most of the time (a Core i5 laptop and an Atom netbook) rarely experience loads larger than 2. There are simply not enough threads to keep them busy.

That's not to say they never get slow - they do - but I'd like to emphasize that the limiting factor here is that we are not extracting parallelism from the software already written. We stand do gain a lot from that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: