Python's multithreading is insanely inefficient, because of the Guido von Rossum...

the_mitsuhiko · on Oct 30, 2016

> There's no such thing as thread-local data in Python.

There is. threading.local in all aspects is thread local data.

> Now there's "asyncio", which is the descendant of "Twisted Python". That was mostly used as a way for one Python instance to service many low-traffic network connections. The new "asyncio" is apparently more general, but hammering it into the language seems to have created a mess.

I think the mess was created before 3.5. Had the whole thing started out with the async keywords we might have been spared `yield from` which is a beast in itself and a lot of the hacky machinery for legacy coroutines. I do think however we can still undo that damage.

Animats · on Oct 30, 2016

threading.local in all aspects is thread local data.

You can still pass data attached to threading.local to another thread. Another thread may be able to get at threading.local data with setattr(). There's no isolation, so all the locking is still needed.

This is a hard problem. There's real thread-local data in C and C++, but it's not safe. If you pass a pointer to something on the stack to another thread, the address is invalid and the thread will probably crash trying to access it. C++ tries to prevent you from creating a reference to the stack, but the protection isn't airtight. In Rust, the compiler knows what's thread-local, as a consequence of the ownership system. Go kind of punts; data can be shared between coroutines, but the memory allocation system is mostly thread-safe. Mostly. Go's dicts are not thread-safe, and there's an exploit involving slice descriptor race conditions.

the_mitsuhiko · on Oct 30, 2016

> You can still pass data attached to threading.local to another thread.

You can in most languages. Only rust I know has enough information to prevent that.

_pgmf · on Oct 30, 2016

On what are you basing your statement that multithreading in Python is insanely inefficient? Despite the fact that the GIL prevents multiple threads from running in parallel, using multiple threads can give you a huge boost if IO is your bottleneck.

I think that it's irresponsible to make a blanket statement like this, because there are many use-cases for multiple threads in Python. Sure, one of the obvious ones (parallel processing) doesn't work, but besides that threads can be extremely useful.

I'm also unclear on what you mean by "no such thing as thread-local data" when there is `threading.local()` that does exactly that.

Lastly, I don't think multiprocessing was created as a workaround for threading per-se. Rather it was a workaround for the global interpreter lock.

sidlls · on Oct 31, 2016

And in either case my view is concurrency is best done in a language where there is proper support for it, whether the model is threaded or processor based concurrency.

sidlls · on Oct 30, 2016

Not that I disagree with (m)any of the points you raise, but with due respect I think it's a bit off-topic from my comment.

Were it my choice, any time the need for concurrency comes up at my job I'd prefer to use a statically typed, compiled langauge like C++ or Java (or, once I've familiarized myself with Rust's implementations, that language), and this kind of discussion wouldn't even come up. I like python as a rapid-prototyping language. For the kinds of numerical computations and data-laden I/O bound work I do I find it sorely lacking, and consider it an unfortunate choice for production work.