As far as I'm aware, they do. GIL means that even using threads, they still need 8 processes to utilize 8 cores.
There are more languages than those 3, though, and if those 3 are the languages that you're considering, evented vs blocking i/o is way beside the point when it comes to performance. A super-simple thread-per-connection program with blocking i/o calls from C or Java will demolish the most sophisticated evented system you could ever design in a scripting language. That was my "barely even moves" vs "spinning roundhouse kick" comparison.
I did some benchmarks a few years ago playing around with different techniques in C#.
If you spin up a thread for each connection, and the connections are short lived, the time spent spinning up a thread will dominate the time spent communicating on the socket. Event loops perform much better than one-thread-per-connection in simple web services because they don't have the extra thread overhead. Its no accident the hello world app on nodejs.org is this sort of service.
But by far the fastest technique I found was to use a thread pool and an event loop together. When there was any work to do (new connection or new data), I scheduled the next available thread from a thread pool to do the work. This technique requires that you deal with both callback hell and thread synchronization. But, its blazing fast. Way faster than you can get with a single nodejs process.
The code is also many times more complicated, so it really depends on what you're doing. Despite knowing how to make a super high performance server, I do most of my work in nodejs. - It performs fine in most applications.
* Since this is dealing with async operations, an IIS worker thread takes an incoming request and hands it off to be processed on the CLR thread pool. At this point, the IIS worker thread immediately becomes available for processing more requests
* Threadpool threads will not be tied up by open connections and waiting for IO. Only when executing actual code/work will they be in use
* To explain why threadpool threads are not tied up: Async operations use async IO. As soon as an I/O operation begins, that calling thread is returned to the thread pool to continue other work. When I/O completes, signal is sent and another thread is picked up to finish up work.
> A super-simple thread-per-connection program with blocking i/o calls from C or Java will demolish the most sophisticated evented system you could ever design in a scripting language
Are you really sure about that? Perhaps 5 years ago, but have you tried recent versions of node (V8) or Python? I have, and even as an old-timey programmer, I'm impressed.
Ted: "Node is cancer, because it's not the one true tool that can do everything! It may be good at IO bound code but it's not so hot at CPU bound stuff".
Node hackers: "Yes it is the one true tool!".
Me: face-palm.
OK, Threads can be used to do anything, but they are hard, while async is pretty easy (unless you want to do CPU bound stuff). async sucks for CPU bound stuff, but that's not the problem it's trying to solve (and anyone who makes a big fuss over it from either side is a fool). But none of this really matters much, until you actually need to scale.
If you are doing something simple, C or Java will be best, simply because they are faster. But not everyone uses C or Java, and it's not like speed matters that much when your App server can scale trivially, your DB is the real bottleneck, and you only have 3 users (one of whom is your cat).
If you need a lot of connections, some of which block (due to a call to a web service like BrowserID or Facebook - and there's a lot more web sites that need to be optimized for web APIs calls rather than calculating Fourier transforms), you need lots of processes (which is too heavy in Python), lots of threads (and then you need thread-safety, which is a pain, and very easy to screw up), or something async like Twisted or Tornado. Given that Tornado is already really easy to use, and basic async stuff is fairly easy to get right, the choice is easy for me. (I don't know enough about JS, Node, and V8 to really comment on Node, but I'll just assume it's roughly equivalent).
The thing is, I just don't trust threads (at least, not if I'm doing the code). There's far too many ways you can have weird bugs that won't show up without a massive testing system, or when you get a non-trivial number of users. And you can't integrate existing libraries without jumping through a lot of hoops.
Using callbacks looks like "barely even moves" to me, while multi-threaded code looks like a "spinning roundhouse kick", but then, maybe I'm just not good at multi-threaded code.
I guess the most important things is that threads need to be 100% thread-safe. Async code only has the be "async-safe" in the bits that need to be async. It looks like this:
def threaded_code():
do_something_100_percent_thread_safe()
do_more_thread_safe_stuff()
# fail here when you have a few users
do_something_that_uses_an_unsafe_library()
more_thread_safe_stuff()
If you really need to do CPU bound stuff, async sucks. You can create a second server, which handles the CPU bound stuff (and call it using a web interface which is async safe). Or there are pretty easy ways to call a subprocess asynchronously. But arguing over the merits of async programming using Fibonacci sequences as a talking point is not even wrong. Anyone who brings it up is just showing themselves to be a complete tool. That might be what Ted was trying to do (as many of the Node responses have been unbelievably lame), but it doesn't prove anything except that the internet has trolls, and plenty of idiots who still take the bait.
https://github.com/glenjamin/node-fib is close enough. Trying to explain how Node can do concurrent Fibonacci without calling another process is falling for the troll's trap - Node isn't "the one true tool", and you don't need to try to prove that it is. The trolls will just say that Fibonacci is too trivial, and you wouldn't use the same approach (co-operative multi-threading) for less trivial CPU-bound tasks.
You can use a hammer to bang in screws, but it's not usually the best way. If you are using co-operative multi-threading as a way to get Node to handle concurrent CPU-bound requests, you are Doing It Wrong (TM). In general, it's better to create a second (possibly not async) server to handle CPU intensive stuff, or fork stuff off to a subprocess (depending on the actual task). There might also be other ways - I'm not an expert.
I'm sure that if glenjamin had a less trivial CPU-bound task, he/she would handle it a better way (depending on what the task was). But to the Node haters, node-fib is just troll food. It's also an interesting example of how node.js works, but the trolls don't care.
No, heroku doesn't work that way, heroku lets you spin up processes and those processes can be threaded. I use JRuby and get one process that can use all cores.
I think it's safe to say that typical Ruby deployments (as I qualified) are not using JRuby, especially not those on Heroku. Thus, even if the process spawns threads, they will be limited to one core at a time.
Whether JRuby's threads are actually running on multiple cores simultaneously depends on what mode Heroku is running the JVM in.
Don't typical Python and Ruby deployments also need 8 processes to utilize 8 cores? If I'm not mistaken, this is in fact how Heroku works.