Why is that shocking? Can you elaborate?

mrshoe · on Oct 9, 2009

Sure, now that HN has decided that enough time has elapsed to allow me to reply. :-)

Actually, antonovka did an excellent job explaining it above. The most important aspect is that threads are "lighter weight" than processes. They use less memory and are quicker to context switch (usually). The result of this lighter weight is that you can spawn more threads than you could processes, on the same hardware. And when you're using one thread/process per connection, that means more concurrent connections on the same hardware. So, if Unicorn used its exact same architecture, but replaced the worker processes with worker threads, you could scale much better.

Secondly, haproxy is written in C, which generally means it's going to perform much better and use a lot less memory than a Ruby webserver. This translates, once again, to more output from the same amount of hardware.

That's why I was pretty surprised to see that github would choose both Ruby and pre-fork over C and threads (or in the case of haproxy, async, which scales even better).

grandalf · on Oct 10, 2009

I think you missed the point. The point was not that github stopped using haproxy, but that it no longer needs to use it.

Assuming haproxy can work with unix sockets, github could configure nginx/unicorn to use it, but with unicorn its current work load does not require additional load balancing acrobatics beyond what the unix socket does.

Similarly, many rails sites use thin/nginx or mongrel/nginx and do not even have a workload that necessitates haproxy.

Plus, haproxy adds another layer of complexity, configuration, and management, which is nice to avoid if you can.

davidw · on Oct 10, 2009

Switching to threads from processes is probably "premature optimization" for Ruby - it's just not going to buy you all that much.

joevandyk · on Oct 9, 2009

The limiting factor of concurrency in a Rails application is the slowness of Ruby, not processes vs threads.

If the bottleneck of Ruby were fixed, then the next bottleneck would be the database.

rtomayko · on Oct 11, 2009

> Actually, antonovka did an excellent job explaining it above. The most important aspect is that threads are "lighter weight" than processes. They use less memory and are quicker to context switch (usually).

Processes and threads have very similar execution models under most Unixes from what I understand. Threads don't use all that much less memory, either, given a copy-on-write friendly environment (e.g., not Ruby MRI). Perhaps you're confusing threads vs. processes under Windows or Java to threads vs. processes under Unix?

> The result of this lighter weight is that you can spawn more threads than you could processes, on the same hardware. And when you're using one thread/process per connection, that means more concurrent connections on the same hardware. So, if Unicorn used its exact same architecture, but replaced the worker processes with worker threads, you could scale much better.

You're crazy. If Unicorn were 1.) somehow able to take advantage of native threads (it's not), and 2.) moved to a thread-per-connection model instead of a process-per-connection model, it would have basically zero practical impact on the efficiency with which it processes requests. It's already sharing a great deal of its base memory footprint thanks to preloading in the master and fork, so processes don't have significant memory overhead. It's using the kernel to balance requests between processes, so it's not like there's a bunch of IPC between master and worker processes that could be removed with threads.

Say you're running 8 process-per-connection backends on eight cores and each process is 100% CPU bound processing requests. You have a load of 8.0, machine is 100% utilized. If you then change this to a single process with 8 native-thread-per-connection workers, absolutely nothing will change. The load will still be 8.0. You can start more threads to do work but it will have the near exact same effect as starting the same number of processes.

Process-per-connection doesn't fall down under these levels of concurrency. It does eventually - you can't use process-per-connection to solve C10K problems, for instance. But we're talking about Ruby backends, which are always managed as multiple processes due purely to the way Ruby web apps are written (not efficient, not async). Requests execute within a giant request-sized Mutex lock.

And even if threads were considerably more efficient than processes, you still don't want to run a lot of them because each consumes network resources, like database/memcached connections. Using native threads would not let Rails apps spawn thousands (or even hundreds) of thread-per-connection workers. A high concurrency threading model works for Apache (and Varnish and Squid) because the work performed in each thread is fairly simple and doesn't require the kind of network interaction and resource use that an app backend does.

Basically, this notion that threads (even native threads) would be a considerable improvement to Unicorn's design is just all wrong.

> Secondly, haproxy is written in C, which generally means it's going to perform much better and use a lot less memory than a Ruby webserver. This translates, once again, to more output from the same amount of hardware.

Your world seems considerably more simplistic than mine. I'm not even sure how haproxy and unicorn can be compared in any useful way. Unicorn is not a proxy. The master process does not do userland TCP balancing or anything like that. Unicorn is designed to run and manage single-threaded Ruby backends efficiency, HAproxy is a goddam high availability TCP proxy.

> That's why I was pretty surprised to see that github would choose both Ruby and pre-fork over C and threads (or in the case of haproxy, async, which scales even better).

I think you're confusing the types of constraints you have when you're building high concurrency, special purpose web servers and intermediaries (like nginx and haproxy) with the types of constraints you have when you're building a container for executing logic-heavy requests in an interpreted language. Nginx and HAproxy have excellent designs for what they do, and so does Unicorn. They're different because they do different things.