I don't understand why people insist on architectures where otherwise-independent processes share a single socket.
You're already running a reverse proxy in front of them! There's no reason each Unicorn couldn't be listening on a different port. Does that third layer of local load-balancing between the HTTP proxy and the event-driven app server actually get you anything?
It's a general result in queue theory that you want one global queue as early as possible in the system. This is because requests don't take exactly the average service time to clear backends: it's a distribution. The more workers that can pull from a queue, the less the worst case service times affect the average service time.
One queue per worker with no global queue is the worst configuration and should be avoided if at all possible. Anyone who's run large reverse proxy installs knows this pain well.
The ideal system would be for the balancer machine(s) to hold the requests, and for backends to pull them in a sort of ping/pong fashion. I gather fuzed runs in a pattern like this, though I've not used it.
Telcom folks have analyzed this stuff in detail for the better part of a century. There's a lot of theory out there and it's surprisingly practical and applicable to real world web applications.
You read me correctly, I don't see why load balancing should be taking place down at the app server level.
Aren't you going to have multiple machines each with their own blessing of unicorns? You're still going to have to use some kind of load balancer in front of independent sockets.
Replacing N ports with one simplifies configuration.
I never understood the complex HAProxy in front of Apache in front of Nginx in front of Mongrel type setups that seem to be popular in the Rails world. Why not just use Unicorn? What value is GitHub getting from having Nginx in front?
Because Ruby 1.8 threading sucks you pay a large memory price (ie a process) for each concurrent request in flight. A fronting proxy allows your backends to write out the response as fast as possible and move on to another request while the proxy spoon feeds the response to slow clients.
Also, nginx is going to be more efficient for serving static files, though most larger apps will have broken such requests out to a separate set of domains likely serviced by a cdn.
You're already running a reverse proxy in front of them! There's no reason each Unicorn couldn't be listening on a different port. Does that third layer of local load-balancing between the HTTP proxy and the event-driven app server actually get you anything?