and whether or not the other webservers mentioned support such options and if they are enabled in the tests. There is no way to know if this is alarming or just FUD. Also missing from this comparison is a vanilla Apache SSL benchmark (which will suck, but would serve as a reference point).
Given nginx's track record, I prepared to give nginx the benefit of the doubt and assume that it's a invalid test.
That said, I'm going to run some benchmarks of my own.
If I see such bad numbers, usually, before I write a post about it, I try to investigate and share the results of that with the post.
I'll try to reproduce these results tomorrow, but if I had to guess, I'd say ssl_session_cache was left to its default (off) which means that every connection has to do the expensive SSL handshake.
Sigh. Sometimes posts like these make me want to go back to academia (it's not perfect, but they generally believe in this little thing called rigor).
From TFA:
I tested nginx as a proxy, serving static files, and serving nginx-generated redirects. I tried changing all the relevant ssl parameters I could find. All setups resulted in the same SSL performance from nginx. I even tried the setup on more than one server (the other server was quad-core nginx got up to 75 requests per second).
So "all the relevant ssl parameters I could find", no details about what those involve, and the surprising result that it made no difference.
In the same situation, I might think I was doing something wrong...
And then this overarching statement:
Never let nginx listen for SSL connections itself.
Rigor is an interesting point. Do we prefer to have a flawed, but slightly useful, post now - or - do we prefer to wait a month or two for a squeaky clean post with all issues worked out?
In two years, people will still be saying "nginx + ssl = bad" because of this post even though the problem may well be fully addressed. Google will continue to surface this article even though it may be totally wrong at some future date. That sucks.
If it was really that easy to spread this questionable message for years, it would be as easy to spread other articles as well.
So it wouldn't need more than a few articles, like "nginx + ssl = works like a charm", or "nginx has better SSL support than Apache". It would not matter whether those were actually correct, just a single article of questionable quality would be sufficient.
Why not doing both? First a small article about the surprising phenomenon, which announces a more thorough analysis next week.
That way, it is possible to get some initial feedback and maybe even some good hints that help speeding up the analysis. In the best case, the announced analysis could become a collaboration by multiple authors.
That reminds me of the old trick of asking a reasonable question. Then getting a friend to give a wrong answer to that. The real answer is likely to be somewhere in the flood of corrections that you see.
I agree. What if we all waited until 2008 for an academic to publish a 30 page paper about Ruby and Python and how they can be useful for building web apps?
Academics share and discuss findings in casual terms before formal publishing all the time. Two academics meeting over coffee aren't going to demand the rigor you get with a published article.
That's true, but the "damage" tends to be limited, because it's shared in person with a handful of people who understand the preliminary nature of the results--- not potentially tens of thousands of people clicking on the front page of Hacker News, who see a very definitive-sounding statement ("Nginx sucks at SSL").
I do think academics overcorrect on this, and should share more early results, possibly via things like blog posts (this is slowly starting to happen). But erring in the opposite direction is also quite common among tech bloggers. In particular, if you're going to publish anything that looks vaguely like a benchmark, it might be worth taking at least a few days to check out possible problems before sending it out into the world (not months or anything, but a few days).
Try adding the -k parameter to ab to use keep alive requests and see if you notice an uptick. If you a generating a new session on each request it doesn't matter if you used the SSL session caching or not in Nginx.
Thanks! I didn't know nginx sucked at SSL. You may have increased our revenue. Many businesses like us have our conversion pages on SSL. Our front-end server is doing 2000 to 4000 http requests per second and we get over 3 million uniques on the main site where we sell stuff via SSL. If SSL is this slow, it probably impacts performance on our secure pages which affects revenue. Where do I send the beer?
On a 4 core Xeon E5410 using ab -c 50 -n 5000 with 64 bit ubuntu 10.10 and kernel 2.6.35 I get:
For a 43 byte transparent gif image on regular HTTP:
Requests per second: 11703.19 [#/sec] (mean)
Same file via HTTPS with various ssl_session_cache params set:
ssl_session_cache shared:SSL:10m;
Requests per second: 180.13 [#/sec] (mean)
ssl_session_cache builtin:1000 shared:SSL:10m;
Requests per second: 183.53 [#/sec] (mean)
ssl_session_cache builtin:1000;
Requests per second: 182.63 [#/sec] (mean)
No ssl_session_cache:
Requests per second: 184.67 [#/sec] (mean)
The cache probably has no effect because each 'ab' request is a new visitor. But I'd guess the first https pageview for any visitor is the most critical pageview of most funnels.
I hear the same performance happens with our without the session cache enabled (for benchmarks). The http(s) benchmarking tools don't resume sessions. It's simulating a horde of new clients who never come back or request other resources.
It would be interesting to see stud with a session cache too.
I'm reading over Matt's work carefully, but my initial inclination is not to merge the bulk of this into stud mainline. I'd rather keep stud simple and protocol-naive and have HAProxy do the HTTP work.
Which is to say indirectly that I think the right answer is for nginx (and daemons generally) to support the PROXY protocol, or some other agreed-upon standard for a naive upstream proxy to indicate host/port information.
initial inclination is not to merge the bulk of this into stud mainline
I agree. The HTTP stuff is still too integrated. ifdefs are ugly.
The solution is to do what showed up when I was 99% done working on XFF -- the nice PROXY protocol addition. We just need to get PROXY support into nginx now to obviate my XFF machinations.
I don't know what the limits are in the nginx HTTP parser Matt's using, so this is probably moot, but code that does things like "realloc(ptr, size + newsz)" or "malloc(size + 1)" expecting things to be fine gives me the howling fantods.
I don't know what the limits are in the nginx HTTP parser
You're correct in assuming the library enforces its own size limitations. It operates on length of received SSL data which is capped by the static receive buffer at 32k. Nice and tiny.
(Also, you are, of course, painfully correct about lack of bounds checking and lack of return value checking on the malloc/realloc calls. If I ever graduate the branch to production status, the six malloc calls and three realloc calls will be wrapped in proper checks.)
Why? Can't you just bail on whatever you are currently doing? How is the entire process compromised by a failed malloc? Resources are limited, sort of by definition, shouldn't good code be able to handle this possibility?
If malloc is rigged to explode when it fails, you can't accidentally forget to check; sometimes, malloc failures can end up being exploitable. It's not like most code does anything particularly smart when memory runs out.
Handling an out of memory error in any way other than terminating the entire process is very very hard, because the effects of memory exhaustion are felt by all your threads at the same time usually preceded by massive slowdown due to swapping (which will cause other symptoms if there is any real time constraints put on the process).
I'm not saying there are no cases where recovering from allocation errors would be possible but it's not the general case. It's usually easier to treat any allocation error as a fatal error and insure your programs so they don't run out of memory through other means.
"malloc(size+1)" is a sign you may have one off errors in your code. If you need to store a string of size s you need s+1 bytes allocated. The plus one is for null termination. If you want an array of t you can either pass the size around with the array or null terminate the array like strings, but then you can't store any null values in the array.
Also, there's no bounds checking on size so in certain conditions such as a 2GB/4GB allocation you may allocate zero bytes or -2GB bytes.
It would be nice it Matt provided full details on the testbed, including the client. In a test scenario it is very important to understand what gets tested in the end. I liked that "Russia" tag too :)
I suspect that they are re-using connections in that benchmark. SSL connection setup is CPU intensive. Once a session is set up, an SSL connection uses only slightly more CPU time than an unencrypted connection.
He's concerned about the raw speed of the SSL calculations, not requests per second, but if you're actually concerned about SSL speed and you have enough requests per second to justify optimizing SSL speed, it could be pretty useful.
Interesting, thanks -- I'll watch for that. Until now I haven't paid much attention to it since I'm not responsible for that part of the configuration.
I'm not sure why he would be getting numbers that low. The only setup I have at the moment which would give useful numbers for ssl req/sec is a small single core VM, running one nginx worker process, and that pumps out 135 new req/sec. Add a few cores, workers, put it on real hardware, and I don't see how this couldn't push well over 400 req/sec.
This is using nginx strictly as an ssl termination, where I need to do some header manipulation that I couldn't do in stunnel/stud.
I remembered I had an older 8 core server sitting unused at the moment. I configured nginx with 8 workers, and ran `ab` against it. From a single (VM) host, I can get 680 connections per second (maxed the cpu on the host running the test). From 4 hosts, each host got > 290 connections per sec, so I got nginx up to over 1190 new connections per second, and can likely push it further.
[EDIT] got it to peak at 1535 requests per second with 4 hosts testing.
If anyone does benchmarking, please include litespeed as I am curious. I suspect it's much faster than nginx at ssl. Even with the connection limit on the free version I suspect it will still be feasible for testing.
(on an 8 core server...)
haproxy direct: 6,000 requests per second
stunnel -> haproxy: 430 requests per second
nginx (ssl) -> haproxy: 90 requests per second
Yet Matt Cutts tells us that SSL is not computationally expensive anymore. Based on these results it's still an order of magnitude slower.
Given nginx's track record, I prepared to give nginx the benefit of the doubt and assume that it's a invalid test.
That said, I'm going to run some benchmarks of my own.