I was under the impression that even though TLS has the idea of built-in compression, in practice, that is never actually used. Otherwise why would browsers include Accept-Encoding headers, and servers include Content-Encoding headers, when they could just negotiate compression via the TLS handshake.
Also, you claim that Google is basing this protocol on assumptions and you imply that they do not understand the value of metrics. I strongly disagree with this sentiment, mostly because Google's propensity for data-driven experimentation once wasted an afternoon of mine.
I once spent an afternoon debugging a SPDY server that wasn't negotiating spdy/2 over NPN properly. Turns out that for 5% of startups Chrome will disable SPDY, fallback to plain HTTPS, and collect anonymized performance metrics. You will, of course, argue that this is not a fair comparison with a pipelined HTTP stack. I have posted before about the issues with pipelining, and won't repeat myself here. Suffice it to say that pipelining has many problems; problems of a large enough magnitude that it might be easier for a browser to implement a new protocol than it would be to (correctly) apply the many heuristics necessary to enable pipelining in the wild. SPDY would also cause requests and responses to conform to an asynchronous model, which (to me) is wildly preferable to the synchronous one prescribed by HTTP pipelining (and HTTP in general).
(edit)
And to answer your question about why \0\0\0\4 and other lengths appears so many times in the newest version of the prefix dictionary (the 2nd one AFAIK), its because SPDY headers are length-prefixed for ease of parsing. It compresses better when the prefix is included in the dictionary; such that {0x00,0x00,0x00,0x07,o,p,t,i,o,n,s,} would compress to one byte, not 5 (assuming that that entry was still in the dictionary of course).
> I was under the impression that even though TLS has the idea of built-in compression, in practice, that is never actually used.
This is determined mostly by the browser. Most browsers in TLS ClientHello send an empty compression algorithm list since they will use encoding headers (which let the servers cache the compressed data). A browser using Spdy could indicate it supports deflate to enable compression.
> Also, you claim that Google is basing this protocol on assumptions and you imply that they do not understand the value of metrics. I strongly disagree with this sentiment
Watch the tech talk video the other guy posted and drink when they say 'we assume' or to that effect. These guys do testing to confirm their assumptions. Take a look at when a questioner asks about high packet loss and how they hand-wave away Spdy being crippled by high packet loss.
> because SPDY headers are length-prefixed for ease of parsing. It compresses better when the prefix is included in the dictionary; such that {0x00,0x00,0x00,0x07,o,p,t,i,o,n,s,} would compress to one byte, not 5
No, wrong, that's not how compression works. Deflate uses lz to compress runs and repeats like \0\0\0 and it uses huffman to better encode it to bits.
Here's a clue for you, compressing "\0\0\0\7options" using deflate:
The first is the rfc draft dictionary, the second is that with only the first occurrence of each count, and the third with on \0\0\0 at the start and no other counts.
Not even to mention there is limited space in the prefix due to the window size and it sliding, so not only are the counts basically useless but they use space that could be used for other data.
I take it back... these people don't just lack rigor, they are clueless bordering on incompetent. As just an implementer I can accept that you don't completely understand the issues and are excited to work on the implementation, but as protocol designers from Google the work done with Spdy is just unacceptably bad.
> A browser using Spdy could indicate it supports deflate to enable compression.
This would cause everything on the connection to be compressed. Including things like images and videos, which are already compressed, and likely as not, will get larger when you recompress them.
> These guys do testing to confirm their assumptions.
This is how any science gets done... hypothesis, then experimentation
As to your comments on how compression works, I admit my lack of knowledge on how lz works. But I trust the results of the experiments done by the guys at the University of Delaware. If you have a better dictionary in mind, take their sample data, run the same experiment with your dictionary, and post your results on spdy-dev. That's how they got their change in the spec in the first place.
> ... but as protocol designers from Google the work done with Spdy is just unacceptably bad.
Any actual examples? If you had any real basis for your arguments, I'd be more than willing to back you up on spdy-dev.
Take a look at headers served by https://www.facebook.com/ or any large site that uses TLS.
Also, you claim that Google is basing this protocol on assumptions and you imply that they do not understand the value of metrics. I strongly disagree with this sentiment, mostly because Google's propensity for data-driven experimentation once wasted an afternoon of mine.
I once spent an afternoon debugging a SPDY server that wasn't negotiating spdy/2 over NPN properly. Turns out that for 5% of startups Chrome will disable SPDY, fallback to plain HTTPS, and collect anonymized performance metrics. You will, of course, argue that this is not a fair comparison with a pipelined HTTP stack. I have posted before about the issues with pipelining, and won't repeat myself here. Suffice it to say that pipelining has many problems; problems of a large enough magnitude that it might be easier for a browser to implement a new protocol than it would be to (correctly) apply the many heuristics necessary to enable pipelining in the wild. SPDY would also cause requests and responses to conform to an asynchronous model, which (to me) is wildly preferable to the synchronous one prescribed by HTTP pipelining (and HTTP in general).
(edit) And to answer your question about why \0\0\0\4 and other lengths appears so many times in the newest version of the prefix dictionary (the 2nd one AFAIK), its because SPDY headers are length-prefixed for ease of parsing. It compresses better when the prefix is included in the dictionary; such that {0x00,0x00,0x00,0x07,o,p,t,i,o,n,s,} would compress to one byte, not 5 (assuming that that entry was still in the dictionary of course).