I applaud the effort to hate on "smart" middleware proxies!
That being said, author gets no points for namedropping random distributed systems algorithms and using tcp keepalives (2 hours minimum!) as an argument against TLS terminating proxies.
Is there a reason to (as he says) "fully implement the protocol" in the proxy? I battled with websockets through Pound last week, and it simply doesn't work because the author took a non-postel stand on protocol specifics.
Having a protocol agnostic proxy like hitch (previously stud) fixed that without losing functionality, and I expect it to age better as well.
Sadly the internet isn't as nice. I've been to various hotels where the router would silently drop keep-alive packets (but god forbid informing the packet layer you do this!) and mangled DNS packets ("looking for mail.example.com? Here is the answer for example.com" and "looking for doesnotexist.com? Here is result for internalsearchengine.com which redirects you to a sponsored search page with ads")
Even encrypted DNS will suffer because middleboxes with captive portals will attempt to tamper it.
Unless you pipe it over TLS or HTTP in which case you run into problems with not knowing why there is no connection in a captive portal (we obviously need to fix captive portals, they're source of 90% of problems)
I learned recently about RFC 7710 which specifies a DHCP (v4/v6) and RA option for "You're on a captive portal, here's the website you should visit before you get access": https://tools.ietf.org/html/rfc7710
Do any of the major implementations of captive portals support it?
RFCs are never anything other than proposed standards, or informational documents. There is no point at which an RFC becomes a “recommended standard” or some such thing.
All internet standards, from IP to HTTP, are proposed standards, it doesn’t actually mean anything about whether they’re generally implemented or not.
Not that I know. My pfSense firewall doesn't have it (IIRC), so my guess would be that poorly maintained router boxes in a hotel basement definitely don't have it.
I'm not sure if the various DHCP clients communicate this properly to the OS or browser even (I wouldn't know how to query for it on Linux)
"TCP implementations should follow a general principle of robustness: be conservative in what you do, be liberal in what you accept from others." -- Jon Postel
I dont grok this, if tcp's model has fundamental problems how come the Internet works. :)
The fact that a protocol technically is not perfect and causes jip for isps does not mean the application layer has to get involved.
I've been writing tcp based apps for years and the stream abstraction has never failed me. After reading this I dont see why I should change that assumption? I have to rebuild connections occasionally but its never cost my application so much that an alternative more complicated abstraction layer made sense.
I usually write req/response over tcp, an even more inaccurate abstraction. Occasionally nonblocking code. Never have I wanted more complexity than nio in my application layer.
Devs do know that "tcp is not a stream of bytes" but deliberately do not want to get app code involved.
> I dont grok this, if tcp's model has fundamental problems how come the Internet works. :)
Depends on your definition of "works"! Just ask any gamer what happens when someone starts using netflix while they're playing a game... For a lot of people the internet doesn't work very well.
TFA is describing, I think, the issue of stale / dead connections - which can be notoriously difficult to detect. Imagine you are building black boxes that sit in peoples' homes, with a long running TCP connection to push notifications to. With the state of the internet, it's very likely some percentage of those connections will die without either side realizing it. That's where the complexity comes in.
Off-topic on my part here but I think it is so weird that people say “TFA” when referring to the article in the way you are doing. I see this a lot on HN.
IMO it would be natural to use “TFA” in the same way you would use “RTFM” — when you are scolding someone for not having read what they should — but using “TFA” as just a synonym for “the article”... that always throws me off.
You’re both right and wrong. That is, the... anger? you’re picking up isn’t incorrect, but instead lost in translation.
Back in the /. days, which, as far as I know, is where this comes from, people would use it in this way. “Read TFA.” “Have you read TFA?” “Seems like you haven’t read TFA.” Eventually, that morphed into the usage you see on HN and everything else, as kind of a different version of OP.
> Just ask any gamer what happens when someone starts using netflix while they're playing a game.
Every cable modem I have had suffered from some form of bufferbloat. In short it's not TCP's fault that your head shot packet sat in a cable modem's buffer for 5 seconds before being sent to the server.
Edited to add:
> TFA is describing, I think, the issue of stale / dead connections - which can be notoriously difficult to detect.
Normally all long lived connections have some kind of TCP keep alive to them. When the TCP keep alive fails, the connection dies, and it's up to the application to restart the TCP connection. The proxy can send these out as well.
I remember that back in the 90s it was very hard for me on MacOS to debug network applications when the connection was interrupted or lost due to a crash or a test. It took minutes until the same port could be used again, even when another local port was chosen. To detect a lost connection people used ping-pong signals in their application protocol.
Since then network stacks seem to have improved tremendously but I thought it's still customary to ping the other side from time to time. Is that not enough to solve the problem?
Sorry I wasn't specific enough, I didn't mean ICMP ping but some application protocol specific periodic request-acknowledgment exchange within a TCP connection in order to check that the connection is still alive regardless of what the OS socket layer says. Many application protocols have this, at least older ones, and I always thought the reason for it is that OS TCP stacks are sometimes unreliable or have unreasonable timeouts.
It's been a long time I've done network programming, though...
On IPv4, blocking ICMP has sadly become the norm since some people abused it for ICMP storms.
On IPv6 it is strongly recommended not to block ICMPv6 since it routes highly important information back to the sender or forwards to the receiver (like when MTU is reached)
> On IPv6 it is strongly recommended not to block ICMPv6 since it routes highly important information back to the sender or forwards to the receiver (like when MTU is reached)
The same is true for IPv4. Path MTU discovery is not new for IPv6. Most systems set the DF (don't fragment) bit by default.
On IPv4 it's a bit of a problem because of the number of legacy middleboxes that block things they don't know, some olders routers have problems with DF (always fun to debug).
IPv6 makes ICMP path MTU discovery part of the protocol and disabling ICMP can (and will) wreak havoc on reliable connections (though TCP can compensate)
> On IPv4 it's a bit of a problem because of the number of legacy middleboxes that block things they don't know, some olders routers have problems with DF (always fun to debug).
The DF bit has been part of IPv4 since day one. There is no excuse for anything in production to not support it. That doesn't mean broken routers don't exist, but "obviously broken router is obviously broken" is properly resolved by removing it from service and possibly finding a suitable museum to put it in.
That isn't the issue. IPv6 doesn't have a DF bit (it's always implied), but sending an IPv4 packet with DF set has the same effect, and most modern systems do that. Most routers will then correctly send back the "Packet Too Big" ICMP message as appropriate. But if the firewall drops that ICMP message, it's just as problematic as doing it for IPv6. The result is the same.
OK now it makes sense. If you presume everything must go over https and sometimes you have little to no activity and want to back push events on a stream started by a client, tcp seems horrible! :)
Its hard trying to push in the wrong direction, not having udp when its needed. And not having stable unNATed IPs because addresses have run out. Not sure that is TCPs fault tho. Arguably that is www + ip problem. More and more HTTPS is the only Internet allowed. This is bad IMHO and leads to stupid hacks like websockets. Gamers used to open port in their routers in my day and would receive a performance penalty if rhey did not.
Only mail servers should be talking on port 25. Clients should be using 587 to talk to servers.
I've never had an issue with any ISP nor VPS provider with port 25. You can't expect hotels and inflight internet providers to allow you to run a mail server inside their perimeter, they do block 25, but I have never had a problem with 587.
More to the point, I've talked to second line tech support agents who have told me that they block all incoming connections on port 25 as a spam prevention measure.
> I dont grok this, if tcp's model has fundamental problems how come the Internet works. :)
So he's arguing about using a proxy to "shield" the server from knowing about all the ins and outs of the myriad of TCP clients out there.
And then he laments that an HTTP proxy might very well be the root of all evil on the internet because the true client to server end-to-end connection is lost when a proxy is put in the middle, which is pretty standard practice these days.
So one thing that he's incorrect on is that a proxy can know the state of connections, because that's in the TCP packet, but not in the data section [1]. The SYN/SYNACK/ACK hand shake and data retry happens outside of the data section of the TCP packet.
I suppose you could call it a two-node consensus algorithm, the same way plugging a flash drive into your laptop is. Even after reading, I don't see the benefit of viewing TCP this way.
It's a layer under the "it's a bidirectional stream" abstraction. Having had to debug gnarly TCP problems before (generally with semi-"intelligent" load balancers), the two-node consensus concept is good to keep in mind.
Right now the "two-node consensus algorithm" idea has got me thinking more about some of the nasty problems I've debugged and how to look at them through that lens, especially considering partial partition tolerance. "Under what circumstances can the TCP state machine on either end lose consensus, and how will/won't it recover?"
The problem is in thinking of an HTTPS request-response through proxies as a single TCP connection. It isn't.
A TLS proxy is not a normal part of a layered TCP/IP connection. It's literally in the name: "terminating" proxy. It stops the connection right there. Anything after the TLS proxy is outside the scope of the initial connection. Applications have to be engineered to pass on data from one connection to another.
An example is stateful firewalls. Almost all stateful firewalls are NAT gateways with rules. NAT gateways are designed to pass certain things from one connection to another, but they are not simply unwrapping a layer from a connection and passing it on: they maintain separate connections. edit Apparently I'm wrong, as Netfiler apparently only defragments and then changes addresses and ports, but firewall vendors basically keep independent connections (for security reasons)
TCP is specified just fine for consensus on a single TCP connection. It isn't specified for an HTTPS connection through middleware. Hence, such middleware is complicated.
You need them to get even the basic stuff right (see https://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-...) and you need it even more to implement "modern" application layer protocols like HTTP2 (if you don't use it, you get data loss bugs like this: https://trac.nginx.org/nginx/ticket/1250#comment:4).