This is an interesting thing that points out why abstraction layers can be bad without proper message passing mechanisms.
This could be fixed if there was a way for the application at L7 to tell the TCP stack at L4 "hey, I'm an interactive shell so I expect to have a lot of tiny packets, you should leave TCP_NODELAY on for these packets" so that it can be off by default but on for that application to reduce overhead.
Of course nowadays it's probably an unnecessary optimization anyway, but back in '84 it would have been super handy.
"I'm an interactive shell so I expect to have a lot of tiny packets" is what the delay is for. If you want to turn it off for those, you should turn it off for everything.
(If you're worried about programs that buffer badly, then you could compensate with a 1ms delay. But not this round trip stuff.)
This could be fixed if there was a way for the application at L7 to tell the TCP stack at L4 "hey, I'm an interactive shell so I expect to have a lot of tiny packets, you should leave TCP_NODELAY on for these packets" so that it can be off by default but on for that application to reduce overhead.
Of course nowadays it's probably an unnecessary optimization anyway, but back in '84 it would have been super handy.