Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Normally a POSIX thread gets a very large stack (1MB is typically the default), and obviously these "virtual stacks" will only be as large as they grow (splayed on the heap?). But the memory for those POSIX thread stacks isn't necessarily allocated up front! The OS grows the thread when the thread traps trying to access the guard page, so it's not really a 1MB stack.

Now, if you need to serve 1e6 clients with threads, and those are 1MB stack threads, then you'll be using 1TB of your VM space, which... is almost certainly going to have some performance issues (MMU table size issues at the very least). If you splay your stacks on the heap as linked lists of stack chunks then you might get away with having a very large (and fragmented) heap with large page table entries, which might be a win.

I think approaches on the CPS side of the spectrum will be generally better than this. No, I don't study this and I don't have numbers. Yes, CPS in general means allocating closures on the heap so that some state does live splayed all over rather than compressed, but it doesn't have to be so. But often you'll have only a handful of such closures, and the language could understand that they are one-time use closures (hello Rust) so that no GC is needed.

I've written a small (proprietary) HTTP server that is hand-coded CPS -- specifically it supports hanging-GETs with Range: bytes=0- of regular files as a form of tail -f over HTTP, which is great for log files. That implementation has a single object per GET that has all the state needed, and the only other place state lives is in epoll event registrations (which essentially are the closures, and they are very small, and only one per-connection). Granted, this is a very simple application, and it would be a lot more complicated if, for example, it had to do async I/O directly on a block device to implement a filesystem in the same process -- that would require more care to keep the state compressed.

So in general I'm for CPS. But it's generally true that CPS solutions cost more dev time, and that can be prohibitive. The memory footprint cost difference will be a linear factor, which does not trivially justify the additional dev cost. Then again, if you'll be running lots and lots of instances with lots and lots of clients, the run-time savings can then easily be gargantuan compared to the dev costs -- but no one measures this, and by the time you wish you'd used CPS it will be too late and reimplementation costs prohibitive. Then again, async/await might fit the bill well enough most of the time.



could you explain difference between CPS and using “Async/Await). I have a C# background but always assumed they where the same thing!


CPS == callback hell

async/await == the syntax and compiler help you manage the callback hell


so different syntax but they compile down to the same thing?


Yes, roughly. That or compile to co-routines (which is green threads).




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: