This is the part I don’t understand. You know you can specify the stack size explicitly in POSIX? I can set the stack size to the same 2k in Go and have less overhead. The only difference is it won’t auto-expand like in Go.
No, the stack only grows for the initial thread in a process. Subsequent threads don't have growable stacks. And this is true for virtually every operating system that matters, not just Linux.
I don't think this is correct, do you have a source for this?
man PTHREAD_ATTR_SETSTACKSIZE(3) says:
> The stack size attribute determines the minimum size (in bytes) that will be allocated for threads created using the thread attributes object attr.
and:
> A thread's stack size is fixed at the time of thread creation. Only the main thread can dynamically grow its stack.
My understanding is that it is referring to virtual memory. The kernel would allocate a giant blob of RAM, [stacksize + heapsize + some other stuff] large. My reading of the manpage above is that the main thread can change this allocation, while other threads are stuck with what they started with.
But why would the kernel actually realize the stack portion of the allocation? Surely if I create a 1G stack child thread, it will not realize those stack pages until I actually use them?
The initial thread in a process, on every operating system that is still relevant today, uses guard pages to grow its virtual memory allocation for the stack segment. Physical memory for the stack segment's pages may or may not be mapped into the process address space (it will usually be mapped). This stack can't shrink, and also once the thread touches a page, it becomes commited memory and it will always stay commited.
Other threads in a process do not use guard pages and use fixed virtual memory allocation for their stacks. Their stacks are fixed in size and can't grow. Just like for the initial thread once a page is touched, it becomes commited memory and it will always stay that way.
In Go, stacks start small and use a variable amount of virtual memory. Go stacks can shrink, freeing both address space and physical memory.
I think Linux can defer committing memory for the pages of the stack until they're used, but you need to reserve the entire virtual memory in the first place don't you? Otherwise how can you have the ability to dynamically allocate more stack virtual address space to an arbitrary thread, without relocating it? Or if you do relocate how do you update pointers to the stack?