for the record, i really don't know much about threads, so the following questio...

gpderetta · on Sept 30, 2022

You do not have to context switch the kernel thread when context switching the multiplexed user thread. So the context switch should in principle be much faster. There are second order effects of course, for example the new user thread might touch cold cache lines so the context switch speed up might not make much of a difference.

Normally on an M:N setup the kernel threads are pinned one per phisical hardware thread (i.e. core or SMT thread), so as long a they are the only program running in that core, they are never preempted.

lmm · on Sept 30, 2022

> to minimize this it might be possible for privileged applications to disable os thread context switching for the carrier threads as long as there are active virtual threads. that way, the context switching and scheduling overhead is reduced from "os vs. os+virt" to "os vs. virt". i.e. as soon as there are active virtual threads the carrier thread is excluded for os scheduler until there aren't any active virtual threads anymore (or, alternatively, the virtual thread pool is empty).

> is this a thing? does this make sense? would it be worth it? do operating systems even support "manual" (i.e. by the app) thread scheduling hints? or are the carrier threads only rarely taken out of schedule because they're not really put to sleep as long as there are active virtual threads anyway, making this a non-issue?

It's normally recommended to run with as many "physical" threads as you have CPU cores, and then the OS scheduler can generally just do the right thing (assuming nothing else is running on the machine) - you don't need to do any context switches if you have as many processors as there are OS threads wanting to run. Most OSes do offer a way to "pin" a thread to a processor (at varying levels of hint/requirement) but I've only seen them used when doing fairly extreme performance tuning.