Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

for the record, i really don't know much about threads, so the following questions are probably kinda stupid.

first question: so, as the article states, the ONLY performance upside of virtual threads (versus os threads) is the number of inactive threads, thanks due to lower per-thread memory overhead.

for some reason i was expecting to read something about context switching cost too.

as far as i understand, virtual thread context switches are most likely between a lot cheaper and roughly as expensive than their carrier thread context switches, depending on how much memory has to be copied around and how to find the next thread to execute.

the problem here is that virtual context switches may be cheaper, but have to be executed in addition to the os thread context switches, so the overall efficiency is actually lower because more work is spent scheduling (os vs. os+virtual).

to minimize this it might be possible for privileged applications to disable os thread context switching for the carrier threads as long as there are active virtual threads. that way, the context switching and scheduling overhead is reduced from "os vs. os+virt" to "os vs. virt". i.e. as soon as there are active virtual threads the carrier thread is excluded for os scheduler until there aren't any active virtual threads anymore (or, alternatively, the virtual thread pool is empty).

is this a thing? does this make sense? would it be worth it? do operating systems even support "manual" (i.e. by the app) thread scheduling hints? or are the carrier threads only rarely taken out of schedule because they're not really put to sleep as long as there are active virtual threads anyway, making this a non-issue?

second question: as far as i understand blocking os threads, the scheduler stores which thread is waiting on which io resource and the appropriate thread gets woken up once a waited-on io resouce is available. this is not much of a problem with with a few hundred or thousand os threads, but now with virtual threads, the io resource must now be linked to the os thread for the virtual thread executor's scheduler by the os and then to the virtual thread waiting on the resource by the virtual thread scheduler. so for example if there are 100.000 inactive virtual threads waiting for a network response and one arrives, the os scheduler has to match it to an os thread first (the one the vt scheduler runs on) and then the vt scheduler has to match it to one of the virtual threads. i.e. two lookups in hashtables with 100.000 entries each (one io to os threads, the other io to vt). is this how it works or do i misunderstand this? as async models have the same issue but work fine i guess this isn't really a problem in practice. also, as far as i understand, the os thread woken up is given a kind of resouce id it's been woken up for, instead of "well, you went to sleep for a certain resource id so it's obvious which one you've been woken up for" in blocking IO).



You do not have to context switch the kernel thread when context switching the multiplexed user thread. So the context switch should in principle be much faster. There are second order effects of course, for example the new user thread might touch cold cache lines so the context switch speed up might not make much of a difference.

Normally on an M:N setup the kernel threads are pinned one per phisical hardware thread (i.e. core or SMT thread), so as long a they are the only program running in that core, they are never preempted.


> to minimize this it might be possible for privileged applications to disable os thread context switching for the carrier threads as long as there are active virtual threads. that way, the context switching and scheduling overhead is reduced from "os vs. os+virt" to "os vs. virt". i.e. as soon as there are active virtual threads the carrier thread is excluded for os scheduler until there aren't any active virtual threads anymore (or, alternatively, the virtual thread pool is empty).

> is this a thing? does this make sense? would it be worth it? do operating systems even support "manual" (i.e. by the app) thread scheduling hints? or are the carrier threads only rarely taken out of schedule because they're not really put to sleep as long as there are active virtual threads anyway, making this a non-issue?

It's normally recommended to run with as many "physical" threads as you have CPU cores, and then the OS scheduler can generally just do the right thing (assuming nothing else is running on the machine) - you don't need to do any context switches if you have as many processors as there are OS threads wanting to run. Most OSes do offer a way to "pin" a thread to a processor (at varying levels of hint/requirement) but I've only seen them used when doing fairly extreme performance tuning.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: