Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, it spawns a kernel thread if all the threads are currently blocked. I ran into this issue running Go on a FUSE mounted file system. If the network hung, Go would spawn thousands of threads and OOM. Go can't _not_ spawn them, as that would risk deadlock.


(disclaimer: I know very little about Go)

An expected way to manage this would be to have a limit on the number of threads to run blocking syscalls. Up to you how to do it (threadpool for syscalls, anythrrad can run a syscalls but checks a mutex first, etc)

I don't think there's a danger of deadlocks here -- your blocking syscalls shouldn't be dependant on other goroutines. Eventually the calls will succeed or timeout (one hopes) and the next call will commence.

In my experience, you can usually find a safe number of parallel syscalls to run -- and it's often not very many.


Imagine 1000 pipes, each of which has a goroutine calling write() and another calling read(). If you schedule all of the writers and none of the readers (or vice versa) you'll get a deadlock.


AFAIK when a goroutine "calls write" (on a channel or on a socket, stream, mutex etc).. and the write "is blocked" it yields the execution and the scheduler can activate something else.. which can be a reader goroutine (after all the writers are blocked for example). So there's no deadlock as long as you have at least one reader for any writer.


That requires the underlying syscall to support an async or non blocking mode though. Disk io or stat doesn't on linux for example. The usual alternative is some sort of background pool for blocking tasks (which adds non trivial overhead), or, where supported by the OS, scheduler activation.


You are right, see Synchronous System Calls in

https://www.ardanlabs.com/blog/2018/08/scheduling-in-go-part...


Sorry, I also expected those things that can be translated into non-blocking calls with select-ish apis for readiness to be translated as well -- because it's the right thing to do in this type of environment.

To be more specific, most socket calls should be async + selectish, but file system calls would likely be done as synchronous, because async doesn't generally work for that -- anyway limiting the number of outstanding file I/O requests tends to help throughput.


stat() cannot (on platforms I am familiar with) be performed in a selectish/nonblocking way. That call can block forever. Local filesystem access in general is still not async-capable in very important ways.


Yes, filesystem calls are going to block -- but putting that into a limited threadpool shouldn't deadlock your system -- you can't do a blocking read on a file, waiting for it to be filled by another thread (unless/until you start playing with FUSE, I guess.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: