An interesting point raised there is that if they instead used a limited thread pool for all goroutines to share when making OS calls you could produce deadlocks.
As I understand it, the runtime has to clone a new thread for each sync call into the OS.
That is the (only) mechanism by which a goro can perform such a sync operation without risking blocking the process.
It's entirely possible under some workloads that those blocking OS calls require other goros to make progress (think pipes), so this could result in deadlock (although for some workloads it wouldn't I guess).
I'm groping blindly here, but from the references in that message to cgo I think it might have to do with calling into foreign code. Presumably Go's scheduler can't yield when you're inside C code (I'm not aware of any lightweight thread system that can achieve this) so you have to block there, which means you can no longer schedule other goroutines on that thread, and thus you have to spawn a new thread if you want to get any work done. But I'm just guessing here.
https://code.google.com/p/go/issues/detail?id=4056
An interesting point raised there is that if they instead used a limited thread pool for all goroutines to share when making OS calls you could produce deadlocks.