Asynchronous programming is a great fit for IO-driven programs, because modern I...

flohofwoe · on July 5, 2023

> Asynchronous programming is a great fit for IO-driven programs

Yeah, but this could already be solved without "async/await compiler magic" in native code just with OS primitives, for instance with Windows-style event objects, it might look like this in pseudo-code:

    const event1 = read_async(...);
    const event2 = read_async(...);
    const event3 = read_async(...);
    wait_all(event1, event2, event3);

This would run three IO operations "in parallel", and you're waiting for all three to finish until execution continues after the wait_all() call.

Looks just as convenient as async/await style, but doesn't need special language features or a tricky code-transformation pass in the compiler which turns sequential code into a switch-case state machine (and most importantly, it doesn't have the 'function-color problem').

(this really makes me wonder why Rust has gone down the Javascript-style async/await route with function coloring - the only reason why it remotely makes sense is that it also works in WASM).

bluejekyll · on July 5, 2023

> this really makes me wonder why Rust has gone down the Javascript-style async/await route with function coloring - the only reason why it remotely makes sense is that it also works in WASM

As someone who’s done asynchronous programming in Rust before Futures (I’ll call it C style), then with Futures, then with async/await, it’s because it is far simpler. On top of that it allows for an ecosystem of libraries to be built up around common implementations for problems. Without it, what you end up with is a lot of people solving common state machine problems in ways that have global context or other things going on which make the library unportable and not able to easily be reused in other contexts. With async/await, we actually have multiple runtimes in the wild, and common implementations that work across those different runtimes without any changes needed. So while I’m disappointed that we ended up with function coloring, I have to say that it’s far simpler than anything else I’ve worked with while maintaining zero overhead costs allowing it to be used in nostd contexts like Operating Systems and embedded software.

pavlov · on July 5, 2023

But the difference is that wait_all() is blocking the thread, right? Or does it keep running the event loop while it's waiting, so callbacks for other events can be processed on the same thread?

If it does the latter, the stack will keep growing with each nested wait call:

main -> runEventLoop -> someCallback -> wait_all -> runEventLoop -> anotherCallback -> wait_all -> ...

The async/await transformation to a state machine avoids this problem.

flohofwoe · on July 5, 2023

Yeah it blocks the thread, any other "user work" needs to happen on a different thread. But if you just need multiple non-blocking IO operations run in parallel it's as simple as it gets.

(the operating system's thread scheduler is basically the equivalent to the JS "event loop").

pavlov · on July 5, 2023

Desktop operating systems all have application event loops that run within a single thread because the OS thread scheduler is not the same thing. If you just want an event loop, trying to use threads instead for everything will often end up in tears due to concurrent data access issues.

flohofwoe · on July 5, 2023

An async/await runtime doesn't necessarily need to run everything on the same thread though (that's really just a Javascript runtime restriction), instead it could use a thread pool. In this case, the same concurrent data issues would apply.

gloryjulio · on July 5, 2023

Yes, basically golang model

mpweiher · on July 5, 2023

Use one or more service threads to do most work off the UI thread.

pavlov · on July 5, 2023

Yes, sure. Operating systems nowadays provide useful thread pool runtimes for this purpose, like Apple’s GCD.

In no way does it mean that you don’t need an event loop because threads exist, as was the contention here.

mpweiher · on July 7, 2023

1. You don't need either thread pools or GCD for this. GCD generally makes things worse.

2. It absolutely does mean you don't need the main thread event loop for non UI-events.

mrfox321 · on July 5, 2023

Right, so it's less efficient than async.

Async would let you yield at the gather.

flohofwoe · on July 5, 2023

The wait_all() "yields" to the operating system's thread scheduler, which switches to another thread that's ready to run (or if fibers are used instead of threads, a 'fiber scheduler' would switch to a different fiber within the same thread).

Taking into account that an async/await runtime also needs to switch to a different "context", the performance difference really shouldn't be all that big, especially if the task scheduler uses fibers (YMMV of course).

mgaunard · on July 5, 2023

That's not how an operating system models disk access though. You synchronously write to the kernel cache, and the kernel eventually gets those written to disk.

Wanting to do asynchronous I/O to disk is only useful if you're aiming to bypass the cache. In practice it is very hard to reach higher performance when doing that though.

justin_ · on July 5, 2023

I was referring to the fact that interaction with the disk itself is asynchronous. Indeed, the interface provided by a kernel for files is synchronous, and for most cases, that's what programmers probably want.

But I also think the interest in things like io_uring in Linux reflect that people are open to asynchronous file IO, since the kernel is doing asynchronous work internally. To be honest, I don't know much about io_uring though - I haven't used it for anything serious.

There's no perfect choice (as always) -- After all, for extremely high-performance scenarios, people avoid the async nature of IO entirely, and dedicate a thread to busy-looping and polling for readiness. That's what DPDK does for networking. And I think for io_uring and other Linux disk interfaces have options to use polling internally.

mgaunard · on July 5, 2023

Networking and disks are inherently entirely different.

Pretending they're the same under some generic I/O concept only goes so far.