A lot of discussion about async Rust assumes that the reason one would want to u...

ori_b · on Oct 15, 2023

Futures are nicely equivalent to one shot channels with threads.

pornel · on Oct 15, 2023

Not quite: Rust futures also have immediate cancellation and easy timeouts that can be imposed externally on any future.

In threads that perform blocking I/O you don't get that, and need to support timeouts and cancellation explicitly in every blocking call.

littlestymaar · on Oct 15, 2023

With a nice syntax sugar on top, but yeah pretty much.

riku_iki · on Oct 15, 2023

Futures as a concept are orthogonal to async, they can totally work in explicit threading model.

jupp0r · on Oct 15, 2023

They can be used that way, but you end up with exactly the same problems that async programming aims to avoid (performance, deadlocks, your business logic being cluttered with low level implementation details).

riku_iki · on Oct 15, 2023

> that async programming aims to avoid (performance, deadlocks, your business logic being cluttered with low level implementation details).

I disagree with you, my code looks safe and simple with explicit blocking threading, and at the same time is much simpler to reason about what is going on and tune in contrast to async frameworks which hide most of the details under the hood.

You can argue about performance, that async/epoll/etc allows to avoid spawning thousands of threads and remove some overhead, but there is no much benchmarks in internet (per my research) which would say that this performance overhead is large.

jupp0r · on Oct 15, 2023

If you are using explicit blocking, share data between threads and have not run into deadlocks then your application is trivial (which is great if it solves your problem).

riku_iki · on Oct 15, 2023

Could you explain how sharing data between threads is different in async programming and blocking programming?

jupp0r · on Oct 16, 2023

You can minimize sharing data between threads because it's easier to have data affinity with threads (ie only thread A will read or write to a piece of data). You can still access that data from multiple modules because the whole thread is never blocked waiting for IO (because of async). An extreme example is nodejs, where you only have one thread, can concurrently do thousands of things and never have to coordinate (ie via mutexes) data access.

riku_iki · on Oct 16, 2023

that may be true if you are Ok to have only one thread and not utilize parallelism.

jupp0r · on Oct 16, 2023

It's not either or, you can combine the two. I've worked on a system that did real time audio mixing for 10000s of concurrent connections, utilizing >50 cores, mostly with one thread each. Each thread had thread-local data, was receiving/sending audio packets to hundreds/thousands of different IP addresses just fine without worrying about mutexes at all. Try that with tens of thousands of actual OS threads and the associated scheduling overhead.

Having data affinity to cores is also great for cache hit rates.

Here is part of the C++ runtime this is based on: https://github.com/goto-opensource/asyncly. I was the principal author of it when it was created (before it was open sourced).

riku_iki · on Oct 16, 2023

> Each thread had thread-local data, was receiving/sending audio packets to hundreds/thousands of different IP addresses just fine without worrying about mutexes at all.

it doesn't sound they really sharing data with each other, it looks like your logic is well lineralizable and data localized, and you can't implement access to some global hashmap in that way for example.

> Try that with tens of thousands of actual OS threads and the associated scheduling overhead.

I run this(10k threads blocked by DB access) in prod and it works fine for my needs. There are lots of statements in internet about overhead, but not much benchmarks how large this overhead is.

> Here is part of the C++ runtime this is based on

yeah, I need one runtime on top of another runtime, with unknown quality, support, longevity and number of gotchas.

jupp0r · on Oct 16, 2023

> it doesn't sound they really sharing data with each other, it looks like your logic is well lineralizable and data localized, and you can't implement access to some global hashmap in that way for example.

Yes, because data can have thread affinity. Data doesn't need to be shared by _all _ connections, just by a few hundred/thousand. This enables connections to be scheduled to run on the same thread so that they can share data without synchronization.

> I run this(10k threads blocked by DB access) in prod and it works fine for my needs. There are lots of statements in internet about overhead, but not much benchmarks how large this overhead is.

The underlying problem is old and well researched: https://en.wikipedia.org/wiki/C10k_problem

riku_iki · on Oct 16, 2023

> Data doesn't need to be shared by _all _ connections,

data doesn't need to be shared in your specific case, not in general.

> The underlying problem is old and well researched: https://en.wikipedia.org/wiki/C10k_problem

wiki page doesn't mean it is well researched, where can I see results of overhead measurements on modern hardware?

jupp0r · on Oct 17, 2023

> wiki page doesn't mean it is well researched, where can I see results of overhead measurements on modern hardware?

Here is how this works: at the bottom of the wiki page, there are referenced papers. They contain measurements in modern hardware. You read those, then perhaps go to Google and see if there is any newer research that cites those papers.

If you don't feel like reading papers, HN has a search bar at the bottom that yields a wealth of results: https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

riku_iki · on Oct 17, 2023

I spent short time looking and found that most papers are very outdated or don't have relevant info (no measurements of overhead) on that page. Give specific paper and citation or we finish this discussion.

jupp0r · on Oct 17, 2023

https://blog.erratasec.com/2013/02/multi-core-scaling-its-no...

Maybe you should just take a college computer architecture course along the lines of Hennessy/Patterson. This is nothing new, I learned much of this in college 15 years ago. The problem has only gotten worse since then, computers have not become more single threaded.

riku_iki · on Oct 17, 2023

my reading is that graphs in that post are just fantasized by author to demonstrate his idea and not backed by any benchmarks or measurements, at least I don't see any links on code in article and no mentions what logic he actually tried to run, how many threads/connections he spawned.

> The problem has only gotten worse since then, computers have not become more single threaded.

Computers are now can handle 10k blocking connections with ease.

jupp0r · on Oct 16, 2023

> yeah, I need one runtime on top of another runtime, with unknown quality, support, longevity and number of gotchas.

It's a library. It solved our problems at the time, years ago. It's still used in production and piping billions of audio minutes per month through it. You don't have to use it, I merely referred to it as an example. A similar library is proposed to be included in C++23: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p23...

riku_iki · on Oct 16, 2023

> It's still used in production and piping billions of audio minutes per month through it.

there are tons of overengineered unmaintainable code in prod, it doesn't mean I need to follow them as example without much justification.

> A similar library is proposed to be included in C++23

hm, I went through the code example, and would prefer my current approach as a much simpler and readable.

pornel · on Oct 15, 2023

At the lowest level Rust's async is a syntax for generators (yield).

I've (ab)used them that way, without any async runtime, just to easily write stateful iterators.

random_ · on Oct 15, 2023

Do you know an example of such case in the Rust ecosystem?

riku_iki · on Oct 15, 2023

No, I don't know, I am talking about general concept.

In Java, Future.get() blocks current thread, and it is trivially integrated into explicit threading programming. In Rust, Future.poll() is not blocking, and one would need to rely on some async framework, or build own event loop which can potentially block thread.

tux3 · on Oct 15, 2023

It's worth noting that a thread's JoinHandle provides a similar interface.

You can spawn your tasks, store the JoinHandle "futures", and wait for completion whenever you need the result.

A difference being that Futures do nothing until polled, while threads start on their own, but that's arguably a helpful simplification for this purpose.

convolvatron · on Oct 15, 2023

this I really found disconcerting - the model I often want is I want to start some work, and then join at some later point - or even chain directly into the next task.

but instead I start a future, and then to run it at all I need to wait for the result. I understand the there are tools to effect this, but it really leaves you wondering - what did I just do? start an async task and then .. block on it in order to get it to execute?

laurencerowe · on Oct 15, 2023

I'm not an expert, but my understanding is that Rust makes a distinction between Futures and Tasks, the top level Futures that are run by the executor.

In JavaScript terms Futures are more like sugar around callbacks, they don't do anything until you call/poll them. Tasks are independent entities like Promises which are being run by the executor, though they may currently be blocked on other tasks.

> the model I often want is I want to start some work, and then join at some later point - or even chain directly into the next task.

Rust wants you to do this the other way around. First chain together your futures so that when you start the top level one as a task there is a single state machine for it to run.