Dec 25, 2024 20 min #rust · #async · #traits

Catching up with async Rust

Thanks to my sponsors: Matthew T, James Leitch, Lyssieth, David White, Mike English, anichno, Thehbadger, René Ribaud, Dirkjan Ochtman, Lena Schönburg, Ben Wishovich, Jack Duvall, Nicolas Riebesel, hgranthorner, prairiewolf, Daniel Silverstone, Jake Demarest-Mays, Marcus Griep, notryanb, Thor Kamphefner and 266 more

This is a dual feature! It's available as a video too. Watch on YouTube

In December 2023, a minor miracle happened: async fn in traits shipped.

As of Rust 1.39, we already had free-standing async functions:

pub async fn read_hosts() -> eyre::Result<Vec<u8>> {
    // etc.
}

…and async functions in impl blocks:

impl HostReader {
    pub async fn read_hosts(&self) -> eyre::Result<Vec<u8>> {
        // etc.
    }
}

But we did not have async functions in traits:

use std::io;

trait AsyncRead {
    async fn read(&mut self, buf: &mut [u8]) -> io::Result<usize>;
}



sansioex on  main via 🦀 v1.82.0
❯ cargo +1.74.0 check --quiet
error[E0706]: functions in traits cannot be declared `async`
 --> src/main.rs:9:5
  |
9 |     async fn read(&mut self, buf: &mut [u8]) -> io::Result<usize>;
  |     -----^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |     |
  |     `async` because of this
  |
  = note: `async` trait functions are not currently supported
  = note: consider using the `async-trait` crate: https://crates.io/crates/async-trait
  = note: see issue #91611 <https://github.com/rust-lang/rust/issues/91611> for more information

For more information about this error, try `rustc --explain E0706`.
error: could not compile `sansioex` (bin "sansioex") due to previous error

Cool Bear's hot tip

The cargo +channel syntax is valid because cargo here is a shim provided by rustup.

Valid channel names look like x.y.z, stable, beta, nightly, etc. — the same thing you’d encounter in rust-toolchain.toml files or any other toolchain override.

For the longest time, the async-trait crate was recommended to have async fn in traits:

use std::io;

#[async_trait::async_trait]
trait AsyncRead {
    async fn read(&mut self, buf: &mut [u8]) -> io::Result<usize>;
}

It worked, but it changed the trait definition (and any implementations) to return pinned, boxed futures.

Boxed futures?

Yeah! Futures that have been allocated on the heap.

Because?

Ah well, because, you see, futures — the value that async functions return — can be of different sizes!

The size of locals

The future returned by this function:

async fn foo() {
    tokio::time::sleep(std::time::Duration::from_secs(1)).await;
    println!("done");
}

…is smaller than the future returned by that function:

async fn bar() {
    let mut a = [0u8; 72];
    tokio::time::sleep(std::time::Duration::from_secs(1)).await;
    for _ in 0..10 {
        a[0] += 1;
    }
    println!("done");
}

Because bar simply has more things going on — more state to keep track of:



sansioex on  main [!] via 🦀 v1.82.0
❯ cargo run --quiet
foo: 128
bar: 200

That array there is not deallocated while we sleep asynchronously — it’s all stored in, well, in the future.

And that’s a problem because typically, when we call a function, we want to know how much space we should reserve for the return value: we say that the return value is “sized”.

And by “we”, I really mean the compiler — reserving stack space for locals is one of the first things a function does when it’s called.

Here, the compiler reserves space on the stack for step1, _foo and step2:

fn main() {
    let step1: u64 = 0;
    let _foo = foo();
    let step2: u64 = 0;

    // etc.
}

As seen in the disassembly:



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo asm sansioex::main --quiet --simplify --color | head -5

sansioex::main:
Lfunc_begin45:
        sub sp, sp, #256
        stp x20, x19, [sp, #224]
        stp x29, x30, [sp, #240]

That “sub” here reserves 256 bytes total.

Cool Bear's hot tip

The cargo asm subcommand shown here is from cargo-show-asm, installed via cargo install --locked --all-features cargo-show-asm

The original cargo-asm crate still works but has less functionality and hasn’t been updated since 2018.

Although there’s nothing forcing the compiler to do so, all the locals in our code…

fn main() {
    let step1: u64 = 0;
    let _foo = foo();
    let step2: u64 = 0;

    println!(
        "distance in bytes between before and after: {:?}",
        (&step2 as *const _ as u64) - (&step1 as *const _ as u64)
    );
}

…are laid out next to each other — and so we can predict what it’s going to print: the distance between step1 and step2 on the stack.

Uhhh… 128? The size of the future returned by foo?

Almost!



sansioex on  main [!] via 🦀 v1.82.0
❯ cargo run --quiet
distance in bytes between before and after: 136

The distance between step1 and step2 includes the size of step1. Our stack looks a little something like this:

Crap, yes, right.

Don’t worry bear, everyone makes fencepost errors.

Just boxing it

Back to this AsyncRead trait:

use std::io;

trait AsyncRead {
    async fn read(&mut self, buf: &mut [u8]) -> io::Result<usize>;
}

If we somehow got ahold of a &dyn AsyncRead, and called read on it:

async fn use_read(r: &mut dyn AsyncRead) {
    let mut buf = [0; 1024];
    // how large is this local?
    let fut = r.read(&mut buf);
    fut.await;
}

…how would we know how much space to reserve for fut?

Well… I imagine we could make “the size of the future” part of the AsyncRead trait.

That way we could first query that size, then allocate it, then call read, passing it a pointer to the space we just reserved…

Right! I’m pretty sure that’s how unsized locals work — and roughly what the longer-term plan is.

But for the time being, the only way to hold an arbitrary future is to box it!

Even though these are two different futures of different sizes, the locals are the same size, because it’s just a pointer to the actual future, which is… 8 bytes.

fn main() {
    let _foo: Pin<Box<dyn Future<Output = ()>>> = Box::pin(foo());
    let _bar: Pin<Box<dyn Future<Output = ()>>> = Box::pin(bar());

    println!("Size of foo: {} bytes", std::mem::size_of_val(&_foo));
    println!("Size of bar: {} bytes", std::mem::size_of_val(&_bar));
}



sansioex on  main [!] via 🦀 v1.82.0
❯ cargo run --quiet
Size of foo: 16 bytes
Size of bar: 16 bytes

But… this prints 16 bytes.

Yup! I did a little lie just now.

Let’s look at those values closer.

Dynamic dispatch

We’ll use LLDB to run through our program, setting a breakpoint right before the end of main:



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo b --quiet && rust-lldb ./target/debug/sansioex -Q
Current executable set to '/Users/amos/bearcove/sansioex/target/debug/sansioex' (arm64).
(lldb) b main.rs:78
Breakpoint 1: where = sansioex`sansioex::main::h616d18632113fc9e + 496 at main.rs:78:39, address = 0x000000010000478c
(lldb) r
Process 41462 launched: '/Users/amos/bearcove/sansioex/target/debug/sansioex' (arm64)
Size of foo: 16 bytes
Process 41462 stopped
* thread #1, name = 'main', queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x000000010000478c sansioex`sansioex::main::h616d18632113fc9e at main.rs:78:39
   75       let _bar: Pin<Box<dyn Future<Output = ()>>> = Box::pin(bar());
   76
   77       println!("Size of foo: {} bytes", std::mem::size_of_val(&_foo));
-> 78       println!("Size of bar: {} bytes", std::mem::size_of_val(&_bar));
   79   }
Target 0: (sansioex) stopped.

LLDB has limited support for Rust, but we can still print the local _foo:



(lldb) p _foo
(core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output=()>, alloc::alloc::Global> >) {
  __pointer = {
    pointer = 0x0000600001a24100
    vtable = 0x0000000100084068
  }
}

And there we have our two pointers! 16 bytes total.

What does vtable point to?

Well, pointer points to data, and vtable points to code: if we poke at it, we find 64-bit values that look a lot like addresses…



(lldb) x/8gx .__pointer.vtable
0x100084068: 0x0000000100004ae4 0x0000000000000080
0x100084078: 0x0000000000000008 0x0000000100004c58
0x100084088: 0x0000000100004a60 0x00000000000000c8
0x100084098: 0x0000000000000008 0x0000000100004e50

And if we look up one of them, we find that it is, in fact, the address of our async function:



(lldb) image lookup -a 0x0000000100004c58
      Address: sansioex[0x0000000100004c58] (sansioex.__TEXT.__text + 3488)
      Summary: sansioex`sansioex::foo::_$u7b$$u7b$closure$u7d$$u7d$::h3b04e16f55b57af9 at main.rs:59

Or rather, a closure in our async function. But that’s an implementation detail.

Other fields in the vtable point to other functions: typically, one of those will be the Drop implementation for a given type — it’s important to know how to free a value we hold!

Note that not all “boxes” are two pointers. Only “boxed trait objects” are.

fn main() {
    let s = String::from("I am on the heap AMA");
    let b = Box::new(s);
    print_type_name_and_size(&b);

    let b: Box<dyn std::fmt::Display> = b;
    print_type_name_and_size(&b);
}

fn print_type_name_and_size<T>(_: &T) {
    println!(
        "\x1b[1m{:45}\x1b[0m \x1b[32m{} bytes\x1b[0m",
        std::any::type_name::<T>(),
        std::mem::size_of::<T>(),
    );
}



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo run --quiet
alloc::boxed::Box<alloc::string::String>      8 bytes
alloc::boxed::Box<dyn core::fmt::Display>     16 bytes

That’s the dyn in Box<dyn Trait> — dynamic dispatch, via the vtable.

And that’s what the async-trait crate does! It transforms this:

#[async_trait::async_trait]
trait AsyncRead {
    async fn read(&mut self, buf: &mut [u8]) -> io::Result<usize>;
}

Into this:

trait AsyncRead {
    #[must_use]
    #[allow(
        elided_named_lifetimes,
        clippy::type_complexity,
        clippy::type_repetition_in_bounds
    )]
    fn read<'life0, 'life1, 'async_trait>(
        &'life0 mut self,
        buf: &'life1 mut [u8],
    ) -> ::core::pin::Pin<
        Box<
            dyn ::core::future::Future<Output = io::Result<usize>>
                + ::core::marker::Send
                + 'async_trait,
        >,
    >
    where
        'life0: 'async_trait,
        'life1: 'async_trait,
        Self: 'async_trait;
}

Well that’s a mouthful.

Yes, the output of procedural macros is seldom readable — the important bit is the return type, Pin<Box<dyn Future>> — this will be 16 bytes, no matter the actual implementation of AsyncRead::read.

`dyn`-compatibility

But let’s forget the async-trait crate for a moment, because, as of Rust 1.75, there is native support for async functions in traits!

use std::io;

trait AsyncRead {
    async fn read(&mut self, buf: &mut [u8]) -> io::Result<usize>;
}

We can implement that trait on any type, since we’ve defined the trait ourselves (see orphan rules):

impl AsyncRead for () {
    async fn read(&mut self, _buf: &mut [u8]) -> io::Result<usize> {
        let a = [0u8; 72];
        tokio::time::sleep(std::time::Duration::from_secs(1)).await;
        Ok(a[3] as _)
    }
}

And this time, the futures aren’t boxed:

fn main() {
    let mut s = ();
    let mut buf = [0u8; 72];
    let fut = s.read(&mut buf);
    print_type_name_and_size(&fut);
}



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo run --quiet
<() as sansioex::AsyncRead>::read::{{closure}} 224 bytes

If they were, this would print 16 bytes.

Well that’s great. So everything is solved, yes?

Not everything, no. What do you think this prints?

fn use_async_read(r: Box<dyn AsyncRead>) {
    let mut buf = [0u8; 72];
    let fut = r.read(&mut buf);
    print_type_name_and_size(&fut);
}

Well we’ve just seen that the future is 224 bytes…

…only for the empty tuple type, (), though!

Our parameter could be any type — any type that implements AsyncRead.

Right, so we can’t know. We have the “unsized locals” problem again.

Exactly:



error[E0038]: the trait `AsyncRead` cannot be made into an object
  --> src/main.rs:51:17
   |
51 |     let fut = r.read(&mut buf);
   |                 ^^^^ `AsyncRead` cannot be made into an object
   |
note: for a trait to be "dyn-compatible" it needs to allow building a vtable to allow the call to be resolvable dynamically; for more information visit <https://doc.rust-lang.org/reference/items/traits.html#object-safety>
  --> src/main.rs:36:14
   |
35 | trait AsyncRead {
   |       --------- this trait cannot be made into an object...
36 |     async fn read(&mut self, buf: &mut [u8]) -> io::Result<usize>;
   |              ^^^^ ...because method `read` is `async`

In the future maybe, the size of futures for async methods will be part of the vtable for trait objects, and async fn in traits will be dyn-compatible.

But for now, they’re not.

There are many other things that are dyn-incompatible: taking self by value, for example:

trait EatSelf {
    fn nomnomnom(self) {}
}



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo c
    Checking sansioex v0.1.0 (/Users/amos/bearcove/sansioex)
error[E0277]: the size for values of type `Self` cannot be known at compilation time
  --> src/main.rs:34:18
   |
34 |     fn nomnomnom(self) {}
   |                  ^^^^ doesn't have a size known at compile-time
   |
help: consider further restricting `Self`
   |
34 |     fn nomnomnom(self) where Self: Sized {}
   |                        +++++++++++++++++
help: function arguments must have a statically known size, borrowed types always have a known size
   |
34 |     fn nomnomnom(&self) {}
   |                  +

For more information about this error, try `rustc --explain E0277`.
error: could not compile `sansioex` (bin "sansioex") due to 1 previous error

Could Box be an answer here too?

Yes! Although it’s not suggested by the compiler here, taking Box<Self> is fine, because it’s just a pointer, which has a predictable size. In fact, all smart pointers and references are okay:

// Examples of dyn-compatible methods.
trait TraitMethods {
    fn by_ref(self: &Self) {}
    fn by_ref_mut(self: &mut Self) {}
    fn by_box(self: Box<Self>) {}
    fn by_rc(self: Rc<Self>) {}
    fn by_arc(self: Arc<Self>) {}
    fn by_pin(self: Pin<&Self>) {}
    fn with_lifetime<'a>(self: &'a Self) {}
    fn nested_pin(self: Pin<Arc<Self>>) {}
}

This is to say: dyn-compatibility is an issue with traits in general, not just with async Rust.

Also, just because our trait isn’t dyn-compatible, doesn’t mean it’s impossible to work with. In my HTTP implementation loona, the whole API is designed around those limitations.

We can take an impl AsyncRead for example — that’s fine.

fn use_reader(_reader: impl AsyncRead) {}

Because it’s really just shorthand for this:

fn use_reader<R: AsyncRead>(_reader: R) {}

We can also take an &impl AsyncRead, or a &mut impl AsyncRead.

But we cannot take a &dyn AsyncRead — something in our trait violates the current restrictions of dyn compatibility, and that thing is that it has an implicit associated type.

Associated types

Because, when we have an async fn in trait:

trait AsyncRead {
    async fn read(&mut self, buf: &mut [u8]) -> io::Result<usize>;
}

We’re really saying that we have an fn that returns “something that implements Future”:

trait AsyncRead {
    fn read(&mut self, buf: &mut [u8]) -> impl Future<Output = io::Result<usize>>;
}

And we’ve seen before that “impl Trait” in argument position translates to a generic type parameter:

fn use_reader(_reader: impl AsyncRead) {}

// is equivalent to
fn use_reader<R: AsyncRead>(_reader: R) {}

Well, when it’s in return position, it translates to an associated type parameter:

trait AsyncRead {
    fn read(&mut self, buf: &mut [u8]) -> impl Future<Output = io::Result<usize>>;
}

// is equivalent to
trait AsyncRead {
    type ReadFuture: Future<Output = io::Result<usize>>;

    fn read(&mut self, buf: &mut [u8]) -> Self::ReadFuture;
}

In fact, on Rust nightly, we can implement that trait pretty easily — this full program compiles and runs:

#![feature(impl_trait_in_assoc_type)]

use std::{future::Future, io};

trait AsyncRead {
    type ReadFuture: Future<Output = io::Result<usize>>;

    fn read(&mut self, buf: &mut [u8]) -> Self::ReadFuture;
}

impl AsyncRead for () {
    // this is the unstable bit: `Self::ReadFuture` is inferred from the
    // body of `<() as AsyncRead>::read`.
    type ReadFuture = impl Future<Output = io::Result<usize>>;

    fn read(&mut self, _buf: &mut [u8]) -> Self::ReadFuture {
        async move {
            let a = [0u8; 72];
            tokio::time::sleep(std::time::Duration::from_secs(1)).await;
            Ok(a[3] as _)
        }
    }
}

fn main() {
    let mut s: () = ();
    let mut buf = [0u8; 72];
    let fut = s.read(&mut buf[..]);
    print_type_name_and_size(&fut);
}

fn print_type_name_and_size<T>(_: &T) {
    println!(
        "\x1b[1m{:45}\x1b[0m \x1b[32m{} bytes\x1b[0m",
        std::any::type_name::<T>(),
        std::mem::size_of::<T>(),
    );
}



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo +nightly run --quiet
<() as sansioex::AsyncRead>::read::{{closure}} 200 bytes

This pattern should be familiar to those of you who have spent quality time with the tower crate, via hyper for example: their Service trait has a Future associated type:

pub trait Service<Request> {
    type Response;
    type Error;
    type Future: Future<Output = Result<Self::Response, Self::Error>>;

    // Required methods
    fn poll_ready(
        &mut self,
        cx: &mut Context<'_>,
    ) -> Poll<Result<(), Self::Error>>;
    fn call(&mut self, req: Request) -> Self::Future;
}

And when implementing Service, you pretty much have three choices:

Implement Future by hand to avoid heap allocation
Set type Future to a boxed future (Pin<Box<dyn Future<...>>>)
Require Rust nightly and opt into #![feature(impl_trait_in_assoc_type)]

A refreshed `Service` trait

Well, as of Rust 1.75, one could imagine a simpler version of this trait:

trait Service<Request> {
    type Response;
    type Error;

    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>>;
    async fn call(&mut self, request: Request) -> Result<Self::Response, Self::Error>;
}

Implementing it would be comparatively trivial — here’s a no-op service:

impl<Request> Service<Request> for () {
    type Response = ();
    type Error = ();

    fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        Poll::Ready(Ok(()))
    }

    async fn call(&mut self, _request: Request) -> Result<Self::Response, Self::Error> {
        Ok(())
    }
}

And here’s a service that logs any incoming request:

impl<S, Request> Service<Request> for LogRequest<S>
where
    S: Service<Request>,
    Request: std::fmt::Debug,
{
    type Response = S::Response;
    type Error = S::Error;

    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        self.inner.poll_ready(cx)
    }

    async fn call(&mut self, request: Request) -> Result<Self::Response, Self::Error> {
        println!("{:?}", request);
        self.inner.call(request).await
    }
}

None of those futures are boxed, and this all works on Rust 1.75 stable:

#[tokio::main]
async fn main() {
    let mut service = LogRequest { inner: () };

    // (note: we assume the service is ready)
    let fut = service.call(());
    print_type_name_and_size(&fut);
    fut.await.unwrap();
}



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo +1.75 run --quiet
<sansioex::LogRequest<()> as sansioex::Service<()>>::call::{{closure}} 32 bytes
()

But this currently comes with several limitations.

Unnameable types

To start with, we can no longer “name” the return type of Service::call.

Some tower services rely on this, for example the built-in Either service:

enum Either<A, B> {
    Left(A),
    Right(B),
}

// Traditional tower Service trait
impl<A, B, Request> Service<Request> for Either<A, B>
where
    A: Service<Request>,
    B: Service<Request, Response = A::Response, Error = A::Error>,
{
    type Response = A::Response;
    type Error = A::Error;
    type Future = EitherResponseFuture<A::Future, B::Future>;

    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        match self {
            Either::Left(service) => service.poll_ready(cx),
            Either::Right(service) => service.poll_ready(cx),
        }
    }

    fn call(&mut self, request: Request) -> Self::Future {
        match self {
            Either::Left(service) => EitherResponseFuture {
                kind: Kind::Left {
                    inner: service.call(request),
                },
            },
            Either::Right(service) => EitherResponseFuture {
                kind: Kind::Right {
                    inner: service.call(request),
                },
            },
        }
    }
}

However, this specific scenario is not an issue — the implementation of our simplified Service trait for Either is simpler and more natural:

// Our simplified Service trait
impl<A, B, Request> Service<Request> for Either<A, B>
where
    A: Service<Request>,
    B: Service<Request, Response = A::Response, Error = A::Error>,
{
    type Response = A::Response;
    type Error = A::Error;

    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        match self {
            Either::Left(service) => service.poll_ready(cx),
            Either::Right(service) => service.poll_ready(cx),
        }
    }

    async fn call(&mut self, request: Request) -> Result<Self::Response, Self::Error> {
        match self {
            Either::Left(service) => service.call(request).await,
            Either::Right(service) => service.call(request).await,
        }
    }
}

This is a win for “native” async fn in trait.

Things get complicated when it comes to specifying additional bounds, like lifetimes or Send-ness.

Lifetimes: a refresher

Rust famously differs from other languages in that it has you annotate lifetimes.

A function can return something that borrow from the input, but you must say so:

fn substring<'s>(input: &'s str, start: usize, end: usize) -> &'s str {
    &input[start..end]
}

Cool Bear's hot tip

Note that in this case, the lifetime annotations could be elided, because there’s only one lifetime to worry about, and thus, it’s clear that the returned value’s lifetime depends on the input.

In this example, t borrows from s, you can tell by looking at the range of addresses they both refer to.

fn main() {
    let s = String::from("Hello, world!");
    let t = substring(&s, 0, 5);
    println!(
        "{:?}\n{:?}",
        s.as_bytes().as_ptr_range(),
        t.as_bytes().as_ptr_range()
    );
}



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo run --quiet
0x600003bcc040..0x600003bcc04d
0x600003bcc040..0x600003bcc045

Because we gave the compiler that information, it can prevent us from doing something dangerous, like using t after we’ve freed s:

fn main() {
    let s = String::from("Hello, world!");
    let t = substring(&s, 0, 5);
    drop(s);
    println!("{t}");
}



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo check
    Checking sansioex v0.1.0 (/Users/amos/bearcove/sansioex)
error[E0505]: cannot move out of `s` because it is borrowed
  --> src/main.rs:36:10
   |
34 |     let s = String::from("Hello, world!");
   |         - binding `s` declared here
35 |     let t = substring(&s, 0, 5);
   |                       -- borrow of `s` occurs here
36 |     drop(s);
   |          ^ move out of `s` occurs here
37 |     println!("{t}");
   |               --- borrow later used here
   |
help: consider cloning the value if the performance cost is acceptable
   |
35 |     let t = substring(&s.clone(), 0, 5);
   |                         ++++++++

For more information about this error, try `rustc --explain E0505`.
error: could not compile `sansioex` (bin "sansioex") due to 1 previous error

You can also return something that doesn’t borrow from the input, we just have to say so:

fn substring(input: &str, start: usize, end: usize) -> String {
    input[start..end].to_string()
}

Here, the returned value is “owned”, and our use-after-free is no longer a use-after-free:

fn main() {
    let s = String::from("Hello, world!");
    let t = substring(&s, 0, 5);
    println!(
        "{:?}\n{:?}",
        s.as_bytes().as_ptr_range(),
        t.as_bytes().as_ptr_range()
    );
    // this is fine now, `t` points to its own memory!
    drop(s);
    println!("{t}");
}



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo run
   Compiling sansioex v0.1.0 (/Users/amos/bearcove/sansioex)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.16s
     Running `target/debug/sansioex`
0x600000a54000..0x600000a5400d
0x600000a54010..0x600000a54015
Hello

Hidden captures

Async functions add another layer of complication because they can be in a “partially executed” state, sort of.

Earlier, we equated this:

// Classic tower Service trait
trait Service<Request> {
    type Response;
    type Error;
    type Future: Future<Output = Result<Self::Response, Self::Error>>;

    fn call(&mut self, request: Request) -> Self::Future;
}

With this:

// Our simplified Service trait
trait Service<Request> {
    type Response;
    type Error;

    async fn call(&mut self, request: Request) -> Result<Self::Response, Self::Error>;
}

But they’re not equivalent!

See, with the classic tower Service trait, the returned Future cannot borrow from self!

This implementation of it for the type i32, which accepts requests of type i32 and adds itself to it, doesn’t compile:

#![feature(impl_trait_in_assoc_type)]

impl Service<i32> for i32 {
    type Response = i32;
    type Error = ();
    type Future = impl Future<Output = Result<Self::Response, Self::Error>>;

    fn call(&mut self, request: i32) -> Self::Future {
        async move { Ok(*self + request) }
    }
}



sansioex on  main [✘!+] via 🦀 v1.85.0-nightly
❯ cargo +nightly check --quiet
error[E0700]: hidden type for `<i32 as Service<i32>>::Future` captures lifetime that does not appear in bounds
  --> src/main.rs:19:9
   |
16 |     type Future = impl Future<Output = Result<Self::Response, Self::Error>>;
   |                   --------------------------------------------------------- opaque type defined here
17 |
18 |     fn call(&mut self, request: i32) -> Self::Future {
   |             --------- hidden type `{async block@src/main.rs:19:9: 19:19}` captures the anonymous lifetime defined here
19 |         async move { Ok(*self + request) }
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For more information about this error, try `rustc --explain E0700`.
error: could not compile `sansioex` (bin "sansioex") due to 1 previous error

There is a way to make this work, and it’s to make the associated type generic (over a lifetime):

pub trait Service<Request> {
    type Response;
    type Error;
    type Future<'a>: Future<Output = Result<Self::Response, Self::Error>> + 'a
    where
        Self: 'a;

    fn call(&mut self, request: Request) -> Self::Future<'_>;
}

This feature is called GATs, for “generic associated types”, and was introduced in Rust 1.65 — which explains why it’s not used by tower’s Service type.

If the Service type did use GATs, as shown, then we could borrow from self, like so:

impl Service<i32> for i32 {
    type Response = i32;
    type Error = ();
    type Future<'a> = impl Future<Output = Result<Self::Response, Self::Error>> + 'a;

    fn call(&mut self, request: i32) -> Self::Future<'_> {
        async move { Ok(*self + request) }
    }
}

But we can’t, because that’s not how the tower Service trait is defined. It wants a Future that does not borrow from self, and so we have to do… whatever we need to do with self before returning a future.

In this weird contrived case, this means we need to dereference self before the async block:



sansioex on  main [!] via 🦀 v1.85.0-nightly
❯ gwd
diff --git a/src/main.rs b/src/main.rs
index 9fb8ffb..bcdbed2 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -16,7 +16,8 @@ impl Service<i32> for i32 {
     type Future = impl Future<Output = Result<Self::Response, Self::Error>>;

     fn call(&mut self, request: i32) -> Self::Future {
-        async move { Ok(*self + request) }
+        let this = *self;
+        async move { Ok(this + request) }
     }
 }

…and again, this isn’t stable — we’re using #![feature(impl_trait_in_assoc_type)].

By comparison, our simplified Service type works on Rust 1.75 stable, and allows borrowing from self:

trait Service<Request> {
    type Response;
    type Error;

    async fn call(&mut self, request: Request) -> Result<Self::Response, Self::Error>;
}

impl Service<i32> for i32 {
    type Response = i32;
    type Error = ();

    async fn call(&mut self, request: i32) -> Result<Self::Response, Self::Error> {
        Ok(*self + request)
    }
}

#[tokio::main]
async fn main() {
    let mut service: i32 = 1990;
    let res = service.call(34).await.unwrap();
    println!("Result: \x1b[1;32m{res}\x1b[0m");
}



sansioex on  main [!+] via 🦀 v1.83.0
❯ cargo run --quiet
Result: 1990

But this is a breaking change. Being able to do service.call() several times in a row, obtaining several, separate, owned futures, and spawning them on an executor is a feature of the tower Service trait.

But with our simplified Service trait, we cannot have multiple concurrent requests:

#[tokio::main]
async fn main() {
    let mut service: i32 = 2024;

    let fut1 = service.call(-34);
    let fut2 = service.call(-25);

    let (response1, response2) = tokio::try_join!(fut1, fut2).unwrap();
    println!("Got responses: {response1:?}, {response2:?}");
}



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo check --quiet
error[E0499]: cannot borrow `service` as mutable more than once at a time
  --> src/main.rs:22:16
   |
21 |     let fut1 = service.call(-34);
   |                ------- first mutable borrow occurs here
22 |     let fut2 = service.call(-25);
   |                ^^^^^^^ second mutable borrow occurs here
23 |
24 |     let (response1, response2) = tokio::try_join!(fut1, fut2).unwrap();
   |                                                   ---- first borrow later used here

For more information about this error, try `rustc --explain E0499`.
error: could not compile `sansioex` (bin "sansioex") due to 1 previous error

Relaxing lifetime bounds

And that’s exactly why, currently, rustc warns you if you have an async fn in a public trait: because with that syntax, you’re not able to specify additional bounds.

Say we wanted to make our simplified Service trait closer to the original tower trait: we could say that the return future ought to be 'static.



sansioex on  main [!] via 🦀 v1.83.0
❯ gwd
diff --git a/src/main.rs b/src/main.rs
index 6d5223f..7947a70 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -1,8 +1,13 @@
+use std::future::Future;
+
 pub trait Service<Request> {
     type Response;
     type Error;

-    async fn call(&mut self, request: Request) -> Result<Self::Response, Self::Error>;
+    fn call(
+        &mut self,
+        request: Request,
+    ) -> impl Future<Output = Result<Self::Response, Self::Error>> + 'static;
 }

 impl Service<i32> for i32 {

But then, our implementation breaks:



sansioex on  main [+] via 🦀 v1.83.0
❯ cargo c
    Checking sansioex v0.1.0 (/Users/amos/bearcove/sansioex)
error[E0477]: the type `impl Future<Output = Result<<i32 as Service<i32>>::Response, <i32 as Service<i32>>::Error>>` does not fulfill the required lifetime
  --> src/main.rs:17:5
   |
17 |     async fn call(&mut self, request: i32) -> Result<Self::Response, Self::Error> {
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
note: type must satisfy the static lifetime as required by this binding
  --> src/main.rs:10:70
   |
10 |     ) -> impl Future<Output = Result<Self::Response, Self::Error>> + 'static;
   |                                                                      ^^^^^^^

For more information about this error, try `rustc --explain E0477`.
error: could not compile `sansioex` (bin "sansioex") due to 1 previous error

…and it’s not exactly clear why.

I don’t believe it’s possible to solve that issue while sticking with the async fn syntax: we have to switch to returning an impl Future via an impl block in the implementation as well:



sansioex on  main [!+] via 🦀 v1.83.0
❯ gwd
diff --git a/src/main.rs b/src/main.rs
index 7947a70..b57b520 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -14,8 +14,12 @@ impl Service<i32> for i32 {
     type Response = i32;
     type Error = ();

-    async fn call(&mut self, request: i32) -> Result<Self::Response, Self::Error> {
-        Ok(*self + request)
+    fn call(
+        &mut self,
+        request: i32,
+    ) -> impl Future<Output = Result<Self::Response, Self::Error>> + 'static {
+        let this = *self;
+        async move { Ok(this + request) }
     }
 }

With this change though, we’re able to have several requests in-flight concurrently:

#[tokio::main]
async fn main() {
    let mut service: i32 = 2024;

    let fut1 = service.call(-34);
    let fut2 = service.call(-25);

    let (response1, response2) = tokio::try_join!(fut1, fut2).unwrap();
    println!("Got responses: {response1:?}, {response2:?}");
}



sansioex on  main [!+] via 🦀 v1.83.0
❯ cargo run --quiet
Got responses: 1990, 1999

Sendness

It even looks like.. we’re able to spawn them on the tokio runtime?



sansioex on  main [!] via 🦀 v1.83.0
❯ gwd
diff --git a/src/main.rs b/src/main.rs
index 16e6549..ae009d2 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -27,8 +27,8 @@ impl Service<i32> for i32 {
 async fn main() {
     let mut service: i32 = 2024;

-    let fut1 = service.call(-34);
-    let fut2 = service.call(-25);
+    let fut1 = tokio::spawn(service.call(-34));
+    let fut2 = tokio::spawn(service.call(-25));

     let (response1, response2) = tokio::try_join!(fut1, fut2).unwrap();
     println!("Got responses: {response1:?}, {response2:?}");



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo run --quiet
Got responses: Ok(1990), Ok(1999)

Huh. But… but tokio::spawn requires Send!

That’s right, it does!

// from the tokio sources

#[track_caller]
pub fn spawn<F>(future: F) -> JoinHandle<F::Output>
where
    F: Future + Send + 'static,
    F::Output: Send + 'static,
{
    // ✂️
}

…and there’s nothing in the definition of Service that requires the returned future to be Send…

That’s right too! You know what’s happening? The compiler can see the concrete type of <i32 as Service<i32>>::call.

Oh god. Oh no. So it knows it just happens to be Send?

That’s right! If it weren’t, we’d get an error:



sansioex on  main [!] via 🦀 v1.83.0
❯ gwd
diff --git a/src/main.rs b/src/main.rs
index ae009d2..df060ba 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -19,7 +19,11 @@ impl Service<i32> for i32 {
         request: i32,
     ) -> impl Future<Output = Result<Self::Response, Self::Error>> + 'static {
         let this = *self;
-        async move { Ok(this + request) }
+        let something_not_send = std::rc::Rc::new(());
+        async move {
+            let _woops = something_not_send;
+            Ok(this + request)
+        }
     }
 }



sansioex on  main [!] via 🦀 v1.83.0
❯ cargo check --quiet
error: future cannot be sent between threads safely
   --> src/main.rs:34:29
    |
34  |     let fut1 = tokio::spawn(service.call(-34));
    |                             ^^^^^^^^^^^^^^^^^ future created by async block is not `Send`
    |
    = help: within `impl Future<Output = Result<<i32 as Service<i32>>::Response, <i32 as Service<i32>>::Error>> + 'static`, the trait `Send` is not implemented for `Rc<()>`, which is required by `impl Future<Output = Result<<i32 as Service<i32>>::Response, <i32 as Service<i32>>::Error>> + 'static: Send`
note: captured value is not `Send`
   --> src/main.rs:24:26
    |
24  |             let _woops = something_not_send;
    |                          ^^^^^^^^^^^^^^^^^^ has type `Rc<()>` which is not `Send`
note: required by a bound in `tokio::spawn`
   --> /Users/amos/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.42.0/src/task/spawn.rs:168:21
    |
166 |     pub fn spawn<F>(future: F) -> JoinHandle<F::Output>
    |            ----- required by a bound in this function
167 |     where
168 |         F: Future + Send + 'static,
    |                     ^^^^ required by this bound in `spawn`

✂️

We also could’ve seen this problem with the previous implementation of Service for i32, which happens to be Send, if we did the spawning from a generic function that accepts any Service:

#[tokio::main]
async fn main() {
    let mut service: i32 = 2024;
    do_the_spawning(&mut service).await;
}

async fn do_the_spawning<S>(service: &mut S)
where
    S: Service<i32>,
{
    let fut1 = tokio::spawn(service.call(-34));
    let fut2 = tokio::spawn(service.call(-25));

    let (response1, response2) = tokio::try_join!(fut1, fut2).unwrap();
    println!("Got responses: {response1:?}, {response2:?}");
}

We can add bounds on Response and Error:



@@ -32,6 +32,8 @@ async fn main() {
 async fn do_the_spawning<S>(service: &mut S)
 where
     S: Service<i32>,
+    S::Response: Send + std::fmt::Debug + 'static,
+    S::Error: Send + std::fmt::Debug + 'static,
 {
     let fut1 = tokio::spawn(service.call(-34));
     let fut2 = tokio::spawn(service.call(-25));

Which will reduce the number of errors a bit… but the main issue remains: that future simply isn’t declared to be Send:



error[E0277]: `impl Future<Output = Result<<S as Service<i32>>::Response, <S as Service<i32>>::Error>> + 'static` cannot be sent between threads safely
   --> src/main.rs:38:29
    |
38  |     let fut1 = tokio::spawn(service.call(-34));
    |                ------------ ^^^^^^^^^^^^^^^^^ `impl Future<Output = Result<<S as Service<i32>>::Response, <S as Service<i32>>::Error>> + 'static` cannot be sent between threads safely
    |                |
    |                required by a bound introduced by this call
    |
    = help: the trait `Send` is not implemented for `impl Future<Output = Result<<S as Service<i32>>::Response, <S as Service<i32>>::Error>> + 'static`
note: required by a bound in `tokio::spawn`
   --> /Users/amos/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.42.0/src/task/spawn.rs:168:21
    |
166 |     pub fn spawn<F>(future: F) -> JoinHandle<F::Output>
    |            ----- required by a bound in this function
167 |     where
168 |         F: Future + Send + 'static,
    |                     ^^^^ required by this bound in `spawn`

And there’s nothing we can do about it at the callsite (in the signature of do_the_spawning).

That decision is made in the trait declaration. Luckily, with the impl Trait in return position, we can at least fix it:



sansioex on  main [!⇡] via 🦀 v1.83.0
❯ gwd
diff --git a/src/main.rs b/src/main.rs
index b32c487..3ff8f6f 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -7,7 +7,7 @@ pub trait Service<Request> {
     fn call(
         &mut self,
         request: Request,
-    ) -> impl Future<Output = Result<Self::Response, Self::Error>> + 'static;
+    ) -> impl Future<Output = Result<Self::Response, Self::Error>> + Send + 'static;
 }

 impl Service<i32> for i32 {

But our trait is strictly less versatile than the original tower Service trait!

Afterword

I’m personally excited about the future of async Rust.

The async WG has been putting out crates that help fill the current gaps:

trait-variant lets you declare both a Send and non-Send version of a trait in one go
dynosaur lets you use dynamic dispatch on traits with async fn in trait

Ultimately, the big thing I’m waiting for is dyn async traits — you will know the second it lands because I’ll be the first one to excitedly rave about it on Bluesky or Mastodon!

This is (was? you're done reading I guess) a dual feature! It's available as a video too. Watch on YouTube

Comment on /r/fasterthanlime

(JavaScript is required to see this. Or maybe my stuff broke)

Here's another article just for you:

A dynamic linker murder mystery

Jul 08, 2020

23 min #rust · #linkers · #linux · #postmortem

I write a ton of articles about rust. And in those articles, the main focus is about writing Rust code that compiles. Once it compiles, well, we’re basically in the clear! Especially if it compiles to a single executable, that’s made up entirely of Rust code.

That works great for short tutorials, or one-off explorations.

Unfortunately, “in the real world”, our code often has to share the stage with other code. And Rust is great at that. Compiling Go code to a static library, for example, is relatively finnicky. It insists on being built with GCC (and no other compiler), and linked with GNU ld (and no other linker).

Catching up with async Rust

The size of locals

Just boxing it

Dynamic dispatch

dyn-compatibility

Associated types

A refreshed Service trait

Unnameable types

Lifetimes: a refresher

Hidden captures

Relaxing lifetime bounds

Sendness

Afterword

A dynamic linker murder mystery

`dyn`-compatibility

A refreshed `Service` trait