The curse of strong typing

👋 This page was last updated ~3 years ago. Just so you know.

It happened when I least expected it.

Someone, somewhere (above me, presumably) made a decision. "From now on", they declared, "all our new stuff must be written in Rust".

I'm not sure where they got that idea from. Maybe they've been reading propaganda. Maybe they fell prey to some confident asshole, and convinced themselves that Rust was the answer to their problems.

I don't know what they see in it, to be honest. It's like I always say: it's not a data race, it's a data marathon.

At any rate, I now find myself in a beautiful house, with a beautiful wife, and a lot of compile errors.

Jesus that's a lot of compile errors.

Different kinds of numbers

And it's not like I'm resisting progress! When someone made the case for using tau instead of pi, I was the first to hop on the bandwagon.

But Rust won't even let me do that:

fn main() {
    // only nerds need more digits
    println!("tau = {}", 2 * 3.14159265);
}
$ cargo run --quiet
error[E0277]: cannot multiply `{integer}` by `{float}`
 --> src/main.rs:3:28
  |
3 |     println!("tau = {}", 2 * 3.14159265);
  |                            ^ no implementation for `{integer} * {float}`
  |
  = help: the trait `Mul<{float}>` is not implemented for `{integer}`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `grr` due to previous error

When it clearly works in ECMAScript for example:

// in `main.js`

// TODO: ask for budget increase so we can afford more digits
console.log(`tau = ${2 * 3.14159265}`);
$ node main.js
tau = 6.2831853

Luckily, a colleague rushes in to help me.

Cool bear

Well those... those are different types.

Types? Never heard of them.

Cool bear

You've seen the title of this post right? Strong typing?

Fine, I'll look it up. It says here that:

"Strong typing" generally refers to use of programming language types in order to both capture invariants of the code, and ensure its correctness, and definitely exclude certain classes of programming errors. Thus there are many "strong typing" disciplines used to achieve these goals.

Okay. What's incorrect about my code?

Cool bear

Oh, nothing! Nothing at all. These are just different types.

So it's just getting in the way right now yes, correct?

Cool bear

Well... sort of? But it's not like your program is running on an imaginary machine. There's a real difference between an "integer" and a "floating point number".

A floa-

Cool bear

Look at this for example:

package main

import "fmt"

func main() {
	a := 200000000
	for i := 0; i < 10; i++ {
		a *= 10
		fmt.Printf("a = %v\n", a)
	}
}

$ go run main.go
a = 2000000000
a = 20000000000
a = 200000000000
a = 2000000000000
a = 20000000000000
a = 200000000000000
a = 2000000000000000
a = 20000000000000000
a = 200000000000000000
a = 2000000000000000000

Yeah, that makes perfect sense! What's your point?

Cool bear

Well, if we keep going a little more...

package main

import "fmt"

func main() {
	a := 200000000
	//              👇
	for i := 0; i < 15; i++ {
		a *= 10
		fmt.Printf("a = %v\n", a)
	}
}
$ go run main.go
a = 2000000000
a = 20000000000
a = 200000000000
a = 2000000000000
a = 20000000000000
a = 200000000000000
a = 2000000000000000
a = 20000000000000000
a = 200000000000000000
a = 2000000000000000000
a = 1553255926290448384
a = -2914184810805067776
a = 7751640039368425472
a = 3729424098846048256
a = 400752841041379328

Oh. Oh no.

Cool bear

That's an overflow. We used a 64-bit integer variable, and to represent 2000000000000000000, we'd need 64.12 bits, which... that's more than we have.

Okay, but again this works in ECMAScript for example:

let a = 200000000;
for (let i = 0; i < 15; i++) {
  a *= 10;
  console.log(`a = ${a}`);
}
$ node main.js
a = 2000000000
a = 20000000000
a = 200000000000
a = 2000000000000
a = 20000000000000
a = 200000000000000
a = 2000000000000000
a = 20000000000000000
a = 200000000000000000
a = 2000000000000000000
a = 20000000000000000000
a = 200000000000000000000
a = 2e+21
a = 2e+22
a = 2e+23

Sure, it's using nerd notation, but if we just go back, we can see it's working:

let a = 200000000;

for (let i = 0; i < 15; i++) {
  a *= 10;
  console.log(`a = ${a}`);
}

console.log("turn back!");

for (let i = 0; i < 15; i++) {
  a /= 10;
  console.log(`a = ${a}`);
}
$ node main.js
a = 2000000000
a = 20000000000
a = 200000000000
a = 2000000000000
a = 20000000000000
a = 200000000000000
a = 2000000000000000
a = 20000000000000000
a = 200000000000000000
a = 2000000000000000000
a = 20000000000000000000
a = 200000000000000000000
a = 2e+21
a = 2e+22
a = 2e+23
turn back!
a = 2e+22
a = 2e+21
a = 200000000000000000000
a = 20000000000000000000
a = 2000000000000000000
a = 200000000000000000
a = 20000000000000000
a = 2000000000000000
a = 200000000000000
a = 20000000000000
a = 2000000000000
a = 200000000000
a = 20000000000
a = 2000000000
a = 200000000

Mhh, looks like döner kebab.

Cool bear

Okay, but those are floating point numbers.

They don't look very floating to me.

Cool bear

Consider this:

let a = 0.1;
let b = 0.2;
let sum = a + b;

console.log(sum);
$ node main.js
0.30000000000000004

Ah, that... that does float.

Cool bear

Yeah, and that's the trade-off. You get to represent numbers that aren't whole numbers, and also /very large/ numbers, at the expense of some precision.

I see.

Cool bear

For example, with floats, you can compute two thirds:

fn main() {
    println!("two thirds = {}", 2.0 / 3.0);
}
$ cargo run --quiet
two thirds = 0.6666666666666666
Cool bear

But with integers, you can't:

fn main() {
    println!("two thirds = {}", 2 / 3);
}
$ cargo run --quiet
two thirds = 0

Wait, but I don't see any actual types here. Just values.

Cool bear

Yeah, it's all inferred!

I uh. Okay I'm still confused. See, in ECMAScript, a number's a number:

console.log(typeof 36);
console.log(typeof 42.28);
$ node main.js
number
number
Cool bear

Unless it's a big number!

console.log(typeof 36);
console.log(typeof 42.28);
console.log(typeof 248672936507863405786027355423684n);
$ node main.js
number
number
bigint

Ahhh. So ECMAScript does have integers.

Cool bear

Only big ones. Well they can smol if you want to. Operations just... are more expensive on them.

What about Python? Does Python have integers?

$ python3 -q
>>> type(38)
<class 'int'>
>>> type(38.139582735)
<class 'float'>
>>>

Mh, yeah, it does!

Cool bear

Try computing two thirds with it!

$ python3 -q
>>> 2/3
0.6666666666666666
>>> type(2)
<class 'int'>
>>> type(2/3)
<class 'float'>
>>>

Hey that works! So the / operator in python takes two int values and gives a float.

Cool bear

Not two int values. Two numbers. Could be anything.

$ python3 -q
>>> 2.8 / 1.4
2.0
>>>

What if I want to do integer division?

Cool bear

There's an operator for that!

$ python3 -q
>>> 10 // 3
3
>>>
Cool bear

Similarly, for addition you have ++...

$ python3 -q
>>> 2 + 3
5
>>> 2 ++ 3
5
>>>
Cool bear

And so on...

>>> 8 - 3
5
>>> 8 -- 3
11

Wait, no, I th-

>>> 8 * 3
24
>>> 8 ** 3
512
Cool bear

Woops, my bad — I guess it's just //. a ++ b really is a + (+b), a -- b is a - (-b), and a ** b is a to the bth power.

Okay so Python values have types, you just can't see them unless you ask.

Can I see the types of Rust values too?

Cool bear

Kinda! You can do this:

fn main() {
    dbg!(type_name_of(2));
    dbg!(type_name_of(268.2111));
}

fn type_name_of<T>(_: T) -> &'static str {
    std::any::type_name::<T>()
}
$ cargo run --quiet
[src/main.rs:2] type_name_of(2) = "i32"
[src/main.rs:3] type_name_of(268.2111) = "f64"

Okay. And so in Rust, a value like 42 defaults to i32 (signed 32-bit integer), and a value like 3.14 defaults to f64.

How do I make other number types? Surely there's other.

Cool bear

For literals, you can use suffixes:

$ cargo run --quiet
[src/main.rs:2] type_name_of(1_u8) = "u8"
[src/main.rs:3] type_name_of(1_u16) = "u16"
[src/main.rs:4] type_name_of(1_u32) = "u32"
[src/main.rs:5] type_name_of(1_u64) = "u64"
[src/main.rs:6] type_name_of(1_u128) = "u128"
[src/main.rs:8] type_name_of(1_i8) = "i8"
[src/main.rs:9] type_name_of(1_i16) = "i16"
[src/main.rs:10] type_name_of(1_i32) = "i32"
[src/main.rs:11] type_name_of(1_i64) = "i64"
[src/main.rs:12] type_name_of(1_i128) = "i128"
[src/main.rs:14] type_name_of(1_f32) = "f32"
[src/main.rs:15] type_name_of(1_f64) = "f64"

No f128?

Cool bear

Not builtin, no. For now.

Okay, so my original code here didn't work:

fn main() {
    // only nerds need more digits
    println!("tau = {}", 2 * 3.14159265);
}

Was because the 2 on the left is an integer, and the 3.14159265 is a floating point number, and so I have to do this:

    println!("tau = {}", 2.0 * 3.14159265);

Or this:

    println!("tau = {}", 2f64 * 3.14159265);

Or this, to be more readable, since apparently you can stuff _ anywhere in number literals:

    println!("tau = {}", 2_f64 * 3.14159265);
Cool bear

What did we learn?

In ECMAScript, you have 64-bit floats (number), and bigints. Operations on bigints are significantly more expensive than operations on floats.

In Python, you have floats, and integers. Python 3 handles bigints seamlessly: doing arithmetic on small integer values is still "cheap".

In languages like Rust, you have integers and floats, but you need to pick a bit width. Number literals will default to i32 and f64, unless you add a suffix or... some other conditions described in the next section.

Conversions and type inference

Okay, I think I get it.

So whereas Python has an "integer" and "float" type, Rust has different widths of integer types, like C and other system languages.

So this doesn't work:

fn main() {
    let val = 280_u32;
    takes_u32(val);
    takes_u64(val);
}

fn takes_u32(val: u32) {
    dbg!(val);
}

fn takes_u64(val: u64) {
    dbg!(val);
}
$ cargo run --quiet
error[E0308]: mismatched types
 --> src/main.rs:4:15
  |
4 |     takes_u64(val);
  |               ^^^ expected `u64`, found `u32`
  |
help: you can convert a `u32` to a `u64`
  |
4 |     takes_u64(val.into());
  |                  +++++++

For more information about this error, try `rustc --explain E0308`.
error: could not compile `grr` due to previous error

And the compiler gives me a suggestion, but according to the heading of the section, as should work, too:

    takes_u64(val as u64);
$ cargo run --quiet
[src/main.rs:8] val = 280
[src/main.rs:12] val = 280
Cool bear

Yeah! And you see the definition of takes_u64? It has val: u64.

Yeah I see, I wrote it!

Cool bear

So that means the compiler knows that the argument to takes_u64 must be a u64, right?

Yeah?

Cool bear

So it should be able to infer it!

Yeah, this does work:

    takes_u64(230984423857928735);
Cool bear

Exactly! Whereas before, it defaulted to the type of the literal to i32, this time it knows it should be a u64 in the end, so it turns the kind of squishy {integer} type into the very concrete u64 type.

Neat.

Cool bear

But it doesn't stop there — in a bunch of places in Rust, when you want to ask the compiler to "just figure it out", you can substitute _.

No... so you mean?

fn main() {
    let val = 280_u32;
    takes_u32(val);
    //              👇
    takes_u64(val as _);
}

// etc.
$ cargo run --quiet
[src/main.rs:8] val = 280
[src/main.rs:12] val = 280

Neat!

Let's try .into() too, since that's what the compiler suggested:

fn main() {
    let val = 280_u32;
    takes_u32(val);
    takes_u64(val.into());
}

// etc.

That works too!

Cool bear

Oooh, ooh, try it the other way around!

Like this?

fn main() {
    //             👇
    let val = 280_u64;
    //    👇
    takes_u64(val);
    //    👇
    takes_u32(val.into());
}
$ cargo run --quiet
error[E0277]: the trait bound `u32: From<u64>` is not satisfied
 --> src/main.rs:4:19
  |
4 |     takes_u32(val.into());
  |                   ^^^^ the trait `From<u64>` is not implemented for `u32`
  |
  = help: the following implementations were found:
            <u32 as From<Ipv4Addr>>
            <u32 as From<NonZeroU32>>
            <u32 as From<bool>>
            <u32 as From<char>>
          and 71 others
  = note: required because of the requirements on the impl of `Into<u32>` for `u64`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `grr` due to previous error

Oh, it's not happy at all. It does helpfully suggest we could use an IPv4 address instead, which...

Cool bear

I know someone who'll think this diagnostic could use a little tune-up...

No no, we can try it, we got time:

use std::{net::Ipv4Addr, str::FromStr};

fn main() {
    takes_u32(Ipv4Addr::from_str("127.0.0.1").unwrap().into());
}

fn takes_u32(val: u32) {
    dbg!(val);
}
$ cargo run --quiet
[src/main.rs:8] val = 2130706433

...yes, okay.

Just like an IPv6 address can be a u128, if it believes:

use std::{net::Ipv6Addr, str::FromStr};

fn main() {
    takes_u128(Ipv6Addr::from_str("ff::d1:e3").unwrap().into());
}

fn takes_u128(val: u128) {
    dbg!(val);
}
$ cargo run --quiet
[src/main.rs:8] val = 1324035698926381045275276563964821731

But apparently a u64 can't be a u32?

Cool bear

Well... that's because not all values of type u64 fit into a u32.

Oh!

Cool bear

...that's why there's no impl From<u64> for u32...

Ah.

Cool bear

...but there is an impl TryFrom<u64> for u32.

Ah?

Cool bear

Because some u64 fit in a u32.

So err... we used .into() earlier... which we could do because... From?

And so because now we have TryFrom... .try_into()?

Cool bear

Yes! Because of this blanket impl and that blanket impl, respectively.

I have a feeling we'll come back to these later... but for now, let's give it a shot:

fn main() {
    let val: u64 = 48_000;
    takes_u32(val.try_into().unwrap());
}

fn takes_u32(val: u32) {
    dbg!(val);
}

This compiles, and runs.

As for this:

fn main() {
    let val: u64 = 25038759283948;
    takes_u32(val.try_into().unwrap());
}

It compiles, but does not run!

$ cargo run --quiet
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: TryFromIntError(())', src/main.rs:3:30
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Makes sense so far.

And that's... that's all of it right?

Cool bear

Not quite! You can parse stuff.

Ah, like we just did with Ipv4Addr::from_str right?

Cool bear

Yes! But just like T::from(val) has val.into(), T::from_str(val) has val.parse().

Fantastic! Let's give it a go:

fn main() {
    let val = "1234".parse();
    dbg!(val);
}
$ cargo run --quiet
error[E0284]: type annotations needed for `Result<F, _>`
 --> src/main.rs:2:22
  |
2 |     let val = "1234".parse();
  |         ---          ^^^^^ cannot infer type for type parameter `F` declared on the associated function `parse`
  |         |
  |         consider giving `val` the explicit type `Result<F, _>`, where the type parameter `F` is specified
  |
  = note: cannot satisfy `<_ as FromStr>::Err == _`

For more information about this error, try `rustc --explain E0284`.
error: could not compile `grr` due to previous error

Oh it's... unhappy? Again?

Cool bear

Consider this: what do you want to parse to?

A number, clearly! The string is 1234.

See, ECMAScript gets it right:

let a = "1234";
console.log({ a });
let b = parseInt(a, 10);
console.log({ b });
$ node main.js
{ a: '1234' }
{ b: 1234 }
Cool bear

Nnnnonono, you said parseInt, not just parse.

Okay fine, let's not say parse at all then:

let a = "1234";
console.log({ a });
let b = +a;
console.log({ b });
$ node main.js
{ a: '1234' }
{ b: 1234 }
Cool bear

Okay but the unary plus operator here coerces a string to a number, and in that case the only sensible thing to do is...

Nah nah nah, that's too easy. I think you're just looking for excuses. The truth is, ECMAScript is production-ready in a way that Rust isn't, and never will be.

Those fools at work have it coming. Soon they'll realize! They've been had. They've been swindled. They've developed a taste for snake o-

Cool bear

JUST ADD : u64 AFTER let val WILL YOU

fn main() {
    let val: u64 = "2930482035982309".parse().unwrap();
    dbg!(val);
}
$ cargo run --quiet
[src/main.rs:3] val = 2930482035982309

Oh.

Yeah that tracks. And I suppose if we have to care about bit widths here, that if I change it for u32...

fn main() {
    let val: u32 = "2930482035982309".parse().unwrap();
    dbg!(val);
}
$ cargo run --quiet
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ParseIntError { kind: PosOverflow }', src/main.rs:2:47
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

It errors out, because that doesn't fit in a u32. I see.

Cool bear

YES. NOW TRY CASTING THAT VALUE AS AN u64 TO A u32.

Cool down, bear! I'll try, I'll try:

fn main() {
    let a = 2930482035982309_u64;
    println!("a = {a} (u64)");

    let b = a as u32;
    println!("b = {b} (u32)");
}
$ cargo run --quiet
a = 2930482035982309 (u64)
b = 80117733 (u32)

Oh. It's... it's not crashing, just... doing the wrong thing?

Cool bear

YES THAT WAS MY POINT THANK YOU

Yeesh okay how about you take a minute there, bear. So I agree that number shouldn't fit in a u32, so it's doing... something with it.

Maybe if we print it as hex:

fn main() {
    let a = 2930482035982309_u64;
    println!("a = {a:016x} (u64)");

    let b = a as u32;
    println!("b = {b:016x} (u32)");
}
$ cargo run --quiet
a = 000a694204c67fe5 (u64)
b = 0000000004c67fe5 (u32)
            👆

Oh yeah okay! It's truncating it!

It's even clearer in binary:

fn main() {
    let a = 2930482035982309_u64;
    println!("a = {a:064b} (u64)");

    let b = a as u32;
    println!("b = {b:064b} (u32)");
}
$ cargo run --quiet
a = 0000000000001010011010010100001000000100110001100111111111100101 (u64)
b = 0000000000000000000000000000000000000100110001100111111111100101 (u32)
                                   👆
Cool bear

YES THAT'S THE PROBLEM WITH as. YOU CAN TRUNCATE VALUES WHEN YOU DIDN'T INTEND TO.

Ah. But it's shorter and super convenient still, right?

Cool bear

I GUESS!

Gotcha.

Generics and enums

Cool bear

Wait wait wait, we haven't even talked about strings yet. Are you sure about that heading?

Hell yeah! Generics are baby stuff: you just slap a couple angle brackets, or "chevrons" if you want to be fancy, and boom, Bob's your uncle!

Cool bear

Ew.

Amos

Not that Bob.

See, this for example:

fn show<T>(a: T) {
    todo!()
}

Now we can call it with a value a of type T, for any T!

fn main() {
    show(42);
    show("blah");
}
Cool bear

Okay yeah but you haven't implemented it yet!

True true, it panics right now:

$ cargo run --quiet
thread 'main' panicked at 'not yet implemented', src/main.rs:7:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

But we could... I don't know, we could display it!

fn main() {
    show(42);
    show("blah");
}

fn show<T>(a: T) {
    println!("a = {}", a);
}
$ cargo run --quiet
error[E0277]: `T` doesn't implement `std::fmt::Display`
 --> src/main.rs:7:24
  |
7 |     println!("a = {}", a);
  |                        ^ `T` cannot be formatted with the default formatter
  |
  = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
  = note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider restricting type parameter `T`
  |
6 | fn show<T: std::fmt::Display>(a: T) {
  |          +++++++++++++++++++

For more information about this error, try `rustc --explain E0277`.
error: could not compile `grr` due to previous error

Mhhhhhh. Does not implement Display.

Okay maybe {:?} instead of {} then?

fn show<T>(a: T) {
    println!("a = {:?}", a);
}
$ cargo run --quiet
error[E0277]: `T` doesn't implement `Debug`
 --> src/main.rs:7:26
  |
7 |     println!("a = {:?}", a);
  |                          ^ `T` cannot be formatted using `{:?}` because it doesn't implement `Debug`
  |
  = note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider restricting type parameter `T`
  |
6 | fn show<T: std::fmt::Debug>(a: T) {
  |          +++++++++++++++++

For more information about this error, try `rustc --explain E0277`.
error: could not compile `grr` due to previous error

Oh now it doesn't implement Debug.

Well. Okay! Maybe show can't do anything useful with its argument, but at least you can pass any type to it.

And, because T is a type like any other...

Cool bear

A "type parameter", technically, but who's keeping track.

...you can use it several times, probably!

fn main() {
    show(5, 7);
    show("blah", "bleh");
}

fn show<T>(a: T, b: T) {
    todo!()
}

Yeah, see, that works!

And if we do this:

fn main() {
    show(42, "aha")
}

fn show<T>(a: T, b: T) {
    todo!()
}

It... oh.

$ cargo run --quiet
error[E0308]: mismatched types
 --> src/main.rs:2:14
  |
2 |     show(42, "aha")
  |              ^^^^^ expected integer, found `&str`

For more information about this error, try `rustc --explain E0308`.
error: could not compile `grr` due to previous error

Well that's interesting. I guess they have to match? So like it's using the first argument, 42, to infer T, and then the second one has to match, alright.

Cool bear

Yeah, and you'll notice it says "expected integer", not "expected i32".

So that means this would work:

    show(42, 256_u64)

And it does!

And if we want two genuinely different types, I guess we have to... use two dif-

Cool bear

Use two different type parameters, yes.

fn main() {
    show(4, "hi")
}

fn show<A, B>(a: A, b: B) {
    todo!()
}

That works! Alright.

Well we don't know how to do anything useful with these values yet, but-

Cool bear

Yes, that's what you get for trying to skip ahead.

How about a nice enum instead?

Something like this?

fn main() {
    show(Answer::Maybe)
}

enum Answer {
    Yes,
    No,
    Maybe,
}

fn show(answer: Answer) {
    let s = match answer {
        Answer::Yes => "yes",
        Answer::No => "no",
        Answer::Maybe => "maybe",
    };
    println!("the answer is {s}");
}
$ cargo run --quiet
the answer is maybe
Cool bear

I mean, yeah sure. That's a good starting point.

And maybe you want me to learn about this, too?

fn is_yes(answer: Answer) -> bool {
    if let Answer::Yes = answer {
        true
    } else {
        false
    }
}
Cool bear

Sure, but mostly I w-

Or better still, this?

fn is_yes(answer: Answer) -> bool {
    matches!(answer, Answer::Yes)
}
Cool bear

No, more like this:

fn main() {
    show(Either::Character('C'));
    show(Either::Number(64));
}

enum Either {
    Number(i64),
    Character(char),
}

fn show(either: Either) {
    match either {
        Either::Number(n) => println!("{n}"),
        Either::Character(c) => println!("{c}"),
    }
}
$ cargo run --quiet
C
64

Oh, yeah, that's pretty good. So like enum variants that... hold some data?

Cool bear

Yes!

And you can do pattern matching to know which variant it is, and to access what's inside.

And I suppose it's safe too, as in it won't let you accidentally access the wrong variant?

Cool bear

Yes, yes of course. These are no C unions. They're tagged unions. Or choice types. Or sum types. Or coproducts.

Let's just stick with "enums".

But that's great news: I can finally take functions that can handle multiple types, even without understanding generics!

And I suppose... conversions could help there too? Like what if I could do this?

fn main() {
    show('C'.into());
    show(64.into());
}
Cool bear

Sure, you can do that. Just implement a couple traits!

Traits? But we're in the enums sect-

Implementing traits

Ah, here we are. Couple traits, okay, show me!

fn main() {
    show('C'.into());
    show(64.into());
}

enum Either {
    Number(i64),
    Character(char),
}

//        👇
impl From<i64> for Either {
    fn from(n: i64) -> Self {
        Either::Number(n)
    }
}

//        👇
impl From<char> for Either {
    fn from(c: char) -> Self {
        Either::Character(c)
    }
}

fn show(either: Either) {
    match either {
        Either::Number(n) => println!("{n}"),
        Either::Character(c) => println!("{c}"),
    }
}
$ cargo run --quiet
C
64

Hey, that's pretty good! But we haven't declared that From trait anywhere, let's see... ah, here's what it looks like, from the Rust standard library:

pub trait From<T> {
    fn from(T) -> Self;
}

Ah, that's refreshingly short. And Self is?

Cool bear

The type you're implementing From<T> for.

And then I suppose Into is also in there somewhere?

pub trait Into<T> {
  fn into(self) -> T;
}

Right! And self is...

Cool bear

...short for self: Self, in that position.

And I suppose there's other traits?

Wait, are Display and Debug traits?

Cool bear

They are! Here, let me show you something:

use std::fmt::Display;

fn main() {
    show(&'C');
    show(&64);
}

fn show(v: &dyn Display) {
    println!("{v}");
}
$ cargo run --quiet
C
64

Whoa. WHOA. Game changer. No .into() needed, it just works? Very cool.

Cool bear

Now let me show you something else:

use std::fmt::Display;

fn main() {
    show(&'C');
    show(&64);
}

fn show(v: impl Display) {
    println!("{v}");
}

That works too? No way! v can be whichever type implements Display! So nice!

Cool bear

Yes! It's the shorter way of spelling this:

fn show<D: Display>(v: D) {
    println!("{v}");
}

Ah!!! So that's how you add a... how you tell the compiler that the type must implement something.

Cool bear

A trait bound, yes. There's an even longer way to spell this:

fn show<D>(v: D)
where
    D: Display,
{
    println!("{v}");
}

Okay, that... I mean if you ignore all the punctuation going on, this almost reads like English. If English were maths. Well, the kind of maths compilers think about. Possibly type theory?

Return position

Wait, I didn't type that heading. Cool bear??

Cool bear

Shh, look at this.

use std::fmt::Display;

fn main() {
    show(get_char());
    show(get_int());
}

fn get_char() -> impl Display {
    'C'
}

fn get_int() -> impl Display {
    64
}

fn show(v: impl Display) {
    println!("{v}");
}

Okay. So we can use impl Display "in return position", if we don't feel like typing it all out. That's good.

And I suppose, since impl T is much like generics, we can probably do something like:

use std::fmt::Display;

fn main() {
    show(get_char_or_int(true));
    show(get_char_or_int(false));
}

fn get_char_or_int(give_char: bool) -> impl Display {
    if give_char {
        'C'
    } else {
        64
    }
}

fn show(v: impl Display) {
    println!("{v}");
}
$ cargo run --quiet
error[E0308]: `if` and `else` have incompatible types
  --> src/main.rs:12:9
   |
9  | /     if give_char {
10 | |         'C'
   | |         --- expected because of this
11 | |     } else {
12 | |         64
   | |         ^^ expected `char`, found integer
13 | |     }
   | |_____- `if` and `else` have incompatible types

For more information about this error, try `rustc --explain E0308`.
error: could not compile `grr` due to previous error

Ah. No I cannot.

So our return type is impl Display... ah, and it infers it to be char, because that's the first thing we return! And so the other thing must also be char.

But it's not.

Well I'm lost. Bear, how do we get out of this?

Bear?

...okay maybe... generics? 🤷

fn get_char_or_int<D: Display>(give_char: bool) -> D {
    if give_char {
        'C'
    } else {
        64
    }
}
$ cargo run --quiet
error[E0282]: type annotations needed
 --> src/main.rs:4:5
  |
4 |     show(get_char_or_int(true));
  |     ^^^^ cannot infer type for type parameter `impl Display` declared on the function `show`

error[E0308]: mismatched types
  --> src/main.rs:10:9
   |
8  | fn get_char_or_int<D: Display>(give_char: bool) -> D {
   |                    -                               -
   |                    |                               |
   |                    |                               expected `D` because of return type
   |                    this type parameter             help: consider using an impl return type: `impl Display`
9  |     if give_char {
10 |         'C'
   |         ^^^ expected type parameter `D`, found `char`
   |
   = note: expected type parameter `D`
                        found type `char`

error[E0308]: mismatched types
  --> src/main.rs:12:9
   |
8  | fn get_char_or_int<D: Display>(give_char: bool) -> D {
   |                    -                               -
   |                    |                               |
   |                    |                               expected `D` because of return type
   |                    this type parameter             help: consider using an impl return type: `impl Display`
...
12 |         64
   |         ^^ expected type parameter `D`, found integer
   |
   = note: expected type parameter `D`
                        found type `{integer}`

Some errors have detailed explanations: E0282, E0308.
For more information about an error, try `rustc --explain E0282`.
error: could not compile `grr` due to 3 previous errors

Err, ew, no, go back, that's even worse.

Cool bear

Yeah that'll never work.

Bear where were you!

Cool bear

Bear business. You wouldn't get it.

I...

Cool bear

It'll never work, but the compiler's got your back: it tells you you should be using impl Display.

But that's what I tried first!

Cool bear

Okay well, the impl Display in question can only be a single type.

But then what good is it?

Cool bear

Okay let's back up. You remember how you made an enum to handle arguments of two different types?

Vaguely? Oh I can do that here too, can't I.

Let's see 🎶

use std::fmt::Display;

fn main() {
    show(get_char_or_int(true));
    show(get_char_or_int(false));
}

enum Either {
    Char(char),
    Int(i64),
}

fn get_char_or_int(give_char: bool) -> Either {
    if give_char {
        Either::Char('C')
    } else {
        Either::Int(64)
    }
}

fn show(v: impl Display) {
    println!("{v}");
}
$ cargo run --quiet
error[E0277]: `Either` doesn't implement `std::fmt::Display`
  --> src/main.rs:4:10
   |
4  |     show(get_char_or_int(true));
   |     ---- ^^^^^^^^^^^^^^^^^^^^^ `Either` cannot be formatted with the default formatter
   |     |
   |     required by a bound introduced by this call
   |
   = help: the trait `std::fmt::Display` is not implemented for `Either`
   = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
note: required by a bound in `show`
  --> src/main.rs:21:17
   |
21 | fn show(v: impl Display) {
   |                 ^^^^^^^ required by this bound in `show`

error[E0277]: `Either` doesn't implement `std::fmt::Display`
  --> src/main.rs:5:10
   |
5  |     show(get_char_or_int(false));
   |     ---- ^^^^^^^^^^^^^^^^^^^^^^ `Either` cannot be formatted with the default formatter
   |     |
   |     required by a bound introduced by this call
   |
   = help: the trait `std::fmt::Display` is not implemented for `Either`
   = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
note: required by a bound in `show`
  --> src/main.rs:21:17
   |
21 | fn show(v: impl Display) {
   |                 ^^^^^^^ required by this bound in `show`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `grr` due to 2 previous errors

Oh, wait, wait, I know this! I can just implement Display for Either:

impl Display for Either {
  // ...
}

Wait, what do I put in there?

Cool bear

Use the rust-analyzer code generation assist.

You do have it installed, right?

Yes haha, of course, yes. Okay so Ctrl+. (Cmd+. on macOS), pick "Implement missing members", and... it gives me this:

impl Display for Either {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        todo!()
    }
}

...and then I guess I just match on self? To call either the Display implementation for char or for i64?

impl Display for Either {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            //
        }
    }
}

Wait, what do I write there?

Cool bear

Use the rust-analyzer code generation assist.

Sounding like a broken record, you doing ok bear?

Cool bear

I am. There's a different code generation assist for this. Alternatively, GitHub Copilot might write the whole block for you.

It's getting better. It's learning.

Okay, using the "Fill match arms" assist...

impl Display for Either {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Either::Char(_) => todo!(),
            Either::Int(_) => todo!(),
        }
    }
}

Okay I can do the rest!

impl Display for Either {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Either::Char(c) => c.fmt(f),
            Either::Int(i) => i.fmt(f),
        }
    }
}

And this now runs!

$ cargo run --quiet
C
64

Nice. But that was, like, super verbose. Can we make it less verbose?

Cool bear

Sure! You can use the delegate crate, for instance.

Okay okay I remember that bit, so you just:

$ cargo add delegate
    Updating 'https://github.com/rust-lang/crates.io-index' index
      Adding delegate v0.6.2 to dependencies.

And then... wait, what do we delegate to?

Cool bear

Oh I'll give you this one for free:

impl Either {
    fn display(&self) -> &dyn Display {
        match self {
            Either::Char(c) => c,
            Either::Int(i) => i,
        }
    }
}

Wait wait wait but that's interesting as heck. You don't need traits to add methods to types like that? You can return a &dyn Trait object? That borrows from &self? Which is short for self: &Self? And it extends the lifetime of the receiver, also called a borrow-through???

Cool bear

Heyyyyyyyyy now where did you learn all that, we covered nothing of this.

Hehehe okay forget about it.

Okay so now that we've got a display method we can do this:

impl Display for Either {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        self.display().fmt(f)
    }
}

And that's where the delegate crate comes in to make things simpler (or at least shorter), mhh, looking at the README, we can probably do...

impl Display for Either {
    delegate::delegate! {
        to self.display() {
            fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result;
        }
    }
}
Cool bear

Yeah! Or, you know, use delegate::delegate; first, and then you can just call the macro with delegate! instead of qualifying it with delegate::delegate!.

There's even a rust-analyzer assist for it — "replace qualified path with use".

Macros? Qualified paths? Wow, we're glossing over a lot of things.

Cool bear

Not that many, but yes.

Anyway, that all works! Here's the complete listing:

use delegate::delegate;
use std::fmt::Display;

fn main() {
    show(get_char_or_int(true));
    show(get_char_or_int(false));
}

impl Either {
    fn display(&self) -> &dyn Display {
        match self {
            Either::Char(c) => c,
            Either::Int(i) => i,
        }
    }
}

impl Display for Either {
    delegate! {
        to self.display() {
            fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result;
        }
    }
}

enum Either {
    Char(char),
    Int(i64),
}

fn get_char_or_int(give_char: bool) -> Either {
    if give_char {
        Either::Char('C')
    } else {
        Either::Int(64)
    }
}

fn show(v: impl Display) {
    println!("{v}");
}
$ cargo run --quiet
C
64

But... it feels a little wrong to have to write all that code just to do that.

Cool bear

Ah, that's because you don't!

Dynamically-sized types

Uhhh. What does any of that mean?

Cool bear

Okay, so it's more implementation details: just like bit widths (u32 vs u64), etc. But details are where the devil vacations.

Try printing the size of a few things with std::mem::size_of.

Okay then!

fn main() {
    dbg!(std::mem::size_of::<u32>());
    dbg!(std::mem::size_of::<u64>());
    dbg!(std::mem::size_of::<u128>());
}
$ cargo run --quiet
[src/main.rs:2] std::mem::size_of::<u32>() = 4
[src/main.rs:3] std::mem::size_of::<u64>() = 8
[src/main.rs:4] std::mem::size_of::<u128>() = 16

Okay, 32 bits is 4 bytes, that checks out on x86_64.

Cool bear

Wait, where did you learn that syntax?

Ehh you showed it to me with typeof and, I looked it up: turns out it's named turbofish syntax! The name was cute, so I remembered.

Cool bear

Okay, now try references.

Sure!

fn main() {
    dbg!(std::mem::size_of::<&u32>());
    dbg!(std::mem::size_of::<&u64>());
    dbg!(std::mem::size_of::<&u128>());
}
$ cargo run --quiet
[src/main.rs:2] std::mem::size_of::<&u32>() = 8
[src/main.rs:3] std::mem::size_of::<&u64>() = 8
[src/main.rs:4] std::mem::size_of::<&u128>() = 8

Yeah, they're all 64-bit! Again, I'm on an x86_64 CPU right now, so that's not super surprising.

Cool bear

Now try trait objects.

Oh, the dyn Trait stuff?

use std::fmt::Debug;

fn main() {
    dbg!(std::mem::size_of::<dyn Debug>());
}
$ cargo run --quiet
error[E0277]: the size for values of type `dyn std::fmt::Debug` cannot be known at compilation time
   --> src/main.rs:4:10
    |
4   |     dbg!(std::mem::size_of::<dyn Debug>());
    |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `dyn std::fmt::Debug`
note: required by a bound in `std::mem::size_of`
   --> /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:304:22
    |
304 | pub const fn size_of<T>() -> usize {
    |                      ^ required by this bound in `std::mem::size_of`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `grr` due to previous error

Oh. But that's... mhh.

Cool bear

What type is dyn Debug? What size would you expect it to have?

I don't know, I suppose... I suppose a lot of types implement Debug? Like, u32 does, u64 does, u128 does too, and String, and...

Cool bear

Exactly. It could be any of these, and then some. So it's impossible to know what size it is, because it could have any size.

Heck, even the empty tuple type, (), implements Debug!

fn main() {
    dbg!(std::mem::size_of::<()>());
    println!("{:?}", ());
}
$ cargo run --quiet
[src/main.rs:2] std::mem::size_of::<()>() = 0
()
Cool bear

...and it's a zero-sized type! (a ZST). So dyn Debug, or any other "trait object", is a DST: a dynamically-sized type.

Wait, but we did return a &dyn Display at some point, right?

Cool bear

Ah, yes, but references al-

...all have the same size! Right!!! Because you're not holding the actual value, you're just holding the address of it!

Cool bear

Exactly!

use std::mem::size_of_val;

fn main() {
    let a = 101_u128;
    println!("{:16}, of size {}", a, size_of_val(&a));
    println!("{:16p}, of size {}", &a, size_of_val(&&a));
}
$ cargo run --quiet
             101, of size 16
  0x7ffdc4fb8af8, of size 8

And so uh... what was that about us not needing the enum at all?

Cool bear

We're getting to it!

Storing stuff in structs

Oh structs, those are easy, just like other languages right?

Like that:

#[derive(Debug)]
struct Vec2 {
    x: f64,
    y: f64,
}

fn main() {
    let v = Vec2 { x: 1.0, y: 2.0 };
    println!("v = {v:#?}");
}
Cool bear

Wait, #[derive(Debug)]? I don't find we've quite reached that part of the curriculum yet... in fact I don't see it in there at all.

Oh it's just a macro that can implement a trait for you, in this case it expands to something like this:

use std::fmt;

impl fmt::Debug for Vec2 {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.debug_struct("Vec2")
            .field("x", &self.x)
            .field("y", &self.y)
            .finish()
    }
}
Cool bear

Well well well look who's teaching who now?

No it's types I'm struggling with, the rest is easy peasy limey squeezy.

But not structs, structs are easy, this, my program runs:

$ cargo run --quiet
v = Vec2 {
    x: 1.0,
    y: 2.0,
}

Cool bear

Okay, now make a function that adds two Vec2!

Alright!

#[derive(Debug)]
struct Vec2 {
    x: f64,
    y: f64,
}

impl Vec2 {
    fn add(self, other: Vec2) -> Vec2 {
        Vec2 {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}

fn main() {
    let v = Vec2 { x: 1.0, y: 2.0 };
    let w = Vec2 { x: 9.0, y: 18.0 };
    dbg!(v.add(w));
}
$ cargo run --quiet
[src/main.rs:21] v.add(w) = Vec2 {
    x: 10.0,
    y: 20.0,
}
Cool bear

Now call add twice!

fn main() {
    let v = Vec2 { x: 1.0, y: 2.0 };
    let w = Vec2 { x: 9.0, y: 18.0 };
    dbg!(v.add(w));
    dbg!(v.add(w));
}
$ cargo run --quiet
error[E0382]: use of moved value: `v`
  --> src/main.rs:22:10
   |
19 |     let v = Vec2 { x: 1.0, y: 2.0 };
   |         - move occurs because `v` has type `Vec2`, which does not implement the `Copy` trait
20 |     let w = Vec2 { x: 9.0, y: 18.0 };
21 |     dbg!(v.add(w));
   |            ------ `v` moved due to this method call
22 |     dbg!(v.add(w));
   |          ^ value used here after move
   |
note: this function takes ownership of the receiver `self`, which moves `v`
  --> src/main.rs:10:12
   |
10 |     fn add(self, other: Vec2) -> Vec2 {
   |            ^^^^

error[E0382]: use of moved value: `w`
  --> src/main.rs:22:16
   |
20 |     let w = Vec2 { x: 9.0, y: 18.0 };
   |         - move occurs because `w` has type `Vec2`, which does not implement the `Copy` trait
21 |     dbg!(v.add(w));
   |                - value moved here
22 |     dbg!(v.add(w));
   |                ^ value used here after move

For more information about this error, try `rustc --explain E0382`.
error: could not compile `grr` due to 2 previous errors

Erm, doesn't work.

Cool bear

Do you know why?

I mean it says stuff? Something something Vec2 does not implement Copy, yet more traits, okay, so it gets "moved".

Wait we can probably work around this with Clone!

//               👇
#[derive(Debug, Clone)]
struct Vec2 {
    x: f64,
    y: f64,
}

fn main() {
    let v = Vec2 { x: 1.0, y: 2.0 };
    let w = Vec2 { x: 9.0, y: 18.0 };
    dbg!(v.clone().add(w.clone()));
    dbg!(v.add(w));
}

Okay it works again!

Cool bear

What if you don't want to call .clone()?

Then I guess... Copy?

#[derive(Debug, Clone, Copy)]
struct Vec2 {
    x: f64,
    y: f64,
}

fn main() {
    let v = Vec2 { x: 1.0, y: 2.0 };
    let w = Vec2 { x: 9.0, y: 18.0 };
    dbg!(v.add(w));
    dbg!(v.add(w));
}
Cool bear

Very good! Now forget about all that code, and tell me what's the type of "hello world"?

Ah, I'll just re-use the type_name_of function you gave me... one sec...

fn main() {
    dbg!(type_name_of("hello world"));
}

fn type_name_of<T>(_: T) -> &'static str {
    std::any::type_name::<T>()
}
$ cargo run --quiet
[src/main.rs:2] type_name_of("hello world") = "&str"

There it is! It's &str!

Cool bear

Alright! Now store it in a struct!

Sure, easy enough:

#[derive(Debug)]
struct Message {
    text: &str,
}

fn main() {
    let msg = Message {
        text: "hello world",
    };
    dbg!(msg);
}
$ cargo run --quiet
error[E0106]: missing lifetime specifier
 --> src/main.rs:3:11
  |
3 |     text: &str,
  |           ^ expected named lifetime parameter
  |
help: consider introducing a named lifetime parameter
  |
2 ~ struct Message<'a> {
3 ~     text: &'a str,
  |

For more information about this error, try `rustc --explain E0106`.
error: could not compile `grr` due to previous error

Oh. Not easy enough.

Cool bear

The compiler is showing you the way — heed its advice!

Okay, sure:

#[derive(Debug)]
//             👇
struct Message<'a> {
//        👇
    text: &'a str,
}
$ cargo run --quiet
[src/main.rs:12] msg = Message {
    text: "hello world",
}
Cool bear

Okay, now read the file src/main.rs as a string, and store a reference to it in a Message.

Fine, fine, so, reading files... std::fs perhaps?

fn main() {
    let code = std::fs::read_to_string("src/main.rs").unwrap();
    let msg = Message { text: &code };
    dbg!(msg);
}
$ cargo run --quiet
[src/main.rs:9] msg = Message {
    text: "#[derive(Debug)]\nstruct Message<'a> {\n    text: &'a str,\n}\n\nfn main() {\n    let code = std::fs::read_to_string(\"src/main.rs\").unwrap();\n    let msg = Message { text: &code };\n    dbg!(msg);\n}\n",
}

Okay, I did it! What now?

Cool bear

Now move all the code to construct the Message into a separate function!

Like this?

#[derive(Debug)]
struct Message<'a> {
    text: &'a str,
}

fn main() {
    let msg = get_msg();
    dbg!(msg);
}

fn get_msg() -> Message {
    let code = std::fs::read_to_string("src/main.rs").unwrap();
    Message { text: &code }
}
$ cargo run --quiet
error[E0106]: missing lifetime specifier
  --> src/main.rs:11:17
   |
11 | fn get_msg() -> Message {
   |                 ^^^^^^^ expected named lifetime parameter
   |
   = help: this function's return type contains a borrowed value, but there is no value for it to be borrowed from
help: consider using the `'static` lifetime
   |
11 | fn get_msg() -> Message<'static> {
   |                 ~~~~~~~~~~~~~~~~

For more information about this error, try `rustc --explain E0106`.
error: could not compile `grr` due to previous error

Erm, not happy.

Cool bear

Okay, that's lifetime stuff. We're not there yet. What's the only thing you use the Message for?

Passing it to the dbg! macro?

Cool bear

And what does that use?

Probably the Debug trait?

Cool bear

So what can we change the return type to?

Ohhhh impl Debug! To let the compiler figure it out!

fn get_msg() -> impl std::fmt::Debug {
    let code = std::fs::read_to_string("src/main.rs").unwrap();
    Message { text: &code }
}
$ cargo run --quiet
error[E0597]: `code` does not live long enough
  --> src/main.rs:13:21
   |
11 | fn get_msg() -> impl std::fmt::Debug {
   |                 -------------------- opaque type requires that `code` is borrowed for `'static`
12 |     let code = std::fs::read_to_string("src/main.rs").unwrap();
13 |     Message { text: &code }
   |                     ^^^^^ borrowed value does not live long enough
14 | }
   | - `code` dropped here while still borrowed
   |
help: you can add a bound to the opaque type to make it last less than `'static` and match `'static`
   |
11 | fn get_msg() -> impl std::fmt::Debug + 'static {
   |                                      +++++++++

For more information about this error, try `rustc --explain E0597`.
error: could not compile `grr` due to previous error

Huh. That seems like... a lifetime problem? I thought we weren't at lifetimes yet.

Cool bear

We are now 😎

Lifetimes and ownership

Look this is all moving a little fast for me, I'd just like to-

Cool bear

You can go back and read the transcript later! For now, what's the type returned by std::fs::read_to_string?

Uhhh it's-

Cool bear

Don't go look at the definition. No time. Just do this:

fn get_msg() -> impl std::fmt::Debug {
    //        👇
    let code: () = std::fs::read_to_string("src/main.rs").unwrap();
    Message { text: &code }
}
$ cargo run --quiet
error[E0308]: mismatched types
  --> src/main.rs:12:20
   |
12 |     let code: () = std::fs::read_to_string("src/main.rs").unwrap();
   |               --   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `()`, found struct `String`
   |               |
   |               expected due to this

rust-analyzer was showing me the type as an inlay, you know...

Cool bear

Oh, you installed it! Good. Anyway, it's String. Try storing that inside the struct.

Okay. I guess we won't need that 'a anymore...

#[derive(Debug)]
struct Message {
    //      👇
    text: String,
}

fn main() {
    let msg = get_msg();
    dbg!(msg);
}

fn get_msg() -> impl std::fmt::Debug {
    let code = std::fs::read_to_string("src/main.rs").unwrap();
    //               👇 (the `&` is gone)
    Message { text: code }
}
Cool bear

Okay, why does this work when the other one didn't?

Because uhhhh, the &str was a... reference?

Cool bear

Yes, and?

And that means it borrowed from something? In this case the result of std::fs::read_to_string?

Cool bear

Yes, and??

And that meant we could not return that reference, because code dropped (which means it got freed) at the end of the function, and so the reference would be dangling?

Cool bear

Veeeery goooood! And it works as a String because?

Well, I guess it doesn't borrow? Like, the result of read_to_string is moved into Message, and so we take ownership of it, and we can move it anywhere we please?

Cool bear

Exactly! Suspiciously exact, even. Are you sure this is your first time?

👼

Cool bear

Very well, boss baby, do you know of other types that let you own a string?

Ah, there's a couple! Box<str> will work, for example:

#[derive(Debug)]
struct Message {
    //     👇
    text: Box<str>,
}

fn main() {
    let msg = get_msg();
    dbg!(msg);
}

fn get_msg() -> impl std::fmt::Debug {
    let code = std::fs::read_to_string("src/main.rs").unwrap();
    //               👇
    Message { text: code.into() }
}

And that one has exclusive ownership. Whereas something like Arc<str> will, well, it'll also work:

use std::sync::Arc;

#[derive(Debug)]
struct Message {
    text: Arc<str>,
}

But that one's shared ownership. You can hand out clones of it and so multiple structs can point to the same thing:

use std::sync::Arc;

#[derive(Debug)]
struct Message {
    text: Arc<str>,
}

fn main() {
    let a = get_msg();
    let b = Message {
        text: a.text.clone(),
    };
    let c = Message {
        text: b.text.clone(),
    };
    dbg!(a.text.as_ptr(), b.text.as_ptr(), c.text.as_ptr());
}

fn get_msg() -> Message {
    let code = std::fs::read_to_string("src/main.rs").unwrap();
    Message { text: code.into() }
}
$ cargo run --quiet
[src/main.rs:16] a.text.as_ptr() = 0x0000555f4e9d8d80
[src/main.rs:16] b.text.as_ptr() = 0x0000555f4e9d8d80
[src/main.rs:16] c.text.as_ptr() = 0x0000555f4e9d8d80

But you can't modify it.

Cool bear

Well, it's pretty awkward to mutate a &mut str to begin with!

Yeah. It's easier to show that with a &mut [u8].

Cool bear

Oh you're the professor now huh?

Sure! Watch me make a table:

Text (UTF-8)Bytes
Immutable reference / slice&str&[u8]
Owned, can growStringVec<u8>
Owned, fixed lenBox<str>Box<[u8]>
Shared ownership (atomic)Arc<str>Arc<[u8]>
Cool bear

Now where... where did you find that? You're not even telling people about Rc!

Eh, by the time they're worried about the cost of atomic reference counting, they can do their own research. And then they'll have a nice surprise: free performance!

There is one thing that's a bit odd, though. In the table above, we have an equivalence between str and [u8]. What are those types?

Cool bear

Ah! Those. Well...

Slices and arrays

Cool bear

Try printing the size of the str and [u8] types!

Okay sure!

use std::mem::size_of;

fn main() {
    dbg!(size_of::<str>());
    dbg!(size_of::<[u8]>());
}

Wait, no, we can't:

$ cargo run --quiet
error[E0277]: the size for values of type `str` cannot be known at compilation time
   --> src/main.rs:4:20
    |
4   |     dbg!(size_of::<str>());
    |                    ^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `str`
note: required by a bound in `std::mem::size_of`
   --> /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:304:22
    |
304 | pub const fn size_of<T>() -> usize {
    |                      ^ required by this bound in `std::mem::size_of`

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/main.rs:5:20
    |
5   |     dbg!(size_of::<[u8]>());
    |                    ^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `[u8]`
note: required by a bound in `std::mem::size_of`
   --> /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:304:22
    |
304 | pub const fn size_of<T>() -> usize {
    |                      ^ required by this bound in `std::mem::size_of`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `grr` due to 2 previous errors
Cool bear

Correct! What about the size of &str and &[u8]?

use std::mem::size_of;

fn main() {
    dbg!(size_of::<&str>());
    dbg!(size_of::<&[u8]>());
}
$ cargo run --quiet
[src/main.rs:4] size_of::<&str>() = 16
[src/main.rs:5] size_of::<&[u8]>() = 16

Ah, those we can! 16 bytes, that's... 2x8 bytes... two pointers!

Cool bear

Yes! Start and length.

Okay, so those are always references because... nothing else makes sense? Like, we don't know the size of the thing we're borrowing a slice of?

Cool bear

Yes! And the thing we're borrowing from can be... a lot of different things. Let's take &[u8] — what types can you borrow a &[u8] out of?

Well... the heading says "arrays" so I'm gonna assume it works for arrays:

use std::mem::size_of_val;

fn main() {
    let arr = [1, 2, 3, 4, 5];
    let slice = &arr[1..4];
    dbg!(size_of_val(&arr));
    dbg!(size_of_val(&slice));
    print_byte_slice(slice);
}

fn print_byte_slice(slice: &[u8]) {
    println!("{slice:?}");
}
$ cargo run --quiet
[src/main.rs:6] size_of_val(&arr) = 5
[src/main.rs:7] size_of_val(&slice) = 16
[2, 3, 4]

Okay, yes.

Cool bear

What else?

I guess, anything we had in that table under "bytes"?

It should definitely work for Vec<u8>

use std::mem::size_of_val;

fn main() {
    let vec = vec![1, 2, 3, 4, 5];
    let slice = &vec[1..4];
    dbg!(size_of_val(&vec));
    dbg!(size_of_val(&slice));
    print_byte_slice(slice);
}

fn print_byte_slice(slice: &[u8]) {
    println!("{slice:?}");
}
$ cargo run --quiet
[src/main.rs:6] size_of_val(&vec) = 24
[src/main.rs:7] size_of_val(&slice) = 16
[2, 3, 4]

Wait, 24 bytes?

Cool bear

Yeah! Start, length, capacity. Not necessarily in that order. Rust doesn't guarantee a particular type layout anyway, so you shouldn't rely on it.

Next up is Box<[u8]>:

use std::mem::size_of_val;

fn main() {
    let bbox: Box<[u8]> = Box::new([1, 2, 3, 4, 5]);
    let slice = &bbox[1..4];
    dbg!(size_of_val(&bbox));
    dbg!(size_of_val(&slice));
    print_byte_slice(slice);
}

fn print_byte_slice(slice: &[u8]) {
    println!("{slice:?}");
}
$ cargo run --quiet
[src/main.rs:6] size_of_val(&bbox) = 16
[src/main.rs:7] size_of_val(&slice) = 16
[2, 3, 4]

Ha, 2x8 bytes each. I suppose... a Box<[u8]> is exactly like a &[u8] except... it has ownership of the data it points to? So we can move it and stuff? And dropping it frees the data?

Cool bear

Yup! And you forgot one: slices of slices.

use std::mem::size_of_val;

fn main() {
    let arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
    let slice = &arr[2..7];
    let slice_of_slice = &slice[2..];
    dbg!(size_of_val(&slice_of_slice));
    print_byte_slice(slice_of_slice);
}

fn print_byte_slice(slice: &[u8]) {
    println!("{slice:?}");
}
$ cargo run --quiet
[src/main.rs:7] size_of_val(&slice_of_slice) = 16
[5, 6, 7]

Very cool.

So wait, just to back up — arrays are [T; n], and slices are &[T]. We know the size of arrays because we know how many elements they have, and we know the size of &[T] because it's just start + length.

But we don't know the size of [T] because...

Cool bear

Because the slice could borrow from anything! As we've seen: [u8; n], Vec<u8>, Box<[u8]>, Arc<[u8]>, another slice...

Ah. So we don't know its size.

Wait wait wait.

That makes [T] a dynamically-sized type? Just like trait objects?

Cool bear

Yes, it is a DST.

And we can just do Box<[T]>?

Cool bear

Sure! That's just an owning pointer.

Ooooh that gives me an idea.

Boxed trait objects

So! Deep breaths. If I followed correctly, that means that, although we don't know the size of dyn Display, we know the size of Box<dyn Display> — it should be the same size as &dyn Display, it just has ownership of its... of the thing it points to.

Cool bear

Its pointee, yeah. Also, same with Arc<dyn Display>, or any other smart pointer.

Okay let me check it real quick:

use std::{fmt::Display, mem::size_of, rc::Rc, sync::Arc};

fn main() {
    dbg!(size_of::<&dyn Display>());
    dbg!(size_of::<Box<dyn Display>>());
    dbg!(size_of::<Arc<dyn Display>>());
    dbg!(size_of::<Rc<dyn Display>>());
}
$ cargo run --quiet
[src/main.rs:4] size_of::<&dyn Display>() = 16
[src/main.rs:5] size_of::<Box<dyn Display>>() = 16
[src/main.rs:6] size_of::<Arc<dyn Display>>() = 16
[src/main.rs:7] size_of::<Rc<dyn Display>>() = 16

Okay, okay! They're all the same size, the size of a p-.. of two pointers? What?

Cool bear

Yeah! Data and vtable. You remember how you couldn't do anything with the values in your first generic function?

That one?

fn show<T>(a: T) {
    todo!()
}
Cool bear

The very same. Well there's two ways to solve this. Either you add a trait bound, like so:

fn show<T: std::fmt::Display>(a: T) {
    // blah
}
Cool bear

And then a different version of show gets generated for every type you call it with.

Oooh, right! That's uhh... it's called... discombobulation?

Cool bear

Monomorphization. show is "polymorphic" because it can take multiple forms, and it gets replaced with many "monomorphic" versions of itself, that each handle a certain combination of types.

Okay, so that's one way. And the other way?

Cool bear

You take a reference to a trait object: &dyn Trait.

And that helps how?

Cool bear

Well, it points to the value itself, and a list of all functions required by the trait. And only those.

Oh. Oh! And that's the vtable? It's just "the concrete type's implementation of every function listed in the trait definition"?

Cool bear

Yes. But can you define "concrete type" for me?

Well... let's take this:

use std::fmt::Display;

fn main() {
    let x: u64 = 42;
    show(x);
}

fn show<D: Display>(d: D) {
    println!("{}", d);
}

In that case, I'd call D the type parameter (or generic type?), and u64 the concrete type.

Cool bear

Okay, I was just making sure. You were about to have an epiphany?

I was? Oh, right!

$ cargo run --quiet
[src/main.rs:4] size_of::<&dyn Display>() = 16
[src/main.rs:5] size_of::<Box<dyn Display>>() = 16
[src/main.rs:6] size_of::<Arc<dyn Display>>() = 16
[src/main.rs:7] size_of::<Rc<dyn Display>>() = 16

So these all have the same size.

And the last time we tried returning a dyn Display we ran into trouble because, well, it's dynamically-sized:

use std::fmt::Display;

fn main() {
    let x = get_display();
    show(x);
}

fn get_display() -> dyn Display {
    let x: u64 = 42;
    x
}

fn show<D: Display>(d: D) {
    println!("{}", d);
}
$ cargo run --quiet
error[E0746]: return type cannot have an unboxed trait object
 --> src/main.rs:3:21
  |
3 | fn get_display() -> dyn Display {
  |                     ^^^^^^^^^^^ doesn't have a size known at compile-time
  |
  = note: for information on `impl Trait`, see <https://doc.rust-lang.org/book/ch10-02-traits.html#returning-types-that-implement-traits>
help: use `impl Display` as the return type, as all return paths are of type `u64`, which implements `Display`
  |
3 | fn get_display() -> impl Display {
  |                     ~~~~~~~~~~~~

(other errors omitted)

But -> impl Display worked, as the compiler suggests:

fn get_display() -> impl Display {
    let x: u64 = 42;
    x
}

Because it's sorta like this:

fn get_display<D: Display>() -> D {
    let x: u64 = 42;
    x
}
Cool bear

Nooooooo no no no. Verboten. Can't do that!

Yeah, you told me! You didn't explain why, though.

Cool bear

Because, and read this very carefully:

When a generic function is called, it must be possible to infer all its type parameters from its inputs alone.

Ah, erm. Wait so it would work if D was also somewhere in the type of a parameter?

Cool bear

Yeah! Consider this:

fn main() {
    dbg!(add_10(5));
}

fn add_10<N>(n: N) -> N {
    n + 10
}

Wait, that doesn't compile!

$ cargo run --quiet
error[E0369]: cannot add `{integer}` to `N`
 --> src/main.rs:6:7
  |
6 |     n + 10
  |     - ^ -- {integer}
  |     |
  |     N
Cool bear

No. But you also truncated the compiler's output.

Here's the rest of it.

help: consider restricting type parameter `N`
  |
5 | fn add_10<N: std::ops::Add<Output = {integer}>>(n: N) -> N {
  |            +++++++++++++++++++++++++++++++++++
Cool bear

It's not the same issue. The problem here is that N could be anything. Including types that we cannot add 10 to.

Here's a working version:

fn main() {
    dbg!(add_10(1_u8));
    dbg!(add_10(2_u16));
    dbg!(add_10(3_u32));
    dbg!(add_10(4_u64));
}

fn add_10<N>(n: N) -> N
where
    N: From<u8> + std::ops::Add<Output = N>,
{
    n + 10.into()
}

Yeesh that's... gnarly.

Cool bear

Yeah. It's also a super contrived example.

But okay, I get it: impl Trait in return position is the only way to have something about the function signature that's inferred from... its body.

Cool bear

Yes! Which is why both these get_ functions work:

use std::fmt::Display;

fn main() {
    show(get_char());
    show(get_int());
}

fn get_char() -> impl Display {
    'C'
}

fn get_int() -> impl Display {
    64
}

fn show(v: impl Display) {
    println!("{v}");
}

Right, it infers the return type of get_char to be char, and the ret-

Cool bear

Not quite. Well, yes. But it returns an opaque type. The caller doesn't know it's actually a char. All it knows is that it implements Display.

I see.

Cool bear

Still, by itself, it can't unify char and i32, for example. Those are two distinct types.

I wonder what type_name thinks of these...

use std::fmt::Display;

fn main() {
    let c = get_char();
    dbg!(type_name_of(&c));
    let i = get_int();
    dbg!(type_name_of(&i));
}

fn get_char() -> impl Display {
    'C'
}

fn get_int() -> impl Display {
    64
}

fn type_name_of<T>(_: T) -> &'static str {
    std::any::type_name::<T>()
}
$ cargo run --quiet
[src/main.rs:5] type_name_of(&c) = "&char"
[src/main.rs:7] type_name_of(&i) = "&i32"

Hahahaha. Not so opaque after all.

Cool bear

That's uhh.. didn't expect type_name to do that, to be honest.

But they are opaque, I promise. You can call char methods on a real char, but not on the return type of get_char:

use std::fmt::Display;

fn main() {
    let real_c = 'a';
    dbg!(real_c.to_ascii_uppercase());

    let opaque_c = get_char();
    dbg!(opaque_c.to_ascii_uppercase());
}

fn get_char() -> impl Display {
    'C'
}
$ cargo run --quiet
error[E0599]: no method named `to_ascii_uppercase` found for opaque type `impl std::fmt::Display` in the current scope
 --> src/main.rs:8:19
  |
8 |     dbg!(opaque_c.to_ascii_uppercase());
  |                   ^^^^^^^^^^^^^^^^^^ method not found in `impl std::fmt::Display`

For more information about this error, try `rustc --explain E0599`.
error: could not compile `grr` due to previous error
Cool bear

Also, I'm fairly sure type_id will give us different values...

use std::{any::TypeId, fmt::Display};

fn main() {
    let opaque_c = get_char();
    dbg!(type_id_of(opaque_c));

    let real_c = 'a';
    dbg!(type_id_of(real_c));
}

fn get_char() -> impl Display {
    'C'
}

fn type_id_of<T: 'static>(_: T) -> TypeId {
    TypeId::of::<T>()
}
$ cargo run --quiet
[src/main.rs:5] type_id_of(opaque_c) = TypeId {
    t: 15782864888164328018,
}
[src/main.rs:8] type_id_of(real_c) = TypeId {
    t: 15782864888164328018,
}
Cool bear

Ah, huh. I guess not.

Yeah it seems like opaque types are a type-checker trick and it is the concrete type at runtime. The checker will just have prevented us from calling anything that wasn't in the trait.

Actually, now I understand better why this cannot work:

use std::fmt::Display;

fn main() {
    show(get_char_or_int(true));
    show(get_char_or_int(false));
}

fn get_char_or_int(give_char: bool) -> impl Display {
    if give_char {
        'C'
    } else {
        64
    }
}

fn show(v: impl Display) {
    println!("{v}");
}
$ cargo run --quiet
error[E0308]: `if` and `else` have incompatible types
  --> src/main.rs:12:9
   |
9  | /     if give_char {
10 | |         'C'
   | |         --- expected because of this
11 | |     } else {
12 | |         64
   | |         ^^ expected `char`, found integer
13 | |     }
   | |_____- `if` and `else` have incompatible types

For more information about this error, try `rustc --explain E0308`.
error: could not compile `grr` due to previous error

It's because the return type cannot be simultaneously char and, say, i32.

Cool bear

Yes, and also: it's because there's no vtable involved. Remember the enum version you did?

Yeah! That one:

use delegate::delegate;
use std::fmt::Display;

fn main() {
    show(get_char_or_int(true));
    show(get_char_or_int(false));
}

impl Either {
    fn display(&self) -> &dyn Display {
        match self {
            Either::Char(c) => c,
            Either::Int(i) => i,
        }
    }
}

impl Display for Either {
    delegate! {
        to self.display() {
            fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result;
        }
    }
}

enum Either {
    Char(char),
    Int(i64),
}

fn get_char_or_int(give_char: bool) -> Either {
    if give_char {
        Either::Char('C')
    } else {
        Either::Int(64)
    }
}

fn show(v: impl Display) {
    println!("{v}");
}
Cool bear

Right! In that one, you're manually dispatching Display::fmt to either the implementation for char or the one for i64.

Well no, delegate is doing it for me.

Cool bear

Well, you did it here:

impl Display for Either {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Either::Char(c) => c.fmt(f),
            Either::Int(i) => i.fmt(f),
        }
    }
}

Right, yes, I see the idea. So a vtable does the same thing?

Cool bear

Eh, not quite. It's more like function pointers.

Can you show me?

Cool bear

Okay, but real quick then.

use std::{
    fmt::{self, Display},
    mem::transmute,
};

// This is our type that can contain any value that implements `Display`
struct BoxedDisplay {
    // This is a pointer to the actual value, which is on the heap.
    data: *mut (),
    // And this is a reference to the vtable for Display's implementation of the
    // type of our value.
    vtable: &'static DisplayVtable<()>,
}

// 👆 Note that there are no type parameters at all in the above type. The
// type is _erased_.

// Then we need to declare our vtable type.
// This is a type-safe take on it (thanks @eddyb for the idea), but you may
// have noticed `BoxedDisplay` pretends they're all `DisplayVtable<()>`, which
// is fine because we're only dealing with pointers to `T` / `()`, which all
// have the same size.
#[repr(C)]
struct DisplayVtable<T> {
    // This is the implementation of `Display::fmt` for `T`
    fmt: unsafe fn(*mut T, &mut fmt::Formatter<'_>) -> fmt::Result,

    // We also need to be able to drop a `T`. For that we need to know how large
    // `T` is, and there may be side effects (freeing OS resources, flushing a
    // buffer, etc.) so it needs to go into the vtable too.
    drop: unsafe fn(*mut T),
}

impl<T: Display> DisplayVtable<T> {
    // This lets us build a `DisplayVtable` any `T` that implements `Display`
    fn new() -> &'static Self {
        // Why yes you can declare functions in that scope. This one just
        // forwards to `T`'s `Display` implementation.
        unsafe fn fmt<T: Display>(this: *mut T, f: &mut fmt::Formatter<'_>) -> fmt::Result {
            (*this).fmt(f)
        }

        // Here we turn a raw pointer (`*mut T`) back into a `Box<T>`, which
        // has ownership of it and thus, knows how to drop (free) it.
        unsafe fn drop<T>(this: *mut T) {
            Box::from_raw(this);
        }

        // 👆 These are both regular functions, not, closures. They end up in
        // the executable, thus they live for 'static, thus we can return a
        // `&'static Self` as requested.

        &Self { fmt, drop }
    }
}

// Okay, now we can make a constructor for `BoxedDisplay` itself!
impl BoxedDisplay {
    // The `'static` bound makes sure `T` is _owned_ (it can't be a reference
    // shorter than 'static).
    fn new<T: Display + 'static>(t: T) -> Self {
        // Let's do some type erasure!
        Self {
            // Box<T> => *mut T => *mut ()
            data: Box::into_raw(Box::new(t)) as _,

            // &'static DisplayVtable<T> => &'static DisplayVtable<()>
            vtable: unsafe { transmute(DisplayVtable::<T>::new()) },
        }
    }
}

// That one's easy — we dispatch to the right `fmt` function using the vtable.
impl Display for BoxedDisplay {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        unsafe { (self.vtable.fmt)(self.data, f) }
    }
}

// Same here.
impl Drop for BoxedDisplay {
    fn drop(&mut self) {
        unsafe {
            (self.vtable.drop)(self.data);
        }
    }
}

// And finally, we can use it!
fn get_char_or_int(give_char: bool) -> BoxedDisplay {
    if give_char {
        BoxedDisplay::new('C')
    } else {
        BoxedDisplay::new(64)
    }
}

fn show(v: impl Display) {
    println!("{v}");
}

fn main() {
    show(get_char_or_int(true));
    show(get_char_or_int(false));
}
$ cargo run --quiet
C
64

Whoa. Whoa whoa whoa, that could be its own article!

Cool bear

Yes. And yet here we are.

And there's unsafe code in there, how do you know it's okay?

Cool bear

Well, miri is happy about it, so that's a good start:

$ cargo +nightly miri run --quiet
C
64

And do I really need to write code like that?

Cool bear

No you don't! But you can, and the standard library does have code like that, which is awesome, because you don't need to learn a whole other language to drop down and work on it.

Wait, unsafe Rust is not a whole other language?

Cool bear

Touché, smartass.

Cool bear

Anyway you don't need to write all of that yourself because that's exactly what Box<dyn Display> already is.

Oh, word?

use std::fmt::Display;

fn get_char_or_int(give_char: bool) -> Box<dyn Display> {
    if give_char {
        Box::new('C')
    } else {
        Box::new(64)
    }
}

fn show(v: impl Display) {
    println!("{v}");
}

fn main() {
    show(get_char_or_int(true));
    show(get_char_or_int(false));
}
$ cargo run --quiet
C
64

Neat! Super neat.

Cool bear

Really the "magic" happens in the trait object itself. Here it's boxed, but it may as well be arc'd:

fn get_char_or_int(give_char: bool) -> Arc<dyn Display> {
    if give_char {
        Arc::new('C')
    } else {
        Arc::new(64)
    }
}
Cool bear

And that would work just as well. Or, again, just a reference:

fn get_char_or_int(give_char: bool) -> &'static dyn Display {
    if give_char {
        &'C'
    } else {
        &64
    }
}

Well, that's a comfort. For a second there I really thought I would have to write my own custom vtable implementation every time I want to do something useful.

Cool bear

No, this isn't the 1970s. We have re-usable code now.

Reading type signatures

Ok so... there's a lot of different names for essentially the same thing, like &str and String, and &[u8] and Vec<u8>, etc.

Seems like a bunch of extra work. What's the upside?

Cool bear

Well, sometimes it catches bugs.

Ah!

Cool bear

The big thing there is lifetimes, in the context of concurrent code, but...

Whoa there, I don't think we've-

Cool bear

BUT, immutability is another big one.

Consider this:

function double(arr) {
  for (var i = 0; i < arr.length; i++) {
    arr[i] *= 2;
  }
  return arr;
}

let a = [1, 2, 3];
console.log({ a });
let b = double(a);
console.log({ b });

Ah, easy! This'll print 1, 2, 3 and then 2, 4, 6.

$ node main.js
{ a: [ 1, 2, 3 ] }
{ b: [ 2, 4, 6 ] }

Called it!

Cool bear

Now what if we call it like this?

let a = [1, 2, 3];
console.log({ a });
let b = double(a);
console.log({ a, b });

Ah, then, mh... 1, 2, 3 and then... 1, 2, 3 and 2, 4, 6?

Cool bear

Wrong!

$ node main.js
{ a: [ 1, 2, 3 ] }
{ a: [ 2, 4, 6 ], b: [ 2, 4, 6 ] }

Ohhh! Right I suppose double took the array by reference, and so it mutated it in-place.

Mhhh. I guess we have to think about these things in ECMAScript-land, too.

Cool bear

We very much do! We can "fix" it like this for example:

function double(arr) {
  let result = new Array(arr.length);
  for (var i = 0; i < arr.length; i++) {
    result[i] = arr[i] * 2;
  }
  return result;
}
$ node main.js
{ a: [ 1, 2, 3 ] }
{ a: [ 1, 2, 3 ], b: [ 2, 4, 6 ] }

Wait, wouldn't we rather use a functional style, like so?

function double(arr) {
  return arr.map((x) => x * 2);
}
Cool bear

That works too! It's just 86% slower according to this awful microbenchmark I just made.

Aw, nuts. We have to worry about performance too in ECMAScript-land?

Cool bear

You can if you want to! But let's stay on "correctness".

Let's try porting those functions to Rust.

fn main() {
    let a = vec![1, 2, 3];
    println!("a = {a:?}");
    let b = double(a);
    println!("b = {b:?}");
}

fn double(a: Vec<i32>) -> Vec<i32> {
    a.into_iter().map(|x| x * 2).collect()
}

Let's give it a run...

$ cargo run -q
a = [1, 2, 3]
b = [2, 4, 6]

Yeah that checks out.

Cool bear

So, same question as before: do you think double is messing with a?

I don't think so?

Cool bear

Try printing it!

fn main() {
    let a = vec![1, 2, 3];
    println!("a = {a:?}");
    let b = double(a);
    println!("a = {a:?}");
    println!("b = {b:?}");
}
$ cargo run -q
error[E0382]: borrow of moved value: `a`
 --> src/main.rs:5:20
  |
2 |     let a = vec![1, 2, 3];
  |         - move occurs because `a` has type `Vec<i32>`, which does not implement the `Copy` trait
3 |     println!("a = {a:?}");
4 |     let b = double(a);
  |                    - value moved here
5 |     println!("a = {a:?}");
  |                    ^ value borrowed here after move
  |
  = note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info)

For more information about this error, try `rustc --explain E0382`.
error: could not compile `grr` due to previous error

Wait, we can't. double takes ownership of a, so there's no a left for us to print.

Cool bear

Correct! What about this version?

fn main() {
    let a = vec![1, 2, 3];
    println!("a = {a:?}");
    let b = double(&a);
    println!("a = {a:?}");
    println!("b = {b:?}");
}

fn double(a: &Vec<i32>) -> Vec<i32> {
    a.iter().map(|x| x * 2).collect()
}

That one... mhh that one should work?

Cool bear

It does!

$ cargo run -q
a = [1, 2, 3]
a = [1, 2, 3]
b = [2, 4, 6]
Cool bear

But tell me, do we really need to take a &Vec?

What do you mean?

Cool bear

Well, a Vec<T> is neat because it can grow, and shrink. This is useful when collecting results, for example, and we don't know how many results we'll end up having. We need to be able to push elements onto it, without worrying about running out of space.

I suppose so yeah? Well in our case... I suppose all we do is read from a, so no, we don't really need a &Vec. But what else would we take?

Cool bear

Let's ask clippy!

$ cargo clippy -q
warning: writing `&Vec` instead of `&[_]` involves a new object where a slice will do
 --> src/main.rs:9:14
  |
9 | fn double(a: &Vec<i32>) -> Vec<i32> {
  |              ^^^^^^^^^ help: change this to: `&[i32]`
  |
  = note: `#[warn(clippy::ptr_arg)]` on by default
  = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg

Ohhhh a slice, of course!

fn double(a: &[i32]) -> Vec<i32> {
    a.iter().map(|x| x * 2).collect()
}
Cool bear

And now does this version mess with a?

Oh definitely not. Our a in the main function is a growable Vec, and we pass a read-only slice of it to the function, so all it can do is read.

Cool bear

Correct!

$ cargo run -q
a = [1, 2, 3]
a = [1, 2, 3]
b = [2, 4, 6]
Cool bear

How about this one:

fn double(a: &mut [i32]) -> Vec<i32> {
    a.iter().map(|x| x * 2).collect()
}

Well, seems unnecessary? And.. it doesn't compile:

$ cargo run -q
error[E0308]: mismatched types
 --> src/main.rs:4:20
  |
4 |     let b = double(&a);
  |                    ^^ types differ in mutability
  |
  = note: expected mutable reference `&mut [i32]`
                     found reference `&Vec<{integer}>`

For more information about this error, try `rustc --explain E0308`.
error: could not compile `grr` due to previous error
Cool bear

So? Make it compile!

Alright then:

fn main() {
    //   👇
    let mut a = vec![1, 2, 3];
    println!("a = {a:?}");
    //               👇
    let b = double(&mut a);
    println!("a = {a:?}");
    println!("b = {b:?}");
}

fn double(a: &mut [i32]) -> Vec<i32> {
    a.iter().map(|x| x * 2).collect()
}

There. It prints exactly the same thing.

Cool bear

So this works. But is it good?

Not really no. We're asking for more than what we need.

Cool bear

Indeed! We never mutate the input, so we don't need a mutable slice of it.

But can you show a case where it would get in the way?

Yes I suppose... I suppose if we wanted to double the input in parallel a bunch of times? I mean it's pretty contrived, but.. gimme a second.

$ cargo add crossbeam
(cut)
fn main() {
    let mut a = vec![1, 2, 3];
    println!("a = {a:?}");

    crossbeam::scope(|s| {
        for _ in 0..5 {
            s.spawn(|_| {
                let b = double(&mut a);
                println!("b = {b:?}");
            });
        }
    })
    .unwrap();
}

fn double(a: &mut [i32]) -> Vec<i32> {
    a.iter().map(|x| x * 2).collect()
}

There. That fails because we can't borrow a mutably more than once at a time:

$ cargo run -q
error[E0499]: cannot borrow `a` as mutable more than once at a time
  --> src/main.rs:7:21
   |
5  |       crossbeam::scope(|s| {
   |                         - has type `&crossbeam::thread::Scope<'1>`
6  |           for _ in 0..5 {
7  |               s.spawn(|_| {
   |               -       ^^^ `a` was mutably borrowed here in the previous iteration of the loop
   |  _____________|
   | |
8  | |                 let b = double(&mut a);
   | |                                     - borrows occur due to use of `a` in closure
9  | |                 println!("b = {b:?}");
10 | |             });
   | |______________- argument requires that `a` is borrowed for `'1`

For more information about this error, try `rustc --explain E0499`.
error: could not compile `grr` due to previous error

But it works if we just take an immutable reference:

fn main() {
    let a = vec![1, 2, 3];
    println!("a = {a:?}");

    crossbeam::scope(|s| {
        for _ in 0..5 {
            s.spawn(|_| {
                let b = double(&a);
                println!("b = {b:?}");
            });
        }
    })
    .unwrap();
}

fn double(a: &[i32]) -> Vec<i32> {
    a.iter().map(|x| x * 2).collect()
}
$ cargo run -q
a = [1, 2, 3]
b = [2, 4, 6]
b = [2, 4, 6]
b = [2, 4, 6]
b = [2, 4, 6]
b = [2, 4, 6]
Cool bear

Very good! Look at you! And you used crossbeam because?

Because... something something scoped threads. Forget about that part. You got what you wanted, right?

Cool bear

I did! Next question: doesn't this code have the exact same performance issues as our ECMAScript .map()-based function?

Yes and no — we are allocating a new Vec, but it probably has the exact right size to begin with, because Rust iterators have size hints.

Cool bear

Ah, mh, okay, but what if we did want to mutate the vec in-place?

Ah, then I suppose we could do this:

fn main() {
    let a = vec![1, 2, 3];
    println!("a = {a:?}");

    let b = double(a);
    println!("b = {b:?}");
}

fn double(a: Vec<i32>) -> Vec<i32> {
    for i in 0..a.len() {
        a[i] *= 2;
    }
    a
}

Wait, no:

$ cargo run -q
error[E0596]: cannot borrow `a` as mutable, as it is not declared as mutable
  --> src/main.rs:11:9
   |
9  | fn double(a: Vec<i32>) -> Vec<i32> {
   |           - help: consider changing this to be mutable: `mut a`
10 |     for i in 0..a.len() {
11 |         a[i] *= 2;
   |         ^ cannot borrow as mutable

For more information about this error, try `rustc --explain E0596`.
error: could not compile `grr` due to previous error

I mean this:

fn double(mut a: Vec<i32>) -> Vec<i32> {
    for i in 0..a.len() {
        a[i] *= 2;
    }
    a
}

Wait, no:

$ cargo clippy -q
warning: the loop variable `i` is only used to index `a`
  --> src/main.rs:10:14
   |
10 |     for i in 0..a.len() {
   |              ^^^^^^^^^^
   |
   = note: `#[warn(clippy::needless_range_loop)]` on by default
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_range_loop
help: consider using an iterator
   |
10 |     for <item> in &mut a {
   |         ~~~~~~    ~~~~~~

I mean this:

fn double(mut a: Vec<i32>) -> Vec<i32> {
    for x in a.iter_mut() {
        *x *= 2;
    }
    a
}
Cool bear

Okay, no need to run it, I know what it does. But is it good?

Idk. Seems okay? What's wrong with it?

Cool bear

Well, do you really need to take ownership of the Vec? Do you need a Vec in the first place?

What if you want to do this?

fn main() {
    let mut a = [1, 2, 3];
    println!("a = {a:?}");

    let b = double(a);
    println!("b = {b:?}");
}

Ah yeah, that won't work. Well no I suppose we don't need a Vec... after all, we're doing everything in-place, the array.. vector.. whatever, container, doesn't need to grow or shrink.

So we can take... OH! A mutable slice:

fn main() {
    let mut a = [1, 2, 3];
    println!("a = {a:?}");

    double(&mut a);
    println!("a = {a:?}");
}

fn double(a: &mut [i32]) {
    for x in a.iter_mut() {
        *x *= 2
    }
}
$ cargo run -q
a = [1, 2, 3]
a = [2, 4, 6]

And let's make sure it works with a Vec, too:

fn main() {
    let mut a = vec![1, 2, 3];
    println!("a = {a:?}");

    double(&mut a);
    println!("a = {a:?}");
}
$ cargo run -q
a = [1, 2, 3]
a = [2, 4, 6]

Yes it does!

Cool bear

Okay! It's time... for a quiz.

Here's a method defined on slices:

impl<T> [T] {
    pub const fn first(&self) -> Option<&T> {
        // ...
    }
}
Cool bear

Does it mutate the slice?

No! It takes an immutable reference (&self), so all it does is read.

Cool bear

Correct!

fn main() {
    let a = vec![1, 2, 3];
    dbg!(a.first());
}
$ cargo run -q
[src/main.rs:3] a.first() = Some(
    1,
)
Cool bear

What about this one?

impl<T> [T] {
    pub fn fill(&mut self, value: T)
    where
        T: Clone,
    {
        // ...
    }
}

Oh that one mutates! Given the name, I'd say it fills the whole slice with value, and... it needs to be able to make clones of the value because it might need to repeat it several times.

Cool bear

Right again!

fn main() {
    let mut a = [0u8; 5];
    a.fill(3);
    dbg!(a);
}
$ cargo run -q
[src/main.rs:4] a = [
    3,
    3,
    3,
    3,
    3,
]
Cool bear

What about this one?

impl<T> [T] {
    pub fn iter(&self) -> Iter<'_, T> {
        // ...
    }
}

Ooh that one's a toughie. So no mutation, and it uhhh borrows... through? I mean we've only briefly seen lifetimes, but I'm assuming we can't mutate a thing while we're iterating through it, so like, this:

fn main() {
    let mut a = [1, 2, 3, 4, 5];
    let mut iter = a.iter();
    dbg!(iter.next());
    dbg!(iter.next());
    a[2] = 42;
    dbg!(iter.next());
    dbg!(iter.next());
}

...can't possibly work:

$ cargo run -q
error[E0506]: cannot assign to `a[_]` because it is borrowed
 --> src/main.rs:6:5
  |
3 |     let mut iter = a.iter();
  |                    -------- borrow of `a[_]` occurs here
...
6 |     a[2] = 42;
  |     ^^^^^^^^^ assignment to borrowed `a[_]` occurs here
7 |     dbg!(iter.next());
  |          ----------- borrow later used here

For more information about this error, try `rustc --explain E0506`.
error: could not compile `grr` due to previous error

Yeah! Right again 😎

Cool bear

Alrighty! Moving on.

Closures

Cool bear

So, remember this code?

fn double(a: &[i32]) -> Vec<i32> {
    a.iter().map(|x| x * 2).collect()
}
Cool bear

That's a closure.

That's a... which part, the pipe-looking thing? |x| x * 2?

Cool bear

Yes. It's like a function.

Wait, no, a function is like this:

fn main() {
    let a = [1, 2, 3];
    let b = double(&a);
    dbg!(b);
}

// 👇 this
fn times_two(x: &i32) -> i32 {
    x * 2
}

fn double(a: &[i32]) -> Vec<i32> {
    // which we then 👇 use here
    a.iter().map(times_two).collect()
}
$ cargo run -q
[src/main.rs:4] b = [
    2,
    4,
    6,
]
Cool bear

Yeah. It does the same thing.

Oh, now that you mention it yes, yes it does do the same thing.

Cool bear

Except a closure can close over its environment.

I see. No, wait. I don't. I don't see at all. Its environment? As in the birds and the trees and th-

Cool bear

Kinda, except it's more like... bindings. Look:

fn double(a: &[i32]) -> Vec<i32> {
    let factor = 2;
    a.iter().map(|x| x * factor).collect()
}

Ohhh. Well that's a constant, it doesn't really count.

Cool bear

Fineeee, here:

fn main() {
    let a = [1, 2, 3];
    let b = mul(&a, 10);
    dbg!(b);
}

fn mul(a: &[i32], factor: i32) -> Vec<i32> {
    a.iter().map(|x| x * factor).collect()
}

Okay, okay, I see. So factor is definitely not a constant there (if we don't count constant folding), and it's... captured?

Cool bear

Closed over, yes.

...closed over by the closure. I'm gonna say "captured". Seems less obscure.

Cool bear

Sure, fine.

Wait wait wait this is boxed trait objects all over again, right? Sort of? Because closures are actually fat pointers? One pointer to the function itself, and one for the, uh, "environment". I mean, for everything captured by the closure.

Cool bear

Kinda, yes! But aren't we getting ahead of ourselv-

No no no, not at all, it doesn't matter that there might be a lot of new words, or that the underlying concepts aren't crystal clear to everyone reading this yet.

What matters is that we can proceed by analogy, because we've seen similar fuckery just before, and so we can show an example of a manual implementation of closures, just like we did boxed trait objects, and that'll clear it up for everyone.

Cool bear

Are you sure that'll work?

Eh, it's worth a shot right?

So here's what I mean. Say we want to provide a function that does something three times:

fn main() {
    do_three_times(todo!());
}

fn do_three_times<T>(t: T) {
    todo!()
}

It's generic, because it can do any thing three times. Caller's choice. Only how do I... how does the thing... do... something.

Oh! Traits! I can make a trait, hang on.

trait Thing {
    fn do_it(&self);
}

There. And then do_three_times will take anything that implements Thing... oh we can use impl Trait syntax, no need for explicit generic type parameters here:

fn do_three_times(t: impl Thing) {
    for _ in 0..3 {
        t.do_it();
    }
}

And then to call it, well... we need some type, on which we implement Thing, and make it do a thing. What's a good way to make up a new type that's empty?

Cool bear

Empty struct?

Right!

struct Greet;

impl Thing for Greet {
    fn do_it(&self) {
        println!("hello!");
    }
}

fn main() {
    do_three_times(Greet);
}

And, if my calculations are correct...

$ cargo run -q
hello!
hello!
hello!

Yes!!! See bear? Easy peasy! That wasn't even very long at all.

Cool bear

I must admit, I'm impressed.

And look, we can even box these!

trait Thing {
    fn do_it(&self);
}

fn do_three_times(things: &[Box<dyn Thing>]) {
    for _ in 0..3 {
        for t in things {
            t.do_it()
        }
    }
}

struct Greet;

impl Thing for Greet {
    fn do_it(&self) {
        println!("hello!");
    }
}

struct Part;

impl Thing for Part {
    fn do_it(&self) {
        println!("goodbye!");
    }
}

fn main() {
    do_three_times(&[Box::new(Greet), Box::new(Part)]);
}
$ cargo run -q
hello!
goodbye!
hello!
goodbye!
hello!
goodbye!
Cool bear

Very nice. You even figured out how to make slices of heterogenous types.

Now let's see Paul Allen's trai-

Let me stop you right there, bear. I know what you're about to ask: "Oooh, but what if you need to mutate stuff from inside the closure? That won't work will it? Because Wust is such a special widdle wanguage uwu, it can't just wet you do the things you want, it has to be a whiny baby about it" well HAVE NO FEAR because yes, yes, I have realized that this right here:

trait Thing {
    //        👇
    fn do_it(&self);
}

...means the closure can never mutate its environment.

Cool bear

Ah!

And so what you'd need to do if you wanted to be able to do that, is have a ThingMut trait, like so:

trait ThingMut {
    fn do_it(&mut self);
}

fn do_three_times(mut t: impl ThingMut) {
    for _ in 0..3 {
        t.do_it()
    }
}

struct Greet(usize);

impl ThingMut for Greet {
    fn do_it(&mut self) {
        self.0 += 1;
        println!("hello {}!", self.0);
    }
}

fn main() {
    do_three_times(Greet(0));
}
$ cargo run -q
hello 1!
hello 2!
hello 3!
Cool bear

Yes, but you don't really ne-

BUT YOU DON'T NEED TO TAKE OWNERSHIP OF THE THINGMUT I know I know, watch this:

fn do_three_times(t: &mut dyn ThingMut) {
    for _ in 0..3 {
        t.do_it()
    }
}

Boom!

fn main() {
    do_three_times(&mut Greet(0));
}

Bang.

Cool bear

And I suppose you don't need me to do the link with the actual traits in the Rust standard library either?

Eh, who needs you. I'm sure I can find them... there!

There's three of them:

pub trait FnOnce<Args> {
    type Output;
    extern "rust-call" fn call_once(self, args: Args) -> Self::Output;
}

pub trait FnMut<Args>: FnOnce<Args> {
    extern "rust-call" fn call_mut(
        &mut self,
        args: Args
    ) -> Self::Output;
}

pub trait Fn<Args>: FnMut<Args> {
    extern "rust-call" fn call(&self, args: Args) -> Self::Output;
}

So all Fn (immutable reference) are also FnMut (mutable reference), which are also FnOnce (takes ownership). Beautiful symmetry.

And then... I'm assuming the extern "rust-call" fuckery is because... lack of variadics right now?

Cool bear

Right, yes. And that's also why you can't really implement the Fn / FnMut / FnOnce traits yourself on arbitrary types right now.

Yeah, see! Easy. So our example becomes this:

fn do_three_times(t: &mut dyn FnMut()) {
    for _ in 0..3 {
        t()
    }
}

fn main() {
    let mut counter = 0;
    do_three_times(&mut || {
        counter += 1;
        println!("hello {counter}!")
    });
}

Bam, weird syntax but that's a lot less typing, I like it, arguments are between pipes, sure why not.

Cool bear

Arguments are between pipes, what do you mean?

Oh, well closures can take arguments too, they're just like functions right? You told me that. So we can... do this!

//                            👇
fn do_three_times(t: impl Fn(i32)) {
    for i in 0..3 {
        t(i)
    }
}

fn main() {
    //             👇
    do_three_times(|i| println!("hello {i}!"));
}
Cool bear

I see. And I supposed you've figured out boxing as well?

The sport, no. But the type erasure, sure, in that regard they're just regular traits, so, here we go:

fn do_all_the_things(things: &[Box<dyn Fn()>]) {
    for t in things {
        t()
    }
}

fn main() {
    do_all_the_things(&[
        Box::new(|| println!("hello")),
        Box::new(|| println!("how are you")),
        Box::new(|| println!("I wasn't really asking")),
        Box::new(|| println!("goodbye")),
    ]);
}
Cool bear

Well. It looks like you're all set.

Nothing left to learn.

The world no longer holds any secrets for you.

Through science, you have rid the universe of its last mystery, and you are now cursed to roam, surrounded by the mundane, devoid of the last shred of poet-

Wait, what about async stuff?

Cool bear

Ahhhhhhhhhhhhhhhhhhh fuck.

Async stuff

Cool bear

Okay, async stuff, is.... ugh. Wait, you've written about this before.

Multiple times yes, but humor me. Why do I want it?

Cool bear

You don't! God, why would you. I mean, okay you want it if you're writing network services and stuff.

Oh yes, I do want to do that! So I do want async!

Cool bear

Yes. Yes you very much want async.

And I've heard it makes everything worse!

Cool bear

Well...... so, you know how if you write a file, it writes to the file?

Yes? Like that:

fn main() {
    // error handling omitted for story-telling purposes
    let _ = std::fs::write("/tmp/hi", "hi!\n");
}
$ cargo run -q && cat /tmp/hi
hi!
Cool bear

Well async is the same, except it doesn't work.

$ cargo add tokio +full
(cut)
fn main() {
    // error handling omitted for story-telling purposes
    //        👇 (was `std`)
    let _ = tokio::fs::write("/tmp/bye", "bye!\n");
}
$ cargo run -q && cat /tmp/bye
cat: /tmp/bye: No such file or directory

Ah. Indeed it doesn't work.

Cool bear

Exactly, it does nothing, zilch:

$ cargo b -q && strace -ff ./target/debug/grr 2>&1 | grep bye
Cool bear

When the other clearly did something:

$ cargo b -q && strace -ff ./target/debug/grr 2>&1 | grep hi
openat(AT_FDCWD, "/tmp/hi", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 3
write(3, "hi!\n", 4)                    = 4

But wait, that's cuckoo. The cinECMAtic javascript universe also has async and it certainly does do things:

async function main() {
  await require("fs").promises.writeFile("/tmp/see", "see");
}

main();
$ strace -ff node main.js 2>&1 | grep see
[pid 1825359] openat(AT_FDCWD, "/tmp/see", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666 <unfinished ...>
[pid 1825360] write(17, "see", 3 <unfinished ...>
Cool bear

It does do them things, yes. That's because Node.js® is very async at its core. See, the idea... well that's unfair but let's pretend the idea was "threads are hard okay".

Sure, I can buy that. Threads seem hard — especially when there's a bunch of them stepping on each other's knees and toes, knees and toes.

Cool bear

So fuck threads right? Instead of doing blocking calls...

Wait what are bl-

Cool bear

calls that, like, block! Block everything. You're waiting for... some file to be read, and in the meantime, nothing else can happen.

Right. So instead of that we... do callbacks? Those used to be huge right.

Cool bear

Exactly! You say "I'd like to read from that file" and say "and when it's done, call me back on this number" except it's not a number, it's a closure.

Right! Like so:

const { readFile } = require("fs");

readFile("/usr/bin/gcc", () => {
  console.log("just read /usr/bin/gcc");
});

readFile("/usr/bin/clang", () => {
  console.log("just read /usr/bin/clang");
});
Cool bear

Exactly! Even though there's only ever one ECMAScript thing happening at once, multiple I/O (input/output) operations can be in-flight, and they can complete whenever, which is why if we read this, we can get:

$ node main.js
just read /usr/bin/clang
just read /usr/bin/gcc

Right! Even though we asked for /usr/bin/gcc to be read first.

Cool bear

Exactly. So async Rust is the same, right? Except async stuff doesn't run just by itself. There's no built-in runtime that's started implicitly, so we gotta create one and use it:

fn main() {
    tokio::runtime::Runtime::new().unwrap().block_on(async {
        tokio::fs::write("/tmp/bye", "bye!\n").await.unwrap();
    })
}
Cool bear

And now it does do something:

$ cargo run -q && cat /tmp/bye
bye!

$ cargo b -q && strace -ff ./target/debug/grr 2>&1 | grep bye
[pid 1857097] openat(AT_FDCWD, "/tmp/bye", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 9
[pid 1857097] write(9, "bye!\n", 5)     = 5
Cool bear

And so the Node.js® program you showed earlier was doing something more like this:

use std::time::Duration;

fn main() {
    // create a new async runtime
    let rt = tokio::runtime::Runtime::new().unwrap();

    // spawn a future on that runtime
    rt.spawn(async {
        tokio::fs::write("/tmp/bye", "bye!\n").await.unwrap();
    });

    // wait for all spawned futures for... some amount of time
    rt.shutdown_timeout(Duration::from_secs(10_000))
}
Cool bear

Except it probably waited for longer than that. But yeah that's the idea.

Okay, so, wait, there's async blocks? Like async { stuff }?

Cool bear

Yes.

And async closures? Like async |a, b, c| { stuff }?

Cool bear

Unfortunately, not yet.

There's async functions, though:

use std::time::Duration;

fn main() {
    // create a new async runtime
    let rt = tokio::runtime::Runtime::new().unwrap();

    // spawn a future on that runtime
    //          👇
    rt.spawn(write_bye());

    // wait for all spawned futures for... some amount of time
    rt.shutdown_timeout(Duration::from_secs(10_000))
}

// 👇
async fn write_bye() {
    tokio::fs::write("/tmp/bye", "bye!\n").await.unwrap();
}

Well it's something.

But wait, so when you call write_bye() it doesn't actually start doing the work?

Cool bear

No, it returns a future, and then you need to either spawn it somewhere, or you need to poll it.

How do, uh... how does one go about polling it?

Cool bear

You don't, the runtime does.

Ah, right. Because of the... no I'm sorry, that's nonsense. The runtime polls it?

Cool bear

Well, you can poll it if you want to, sometimes it'll even work:

use std::{
    future::Future,
    task::{Context, RawWaker, RawWakerVTable, Waker},
};

fn main() {
    let fut = tokio::fs::read("/etc/hosts");
    let mut fut = Box::pin(fut);

    let rw = RawWaker::new(
        std::ptr::null_mut(),
        &RawWakerVTable::new(clone, wake, wake_by_ref, drop),
    );
    let w = unsafe { Waker::from_raw(rw) };
    let mut cx = Context::from_waker(&w);

    let res = fut.as_mut().poll(&mut cx);
    dbg!(&res);
}

unsafe fn clone(_ptr: *const ()) -> RawWaker {
    todo!()
}

unsafe fn wake(_ptr: *const ()) {
    todo!()
}

unsafe fn wake_by_ref(_ptr: *const ()) {
    todo!()
}

unsafe fn drop(_ptr: *const ()) {
    // do nothing
}

Heyyyyyyyyyyyy that's a vtable, we saw this!

Cool bear

Yes, that's how Rust async runtimes work under the hood. And as you can see:

$ RUST_BACKTRACE=1 cargo run -q
thread 'main' panicked at 'there is no reactor running, must be called from the context of a Tokio 1.x runtime', /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/runtime/context.rs:21:19
stack backtrace:
   0: rust_begin_unwind
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:143:14
   2: core::panicking::panic_display
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:72:5
   3: tokio::runtime::context::current
             at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/runtime/context.rs:21:19
   4: tokio::runtime::blocking::pool::spawn_blocking
             at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/runtime/blocking/pool.rs:113:14
   5: tokio::fs::asyncify::{{closure}}
             at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/fs/mod.rs:119:11
   6: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
   7: tokio::fs::read::read::{{closure}}
             at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/fs/read.rs:50:42
   8: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
   9: grr::main
             at ./src/main.rs:17:15
  10: core::ops::function::FnOnce::call_once
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Cool bear

...okay so this one doesn't work because there's more moving pieces than this.

But you get the idea, futures get polled.

I'm not sure I do. I mean okay so they get polled once, via this weird trait:

pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}
Cool bear

Yes, which has a weird Pin<&mut Self> receiver instead of say, &mut self, to make self-referential types work.

Self-referential types? Ok now I'm completely lost. WE TRIED, EVERYONE, time to pack up and get outta here.

Cool bear

No no no bear with me

😐

Cool bear

...so think back to closures: they're code + data. A function and its environment. And the code in there can create new references to the data, right?

I.. I guess?

Cool bear

Like this for example:

fn main() {
    do_stuff(|| {
        let v = vec![1, 2, 3, 4];
        let back_half = &v[2..];
        println!("{back_half:?}");
    });
}

fn do_stuff(f: impl Fn()) {
    f()
}

Ah right, yes. The closure allocates some memory as a Vec, and then it takes an immutable slice of it. I don't see where the issue is, though.

Cool bear

Well think of futures like closures but... that you can call into several times?

I call into several times?

Cool bear

No, the runtime does.

... the confusion, it remains.

Cool bear

No but like, if we look at this:

use std::future::Future;

fn main() {
    do_stuff(async {
        let arr = [1, 2, 3, 4];
        let back_half = &v[2..];
        let hosts = tokio::fs::read("/etc/hosts").await;
        println!("{back_half:?}, {hosts:?}");
    });
}

fn do_stuff(f: impl Future<Output = ()>) {
    // blah
}

Yes, same idea but with some async sprinkled in there.

Cool bear

Exactly. So that read("/etc/hosts").await line there, that's an await point.

I can't help but feel like we're getting away from the spirit of the article, but okay, sure?

Cool bear

Focus! So read() returns a Future, and then we call .await, which makes the current/ambient async runtime poll it once.

Sure, I can buy that. And then?

Cool bear

Well and then either it returns Poll::Ready and it synchronously continues execution into the second part of that async block.

Or?

Cool bear

Or it returns Poll::Pending, at which point it'll have already registered itself with all the Waker business I teased earlier on.

Right. And then what happens?

Cool bear

And then it returns.

But... but it can't! If it returns we'll lose the data! The array will go out of scope and be freed!

Cool bear

Exactly.

So surely it's not actually returning?

Cool bear

It is actually returning. But it's also storing the array somewhere else. So that the next time it's polled/called, there it is. And in that "somewhere else", it also remembers which await point caused it to return Poll::Pending.

So this is all just a gigantic state machine?

Cool bear

Yes! And some parts of its state (in this case, back_half) may reference some other parts of its state (in this case, arr), so the state struct itself is... self-referential.

Here's the async block code again because that's a lot of scrolling:

    do_stuff(async {
        let arr = [1, 2, 3, 4];
        let back_half = &arr[2..];
        let hosts = tokio::fs::read("/etc/hosts").await;
        println!("{back_half:?}, {hosts:?}");
    });

Self-referential as in it refers to itself, gotcha.

And what's the problem with that?

Cool bear

The problem is, what if you poll that future once, and then it returns Poll::Pending, and then you move it somewhere else in memory?

Then I guess... arr will be moved along with it?

Cool bear

EXACTLY. And back_half will still point at the wrong place.

Ohhhhhhh so it must be pinned.

Cool bear

Yes. It must be pinned in order to be polled. That's why the receiver of poll is Pin<&mut Self>.

And so we can move the future before it's polled, but after the first time it's been polled, it's game over? Stay pinned?

Cool bear

Unless it implements Unpin, yes.

Which... it would implement only if... it was safe to move elsewhere?

Cool bear

Yes, for example if it only contained references to memory that's on the heap!

But GenFuture, the opaque type of async blocks, never implements Unpin (at least, I couldn't get it to), so this fails to build:

use std::{future::Future, time::Duration};

fn main() {
    let fut = async move {
        println!("hang on a sec...");
        tokio::time::sleep(Duration::from_secs(1)).await;
        println!("I'm here!");
    };
    ensure_unpin(&fut);
}

fn ensure_unpin<F: Future + Unpin>(f: &F) {
    // muffin
}
$ cargo check -q
error[E0277]: `from_generator::GenFuture<[static generator@src/main.rs:4:26: 8:6]>` cannot be unpinned
  --> src/main.rs:9:18
   |
9  |     ensure_unpin(&fut);
   |     ------------ ^^^^ within `impl Future<Output = ()>`, the trait `Unpin` is not implemented for `from_generator::GenFuture<[static generator@src/main.rs:4:26: 8:6]>`
   |     |
   |     required by a bound introduced by this call
Cool bear

...but we can always "box-pin" it, moving the whole future to the heap, so that we can move a reference to it wherever we please:

use std::{future::Future, time::Duration};

fn main() {
    //           👇
    let fut = Box::pin(async move {
        println!("hang on a sec...");
        tokio::time::sleep(Duration::from_secs(1)).await;
        println!("I'm here!");
    });
    ensure_unpin(&fut);
}

fn ensure_unpin<F: Future + Unpin>(f: &F) {
    // muffin
}

Okay that... that was a lot to take in.

So async stuff is awful because I need to understand all that, right?

Cool bear

Oh no, not at all.

Huh?

Cool bear

For starters, you don't really want to build a tokio Runtime yourself. There's macros for that.

#[tokio::main]
async fn main() {
    tokio::fs::write("/tmp/bye", "bye!\n").await.unwrap();
}

Ah, that seems more convenient, yes.

Cool bear

And you never really want to care about the Context / Waker / RawWaker stuff either. Those are implementation details.

Right right, yes.

Cool bear

But thus is the terrible deal we've made with the devil compiler. It guards us from numerous perils, but in exchange, we sometimes run head-first into unholy type errors.

I see. So you're saying... I don't need to understand pinning for example?

Cool bear

No! You just need to know that you can Box::pin() your way out of "this thing is not Unpin" diagnostics. Just like you can .clone() your way out of many "this thing doesn't live long enough".

Then WHY in the world did we learn all that.

Cool bear

Well, if you have a vague understanding of the underlying design constraints, it makes it a teensy bit less frustrating when you run into seemingly arbitrary limitations.

Such as?

Cool bear

Ah, friend.

I'm so glad you asked.

Async trait methods

Cool bear

So traits! You know traits. Here's a trait.

pub trait Read {
    fn read(&mut self, buf: &mut [u8]) -> Result<usize>;

    // (other methods omitted)
}

Yeah I know traits. That seems like a reasonable trait. The receiver is &mut self, because... it advances a read head? Also takes a buffer to write its output to, and returns how many bytes were read. Pretty simple stuff.

Cool bear

Wonderful! Now do the same, but make read async.

What, like that?

pub trait AsyncRead {
    async fn read(&mut self, buf: &mut [u8]) -> Result<usize, Box<dyn std::error::Error>>;
}
$ cargo check -q
error[E0706]: functions in traits cannot be declared `async`
 --> src/main.rs:2:5
  |
2 |     async fn read(&mut self, buf: &mut [u8]) -> Result<usize, Box<dyn std::error::Error>>;
  |     -----^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |     |
  |     `async` because of this
  |
  = note: `async` trait functions are not currently supported
  = note: consider using the `async-trait` crate: https://crates.io/crates/async-trait

Well the diagnostic is exemplary but, long story short: compiler says no.

Cool bear

Exactly. Do you know why?

Not really no?

Cool bear

Well, it's complicated. But we can sorta get an intuition for it.

Turns out there already is an AsyncRead trait in tokio (and a couple other places). Let's make an async function that just calls it:

async fn read(r: &mut (dyn AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> {
    r.read(buf).await
}
Cool bear

And now let's use it a couple times:

use tokio::{
    fs::File,
    io::{AsyncRead, AsyncReadExt, AsyncWriteExt},
    net::TcpStream,
};

#[tokio::main]
async fn main() {
    let mut f = File::open("/etc/hosts").await.unwrap();
    let mut buf1 = vec![0u8; 128];
    read(&mut f, &mut buf1).await.unwrap();
    println!("buf1 = {:?}", std::str::from_utf8(&buf1));

    let mut s = TcpStream::connect("example.org:80").await.unwrap();
    s.write_all("GET http://example.org HTTP/1.1\r\n\r\n".as_bytes())
        .await
        .unwrap();
    let mut buf2 = vec![0u8; 128];
    read(&mut s, &mut buf2).await.unwrap();
    println!("buf2 = {:?}", std::str::from_utf8(&buf2));
}

async fn read(r: &mut (dyn AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> {
    r.read(buf).await
}

Whoa. WHOA, we're writing real code now?

Cool bear

If you call that real code, sure. Anyway we're doing two asynchronous things: reading from a file, and reading from a TCP socket, cosplaying as the world's worst HTTP client.

$ cargo run -q
buf1 = Ok("127.0.0.1\tlocalhost\n127.0.1.1\tamos\n\n# The following lines are desirable for IPv6 capable hosts\n::1     ip6-localhost ip6-loopbac")
buf2 = Ok("HTTP/1.1 200 OK\r\nAge: 586436\r\nCache-Control: max-age=604800\r\nContent-Type: text/html; charset=UTF-8\r\nDate: Wed, 01 Jun 2022 17:1")
Cool bear

Now here's my question: what type does read return?

Our read function? I have no idea, why?

Cool bear

Because, it's important.

Well... I suppose we could try assigning one to the other?

Cool bear

Sure, let's do that.

use tokio::{
    fs::File,
    io::{AsyncRead, AsyncReadExt, AsyncWriteExt},
    net::TcpStream,
};

#[tokio::main]
async fn main() {
    let mut f = File::open("/etc/hosts").await.unwrap();
    let mut buf1 = vec![0u8; 128];

    let mut s = TcpStream::connect("example.org:80").await.unwrap();
    s.write_all("GET http://example.org HTTP/1.1\r\n\r\n".as_bytes())
        .await
        .unwrap();
    let mut buf2 = vec![0u8; 128];

    #[allow(unused_assignments)]
    let mut fut1 = read(&mut f, &mut buf1);
    let fut2 = read(&mut s, &mut buf2);
    fut1 = fut2;
    fut1.await.unwrap();
    println!("buf2 = {:?}", std::str::from_utf8(&buf2));
}

async fn read(r: &mut (dyn AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> {
    r.read(buf).await
}
$ cargo run -q
buf2 = Ok("HTTP/1.1 200 OK\r\nAge: 387619\r\nCache-Control: max-age=604800\r\nContent-Type: text/html; charset=UTF-8\r\nDate: Wed, 01 Jun 2022 17:2")

Hey, that worked!

Cool bear

Yes indeed. What else can you tell me about those types?

Mhhh their names, sort of?

{
    // in main:

    let fut1 = read(&mut f, &mut buf1);
    let fut2 = read(&mut s, &mut buf2);
    println!("fut1's type is {}", type_name_of_val(&fut1));
    println!("fut2's type is {}", type_name_of_val(&fut2));
}

fn type_name_of_val<T>(_t: &T) -> &'static str {
    std::any::type_name::<T>()
}
$ cargo run -q
fut1's type is core::future::from_generator::GenFuture<grr::read::{{closure}}>
fut2's type is core::future::from_generator::GenFuture<grr::read::{{closure}}>

Hah! It's closures all the way down. And then I guess their size?

    println!("fut1's size is {}", std::mem::size_of_val(&fut1));
    println!("fut2's size is {}", std::mem::size_of_val(&fut2));
$ cargo run -q
fut1's size is 72
fut2's size is 72
Cool bear

Okay, very well! Now same question with this read function:

async fn read(mut r: (impl AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> {
    r.read(buf).await
}

Okay, let's try assigning one to the other...

    let mut fut1 = read(&mut f, &mut buf1);
    let fut2 = read(&mut s, &mut buf2);
    fut1 = fut2;
$ cargo run -q
error[E0308]: mismatched types
  --> src/main.rs:20:12
   |
18 |     let mut fut1 = read(&mut f, &mut buf1);
   |                    ----------------------- expected due to this value
19 |     let fut2 = read(&mut s, &mut buf2);
20 |     fut1 = fut2;
   |            ^^^^ expected struct `tokio::fs::File`, found struct `tokio::net::TcpStream`
   |
note: while checking the return type of the `async fn`
  --> src/main.rs:25:67
   |
25 | async fn read(mut r: (impl AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> {
   |                                                                   ^^^^^^^^^^^^^^^^^^^^^^ checked the `Output` of this `async fn`, expected opaque type
note: while checking the return type of the `async fn`
  --> src/main.rs:25:67
   |
25 | async fn read(mut r: (impl AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> {
   |                                                                   ^^^^^^^^^^^^^^^^^^^^^^ checked the `Output` of this `async fn`, found opaque type
   = note: expected opaque type `impl Future<Output = Result<usize, std::io::Error>>` (struct `tokio::fs::File`)
              found opaque type `impl Future<Output = Result<usize, std::io::Error>>` (struct `tokio::net::TcpStream`)
   = help: consider `await`ing on both `Future`s

For more information about this error, try `rustc --explain E0308`.
error: could not compile `grr` due to previous error

Huh. HUH. The compiler is not happy AT ALL. It's trying very hard to be helpful, but it's clear it didn't expect anyone to fuck around in that particular manner, much less find out.

Let's try answering the other questions though... the type "name":

    let fut1 = read(&mut f, &mut buf1);
    let fut2 = read(&mut s, &mut buf2);
    println!("fut1's name is {}", type_name_of_val(&fut1));
    println!("fut2's name is {}", type_name_of_val(&fut2));
$ cargo run -q
fut1's name is core::future::from_generator::GenFuture<grr::read<&mut tokio::fs::file::File>::{{closure}}>
fut2's name is core::future::from_generator::GenFuture<grr::read<&mut tokio::net::tcp::stream::TcpStream>::{{closure}}>

Ooooh interesting. And then their sizes:

    let fut1 = read(&mut f, &mut buf1);
    let fut2 = read(&mut s, &mut buf2);
    println!("fut1's size is {}", std::mem::size_of_val(&fut1));
    println!("fut2's size is {}", std::mem::size_of_val(&fut2));
$ cargo run -q
fut1's size is 64
fut2's size is 64

Awwwwwww I was hoping for them to be different, b- wait, WAIT, we're passing &mut f and &mut s each time, that's 8 bytes each, if we pass ownership of the File / TcpStream respectively, then maybe...

    let fut1 = read(f, &mut buf1);
    let fut2 = read(s, &mut buf2);
    println!("fut1's size is {}", std::mem::size_of_val(&fut1));
    println!("fut2's size is {}", std::mem::size_of_val(&fut2));
$ cargo run -q
fut1's size is 256
fut2's size is 112

YES! The File is bigger.

Cool bear

Yes it is, for some reason. I can see... a tokio::sync::Mutex in there? Fun!

Okay so, is read returning the same type in both cases?

No!

Cool bear

And how would that work in a trait?

Well... we have impl Trait in return position, right? So just like these:

async fn sleepy_times() {
    tokio::time::sleep(Duration::from_secs(1)).await
}

...are actually sugar for these:

fn sleepy_times() -> impl Future<Output = ()> {
    async { tokio::time::sleep(Duration::from_secs(1)).await }
}

Then I guess instead of this:

trait AsyncRead {
    async fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize>;
}

We can have this:

trait AsyncRead {
    fn read(&mut self, buf: &mut [u8]) -> impl Future<Output = std::io::Result<usize>>;
}
Cool bear

You would think so! Except we cannot.

$ cargo run -q
error[E0562]: `impl Trait` only allowed in function and inherent method return types, not in trait method return
 --> src/main.rs:9:43
  |
9 |     fn read(&mut self, buf: &mut [u8]) -> impl Future<Output = std::io::Result<usize>>;
  |                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For more information about this error, try `rustc --explain E0562`.
error: could not compile `grr` due to previous error

Well THAT'S IT. I'm learning Haskell.

Cool bear

Whoa whoa now's not the time for self-harm. It's just a limitation!

On the other hand, we can have that:

trait AsyncRead {
    type Future: Future<Output = std::io::Result<usize>>;

    fn read(&mut self, buf: &mut [u8]) -> Self::Future;
}
Cool bear

And AsyncRead::Future is an associated type. It's chosen by the implementor of the trait.

I swear to glob, bear, if this is another one of your tricks, I'm..

$ cargo check -q
(nothing)

Oh. No, this checks. (Literally)

What's the catch?

Cool bear

Try implementing it!

Alright, well there's... tokio has its own AsyncRead trait... and then an AsyncReadExt extension trait, which actually gives us read, so we can just.. and then we... okay, there it is:

impl AsyncRead for File {
    type Future = ();

    fn read(&mut self, buf: &mut [u8]) -> Self::Future {
        tokio::io::AsyncReadExt::read(self, buf)
    }
}

But umm. What do I put as the Future type...

Cool bear

Hahahahahahahha.

Oh shut up will you. I'm sure the compiler will be able to help:

$ cargo check -q
error[E0277]: `()` is not a future
  --> src/main.rs:17:19
   |
17 |     type Future = ();
   |                   ^^ `()` is not a future
   |
   = help: the trait `Future` is not implemented for `()`
   = note: () must be a future or must implement `IntoFuture` to be awaited
note: required by a bound in `AsyncRead::Future`
  --> src/main.rs:11:18
   |
11 |     type Future: Future<Output = std::io::Result<usize>>;
   |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ required by this bound in `AsyncRead::Future`

error[E0308]: mismatched types
  --> src/main.rs:20:9
   |
19 |     fn read(&mut self, buf: &mut [u8]) -> Self::Future {
   |                                           ------------ expected `()` because of return type
20 |         tokio::io::AsyncReadExt::read(self, buf)
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^- help: consider using a semicolon here: `;`
   |         |
   |         expected `()`, found struct `tokio::io::util::read::Read`
   |
   = note: expected unit type `()`
                 found struct `tokio::io::util::read::Read<'_, tokio::fs::File>`

Some errors have detailed explanations: E0277, E0308.
For more information about an error, try `rustc --explain E0277`.
error: could not compile `grr` due to 2 previous errors

See! I just have to...

impl AsyncRead for File {
    type Future = tokio::io::util::read::Read<'_, tokio::fs::File>;

    fn read(&mut self, buf: &mut [u8]) -> Self::Future {
        tokio::io::AsyncReadExt::read(self, buf)
    }
}
$ cargo check -q
error[E0603]: module `util` is private
   --> src/main.rs:17:30
    |
17  |     type Future = tokio::io::util::read::Read<'_, tokio::fs::File>;
    |                              ^^^^ private module
    |
note: the module `util` is defined here
   --> /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/io/mod.rs:256:5
    |
256 |     pub(crate) mod util;
    |     ^^^^^^^^^^^^^^^^^^^^

😭

Cool bear

Hahahahahha. You simultaneously had the best and the worst luck.

...explain?

Cool bear

Well, because it turns out that AsyncReadExt::read is not an async fn, it's a regular fn that returns a named type that implements Future, so you could technically implement your AsyncRead trait... but it's unexported, so you can't name it, only the tokio crate can.

Ahhhhhhhhhh. So... how do I get out of this?

Cool bear

Remember the survival rules: you could always Box::pin the future. That way you can name it.

Okay... then the whole thing becomes this:

trait AsyncRead {
    type Future: Future<Output = std::io::Result<usize>>;

    fn read(&mut self, buf: &mut [u8]) -> Self::Future;
}

impl AsyncRead for File {
    type Future = Pin<Box<dyn Future<Output = std::io::Result<usize>>>>;

    fn read(&mut self, buf: &mut [u8]) -> Self::Future {
        Box::pin(tokio::io::AsyncReadExt::read(self, buf))
    }
}

...which seems like it just.. might..

$ cargo check -q
error[E0759]: `buf` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement
  --> src/main.rs:20:18
   |
19 |     fn read(&mut self, buf: &mut [u8]) -> Self::Future {
   |                             --------- this data with an anonymous lifetime `'_`...
20 |         Box::pin(tokio::io::AsyncReadExt::read(self, buf))
   |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^---^
   |                                                      |
   |                                                      ...is used and required to live as long as `'static` here
   |
note: `'static` lifetime requirement introduced by the return type
  --> src/main.rs:19:43
   |
19 |     fn read(&mut self, buf: &mut [u8]) -> Self::Future {
   |                                           ^^^^^^^^^^^^ requirement introduced by this return type
20 |         Box::pin(tokio::io::AsyncReadExt::read(self, buf))
   |         -------------------------------------------------- because of this returned expression

For more information about this error, try `rustc --explain E0759`.
error: could not compile `grr` due to previous error

Oh COME ON.

Cool bear

Hahahahahahahahahhahah yes. The Self::Future type has to be generic over the lifetime of self...

??? how did we get here. We were learning some basic Rust. It was nice.

Cool bear

Well, Box<dyn Trait> actually has an implicit static bound: it's really Box<dyn Trait + 'static>.

It... okay yes, it must be owned.

Cool bear

And the future you're trying to box isn't owned is it? It's borrowing from self.

Ahhhh fuckity fuck fuck.

Cool bear

Hey hey hey, no cursing, it's nothing a few nightly features can't fix!

# in rust-toolchain.toml

[toolchain]
channel = "nightly-2022-06-01"
//                   👇
#![feature(generic_associated_types)]

use std::{future::Future, pin::Pin};
use tokio::fs::File;

#[tokio::main]
async fn main() {
    let mut f = File::open("/etc/hosts").await.unwrap();
    let mut buf = vec![0u8; 32];
    AsyncRead::read(&mut f, &mut buf).await.unwrap();
    println!("buf = {:?}", std::str::from_utf8(&buf));
}

trait AsyncRead {
    type Future<'a>: Future<Output = std::io::Result<usize>>
    where
        Self: 'a;

    fn read<'a>(&'a mut self, buf: &'a mut [u8]) -> Self::Future<'a>;
}

impl AsyncRead for File {
    type Future<'a> = Pin<Box<dyn Future<Output = std::io::Result<usize>> + 'a>>;

    fn read<'a>(&'a mut self, buf: &'a mut [u8]) -> Self::Future<'a> {
        Box::pin(tokio::io::AsyncReadExt::read(self, buf))
    }
}

Whoa whoa whoa when did we graduate to that level of type fuckery.

Cool bear

Just squint! Or come back to it every couple weeks, whichever works.

$ cargo run -q
buf = Ok("127.0.0.1\tlocalhost\n127.0.1.1\tam")

Well it does run, I'll grant you that.

But wait, isn't boxing bad? What if we don't want to move that future to the heap?

Cool bear

Ah, then we need another trick unstable feature. And look, we can even use an async block!

//                   👇
#![feature(generic_associated_types)]
//                   👇👇
#![feature(type_alias_impl_trait)]

use std::future::Future;
use tokio::fs::File;

#[tokio::main]
async fn main() {
    let mut f = File::open("/etc/hosts").await.unwrap();
    let mut buf = vec![0u8; 32];
    AsyncRead::read(&mut f, &mut buf).await.unwrap();
    println!("buf = {:?}", std::str::from_utf8(&buf));
}

trait AsyncRead {
    type Future<'a>: Future<Output = std::io::Result<usize>>
    where
        Self: 'a;

    fn read<'a>(&'a mut self, buf: &'a mut [u8]) -> Self::Future<'a>;
}

impl AsyncRead for File {
    //                 👇
    type Future<'a> = impl Future<Output = std::io::Result<usize>> + 'a;

    fn read<'a>(&'a mut self, buf: &'a mut [u8]) -> Self::Future<'a> {
        // 👇
        async move { tokio::io::AsyncReadExt::read(self, buf).await }
    }
}

Whoaaaaa. It even runs!

Cool bear

It does! And you know the best part?

No?

Cool bear

These are actually slated for stabilizations Soon™️.

Wait, so we're learning all that for naught? All that effort???

Cool bear

Eh, look at this way: if and when those get stabilized, we'll be able to look back at all and laugh.

Just like today laugh at the fact that before Rust 1.35 (May 2019), the Fn traits weren't implemented for Box<dyn Fn>.

Or any number of significant milestones. It's been a long road.

I see. And in the meantime?

Cool bear

In the meantime my dear, we live in the now. And in the now, we have to deal with things such as...

The Connect trait from hyper

Ah, hyper! I've heard of it before.

It's an... http library? Does client, server, even http/2, maybe some day http/3.

Cool bear

Yeah I uh... that one needs help still. Call me? I just want to help.

But yes, http stuff.

And it has a Connect trait which is...

pub trait Connect: Sealed + Sized { }

...not very helpful.

Cool bear

No. But if you bothered to read the docs, you'd realize you're not supposed to implement it directly: instead you should implement tower::Service<Uri>.

Oh boy, here we go. How about I don't implement it at all? Huh? How's that.

Cool bear

Sure, you don't need to!

# let's just switch back to stable...
$ rm rust-toolchain.toml

$ cargo add hyper --features full
(cut)
use hyper::Client;

#[tokio::main]
async fn main() {
    let client = Client::new();
    let uri = "http://example.org".parse().unwrap();
    let res = client.get(uri).await.unwrap();
    let body = hyper::body::to_bytes(res.into_body()).await.unwrap();
    let body = std::str::from_utf8(&body).unwrap();
    println!("{}...", &body[..128]);
}
$ cargo run -q
<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type...

Ah, well, that's good. Cause I'm done with gnarly traits. Only simple code from now on.

Cool bear

And you're absolutely entitled to that. So that's for a simple plaintext HTTP request over TCP, but did you know you can do HTTP over other types of sockets?

Unix sockets for instance!

Unix sock... oh like the Docker daemon?

Cool bear

Exactly like the Docker daemon!

$ cargo add hyperlocal
(cut)

$ cargo add serde_json
(cut)
use hyper::{Body, Client};
use hyperlocal::UnixConnector;

#[tokio::main]
async fn main() {
    let client = Client::builder().build::<_, Body>(UnixConnector);
    let uri = hyperlocal::Uri::new("/var/run/docker.sock", "/v1.41/info").into();
    let res = client.get(uri).await.unwrap();
    let body = hyper::body::to_bytes(res.into_body()).await.unwrap();
    let value: serde_json::Value = serde_json::from_slice(&body).unwrap();
    println!("operating system: {}", value["OperatingSystem"]);
}
$ cargo run -q
operating system: "Ubuntu 22.04 LTS"

Whoa wait, serde_json? Are we doing useful stuff again?

Cool bear

Just for a bit.

So, making a request like that involves a bunch of operations, right?

Yeah it does! Let's take a look with strace, since apparently that's fair game in this monstrous article:

$ cargo build -q && strace -ff ./target/debug/grr 2>&1 | grep -vE 'futex|mmap|munmap|madvise|mprotect|sigalt|sigproc|prctl' | grep connect -A 20
[pid 1943976] connect(9, {sa_family=AF_UNIX, sun_path="/var/run/docker.sock"}, 23 <unfinished ...>
[pid 1943976] <... connect resumed>)    = 0
[pid 1943976] epoll_ctl(5, EPOLL_CTL_ADD, 9, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=1, u64=1}} <unfinished ...>
[pid 1943976] <... epoll_ctl resumed>)  = 0
[pid 1944006] sched_getaffinity(1944006, 32,  <unfinished ...>
[pid 1943977] <... epoll_wait resumed>[{events=EPOLLOUT, data={u32=1, u64=1}}], 1024, -1) = 1
[pid 1944006] <... sched_getaffinity resumed>[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]) = 8
[pid 1943976] getsockopt(9, SOL_SOCKET, SO_ERROR,  <unfinished ...>
[pid 1943976] <... getsockopt resumed>[0], [4]) = 0
[pid 1943977] epoll_wait(3,  <unfinished ...>
[pid 1944006] write(9, "GET /v1.41/info HTTP/1.1\r\nhost: "..., 78 <unfinished ...>
[pid 1944006] <... write resumed>)      = 78
[pid 1943977] <... epoll_wait resumed>[{events=EPOLLOUT, data={u32=1, u64=1}}], 1024, -1) = 1
[pid 1943977] epoll_wait(3, [{events=EPOLLIN|EPOLLOUT, data={u32=1, u64=1}}], 1024, -1) = 1
[pid 1943977] recvfrom(9, "HTTP/1.1 200 OK\r\nApi-Version: 1."..., 8192, 0, NULL, NULL) = 2536
[pid 1943977] epoll_wait(3,  <unfinished ...>
[pid 1944005] write(4, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 1943977] <... epoll_wait resumed>[{events=EPOLLIN, data={u32=2147483648, u64=2147483648}}], 1024, -1) = 1
[pid 1944005] recvfrom(9,  <unfinished ...>
[pid 1943977] epoll_wait(3,  <unfinished ...>
[pid 1944005] <... recvfrom resumed>0x7f6f84000d00, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 1943976] write(4, "\1\0\0\0\0\0\0\0", 8 <unfinished ...>
Cool bear

And those operations are different for TCP sockets and Unix sockets?

I would imagine so, yes.

Cool bear

Well, that work is done respectively by the HttpConnector and UnixConnector structs.

I see. And, wait... waitwaitwait. Connecting to a socket is an asynchronous operation too, right?

I know for TCP is involves sending a SYN, getting back an ACK, then sending a SYNACK, that all happens over the network, you probably don't want to block on that, right?

Cool bear

Right!

But Connect is a trait though. I thought you couldn't have async trait methods?

Cool bear

Ah, well, it's time to gaze upon... the tower Service trait.

pub trait Service<Request> {
    type Response;
    type Error;
    type Future: Future<Output = Result<Self::Response, Self::Error>>;

    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>>;
    fn call(&mut self, req: Request) -> Self::Future;
}

I see. Three associated types: Response, Error, and Future. And I see... Future is not generic over any lifetime, which means... call can't borrow from self. Ah and it takes ownership of Request!

And then there's poll_ready, which uhh...

Cool bear

That's just for backpressure. It's pretty clever, but not super relevant here.

In fact, if we look at the implementation for hyperlocal::UnixConnector:

// somewhere in hyperlocal's source code

impl Service<Uri> for UnixConnector {
    type Response = UnixStream;
    type Error = std::io::Error;
    type Future = BoxFuture<'static, Result<Self::Response, Self::Error>>;
    fn call(&mut self, req: Uri) -> Self::Future {
        let fut = async move {
            let path = parse_socket_path(req)?;
            UnixStream::connect(path).await
        };

        Box::pin(fut)
    }
    fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        Poll::Ready(Ok(()))
    }
}

Ah, it's not using that capacity at all, just returning Ready immediately.

Cool bear

Okay, here comes the exercise. Ready?

Hit me.

Cool bear

How do we make a hyper connector that can connect over both TCP sockets and Unix sockets?

Ah, well. I suppose we better make our own connector type then.

Something like... this?

use std::{future::Future, pin::Pin};

use hyper::{client::HttpConnector, service::Service, Body, Client, Uri};
use hyperlocal::UnixConnector;

struct SuperConnector {
    tcp: HttpConnector,
    unix: UnixConnector,
}

impl Default for SuperConnector {
    fn default() -> Self {
        Self {
            tcp: HttpConnector::new(),
            unix: Default::default(),
        }
    }
}

impl Service<Uri> for SuperConnector {
    type Response = ();
    type Error = ();
    type Future = Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>>>>;

    fn poll_ready(
        &mut self,
        cx: &mut std::task::Context<'_>,
    ) -> std::task::Poll<Result<(), Self::Error>> {
        todo!()
    }

    fn call(&mut self, req: Uri) -> Self::Future {
        todo!()
    }
}

#[tokio::main]
async fn main() {
    let client = Client::builder().build::<_, Body>(SuperConnector::default());
    let uri = hyperlocal::Uri::new("/var/run/docker.sock", "/v1.41/info").into();
    let res = client.get(uri).await.unwrap();
    let body = hyper::body::to_bytes(res.into_body()).await.unwrap();
    let value: serde_json::Value = serde_json::from_slice(&body).unwrap();
    println!("operating system: {}", value["OperatingSystem"]);
}
Cool bear

I see, I see. So you haven't decided on a Response / Error type yet, that's fine. And you're boxing the future?

Yeah, it's the easy way out, but that's what the async-trait crate does, so it seems like a safe bet.

Besides, I suppose HttpConnector and UnixConnector return incompatible futures, right? So we'd have the same problem we did before, wayyyy back, with code like that:

fn get_char_or_int(give_char: bool) -> impl Display {
    if give_char {
        'C'
    } else {
        64
    }
}
Cool bear

...yes actually yes, that was the whole motivation for the article, now that I think of it.

Now that you think? Nuh-huh. You don't think. I write you.

Cool bear

Well... maybe it started out this way, but look at us now. Who will the people remember?

...let's get back to the code shall we.

So anyway my temporary code doesn't even compile:

$ cargo check -q
error[E0277]: the trait bound `SuperConnector: Clone` is not satisfied
    --> src/main.rs:39:53
     |
39   |     let client = Client::builder().build::<_, Body>(SuperConnector::default());
     |                                    -----            ^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `Clone` is not implemented for `SuperConnector`
     |                                    |
     |                                    required by a bound introduced by this call
     |
note: required by a bound in `hyper::client::Builder::build`
    --> /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.19/src/client/client.rs:1336:22
     |
1336 |         C: Connect + Clone,
     |                      ^^^^^ required by this bound in `hyper::client::Builder::build`
Cool bear

Oh yeah you need it to be Clone. Both connectors you're wrapping are bound to be Clone already, so you can just derive it, probably.

Alrighty then:

#[derive(Clone)]
struct SuperConnector {
    tcp: HttpConnector,
    unix: UnixConnector,
}

Okay... now it complains that () doesn't implement AsyncRead, AsyncWrite, or hyper::client::connect::Connection. Also, our Future type isn't Send + 'static, and it has to be.

That one's an easy fix:

    type Future =
        Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>> + Send + 'static>>;

There. As for the AsyncRead / AsyncWrite / Connection stuff, well...

Cool bear

Right. That's where it gets awkward.

Oh? Can't we just use boxed trait objects here too?

Cool bear

Well no, because you've got three traits.

So? We've clearly done, for example, Box<dyn T + Send + 'static> before.

Cool bear

Yes, but Send is a marker trait (it doesn't actually have any methods), and 'static is just a lifetime bound, not a trait.

So you mean to tell me that if I did this:

    type Response = Pin<Box<dyn AsyncRead + AsyncWrite + Connection>>;

It wouldn't w-

$ cargo check -q
error[E0225]: only auto traits can be used as additional traits in a trait object
  --> src/main.rs:27:45
   |
27 |     type Response = Pin<Box<dyn AsyncRead + AsyncWrite + Connection>>;
   |                                 ---------   ^^^^^^^^^^ additional non-auto trait
   |                                 |
   |                                 first non-auto trait
   |
   = help: consider creating a new trait with all of these as supertraits and using that trait here instead: `trait NewTrait: AsyncRead + AsyncWrite + hyper::client::connect::Connection {}`
   = note: auto-traits like `Send` and `Sync` are traits that have special properties; for more information on them, visit <https://doc.rust-lang.org/reference/special-types-and-traits.html#auto-traits>

For more information about this error, try `rustc --explain E0225`.
error: could not compile `grr` due to previous error

Oh.

Cool bear

Can you see why?

Well the diagnostic is pretty fantastic here, game recognize game. But also uhh... oh is it a vtable thing?

Cool bear

Yes it is! Trait objects are two pointers: data + vtable. One vtable. Not three.

Ahhh hence the advice to make a new trait instead? Which would create a new super-vtable that contains the vtables for those three traits?

You know what, don't say a thing, I'm trying it.

Cool bear

That's the spir-

NOT A THING.

trait SuperConnection: AsyncRead + AsyncWrite + Connection {}

impl Service<Uri> for SuperConnector {
    type Response = Pin<Box<dyn SuperConnection>>;

    // etc.
}
$ cargo check -q
error[E0277]: the trait bound `Pin<Box<(dyn SuperConnection + 'static)>>: hyper::client::connect::Connection` is not satisfied
    --> src/main.rs:48:53
     |
48   |     let client = Client::builder().build::<_, Body>(SuperConnector::default());
     |                                    -----            ^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `hyper::client::connect::Connection` is not implemented for `Pin<Box<(dyn SuperConnection + 'static)>>`
     |                                    |
     |                                    required by a bound introduced by this call
     |
     = note: required because of the requirements on the impl of `hyper::client::connect::Connect` for `SuperConnector`
note: required by a bound in `hyper::client::Builder::build`
    --> /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.19/src/client/client.rs:1336:12
     |
1336 |         C: Connect + Clone,
     |            ^^^^^^^ required by this bound in `hyper::client::Builder::build`

Wait, what, why.

Cool bear

Well, you're boxing it! T where T: SuperConnection implements Connection, but Box<dyn SuperConnection> might not!

And why do we not have that error with AsyncRead and AsyncWrite?

Cool bear

Because there's blanket impls, see:

// somewhere in tokio's source code

macro_rules! deref_async_read {
    () => {
        fn poll_read(
            mut self: Pin<&mut Self>,
            cx: &mut Context<'_>,
            buf: &mut ReadBuf<'_>,
        ) -> Poll<io::Result<()>> {
            Pin::new(&mut **self).poll_read(cx, buf)
        }
    };
}

impl<T: ?Sized + AsyncRead + Unpin> AsyncRead for Box<T> {
    deref_async_read!();
}

impl<T: ?Sized + AsyncRead + Unpin> AsyncRead for &mut T {
    deref_async_read!();
}

Ah, and there's no blanket impl<T> Connection for Box<T> where T: Connection?

Cool bear

Apparently not!

Okay, let's hope orphan rules don't get in the way...

impl Connection for Pin<Box<dyn SuperConnection>> {
    fn connected(&self) -> hyper::client::connect::Connected {
        (**self).connected()
    }
}

...it's not complaining yet, let's keep going.

We need to pick an error type, and fill out our poll_ready and call methods.

Let's fucking goooooooooo:

impl Service<Uri> for SuperConnector {
    type Response = Pin<Box<dyn SuperConnection>>;
    type Error = Box<dyn std::error::Error + Send>;
    type Future =
        Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>> + Send + 'static>>;

    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        match (self.tcp.poll_ready(cx), self.unix.poll_ready(cx)) {
            (Poll::Pending, _) | (_, Poll::Pending) => Poll::Pending,
            _ => Ok(()).into(),
        }
    }

    fn call(&mut self, req: Uri) -> Self::Future {
        match req.scheme_str().unwrap_or_default() {
            "unix" => {
                let fut = self.unix.call(req);
                Box::pin(async move {
                    match fut.await {
                        Ok(conn) => Ok(Box::pin(conn)),
                        Err(e) => Err(Box::new(e)),
                    }
                })
            }
            _ => {
                let fut = self.tcp.call(req);
                Box::pin(async move {
                    match fut.await {
                        Ok(conn) => Ok(Box::pin(conn)),
                        Err(e) => Err(Box::new(e)),
                    }
                })
            }
        }
    }
}

So I'm looking at this in vscode, and it's very red.

I think we may have forgotten something...

Cool bear

Ah yes! The composition trait here:

trait SuperConnection: AsyncRead + AsyncWrite + Connection {}
Cool bear

You're missing half of it. Nothing implements this supertrait right now.

Ohhh because there's types that implement AsyncRead, AsyncWrite and Connection, but they also have to implement SuperConnection itself. The other three are just prerequisites?

Cool bear

They're just supertraits, yeah. Anyway this is the part you're missing:

impl<T> SuperConnection for T where T: AsyncRead + AsyncWrite + Connection {}

Ah, a beautiful blanket impl.

Okay, I'm working here, adding bounds left and right, here a Send, here a 'static, but I'm seeing some errors... some pretty bad errors here...

$ cargo check -q
error[E0271]: type mismatch resolving `<impl Future<Output = Result<Pin<Box<hyperlocal::client::UnixStream>>, Box<std::io::Error>>> as Future>::Output == Result<Pin<Box<(dyn SuperConnection + Send + 'static)>>, Box<(dyn std::error::Error + Send + 'static)>>`
  --> src/main.rs:56:17
   |
56 | /                 Box::pin(async move {
57 | |                     match fut.await {
58 | |                         Ok(conn) => Ok(Box::pin(conn)),
59 | |                         Err(e) => Err(Box::new(e)),
60 | |                     }
61 | |                 })
   | |__________________^ expected trait object `dyn SuperConnection`, found struct `hyperlocal::client::UnixStream`
   |
   = note: expected enum `Result<Pin<Box<(dyn SuperConnection + Send + 'static)>>, Box<(dyn std::error::Error + Send + 'static)>>`
              found enum `Result<Pin<Box<hyperlocal::client::UnixStream>>, Box<std::io::Error>>`
   = note: required for the cast to the object type `dyn Future<Output = Result<Pin<Box<(dyn SuperConnection + Send + 'static)>>, Box<(dyn std::error::Error + Send + 'static)>>> + Send`
Cool bear

Hahahahahahah yes. YES. Now you're doing it! One of us, one of us, one of u-

Bear, please. I'm crying. How do I get out of this one?

Cool bear

Ah, well, since we can't have type ascription, I guess just annotate harder:

    fn call(&mut self, req: Uri) -> Self::Future {
        match req.scheme_str().unwrap_or_default() {
            "unix" => {
                let fut = self.unix.call(req);
                Box::pin(async move {
                    match fut.await {
                        Ok(conn) => Ok::<Self::Response, _>(Box::pin(conn)),
                        Err(e) => Err::<_, Self::Error>(Box::new(e)),
                    }
                })
            }
            _ => {
                let fut = self.tcp.call(req);
                Box::pin(async move {
                    match fut.await {
                        Ok(conn) => Ok::<Self::Response, _>(Box::pin(conn)),
                        Err(e) => Err::<_, Self::Error>(Box::new(e)),
                    }
                })
            }
        }
    }

Oh. Well that's. I've never seen the turbofish in that position. But sure, fine...

It still doesn't work, though:

$ cargo check -q
error[E0277]: the size for values of type `(dyn std::error::Error + Send + 'static)` cannot be known at compilation time
    --> src/main.rs:78:53
     |
78   |     let client = Client::builder().build::<_, Body>(SuperConnector::default());
     |                                    -----            ^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
     |                                    |
     |                                    required by a bound introduced by this call
     |
     = help: the trait `Sized` is not implemented for `(dyn std::error::Error + Send + 'static)`
     = note: required because of the requirements on the impl of `std::error::Error` for `Box<(dyn std::error::Error + Send + 'static)>`
     = note: required because of the requirements on the impl of `From<Box<(dyn std::error::Error + Send + 'static)>>` for `Box<(dyn std::error::Error + Send + Sync + 'static)>`
     = note: required because of the requirements on the impl of `Into<Box<(dyn std::error::Error + Send + Sync + 'static)>>` for `Box<(dyn std::error::Error + Send + 'static)>`
     = note: required because of the requirements on the impl of `hyper::client::connect::Connect` for `SuperConnector`
note: required by a bound in `hyper::client::Builder::build`
    --> /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.19/src/client/client.rs:1336:12
     |
1336 |         C: Connect + Clone,
     |            ^^^^^^^ required by this bound in `hyper::client::Builder::build`

How do you suggest we get out of this one, professor?

Cool bear

Oh that one is a red herring.

Remember: you don't have to understand why some type bounds are there, you merely have to make it fit.

In this case, the bound is here:

// deep in the bowels of hyper's source code, in a submodule because that's a
// sealed trait:

    impl<S, T> Connect for S
    where
        S: tower_service::Service<Uri, Response = T> + Send + 'static,
        //         👇
        S::Error: Into<Box<dyn StdError + Send + Sync>>,
        S::Future: Unpin + Send,
        T: AsyncRead + AsyncWrite + Connection + Unpin + Send + 'static,
    {
        type _Svc = S;

        fn connect(self, _: Internal, dst: Uri) -> crate::service::Oneshot<S, Uri> {
            crate::service::oneshot(self, dst)
        }
    }

Ohhhhhhhhhhh.

Cool bear

See that? Into<Box<dyn Error + Send + Sync>>. You know what implements Into<T>? T!

Ohhh...? I don't get it.

Cool bear

It's okay. What we have right now is Box<dyn Error + Send>. We're just missing the Sync bound.

Ahhhhhhhhhhhhhhh.

    type Error = Box<dyn std::error::Error + Send + Sync + 'static>;

IT TYPECHECKS. THIS IS NOT A DRILL.

Cool bear

I love it when you go apeshit at the end of our articles.

Our artic-

Cool bear

But don't you want to golf down that impl a bit more? The implementations for poll_ready and call are pretty gnarly still...

Well sure, but how?

Cool bear

Let's bring in just one... more... crate.

$ cargo add futures
(cut)
Cool bear

And a well-placed macro...

use std::{
    pin::Pin,
    task::{Context, Poll},
};

use futures::{future::BoxFuture, FutureExt, TryFutureExt};
use hyper::{
    client::{connect::Connection, HttpConnector},
    service::Service,
    Body, Client, Uri,
};
use hyperlocal::UnixConnector;
use tokio::io::{AsyncRead, AsyncWrite};

#[derive(Clone)]
struct SuperConnector {
    tcp: HttpConnector,
    unix: UnixConnector,
}

impl Default for SuperConnector {
    fn default() -> Self {
        Self {
            tcp: HttpConnector::new(),
            unix: Default::default(),
        }
    }
}

trait SuperConnection: AsyncRead + AsyncWrite + Connection {}
impl<T> SuperConnection for T where T: AsyncRead + AsyncWrite + Connection {}

impl Connection for Pin<Box<dyn SuperConnection + Send + 'static>> {
    fn connected(&self) -> hyper::client::connect::Connected {
        (**self).connected()
    }
}

impl Service<Uri> for SuperConnector {
    type Response = Pin<Box<dyn SuperConnection + Send + 'static>>;
    type Error = Box<dyn std::error::Error + Send + Sync + 'static>;
    // `futures` provides a handy `BoxFuture<'a, T>` alias
    type Future = BoxFuture<'static, Result<Self::Response, Self::Error>>;

    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        // that macro propagates `Poll::Pending`, like `?` propagates `Result::Err`
        futures::ready!(self.tcp.poll_ready(cx))?;
        futures::ready!(self.unix.poll_ready(cx))?;
        Ok(()).into()
    }

    fn call(&mut self, req: Uri) -> Self::Future {
        // keep it DRY (don't repeat yourself) with a macro...
        macro_rules! forward {
            ($target:expr) => {
                $target
                    .call(req)
                    // these are from Future extension traits provided by `futures`
                    // they map `Future->Future`, not `Result->Result`
                    .map_ok(|c| -> Self::Response { Box::pin(c) })
                    // oh yeah by the way, closure syntax accepts `-> T` to annotate
                    // the return type, that's load-bearing here.
                    .map_err(|e| -> Self::Error { Box::new(e) })
                    // also an extension trait: `fut.boxed()` => `Box::pin(fut) as BoxFuture<_>`
                    .boxed()
            };
        }

        // much cleaner:
        match req.scheme_str().unwrap_or_default() {
            "unix" => forward!(self.unix),
            _ => forward!(self.tcp),
        }
    }
}

Well, I guess there's just one thing left to do: actually use it.

#[tokio::main]
async fn main() {
    let client = Client::builder().build::<_, Body>(SuperConnector::default());

    {
        let uri = hyperlocal::Uri::new("/var/run/docker.sock", "/v1.41/info").into();
        let res = client.get(uri).await.unwrap();
        let body = hyper::body::to_bytes(res.into_body()).await.unwrap();
        let value: serde_json::Value = serde_json::from_slice(&body).unwrap();
        println!("operating system: {}", value["OperatingSystem"]);
    }

    {
        let uri = "http://example.org".parse().unwrap();
        let res = client.get(uri).await.unwrap();
        let body = hyper::body::to_bytes(res.into_body()).await.unwrap();
        let body = std::str::from_utf8(&body).unwrap();
        println!("start of dom: {}", &body[..128]);
    }
}
$ cargo run -q
operating system: "Ubuntu 22.04 LTS"
start of dom: <!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type

Wonderful.

Say, bear, did we just accidentally write a book's worth of material about the Rust type system?

Cool bear

It would appear so, yes. But there's one thing we haven't covered yet.

Oh no. No no no I was just asking out of curios-

Higher-ranked trait bounds

FUCK. Someone stop that bear.

Cool bear

Consider the following trait:

trait Transform<'a> {
    fn apply(&self, slice: &'a mut [u8]);
}

I WILL NOT. I WILL NOT CONSIDER THE PRECEDING TRAIT.

Cool bear

Consider how you'd use it:

fn apply_transform<T>(slice: &mut [u8], transform: T)
where
    T: Transform,
{
    transform.apply(slice);
}

I NO LONGER CARE, I HAVE MENTALLY CHECKED OUT FROM THIS ARTICLE. YOU CANNOT MAKE ME CARE.

$ cargo check -q
error[E0106]: missing lifetime specifier
 --> src/main.rs:9:8
  |
9 |     T: Transform,
  |        ^^^^^^^^^ expected named lifetime parameter
  |
help: consider introducing a named lifetime parameter
  |
7 ~ fn apply_transform<'a, T>(slice: &mut [u8], transform: T)
8 | where
9 ~     T: Transform<'a>,
  |

For more information about this error, try `rustc --explain E0106`.
error: could not compile `grr` due to previous error
Cool bear

As you can see,

I CANNOT SEE

Cool bear

...this doesn't compile. The rust compiler wants us to specify a lifetime. But which should it be?

deep sigh

It should be... generic.

Cool bear

AhAH! Can you show me?

Sssure, here:

fn apply_transform<'a, T>(slice: &mut [u8], transform: T)
where
    //           👇
    T: Transform<'a>,
{
    transform.apply(slice);
}
$ cargo check -q
error[E0621]: explicit lifetime required in the type of `slice`
  --> src/main.rs:11:21
   |
7  | fn apply_transform<'a, T>(slice: &mut [u8], transform: T)
   |                                  --------- help: add explicit lifetime `'a` to the type of `slice`: `&'a mut [u8]`
...
11 |     transform.apply(slice);
   |                     ^^^^^ lifetime `'a` required

For more information about this error, try `rustc --explain E0621`.
error: could not compile `grr` due to previous error

Fuck. Hold on.

//                                👇
fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T)
where
    T: Transform<'a>,
{
    transform.apply(slice);
}
$ cargo check -q
error[E0309]: the parameter type `T` may not live long enough
  --> src/main.rs:11:5
   |
7  | fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T)
   |                        - help: consider adding an explicit lifetime bound...: `T: 'a`
...
11 |     transform.apply(slice);
   |     ^^^^^^^^^ ...so that the type `T` is not borrowed for too long

error[E0309]: the parameter type `T` may not live long enough

AhhhhhhhhhhhhHHHHH

fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T)
where
    //                👇
    T: Transform<'a> + 'a,
{
    transform.apply(slice);
}
$ cargo check -q
error[E0597]: `transform` does not live long enough
  --> src/main.rs:11:5
   |
7  | fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T)
   |                    -- lifetime `'a` defined here
...
11 |     transform.apply(slice);
   |     ^^^^^^^^^^^^^^^^^^^^^^
   |     |
   |     borrowed value does not live long enough
   |     argument requires that `transform` is borrowed for `'a`
12 | }
   | - `transform` dropped here while still borrowed

For more information about this error, try `rustc --explain E0597`.
error: could not compile `grr` due to previous error

AHHHHH NOTHING WORKS.

Cool bear

Yes, yes haha, nothing works indeed. Well that's what you get for glossing over lifetimes earlier.

Okay well, what do you suggest?

Cool bear

Well, the problem is that we're conflating the lifetimes of many different things.

Because we have a single lifetime name, 'a, we need all of these to outlive 'a:

  • the &mut [u8] slice
  • transform itself
  • the borrow of transform we need to call apply

It's clearer if we do the auto-ref ourselves:

fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T)
where
    T: Transform<'a> + 'a,
{
    let borrowed_transform = &transform;
    borrowed_transform.apply(slice);
    drop(transform);
}
Cool bear

The signature of Transform::apply requires self to be borrowed for as long as the slice. And that can't be true, since we need to drop transform before we drop the slice itself.

What do you suggest then? Borrowing transform too?

Cool bear

Sure, that'd work:

fn apply_transform<'a, T>(slice: &'a mut [u8], transform: &'a dyn Transform<'a>) {
    transform.apply(slice);
}
Cool bear

But that's not the problem statement. We can fix the original code, with HRTB: higher-ranked trait bounds.

We don't want Transform<'a> to be implemented by T for a specific lifetime 'a. We want it to be implemented for any lifetime.

And here's the syntax that makes the magic happen:

fn apply_transform<T>(slice: &mut [u8], transform: T)
where
    T: for<'a> Transform<'a>,
{
    transform.apply(slice);
}

Oh, that. That wasn't nearly as scary as I had anticipated. That's it?

Cool bear

Well, also, it's one of those features that you probably don't need as much as you think you do.

Meaning?

Cool bear

Meaning our trait is kinda odd to begin with. There's no reason self and slice should be borrowed for the same lifetime.

If we just get rid of all our lifetime annotations, things work just as well:

trait Transform {
    fn apply(&self, slice: &mut [u8]);
}

fn apply_transform_thrice<T>(slice: &mut [u8], transform: T)
where
    T: Transform,
{
    transform.apply(slice);
    transform.apply(slice);
    transform.apply(slice);
}

Oh.

But surely it's useful in some instances, right?

Cool bear

Why yes! Consider the following:

Oh not agai-

trait Transform<T> {
    fn apply(&self, target: T);
}
Cool bear

Now, Transform is generic over the type T. How do we use it?

Well... just like before, except with one more bound I guess:

fn apply_transform<T>(slice: &mut [u8], transform: T)
where
    T: Transform<&mut [u8]>,
{
    transform.apply(slice);
}
Cool bear

Ah yes! Except, no.

cargo check -q
error[E0637]: `&` without an explicit lifetime name cannot be used here
 --> src/main.rs:9:18
  |
9 |     T: Transform<&mut [u8]>,
  |                  ^ explicit lifetime name needed here

error[E0312]: lifetime of reference outlives lifetime of borrowed content...
  --> src/main.rs:11:21
   |
11 |     transform.apply(slice);
   |                     ^^^^^
   |
   = note: ...the reference is valid for the static lifetime...
note: ...but the borrowed content is only valid for the anonymous lifetime defined here
  --> src/main.rs:7:30
   |
7  | fn apply_transform<T>(slice: &mut [u8], transform: T)
   |                              ^^^^^^^^^

Some errors have detailed explanations: E0312, E0637.
For more information about an error, try `rustc --explain E0312`.
error: could not compile `grr` due to 2 previous errors

Ah, more generics then?

fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T)
where
    T: Transform<&'a mut [u8]>,
{
    transform.apply(slice);
}
Cool bear

That does work! Now turn into into apply_transform_thrice again...

fn apply_transform_thrice<'a, T>(slice: &'a mut [u8], transform: T)
where
    T: Transform<&'a mut [u8]>,
{
    transform.apply(slice);
    transform.apply(slice);
    transform.apply(slice);
}
$ cargo check -q
error[E0499]: cannot borrow `*slice` as mutable more than once at a time
  --> src/main.rs:12:21
   |
7  | fn apply_transform_thrice<'a, T>(slice: &'a mut [u8], transform: T)
   |                           -- lifetime `'a` defined here
...
11 |     transform.apply(slice);
   |     ----------------------
   |     |               |
   |     |               first mutable borrow occurs here
   |     argument requires that `*slice` is borrowed for `'a`
12 |     transform.apply(slice);
   |                     ^^^^^ second mutable borrow occurs here

error[E0499]: cannot borrow `*slice` as mutable more than once at a time
  --> src/main.rs:13:21
   |
7  | fn apply_transform_thrice<'a, T>(slice: &'a mut [u8], transform: T)
   |                           -- lifetime `'a` defined here
...
11 |     transform.apply(slice);
   |     ----------------------
   |     |               |
   |     |               first mutable borrow occurs here
   |     argument requires that `*slice` is borrowed for `'a`
12 |     transform.apply(slice);
13 |     transform.apply(slice);
   |                     ^^^^^ second mutable borrow occurs here

For more information about this error, try `rustc --explain E0499`.
error: could not compile `grr` due to 2 previous errors

Oh hell. You sly bear. That was your plan all along, wasn't it?

Cool bear

Hahahahahahahahha yes. Do you know how to get out of that one?

...yes I do. I suppose it worked when we called it once because... the slice parameter to apply_transform could have the same lifetime as the parameter to Transform::transform. But now we call it three times, so the lifetime of the parameter to Transform::transform has to be smaller.

Three times smaller in fact.

Cool bear

Well that's not how... lifetimes don't really have sizes you can measure, but sure, yeah, that's the gist.

And that's where HRTB (higher-ranked trait bounds) come in, don't they.

fn apply_transform_thrice<T>(slice: &mut [u8], transform: T)
where
    T: for<'a> Transform<&'a mut [u8]>,
{
    transform.apply(slice);
    transform.apply(slice);
    transform.apply(slice);
}

Ah heck. This typechecks.

I was all out of learning juice and you still managed to sneak one in.

Cool bear

😎😎😎

Afterword

It's me, regular Amos. I know Rust again. I feel like we need some aftercare debriefing after going through all this. Are you okay? We have juice and cookies if you want.

Congratulations on reaching the end by the way! I'm guessing you're not using Mobile Safari, or else it would've already crashed.

I don't want any of this to scare you.

Like Bear and I said, it's really just about making the pieces fit. Sometimes the shape of the pieces (the types) prevent you from making GRAVE mistakes (like data races, or accessing the Ok variant of a Result type), sometimes they're there because... that's the best we got.

Most of the time, you're playing with someone else's toy pieces: they've already determined what shapes make sense, and you can let yourself be guided by compiler diagnostics, which are fantastic most of the time, and then rapidly degrade as you delve deeper into async land or try to generally, uh, "get smart".

But you don't have to get smart. Keep in mind the escape hatches. Struggling with lifetimes? Clone it! Can't clone it? Arc<T> it! You can even Arc<Mutex<T>> it, if you really need to mutate it.

Need to put a bunch of heterogenous types together in the same collection? Or return them from a single function? Just box them!

It gets harder with complex traits and associated types, but in this article, we've covered literally the worst case I've ever seen. The other cases are just variations on a theme, with additional bounds, which you can solve one by one.

There's a lot to Rust we haven't covered here — this is by no means a comprehensive book on the language. But my hope is that it serves as sort of a survival guide for anyone who finds themselves stuck with Rust before they appreciate it. I hope you read this in anger, and it gets you out of the hole.

And beyond that, I really hope large parts of this article become completely irrelevant. Laughably so. That we get GATs, type alias impl trait, maybe dyn*, maybe modifier generics?

There's ton of good stuff in the pipes, some of it has been in the works "seemingly forever", and I'm looking forward to all of it, because that means I'll have to write fewer articles like these.

In the meantime, I'm still having a relatively good time in the Rust async ecosystem. I can live with the extra boilerplate while we find good solutions for all these. Sometimes it's a bit frustrating, but then I spend a couple hours playing with a language that doesn't have a borrow checker, or doesn't have sum types, and I'm all better.

I hope I was able to show, too, that I don't consider Rust the perfect, be-all-end-all programming language. There's still a bunch of situations where, without the requisite years of messing around, you'll be stuck. Because I'm so often the person of reference to help solve these, at work and otherwise, I just thought I'd put a little something together.

Hopefully this article helps a little. And in the meantime, take excellent care of yourselves.

Comment on /r/fasterthanlime

(JavaScript is required to see this. Or maybe my stuff broke)

Here's another article just for you:

Understanding Rust futures by going way too deep

So! Rust futures! Easy peasy lemon squeezy. Until it's not. So let's do the easy thing, and then instead of waiting for the hard thing to sneak up on us, we'll go for it intentionally.

Cool bear

Cool bear's hot tip

That's all-around solid life advice.

Choo choo here comes the easy part 🚂💨

We make a new project: