The curse of strong typing
👋 This page was last updated ~2 years ago. Just so you know.
It happened when I least expected it.
Someone, somewhere (above me, presumably) made a decision. "From now on", they declared, "all our new stuff must be written in Rust".
I'm not sure where they got that idea from. Maybe they've been reading propaganda. Maybe they fell prey to some confident asshole, and convinced themselves that Rust was the answer to their problems.
I don't know what they see in it, to be honest. It's like I always say: it's not a data race, it's a data marathon.
At any rate, I now find myself in a beautiful house, with a beautiful wife, and a lot of compile errors.
Jesus that's a lot of compile errors.
Different kinds of numbers
And it's not like I'm resisting progress! When someone made the case for using tau instead of pi, I was the first to hop on the bandwagon.
But Rust won't even let me do that:
fn main() { // only nerds need more digits println!("tau = {}", 2 * 3.14159265); }
$ cargo run --quiet error[E0277]: cannot multiply `{integer}` by `{float}` --> src/main.rs:3:28 | 3 | println!("tau = {}", 2 * 3.14159265); | ^ no implementation for `{integer} * {float}` | = help: the trait `Mul<{float}>` is not implemented for `{integer}` For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to previous error
When it clearly works in ECMAScript for example:
// in `main.js` // TODO: ask for budget increase so we can afford more digits console.log(`tau = ${2 * 3.14159265}`);
$ node main.js tau = 6.2831853
Luckily, a colleague rushes in to help me.
Well those... those are different types.
Types? Never heard of them.
You've seen the title of this post right? Strong typing?
Fine, I'll look it up. It says here that:
"Strong typing" generally refers to use of programming language types in order to both capture invariants of the code, and ensure its correctness, and definitely exclude certain classes of programming errors. Thus there are many "strong typing" disciplines used to achieve these goals.
Okay. What's incorrect about my code?
Oh, nothing! Nothing at all. These are just different types.
So it's just getting in the way right now yes, correct?
Well... sort of? But it's not like your program is running on an imaginary machine. There's a real difference between an "integer" and a "floating point number".
A floa-
Look at this for example:
package main import "fmt" func main() { a := 200000000 for i := 0; i < 10; i++ { a *= 10 fmt.Printf("a = %v\n", a) } }
$ go run main.go a = 2000000000 a = 20000000000 a = 200000000000 a = 2000000000000 a = 20000000000000 a = 200000000000000 a = 2000000000000000 a = 20000000000000000 a = 200000000000000000 a = 2000000000000000000
Yeah, that makes perfect sense! What's your point?
Well, if we keep going a little more...
package main import "fmt" func main() { a := 200000000 // 👇 for i := 0; i < 15; i++ { a *= 10 fmt.Printf("a = %v\n", a) } }
$ go run main.go a = 2000000000 a = 20000000000 a = 200000000000 a = 2000000000000 a = 20000000000000 a = 200000000000000 a = 2000000000000000 a = 20000000000000000 a = 200000000000000000 a = 2000000000000000000 a = 1553255926290448384 a = -2914184810805067776 a = 7751640039368425472 a = 3729424098846048256 a = 400752841041379328
Oh. Oh no.
That's an overflow. We used a 64-bit integer variable, and to represent 2000000000000000000, we'd need 64.12 bits, which... that's more than we have.
Okay, but again this works in ECMAScript for example:
let a = 200000000; for (let i = 0; i < 15; i++) { a *= 10; console.log(`a = ${a}`); }
$ node main.js a = 2000000000 a = 20000000000 a = 200000000000 a = 2000000000000 a = 20000000000000 a = 200000000000000 a = 2000000000000000 a = 20000000000000000 a = 200000000000000000 a = 2000000000000000000 a = 20000000000000000000 a = 200000000000000000000 a = 2e+21 a = 2e+22 a = 2e+23
Sure, it's using nerd notation, but if we just go back, we can see it's working:
let a = 200000000; for (let i = 0; i < 15; i++) { a *= 10; console.log(`a = ${a}`); } console.log("turn back!"); for (let i = 0; i < 15; i++) { a /= 10; console.log(`a = ${a}`); }
$ node main.js a = 2000000000 a = 20000000000 a = 200000000000 a = 2000000000000 a = 20000000000000 a = 200000000000000 a = 2000000000000000 a = 20000000000000000 a = 200000000000000000 a = 2000000000000000000 a = 20000000000000000000 a = 200000000000000000000 a = 2e+21 a = 2e+22 a = 2e+23 turn back! a = 2e+22 a = 2e+21 a = 200000000000000000000 a = 20000000000000000000 a = 2000000000000000000 a = 200000000000000000 a = 20000000000000000 a = 2000000000000000 a = 200000000000000 a = 20000000000000 a = 2000000000000 a = 200000000000 a = 20000000000 a = 2000000000 a = 200000000
Mhh, looks like döner kebab.
Okay, but those are floating point numbers.
They don't look very floating to me.
Consider this:
let a = 0.1; let b = 0.2; let sum = a + b; console.log(sum);
$ node main.js 0.30000000000000004
Ah, that... that does float.
Yeah, and that's the trade-off. You get to represent numbers that aren't whole numbers, and also /very large/ numbers, at the expense of some precision.
I see.
For example, with floats, you can compute two thirds:
fn main() { println!("two thirds = {}", 2.0 / 3.0); }
$ cargo run --quiet two thirds = 0.6666666666666666
But with integers, you can't:
fn main() { println!("two thirds = {}", 2 / 3); }
$ cargo run --quiet two thirds = 0
Wait, but I don't see any actual types here. Just values.
Yeah, it's all inferred!
I uh. Okay I'm still confused. See, in ECMAScript, a number's a number:
console.log(typeof 36); console.log(typeof 42.28);
$ node main.js number number
Unless it's a big number!
console.log(typeof 36); console.log(typeof 42.28); console.log(typeof 248672936507863405786027355423684n);
$ node main.js number number bigint
Ahhh. So ECMAScript does have integers.
Only big ones. Well they can smol if you want to. Operations just... are more expensive on them.
What about Python? Does Python have integers?
$ python3 -q >>> type(38) <class 'int'> >>> type(38.139582735) <class 'float'> >>>
Mh, yeah, it does!
Try computing two thirds with it!
$ python3 -q >>> 2/3 0.6666666666666666 >>> type(2) <class 'int'> >>> type(2/3) <class 'float'> >>>
Hey that works! So the /
operator in python takes two int
values and gives a
float
.
Not two int
values. Two numbers. Could be anything.
$ python3 -q >>> 2.8 / 1.4 2.0 >>>
What if I want to do integer division?
There's an operator for that!
$ python3 -q >>> 10 // 3 3 >>>
Similarly, for addition you have ++
...
$ python3 -q >>> 2 + 3 5 >>> 2 ++ 3 5 >>>
And so on...
>>> 8 - 3 5 >>> 8 -- 3 11
Wait, no, I th-
>>> 8 * 3 24 >>> 8 ** 3 512
Woops, my bad — I guess it's just //
. a ++ b
really is a + (+b)
,
a -- b
is a - (-b)
, and a ** b
is a
to the b
th power.
Okay so Python values have types, you just can't see them unless you ask.
Can I see the types of Rust values too?
Kinda! You can do this:
fn main() { dbg!(type_name_of(2)); dbg!(type_name_of(268.2111)); } fn type_name_of<T>(_: T) -> &'static str { std::any::type_name::<T>() }
$ cargo run --quiet [src/main.rs:2] type_name_of(2) = "i32" [src/main.rs:3] type_name_of(268.2111) = "f64"
Okay. And so in Rust, a value like 42
defaults to i32
(signed 32-bit integer),
and a value like 3.14
defaults to f64
.
How do I make other number types? Surely there's other.
For literals, you can use suffixes:
$ cargo run --quiet [src/main.rs:2] type_name_of(1_u8) = "u8" [src/main.rs:3] type_name_of(1_u16) = "u16" [src/main.rs:4] type_name_of(1_u32) = "u32" [src/main.rs:5] type_name_of(1_u64) = "u64" [src/main.rs:6] type_name_of(1_u128) = "u128" [src/main.rs:8] type_name_of(1_i8) = "i8" [src/main.rs:9] type_name_of(1_i16) = "i16" [src/main.rs:10] type_name_of(1_i32) = "i32" [src/main.rs:11] type_name_of(1_i64) = "i64" [src/main.rs:12] type_name_of(1_i128) = "i128" [src/main.rs:14] type_name_of(1_f32) = "f32" [src/main.rs:15] type_name_of(1_f64) = "f64"
No f128
?
Not builtin, no. For now.
Okay, so my original code here didn't work:
fn main() { // only nerds need more digits println!("tau = {}", 2 * 3.14159265); }
Was because the 2
on the left is an integer, and the 3.14159265
is a
floating point number, and so I have to do this:
println!("tau = {}", 2.0 * 3.14159265);
Or this:
println!("tau = {}", 2f64 * 3.14159265);
Or this, to be more readable, since apparently you can stuff _
anywhere in
number literals:
println!("tau = {}", 2_f64 * 3.14159265);
In ECMAScript, you have 64-bit floats (number
), and bigints. Operations on
bigints are significantly more expensive than operations on floats.
In Python, you have floats, and integers. Python 3 handles bigints seamlessly: doing arithmetic on small integer values is still "cheap".
In languages like Rust, you have integers and floats, but you need to pick a bit
width. Number literals will default to i32
and f64
, unless you add a suffix
or... some other conditions described in the next section.
Conversions and type inference
Okay, I think I get it.
So whereas Python has an "integer" and "float" type, Rust has different widths of integer types, like C and other system languages.
So this doesn't work:
fn main() { let val = 280_u32; takes_u32(val); takes_u64(val); } fn takes_u32(val: u32) { dbg!(val); } fn takes_u64(val: u64) { dbg!(val); }
$ cargo run --quiet error[E0308]: mismatched types --> src/main.rs:4:15 | 4 | takes_u64(val); | ^^^ expected `u64`, found `u32` | help: you can convert a `u32` to a `u64` | 4 | takes_u64(val.into()); | +++++++ For more information about this error, try `rustc --explain E0308`. error: could not compile `grr` due to previous error
And the compiler gives me a suggestion, but according to the heading of the
section, as
should work, too:
takes_u64(val as u64);
$ cargo run --quiet [src/main.rs:8] val = 280 [src/main.rs:12] val = 280
Yeah! And you see the definition of takes_u64
? It has val: u64
.
Yeah I see, I wrote it!
So that means the compiler knows that the argument to takes_u64
must be
a u64
, right?
Yeah?
So it should be able to infer it!
Yeah, this does work:
takes_u64(230984423857928735);
Exactly! Whereas before, it defaulted to the type of the literal to i32
, this
time it knows it should be a u64
in the end, so it turns the kind of squishy
{integer}
type into the very concrete u64
type.
Neat.
But it doesn't stop there — in a bunch of places in Rust, when you want to
ask the compiler to "just figure it out", you can substitute _
.
No... so you mean?
fn main() { let val = 280_u32; takes_u32(val); // 👇 takes_u64(val as _); } // etc.
$ cargo run --quiet [src/main.rs:8] val = 280 [src/main.rs:12] val = 280
Neat!
Let's try .into()
too, since that's what the compiler suggested:
fn main() { let val = 280_u32; takes_u32(val); takes_u64(val.into()); } // etc.
That works too!
Oooh, ooh, try it the other way around!
Like this?
fn main() { // 👇 let val = 280_u64; // 👇 takes_u64(val); // 👇 takes_u32(val.into()); }
$ cargo run --quiet error[E0277]: the trait bound `u32: From<u64>` is not satisfied --> src/main.rs:4:19 | 4 | takes_u32(val.into()); | ^^^^ the trait `From<u64>` is not implemented for `u32` | = help: the following implementations were found: <u32 as From<Ipv4Addr>> <u32 as From<NonZeroU32>> <u32 as From<bool>> <u32 as From<char>> and 71 others = note: required because of the requirements on the impl of `Into<u32>` for `u64` For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to previous error
Oh, it's not happy at all. It does helpfully suggest we could use an IPv4 address instead, which...
I know someone who'll think this diagnostic could use a little tune-up...
No no, we can try it, we got time:
use std::{net::Ipv4Addr, str::FromStr}; fn main() { takes_u32(Ipv4Addr::from_str("127.0.0.1").unwrap().into()); } fn takes_u32(val: u32) { dbg!(val); }
$ cargo run --quiet [src/main.rs:8] val = 2130706433
...yes, okay.
Just like an IPv6 address can be a u128
, if it believes:
use std::{net::Ipv6Addr, str::FromStr}; fn main() { takes_u128(Ipv6Addr::from_str("ff::d1:e3").unwrap().into()); } fn takes_u128(val: u128) { dbg!(val); }
$ cargo run --quiet [src/main.rs:8] val = 1324035698926381045275276563964821731
But apparently a u64
can't be a u32
?
Well... that's because not all values of type u64
fit into a u32
.
Oh!
...that's why there's no impl From<u64> for u32
...
Ah.
...but there is an impl TryFrom<u64> for u32
.
Ah?
Because some u64
fit in a u32
.
So err... we used .into()
earlier... which we could do because... From
?
And so because now we have TryFrom
... .try_into()
?
Yes! Because of this blanket impl and that blanket impl, respectively.
I have a feeling we'll come back to these later... but for now, let's give it a shot:
fn main() { let val: u64 = 48_000; takes_u32(val.try_into().unwrap()); } fn takes_u32(val: u32) { dbg!(val); }
This compiles, and runs.
As for this:
fn main() { let val: u64 = 25038759283948; takes_u32(val.try_into().unwrap()); }
It compiles, but does not run!
$ cargo run --quiet thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: TryFromIntError(())', src/main.rs:3:30 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Makes sense so far.
And that's... that's all of it right?
Not quite! You can parse stuff.
Ah, like we just did with Ipv4Addr::from_str
right?
Yes! But just like T::from(val)
has val.into()
, T::from_str(val)
has val.parse()
.
Fantastic! Let's give it a go:
fn main() { let val = "1234".parse(); dbg!(val); }
$ cargo run --quiet error[E0284]: type annotations needed for `Result<F, _>` --> src/main.rs:2:22 | 2 | let val = "1234".parse(); | --- ^^^^^ cannot infer type for type parameter `F` declared on the associated function `parse` | | | consider giving `val` the explicit type `Result<F, _>`, where the type parameter `F` is specified | = note: cannot satisfy `<_ as FromStr>::Err == _` For more information about this error, try `rustc --explain E0284`. error: could not compile `grr` due to previous error
Oh it's... unhappy? Again?
Consider this: what do you want to parse to?
A number, clearly! The string is 1234
.
See, ECMAScript gets it right:
let a = "1234"; console.log({ a }); let b = parseInt(a, 10); console.log({ b });
$ node main.js { a: '1234' } { b: 1234 }
Nnnnonono, you said parseInt
, not just parse
.
Okay fine, let's not say parse
at all then:
let a = "1234"; console.log({ a }); let b = +a; console.log({ b });
$ node main.js { a: '1234' } { b: 1234 }
Okay but the unary plus operator here coerces a string
to a number
, and
in that case the only sensible thing to do is...
Nah nah nah, that's too easy. I think you're just looking for excuses. The truth is, ECMAScript is production-ready in a way that Rust isn't, and never will be.
Those fools at work have it coming. Soon they'll realize! They've been had. They've been swindled. They've developed a taste for snake o-
JUST ADD : u64
AFTER let val
WILL YOU
fn main() { let val: u64 = "2930482035982309".parse().unwrap(); dbg!(val); }
$ cargo run --quiet [src/main.rs:3] val = 2930482035982309
Oh.
Yeah that tracks. And I suppose if we have to care about bit widths here, that
if I change it for u32
...
fn main() { let val: u32 = "2930482035982309".parse().unwrap(); dbg!(val); }
$ cargo run --quiet thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ParseIntError { kind: PosOverflow }', src/main.rs:2:47 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
It errors out, because that doesn't fit in a u32
. I see.
YES. NOW TRY CASTING THAT VALUE AS AN u64
TO A u32
.
Cool down, bear! I'll try, I'll try:
fn main() { let a = 2930482035982309_u64; println!("a = {a} (u64)"); let b = a as u32; println!("b = {b} (u32)"); }
$ cargo run --quiet a = 2930482035982309 (u64) b = 80117733 (u32)
Oh. It's... it's not crashing, just... doing the wrong thing?
YES THAT WAS MY POINT THANK YOU
Yeesh okay how about you take a minute there, bear. So I agree that number
shouldn't fit in a u32
, so it's doing... something with it.
Maybe if we print it as hex:
fn main() { let a = 2930482035982309_u64; println!("a = {a:016x} (u64)"); let b = a as u32; println!("b = {b:016x} (u32)"); }
$ cargo run --quiet a = 000a694204c67fe5 (u64) b = 0000000004c67fe5 (u32) 👆
Oh yeah okay! It's truncating it!
It's even clearer in binary:
fn main() { let a = 2930482035982309_u64; println!("a = {a:064b} (u64)"); let b = a as u32; println!("b = {b:064b} (u32)"); }
$ cargo run --quiet a = 0000000000001010011010010100001000000100110001100111111111100101 (u64) b = 0000000000000000000000000000000000000100110001100111111111100101 (u32) 👆
YES THAT'S THE PROBLEM WITH as
. YOU CAN TRUNCATE VALUES WHEN YOU DIDN'T INTEND
TO.
Ah. But it's shorter and super convenient still, right?
I GUESS!
Gotcha.
Generics and enums
Wait wait wait, we haven't even talked about strings yet. Are you sure about that heading?
Hell yeah! Generics are baby stuff: you just slap a couple angle brackets, or "chevrons" if you want to be fancy, and boom, Bob's your uncle!
Ew.
Not that Bob.
See, this for example:
fn show<T>(a: T) { todo!() }
Now we can call it with a value a
of type T
, for any T
!
fn main() { show(42); show("blah"); }
Okay yeah but you haven't implemented it yet!
True true, it panics right now:
$ cargo run --quiet thread 'main' panicked at 'not yet implemented', src/main.rs:7:5 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
But we could... I don't know, we could display it!
fn main() { show(42); show("blah"); } fn show<T>(a: T) { println!("a = {}", a); }
$ cargo run --quiet error[E0277]: `T` doesn't implement `std::fmt::Display` --> src/main.rs:7:24 | 7 | println!("a = {}", a); | ^ `T` cannot be formatted with the default formatter | = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead = note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info) help: consider restricting type parameter `T` | 6 | fn show<T: std::fmt::Display>(a: T) { | +++++++++++++++++++ For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to previous error
Mhhhhhh. Does not implement Display
.
Okay maybe {:?}
instead of {}
then?
fn show<T>(a: T) { println!("a = {:?}", a); }
$ cargo run --quiet error[E0277]: `T` doesn't implement `Debug` --> src/main.rs:7:26 | 7 | println!("a = {:?}", a); | ^ `T` cannot be formatted using `{:?}` because it doesn't implement `Debug` | = note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info) help: consider restricting type parameter `T` | 6 | fn show<T: std::fmt::Debug>(a: T) { | +++++++++++++++++ For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to previous error
Oh now it doesn't implement Debug
.
Well. Okay! Maybe show
can't do anything useful with its argument, but at
least you can pass any type to it.
And, because T
is a type like any other...
A "type parameter", technically, but who's keeping track.
...you can use it several times, probably!
fn main() { show(5, 7); show("blah", "bleh"); } fn show<T>(a: T, b: T) { todo!() }
Yeah, see, that works!
And if we do this:
fn main() { show(42, "aha") } fn show<T>(a: T, b: T) { todo!() }
It... oh.
$ cargo run --quiet error[E0308]: mismatched types --> src/main.rs:2:14 | 2 | show(42, "aha") | ^^^^^ expected integer, found `&str` For more information about this error, try `rustc --explain E0308`. error: could not compile `grr` due to previous error
Well that's interesting. I guess they have to match? So like it's using the
first argument, 42
, to infer T
, and then the second one has to match,
alright.
Yeah, and you'll notice it says "expected integer", not "expected i32".
So that means this would work:
show(42, 256_u64)
And it does!
And if we want two genuinely different types, I guess we have to... use two dif-
Use two different type parameters, yes.
fn main() { show(4, "hi") } fn show<A, B>(a: A, b: B) { todo!() }
That works! Alright.
Well we don't know how to do anything useful with these values yet, but-
Yes, that's what you get for trying to skip ahead.
How about a nice enum instead?
Something like this?
fn main() { show(Answer::Maybe) } enum Answer { Yes, No, Maybe, } fn show(answer: Answer) { let s = match answer { Answer::Yes => "yes", Answer::No => "no", Answer::Maybe => "maybe", }; println!("the answer is {s}"); }
$ cargo run --quiet the answer is maybe
I mean, yeah sure. That's a good starting point.
And maybe you want me to learn about this, too?
fn is_yes(answer: Answer) -> bool { if let Answer::Yes = answer { true } else { false } }
Sure, but mostly I w-
Or better still, this?
fn is_yes(answer: Answer) -> bool { matches!(answer, Answer::Yes) }
No, more like this:
fn main() { show(Either::Character('C')); show(Either::Number(64)); } enum Either { Number(i64), Character(char), } fn show(either: Either) { match either { Either::Number(n) => println!("{n}"), Either::Character(c) => println!("{c}"), } }
$ cargo run --quiet C 64
Oh, yeah, that's pretty good. So like enum variants that... hold some data?
Yes!
And you can do pattern matching to know which variant it is, and to access what's inside.
And I suppose it's safe too, as in it won't let you accidentally access the wrong variant?
Yes, yes of course. These are no C unions. They're tagged unions. Or choice types. Or sum types. Or coproducts.
Let's just stick with "enums".
But that's great news: I can finally take functions that can handle multiple types, even without understanding generics!
And I suppose... conversions could help there too? Like what if I could do this?
fn main() { show('C'.into()); show(64.into()); }
Sure, you can do that. Just implement a couple traits!
Traits? But we're in the enums sect-
Implementing traits
Ah, here we are. Couple traits, okay, show me!
fn main() { show('C'.into()); show(64.into()); } enum Either { Number(i64), Character(char), } // 👇 impl From<i64> for Either { fn from(n: i64) -> Self { Either::Number(n) } } // 👇 impl From<char> for Either { fn from(c: char) -> Self { Either::Character(c) } } fn show(either: Either) { match either { Either::Number(n) => println!("{n}"), Either::Character(c) => println!("{c}"), } }
$ cargo run --quiet C 64
Hey, that's pretty good! But we haven't declared that From
trait anywhere,
let's see... ah, here's what it looks like, from the Rust standard library:
pub trait From<T> { fn from(T) -> Self; }
Ah, that's refreshingly short. And Self
is?
The type you're implementing From<T>
for.
And then I suppose Into
is also in there somewhere?
pub trait Into<T> { fn into(self) -> T; }
Right! And self
is...
...short for self: Self
, in that position.
And I suppose there's other traits?
Wait, are Display
and Debug
traits?
They are! Here, let me show you something:
use std::fmt::Display; fn main() { show(&'C'); show(&64); } fn show(v: &dyn Display) { println!("{v}"); }
$ cargo run --quiet C 64
Whoa. WHOA. Game changer. No .into()
needed, it just works? Very cool.
Now let me show you something else:
use std::fmt::Display; fn main() { show(&'C'); show(&64); } fn show(v: impl Display) { println!("{v}"); }
That works too? No way! v
can be whichever type implements Display
! So nice!
Yes! It's the shorter way of spelling this:
fn show<D: Display>(v: D) { println!("{v}"); }
Ah!!! So that's how you add a... how you tell the compiler that the type must implement something.
A trait bound, yes. There's an even longer way to spell this:
fn show<D>(v: D) where D: Display, { println!("{v}"); }
Okay, that... I mean if you ignore all the punctuation going on, this almost reads like English. If English were maths. Well, the kind of maths compilers think about. Possibly type theory?
Return position
Wait, I didn't type that heading. Cool bear??
Shh, look at this.
use std::fmt::Display; fn main() { show(get_char()); show(get_int()); } fn get_char() -> impl Display { 'C' } fn get_int() -> impl Display { 64 } fn show(v: impl Display) { println!("{v}"); }
Okay. So we can use impl Display
"in return position", if we don't feel like
typing it all out. That's good.
And I suppose, since impl T
is much like generics, we can probably do
something like:
use std::fmt::Display; fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); } fn get_char_or_int(give_char: bool) -> impl Display { if give_char { 'C' } else { 64 } } fn show(v: impl Display) { println!("{v}"); }
$ cargo run --quiet error[E0308]: `if` and `else` have incompatible types --> src/main.rs:12:9 | 9 | / if give_char { 10 | | 'C' | | --- expected because of this 11 | | } else { 12 | | 64 | | ^^ expected `char`, found integer 13 | | } | |_____- `if` and `else` have incompatible types For more information about this error, try `rustc --explain E0308`. error: could not compile `grr` due to previous error
Ah. No I cannot.
So our return type is impl Display
... ah, and it infers it to be char
,
because that's the first thing we return! And so the other thing must also
be char
.
But it's not.
Well I'm lost. Bear, how do we get out of this?
Bear?
...okay maybe... generics? 🤷
fn get_char_or_int<D: Display>(give_char: bool) -> D { if give_char { 'C' } else { 64 } }
$ cargo run --quiet error[E0282]: type annotations needed --> src/main.rs:4:5 | 4 | show(get_char_or_int(true)); | ^^^^ cannot infer type for type parameter `impl Display` declared on the function `show` error[E0308]: mismatched types --> src/main.rs:10:9 | 8 | fn get_char_or_int<D: Display>(give_char: bool) -> D { | - - | | | | | expected `D` because of return type | this type parameter help: consider using an impl return type: `impl Display` 9 | if give_char { 10 | 'C' | ^^^ expected type parameter `D`, found `char` | = note: expected type parameter `D` found type `char` error[E0308]: mismatched types --> src/main.rs:12:9 | 8 | fn get_char_or_int<D: Display>(give_char: bool) -> D { | - - | | | | | expected `D` because of return type | this type parameter help: consider using an impl return type: `impl Display` ... 12 | 64 | ^^ expected type parameter `D`, found integer | = note: expected type parameter `D` found type `{integer}` Some errors have detailed explanations: E0282, E0308. For more information about an error, try `rustc --explain E0282`. error: could not compile `grr` due to 3 previous errors
Err, ew, no, go back, that's even worse.
Yeah that'll never work.
Bear where were you!
Bear business. You wouldn't get it.
I...
It'll never work, but the compiler's got your back: it tells you you should be
using impl Display
.
But that's what I tried first!
Okay well, the impl Display
in question can only be a single type.
But then what good is it?
Okay let's back up. You remember how you made an enum to handle arguments of two different types?
Vaguely? Oh I can do that here too, can't I.
Let's see 🎶
use std::fmt::Display; fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); } enum Either { Char(char), Int(i64), } fn get_char_or_int(give_char: bool) -> Either { if give_char { Either::Char('C') } else { Either::Int(64) } } fn show(v: impl Display) { println!("{v}"); }
$ cargo run --quiet error[E0277]: `Either` doesn't implement `std::fmt::Display` --> src/main.rs:4:10 | 4 | show(get_char_or_int(true)); | ---- ^^^^^^^^^^^^^^^^^^^^^ `Either` cannot be formatted with the default formatter | | | required by a bound introduced by this call | = help: the trait `std::fmt::Display` is not implemented for `Either` = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead note: required by a bound in `show` --> src/main.rs:21:17 | 21 | fn show(v: impl Display) { | ^^^^^^^ required by this bound in `show` error[E0277]: `Either` doesn't implement `std::fmt::Display` --> src/main.rs:5:10 | 5 | show(get_char_or_int(false)); | ---- ^^^^^^^^^^^^^^^^^^^^^^ `Either` cannot be formatted with the default formatter | | | required by a bound introduced by this call | = help: the trait `std::fmt::Display` is not implemented for `Either` = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead note: required by a bound in `show` --> src/main.rs:21:17 | 21 | fn show(v: impl Display) { | ^^^^^^^ required by this bound in `show` For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to 2 previous errors
Oh, wait, wait, I know this! I can just implement Display
for Either
:
impl Display for Either { // ... }
Wait, what do I put in there?
Use the rust-analyzer code generation assist.
You do have it installed, right?
Yes haha, of course, yes. Okay so Ctrl+.
(Cmd+.
on macOS), pick "Implement
missing members", and... it gives me this:
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { todo!() } }
...and then I guess I just match on self
? To call either the Display
implementation for char
or for i64
?
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { // } } }
Wait, what do I write there?
Use the rust-analyzer code generation assist.
Sounding like a broken record, you doing ok bear?
I am. There's a different code generation assist for this. Alternatively, GitHub Copilot might write the whole block for you.
It's getting better. It's learning.
Okay, using the "Fill match arms" assist...
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Either::Char(_) => todo!(), Either::Int(_) => todo!(), } } }
Okay I can do the rest!
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Either::Char(c) => c.fmt(f), Either::Int(i) => i.fmt(f), } } }
And this now runs!
$ cargo run --quiet C 64
Nice. But that was, like, super verbose. Can we make it less verbose?
Sure! You can use the delegate crate, for instance.
Okay okay I remember that bit, so you just:
$ cargo add delegate Updating 'https://github.com/rust-lang/crates.io-index' index Adding delegate v0.6.2 to dependencies.
And then... wait, what do we delegate to?
Oh I'll give you this one for free:
impl Either { fn display(&self) -> &dyn Display { match self { Either::Char(c) => c, Either::Int(i) => i, } } }
Wait wait wait but that's interesting as heck. You don't need traits to add
methods to types like that? You can return a &dyn Trait
object? That
borrows from &self
? Which is short for self: &Self
? And it extends
the lifetime of the receiver, also called a borrow-through???
Heyyyyyyyyy now where did you learn all that, we covered nothing of this.
Hehehe okay forget about it.
Okay so now that we've got a display
method we can do this:
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { self.display().fmt(f) } }
And that's where the delegate crate comes in to make things simpler (or at least shorter), mhh, looking at the README, we can probably do...
impl Display for Either { delegate::delegate! { to self.display() { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result; } } }
Yeah! Or, you know, use delegate::delegate;
first, and then you can just call the macro
with delegate!
instead of qualifying it with delegate::delegate!
.
There's even a rust-analyzer assist for it — "replace qualified path with use".
Macros? Qualified paths? Wow, we're glossing over a lot of things.
Not that many, but yes.
Anyway, that all works! Here's the complete listing:
use delegate::delegate; use std::fmt::Display; fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); } impl Either { fn display(&self) -> &dyn Display { match self { Either::Char(c) => c, Either::Int(i) => i, } } } impl Display for Either { delegate! { to self.display() { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result; } } } enum Either { Char(char), Int(i64), } fn get_char_or_int(give_char: bool) -> Either { if give_char { Either::Char('C') } else { Either::Int(64) } } fn show(v: impl Display) { println!("{v}"); }
$ cargo run --quiet C 64
But... it feels a little wrong to have to write all that code just to do that.
Ah, that's because you don't!
Dynamically-sized types
Uhhh. What does any of that mean?
Okay, so it's more implementation details: just like bit widths (u32
vs
u64
), etc. But details are where the devil vacations.
Try printing the size of a few things with std::mem::size_of.
Okay then!
fn main() { dbg!(std::mem::size_of::<u32>()); dbg!(std::mem::size_of::<u64>()); dbg!(std::mem::size_of::<u128>()); }
$ cargo run --quiet [src/main.rs:2] std::mem::size_of::<u32>() = 4 [src/main.rs:3] std::mem::size_of::<u64>() = 8 [src/main.rs:4] std::mem::size_of::<u128>() = 16
Okay, 32 bits is 4 bytes, that checks out on x86_64
.
Wait, where did you learn that syntax?
Ehh you showed it to me with typeof
and, I looked it up: turns out it's named
turbofish syntax! The name was cute, so I remembered.
Okay, now try references.
Sure!
fn main() { dbg!(std::mem::size_of::<&u32>()); dbg!(std::mem::size_of::<&u64>()); dbg!(std::mem::size_of::<&u128>()); }
$ cargo run --quiet [src/main.rs:2] std::mem::size_of::<&u32>() = 8 [src/main.rs:3] std::mem::size_of::<&u64>() = 8 [src/main.rs:4] std::mem::size_of::<&u128>() = 8
Yeah, they're all 64-bit! Again, I'm on an x86_64 CPU right now, so that's not super surprising.
Now try trait objects.
Oh, the dyn Trait
stuff?
use std::fmt::Debug; fn main() { dbg!(std::mem::size_of::<dyn Debug>()); }
$ cargo run --quiet error[E0277]: the size for values of type `dyn std::fmt::Debug` cannot be known at compilation time --> src/main.rs:4:10 | 4 | dbg!(std::mem::size_of::<dyn Debug>()); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time | = help: the trait `Sized` is not implemented for `dyn std::fmt::Debug` note: required by a bound in `std::mem::size_of` --> /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:304:22 | 304 | pub const fn size_of<T>() -> usize { | ^ required by this bound in `std::mem::size_of` For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to previous error
Oh. But that's... mhh.
What type is dyn Debug
? What size would you expect it to have?
I don't know, I suppose... I suppose a lot of types implement Debug
? Like,
u32
does, u64
does, u128
does too, and String
, and...
Exactly. It could be any of these, and then some. So it's impossible to know what size it is, because it could have any size.
Heck, even the empty tuple type, ()
, implements Debug
!
fn main() { dbg!(std::mem::size_of::<()>()); println!("{:?}", ()); }
$ cargo run --quiet [src/main.rs:2] std::mem::size_of::<()>() = 0 ()
...and it's a zero-sized type! (a ZST). So dyn Debug
, or any other "trait
object", is a DST: a dynamically-sized type.
Wait, but we did return a &dyn Display
at some point, right?
Ah, yes, but references al-
...all have the same size! Right!!! Because you're not holding the actual value, you're just holding the address of it!
Exactly!
use std::mem::size_of_val; fn main() { let a = 101_u128; println!("{:16}, of size {}", a, size_of_val(&a)); println!("{:16p}, of size {}", &a, size_of_val(&&a)); }
$ cargo run --quiet 101, of size 16 0x7ffdc4fb8af8, of size 8
And so uh... what was that about us not needing the enum at all?
We're getting to it!
Storing stuff in structs
Oh structs, those are easy, just like other languages right?
Like that:
#[derive(Debug)] struct Vec2 { x: f64, y: f64, } fn main() { let v = Vec2 { x: 1.0, y: 2.0 }; println!("v = {v:#?}"); }
Wait, #[derive(Debug)]
? I don't find we've quite reached that part of the
curriculum yet... in fact I don't see it in there at all.
Oh it's just a macro that can implement a trait for you, in this case it expands to something like this:
use std::fmt; impl fmt::Debug for Vec2 { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { f.debug_struct("Vec2") .field("x", &self.x) .field("y", &self.y) .finish() } }
Well well well look who's teaching who now?
No it's types I'm struggling with, the rest is easy peasy limey squeezy.
But not structs, structs are easy, this, my program runs:
$ cargo run --quiet v = Vec2 { x: 1.0, y: 2.0, }
Okay, now make a function that adds two Vec2
!
Alright!
#[derive(Debug)] struct Vec2 { x: f64, y: f64, } impl Vec2 { fn add(self, other: Vec2) -> Vec2 { Vec2 { x: self.x + other.x, y: self.y + other.y, } } } fn main() { let v = Vec2 { x: 1.0, y: 2.0 }; let w = Vec2 { x: 9.0, y: 18.0 }; dbg!(v.add(w)); }
$ cargo run --quiet [src/main.rs:21] v.add(w) = Vec2 { x: 10.0, y: 20.0, }
Now call add twice!
fn main() { let v = Vec2 { x: 1.0, y: 2.0 }; let w = Vec2 { x: 9.0, y: 18.0 }; dbg!(v.add(w)); dbg!(v.add(w)); }
$ cargo run --quiet error[E0382]: use of moved value: `v` --> src/main.rs:22:10 | 19 | let v = Vec2 { x: 1.0, y: 2.0 }; | - move occurs because `v` has type `Vec2`, which does not implement the `Copy` trait 20 | let w = Vec2 { x: 9.0, y: 18.0 }; 21 | dbg!(v.add(w)); | ------ `v` moved due to this method call 22 | dbg!(v.add(w)); | ^ value used here after move | note: this function takes ownership of the receiver `self`, which moves `v` --> src/main.rs:10:12 | 10 | fn add(self, other: Vec2) -> Vec2 { | ^^^^ error[E0382]: use of moved value: `w` --> src/main.rs:22:16 | 20 | let w = Vec2 { x: 9.0, y: 18.0 }; | - move occurs because `w` has type `Vec2`, which does not implement the `Copy` trait 21 | dbg!(v.add(w)); | - value moved here 22 | dbg!(v.add(w)); | ^ value used here after move For more information about this error, try `rustc --explain E0382`. error: could not compile `grr` due to 2 previous errors
Erm, doesn't work.
Do you know why?
I mean it says stuff? Something something Vec2
does not implement Copy
, yet
more traits, okay, so it gets "moved".
Wait we can probably work around this with Clone
!
// 👇 #[derive(Debug, Clone)] struct Vec2 { x: f64, y: f64, } fn main() { let v = Vec2 { x: 1.0, y: 2.0 }; let w = Vec2 { x: 9.0, y: 18.0 }; dbg!(v.clone().add(w.clone())); dbg!(v.add(w)); }
Okay it works again!
What if you don't want to call .clone()
?
Then I guess... Copy
?
#[derive(Debug, Clone, Copy)] struct Vec2 { x: f64, y: f64, } fn main() { let v = Vec2 { x: 1.0, y: 2.0 }; let w = Vec2 { x: 9.0, y: 18.0 }; dbg!(v.add(w)); dbg!(v.add(w)); }
Very good! Now forget about all that code, and tell me what's the type of "hello world"?
Ah, I'll just re-use the type_name_of
function you gave me... one sec...
fn main() { dbg!(type_name_of("hello world")); } fn type_name_of<T>(_: T) -> &'static str { std::any::type_name::<T>() }
$ cargo run --quiet [src/main.rs:2] type_name_of("hello world") = "&str"
There it is! It's &str
!
Alright! Now store it in a struct!
Sure, easy enough:
#[derive(Debug)] struct Message { text: &str, } fn main() { let msg = Message { text: "hello world", }; dbg!(msg); }
$ cargo run --quiet error[E0106]: missing lifetime specifier --> src/main.rs:3:11 | 3 | text: &str, | ^ expected named lifetime parameter | help: consider introducing a named lifetime parameter | 2 ~ struct Message<'a> { 3 ~ text: &'a str, | For more information about this error, try `rustc --explain E0106`. error: could not compile `grr` due to previous error
Oh. Not easy enough.
The compiler is showing you the way — heed its advice!
Okay, sure:
#[derive(Debug)] // 👇 struct Message<'a> { // 👇 text: &'a str, }
$ cargo run --quiet [src/main.rs:12] msg = Message { text: "hello world", }
Okay, now read the file src/main.rs
as a string, and store a reference to it
in a Message
.
Fine, fine, so, reading files... std::fs perhaps?
fn main() { let code = std::fs::read_to_string("src/main.rs").unwrap(); let msg = Message { text: &code }; dbg!(msg); }
$ cargo run --quiet [src/main.rs:9] msg = Message { text: "#[derive(Debug)]\nstruct Message<'a> {\n text: &'a str,\n}\n\nfn main() {\n let code = std::fs::read_to_string(\"src/main.rs\").unwrap();\n let msg = Message { text: &code };\n dbg!(msg);\n}\n", }
Okay, I did it! What now?
Now move all the code to construct the Message
into a separate function!
Like this?
#[derive(Debug)] struct Message<'a> { text: &'a str, } fn main() { let msg = get_msg(); dbg!(msg); } fn get_msg() -> Message { let code = std::fs::read_to_string("src/main.rs").unwrap(); Message { text: &code } }
$ cargo run --quiet error[E0106]: missing lifetime specifier --> src/main.rs:11:17 | 11 | fn get_msg() -> Message { | ^^^^^^^ expected named lifetime parameter | = help: this function's return type contains a borrowed value, but there is no value for it to be borrowed from help: consider using the `'static` lifetime | 11 | fn get_msg() -> Message<'static> { | ~~~~~~~~~~~~~~~~ For more information about this error, try `rustc --explain E0106`. error: could not compile `grr` due to previous error
Erm, not happy.
Okay, that's lifetime stuff. We're not there yet. What's the only thing you use
the Message
for?
Passing it to the dbg!
macro?
And what does that use?
Probably the Debug
trait?
So what can we change the return type to?
Ohhhh impl Debug
! To let the compiler figure it out!
fn get_msg() -> impl std::fmt::Debug { let code = std::fs::read_to_string("src/main.rs").unwrap(); Message { text: &code } }
$ cargo run --quiet error[E0597]: `code` does not live long enough --> src/main.rs:13:21 | 11 | fn get_msg() -> impl std::fmt::Debug { | -------------------- opaque type requires that `code` is borrowed for `'static` 12 | let code = std::fs::read_to_string("src/main.rs").unwrap(); 13 | Message { text: &code } | ^^^^^ borrowed value does not live long enough 14 | } | - `code` dropped here while still borrowed | help: you can add a bound to the opaque type to make it last less than `'static` and match `'static` | 11 | fn get_msg() -> impl std::fmt::Debug + 'static { | +++++++++ For more information about this error, try `rustc --explain E0597`. error: could not compile `grr` due to previous error
Huh. That seems like... a lifetime problem? I thought we weren't at lifetimes yet.
We are now 😎
Lifetimes and ownership
Look this is all moving a little fast for me, I'd just like to-
You can go back and read the transcript later! For now, what's the type returned
by std::fs::read_to_string
?
Uhhh it's-
Don't go look at the definition. No time. Just do this:
fn get_msg() -> impl std::fmt::Debug { // 👇 let code: () = std::fs::read_to_string("src/main.rs").unwrap(); Message { text: &code } }
$ cargo run --quiet error[E0308]: mismatched types --> src/main.rs:12:20 | 12 | let code: () = std::fs::read_to_string("src/main.rs").unwrap(); | -- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `()`, found struct `String` | | | expected due to this
rust-analyzer was showing me the type as an inlay, you know...
Oh, you installed it! Good. Anyway, it's String
. Try storing that inside the
struct.
Okay. I guess we won't need that 'a
anymore...
#[derive(Debug)] struct Message { // 👇 text: String, } fn main() { let msg = get_msg(); dbg!(msg); } fn get_msg() -> impl std::fmt::Debug { let code = std::fs::read_to_string("src/main.rs").unwrap(); // 👇 (the `&` is gone) Message { text: code } }
Okay, why does this work when the other one didn't?
Because uhhhh, the &str
was a... reference?
Yes, and?
And that means it borrowed from something? In this case the result of
std::fs::read_to_string
?
Yes, and??
And that meant we could not return that reference, because code
dropped
(which means it got freed) at the end of the function, and so the reference
would be dangling?
Veeeery goooood! And it works as a String
because?
Well, I guess it doesn't borrow? Like, the result of read_to_string
is moved
into Message
, and so we take ownership of it, and we can move it anywhere we
please?
Exactly! Suspiciously exact, even. Are you sure this is your first time?
👼
Very well, boss baby, do you know of other types that let you own a string?
Ah, there's a couple! Box<str>
will work, for example:
#[derive(Debug)] struct Message { // 👇 text: Box<str>, } fn main() { let msg = get_msg(); dbg!(msg); } fn get_msg() -> impl std::fmt::Debug { let code = std::fs::read_to_string("src/main.rs").unwrap(); // 👇 Message { text: code.into() } }
And that one has exclusive ownership. Whereas something like Arc<str>
will,
well, it'll also work:
use std::sync::Arc; #[derive(Debug)] struct Message { text: Arc<str>, }
But that one's shared ownership. You can hand out clones of it and so multiple structs can point to the same thing:
use std::sync::Arc; #[derive(Debug)] struct Message { text: Arc<str>, } fn main() { let a = get_msg(); let b = Message { text: a.text.clone(), }; let c = Message { text: b.text.clone(), }; dbg!(a.text.as_ptr(), b.text.as_ptr(), c.text.as_ptr()); } fn get_msg() -> Message { let code = std::fs::read_to_string("src/main.rs").unwrap(); Message { text: code.into() } }
$ cargo run --quiet [src/main.rs:16] a.text.as_ptr() = 0x0000555f4e9d8d80 [src/main.rs:16] b.text.as_ptr() = 0x0000555f4e9d8d80 [src/main.rs:16] c.text.as_ptr() = 0x0000555f4e9d8d80
But you can't modify it.
Well, it's pretty awkward to mutate a &mut str
to begin with!
Yeah. It's easier to show that with a &mut [u8]
.
Oh you're the professor now huh?
Sure! Watch me make a table:
Text (UTF-8) | Bytes | |
Immutable reference / slice | &str | &[u8] |
Owned, can grow | String | Vec<u8> |
Owned, fixed len | Box<str> | Box<[u8]> |
Shared ownership (atomic) | Arc<str> | Arc<[u8]> |
Now where... where did you find that? You're not even telling people about Rc!
Eh, by the time they're worried about the cost of atomic reference counting, they can do their own research. And then they'll have a nice surprise: free performance!
There is one thing that's a bit odd, though. In the table above, we have
an equivalence between str
and [u8]
. What are those types?
Ah! Those. Well...
Slices and arrays
Try printing the size of the str
and [u8]
types!
Okay sure!
use std::mem::size_of; fn main() { dbg!(size_of::<str>()); dbg!(size_of::<[u8]>()); }
Wait, no, we can't:
$ cargo run --quiet error[E0277]: the size for values of type `str` cannot be known at compilation time --> src/main.rs:4:20 | 4 | dbg!(size_of::<str>()); | ^^^ doesn't have a size known at compile-time | = help: the trait `Sized` is not implemented for `str` note: required by a bound in `std::mem::size_of` --> /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:304:22 | 304 | pub const fn size_of<T>() -> usize { | ^ required by this bound in `std::mem::size_of` error[E0277]: the size for values of type `[u8]` cannot be known at compilation time --> src/main.rs:5:20 | 5 | dbg!(size_of::<[u8]>()); | ^^^^ doesn't have a size known at compile-time | = help: the trait `Sized` is not implemented for `[u8]` note: required by a bound in `std::mem::size_of` --> /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:304:22 | 304 | pub const fn size_of<T>() -> usize { | ^ required by this bound in `std::mem::size_of` For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to 2 previous errors
Correct! What about the size of &str
and &[u8]
?
use std::mem::size_of; fn main() { dbg!(size_of::<&str>()); dbg!(size_of::<&[u8]>()); }
$ cargo run --quiet [src/main.rs:4] size_of::<&str>() = 16 [src/main.rs:5] size_of::<&[u8]>() = 16
Ah, those we can! 16 bytes, that's... 2x8 bytes... two pointers!
Yes! Start and length.
Okay, so those are always references because... nothing else makes sense? Like, we don't know the size of the thing we're borrowing a slice of?
Yes! And the thing we're borrowing from can be... a lot of different things.
Let's take &[u8]
— what types can you borrow a &[u8]
out of?
Well... the heading says "arrays" so I'm gonna assume it works for arrays:
use std::mem::size_of_val; fn main() { let arr = [1, 2, 3, 4, 5]; let slice = &arr[1..4]; dbg!(size_of_val(&arr)); dbg!(size_of_val(&slice)); print_byte_slice(slice); } fn print_byte_slice(slice: &[u8]) { println!("{slice:?}"); }
$ cargo run --quiet [src/main.rs:6] size_of_val(&arr) = 5 [src/main.rs:7] size_of_val(&slice) = 16 [2, 3, 4]
Okay, yes.
What else?
I guess, anything we had in that table under "bytes"?
It should definitely work for Vec<u8>
use std::mem::size_of_val; fn main() { let vec = vec![1, 2, 3, 4, 5]; let slice = &vec[1..4]; dbg!(size_of_val(&vec)); dbg!(size_of_val(&slice)); print_byte_slice(slice); } fn print_byte_slice(slice: &[u8]) { println!("{slice:?}"); }
$ cargo run --quiet [src/main.rs:6] size_of_val(&vec) = 24 [src/main.rs:7] size_of_val(&slice) = 16 [2, 3, 4]
Wait, 24 bytes?
Yeah! Start, length, capacity. Not necessarily in that order. Rust doesn't guarantee a particular type layout anyway, so you shouldn't rely on it.
Next up is Box<[u8]>
:
use std::mem::size_of_val; fn main() { let bbox: Box<[u8]> = Box::new([1, 2, 3, 4, 5]); let slice = &bbox[1..4]; dbg!(size_of_val(&bbox)); dbg!(size_of_val(&slice)); print_byte_slice(slice); } fn print_byte_slice(slice: &[u8]) { println!("{slice:?}"); }
$ cargo run --quiet [src/main.rs:6] size_of_val(&bbox) = 16 [src/main.rs:7] size_of_val(&slice) = 16 [2, 3, 4]
Ha, 2x8 bytes each. I suppose... a Box<[u8]>
is exactly like a &[u8]
except... it has ownership of the data it points to? So we can move it and
stuff? And dropping it frees the data?
Yup! And you forgot one: slices of slices.
use std::mem::size_of_val; fn main() { let arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; let slice = &arr[2..7]; let slice_of_slice = &slice[2..]; dbg!(size_of_val(&slice_of_slice)); print_byte_slice(slice_of_slice); } fn print_byte_slice(slice: &[u8]) { println!("{slice:?}"); }
$ cargo run --quiet [src/main.rs:7] size_of_val(&slice_of_slice) = 16 [5, 6, 7]
Very cool.
So wait, just to back up — arrays are [T; n]
, and slices are &[T]
. We know
the size of arrays because we know how many elements they have, and we know the
size of &[T]
because it's just start + length.
But we don't know the size of [T]
because...
Because the slice could borrow from anything! As we've seen: [u8; n]
,
Vec<u8>
, Box<[u8]>
, Arc<[u8]>
, another slice...
Ah. So we don't know its size.
Wait wait wait.
That makes [T]
a dynamically-sized type? Just like trait objects?
Yes, it is a DST.
And we can just do Box<[T]>
?
Sure! That's just an owning pointer.
Ooooh that gives me an idea.
Boxed trait objects
So! Deep breaths. If I followed correctly, that means that, although we don't
know the size of dyn Display
, we know the size of Box<dyn Display>
— it
should be the same size as &dyn Display
, it just has ownership of its... of
the thing it points to.
Its pointee, yeah. Also, same with Arc<dyn Display>
, or any other smart
pointer.
Okay let me check it real quick:
use std::{fmt::Display, mem::size_of, rc::Rc, sync::Arc}; fn main() { dbg!(size_of::<&dyn Display>()); dbg!(size_of::<Box<dyn Display>>()); dbg!(size_of::<Arc<dyn Display>>()); dbg!(size_of::<Rc<dyn Display>>()); }
$ cargo run --quiet [src/main.rs:4] size_of::<&dyn Display>() = 16 [src/main.rs:5] size_of::<Box<dyn Display>>() = 16 [src/main.rs:6] size_of::<Arc<dyn Display>>() = 16 [src/main.rs:7] size_of::<Rc<dyn Display>>() = 16
Okay, okay! They're all the same size, the size of a p-.. of two pointers? What?
Yeah! Data and vtable. You remember how you couldn't do anything with the values in your first generic function?
That one?
fn show<T>(a: T) { todo!() }
The very same. Well there's two ways to solve this. Either you add a trait bound, like so:
fn show<T: std::fmt::Display>(a: T) { // blah }
And then a different version of show
gets generated for every type you call
it with.
Oooh, right! That's uhh... it's called... discombobulation?
Monomorphization. show
is "polymorphic" because it can take multiple forms,
and it gets replaced with many "monomorphic" versions of itself, that each handle
a certain combination of types.
Okay, so that's one way. And the other way?
You take a reference to a trait object: &dyn Trait
.
And that helps how?
Well, it points to the value itself, and a list of all functions required by the trait. And only those.
Oh. Oh! And that's the vtable? It's just "the concrete type's implementation of every function listed in the trait definition"?
Yes. But can you define "concrete type" for me?
Well... let's take this:
use std::fmt::Display; fn main() { let x: u64 = 42; show(x); } fn show<D: Display>(d: D) { println!("{}", d); }
In that case, I'd call D
the type parameter (or generic type?), and u64
the
concrete type.
Okay, I was just making sure. You were about to have an epiphany?
I was? Oh, right!
$ cargo run --quiet [src/main.rs:4] size_of::<&dyn Display>() = 16 [src/main.rs:5] size_of::<Box<dyn Display>>() = 16 [src/main.rs:6] size_of::<Arc<dyn Display>>() = 16 [src/main.rs:7] size_of::<Rc<dyn Display>>() = 16
So these all have the same size.
And the last time we tried returning a dyn Display
we ran into trouble
because, well, it's dynamically-sized:
use std::fmt::Display; fn main() { let x = get_display(); show(x); } fn get_display() -> dyn Display { let x: u64 = 42; x } fn show<D: Display>(d: D) { println!("{}", d); }
$ cargo run --quiet error[E0746]: return type cannot have an unboxed trait object --> src/main.rs:3:21 | 3 | fn get_display() -> dyn Display { | ^^^^^^^^^^^ doesn't have a size known at compile-time | = note: for information on `impl Trait`, see <https://doc.rust-lang.org/book/ch10-02-traits.html#returning-types-that-implement-traits> help: use `impl Display` as the return type, as all return paths are of type `u64`, which implements `Display` | 3 | fn get_display() -> impl Display { | ~~~~~~~~~~~~ (other errors omitted)
But -> impl Display
worked, as the compiler suggests:
fn get_display() -> impl Display { let x: u64 = 42; x }
Because it's sorta like this:
fn get_display<D: Display>() -> D { let x: u64 = 42; x }
Nooooooo no no no. Verboten. Can't do that!
Yeah, you told me! You didn't explain why, though.
Because, and read this very carefully:
When a generic function is called, it must be possible to infer all its type parameters from its inputs alone.
Ah, erm. Wait so it would work if D
was also somewhere in the type of a
parameter?
Yeah! Consider this:
fn main() { dbg!(add_10(5)); } fn add_10<N>(n: N) -> N { n + 10 }
Wait, that doesn't compile!
$ cargo run --quiet error[E0369]: cannot add `{integer}` to `N` --> src/main.rs:6:7 | 6 | n + 10 | - ^ -- {integer} | | | N
No. But you also truncated the compiler's output.
Here's the rest of it.
help: consider restricting type parameter `N` | 5 | fn add_10<N: std::ops::Add<Output = {integer}>>(n: N) -> N { | +++++++++++++++++++++++++++++++++++
It's not the same issue. The problem here is that N
could be anything.
Including types that we cannot add 10 to.
Here's a working version:
fn main() { dbg!(add_10(1_u8)); dbg!(add_10(2_u16)); dbg!(add_10(3_u32)); dbg!(add_10(4_u64)); } fn add_10<N>(n: N) -> N where N: From<u8> + std::ops::Add<Output = N>, { n + 10.into() }
Yeesh that's... gnarly.
Yeah. It's also a super contrived example.
But okay, I get it: impl Trait
in return position is the only way to have
something about the function signature that's inferred from... its body.
Yes! Which is why both these get_
functions work:
use std::fmt::Display; fn main() { show(get_char()); show(get_int()); } fn get_char() -> impl Display { 'C' } fn get_int() -> impl Display { 64 } fn show(v: impl Display) { println!("{v}"); }
Right, it infers the return type of get_char
to be char
, and the ret-
Not quite. Well, yes. But it returns an opaque type. The caller doesn't know
it's actually a char
. All it knows is that it implements Display
.
I see.
Still, by itself, it can't unify char
and i32
, for example. Those are two
distinct types.
I wonder what type_name
thinks of these...
use std::fmt::Display; fn main() { let c = get_char(); dbg!(type_name_of(&c)); let i = get_int(); dbg!(type_name_of(&i)); } fn get_char() -> impl Display { 'C' } fn get_int() -> impl Display { 64 } fn type_name_of<T>(_: T) -> &'static str { std::any::type_name::<T>() }
$ cargo run --quiet [src/main.rs:5] type_name_of(&c) = "&char" [src/main.rs:7] type_name_of(&i) = "&i32"
Hahahaha. Not so opaque after all.
That's uhh.. didn't expect type_name
to do that, to be honest.
But they are opaque, I promise. You can call char
methods on a real char
,
but not on the return type of get_char
:
use std::fmt::Display; fn main() { let real_c = 'a'; dbg!(real_c.to_ascii_uppercase()); let opaque_c = get_char(); dbg!(opaque_c.to_ascii_uppercase()); } fn get_char() -> impl Display { 'C' }
$ cargo run --quiet error[E0599]: no method named `to_ascii_uppercase` found for opaque type `impl std::fmt::Display` in the current scope --> src/main.rs:8:19 | 8 | dbg!(opaque_c.to_ascii_uppercase()); | ^^^^^^^^^^^^^^^^^^ method not found in `impl std::fmt::Display` For more information about this error, try `rustc --explain E0599`. error: could not compile `grr` due to previous error
Also, I'm fairly sure type_id
will give us different values...
use std::{any::TypeId, fmt::Display}; fn main() { let opaque_c = get_char(); dbg!(type_id_of(opaque_c)); let real_c = 'a'; dbg!(type_id_of(real_c)); } fn get_char() -> impl Display { 'C' } fn type_id_of<T: 'static>(_: T) -> TypeId { TypeId::of::<T>() }
$ cargo run --quiet [src/main.rs:5] type_id_of(opaque_c) = TypeId { t: 15782864888164328018, } [src/main.rs:8] type_id_of(real_c) = TypeId { t: 15782864888164328018, }
Ah, huh. I guess not.
Yeah it seems like opaque types are a type-checker trick and it is the concrete type at runtime. The checker will just have prevented us from calling anything that wasn't in the trait.
Actually, now I understand better why this cannot work:
use std::fmt::Display; fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); } fn get_char_or_int(give_char: bool) -> impl Display { if give_char { 'C' } else { 64 } } fn show(v: impl Display) { println!("{v}"); }
$ cargo run --quiet error[E0308]: `if` and `else` have incompatible types --> src/main.rs:12:9 | 9 | / if give_char { 10 | | 'C' | | --- expected because of this 11 | | } else { 12 | | 64 | | ^^ expected `char`, found integer 13 | | } | |_____- `if` and `else` have incompatible types For more information about this error, try `rustc --explain E0308`. error: could not compile `grr` due to previous error
It's because the return type cannot be simultaneously char
and, say, i32
.
Yes, and also: it's because there's no vtable involved. Remember the enum
version you did?
Yeah! That one:
use delegate::delegate; use std::fmt::Display; fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); } impl Either { fn display(&self) -> &dyn Display { match self { Either::Char(c) => c, Either::Int(i) => i, } } } impl Display for Either { delegate! { to self.display() { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result; } } } enum Either { Char(char), Int(i64), } fn get_char_or_int(give_char: bool) -> Either { if give_char { Either::Char('C') } else { Either::Int(64) } } fn show(v: impl Display) { println!("{v}"); }
Right! In that one, you're manually dispatching Display::fmt
to either the
implementation for char
or the one for i64
.
Well no, delegate
is doing it for me.
Well, you did it here:
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Either::Char(c) => c.fmt(f), Either::Int(i) => i.fmt(f), } } }
Right, yes, I see the idea. So a vtable does the same thing?
Eh, not quite. It's more like function pointers.
Can you show me?
Okay, but real quick then.
use std::{ fmt::{self, Display}, mem::transmute, }; // This is our type that can contain any value that implements `Display` struct BoxedDisplay { // This is a pointer to the actual value, which is on the heap. data: *mut (), // And this is a reference to the vtable for Display's implementation of the // type of our value. vtable: &'static DisplayVtable<()>, } // 👆 Note that there are no type parameters at all in the above type. The // type is _erased_. // Then we need to declare our vtable type. // This is a type-safe take on it (thanks @eddyb for the idea), but you may // have noticed `BoxedDisplay` pretends they're all `DisplayVtable<()>`, which // is fine because we're only dealing with pointers to `T` / `()`, which all // have the same size. #[repr(C)] struct DisplayVtable<T> { // This is the implementation of `Display::fmt` for `T` fmt: unsafe fn(*mut T, &mut fmt::Formatter<'_>) -> fmt::Result, // We also need to be able to drop a `T`. For that we need to know how large // `T` is, and there may be side effects (freeing OS resources, flushing a // buffer, etc.) so it needs to go into the vtable too. drop: unsafe fn(*mut T), } impl<T: Display> DisplayVtable<T> { // This lets us build a `DisplayVtable` any `T` that implements `Display` fn new() -> &'static Self { // Why yes you can declare functions in that scope. This one just // forwards to `T`'s `Display` implementation. unsafe fn fmt<T: Display>(this: *mut T, f: &mut fmt::Formatter<'_>) -> fmt::Result { (*this).fmt(f) } // Here we turn a raw pointer (`*mut T`) back into a `Box<T>`, which // has ownership of it and thus, knows how to drop (free) it. unsafe fn drop<T>(this: *mut T) { Box::from_raw(this); } // 👆 These are both regular functions, not, closures. They end up in // the executable, thus they live for 'static, thus we can return a // `&'static Self` as requested. &Self { fmt, drop } } } // Okay, now we can make a constructor for `BoxedDisplay` itself! impl BoxedDisplay { // The `'static` bound makes sure `T` is _owned_ (it can't be a reference // shorter than 'static). fn new<T: Display + 'static>(t: T) -> Self { // Let's do some type erasure! Self { // Box<T> => *mut T => *mut () data: Box::into_raw(Box::new(t)) as _, // &'static DisplayVtable<T> => &'static DisplayVtable<()> vtable: unsafe { transmute(DisplayVtable::<T>::new()) }, } } } // That one's easy — we dispatch to the right `fmt` function using the vtable. impl Display for BoxedDisplay { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { unsafe { (self.vtable.fmt)(self.data, f) } } } // Same here. impl Drop for BoxedDisplay { fn drop(&mut self) { unsafe { (self.vtable.drop)(self.data); } } } // And finally, we can use it! fn get_char_or_int(give_char: bool) -> BoxedDisplay { if give_char { BoxedDisplay::new('C') } else { BoxedDisplay::new(64) } } fn show(v: impl Display) { println!("{v}"); } fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); }
$ cargo run --quiet C 64
Whoa. Whoa whoa whoa, that could be its own article!
Yes. And yet here we are.
And there's unsafe
code in there, how do you know it's okay?
Well, miri is happy about it, so that's a good start:
$ cargo +nightly miri run --quiet C 64
And do I really need to write code like that?
No you don't! But you can, and the standard library does have code like that, which is awesome, because you don't need to learn a whole other language to drop down and work on it.
Wait, unsafe Rust is not a whole other language?
Touché, smartass.
Anyway you don't need to write all of that yourself because
that's exactly what Box<dyn Display>
already is.
Oh, word?
use std::fmt::Display; fn get_char_or_int(give_char: bool) -> Box<dyn Display> { if give_char { Box::new('C') } else { Box::new(64) } } fn show(v: impl Display) { println!("{v}"); } fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); }
$ cargo run --quiet C 64
Neat! Super neat.
Really the "magic" happens in the trait object itself. Here it's boxed, but it may as well be arc'd:
fn get_char_or_int(give_char: bool) -> Arc<dyn Display> { if give_char { Arc::new('C') } else { Arc::new(64) } }
And that would work just as well. Or, again, just a reference:
fn get_char_or_int(give_char: bool) -> &'static dyn Display { if give_char { &'C' } else { &64 } }
Well, that's a comfort. For a second there I really thought I would have to write my own custom vtable implementation every time I want to do something useful.
No, this isn't the 1970s. We have re-usable code now.
Reading type signatures
Ok so... there's a lot of different names for essentially the same thing, like
&str
and String
, and &[u8]
and Vec<u8>
, etc.
Seems like a bunch of extra work. What's the upside?
Well, sometimes it catches bugs.
Ah!
The big thing there is lifetimes, in the context of concurrent code, but...
Whoa there, I don't think we've-
BUT, immutability is another big one.
Consider this:
function double(arr) { for (var i = 0; i < arr.length; i++) { arr[i] *= 2; } return arr; } let a = [1, 2, 3]; console.log({ a }); let b = double(a); console.log({ b });
Ah, easy! This'll print 1, 2, 3
and then 2, 4, 6
.
$ node main.js { a: [ 1, 2, 3 ] } { b: [ 2, 4, 6 ] }
Called it!
Now what if we call it like this?
let a = [1, 2, 3]; console.log({ a }); let b = double(a); console.log({ a, b });
Ah, then, mh... 1, 2, 3
and then... 1, 2, 3
and 2, 4, 6
?
Wrong!
$ node main.js { a: [ 1, 2, 3 ] } { a: [ 2, 4, 6 ], b: [ 2, 4, 6 ] }
Ohhh! Right I suppose double
took the array by reference, and so it mutated it
in-place.
Mhhh. I guess we have to think about these things in ECMAScript-land, too.
We very much do! We can "fix" it like this for example:
function double(arr) { let result = new Array(arr.length); for (var i = 0; i < arr.length; i++) { result[i] = arr[i] * 2; } return result; }
$ node main.js { a: [ 1, 2, 3 ] } { a: [ 1, 2, 3 ], b: [ 2, 4, 6 ] }
Wait, wouldn't we rather use a functional style, like so?
function double(arr) { return arr.map((x) => x * 2); }
That works too! It's just 86% slower according to this awful microbenchmark I just made.
Aw, nuts. We have to worry about performance too in ECMAScript-land?
You can if you want to! But let's stay on "correctness".
Let's try porting those functions to Rust.
fn main() { let a = vec![1, 2, 3]; println!("a = {a:?}"); let b = double(a); println!("b = {b:?}"); } fn double(a: Vec<i32>) -> Vec<i32> { a.into_iter().map(|x| x * 2).collect() }
Let's give it a run...
$ cargo run -q a = [1, 2, 3] b = [2, 4, 6]
Yeah that checks out.
So, same question as before: do you think double
is messing with a
?
I don't think so?
Try printing it!
fn main() { let a = vec![1, 2, 3]; println!("a = {a:?}"); let b = double(a); println!("a = {a:?}"); println!("b = {b:?}"); }
$ cargo run -q error[E0382]: borrow of moved value: `a` --> src/main.rs:5:20 | 2 | let a = vec![1, 2, 3]; | - move occurs because `a` has type `Vec<i32>`, which does not implement the `Copy` trait 3 | println!("a = {a:?}"); 4 | let b = double(a); | - value moved here 5 | println!("a = {a:?}"); | ^ value borrowed here after move | = note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info) For more information about this error, try `rustc --explain E0382`. error: could not compile `grr` due to previous error
Wait, we can't. double
takes ownership of a
, so there's no a
left for us
to print.
Correct! What about this version?
fn main() { let a = vec![1, 2, 3]; println!("a = {a:?}"); let b = double(&a); println!("a = {a:?}"); println!("b = {b:?}"); } fn double(a: &Vec<i32>) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
That one... mhh that one should work?
It does!
$ cargo run -q a = [1, 2, 3] a = [1, 2, 3] b = [2, 4, 6]
But tell me, do we really need to take a &Vec
?
What do you mean?
Well, a Vec<T>
is neat because it can grow, and shrink. This is useful when
collecting results, for example, and we don't know how many results we'll
end up having. We need to be able to push elements onto it, without worrying
about running out of space.
I suppose so yeah? Well in our case... I suppose all we do is read from a
,
so no, we don't really need a &Vec
. But what else would we take?
Let's ask clippy!
$ cargo clippy -q warning: writing `&Vec` instead of `&[_]` involves a new object where a slice will do --> src/main.rs:9:14 | 9 | fn double(a: &Vec<i32>) -> Vec<i32> { | ^^^^^^^^^ help: change this to: `&[i32]` | = note: `#[warn(clippy::ptr_arg)]` on by default = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg
Ohhhh a slice, of course!
fn double(a: &[i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
And now does this version mess with a
?
Oh definitely not. Our a
in the main
function is a growable Vec
, and we
pass a read-only slice of it to the function, so all it can do is read.
Correct!
$ cargo run -q a = [1, 2, 3] a = [1, 2, 3] b = [2, 4, 6]
How about this one:
fn double(a: &mut [i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
Well, seems unnecessary? And.. it doesn't compile:
$ cargo run -q error[E0308]: mismatched types --> src/main.rs:4:20 | 4 | let b = double(&a); | ^^ types differ in mutability | = note: expected mutable reference `&mut [i32]` found reference `&Vec<{integer}>` For more information about this error, try `rustc --explain E0308`. error: could not compile `grr` due to previous error
So? Make it compile!
Alright then:
fn main() { // 👇 let mut a = vec![1, 2, 3]; println!("a = {a:?}"); // 👇 let b = double(&mut a); println!("a = {a:?}"); println!("b = {b:?}"); } fn double(a: &mut [i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
There. It prints exactly the same thing.
So this works. But is it good?
Not really no. We're asking for more than what we need.
Indeed! We never mutate the input, so we don't need a mutable slice of it.
But can you show a case where it would get in the way?
Yes I suppose... I suppose if we wanted to double the input in parallel a bunch of times? I mean it's pretty contrived, but.. gimme a second.
$ cargo add crossbeam (cut)
fn main() { let mut a = vec![1, 2, 3]; println!("a = {a:?}"); crossbeam::scope(|s| { for _ in 0..5 { s.spawn(|_| { let b = double(&mut a); println!("b = {b:?}"); }); } }) .unwrap(); } fn double(a: &mut [i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
There. That fails because we can't borrow a
mutably more than once at a time:
$ cargo run -q error[E0499]: cannot borrow `a` as mutable more than once at a time --> src/main.rs:7:21 | 5 | crossbeam::scope(|s| { | - has type `&crossbeam::thread::Scope<'1>` 6 | for _ in 0..5 { 7 | s.spawn(|_| { | - ^^^ `a` was mutably borrowed here in the previous iteration of the loop | _____________| | | 8 | | let b = double(&mut a); | | - borrows occur due to use of `a` in closure 9 | | println!("b = {b:?}"); 10 | | }); | |______________- argument requires that `a` is borrowed for `'1` For more information about this error, try `rustc --explain E0499`. error: could not compile `grr` due to previous error
But it works if we just take an immutable reference:
fn main() { let a = vec![1, 2, 3]; println!("a = {a:?}"); crossbeam::scope(|s| { for _ in 0..5 { s.spawn(|_| { let b = double(&a); println!("b = {b:?}"); }); } }) .unwrap(); } fn double(a: &[i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
$ cargo run -q a = [1, 2, 3] b = [2, 4, 6] b = [2, 4, 6] b = [2, 4, 6] b = [2, 4, 6] b = [2, 4, 6]
Very good! Look at you! And you used crossbeam because?
Because... something something scoped threads. Forget about that part. You got what you wanted, right?
I did! Next question: doesn't this code have the exact same performance issues
as our ECMAScript .map()
-based function?
Yes and no — we are allocating a new Vec
, but it probably has the exact
right size to begin with, because Rust iterators have size hints.
Ah, mh, okay, but what if we did want to mutate the vec in-place?
Ah, then I suppose we could do this:
fn main() { let a = vec![1, 2, 3]; println!("a = {a:?}"); let b = double(a); println!("b = {b:?}"); } fn double(a: Vec<i32>) -> Vec<i32> { for i in 0..a.len() { a[i] *= 2; } a }
Wait, no:
$ cargo run -q error[E0596]: cannot borrow `a` as mutable, as it is not declared as mutable --> src/main.rs:11:9 | 9 | fn double(a: Vec<i32>) -> Vec<i32> { | - help: consider changing this to be mutable: `mut a` 10 | for i in 0..a.len() { 11 | a[i] *= 2; | ^ cannot borrow as mutable For more information about this error, try `rustc --explain E0596`. error: could not compile `grr` due to previous error
I mean this:
fn double(mut a: Vec<i32>) -> Vec<i32> { for i in 0..a.len() { a[i] *= 2; } a }
Wait, no:
$ cargo clippy -q warning: the loop variable `i` is only used to index `a` --> src/main.rs:10:14 | 10 | for i in 0..a.len() { | ^^^^^^^^^^ | = note: `#[warn(clippy::needless_range_loop)]` on by default = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_range_loop help: consider using an iterator | 10 | for <item> in &mut a { | ~~~~~~ ~~~~~~
I mean this:
fn double(mut a: Vec<i32>) -> Vec<i32> { for x in a.iter_mut() { *x *= 2; } a }
Okay, no need to run it, I know what it does. But is it good?
Idk. Seems okay? What's wrong with it?
Well, do you really need to take ownership of the Vec
? Do you need a Vec
in the first place?
What if you want to do this?
fn main() { let mut a = [1, 2, 3]; println!("a = {a:?}"); let b = double(a); println!("b = {b:?}"); }
Ah yeah, that won't work. Well no I suppose we don't need a Vec
... after all,
we're doing everything in-place, the array.. vector.. whatever, container,
doesn't need to grow or shrink.
So we can take... OH! A mutable slice:
fn main() { let mut a = [1, 2, 3]; println!("a = {a:?}"); double(&mut a); println!("a = {a:?}"); } fn double(a: &mut [i32]) { for x in a.iter_mut() { *x *= 2 } }
$ cargo run -q a = [1, 2, 3] a = [2, 4, 6]
And let's make sure it works with a Vec
, too:
fn main() { let mut a = vec![1, 2, 3]; println!("a = {a:?}"); double(&mut a); println!("a = {a:?}"); }
$ cargo run -q a = [1, 2, 3] a = [2, 4, 6]
Yes it does!
Okay! It's time... for a quiz.
Here's a method defined on slices:
impl<T> [T] { pub const fn first(&self) -> Option<&T> { // ... } }
Does it mutate the slice?
No! It takes an immutable reference (&self
), so all it does is read.
Correct!
fn main() { let a = vec![1, 2, 3]; dbg!(a.first()); }
$ cargo run -q [src/main.rs:3] a.first() = Some( 1, )
What about this one?
impl<T> [T] { pub fn fill(&mut self, value: T) where T: Clone, { // ... } }
Oh that one mutates! Given the name, I'd say it fills the whole slice with
value
, and... it needs to be able to make clones of the value because
it might need to repeat it several times.
Right again!
fn main() { let mut a = [0u8; 5]; a.fill(3); dbg!(a); }
$ cargo run -q [src/main.rs:4] a = [ 3, 3, 3, 3, 3, ]
What about this one?
impl<T> [T] { pub fn iter(&self) -> Iter<'_, T> { // ... } }
Ooh that one's a toughie. So no mutation, and it uhhh borrows... through? I mean we've only briefly seen lifetimes, but I'm assuming we can't mutate a thing while we're iterating through it, so like, this:
fn main() { let mut a = [1, 2, 3, 4, 5]; let mut iter = a.iter(); dbg!(iter.next()); dbg!(iter.next()); a[2] = 42; dbg!(iter.next()); dbg!(iter.next()); }
...can't possibly work:
$ cargo run -q error[E0506]: cannot assign to `a[_]` because it is borrowed --> src/main.rs:6:5 | 3 | let mut iter = a.iter(); | -------- borrow of `a[_]` occurs here ... 6 | a[2] = 42; | ^^^^^^^^^ assignment to borrowed `a[_]` occurs here 7 | dbg!(iter.next()); | ----------- borrow later used here For more information about this error, try `rustc --explain E0506`. error: could not compile `grr` due to previous error
Yeah! Right again 😎
Alrighty! Moving on.
Closures
So, remember this code?
fn double(a: &[i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
That's a closure.
That's a... which part, the pipe-looking thing? |x| x * 2
?
Yes. It's like a function.
Wait, no, a function is like this:
fn main() { let a = [1, 2, 3]; let b = double(&a); dbg!(b); } // 👇 this fn times_two(x: &i32) -> i32 { x * 2 } fn double(a: &[i32]) -> Vec<i32> { // which we then 👇 use here a.iter().map(times_two).collect() }
$ cargo run -q [src/main.rs:4] b = [ 2, 4, 6, ]
Yeah. It does the same thing.
Oh, now that you mention it yes, yes it does do the same thing.
Except a closure can close over its environment.
I see. No, wait. I don't. I don't see at all. Its environment? As in the birds and the trees and th-
Kinda, except it's more like... bindings. Look:
fn double(a: &[i32]) -> Vec<i32> { let factor = 2; a.iter().map(|x| x * factor).collect() }
Ohhh. Well that's a constant, it doesn't really count.
Fineeee, here:
fn main() { let a = [1, 2, 3]; let b = mul(&a, 10); dbg!(b); } fn mul(a: &[i32], factor: i32) -> Vec<i32> { a.iter().map(|x| x * factor).collect() }
Okay, okay, I see. So factor
is definitely not a constant there (if we don't
count constant folding), and it's... captured?
Closed over, yes.
...closed over by the closure. I'm gonna say "captured". Seems less obscure.
Sure, fine.
Wait wait wait this is boxed trait objects all over again, right? Sort of? Because closures are actually fat pointers? One pointer to the function itself, and one for the, uh, "environment". I mean, for everything captured by the closure.
Kinda, yes! But aren't we getting ahead of ourselv-
No no no, not at all, it doesn't matter that there might be a lot of new words, or that the underlying concepts aren't crystal clear to everyone reading this yet.
What matters is that we can proceed by analogy, because we've seen similar fuckery just before, and so we can show an example of a manual implementation of closures, just like we did boxed trait objects, and that'll clear it up for everyone.
Are you sure that'll work?
Eh, it's worth a shot right?
So here's what I mean. Say we want to provide a function that does something three times:
fn main() { do_three_times(todo!()); } fn do_three_times<T>(t: T) { todo!() }
It's generic, because it can do any thing three times. Caller's choice. Only how do I... how does the thing... do... something.
Oh! Traits! I can make a trait, hang on.
trait Thing { fn do_it(&self); }
There. And then do_three_times
will take anything that implements Thing
...
oh we can use impl Trait
syntax, no need for explicit generic type parameters
here:
fn do_three_times(t: impl Thing) { for _ in 0..3 { t.do_it(); } }
And then to call it, well... we need some type, on which we implement Thing
,
and make it do a thing. What's a good way to make up a new type that's empty?
Empty struct?
Right!
struct Greet; impl Thing for Greet { fn do_it(&self) { println!("hello!"); } } fn main() { do_three_times(Greet); }
And, if my calculations are correct...
$ cargo run -q hello! hello! hello!
Yes!!! See bear? Easy peasy! That wasn't even very long at all.
I must admit, I'm impressed.
And look, we can even box these!
trait Thing { fn do_it(&self); } fn do_three_times(things: &[Box<dyn Thing>]) { for _ in 0..3 { for t in things { t.do_it() } } } struct Greet; impl Thing for Greet { fn do_it(&self) { println!("hello!"); } } struct Part; impl Thing for Part { fn do_it(&self) { println!("goodbye!"); } } fn main() { do_three_times(&[Box::new(Greet), Box::new(Part)]); }
$ cargo run -q hello! goodbye! hello! goodbye! hello! goodbye!
Very nice. You even figured out how to make slices of heterogenous types.
Now let's see Paul Allen's trai-
Let me stop you right there, bear. I know what you're about to ask: "Oooh, but what if you need to mutate stuff from inside the closure? That won't work will it? Because Wust is such a special widdle wanguage uwu, it can't just wet you do the things you want, it has to be a whiny baby about it" well HAVE NO FEAR because yes, yes, I have realized that this right here:
trait Thing { // 👇 fn do_it(&self); }
...means the closure can never mutate its environment.
Ah!
And so what you'd need to do if you wanted to be able to do that, is have a
ThingMut
trait, like so:
trait ThingMut { fn do_it(&mut self); } fn do_three_times(mut t: impl ThingMut) { for _ in 0..3 { t.do_it() } } struct Greet(usize); impl ThingMut for Greet { fn do_it(&mut self) { self.0 += 1; println!("hello {}!", self.0); } } fn main() { do_three_times(Greet(0)); }
$ cargo run -q hello 1! hello 2! hello 3!
Yes, but you don't really ne-
BUT YOU DON'T NEED TO TAKE OWNERSHIP OF THE THINGMUT I know I know, watch this:
fn do_three_times(t: &mut dyn ThingMut) { for _ in 0..3 { t.do_it() } }
Boom!
fn main() { do_three_times(&mut Greet(0)); }
Bang.
And I suppose you don't need me to do the link with the actual traits in the Rust standard library either?
Eh, who needs you. I'm sure I can find them... there!
There's three of them:
pub trait FnOnce<Args> { type Output; extern "rust-call" fn call_once(self, args: Args) -> Self::Output; } pub trait FnMut<Args>: FnOnce<Args> { extern "rust-call" fn call_mut( &mut self, args: Args ) -> Self::Output; } pub trait Fn<Args>: FnMut<Args> { extern "rust-call" fn call(&self, args: Args) -> Self::Output; }
So all Fn
(immutable reference) are also FnMut
(mutable reference), which
are also FnOnce
(takes ownership). Beautiful symmetry.
And then... I'm assuming the extern "rust-call"
fuckery is because... lack of
variadics right now?
Right, yes. And that's also why you can't really implement the Fn
/ FnMut
/ FnOnce
traits yourself on arbitrary types right now.
Yeah, see! Easy. So our example becomes this:
fn do_three_times(t: &mut dyn FnMut()) { for _ in 0..3 { t() } } fn main() { let mut counter = 0; do_three_times(&mut || { counter += 1; println!("hello {counter}!") }); }
Bam, weird syntax but that's a lot less typing, I like it, arguments are between pipes, sure why not.
Arguments are between pipes, what do you mean?
Oh, well closures can take arguments too, they're just like functions right? You told me that. So we can... do this!
// 👇 fn do_three_times(t: impl Fn(i32)) { for i in 0..3 { t(i) } } fn main() { // 👇 do_three_times(|i| println!("hello {i}!")); }
I see. And I supposed you've figured out boxing as well?
The sport, no. But the type erasure, sure, in that regard they're just regular traits, so, here we go:
fn do_all_the_things(things: &[Box<dyn Fn()>]) { for t in things { t() } } fn main() { do_all_the_things(&[ Box::new(|| println!("hello")), Box::new(|| println!("how are you")), Box::new(|| println!("I wasn't really asking")), Box::new(|| println!("goodbye")), ]); }
Well. It looks like you're all set.
Nothing left to learn.
The world no longer holds any secrets for you.
Through science, you have rid the universe of its last mystery, and you are now cursed to roam, surrounded by the mundane, devoid of the last shred of poet-
Wait, what about async stuff?
Ahhhhhhhhhhhhhhhhhhh fuck.
Async stuff
Okay, async stuff, is.... ugh. Wait, you've written about this before.
Multiple times yes, but humor me. Why do I want it?
You don't! God, why would you. I mean, okay you want it if you're writing network services and stuff.
Oh yes, I do want to do that! So I do want async!
Yes. Yes you very much want async.
And I've heard it makes everything worse!
Well...... so, you know how if you write a file, it writes to the file?
Yes? Like that:
fn main() { // error handling omitted for story-telling purposes let _ = std::fs::write("/tmp/hi", "hi!\n"); }
$ cargo run -q && cat /tmp/hi hi!
Well async is the same, except it doesn't work.
$ cargo add tokio +full (cut)
fn main() { // error handling omitted for story-telling purposes // 👇 (was `std`) let _ = tokio::fs::write("/tmp/bye", "bye!\n"); }
$ cargo run -q && cat /tmp/bye cat: /tmp/bye: No such file or directory
Ah. Indeed it doesn't work.
Exactly, it does nothing, zilch:
$ cargo b -q && strace -ff ./target/debug/grr 2>&1 | grep bye
When the other clearly did something:
$ cargo b -q && strace -ff ./target/debug/grr 2>&1 | grep hi openat(AT_FDCWD, "/tmp/hi", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 3 write(3, "hi!\n", 4) = 4
But wait, that's cuckoo. The cinECMAtic javascript universe also has async and it certainly does do things:
async function main() { await require("fs").promises.writeFile("/tmp/see", "see"); } main();
$ strace -ff node main.js 2>&1 | grep see [pid 1825359] openat(AT_FDCWD, "/tmp/see", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666 <unfinished ...> [pid 1825360] write(17, "see", 3 <unfinished ...>
It does do them things, yes. That's because Node.js® is very async at its core. See, the idea... well that's unfair but let's pretend the idea was "threads are hard okay".
Sure, I can buy that. Threads seem hard — especially when there's a bunch of them stepping on each other's knees and toes, knees and toes.
So fuck threads right? Instead of doing blocking calls...
Wait what are bl-
calls that, like, block! Block everything. You're waiting for... some file to be read, and in the meantime, nothing else can happen.
Right. So instead of that we... do callbacks? Those used to be huge right.
Exactly! You say "I'd like to read from that file" and say "and when it's done, call me back on this number" except it's not a number, it's a closure.
Right! Like so:
const { readFile } = require("fs"); readFile("/usr/bin/gcc", () => { console.log("just read /usr/bin/gcc"); }); readFile("/usr/bin/clang", () => { console.log("just read /usr/bin/clang"); });
Exactly! Even though there's only ever one ECMAScript thing happening at once, multiple I/O (input/output) operations can be in-flight, and they can complete whenever, which is why if we read this, we can get:
$ node main.js just read /usr/bin/clang just read /usr/bin/gcc
Right! Even though we asked for /usr/bin/gcc
to be read first.
Exactly. So async Rust is the same, right? Except async stuff doesn't run just by itself. There's no built-in runtime that's started implicitly, so we gotta create one and use it:
fn main() { tokio::runtime::Runtime::new().unwrap().block_on(async { tokio::fs::write("/tmp/bye", "bye!\n").await.unwrap(); }) }
And now it does do something:
$ cargo run -q && cat /tmp/bye bye! $ cargo b -q && strace -ff ./target/debug/grr 2>&1 | grep bye [pid 1857097] openat(AT_FDCWD, "/tmp/bye", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 9 [pid 1857097] write(9, "bye!\n", 5) = 5
And so the Node.js® program you showed earlier was doing something more like this:
use std::time::Duration; fn main() { // create a new async runtime let rt = tokio::runtime::Runtime::new().unwrap(); // spawn a future on that runtime rt.spawn(async { tokio::fs::write("/tmp/bye", "bye!\n").await.unwrap(); }); // wait for all spawned futures for... some amount of time rt.shutdown_timeout(Duration::from_secs(10_000)) }
Except it probably waited for longer than that. But yeah that's the idea.
Okay, so, wait, there's async blocks? Like async { stuff }
?
Yes.
And async closures? Like async |a, b, c| { stuff }
?
Unfortunately, not yet.
There's async functions, though:
use std::time::Duration; fn main() { // create a new async runtime let rt = tokio::runtime::Runtime::new().unwrap(); // spawn a future on that runtime // 👇 rt.spawn(write_bye()); // wait for all spawned futures for... some amount of time rt.shutdown_timeout(Duration::from_secs(10_000)) } // 👇 async fn write_bye() { tokio::fs::write("/tmp/bye", "bye!\n").await.unwrap(); }
Well it's something.
But wait, so when you call write_bye()
it doesn't actually start doing the
work?
No, it returns a future, and then you need to either spawn it somewhere, or you need to poll it.
How do, uh... how does one go about polling it?
You don't, the runtime does.
Ah, right. Because of the... no I'm sorry, that's nonsense. The runtime polls it?
Well, you can poll it if you want to, sometimes it'll even work:
use std::{ future::Future, task::{Context, RawWaker, RawWakerVTable, Waker}, }; fn main() { let fut = tokio::fs::read("/etc/hosts"); let mut fut = Box::pin(fut); let rw = RawWaker::new( std::ptr::null_mut(), &RawWakerVTable::new(clone, wake, wake_by_ref, drop), ); let w = unsafe { Waker::from_raw(rw) }; let mut cx = Context::from_waker(&w); let res = fut.as_mut().poll(&mut cx); dbg!(&res); } unsafe fn clone(_ptr: *const ()) -> RawWaker { todo!() } unsafe fn wake(_ptr: *const ()) { todo!() } unsafe fn wake_by_ref(_ptr: *const ()) { todo!() } unsafe fn drop(_ptr: *const ()) { // do nothing }
Heyyyyyyyyyyyy that's a vtable, we saw this!
Yes, that's how Rust async runtimes work under the hood. And as you can see:
$ RUST_BACKTRACE=1 cargo run -q thread 'main' panicked at 'there is no reactor running, must be called from the context of a Tokio 1.x runtime', /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/runtime/context.rs:21:19 stack backtrace: 0: rust_begin_unwind at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:584:5 1: core::panicking::panic_fmt at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:143:14 2: core::panicking::panic_display at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:72:5 3: tokio::runtime::context::current at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/runtime/context.rs:21:19 4: tokio::runtime::blocking::pool::spawn_blocking at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/runtime/blocking/pool.rs:113:14 5: tokio::fs::asyncify::{{closure}} at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/fs/mod.rs:119:11 6: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19 7: tokio::fs::read::read::{{closure}} at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/fs/read.rs:50:42 8: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19 9: grr::main at ./src/main.rs:17:15 10: core::ops::function::FnOnce::call_once at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:227:5 note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
...okay so this one doesn't work because there's more moving pieces than this.
But you get the idea, futures get polled.
I'm not sure I do. I mean okay so they get polled once, via this weird trait:
pub trait Future { type Output; fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>; }
Yes, which has a weird Pin<&mut Self>
receiver instead of say, &mut self
, to
make self-referential types work.
Self-referential types? Ok now I'm completely lost. WE TRIED, EVERYONE, time to pack up and get outta here.
No no no bear with me
😐
...so think back to closures: they're code + data. A function and its environment. And the code in there can create new references to the data, right?
I.. I guess?
Like this for example:
fn main() { do_stuff(|| { let v = vec![1, 2, 3, 4]; let back_half = &v[2..]; println!("{back_half:?}"); }); } fn do_stuff(f: impl Fn()) { f() }
Ah right, yes. The closure allocates some memory as a Vec
, and then it takes
an immutable slice of it. I don't see where the issue is, though.
Well think of futures like closures but... that you can call into several times?
I call into several times?
No, the runtime does.
... the confusion, it remains.
No but like, if we look at this:
use std::future::Future; fn main() { do_stuff(async { let arr = [1, 2, 3, 4]; let back_half = &v[2..]; let hosts = tokio::fs::read("/etc/hosts").await; println!("{back_half:?}, {hosts:?}"); }); } fn do_stuff(f: impl Future<Output = ()>) { // blah }
Yes, same idea but with some async sprinkled in there.
Exactly. So that read("/etc/hosts").await
line there, that's an await point.
I can't help but feel like we're getting away from the spirit of the article, but okay, sure?
Focus! So read()
returns a Future, and then we call .await
, which makes the
current/ambient async runtime poll it once.
Sure, I can buy that. And then?
Well and then either it returns Poll::Ready
and it synchronously continues
execution into the second part of that async block.
Or?
Or it returns Poll::Pending
, at which point it'll have already registered
itself with all the Waker
business I teased earlier on.
Right. And then what happens?
And then it returns.
But... but it can't! If it returns we'll lose the data! The array will go out of scope and be freed!
Exactly.
So surely it's not actually returning?
It is actually returning. But it's also storing the array somewhere else. So
that the next time it's polled/called, there it is. And in that "somewhere
else", it also remembers which await point caused it to return Poll::Pending
.
So this is all just a gigantic state machine?
Yes! And some parts of its state (in this case, back_half
) may reference some
other parts of its state (in this case, arr
), so the state struct itself is...
self-referential.
Here's the async block code again because that's a lot of scrolling:
do_stuff(async { let arr = [1, 2, 3, 4]; let back_half = &arr[2..]; let hosts = tokio::fs::read("/etc/hosts").await; println!("{back_half:?}, {hosts:?}"); });
Self-referential as in it refers to itself, gotcha.
And what's the problem with that?
The problem is, what if you poll that future once, and then it returns
Poll::Pending
, and then you move it somewhere else in memory?
Then I guess... arr
will be moved along with it?
EXACTLY. And back_half
will still point at the wrong place.
Ohhhhhhh so it must be pinned.
Yes. It must be pinned in order to be polled. That's why the receiver of poll
is Pin<&mut Self>
.
And so we can move the future before it's polled, but after the first time it's been polled, it's game over? Stay pinned?
Unless it implements Unpin
, yes.
Which... it would implement only if... it was safe to move elsewhere?
Yes, for example if it only contained references to memory that's on the heap!
But GenFuture
, the opaque type of async blocks, never implements Unpin
(at
least, I couldn't get it to), so this fails to build:
use std::{future::Future, time::Duration}; fn main() { let fut = async move { println!("hang on a sec..."); tokio::time::sleep(Duration::from_secs(1)).await; println!("I'm here!"); }; ensure_unpin(&fut); } fn ensure_unpin<F: Future + Unpin>(f: &F) { // muffin }
$ cargo check -q error[E0277]: `from_generator::GenFuture<[static generator@src/main.rs:4:26: 8:6]>` cannot be unpinned --> src/main.rs:9:18 | 9 | ensure_unpin(&fut); | ------------ ^^^^ within `impl Future<Output = ()>`, the trait `Unpin` is not implemented for `from_generator::GenFuture<[static generator@src/main.rs:4:26: 8:6]>` | | | required by a bound introduced by this call
...but we can always "box-pin" it, moving the whole future to the heap, so that we can move a reference to it wherever we please:
use std::{future::Future, time::Duration}; fn main() { // 👇 let fut = Box::pin(async move { println!("hang on a sec..."); tokio::time::sleep(Duration::from_secs(1)).await; println!("I'm here!"); }); ensure_unpin(&fut); } fn ensure_unpin<F: Future + Unpin>(f: &F) { // muffin }
Okay that... that was a lot to take in.
So async stuff is awful because I need to understand all that, right?
Oh no, not at all.
Huh?
For starters, you don't really want to build a tokio Runtime yourself. There's macros for that.
#[tokio::main] async fn main() { tokio::fs::write("/tmp/bye", "bye!\n").await.unwrap(); }
Ah, that seems more convenient, yes.
And you never really want to care about the Context
/ Waker
/ RawWaker
stuff either. Those are implementation details.
Right right, yes.
But thus is the terrible deal we've made with the devil compiler. It guards
us from numerous perils, but in exchange, we sometimes run head-first into
unholy type errors.
I see. So you're saying... I don't need to understand pinning for example?
No! You just need to know that you can Box::pin()
your way out of "this thing
is not Unpin
" diagnostics. Just like you can .clone()
your way out of many
"this thing doesn't live long enough".
Then WHY in the world did we learn all that.
Well, if you have a vague understanding of the underlying design constraints, it makes it a teensy bit less frustrating when you run into seemingly arbitrary limitations.
Such as?
Ah, friend.
I'm so glad you asked.
Async trait methods
So traits! You know traits. Here's a trait.
pub trait Read { fn read(&mut self, buf: &mut [u8]) -> Result<usize>; // (other methods omitted) }
Yeah I know traits. That seems like a reasonable trait. The receiver is &mut self
, because... it advances a read head? Also takes a buffer to write its
output to, and returns how many bytes were read. Pretty simple stuff.
Wonderful! Now do the same, but make read
async.
What, like that?
pub trait AsyncRead { async fn read(&mut self, buf: &mut [u8]) -> Result<usize, Box<dyn std::error::Error>>; }
$ cargo check -q error[E0706]: functions in traits cannot be declared `async` --> src/main.rs:2:5 | 2 | async fn read(&mut self, buf: &mut [u8]) -> Result<usize, Box<dyn std::error::Error>>; | -----^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | | | `async` because of this | = note: `async` trait functions are not currently supported = note: consider using the `async-trait` crate: https://crates.io/crates/async-trait
Well the diagnostic is exemplary but, long story short: compiler says no.
Exactly. Do you know why?
Not really no?
Well, it's complicated. But we can sorta get an intuition for it.
Turns out there already is an AsyncRead
trait in tokio
(and a couple other
places). Let's make an async function that just calls it:
async fn read(r: &mut (dyn AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> { r.read(buf).await }
And now let's use it a couple times:
use tokio::{ fs::File, io::{AsyncRead, AsyncReadExt, AsyncWriteExt}, net::TcpStream, }; #[tokio::main] async fn main() { let mut f = File::open("/etc/hosts").await.unwrap(); let mut buf1 = vec![0u8; 128]; read(&mut f, &mut buf1).await.unwrap(); println!("buf1 = {:?}", std::str::from_utf8(&buf1)); let mut s = TcpStream::connect("example.org:80").await.unwrap(); s.write_all("GET http://example.org HTTP/1.1\r\n\r\n".as_bytes()) .await .unwrap(); let mut buf2 = vec![0u8; 128]; read(&mut s, &mut buf2).await.unwrap(); println!("buf2 = {:?}", std::str::from_utf8(&buf2)); } async fn read(r: &mut (dyn AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> { r.read(buf).await }
Whoa. WHOA, we're writing real code now?
If you call that real code, sure. Anyway we're doing two asynchronous things: reading from a file, and reading from a TCP socket, cosplaying as the world's worst HTTP client.
$ cargo run -q buf1 = Ok("127.0.0.1\tlocalhost\n127.0.1.1\tamos\n\n# The following lines are desirable for IPv6 capable hosts\n::1 ip6-localhost ip6-loopbac") buf2 = Ok("HTTP/1.1 200 OK\r\nAge: 586436\r\nCache-Control: max-age=604800\r\nContent-Type: text/html; charset=UTF-8\r\nDate: Wed, 01 Jun 2022 17:1")
Now here's my question: what type does read
return?
Our read
function? I have no idea, why?
Because, it's important.
Well... I suppose we could try assigning one to the other?
Sure, let's do that.
use tokio::{ fs::File, io::{AsyncRead, AsyncReadExt, AsyncWriteExt}, net::TcpStream, }; #[tokio::main] async fn main() { let mut f = File::open("/etc/hosts").await.unwrap(); let mut buf1 = vec![0u8; 128]; let mut s = TcpStream::connect("example.org:80").await.unwrap(); s.write_all("GET http://example.org HTTP/1.1\r\n\r\n".as_bytes()) .await .unwrap(); let mut buf2 = vec![0u8; 128]; #[allow(unused_assignments)] let mut fut1 = read(&mut f, &mut buf1); let fut2 = read(&mut s, &mut buf2); fut1 = fut2; fut1.await.unwrap(); println!("buf2 = {:?}", std::str::from_utf8(&buf2)); } async fn read(r: &mut (dyn AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> { r.read(buf).await }
$ cargo run -q buf2 = Ok("HTTP/1.1 200 OK\r\nAge: 387619\r\nCache-Control: max-age=604800\r\nContent-Type: text/html; charset=UTF-8\r\nDate: Wed, 01 Jun 2022 17:2")
Hey, that worked!
Yes indeed. What else can you tell me about those types?
Mhhh their names, sort of?
{ // in main: let fut1 = read(&mut f, &mut buf1); let fut2 = read(&mut s, &mut buf2); println!("fut1's type is {}", type_name_of_val(&fut1)); println!("fut2's type is {}", type_name_of_val(&fut2)); } fn type_name_of_val<T>(_t: &T) -> &'static str { std::any::type_name::<T>() }
$ cargo run -q fut1's type is core::future::from_generator::GenFuture<grr::read::{{closure}}> fut2's type is core::future::from_generator::GenFuture<grr::read::{{closure}}>
Hah! It's closures all the way down. And then I guess their size?
println!("fut1's size is {}", std::mem::size_of_val(&fut1)); println!("fut2's size is {}", std::mem::size_of_val(&fut2));
$ cargo run -q fut1's size is 72 fut2's size is 72
Okay, very well! Now same question with this read
function:
async fn read(mut r: (impl AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> { r.read(buf).await }
Okay, let's try assigning one to the other...
let mut fut1 = read(&mut f, &mut buf1); let fut2 = read(&mut s, &mut buf2); fut1 = fut2;
$ cargo run -q error[E0308]: mismatched types --> src/main.rs:20:12 | 18 | let mut fut1 = read(&mut f, &mut buf1); | ----------------------- expected due to this value 19 | let fut2 = read(&mut s, &mut buf2); 20 | fut1 = fut2; | ^^^^ expected struct `tokio::fs::File`, found struct `tokio::net::TcpStream` | note: while checking the return type of the `async fn` --> src/main.rs:25:67 | 25 | async fn read(mut r: (impl AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> { | ^^^^^^^^^^^^^^^^^^^^^^ checked the `Output` of this `async fn`, expected opaque type note: while checking the return type of the `async fn` --> src/main.rs:25:67 | 25 | async fn read(mut r: (impl AsyncRead + Unpin), buf: &mut [u8]) -> std::io::Result<usize> { | ^^^^^^^^^^^^^^^^^^^^^^ checked the `Output` of this `async fn`, found opaque type = note: expected opaque type `impl Future<Output = Result<usize, std::io::Error>>` (struct `tokio::fs::File`) found opaque type `impl Future<Output = Result<usize, std::io::Error>>` (struct `tokio::net::TcpStream`) = help: consider `await`ing on both `Future`s For more information about this error, try `rustc --explain E0308`. error: could not compile `grr` due to previous error
Huh. HUH. The compiler is not happy AT ALL. It's trying very hard to be helpful, but it's clear it didn't expect anyone to fuck around in that particular manner, much less find out.
Let's try answering the other questions though... the type "name":
let fut1 = read(&mut f, &mut buf1); let fut2 = read(&mut s, &mut buf2); println!("fut1's name is {}", type_name_of_val(&fut1)); println!("fut2's name is {}", type_name_of_val(&fut2));
$ cargo run -q fut1's name is core::future::from_generator::GenFuture<grr::read<&mut tokio::fs::file::File>::{{closure}}> fut2's name is core::future::from_generator::GenFuture<grr::read<&mut tokio::net::tcp::stream::TcpStream>::{{closure}}>
Ooooh interesting. And then their sizes:
let fut1 = read(&mut f, &mut buf1); let fut2 = read(&mut s, &mut buf2); println!("fut1's size is {}", std::mem::size_of_val(&fut1)); println!("fut2's size is {}", std::mem::size_of_val(&fut2));
$ cargo run -q fut1's size is 64 fut2's size is 64
Awwwwwww I was hoping for them to be different, b- wait, WAIT, we're passing
&mut f
and &mut s
each time, that's 8 bytes each, if we pass ownership
of the File
/ TcpStream
respectively, then maybe...
let fut1 = read(f, &mut buf1); let fut2 = read(s, &mut buf2); println!("fut1's size is {}", std::mem::size_of_val(&fut1)); println!("fut2's size is {}", std::mem::size_of_val(&fut2));
$ cargo run -q fut1's size is 256 fut2's size is 112
YES! The File
is bigger.
Yes it is, for some reason. I can see... a tokio::sync::Mutex
in there? Fun!
Okay so, is read
returning the same type in both cases?
No!
And how would that work in a trait?
Well... we have impl Trait
in return position, right? So just like these:
async fn sleepy_times() { tokio::time::sleep(Duration::from_secs(1)).await }
...are actually sugar for these:
fn sleepy_times() -> impl Future<Output = ()> { async { tokio::time::sleep(Duration::from_secs(1)).await } }
Then I guess instead of this:
trait AsyncRead { async fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize>; }
We can have this:
trait AsyncRead { fn read(&mut self, buf: &mut [u8]) -> impl Future<Output = std::io::Result<usize>>; }
You would think so! Except we cannot.
$ cargo run -q error[E0562]: `impl Trait` only allowed in function and inherent method return types, not in trait method return --> src/main.rs:9:43 | 9 | fn read(&mut self, buf: &mut [u8]) -> impl Future<Output = std::io::Result<usize>>; | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For more information about this error, try `rustc --explain E0562`. error: could not compile `grr` due to previous error
Well THAT'S IT. I'm learning Haskell.
Whoa whoa now's not the time for self-harm. It's just a limitation!
On the other hand, we can have that:
trait AsyncRead { type Future: Future<Output = std::io::Result<usize>>; fn read(&mut self, buf: &mut [u8]) -> Self::Future; }
And AsyncRead::Future
is an associated type. It's chosen by the implementor
of the trait.
I swear to glob, bear, if this is another one of your tricks, I'm..
$ cargo check -q (nothing)
Oh. No, this checks. (Literally)
What's the catch?
Try implementing it!
Alright, well there's... tokio has its own AsyncRead
trait... and then an
AsyncReadExt
extension trait, which actually gives us read
, so we can just..
and then we... okay, there it is:
impl AsyncRead for File { type Future = (); fn read(&mut self, buf: &mut [u8]) -> Self::Future { tokio::io::AsyncReadExt::read(self, buf) } }
But umm. What do I put as the Future
type...
Hahahahahahahha.
Oh shut up will you. I'm sure the compiler will be able to help:
$ cargo check -q error[E0277]: `()` is not a future --> src/main.rs:17:19 | 17 | type Future = (); | ^^ `()` is not a future | = help: the trait `Future` is not implemented for `()` = note: () must be a future or must implement `IntoFuture` to be awaited note: required by a bound in `AsyncRead::Future` --> src/main.rs:11:18 | 11 | type Future: Future<Output = std::io::Result<usize>>; | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ required by this bound in `AsyncRead::Future` error[E0308]: mismatched types --> src/main.rs:20:9 | 19 | fn read(&mut self, buf: &mut [u8]) -> Self::Future { | ------------ expected `()` because of return type 20 | tokio::io::AsyncReadExt::read(self, buf) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^- help: consider using a semicolon here: `;` | | | expected `()`, found struct `tokio::io::util::read::Read` | = note: expected unit type `()` found struct `tokio::io::util::read::Read<'_, tokio::fs::File>` Some errors have detailed explanations: E0277, E0308. For more information about an error, try `rustc --explain E0277`. error: could not compile `grr` due to 2 previous errors
See! I just have to...
impl AsyncRead for File { type Future = tokio::io::util::read::Read<'_, tokio::fs::File>; fn read(&mut self, buf: &mut [u8]) -> Self::Future { tokio::io::AsyncReadExt::read(self, buf) } }
$ cargo check -q error[E0603]: module `util` is private --> src/main.rs:17:30 | 17 | type Future = tokio::io::util::read::Read<'_, tokio::fs::File>; | ^^^^ private module | note: the module `util` is defined here --> /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.18.2/src/io/mod.rs:256:5 | 256 | pub(crate) mod util; | ^^^^^^^^^^^^^^^^^^^^
😭
Hahahahahha. You simultaneously had the best and the worst luck.
...explain?
Well, because it turns out that AsyncReadExt::read
is not an async fn
, it's
a regular fn that returns a named type that implements Future
, so you
could technically implement your AsyncRead
trait... but it's unexported, so
you can't name it, only the tokio
crate can.
Ahhhhhhhhhh. So... how do I get out of this?
Remember the survival rules: you could always Box::pin
the future. That way
you can name it.
Okay... then the whole thing becomes this:
trait AsyncRead { type Future: Future<Output = std::io::Result<usize>>; fn read(&mut self, buf: &mut [u8]) -> Self::Future; } impl AsyncRead for File { type Future = Pin<Box<dyn Future<Output = std::io::Result<usize>>>>; fn read(&mut self, buf: &mut [u8]) -> Self::Future { Box::pin(tokio::io::AsyncReadExt::read(self, buf)) } }
...which seems like it just.. might..
$ cargo check -q error[E0759]: `buf` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement --> src/main.rs:20:18 | 19 | fn read(&mut self, buf: &mut [u8]) -> Self::Future { | --------- this data with an anonymous lifetime `'_`... 20 | Box::pin(tokio::io::AsyncReadExt::read(self, buf)) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^---^ | | | ...is used and required to live as long as `'static` here | note: `'static` lifetime requirement introduced by the return type --> src/main.rs:19:43 | 19 | fn read(&mut self, buf: &mut [u8]) -> Self::Future { | ^^^^^^^^^^^^ requirement introduced by this return type 20 | Box::pin(tokio::io::AsyncReadExt::read(self, buf)) | -------------------------------------------------- because of this returned expression For more information about this error, try `rustc --explain E0759`. error: could not compile `grr` due to previous error
Oh COME ON.
Hahahahahahahahahhahah yes. The Self::Future
type has to be generic over the
lifetime of self...
??? how did we get here. We were learning some basic Rust. It was nice.
Well, Box<dyn Trait>
actually has an implicit static bound: it's really
Box<dyn Trait + 'static>
.
It... okay yes, it must be owned.
And the future you're trying to box isn't owned is it? It's borrowing from self
.
Ahhhh fuckity fuck fuck.
Hey hey hey, no cursing, it's nothing a few nightly features can't fix!
# in rust-toolchain.toml [toolchain] channel = "nightly-2022-06-01"
// 👇 #![feature(generic_associated_types)] use std::{future::Future, pin::Pin}; use tokio::fs::File; #[tokio::main] async fn main() { let mut f = File::open("/etc/hosts").await.unwrap(); let mut buf = vec![0u8; 32]; AsyncRead::read(&mut f, &mut buf).await.unwrap(); println!("buf = {:?}", std::str::from_utf8(&buf)); } trait AsyncRead { type Future<'a>: Future<Output = std::io::Result<usize>> where Self: 'a; fn read<'a>(&'a mut self, buf: &'a mut [u8]) -> Self::Future<'a>; } impl AsyncRead for File { type Future<'a> = Pin<Box<dyn Future<Output = std::io::Result<usize>> + 'a>>; fn read<'a>(&'a mut self, buf: &'a mut [u8]) -> Self::Future<'a> { Box::pin(tokio::io::AsyncReadExt::read(self, buf)) } }
Whoa whoa whoa when did we graduate to that level of type fuckery.
Just squint! Or come back to it every couple weeks, whichever works.
$ cargo run -q buf = Ok("127.0.0.1\tlocalhost\n127.0.1.1\tam")
Well it does run, I'll grant you that.
But wait, isn't boxing bad? What if we don't want to move that future to the heap?
Ah, then we need another trick unstable feature. And look, we can even use
an async
block!
// 👇 #![feature(generic_associated_types)] // 👇👇 #![feature(type_alias_impl_trait)] use std::future::Future; use tokio::fs::File; #[tokio::main] async fn main() { let mut f = File::open("/etc/hosts").await.unwrap(); let mut buf = vec![0u8; 32]; AsyncRead::read(&mut f, &mut buf).await.unwrap(); println!("buf = {:?}", std::str::from_utf8(&buf)); } trait AsyncRead { type Future<'a>: Future<Output = std::io::Result<usize>> where Self: 'a; fn read<'a>(&'a mut self, buf: &'a mut [u8]) -> Self::Future<'a>; } impl AsyncRead for File { // 👇 type Future<'a> = impl Future<Output = std::io::Result<usize>> + 'a; fn read<'a>(&'a mut self, buf: &'a mut [u8]) -> Self::Future<'a> { // 👇 async move { tokio::io::AsyncReadExt::read(self, buf).await } } }
Whoaaaaa. It even runs!
It does! And you know the best part?
No?
These are actually slated for stabilizations Soon™️.
Wait, so we're learning all that for naught? All that effort???
Eh, look at this way: if and when those get stabilized, we'll be able to look back at all and laugh.
Just like today laugh at the fact that before Rust
1.35 (May 2019), the
Fn
traits weren't implemented for Box<dyn Fn>
.
Or any number of significant milestones. It's been a long road.
I see. And in the meantime?
In the meantime my dear, we live in the now. And in the now, we have to deal with things such as...
The Connect trait from hyper
Ah, hyper! I've heard of it before.
It's an... http library? Does client, server, even http/2, maybe some day http/3.
Yeah I uh... that one needs help still. Call me? I just want to help.
But yes, http stuff.
And it has a Connect
trait which is...
pub trait Connect: Sealed + Sized { }
...not very helpful.
No. But if you bothered to read the docs, you'd realize you're not supposed
to implement it directly: instead you should implement tower::Service<Uri>
.
Oh boy, here we go. How about I don't implement it at all? Huh? How's that.
Sure, you don't need to!
# let's just switch back to stable... $ rm rust-toolchain.toml $ cargo add hyper --features full (cut)
use hyper::Client; #[tokio::main] async fn main() { let client = Client::new(); let uri = "http://example.org".parse().unwrap(); let res = client.get(uri).await.unwrap(); let body = hyper::body::to_bytes(res.into_body()).await.unwrap(); let body = std::str::from_utf8(&body).unwrap(); println!("{}...", &body[..128]); }
$ cargo run -q <!doctype html> <html> <head> <title>Example Domain</title> <meta charset="utf-8" /> <meta http-equiv="Content-type...
Ah, well, that's good. Cause I'm done with gnarly traits. Only simple code from now on.
And you're absolutely entitled to that. So that's for a simple plaintext HTTP request over TCP, but did you know you can do HTTP over other types of sockets?
Unix sockets for instance!
Unix sock... oh like the Docker daemon?
Exactly like the Docker daemon!
$ cargo add hyperlocal (cut) $ cargo add serde_json (cut)
use hyper::{Body, Client}; use hyperlocal::UnixConnector; #[tokio::main] async fn main() { let client = Client::builder().build::<_, Body>(UnixConnector); let uri = hyperlocal::Uri::new("/var/run/docker.sock", "/v1.41/info").into(); let res = client.get(uri).await.unwrap(); let body = hyper::body::to_bytes(res.into_body()).await.unwrap(); let value: serde_json::Value = serde_json::from_slice(&body).unwrap(); println!("operating system: {}", value["OperatingSystem"]); }
$ cargo run -q operating system: "Ubuntu 22.04 LTS"
Whoa wait, serde_json
? Are we doing useful stuff again?
Just for a bit.
So, making a request like that involves a bunch of operations, right?
Yeah it does! Let's take a look with strace
, since apparently that's fair
game in this monstrous article:
$ cargo build -q && strace -ff ./target/debug/grr 2>&1 | grep -vE 'futex|mmap|munmap|madvise|mprotect|sigalt|sigproc|prctl' | grep connect -A 20 [pid 1943976] connect(9, {sa_family=AF_UNIX, sun_path="/var/run/docker.sock"}, 23 <unfinished ...> [pid 1943976] <... connect resumed>) = 0 [pid 1943976] epoll_ctl(5, EPOLL_CTL_ADD, 9, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=1, u64=1}} <unfinished ...> [pid 1943976] <... epoll_ctl resumed>) = 0 [pid 1944006] sched_getaffinity(1944006, 32, <unfinished ...> [pid 1943977] <... epoll_wait resumed>[{events=EPOLLOUT, data={u32=1, u64=1}}], 1024, -1) = 1 [pid 1944006] <... sched_getaffinity resumed>[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]) = 8 [pid 1943976] getsockopt(9, SOL_SOCKET, SO_ERROR, <unfinished ...> [pid 1943976] <... getsockopt resumed>[0], [4]) = 0 [pid 1943977] epoll_wait(3, <unfinished ...> [pid 1944006] write(9, "GET /v1.41/info HTTP/1.1\r\nhost: "..., 78 <unfinished ...> [pid 1944006] <... write resumed>) = 78 [pid 1943977] <... epoll_wait resumed>[{events=EPOLLOUT, data={u32=1, u64=1}}], 1024, -1) = 1 [pid 1943977] epoll_wait(3, [{events=EPOLLIN|EPOLLOUT, data={u32=1, u64=1}}], 1024, -1) = 1 [pid 1943977] recvfrom(9, "HTTP/1.1 200 OK\r\nApi-Version: 1."..., 8192, 0, NULL, NULL) = 2536 [pid 1943977] epoll_wait(3, <unfinished ...> [pid 1944005] write(4, "\1\0\0\0\0\0\0\0", 8) = 8 [pid 1943977] <... epoll_wait resumed>[{events=EPOLLIN, data={u32=2147483648, u64=2147483648}}], 1024, -1) = 1 [pid 1944005] recvfrom(9, <unfinished ...> [pid 1943977] epoll_wait(3, <unfinished ...> [pid 1944005] <... recvfrom resumed>0x7f6f84000d00, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 1943976] write(4, "\1\0\0\0\0\0\0\0", 8 <unfinished ...>
And those operations are different for TCP sockets and Unix sockets?
I would imagine so, yes.
Well, that work is done respectively by the HttpConnector and UnixConnector structs.
I see. And, wait... waitwaitwait. Connecting to a socket is an asynchronous operation too, right?
I know for TCP is involves sending a SYN
, getting back an ACK
, then sending
a SYNACK
, that all happens over the network, you probably don't want to block
on that, right?
Right!
But Connect
is a trait though. I thought you couldn't have async trait methods?
Ah, well, it's time to gaze upon... the tower Service trait.
pub trait Service<Request> { type Response; type Error; type Future: Future<Output = Result<Self::Response, Self::Error>>; fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>>; fn call(&mut self, req: Request) -> Self::Future; }
I see. Three associated types: Response
, Error
, and Future
. And I see...
Future
is not generic over any lifetime, which means... call
can't borrow
from self
. Ah and it takes ownership of Request
!
And then there's poll_ready
, which uhh...
That's just for backpressure. It's pretty clever, but not super relevant here.
In fact, if we look at the implementation for hyperlocal::UnixConnector
:
// somewhere in hyperlocal's source code impl Service<Uri> for UnixConnector { type Response = UnixStream; type Error = std::io::Error; type Future = BoxFuture<'static, Result<Self::Response, Self::Error>>; fn call(&mut self, req: Uri) -> Self::Future { let fut = async move { let path = parse_socket_path(req)?; UnixStream::connect(path).await }; Box::pin(fut) } fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> { Poll::Ready(Ok(())) } }
Ah, it's not using that capacity at all, just returning Ready
immediately.
Okay, here comes the exercise. Ready?
Hit me.
How do we make a hyper connector that can connect over both TCP sockets and Unix sockets?
Ah, well. I suppose we better make our own connector type then.
Something like... this?
use std::{future::Future, pin::Pin}; use hyper::{client::HttpConnector, service::Service, Body, Client, Uri}; use hyperlocal::UnixConnector; struct SuperConnector { tcp: HttpConnector, unix: UnixConnector, } impl Default for SuperConnector { fn default() -> Self { Self { tcp: HttpConnector::new(), unix: Default::default(), } } } impl Service<Uri> for SuperConnector { type Response = (); type Error = (); type Future = Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>>>>; fn poll_ready( &mut self, cx: &mut std::task::Context<'_>, ) -> std::task::Poll<Result<(), Self::Error>> { todo!() } fn call(&mut self, req: Uri) -> Self::Future { todo!() } } #[tokio::main] async fn main() { let client = Client::builder().build::<_, Body>(SuperConnector::default()); let uri = hyperlocal::Uri::new("/var/run/docker.sock", "/v1.41/info").into(); let res = client.get(uri).await.unwrap(); let body = hyper::body::to_bytes(res.into_body()).await.unwrap(); let value: serde_json::Value = serde_json::from_slice(&body).unwrap(); println!("operating system: {}", value["OperatingSystem"]); }
I see, I see. So you haven't decided on a Response
/ Error
type yet, that's
fine. And you're boxing the future?
Yeah, it's the easy way out, but that's what the async-trait crate does, so it seems like a safe bet.
Besides, I suppose HttpConnector
and UnixConnector
return incompatible
futures, right? So we'd have the same problem we did before, wayyyy back,
with code like that:
fn get_char_or_int(give_char: bool) -> impl Display { if give_char { 'C' } else { 64 } }
...yes actually yes, that was the whole motivation for the article, now that I think of it.
Now that you think? Nuh-huh. You don't think. I write you.
Well... maybe it started out this way, but look at us now. Who will the people remember?
...let's get back to the code shall we.
So anyway my temporary code doesn't even compile:
$ cargo check -q error[E0277]: the trait bound `SuperConnector: Clone` is not satisfied --> src/main.rs:39:53 | 39 | let client = Client::builder().build::<_, Body>(SuperConnector::default()); | ----- ^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `Clone` is not implemented for `SuperConnector` | | | required by a bound introduced by this call | note: required by a bound in `hyper::client::Builder::build` --> /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.19/src/client/client.rs:1336:22 | 1336 | C: Connect + Clone, | ^^^^^ required by this bound in `hyper::client::Builder::build`
Oh yeah you need it to be Clone
. Both connectors you're wrapping are bound to
be Clone
already, so you can just derive it, probably.
Alrighty then:
#[derive(Clone)] struct SuperConnector { tcp: HttpConnector, unix: UnixConnector, }
Okay... now it complains that ()
doesn't implement AsyncRead
, AsyncWrite
,
or hyper::client::connect::Connection
. Also, our Future
type isn't Send + 'static
, and it has to be.
That one's an easy fix:
type Future = Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>> + Send + 'static>>;
There. As for the AsyncRead
/ AsyncWrite
/ Connection
stuff, well...
Right. That's where it gets awkward.
Oh? Can't we just use boxed trait objects here too?
Well no, because you've got three traits.
So? We've clearly done, for example, Box<dyn T + Send + 'static>
before.
Yes, but Send
is a marker trait (it doesn't actually have any methods), and
'static
is just a lifetime bound, not a trait.
So you mean to tell me that if I did this:
type Response = Pin<Box<dyn AsyncRead + AsyncWrite + Connection>>;
It wouldn't w-
$ cargo check -q error[E0225]: only auto traits can be used as additional traits in a trait object --> src/main.rs:27:45 | 27 | type Response = Pin<Box<dyn AsyncRead + AsyncWrite + Connection>>; | --------- ^^^^^^^^^^ additional non-auto trait | | | first non-auto trait | = help: consider creating a new trait with all of these as supertraits and using that trait here instead: `trait NewTrait: AsyncRead + AsyncWrite + hyper::client::connect::Connection {}` = note: auto-traits like `Send` and `Sync` are traits that have special properties; for more information on them, visit <https://doc.rust-lang.org/reference/special-types-and-traits.html#auto-traits> For more information about this error, try `rustc --explain E0225`. error: could not compile `grr` due to previous error
Oh.
Can you see why?
Well the diagnostic is pretty fantastic here, game recognize game. But also uhh... oh is it a vtable thing?
Yes it is! Trait objects are two pointers: data + vtable. One vtable. Not three.
Ahhh hence the advice to make a new trait instead? Which would create a new super-vtable that contains the vtables for those three traits?
You know what, don't say a thing, I'm trying it.
That's the spir-
NOT A THING.
trait SuperConnection: AsyncRead + AsyncWrite + Connection {} impl Service<Uri> for SuperConnector { type Response = Pin<Box<dyn SuperConnection>>; // etc. }
$ cargo check -q error[E0277]: the trait bound `Pin<Box<(dyn SuperConnection + 'static)>>: hyper::client::connect::Connection` is not satisfied --> src/main.rs:48:53 | 48 | let client = Client::builder().build::<_, Body>(SuperConnector::default()); | ----- ^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `hyper::client::connect::Connection` is not implemented for `Pin<Box<(dyn SuperConnection + 'static)>>` | | | required by a bound introduced by this call | = note: required because of the requirements on the impl of `hyper::client::connect::Connect` for `SuperConnector` note: required by a bound in `hyper::client::Builder::build` --> /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.19/src/client/client.rs:1336:12 | 1336 | C: Connect + Clone, | ^^^^^^^ required by this bound in `hyper::client::Builder::build`
Wait, what, why.
Well, you're boxing it! T where T: SuperConnection
implements Connection
,
but Box<dyn SuperConnection>
might not!
And why do we not have that error with AsyncRead
and AsyncWrite
?
Because there's blanket impls, see:
// somewhere in tokio's source code macro_rules! deref_async_read { () => { fn poll_read( mut self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>, ) -> Poll<io::Result<()>> { Pin::new(&mut **self).poll_read(cx, buf) } }; } impl<T: ?Sized + AsyncRead + Unpin> AsyncRead for Box<T> { deref_async_read!(); } impl<T: ?Sized + AsyncRead + Unpin> AsyncRead for &mut T { deref_async_read!(); }
Ah, and there's no blanket impl<T> Connection for Box<T> where T: Connection
?
Apparently not!
Okay, let's hope orphan rules don't get in the way...
impl Connection for Pin<Box<dyn SuperConnection>> { fn connected(&self) -> hyper::client::connect::Connected { (**self).connected() } }
...it's not complaining yet, let's keep going.
We need to pick an error type, and fill out our poll_ready
and call
methods.
Let's fucking goooooooooo:
impl Service<Uri> for SuperConnector { type Response = Pin<Box<dyn SuperConnection>>; type Error = Box<dyn std::error::Error + Send>; type Future = Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>> + Send + 'static>>; fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> { match (self.tcp.poll_ready(cx), self.unix.poll_ready(cx)) { (Poll::Pending, _) | (_, Poll::Pending) => Poll::Pending, _ => Ok(()).into(), } } fn call(&mut self, req: Uri) -> Self::Future { match req.scheme_str().unwrap_or_default() { "unix" => { let fut = self.unix.call(req); Box::pin(async move { match fut.await { Ok(conn) => Ok(Box::pin(conn)), Err(e) => Err(Box::new(e)), } }) } _ => { let fut = self.tcp.call(req); Box::pin(async move { match fut.await { Ok(conn) => Ok(Box::pin(conn)), Err(e) => Err(Box::new(e)), } }) } } } }
So I'm looking at this in vscode, and it's very red.
I think we may have forgotten something...
Ah yes! The composition trait here:
trait SuperConnection: AsyncRead + AsyncWrite + Connection {}
You're missing half of it. Nothing implements this supertrait right now.
Ohhh because there's types that implement AsyncRead
, AsyncWrite
and
Connection
, but they also have to implement SuperConnection
itself. The
other three are just prerequisites?
They're just supertraits, yeah. Anyway this is the part you're missing:
impl<T> SuperConnection for T where T: AsyncRead + AsyncWrite + Connection {}
Ah, a beautiful blanket impl.
Okay, I'm working here, adding bounds left and right, here a Send
, here a
'static
, but I'm seeing some errors... some pretty bad errors here...
$ cargo check -q error[E0271]: type mismatch resolving `<impl Future<Output = Result<Pin<Box<hyperlocal::client::UnixStream>>, Box<std::io::Error>>> as Future>::Output == Result<Pin<Box<(dyn SuperConnection + Send + 'static)>>, Box<(dyn std::error::Error + Send + 'static)>>` --> src/main.rs:56:17 | 56 | / Box::pin(async move { 57 | | match fut.await { 58 | | Ok(conn) => Ok(Box::pin(conn)), 59 | | Err(e) => Err(Box::new(e)), 60 | | } 61 | | }) | |__________________^ expected trait object `dyn SuperConnection`, found struct `hyperlocal::client::UnixStream` | = note: expected enum `Result<Pin<Box<(dyn SuperConnection + Send + 'static)>>, Box<(dyn std::error::Error + Send + 'static)>>` found enum `Result<Pin<Box<hyperlocal::client::UnixStream>>, Box<std::io::Error>>` = note: required for the cast to the object type `dyn Future<Output = Result<Pin<Box<(dyn SuperConnection + Send + 'static)>>, Box<(dyn std::error::Error + Send + 'static)>>> + Send`
Hahahahahahah yes. YES. Now you're doing it! One of us, one of us, one of u-
Bear, please. I'm crying. How do I get out of this one?
Ah, well, since we can't have type ascription, I guess just annotate harder:
fn call(&mut self, req: Uri) -> Self::Future { match req.scheme_str().unwrap_or_default() { "unix" => { let fut = self.unix.call(req); Box::pin(async move { match fut.await { Ok(conn) => Ok::<Self::Response, _>(Box::pin(conn)), Err(e) => Err::<_, Self::Error>(Box::new(e)), } }) } _ => { let fut = self.tcp.call(req); Box::pin(async move { match fut.await { Ok(conn) => Ok::<Self::Response, _>(Box::pin(conn)), Err(e) => Err::<_, Self::Error>(Box::new(e)), } }) } } }
Oh. Well that's. I've never seen the turbofish in that position. But sure, fine...
It still doesn't work, though:
$ cargo check -q error[E0277]: the size for values of type `(dyn std::error::Error + Send + 'static)` cannot be known at compilation time --> src/main.rs:78:53 | 78 | let client = Client::builder().build::<_, Body>(SuperConnector::default()); | ----- ^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time | | | required by a bound introduced by this call | = help: the trait `Sized` is not implemented for `(dyn std::error::Error + Send + 'static)` = note: required because of the requirements on the impl of `std::error::Error` for `Box<(dyn std::error::Error + Send + 'static)>` = note: required because of the requirements on the impl of `From<Box<(dyn std::error::Error + Send + 'static)>>` for `Box<(dyn std::error::Error + Send + Sync + 'static)>` = note: required because of the requirements on the impl of `Into<Box<(dyn std::error::Error + Send + Sync + 'static)>>` for `Box<(dyn std::error::Error + Send + 'static)>` = note: required because of the requirements on the impl of `hyper::client::connect::Connect` for `SuperConnector` note: required by a bound in `hyper::client::Builder::build` --> /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.19/src/client/client.rs:1336:12 | 1336 | C: Connect + Clone, | ^^^^^^^ required by this bound in `hyper::client::Builder::build`
How do you suggest we get out of this one, professor?
Oh that one is a red herring.
Remember: you don't have to understand why some type bounds are there, you merely have to make it fit.
In this case, the bound is here:
// deep in the bowels of hyper's source code, in a submodule because that's a // sealed trait: impl<S, T> Connect for S where S: tower_service::Service<Uri, Response = T> + Send + 'static, // 👇 S::Error: Into<Box<dyn StdError + Send + Sync>>, S::Future: Unpin + Send, T: AsyncRead + AsyncWrite + Connection + Unpin + Send + 'static, { type _Svc = S; fn connect(self, _: Internal, dst: Uri) -> crate::service::Oneshot<S, Uri> { crate::service::oneshot(self, dst) } }
Ohhhhhhhhhhh.
See that? Into<Box<dyn Error + Send + Sync>>
. You know what implements Into<T>
? T
!
Ohhh...? I don't get it.
It's okay. What we have right now is Box<dyn Error + Send>
. We're just missing
the Sync
bound.
Ahhhhhhhhhhhhhhh.
type Error = Box<dyn std::error::Error + Send + Sync + 'static>;
IT TYPECHECKS. THIS IS NOT A DRILL.
I love it when you go apeshit at the end of our articles.
Our artic-
But don't you want to golf down that impl a bit more? The implementations for
poll_ready
and call
are pretty gnarly still...
Well sure, but how?
Let's bring in just one... more... crate.
$ cargo add futures (cut)
And a well-placed macro...
use std::{ pin::Pin, task::{Context, Poll}, }; use futures::{future::BoxFuture, FutureExt, TryFutureExt}; use hyper::{ client::{connect::Connection, HttpConnector}, service::Service, Body, Client, Uri, }; use hyperlocal::UnixConnector; use tokio::io::{AsyncRead, AsyncWrite}; #[derive(Clone)] struct SuperConnector { tcp: HttpConnector, unix: UnixConnector, } impl Default for SuperConnector { fn default() -> Self { Self { tcp: HttpConnector::new(), unix: Default::default(), } } } trait SuperConnection: AsyncRead + AsyncWrite + Connection {} impl<T> SuperConnection for T where T: AsyncRead + AsyncWrite + Connection {} impl Connection for Pin<Box<dyn SuperConnection + Send + 'static>> { fn connected(&self) -> hyper::client::connect::Connected { (**self).connected() } } impl Service<Uri> for SuperConnector { type Response = Pin<Box<dyn SuperConnection + Send + 'static>>; type Error = Box<dyn std::error::Error + Send + Sync + 'static>; // `futures` provides a handy `BoxFuture<'a, T>` alias type Future = BoxFuture<'static, Result<Self::Response, Self::Error>>; fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> { // that macro propagates `Poll::Pending`, like `?` propagates `Result::Err` futures::ready!(self.tcp.poll_ready(cx))?; futures::ready!(self.unix.poll_ready(cx))?; Ok(()).into() } fn call(&mut self, req: Uri) -> Self::Future { // keep it DRY (don't repeat yourself) with a macro... macro_rules! forward { ($target:expr) => { $target .call(req) // these are from Future extension traits provided by `futures` // they map `Future->Future`, not `Result->Result` .map_ok(|c| -> Self::Response { Box::pin(c) }) // oh yeah by the way, closure syntax accepts `-> T` to annotate // the return type, that's load-bearing here. .map_err(|e| -> Self::Error { Box::new(e) }) // also an extension trait: `fut.boxed()` => `Box::pin(fut) as BoxFuture<_>` .boxed() }; } // much cleaner: match req.scheme_str().unwrap_or_default() { "unix" => forward!(self.unix), _ => forward!(self.tcp), } } }
Well, I guess there's just one thing left to do: actually use it.
#[tokio::main] async fn main() { let client = Client::builder().build::<_, Body>(SuperConnector::default()); { let uri = hyperlocal::Uri::new("/var/run/docker.sock", "/v1.41/info").into(); let res = client.get(uri).await.unwrap(); let body = hyper::body::to_bytes(res.into_body()).await.unwrap(); let value: serde_json::Value = serde_json::from_slice(&body).unwrap(); println!("operating system: {}", value["OperatingSystem"]); } { let uri = "http://example.org".parse().unwrap(); let res = client.get(uri).await.unwrap(); let body = hyper::body::to_bytes(res.into_body()).await.unwrap(); let body = std::str::from_utf8(&body).unwrap(); println!("start of dom: {}", &body[..128]); } }
$ cargo run -q operating system: "Ubuntu 22.04 LTS" start of dom: <!doctype html> <html> <head> <title>Example Domain</title> <meta charset="utf-8" /> <meta http-equiv="Content-type
Wonderful.
Say, bear, did we just accidentally write a book's worth of material about the Rust type system?
It would appear so, yes. But there's one thing we haven't covered yet.
Oh no. No no no I was just asking out of curios-
Higher-ranked trait bounds
FUCK. Someone stop that bear.
Consider the following trait:
trait Transform<'a> { fn apply(&self, slice: &'a mut [u8]); }
I WILL NOT. I WILL NOT CONSIDER THE PRECEDING TRAIT.
Consider how you'd use it:
fn apply_transform<T>(slice: &mut [u8], transform: T) where T: Transform, { transform.apply(slice); }
I NO LONGER CARE, I HAVE MENTALLY CHECKED OUT FROM THIS ARTICLE. YOU CANNOT MAKE ME CARE.
$ cargo check -q error[E0106]: missing lifetime specifier --> src/main.rs:9:8 | 9 | T: Transform, | ^^^^^^^^^ expected named lifetime parameter | help: consider introducing a named lifetime parameter | 7 ~ fn apply_transform<'a, T>(slice: &mut [u8], transform: T) 8 | where 9 ~ T: Transform<'a>, | For more information about this error, try `rustc --explain E0106`. error: could not compile `grr` due to previous error
As you can see,
I CANNOT SEE
...this doesn't compile. The rust compiler wants us to specify a lifetime. But which should it be?
deep sigh
It should be... generic.
AhAH! Can you show me?
Sssure, here:
fn apply_transform<'a, T>(slice: &mut [u8], transform: T) where // 👇 T: Transform<'a>, { transform.apply(slice); }
$ cargo check -q error[E0621]: explicit lifetime required in the type of `slice` --> src/main.rs:11:21 | 7 | fn apply_transform<'a, T>(slice: &mut [u8], transform: T) | --------- help: add explicit lifetime `'a` to the type of `slice`: `&'a mut [u8]` ... 11 | transform.apply(slice); | ^^^^^ lifetime `'a` required For more information about this error, try `rustc --explain E0621`. error: could not compile `grr` due to previous error
Fuck. Hold on.
// 👇 fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T) where T: Transform<'a>, { transform.apply(slice); }
$ cargo check -q error[E0309]: the parameter type `T` may not live long enough --> src/main.rs:11:5 | 7 | fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T) | - help: consider adding an explicit lifetime bound...: `T: 'a` ... 11 | transform.apply(slice); | ^^^^^^^^^ ...so that the type `T` is not borrowed for too long error[E0309]: the parameter type `T` may not live long enough
AhhhhhhhhhhhhHHHHH
fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T) where // 👇 T: Transform<'a> + 'a, { transform.apply(slice); }
$ cargo check -q error[E0597]: `transform` does not live long enough --> src/main.rs:11:5 | 7 | fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T) | -- lifetime `'a` defined here ... 11 | transform.apply(slice); | ^^^^^^^^^^^^^^^^^^^^^^ | | | borrowed value does not live long enough | argument requires that `transform` is borrowed for `'a` 12 | } | - `transform` dropped here while still borrowed For more information about this error, try `rustc --explain E0597`. error: could not compile `grr` due to previous error
AHHHHH NOTHING WORKS.
Yes, yes haha, nothing works indeed. Well that's what you get for glossing over lifetimes earlier.
Okay well, what do you suggest?
Well, the problem is that we're conflating the lifetimes of many different things.
Because we have a single lifetime name, 'a
, we need all of these to outlive
'a
:
- the
&mut [u8]
slice transform
itself- the borrow of
transform
we need to callapply
It's clearer if we do the auto-ref ourselves:
fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T) where T: Transform<'a> + 'a, { let borrowed_transform = &transform; borrowed_transform.apply(slice); drop(transform); }
The signature of Transform::apply
requires self
to be borrowed for as long
as the slice. And that can't be true, since we need to drop transform
before
we drop the slice itself.
What do you suggest then? Borrowing transform too?
Sure, that'd work:
fn apply_transform<'a, T>(slice: &'a mut [u8], transform: &'a dyn Transform<'a>) { transform.apply(slice); }
But that's not the problem statement. We can fix the original code, with HRTB: higher-ranked trait bounds.
We don't want Transform<'a>
to be implemented by T
for a specific lifetime
'a
. We want it to be implemented for any lifetime.
And here's the syntax that makes the magic happen:
fn apply_transform<T>(slice: &mut [u8], transform: T) where T: for<'a> Transform<'a>, { transform.apply(slice); }
Oh, that. That wasn't nearly as scary as I had anticipated. That's it?
Well, also, it's one of those features that you probably don't need as much as you think you do.
Meaning?
Meaning our trait is kinda odd to begin with. There's no reason self
and
slice
should be borrowed for the same lifetime.
If we just get rid of all our lifetime annotations, things work just as well:
trait Transform { fn apply(&self, slice: &mut [u8]); } fn apply_transform_thrice<T>(slice: &mut [u8], transform: T) where T: Transform, { transform.apply(slice); transform.apply(slice); transform.apply(slice); }
Oh.
But surely it's useful in some instances, right?
Why yes! Consider the following:
Oh not agai-
trait Transform<T> { fn apply(&self, target: T); }
Now, Transform
is generic over the type T
. How do we use it?
Well... just like before, except with one more bound I guess:
fn apply_transform<T>(slice: &mut [u8], transform: T) where T: Transform<&mut [u8]>, { transform.apply(slice); }
Ah yes! Except, no.
cargo check -q error[E0637]: `&` without an explicit lifetime name cannot be used here --> src/main.rs:9:18 | 9 | T: Transform<&mut [u8]>, | ^ explicit lifetime name needed here error[E0312]: lifetime of reference outlives lifetime of borrowed content... --> src/main.rs:11:21 | 11 | transform.apply(slice); | ^^^^^ | = note: ...the reference is valid for the static lifetime... note: ...but the borrowed content is only valid for the anonymous lifetime defined here --> src/main.rs:7:30 | 7 | fn apply_transform<T>(slice: &mut [u8], transform: T) | ^^^^^^^^^ Some errors have detailed explanations: E0312, E0637. For more information about an error, try `rustc --explain E0312`. error: could not compile `grr` due to 2 previous errors
Ah, more generics then?
fn apply_transform<'a, T>(slice: &'a mut [u8], transform: T) where T: Transform<&'a mut [u8]>, { transform.apply(slice); }
That does work! Now turn into into apply_transform_thrice
again...
fn apply_transform_thrice<'a, T>(slice: &'a mut [u8], transform: T) where T: Transform<&'a mut [u8]>, { transform.apply(slice); transform.apply(slice); transform.apply(slice); }
$ cargo check -q error[E0499]: cannot borrow `*slice` as mutable more than once at a time --> src/main.rs:12:21 | 7 | fn apply_transform_thrice<'a, T>(slice: &'a mut [u8], transform: T) | -- lifetime `'a` defined here ... 11 | transform.apply(slice); | ---------------------- | | | | | first mutable borrow occurs here | argument requires that `*slice` is borrowed for `'a` 12 | transform.apply(slice); | ^^^^^ second mutable borrow occurs here error[E0499]: cannot borrow `*slice` as mutable more than once at a time --> src/main.rs:13:21 | 7 | fn apply_transform_thrice<'a, T>(slice: &'a mut [u8], transform: T) | -- lifetime `'a` defined here ... 11 | transform.apply(slice); | ---------------------- | | | | | first mutable borrow occurs here | argument requires that `*slice` is borrowed for `'a` 12 | transform.apply(slice); 13 | transform.apply(slice); | ^^^^^ second mutable borrow occurs here For more information about this error, try `rustc --explain E0499`. error: could not compile `grr` due to 2 previous errors
Oh hell. You sly bear. That was your plan all along, wasn't it?
Hahahahahahahahha yes. Do you know how to get out of that one?
...yes I do. I suppose it worked when we called it once because... the slice
parameter to apply_transform
could have the same lifetime as the parameter to
Transform::transform
. But now we call it three times, so the lifetime of the
parameter to Transform::transform
has to be smaller.
Three times smaller in fact.
Well that's not how... lifetimes don't really have sizes you can measure, but sure, yeah, that's the gist.
And that's where HRTB (higher-ranked trait bounds) come in, don't they.
fn apply_transform_thrice<T>(slice: &mut [u8], transform: T) where T: for<'a> Transform<&'a mut [u8]>, { transform.apply(slice); transform.apply(slice); transform.apply(slice); }
Ah heck. This typechecks.
I was all out of learning juice and you still managed to sneak one in.
😎😎😎
Afterword
It's me, regular Amos. I know Rust again. I feel like we need some aftercare
debriefing after going through all this. Are you okay? We have juice and cookies
if you want.
Congratulations on reaching the end by the way! I'm guessing you're not using Mobile Safari, or else it would've already crashed.
I don't want any of this to scare you.
Like Bear and I said, it's really just about making the pieces fit. Sometimes the shape of the pieces (the types) prevent you from making GRAVE mistakes (like data races, or accessing the Ok variant of a Result type), sometimes they're there because... that's the best we got.
Most of the time, you're playing with someone else's toy pieces: they've already determined what shapes make sense, and you can let yourself be guided by compiler diagnostics, which are fantastic most of the time, and then rapidly degrade as you delve deeper into async land or try to generally, uh, "get smart".
But you don't have to get smart. Keep in mind the escape hatches. Struggling
with lifetimes? Clone it! Can't clone it? Arc<T>
it! You can even
Arc<Mutex<T>>
it, if you really need to mutate it.
Need to put a bunch of heterogenous types together in the same collection? Or return them from a single function? Just box them!
It gets harder with complex traits and associated types, but in this article, we've covered literally the worst case I've ever seen. The other cases are just variations on a theme, with additional bounds, which you can solve one by one.
There's a lot to Rust we haven't covered here — this is by no means a comprehensive book on the language. But my hope is that it serves as sort of a survival guide for anyone who finds themselves stuck with Rust before they appreciate it. I hope you read this in anger, and it gets you out of the hole.
And beyond that, I really hope large parts of this article become completely irrelevant. Laughably so. That we get GATs, type alias impl trait, maybe dyn*, maybe modifier generics?
There's ton of good stuff in the pipes, some of it has been in the works "seemingly forever", and I'm looking forward to all of it, because that means I'll have to write fewer articles like these.
In the meantime, I'm still having a relatively good time in the Rust async ecosystem. I can live with the extra boilerplate while we find good solutions for all these. Sometimes it's a bit frustrating, but then I spend a couple hours playing with a language that doesn't have a borrow checker, or doesn't have sum types, and I'm all better.
I hope I was able to show, too, that I don't consider Rust the perfect, be-all-end-all programming language. There's still a bunch of situations where, without the requisite years of messing around, you'll be stuck. Because I'm so often the person of reference to help solve these, at work and otherwise, I just thought I'd put a little something together.
Hopefully this article helps a little. And in the meantime, take excellent care of yourselves.
Thanks to my sponsors: Josiah Bull, Philipp Hatt, Romain Kelifa, Mattia Valzelli, Michal Hošna, Marie Janssen, Andrew Neth, Antoine Rouaze, Josh Triplett, Aiden Scandella, Alejandro Angulo, Julian Schmid, Ahmad Alhashemi, Miguel Piedrafita, bbutkovic, Daniel Wagner-Hall, Ronen Ulanovsky, Leo Shimonaka, Xavier Groleau, Michael and 235 more
If you liked what you saw, please support my work!
Here's another article just for you:
What's a ktls
I started work on ktls and ktls-sys, a pair of crates exposing Kernel TLS offload to Rust, about two years ago.
kTLS lets the kernel (and, in turn, any network interface that supports it) take care of encryption, framing, etc., for the entire duration of a TLS connection... as soon as you have a TLS connection.
For the handshake itself (hellos, change cipher, encrypted extensions, certificate verification, etc.), you still have to use a userland TLS implementation.
- Different kinds of numbers
- Conversions and type inference
- Generics and enums
- Implementing traits
- Return position
- Dynamically-sized types
- Storing stuff in structs
- Lifetimes and ownership
- Slices and arrays
- Boxed trait objects
- Reading type signatures
- Closures
- Async stuff
- Async trait methods
- The Connect trait from hyper
- Higher-ranked trait bounds
- Afterword