The curse of strong typing
- Different kinds of numbers
- Conversions and type inference
- Generics and enums
- Implementing traits
- Return position
- Dynamically-sized types
- Storing stuff in structs
- Lifetimes and ownership
- Slices and arrays
- Boxed trait objects
- Reading type signatures
- Closures
- Async stuff
- Async trait methods
- The Connect trait from hyper
- Higher-ranked trait bounds
- Afterword
Contents
It happened when I least expected it.
Someone, somewhere (above me, presumably) made a decision. "From now on", they declared, "all our new stuff must be written in Rust".
I'm not sure where they got that idea from. Maybe they've been reading propaganda. Maybe they fell prey to some confident asshole, and convinced themselves that Rust was the answer to their problems.
I don't know what they see in it, to be honest. It's like I always say: it's not a data race, it's a data marathon.
At any rate, I now find myself in a beautiful house, with a beautiful wife, and a lot of compile errors.
Jesus that's a lot of compile errors.
Different kinds of numbers
And it's not like I'm resisting progress! When someone made the case for using tau instead of pi, I was the first to hop on the bandwagon.
But Rust won't even let me do that:
fn main() { // only nerds need more digits println!("tau = {}", 2 * 3.14159265); }
$ cargo run --quiet error[E0277]: cannot multiply `{integer}` by `{float}` --> src/main.rs:3:28 | 3 | println!("tau = {}", 2 * 3.14159265); | ^ no implementation for `{integer} * {float}` | = help: the trait `Mul<{float}>` is not implemented for `{integer}` For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to previous error
When it clearly works in ECMAScript for example:
// in `main.js` // TODO: ask for budget increase so we can afford more digits console.log(`tau = ${2 * 3.14159265}`);
$ node main.js tau = 6.2831853
Luckily, a colleague rushes in to help me.
Well those... those are different types.
Types? Never heard of them.
You've seen the title of this post right? Strong typing?
Fine, I'll look it up. It says here that:
"Strong typing" generally refers to use of programming language types in order to both capture invariants of the code, and ensure its correctness, and definitely exclude certain classes of programming errors. Thus there are many "strong typing" disciplines used to achieve these goals.
Okay. What's incorrect about my code?
Oh, nothing! Nothing at all. These are just different types.
So it's just getting in the way right now yes, correct?
Well... sort of? But it's not like your program is running on an imaginary machine. There's a real difference between an "integer" and a "floating point number".
A floa-
Look at this for example:
package main import "fmt" func main() { a := 200000000 for i := 0; i < 10; i++ { a *= 10 fmt.Printf("a = %v\n", a) } }
$ go run main.go a = 2000000000 a = 20000000000 a = 200000000000 a = 2000000000000 a = 20000000000000 a = 200000000000000 a = 2000000000000000 a = 20000000000000000 a = 200000000000000000 a = 2000000000000000000
Yeah, that makes perfect sense! What's your point?
Well, if we keep going a little more...
package main import "fmt" func main() { a := 200000000 // π for i := 0; i < 15; i++ { a *= 10 fmt.Printf("a = %v\n", a) } }
$ go run main.go a = 2000000000 a = 20000000000 a = 200000000000 a = 2000000000000 a = 20000000000000 a = 200000000000000 a = 2000000000000000 a = 20000000000000000 a = 200000000000000000 a = 2000000000000000000 a = 1553255926290448384 a = -2914184810805067776 a = 7751640039368425472 a = 3729424098846048256 a = 400752841041379328
Oh. Oh no.
That's an overflow. We used a 64-bit integer variable, and to represent 2000000000000000000, we'd need 64.12 bits, which... that's more than we have.
Okay, but again this works in ECMAScript for example:
let a = 200000000; for (let i = 0; i < 15; i++) { a *= 10; console.log(`a = ${a}`); }
$ node main.js a = 2000000000 a = 20000000000 a = 200000000000 a = 2000000000000 a = 20000000000000 a = 200000000000000 a = 2000000000000000 a = 20000000000000000 a = 200000000000000000 a = 2000000000000000000 a = 20000000000000000000 a = 200000000000000000000 a = 2e+21 a = 2e+22 a = 2e+23
Sure, it's using nerd notation, but if we just go back, we can see it's working:
let a = 200000000; for (let i = 0; i < 15; i++) { a *= 10; console.log(`a = ${a}`); } console.log("turn back!"); for (let i = 0; i < 15; i++) { a /= 10; console.log(`a = ${a}`); }
$ node main.js a = 2000000000 a = 20000000000 a = 200000000000 a = 2000000000000 a = 20000000000000 a = 200000000000000 a = 2000000000000000 a = 20000000000000000 a = 200000000000000000 a = 2000000000000000000 a = 20000000000000000000 a = 200000000000000000000 a = 2e+21 a = 2e+22 a = 2e+23 turn back! a = 2e+22 a = 2e+21 a = 200000000000000000000 a = 20000000000000000000 a = 2000000000000000000 a = 200000000000000000 a = 20000000000000000 a = 2000000000000000 a = 200000000000000 a = 20000000000000 a = 2000000000000 a = 200000000000 a = 20000000000 a = 2000000000 a = 200000000
Mhh, looks like dΓΆner kebab.
Okay, but those are floating point numbers.
They don't look very floating to me.
Consider this:
let a = 0.1; let b = 0.2; let sum = a + b; console.log(sum);
$ node main.js 0.30000000000000004
Ah, that... that does float.
Yeah, and that's the trade-off. You get to represent numbers that aren't whole numbers, and also /very large/ numbers, at the expense of some precision.
I see.
For example, with floats, you can compute two thirds:
fn main() { println!("two thirds = {}", 2.0 / 3.0); }
$ cargo run --quiet two thirds = 0.6666666666666666
But with integers, you can't:
fn main() { println!("two thirds = {}", 2 / 3); }
$ cargo run --quiet two thirds = 0
Wait, but I don't see any actual types here. Just values.
Yeah, it's all inferred!
I uh. Okay I'm still confused. See, in ECMAScript, a number's a number:
console.log(typeof 36); console.log(typeof 42.28);
$ node main.js number number
Unless it's a big number!
console.log(typeof 36); console.log(typeof 42.28); console.log(typeof 248672936507863405786027355423684n);
$ node main.js number number bigint
Ahhh. So ECMAScript does have integers.
Only big ones. Well they can smol if you want to. Operations just... are more expensive on them.
What about Python? Does Python have integers?
$ python3 -q >>> type(38) <class 'int'> >>> type(38.139582735) <class 'float'> >>>
Mh, yeah, it does!
Try computing two thirds with it!
$ python3 -q >>> 2/3 0.6666666666666666 >>> type(2) <class 'int'> >>> type(2/3) <class 'float'> >>>
Hey that works! So the /
operator in python takes two int
values and gives a
float
.
Not two int
values. Two numbers. Could be anything.
$ python3 -q >>> 2.8 / 1.4 2.0 >>>
What if I want to do integer division?
There's an operator for that!
$ python3 -q >>> 10 // 3 3 >>>
Similarly, for addition you have ++
...
$ python3 -q >>> 2 + 3 5 >>> 2 ++ 3 5 >>>
And so on...
>>> 8 - 3 5 >>> 8 -- 3 11
Wait, no, I th-
>>> 8 * 3 24 >>> 8 ** 3 512
Woops, my bad β I guess it's just //
. a ++ b
really is a + (+b)
,
a -- b
is a - (-b)
, and a ** b
is a
to the b
th power.
Okay so Python values have types, you just can't see them unless you ask.
Can I see the types of Rust values too?
Kinda! You can do this:
fn main() { dbg!(type_name_of(2)); dbg!(type_name_of(268.2111)); } fn type_name_of<T>(_: T) -> &'static str { std::any::type_name::<T>() }
$ cargo run --quiet [src/main.rs:2] type_name_of(2) = "i32" [src/main.rs:3] type_name_of(268.2111) = "f64"
Okay. And so in Rust, a value like 42
defaults to i32
(signed 32-bit integer),
and a value like 3.14
defaults to f64
.
How do I make other number types? Surely there's other.
For literals, you can use suffixes:
$ cargo run --quiet [src/main.rs:2] type_name_of(1_u8) = "u8" [src/main.rs:3] type_name_of(1_u16) = "u16" [src/main.rs:4] type_name_of(1_u32) = "u32" [src/main.rs:5] type_name_of(1_u64) = "u64" [src/main.rs:6] type_name_of(1_u128) = "u128" [src/main.rs:8] type_name_of(1_i8) = "i8" [src/main.rs:9] type_name_of(1_i16) = "i16" [src/main.rs:10] type_name_of(1_i32) = "i32" [src/main.rs:11] type_name_of(1_i64) = "i64" [src/main.rs:12] type_name_of(1_i128) = "i128" [src/main.rs:14] type_name_of(1_f32) = "f32" [src/main.rs:15] type_name_of(1_f64) = "f64"
No f128
?
Not builtin, no. For now.
Okay, so my original code here didn't work:
fn main() { // only nerds need more digits println!("tau = {}", 2 * 3.14159265); }
Was because the 2
on the left is an integer, and the 3.14159265
is a
floating point number, and so I have to do this:
println!("tau = {}", 2.0 * 3.14159265);
Or this:
println!("tau = {}", 2f64 * 3.14159265);
Or this, to be more readable, since apparently you can stuff _
anywhere in
number literals:
println!("tau = {}", 2_f64 * 3.14159265);
In ECMAScript, you have 64-bit floats (number
), and bigints. Operations on
bigints are significantly more expensive than operations on floats.
In Python, you have floats, and integers. Python 3 handles bigints seamlessly: doing arithmetic on small integer values is still "cheap".
In languages like Rust, you have integers and floats, but you need to pick a bit
width. Number literals will default to i32
and f64
, unless you add a suffix
or... some other conditions described in the next section.
Conversions and type inference
Okay, I think I get it.
So whereas Python has an "integer" and "float" type, Rust has different widths of integer types, like C and other system languages.
So this doesn't work:
fn main() { let val = 280_u32; takes_u32(val); takes_u64(val); } fn takes_u32(val: u32) { dbg!(val); } fn takes_u64(val: u64) { dbg!(val); }
$ cargo run --quiet error[E0308]: mismatched types --> src/main.rs:4:15 | 4 | takes_u64(val); | ^^^ expected `u64`, found `u32` | help: you can convert a `u32` to a `u64` | 4 | takes_u64(val.into()); | +++++++ For more information about this error, try `rustc --explain E0308`. error: could not compile `grr` due to previous error
And the compiler gives me a suggestion, but according to the heading of the
section, as
should work, too:
takes_u64(val as u64);
$ cargo run --quiet [src/main.rs:8] val = 280 [src/main.rs:12] val = 280
Yeah! And you see the definition of takes_u64
? It has val: u64
.
Yeah I see, I wrote it!
So that means the compiler knows that the argument to takes_u64
must be
a u64
, right?
Yeah?
So it should be able to infer it!
Yeah, this does work:
takes_u64(230984423857928735);
Exactly! Whereas before, it defaulted to the type of the literal to i32
, this
time it knows it should be a u64
in the end, so it turns the kind of squishy
{integer}
type into the very concrete u64
type.
Neat.
But it doesn't stop there β in a bunch of places in Rust, when you want to
ask the compiler to "just figure it out", you can substitute _
.
No... so you mean?
fn main() { let val = 280_u32; takes_u32(val); // π takes_u64(val as _); } // etc.
$ cargo run --quiet [src/main.rs:8] val = 280 [src/main.rs:12] val = 280
Neat!
Let's try .into()
too, since that's what the compiler suggested:
fn main() { let val = 280_u32; takes_u32(val); takes_u64(val.into()); } // etc.
That works too!
Oooh, ooh, try it the other way around!
Like this?
fn main() { // π let val = 280_u64; // π takes_u64(val); // π takes_u32(val.into()); }
$ cargo run --quiet error[E0277]: the trait bound `u32: From<u64>` is not satisfied --> src/main.rs:4:19 | 4 | takes_u32(val.into()); | ^^^^ the trait `From<u64>` is not implemented for `u32` | = help: the following implementations were found: <u32 as From<Ipv4Addr>> <u32 as From<NonZeroU32>> <u32 as From<bool>> <u32 as From<char>> and 71 others = note: required because of the requirements on the impl of `Into<u32>` for `u64` For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to previous error
Oh, it's not happy at all. It does helpfully suggest we could use an IPv4 address instead, which...
I know someone who'll think this diagnostic could use a little tune-up...
No no, we can try it, we got time:
use std::{net::Ipv4Addr, str::FromStr}; fn main() { takes_u32(Ipv4Addr::from_str("127.0.0.1").unwrap().into()); } fn takes_u32(val: u32) { dbg!(val); }
$ cargo run --quiet [src/main.rs:8] val = 2130706433
...yes, okay.
Just like an IPv6 address can be a u128
, if it believes:
use std::{net::Ipv6Addr, str::FromStr}; fn main() { takes_u128(Ipv6Addr::from_str("ff::d1:e3").unwrap().into()); } fn takes_u128(val: u128) { dbg!(val); }
$ cargo run --quiet [src/main.rs:8] val = 1324035698926381045275276563964821731
But apparently a u64
can't be a u32
?
Well... that's because not all values of type u64
fit into a u32
.
Oh!
...that's why there's no impl From<u64> for u32
...
Ah.
...but there is an impl TryFrom<u64> for u32
.
Ah?
Because some u64
fit in a u32
.
So err... we used .into()
earlier... which we could do because... From
?
And so because now we have TryFrom
... .try_into()
?
Yes! Because of this blanket impl and that blanket impl, respectively.
I have a feeling we'll come back to these later... but for now, let's give it a shot:
fn main() { let val: u64 = 48_000; takes_u32(val.try_into().unwrap()); } fn takes_u32(val: u32) { dbg!(val); }
This compiles, and runs.
As for this:
fn main() { let val: u64 = 25038759283948; takes_u32(val.try_into().unwrap()); }
It compiles, but does not run!
$ cargo run --quiet thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: TryFromIntError(())', src/main.rs:3:30 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Makes sense so far.
And that's... that's all of it right?
Not quite! You can parse stuff.
Ah, like we just did with Ipv4Addr::from_str
right?
Yes! But just like T::from(val)
has val.into()
, T::from_str(val)
has val.parse()
.
Fantastic! Let's give it a go:
fn main() { let val = "1234".parse(); dbg!(val); }
$ cargo run --quiet error[E0284]: type annotations needed for `Result<F, _>` --> src/main.rs:2:22 | 2 | let val = "1234".parse(); | --- ^^^^^ cannot infer type for type parameter `F` declared on the associated function `parse` | | | consider giving `val` the explicit type `Result<F, _>`, where the type parameter `F` is specified | = note: cannot satisfy `<_ as FromStr>::Err == _` For more information about this error, try `rustc --explain E0284`. error: could not compile `grr` due to previous error
Oh it's... unhappy? Again?
Consider this: what do you want to parse to?
A number, clearly! The string is 1234
.
See, ECMAScript gets it right:
let a = "1234"; console.log({ a }); let b = parseInt(a, 10); console.log({ b });
$ node main.js { a: '1234' } { b: 1234 }
Nnnnonono, you said parseInt
, not just parse
.
Okay fine, let's not say parse
at all then:
let a = "1234"; console.log({ a }); let b = +a; console.log({ b });
$ node main.js { a: '1234' } { b: 1234 }
Okay but the unary plus operator here coerces a string
to a number
, and
in that case the only sensible thing to do is...
Nah nah nah, that's too easy. I think you're just looking for excuses. The truth is, ECMAScript is production-ready in a way that Rust isn't, and never will be.
Those fools at work have it coming. Soon they'll realize! They've been had. They've been swindled. They've developed a taste for snake o-
JUST ADD : u64
AFTER let val
WILL YOU
fn main() { let val: u64 = "2930482035982309".parse().unwrap(); dbg!(val); }
$ cargo run --quiet [src/main.rs:3] val = 2930482035982309
Oh.
Yeah that tracks. And I suppose if we have to care about bit widths here, that
if I change it for u32
...
fn main() { let val: u32 = "2930482035982309".parse().unwrap(); dbg!(val); }
$ cargo run --quiet thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ParseIntError { kind: PosOverflow }', src/main.rs:2:47 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
It errors out, because that doesn't fit in a u32
. I see.
YES. NOW TRY CASTING THAT VALUE AS AN u64
TO A u32
.
Cool down, bear! I'll try, I'll try:
fn main() { let a = 2930482035982309_u64; println!("a = {a} (u64)"); let b = a as u32; println!("b = {b} (u32)"); }
$ cargo run --quiet a = 2930482035982309 (u64) b = 80117733 (u32)
Oh. It's... it's not crashing, just... doing the wrong thing?
YES THAT WAS MY POINT THANK YOU
Yeesh okay how about you take a minute there, bear. So I agree that number
shouldn't fit in a u32
, so it's doing... something with it.
Maybe if we print it as hex:
fn main() { let a = 2930482035982309_u64; println!("a = {a:016x} (u64)"); let b = a as u32; println!("b = {b:016x} (u32)"); }
$ cargo run --quiet a = 000a694204c67fe5 (u64) b = 0000000004c67fe5 (u32) π
Oh yeah okay! It's truncating it!
It's even clearer in binary:
fn main() { let a = 2930482035982309_u64; println!("a = {a:064b} (u64)"); let b = a as u32; println!("b = {b:064b} (u32)"); }
$ cargo run --quiet a = 0000000000001010011010010100001000000100110001100111111111100101 (u64) b = 0000000000000000000000000000000000000100110001100111111111100101 (u32) π
YES THAT'S THE PROBLEM WITH as
. YOU CAN TRUNCATE VALUES WHEN YOU DIDN'T INTEND
TO.
Ah. But it's shorter and super convenient still, right?
I GUESS!
Gotcha.
Generics and enums
Wait wait wait, we haven't even talked about strings yet. Are you sure about that heading?
Hell yeah! Generics are baby stuff: you just slap a couple angle brackets, or "chevrons" if you want to be fancy, and boom, Bob's your uncle!
Ew.
Not that Bob.
See, this for example:
fn show<T>(a: T) { todo!() }
Now we can call it with a value a
of type T
, for any T
!
fn main() { show(42); show("blah"); }
Okay yeah but you haven't implemented it yet!
True true, it panics right now:
$ cargo run --quiet thread 'main' panicked at 'not yet implemented', src/main.rs:7:5 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
But we could... I don't know, we could display it!
fn main() { show(42); show("blah"); } fn show<T>(a: T) { println!("a = {}", a); }
$ cargo run --quiet error[E0277]: `T` doesn't implement `std::fmt::Display` --> src/main.rs:7:24 | 7 | println!("a = {}", a); | ^ `T` cannot be formatted with the default formatter | = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead = note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info) help: consider restricting type parameter `T` | 6 | fn show<T: std::fmt::Display>(a: T) { | +++++++++++++++++++ For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to previous error
Mhhhhhh. Does not implement Display
.
Okay maybe {:?}
instead of {}
then?
fn show<T>(a: T) { println!("a = {:?}", a); }
$ cargo run --quiet error[E0277]: `T` doesn't implement `Debug` --> src/main.rs:7:26 | 7 | println!("a = {:?}", a); | ^ `T` cannot be formatted using `{:?}` because it doesn't implement `Debug` | = note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info) help: consider restricting type parameter `T` | 6 | fn show<T: std::fmt::Debug>(a: T) { | +++++++++++++++++ For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to previous error
Oh now it doesn't implement Debug
.
Well. Okay! Maybe show
can't do anything useful with its argument, but at
least you can pass any type to it.
And, because T
is a type like any other...
A "type parameter", technically, but who's keeping track.
...you can use it several times, probably!
fn main() { show(5, 7); show("blah", "bleh"); } fn show<T>(a: T, b: T) { todo!() }
Yeah, see, that works!
And if we do this:
fn main() { show(42, "aha") } fn show<T>(a: T, b: T) { todo!() }
It... oh.
$ cargo run --quiet error[E0308]: mismatched types --> src/main.rs:2:14 | 2 | show(42, "aha") | ^^^^^ expected integer, found `&str` For more information about this error, try `rustc --explain E0308`. error: could not compile `grr` due to previous error
Well that's interesting. I guess they have to match? So like it's using the
first argument, 42
, to infer T
, and then the second one has to match,
alright.
Yeah, and you'll notice it says "expected integer", not "expected i32".
So that means this would work:
show(42, 256_u64)
And it does!
And if we want two genuinely different types, I guess we have to... use two dif-
Use two different type parameters, yes.
fn main() { show(4, "hi") } fn show<A, B>(a: A, b: B) { todo!() }
That works! Alright.
Well we don't know how to do anything useful with these values yet, but-
Yes, that's what you get for trying to skip ahead.
How about a nice enum instead?
Something like this?
fn main() { show(Answer::Maybe) } enum Answer { Yes, No, Maybe, } fn show(answer: Answer) { let s = match answer { Answer::Yes => "yes", Answer::No => "no", Answer::Maybe => "maybe", }; println!("the answer is {s}"); }
$ cargo run --quiet the answer is maybe
I mean, yeah sure. That's a good starting point.
And maybe you want me to learn about this, too?
fn is_yes(answer: Answer) -> bool { if let Answer::Yes = answer { true } else { false } }
Sure, but mostly I w-
Or better still, this?
fn is_yes(answer: Answer) -> bool { matches!(answer, Answer::Yes) }
No, more like this:
fn main() { show(Either::Character('C')); show(Either::Number(64)); } enum Either { Number(i64), Character(char), } fn show(either: Either) { match either { Either::Number(n) => println!("{n}"), Either::Character(c) => println!("{c}"), } }
$ cargo run --quiet C 64
Oh, yeah, that's pretty good. So like enum variants that... hold some data?
Yes!
And you can do pattern matching to know which variant it is, and to access what's inside.
And I suppose it's safe too, as in it won't let you accidentally access the wrong variant?
Yes, yes of course. These are no C unions. They're tagged unions. Or choice types. Or sum types. Or coproducts.
Let's just stick with "enums".
But that's great news: I can finally take functions that can handle multiple types, even without understanding generics!
And I suppose... conversions could help there too? Like what if I could do this?
fn main() { show('C'.into()); show(64.into()); }
Sure, you can do that. Just implement a couple traits!
Traits? But we're in the enums sect-
Implementing traits
Ah, here we are. Couple traits, okay, show me!
fn main() { show('C'.into()); show(64.into()); } enum Either { Number(i64), Character(char), } // π impl From<i64> for Either { fn from(n: i64) -> Self { Either::Number(n) } } // π impl From<char> for Either { fn from(c: char) -> Self { Either::Character(c) } } fn show(either: Either) { match either { Either::Number(n) => println!("{n}"), Either::Character(c) => println!("{c}"), } }
$ cargo run --quiet C 64
Hey, that's pretty good! But we haven't declared that From
trait anywhere,
let's see... ah, here's what it looks like, from the Rust standard library:
pub trait From<T> { fn from(T) -> Self; }
Ah, that's refreshingly short. And Self
is?
The type you're implementing From<T>
for.
And then I suppose Into
is also in there somewhere?
pub trait Into<T> { fn into(self) -> T; }
Right! And self
is...
...short for self: Self
, in that position.
And I suppose there's other traits?
Wait, are Display
and Debug
traits?
They are! Here, let me show you something:
use std::fmt::Display; fn main() { show(&'C'); show(&64); } fn show(v: &dyn Display) { println!("{v}"); }
$ cargo run --quiet C 64
Whoa. WHOA. Game changer. No .into()
needed, it just works? Very cool.
Now let me show you something else:
use std::fmt::Display; fn main() { show(&'C'); show(&64); } fn show(v: impl Display) { println!("{v}"); }
That works too? No way! v
can be whichever type implements Display
! So nice!
Yes! It's the shorter way of spelling this:
fn show<D: Display>(v: D) { println!("{v}"); }
Ah!!! So that's how you add a... how you tell the compiler that the type must implement something.
A trait bound, yes. There's an even longer way to spell this:
fn show<D>(v: D) where D: Display, { println!("{v}"); }
Okay, that... I mean if you ignore all the punctuation going on, this almost reads like English. If English were maths. Well, the kind of maths compilers think about. Possibly type theory?
Return position
Wait, I didn't type that heading. Cool bear??
Shh, look at this.
use std::fmt::Display; fn main() { show(get_char()); show(get_int()); } fn get_char() -> impl Display { 'C' } fn get_int() -> impl Display { 64 } fn show(v: impl Display) { println!("{v}"); }
Okay. So we can use impl Display
"in return position", if we don't feel like
typing it all out. That's good.
And I suppose, since impl T
is much like generics, we can probably do
something like:
use std::fmt::Display; fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); } fn get_char_or_int(give_char: bool) -> impl Display { if give_char { 'C' } else { 64 } } fn show(v: impl Display) { println!("{v}"); }
$ cargo run --quiet error[E0308]: `if` and `else` have incompatible types --> src/main.rs:12:9 | 9 | / if give_char { 10 | | 'C' | | --- expected because of this 11 | | } else { 12 | | 64 | | ^^ expected `char`, found integer 13 | | } | |_____- `if` and `else` have incompatible types For more information about this error, try `rustc --explain E0308`. error: could not compile `grr` due to previous error
Ah. No I cannot.
So our return type is impl Display
... ah, and it infers it to be char
,
because that's the first thing we return! And so the other thing must also
be char
.
But it's not.
Well I'm lost. Bear, how do we get out of this?
Bear?
...okay maybe... generics? π€·
fn get_char_or_int<D: Display>(give_char: bool) -> D { if give_char { 'C' } else { 64 } }
$ cargo run --quiet error[E0282]: type annotations needed --> src/main.rs:4:5 | 4 | show(get_char_or_int(true)); | ^^^^ cannot infer type for type parameter `impl Display` declared on the function `show` error[E0308]: mismatched types --> src/main.rs:10:9 | 8 | fn get_char_or_int<D: Display>(give_char: bool) -> D { | - - | | | | | expected `D` because of return type | this type parameter help: consider using an impl return type: `impl Display` 9 | if give_char { 10 | 'C' | ^^^ expected type parameter `D`, found `char` | = note: expected type parameter `D` found type `char` error[E0308]: mismatched types --> src/main.rs:12:9 | 8 | fn get_char_or_int<D: Display>(give_char: bool) -> D { | - - | | | | | expected `D` because of return type | this type parameter help: consider using an impl return type: `impl Display` ... 12 | 64 | ^^ expected type parameter `D`, found integer | = note: expected type parameter `D` found type `{integer}` Some errors have detailed explanations: E0282, E0308. For more information about an error, try `rustc --explain E0282`. error: could not compile `grr` due to 3 previous errors
Err, ew, no, go back, that's even worse.
Yeah that'll never work.
Bear where were you!
Bear business. You wouldn't get it.
I...
It'll never work, but the compiler's got your back: it tells you you should be
using impl Display
.
But that's what I tried first!
Okay well, the impl Display
in question can only be a single type.
But then what good is it?
Okay let's back up. You remember how you made an enum to handle arguments of two different types?
Vaguely? Oh I can do that here too, can't I.
Let's see πΆ
use std::fmt::Display; fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); } enum Either { Char(char), Int(i64), } fn get_char_or_int(give_char: bool) -> Either { if give_char { Either::Char('C') } else { Either::Int(64) } } fn show(v: impl Display) { println!("{v}"); }
$ cargo run --quiet error[E0277]: `Either` doesn't implement `std::fmt::Display` --> src/main.rs:4:10 | 4 | show(get_char_or_int(true)); | ---- ^^^^^^^^^^^^^^^^^^^^^ `Either` cannot be formatted with the default formatter | | | required by a bound introduced by this call | = help: the trait `std::fmt::Display` is not implemented for `Either` = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead note: required by a bound in `show` --> src/main.rs:21:17 | 21 | fn show(v: impl Display) { | ^^^^^^^ required by this bound in `show` error[E0277]: `Either` doesn't implement `std::fmt::Display` --> src/main.rs:5:10 | 5 | show(get_char_or_int(false)); | ---- ^^^^^^^^^^^^^^^^^^^^^^ `Either` cannot be formatted with the default formatter | | | required by a bound introduced by this call | = help: the trait `std::fmt::Display` is not implemented for `Either` = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead note: required by a bound in `show` --> src/main.rs:21:17 | 21 | fn show(v: impl Display) { | ^^^^^^^ required by this bound in `show` For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to 2 previous errors
Oh, wait, wait, I know this! I can just implement Display
for Either
:
impl Display for Either { // ... }
Wait, what do I put in there?
Use the rust-analyzer code generation assist.
You do have it installed, right?
Yes haha, of course, yes. Okay so Ctrl+.
(Cmd+.
on macOS), pick "Implement
missing members", and... it gives me this:
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { todo!() } }
...and then I guess I just match on self
? To call either the Display
implementation for char
or for i64
?
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { // } } }
Wait, what do I write there?
Use the rust-analyzer code generation assist.
Sounding like a broken record, you doing ok bear?
I am. There's a different code generation assist for this. Alternatively, GitHub Copilot might write the whole block for you.
It's getting better. It's learning.
Okay, using the "Fill match arms" assist...
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Either::Char(_) => todo!(), Either::Int(_) => todo!(), } } }
Okay I can do the rest!
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Either::Char(c) => c.fmt(f), Either::Int(i) => i.fmt(f), } } }
And this now runs!
$ cargo run --quiet C 64
Nice. But that was, like, super verbose. Can we make it less verbose?
Sure! You can use the delegate crate, for instance.
Okay okay I remember that bit, so you just:
$ cargo add delegate Updating 'https://github.com/rust-lang/crates.io-index' index Adding delegate v0.6.2 to dependencies.
And then... wait, what do we delegate to?
Oh I'll give you this one for free:
impl Either { fn display(&self) -> &dyn Display { match self { Either::Char(c) => c, Either::Int(i) => i, } } }
Wait wait wait but that's interesting as heck. You don't need traits to add
methods to types like that? You can return a &dyn Trait
object? That
borrows from &self
? Which is short for self: &Self
? And it extends
the lifetime of the receiver, also called a borrow-through???
Heyyyyyyyyy now where did you learn all that, we covered nothing of this.
Hehehe okay forget about it.
Okay so now that we've got a display
method we can do this:
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { self.display().fmt(f) } }
And that's where the delegate crate comes in to make things simpler (or at least shorter), mhh, looking at the README, we can probably do...
impl Display for Either { delegate::delegate! { to self.display() { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result; } } }
Yeah! Or, you know, use delegate::delegate;
first, and then you can just call the macro
with delegate!
instead of qualifying it with delegate::delegate!
.
There's even a rust-analyzer assist for it β "replace qualified path with use".
Macros? Qualified paths? Wow, we're glossing over a lot of things.
Not that many, but yes.
Anyway, that all works! Here's the complete listing:
use delegate::delegate; use std::fmt::Display; fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); } impl Either { fn display(&self) -> &dyn Display { match self { Either::Char(c) => c, Either::Int(i) => i, } } } impl Display for Either { delegate! { to self.display() { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result; } } } enum Either { Char(char), Int(i64), } fn get_char_or_int(give_char: bool) -> Either { if give_char { Either::Char('C') } else { Either::Int(64) } } fn show(v: impl Display) { println!("{v}"); }
$ cargo run --quiet C 64
But... it feels a little wrong to have to write all that code just to do that.
Ah, that's because you don't!
Dynamically-sized types
Uhhh. What does any of that mean?
Okay, so it's more implementation details: just like bit widths (u32
vs
u64
), etc. But details are where the devil vacations.
Try printing the size of a few things with std::mem::size_of.
Okay then!
fn main() { dbg!(std::mem::size_of::<u32>()); dbg!(std::mem::size_of::<u64>()); dbg!(std::mem::size_of::<u128>()); }
$ cargo run --quiet [src/main.rs:2] std::mem::size_of::<u32>() = 4 [src/main.rs:3] std::mem::size_of::<u64>() = 8 [src/main.rs:4] std::mem::size_of::<u128>() = 16
Okay, 32 bits is 4 bytes, that checks out on x86_64
.
Wait, where did you learn that syntax?
Ehh you showed it to me with typeof
and, I looked it up: turns out it's named
turbofish syntax! The name was cute, so I remembered.
Okay, now try references.
Sure!
fn main() { dbg!(std::mem::size_of::<&u32>()); dbg!(std::mem::size_of::<&u64>()); dbg!(std::mem::size_of::<&u128>()); }
$ cargo run --quiet [src/main.rs:2] std::mem::size_of::<&u32>() = 8 [src/main.rs:3] std::mem::size_of::<&u64>() = 8 [src/main.rs:4] std::mem::size_of::<&u128>() = 8
Yeah, they're all 64-bit! Again, I'm on an x86_64 CPU right now, so that's not super surprising.
Now try trait objects.
Oh, the dyn Trait
stuff?
use std::fmt::Debug; fn main() { dbg!(std::mem::size_of::<dyn Debug>()); }
$ cargo run --quiet error[E0277]: the size for values of type `dyn std::fmt::Debug` cannot be known at compilation time --> src/main.rs:4:10 | 4 | dbg!(std::mem::size_of::<dyn Debug>()); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time | = help: the trait `Sized` is not implemented for `dyn std::fmt::Debug` note: required by a bound in `std::mem::size_of` --> /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:304:22 | 304 | pub const fn size_of<T>() -> usize { | ^ required by this bound in `std::mem::size_of` For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to previous error
Oh. But that's... mhh.
What type is dyn Debug
? What size would you expect it to have?
I don't know, I suppose... I suppose a lot of types implement Debug
? Like,
u32
does, u64
does, u128
does too, and String
, and...
Exactly. It could be any of these, and then some. So it's impossible to know what size it is, because it could have any size.
Heck, even the empty tuple type, ()
, implements Debug
!
fn main() { dbg!(std::mem::size_of::<()>()); println!("{:?}", ()); }
$ cargo run --quiet [src/main.rs:2] std::mem::size_of::<()>() = 0 ()
...and it's a zero-sized type! (a ZST). So dyn Debug
, or any other "trait
object", is a DST: a dynamically-sized type.
Wait, but we did return a &dyn Display
at some point, right?
Ah, yes, but references al-
...all have the same size! Right!!! Because you're not holding the actual value, you're just holding the address of it!
Exactly!
use std::mem::size_of_val; fn main() { let a = 101_u128; println!("{:16}, of size {}", a, size_of_val(&a)); println!("{:16p}, of size {}", &a, size_of_val(&&a)); }
$ cargo run --quiet 101, of size 16 0x7ffdc4fb8af8, of size 8
And so uh... what was that about us not needing the enum at all?
We're getting to it!
Storing stuff in structs
Oh structs, those are easy, just like other languages right?
Like that:
#[derive(Debug)] struct Vec2 { x: f64, y: f64, } fn main() { let v = Vec2 { x: 1.0, y: 2.0 }; println!("v = {v:#?}"); }
Wait, #[derive(Debug)]
? I don't find we've quite reached that part of the
curriculum yet... in fact I don't see it in there at all.
Oh it's just a macro that can implement a trait for you, in this case it expands to something like this:
use std::fmt; impl fmt::Debug for Vec2 { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { f.debug_struct("Vec2") .field("x", &self.x) .field("y", &self.y) .finish() } }
Well well well look who's teaching who now?
No it's types I'm struggling with, the rest is easy peasy limey squeezy.
But not structs, structs are easy, this, my program runs:
$ cargo run --quiet v = Vec2 { x: 1.0, y: 2.0, }
Okay, now make a function that adds two Vec2
!
Alright!
#[derive(Debug)] struct Vec2 { x: f64, y: f64, } impl Vec2 { fn add(self, other: Vec2) -> Vec2 { Vec2 { x: self.x + other.x, y: self.y + other.y, } } } fn main() { let v = Vec2 { x: 1.0, y: 2.0 }; let w = Vec2 { x: 9.0, y: 18.0 }; dbg!(v.add(w)); }
$ cargo run --quiet [src/main.rs:21] v.add(w) = Vec2 { x: 10.0, y: 20.0, }
Now call add twice!
fn main() { let v = Vec2 { x: 1.0, y: 2.0 }; let w = Vec2 { x: 9.0, y: 18.0 }; dbg!(v.add(w)); dbg!(v.add(w)); }
$ cargo run --quiet error[E0382]: use of moved value: `v` --> src/main.rs:22:10 | 19 | let v = Vec2 { x: 1.0, y: 2.0 }; | - move occurs because `v` has type `Vec2`, which does not implement the `Copy` trait 20 | let w = Vec2 { x: 9.0, y: 18.0 }; 21 | dbg!(v.add(w)); | ------ `v` moved due to this method call 22 | dbg!(v.add(w)); | ^ value used here after move | note: this function takes ownership of the receiver `self`, which moves `v` --> src/main.rs:10:12 | 10 | fn add(self, other: Vec2) -> Vec2 { | ^^^^ error[E0382]: use of moved value: `w` --> src/main.rs:22:16 | 20 | let w = Vec2 { x: 9.0, y: 18.0 }; | - move occurs because `w` has type `Vec2`, which does not implement the `Copy` trait 21 | dbg!(v.add(w)); | - value moved here 22 | dbg!(v.add(w)); | ^ value used here after move For more information about this error, try `rustc --explain E0382`. error: could not compile `grr` due to 2 previous errors
Erm, doesn't work.
Do you know why?
I mean it says stuff? Something something Vec2
does not implement Copy
, yet
more traits, okay, so it gets "moved".
Wait we can probably work around this with Clone
!
// π #[derive(Debug, Clone)] struct Vec2 { x: f64, y: f64, } fn main() { let v = Vec2 { x: 1.0, y: 2.0 }; let w = Vec2 { x: 9.0, y: 18.0 }; dbg!(v.clone().add(w.clone())); dbg!(v.add(w)); }
Okay it works again!
What if you don't want to call .clone()
?
Then I guess... Copy
?
#[derive(Debug, Clone, Copy)] struct Vec2 { x: f64, y: f64, } fn main() { let v = Vec2 { x: 1.0, y: 2.0 }; let w = Vec2 { x: 9.0, y: 18.0 }; dbg!(v.add(w)); dbg!(v.add(w)); }
Very good! Now forget about all that code, and tell me what's the type of "hello world"?
Ah, I'll just re-use the type_name_of
function you gave me... one sec...
fn main() { dbg!(type_name_of("hello world")); } fn type_name_of<T>(_: T) -> &'static str { std::any::type_name::<T>() }
$ cargo run --quiet [src/main.rs:2] type_name_of("hello world") = "&str"
There it is! It's &str
!
Alright! Now store it in a struct!
Sure, easy enough:
#[derive(Debug)] struct Message { text: &str, } fn main() { let msg = Message { text: "hello world", }; dbg!(msg); }
$ cargo run --quiet error[E0106]: missing lifetime specifier --> src/main.rs:3:11 | 3 | text: &str, | ^ expected named lifetime parameter | help: consider introducing a named lifetime parameter | 2 ~ struct Message<'a> { 3 ~ text: &'a str, | For more information about this error, try `rustc --explain E0106`. error: could not compile `grr` due to previous error
Oh. Not easy enough.
The compiler is showing you the way β heed its advice!
Okay, sure:
#[derive(Debug)] // π struct Message<'a> { // π text: &'a str, }
$ cargo run --quiet [src/main.rs:12] msg = Message { text: "hello world", }
Okay, now read the file src/main.rs
as a string, and store a reference to it
in a Message
.
Fine, fine, so, reading files... std::fs perhaps?
fn main() { let code = std::fs::read_to_string("src/main.rs").unwrap(); let msg = Message { text: &code }; dbg!(msg); }
$ cargo run --quiet [src/main.rs:9] msg = Message { text: "#[derive(Debug)]\nstruct Message<'a> {\n text: &'a str,\n}\n\nfn main() {\n let code = std::fs::read_to_string(\"src/main.rs\").unwrap();\n let msg = Message { text: &code };\n dbg!(msg);\n}\n", }
Okay, I did it! What now?
Now move all the code to construct the Message
into a separate function!
Like this?
#[derive(Debug)] struct Message<'a> { text: &'a str, } fn main() { let msg = get_msg(); dbg!(msg); } fn get_msg() -> Message { let code = std::fs::read_to_string("src/main.rs").unwrap(); Message { text: &code } }
$ cargo run --quiet error[E0106]: missing lifetime specifier --> src/main.rs:11:17 | 11 | fn get_msg() -> Message { | ^^^^^^^ expected named lifetime parameter | = help: this function's return type contains a borrowed value, but there is no value for it to be borrowed from help: consider using the `'static` lifetime | 11 | fn get_msg() -> Message<'static> { | ~~~~~~~~~~~~~~~~ For more information about this error, try `rustc --explain E0106`. error: could not compile `grr` due to previous error
Erm, not happy.
Okay, that's lifetime stuff. We're not there yet. What's the only thing you use
the Message
for?
Passing it to the dbg!
macro?
And what does that use?
Probably the Debug
trait?
So what can we change the return type to?
Ohhhh impl Debug
! To let the compiler figure it out!
fn get_msg() -> impl std::fmt::Debug { let code = std::fs::read_to_string("src/main.rs").unwrap(); Message { text: &code } }
$ cargo run --quiet error[E0597]: `code` does not live long enough --> src/main.rs:13:21 | 11 | fn get_msg() -> impl std::fmt::Debug { | -------------------- opaque type requires that `code` is borrowed for `'static` 12 | let code = std::fs::read_to_string("src/main.rs").unwrap(); 13 | Message { text: &code } | ^^^^^ borrowed value does not live long enough 14 | } | - `code` dropped here while still borrowed | help: you can add a bound to the opaque type to make it last less than `'static` and match `'static` | 11 | fn get_msg() -> impl std::fmt::Debug + 'static { | +++++++++ For more information about this error, try `rustc --explain E0597`. error: could not compile `grr` due to previous error
Huh. That seems like... a lifetime problem? I thought we weren't at lifetimes yet.
We are now π
Lifetimes and ownership
Look this is all moving a little fast for me, I'd just like to-
You can go back and read the transcript later! For now, what's the type returned
by std::fs::read_to_string
?
Uhhh it's-
Don't go look at the definition. No time. Just do this:
fn get_msg() -> impl std::fmt::Debug { // π let code: () = std::fs::read_to_string("src/main.rs").unwrap(); Message { text: &code } }
$ cargo run --quiet error[E0308]: mismatched types --> src/main.rs:12:20 | 12 | let code: () = std::fs::read_to_string("src/main.rs").unwrap(); | -- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `()`, found struct `String` | | | expected due to this
rust-analyzer was showing me the type as an inlay, you know...
Oh, you installed it! Good. Anyway, it's String
. Try storing that inside the
struct.
Okay. I guess we won't need that 'a
anymore...
#[derive(Debug)] struct Message { // π text: String, } fn main() { let msg = get_msg(); dbg!(msg); } fn get_msg() -> impl std::fmt::Debug { let code = std::fs::read_to_string("src/main.rs").unwrap(); // π (the `&` is gone) Message { text: code } }
Okay, why does this work when the other one didn't?
Because uhhhh, the &str
was a... reference?
Yes, and?
And that means it borrowed from something? In this case the result of
std::fs::read_to_string
?
Yes, and??
And that meant we could not return that reference, because code
dropped
(which means it got freed) at the end of the function, and so the reference
would be dangling?
Veeeery goooood! And it works as a String
because?
Well, I guess it doesn't borrow? Like, the result of read_to_string
is moved
into Message
, and so we take ownership of it, and we can move it anywhere we
please?
Exactly! Suspiciously exact, even. Are you sure this is your first time?
πΌ
Very well, boss baby, do you know of other types that let you own a string?
Ah, there's a couple! Box<str>
will work, for example:
#[derive(Debug)] struct Message { // π text: Box<str>, } fn main() { let msg = get_msg(); dbg!(msg); } fn get_msg() -> impl std::fmt::Debug { let code = std::fs::read_to_string("src/main.rs").unwrap(); // π Message { text: code.into() } }
And that one has exclusive ownership. Whereas something like Arc<str>
will,
well, it'll also work:
use std::sync::Arc; #[derive(Debug)] struct Message { text: Arc<str>, }
But that one's shared ownership. You can hand out clones of it and so multiple structs can point to the same thing:
use std::sync::Arc; #[derive(Debug)] struct Message { text: Arc<str>, } fn main() { let a = get_msg(); let b = Message { text: a.text.clone(), }; let c = Message { text: b.text.clone(), }; dbg!(a.text.as_ptr(), b.text.as_ptr(), c.text.as_ptr()); } fn get_msg() -> Message { let code = std::fs::read_to_string("src/main.rs").unwrap(); Message { text: code.into() } }
$ cargo run --quiet [src/main.rs:16] a.text.as_ptr() = 0x0000555f4e9d8d80 [src/main.rs:16] b.text.as_ptr() = 0x0000555f4e9d8d80 [src/main.rs:16] c.text.as_ptr() = 0x0000555f4e9d8d80
But you can't modify it.
Well, it's pretty awkward to mutate a &mut str
to begin with!
Yeah. It's easier to show that with a &mut [u8]
.
Oh you're the professor now huh?
Sure! Watch me make a table:
Text (UTF-8) | Bytes | |
---|---|---|
Immutable reference / slice | &str | &[u8] |
Owned, can grow | String | Vec<u8> |
Owned, fixed len | Box<str> | Box<[u8]> |
Shared ownership (atomic) | Arc<str> | Arc<[u8]> |
Now where... where did you find that? You're not even telling people about Rc!
Eh, by the time they're worried about the cost of atomic reference counting, they can do their own research. And then they'll have a nice surprise: free performance!
There is one thing that's a bit odd, though. In the table above, we have
an equivalence between str
and [u8]
. What are those types?
Ah! Those. Well...
Slices and arrays
Try printing the size of the str
and [u8]
types!
Okay sure!
use std::mem::size_of; fn main() { dbg!(size_of::<str>()); dbg!(size_of::<[u8]>()); }
Wait, no, we can't:
$ cargo run --quiet error[E0277]: the size for values of type `str` cannot be known at compilation time --> src/main.rs:4:20 | 4 | dbg!(size_of::<str>()); | ^^^ doesn't have a size known at compile-time | = help: the trait `Sized` is not implemented for `str` note: required by a bound in `std::mem::size_of` --> /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:304:22 | 304 | pub const fn size_of<T>() -> usize { | ^ required by this bound in `std::mem::size_of` error[E0277]: the size for values of type `[u8]` cannot be known at compilation time --> src/main.rs:5:20 | 5 | dbg!(size_of::<[u8]>()); | ^^^^ doesn't have a size known at compile-time | = help: the trait `Sized` is not implemented for `[u8]` note: required by a bound in `std::mem::size_of` --> /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:304:22 | 304 | pub const fn size_of<T>() -> usize { | ^ required by this bound in `std::mem::size_of` For more information about this error, try `rustc --explain E0277`. error: could not compile `grr` due to 2 previous errors
Correct! What about the size of &str
and &[u8]
?
use std::mem::size_of; fn main() { dbg!(size_of::<&str>()); dbg!(size_of::<&[u8]>()); }
$ cargo run --quiet [src/main.rs:4] size_of::<&str>() = 16 [src/main.rs:5] size_of::<&[u8]>() = 16
Ah, those we can! 16 bytes, that's... 2x8 bytes... two pointers!
Yes! Start and length.
Okay, so those are always references because... nothing else makes sense? Like, we don't know the size of the thing we're borrowing a slice of?
Yes! And the thing we're borrowing from can be... a lot of different things.
Let's take &[u8]
β what types can you borrow a &[u8]
out of?
Well... the heading says "arrays" so I'm gonna assume it works for arrays:
use std::mem::size_of_val; fn main() { let arr = [1, 2, 3, 4, 5]; let slice = &arr[1..4]; dbg!(size_of_val(&arr)); dbg!(size_of_val(&slice)); print_byte_slice(slice); } fn print_byte_slice(slice: &[u8]) { println!("{slice:?}"); }
$ cargo run --quiet [src/main.rs:6] size_of_val(&arr) = 5 [src/main.rs:7] size_of_val(&slice) = 16 [2, 3, 4]
Okay, yes.
What else?
I guess, anything we had in that table under "bytes"?
It should definitely work for Vec<u8>
use std::mem::size_of_val; fn main() { let vec = vec![1, 2, 3, 4, 5]; let slice = &vec[1..4]; dbg!(size_of_val(&vec)); dbg!(size_of_val(&slice)); print_byte_slice(slice); } fn print_byte_slice(slice: &[u8]) { println!("{slice:?}"); }
$ cargo run --quiet [src/main.rs:6] size_of_val(&vec) = 24 [src/main.rs:7] size_of_val(&slice) = 16 [2, 3, 4]
Wait, 24 bytes?
Yeah! Start, length, capacity. Not necessarily in that order. Rust doesn't guarantee a particular type layout anyway, so you shouldn't rely on it.
Next up is Box<[u8]>
:
use std::mem::size_of_val; fn main() { let bbox: Box<[u8]> = Box::new([1, 2, 3, 4, 5]); let slice = &bbox[1..4]; dbg!(size_of_val(&bbox)); dbg!(size_of_val(&slice)); print_byte_slice(slice); } fn print_byte_slice(slice: &[u8]) { println!("{slice:?}"); }
$ cargo run --quiet [src/main.rs:6] size_of_val(&bbox) = 16 [src/main.rs:7] size_of_val(&slice) = 16 [2, 3, 4]
Ha, 2x8 bytes each. I suppose... a Box<[u8]>
is exactly like a &[u8]
except... it has ownership of the data it points to? So we can move it and
stuff? And dropping it frees the data?
Yup! And you forgot one: slices of slices.
use std::mem::size_of_val; fn main() { let arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; let slice = &arr[2..7]; let slice_of_slice = &slice[2..]; dbg!(size_of_val(&slice_of_slice)); print_byte_slice(slice_of_slice); } fn print_byte_slice(slice: &[u8]) { println!("{slice:?}"); }
$ cargo run --quiet [src/main.rs:7] size_of_val(&slice_of_slice) = 16 [5, 6, 7]
Very cool.
So wait, just to back up β arrays are [T; n]
, and slices are &[T]
. We know
the size of arrays because we know how many elements they have, and we know the
size of &[T]
because it's just start + length.
But we don't know the size of [T]
because...
Because the slice could borrow from anything! As we've seen: [u8; n]
,
Vec<u8>
, Box<[u8]>
, Arc<[u8]>
, another slice...
Ah. So we don't know its size.
Wait wait wait.
That makes [T]
a dynamically-sized type? Just like trait objects?
Yes, it is a DST.
And we can just do Box<[T]>
?
Sure! That's just an owning pointer.
Ooooh that gives me an idea.
Boxed trait objects
So! Deep breaths. If I followed correctly, that means that, although we don't
know the size of dyn Display
, we know the size of Box<dyn Display>
β it
should be the same size as &dyn Display
, it just has ownership of its... of
the thing it points to.
Its pointee, yeah. Also, same with Arc<dyn Display>
, or any other smart
pointer.
Okay let me check it real quick:
use std::{fmt::Display, mem::size_of, rc::Rc, sync::Arc}; fn main() { dbg!(size_of::<&dyn Display>()); dbg!(size_of::<Box<dyn Display>>()); dbg!(size_of::<Arc<dyn Display>>()); dbg!(size_of::<Rc<dyn Display>>()); }
$ cargo run --quiet [src/main.rs:4] size_of::<&dyn Display>() = 16 [src/main.rs:5] size_of::<Box<dyn Display>>() = 16 [src/main.rs:6] size_of::<Arc<dyn Display>>() = 16 [src/main.rs:7] size_of::<Rc<dyn Display>>() = 16
Okay, okay! They're all the same size, the size of a p-.. of two pointers? What?
Yeah! Data and vtable. You remember how you couldn't do anything with the values in your first generic function?
That one?
fn show<T>(a: T) { todo!() }
The very same. Well there's two ways to solve this. Either you add a trait bound, like so:
fn show<T: std::fmt::Display>(a: T) { // blah }
And then a different version of show
gets generated for every type you call
it with.
Oooh, right! That's uhh... it's called... discombobulation?
Monomorphization. show
is "polymorphic" because it can take multiple forms,
and it gets replaced with many "monomorphic" versions of itself, that each handle
a certain combination of types.
Okay, so that's one way. And the other way?
You take a reference to a trait object: &dyn Trait
.
And that helps how?
Well, it points to the value itself, and a list of all functions required by the trait. And only those.
Oh. Oh! And that's the vtable? It's just "the concrete type's implementation of every function listed in the trait definition"?
Yes. But can you define "concrete type" for me?
Well... let's take this:
use std::fmt::Display; fn main() { let x: u64 = 42; show(x); } fn show<D: Display>(d: D) { println!("{}", d); }
In that case, I'd call D
the type parameter (or generic type?), and u64
the
concrete type.
Okay, I was just making sure. You were about to have an epiphany?
I was? Oh, right!
$ cargo run --quiet [src/main.rs:4] size_of::<&dyn Display>() = 16 [src/main.rs:5] size_of::<Box<dyn Display>>() = 16 [src/main.rs:6] size_of::<Arc<dyn Display>>() = 16 [src/main.rs:7] size_of::<Rc<dyn Display>>() = 16
So these all have the same size.
And the last time we tried returning a dyn Display
we ran into trouble
because, well, it's dynamically-sized:
use std::fmt::Display; fn main() { let x = get_display(); show(x); } fn get_display() -> dyn Display { let x: u64 = 42; x } fn show<D: Display>(d: D) { println!("{}", d); }
$ cargo run --quiet error[E0746]: return type cannot have an unboxed trait object --> src/main.rs:3:21 | 3 | fn get_display() -> dyn Display { | ^^^^^^^^^^^ doesn't have a size known at compile-time | = note: for information on `impl Trait`, see <https://doc.rust-lang.org/book/ch10-02-traits.html#returning-types-that-implement-traits> help: use `impl Display` as the return type, as all return paths are of type `u64`, which implements `Display` | 3 | fn get_display() -> impl Display { | ~~~~~~~~~~~~ (other errors omitted)
But -> impl Display
worked, as the compiler suggests:
fn get_display() -> impl Display { let x: u64 = 42; x }
Because it's sorta like this:
fn get_display<D: Display>() -> D { let x: u64 = 42; x }
Nooooooo no no no. Verboten. Can't do that!
Yeah, you told me! You didn't explain why, though.
Because, and read this very carefully:
When a generic function is called, it must be possible to infer all its type parameters from its inputs alone.
Ah, erm. Wait so it would work if D
was also somewhere in the type of a
parameter?
Yeah! Consider this:
fn main() { dbg!(add_10(5)); } fn add_10<N>(n: N) -> N { n + 10 }
Wait, that doesn't compile!
$ cargo run --quiet error[E0369]: cannot add `{integer}` to `N` --> src/main.rs:6:7 | 6 | n + 10 | - ^ -- {integer} | | | N
No. But you also truncated the compiler's output.
Here's the rest of it.
help: consider restricting type parameter `N` | 5 | fn add_10<N: std::ops::Add<Output = {integer}>>(n: N) -> N { | +++++++++++++++++++++++++++++++++++
It's not the same issue. The problem here is that N
could be anything.
Including types that we cannot add 10 to.
Here's a working version:
fn main() { dbg!(add_10(1_u8)); dbg!(add_10(2_u16)); dbg!(add_10(3_u32)); dbg!(add_10(4_u64)); } fn add_10<N>(n: N) -> N where N: From<u8> + std::ops::Add<Output = N>, { n + 10.into() }
Yeesh that's... gnarly.
Yeah. It's also a super contrived example.
But okay, I get it: impl Trait
in return position is the only way to have
something about the function signature that's inferred from... its body.
Yes! Which is why both these get_
functions work:
use std::fmt::Display; fn main() { show(get_char()); show(get_int()); } fn get_char() -> impl Display { 'C' } fn get_int() -> impl Display { 64 } fn show(v: impl Display) { println!("{v}"); }
Right, it infers the return type of get_char
to be char
, and the ret-
Not quite. Well, yes. But it returns an opaque type. The caller doesn't know
it's actually a char
. All it knows is that it implements Display
.
I see.
Still, by itself, it can't unify char
and i32
, for example. Those are two
distinct types.
I wonder what type_name
thinks of these...
use std::fmt::Display; fn main() { let c = get_char(); dbg!(type_name_of(&c)); let i = get_int(); dbg!(type_name_of(&i)); } fn get_char() -> impl Display { 'C' } fn get_int() -> impl Display { 64 } fn type_name_of<T>(_: T) -> &'static str { std::any::type_name::<T>() }
$ cargo run --quiet [src/main.rs:5] type_name_of(&c) = "&char" [src/main.rs:7] type_name_of(&i) = "&i32"
Hahahaha. Not so opaque after all.
That's uhh.. didn't expect type_name
to do that, to be honest.
But they are opaque, I promise. You can call char
methods on a real char
,
but not on the return type of get_char
:
use std::fmt::Display; fn main() { let real_c = 'a'; dbg!(real_c.to_ascii_uppercase()); let opaque_c = get_char(); dbg!(opaque_c.to_ascii_uppercase()); } fn get_char() -> impl Display { 'C' }
$ cargo run --quiet error[E0599]: no method named `to_ascii_uppercase` found for opaque type `impl std::fmt::Display` in the current scope --> src/main.rs:8:19 | 8 | dbg!(opaque_c.to_ascii_uppercase()); | ^^^^^^^^^^^^^^^^^^ method not found in `impl std::fmt::Display` For more information about this error, try `rustc --explain E0599`. error: could not compile `grr` due to previous error
Also, I'm fairly sure type_id
will give us different values...
use std::{any::TypeId, fmt::Display}; fn main() { let opaque_c = get_char(); dbg!(type_id_of(opaque_c)); let real_c = 'a'; dbg!(type_id_of(real_c)); } fn get_char() -> impl Display { 'C' } fn type_id_of<T: 'static>(_: T) -> TypeId { TypeId::of::<T>() }
$ cargo run --quiet [src/main.rs:5] type_id_of(opaque_c) = TypeId { t: 15782864888164328018, } [src/main.rs:8] type_id_of(real_c) = TypeId { t: 15782864888164328018, }
Ah, huh. I guess not.
Yeah it seems like opaque types are a type-checker trick and it is the concrete type at runtime. The checker will just have prevented us from calling anything that wasn't in the trait.
Actually, now I understand better why this cannot work:
use std::fmt::Display; fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); } fn get_char_or_int(give_char: bool) -> impl Display { if give_char { 'C' } else { 64 } } fn show(v: impl Display) { println!("{v}"); }
$ cargo run --quiet error[E0308]: `if` and `else` have incompatible types --> src/main.rs:12:9 | 9 | / if give_char { 10 | | 'C' | | --- expected because of this 11 | | } else { 12 | | 64 | | ^^ expected `char`, found integer 13 | | } | |_____- `if` and `else` have incompatible types For more information about this error, try `rustc --explain E0308`. error: could not compile `grr` due to previous error
It's because the return type cannot be simultaneously char
and, say, i32
.
Yes, and also: it's because there's no vtable involved. Remember the enum
version you did?
Yeah! That one:
use delegate::delegate; use std::fmt::Display; fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); } impl Either { fn display(&self) -> &dyn Display { match self { Either::Char(c) => c, Either::Int(i) => i, } } } impl Display for Either { delegate! { to self.display() { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result; } } } enum Either { Char(char), Int(i64), } fn get_char_or_int(give_char: bool) -> Either { if give_char { Either::Char('C') } else { Either::Int(64) } } fn show(v: impl Display) { println!("{v}"); }
Right! In that one, you're manually dispatching Display::fmt
to either the
implementation for char
or the one for i64
.
Well no, delegate
is doing it for me.
Well, you did it here:
impl Display for Either { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Either::Char(c) => c.fmt(f), Either::Int(i) => i.fmt(f), } } }
Right, yes, I see the idea. So a vtable does the same thing?
Eh, not quite. It's more like function pointers.
Can you show me?
Okay, but real quick then.
use std::{ fmt::{self, Display}, mem::transmute, }; // This is our type that can contain any value that implements `Display` struct BoxedDisplay { // This is a pointer to the actual value, which is on the heap. data: *mut (), // And this is a reference to the vtable for Display's implementation of the // type of our value. vtable: &'static DisplayVtable<()>, } // π Note that there are no type parameters at all in the above type. The // type is _erased_. // Then we need to declare our vtable type. // This is a type-safe take on it (thanks @eddyb for the idea), but you may // have noticed `BoxedDisplay` pretends they're all `DisplayVtable<()>`, which // is fine because we're only dealing with pointers to `T` / `()`, which all // have the same size. #[repr(C)] struct DisplayVtable<T> { // This is the implementation of `Display::fmt` for `T` fmt: unsafe fn(*mut T, &mut fmt::Formatter<'_>) -> fmt::Result, // We also need to be able to drop a `T`. For that we need to know how large // `T` is, and there may be side effects (freeing OS resources, flushing a // buffer, etc.) so it needs to go into the vtable too. drop: unsafe fn(*mut T), } impl<T: Display> DisplayVtable<T> { // This lets us build a `DisplayVtable` any `T` that implements `Display` fn new() -> &'static Self { // Why yes you can declare functions in that scope. This one just // forwards to `T`'s `Display` implementation. unsafe fn fmt<T: Display>(this: *mut T, f: &mut fmt::Formatter<'_>) -> fmt::Result { (*this).fmt(f) } // Here we turn a raw pointer (`*mut T`) back into a `Box<T>`, which // has ownership of it and thus, knows how to drop (free) it. unsafe fn drop<T>(this: *mut T) { Box::from_raw(this); } // π These are both regular functions, not, closures. They end up in // the executable, thus they live for 'static, thus we can return a // `&'static Self` as requested. &Self { fmt, drop } } } // Okay, now we can make a constructor for `BoxedDisplay` itself! impl BoxedDisplay { // The `'static` bound makes sure `T` is _owned_ (it can't be a reference // shorter than 'static). fn new<T: Display + 'static>(t: T) -> Self { // Let's do some type erasure! Self { // Box<T> => *mut T => *mut () data: Box::into_raw(Box::new(t)) as _, // &'static DisplayVtable<T> => &'static DisplayVtable<()> vtable: unsafe { transmute(DisplayVtable::<T>::new()) }, } } } // That one's easy β we dispatch to the right `fmt` function using the vtable. impl Display for BoxedDisplay { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { unsafe { (self.vtable.fmt)(self.data, f) } } } // Same here. impl Drop for BoxedDisplay { fn drop(&mut self) { unsafe { (self.vtable.drop)(self.data); } } } // And finally, we can use it! fn get_char_or_int(give_char: bool) -> BoxedDisplay { if give_char { BoxedDisplay::new('C') } else { BoxedDisplay::new(64) } } fn show(v: impl Display) { println!("{v}"); } fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); }
$ cargo run --quiet C 64
Whoa. Whoa whoa whoa, that could be its own article!
Yes. And yet here we are.
And there's unsafe
code in there, how do you know it's okay?
Well, miri is happy about it, so that's a good start:
$ cargo +nightly miri run --quiet C 64
And do I really need to write code like that?
No you don't! But you can, and the standard library does have code like that, which is awesome, because you don't need to learn a whole other language to drop down and work on it.
Wait, unsafe Rust is not a whole other language?
TouchΓ©, smartass.
Anyway you don't need to write all of that yourself because
that's exactly what Box<dyn Display>
already is.
Oh, word?
use std::fmt::Display; fn get_char_or_int(give_char: bool) -> Box<dyn Display> { if give_char { Box::new('C') } else { Box::new(64) } } fn show(v: impl Display) { println!("{v}"); } fn main() { show(get_char_or_int(true)); show(get_char_or_int(false)); }
$ cargo run --quiet C 64
Neat! Super neat.
Really the "magic" happens in the trait object itself. Here it's boxed, but it may as well be arc'd:
fn get_char_or_int(give_char: bool) -> Arc<dyn Display> { if give_char { Arc::new('C') } else { Arc::new(64) } }
And that would work just as well. Or, again, just a reference:
fn get_char_or_int(give_char: bool) -> &'static dyn Display { if give_char { &'C' } else { &64 } }
Well, that's a comfort. For a second there I really thought I would have to write my own custom vtable implementation every time I want to do something useful.
No, this isn't the 1970s. We have re-usable code now.
Reading type signatures
Ok so... there's a lot of different names for essentially the same thing, like
&str
and String
, and &[u8]
and Vec<u8>
, etc.
Seems like a bunch of extra work. What's the upside?
Well, sometimes it catches bugs.
Ah!
The big thing there is lifetimes, in the context of concurrent code, but...
Whoa there, I don't think we've-
BUT, immutability is another big one.
Consider this:
function double(arr) { for (var i = 0; i < arr.length; i++) { arr[i] *= 2; } return arr; } let a = [1, 2, 3]; console.log({ a }); let b = double(a); console.log({ b });
Ah, easy! This'll print 1, 2, 3
and then 2, 4, 6
.
$ node main.js { a: [ 1, 2, 3 ] } { b: [ 2, 4, 6 ] }
Called it!
Now what if we call it like this?
let a = [1, 2, 3]; console.log({ a }); let b = double(a); console.log({ a, b });
Ah, then, mh... 1, 2, 3
and then... 1, 2, 3
and 2, 4, 6
?
Wrong!
$ node main.js { a: [ 1, 2, 3 ] } { a: [ 2, 4, 6 ], b: [ 2, 4, 6 ] }
Ohhh! Right I suppose double
took the array by reference, and so it mutated it
in-place.
Mhhh. I guess we have to think about these things in ECMAScript-land, too.
We very much do! We can "fix" it like this for example:
function double(arr) { let result = new Array(arr.length); for (var i = 0; i < arr.length; i++) { result[i] = arr[i] * 2; } return result; }
$ node main.js { a: [ 1, 2, 3 ] } { a: [ 1, 2, 3 ], b: [ 2, 4, 6 ] }
Wait, wouldn't we rather use a functional style, like so?
function double(arr) { return arr.map((x) => x * 2); }
That works too! It's just 86% slower according to this awful microbenchmark I just made.
Aw, nuts. We have to worry about performance too in ECMAScript-land?
You can if you want to! But let's stay on "correctness".
Let's try porting those functions to Rust.
fn main() { let a = vec![1, 2, 3]; println!("a = {a:?}"); let b = double(a); println!("b = {b:?}"); } fn double(a: Vec<i32>) -> Vec<i32> { a.into_iter().map(|x| x * 2).collect() }
Let's give it a run...
$ cargo run -q a = [1, 2, 3] b = [2, 4, 6]
Yeah that checks out.
So, same question as before: do you think double
is messing with a
?
I don't think so?
Try printing it!
fn main() { let a = vec![1, 2, 3]; println!("a = {a:?}"); let b = double(a); println!("a = {a:?}"); println!("b = {b:?}"); }
$ cargo run -q error[E0382]: borrow of moved value: `a` --> src/main.rs:5:20 | 2 | let a = vec![1, 2, 3]; | - move occurs because `a` has type `Vec<i32>`, which does not implement the `Copy` trait 3 | println!("a = {a:?}"); 4 | let b = double(a); | - value moved here 5 | println!("a = {a:?}"); | ^ value borrowed here after move | = note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info) For more information about this error, try `rustc --explain E0382`. error: could not compile `grr` due to previous error
Wait, we can't. double
takes ownership of a
, so there's no a
left for us
to print.
Correct! What about this version?
fn main() { let a = vec![1, 2, 3]; println!("a = {a:?}"); let b = double(&a); println!("a = {a:?}"); println!("b = {b:?}"); } fn double(a: &Vec<i32>) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
That one... mhh that one should work?
It does!
$ cargo run -q a = [1, 2, 3] a = [1, 2, 3] b = [2, 4, 6]
But tell me, do we really need to take a &Vec
?
What do you mean?
Well, a Vec<T>
is neat because it can grow, and shrink. This is useful when
collecting results, for example, and we don't know how many results we'll
end up having. We need to be able to push elements onto it, without worrying
about running out of space.
I suppose so yeah? Well in our case... I suppose all we do is read from a
,
so no, we don't really need a &Vec
. But what else would we take?
Let's ask clippy!
$ cargo clippy -q warning: writing `&Vec` instead of `&[_]` involves a new object where a slice will do --> src/main.rs:9:14 | 9 | fn double(a: &Vec<i32>) -> Vec<i32> { | ^^^^^^^^^ help: change this to: `&[i32]` | = note: `#[warn(clippy::ptr_arg)]` on by default = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg
Ohhhh a slice, of course!
fn double(a: &[i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
And now does this version mess with a
?
Oh definitely not. Our a
in the main
function is a growable Vec
, and we
pass a read-only slice of it to the function, so all it can do is read.
Correct!
$ cargo run -q a = [1, 2, 3] a = [1, 2, 3] b = [2, 4, 6]
How about this one:
fn double(a: &mut [i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
Well, seems unnecessary? And.. it doesn't compile:
$ cargo run -q error[E0308]: mismatched types --> src/main.rs:4:20 | 4 | let b = double(&a); | ^^ types differ in mutability | = note: expected mutable reference `&mut [i32]` found reference `&Vec<{integer}>` For more information about this error, try `rustc --explain E0308`. error: could not compile `grr` due to previous error
So? Make it compile!
Alright then:
fn main() { // π let mut a = vec![1, 2, 3]; println!("a = {a:?}"); // π let b = double(&mut a); println!("a = {a:?}"); println!("b = {b:?}"); } fn double(a: &mut [i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
There. It prints exactly the same thing.
So this works. But is it good?
Not really no. We're asking for more than what we need.
Indeed! We never mutate the input, so we don't need a mutable slice of it.
But can you show a case where it would get in the way?
Yes I suppose... I suppose if we wanted to double the input in parallel a bunch of times? I mean it's pretty contrived, but.. gimme a second.
$ cargo add crossbeam (cut)
fn main() { let mut a = vec![1, 2, 3]; println!("a = {a:?}"); crossbeam::scope(|s| { for _ in 0..5 { s.spawn(|_| { let b = double(&mut a); println!("b = {b:?}"); }); } }) .unwrap(); } fn double(a: &mut [i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
There. That fails because we can't borrow a
mutably more than once at a time:
$ cargo run -q error[E0499]: cannot borrow `a` as mutable more than once at a time --> src/main.rs:7:21 | 5 | crossbeam::scope(|s| { | - has type `&crossbeam::thread::Scope<'1>` 6 | for _ in 0..5 { 7 | s.spawn(|_| { | - ^^^ `a` was mutably borrowed here in the previous iteration of the loop | _____________| | | 8 | | let b = double(&mut a); | | - borrows occur due to use of `a` in closure 9 | | println!("b = {b:?}"); 10 | | }); | |______________- argument requires that `a` is borrowed for `'1` For more information about this error, try `rustc --explain E0499`. error: could not compile `grr` due to previous error
But it works if we just take an immutable reference:
fn main() { let a = vec![1, 2, 3]; println!("a = {a:?}"); crossbeam::scope(|s| { for _ in 0..5 { s.spawn(|_| { let b = double(&a); println!("b = {b:?}"); }); } }) .unwrap(); } fn double(a: &[i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
$ cargo run -q a = [1, 2, 3] b = [2, 4, 6] b = [2, 4, 6] b = [2, 4, 6] b = [2, 4, 6] b = [2, 4, 6]
Very good! Look at you! And you used crossbeam because?
Because... something something scoped threads. Forget about that part. You got what you wanted, right?
I did! Next question: doesn't this code have the exact same performance issues
as our ECMAScript .map()
-based function?
Yes and no β we are allocating a new Vec
, but it probably has the exact
right size to begin with, because Rust iterators have size hints.
Ah, mh, okay, but what if we did want to mutate the vec in-place?
Ah, then I suppose we could do this:
fn main() { let a = vec![1, 2, 3]; println!("a = {a:?}"); let b = double(a); println!("b = {b:?}"); } fn double(a: Vec<i32>) -> Vec<i32> { for i in 0..a.len() { a[i] *= 2; } a }
Wait, no:
$ cargo run -q error[E0596]: cannot borrow `a` as mutable, as it is not declared as mutable --> src/main.rs:11:9 | 9 | fn double(a: Vec<i32>) -> Vec<i32> { | - help: consider changing this to be mutable: `mut a` 10 | for i in 0..a.len() { 11 | a[i] *= 2; | ^ cannot borrow as mutable For more information about this error, try `rustc --explain E0596`. error: could not compile `grr` due to previous error
I mean this:
fn double(mut a: Vec<i32>) -> Vec<i32> { for i in 0..a.len() { a[i] *= 2; } a }
Wait, no:
$ cargo clippy -q warning: the loop variable `i` is only used to index `a` --> src/main.rs:10:14 | 10 | for i in 0..a.len() { | ^^^^^^^^^^ | = note: `#[warn(clippy::needless_range_loop)]` on by default = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_range_loop help: consider using an iterator | 10 | for <item> in &mut a { | ~~~~~~ ~~~~~~
I mean this:
fn double(mut a: Vec<i32>) -> Vec<i32> { for x in a.iter_mut() { *x *= 2; } a }
Okay, no need to run it, I know what it does. But is it good?
Idk. Seems okay? What's wrong with it?
Well, do you really need to take ownership of the Vec
? Do you need a Vec
in the first place?
What if you want to do this?
fn main() { let mut a = [1, 2, 3]; println!("a = {a:?}"); let b = double(a); println!("b = {b:?}"); }
Ah yeah, that won't work. Well no I suppose we don't need a Vec
... after all,
we're doing everything in-place, the array.. vector.. whatever, container,
doesn't need to grow or shrink.
So we can take... OH! A mutable slice:
fn main() { let mut a = [1, 2, 3]; println!("a = {a:?}"); double(&mut a); println!("a = {a:?}"); } fn double(a: &mut [i32]) { for x in a.iter_mut() { *x *= 2 } }
$ cargo run -q a = [1, 2, 3] a = [2, 4, 6]
And let's make sure it works with a Vec
, too:
fn main() { let mut a = vec![1, 2, 3]; println!("a = {a:?}"); double(&mut a); println!("a = {a:?}"); }
$ cargo run -q a = [1, 2, 3] a = [2, 4, 6]
Yes it does!
Okay! It's time... for a quiz.
Here's a method defined on slices:
impl<T> [T] { pub const fn first(&self) -> Option<&T> { // ... } }
Does it mutate the slice?
No! It takes an immutable reference (&self
), so all it does is read.
Correct!
fn main() { let a = vec![1, 2, 3]; dbg!(a.first()); }
$ cargo run -q [src/main.rs:3] a.first() = Some( 1, )
What about this one?
impl<T> [T] { pub fn fill(&mut self, value: T) where T: Clone, { // ... } }
Oh that one mutates! Given the name, I'd say it fills the whole slice with
value
, and... it needs to be able to make clones of the value because
it might need to repeat it several times.
Right again!
fn main() { let mut a = [0u8; 5]; a.fill(3); dbg!(a); }
$ cargo run -q [src/main.rs:4] a = [ 3, 3, 3, 3, 3, ]
What about this one?
impl<T> [T] { pub fn iter(&self) -> Iter<'_, T> { // ... } }
Ooh that one's a toughie. So no mutation, and it uhhh borrows... through? I mean we've only briefly seen lifetimes, but I'm assuming we can't mutate a thing while we're iterating through it, so like, this:
fn main() { let mut a = [1, 2, 3, 4, 5]; let mut iter = a.iter(); dbg!(iter.next()); dbg!(iter.next()); a[2] = 42; dbg!(iter.next()); dbg!(iter.next()); }
...can't possibly work:
$ cargo run -q error[E0506]: cannot assign to `a[_]` because it is borrowed --> src/main.rs:6:5 | 3 | let mut iter = a.iter(); | -------- borrow of `a[_]` occurs here ... 6 | a[2] = 42; | ^^^^^^^^^ assignment to borrowed `a[_]` occurs here 7 | dbg!(iter.next()); | ----------- borrow later used here For more information about this error, try `rustc --explain E0506`. error: could not compile `grr` due to previous error
Yeah! Right again π
Alrighty! Moving on.
Closures
So, remember this code?
fn double(a: &[i32]) -> Vec<i32> { a.iter().map(|x| x * 2).collect() }
That's a closure.
That's a... which part, the pipe-looking thing? |x| x * 2
?
Yes. It's like a function.
Wait, no, a function is like this:
fn main() { let a = [1, 2, 3]; let b = double(&a); dbg!(b); } // π this fn times_two(x: &i32) -> i32 { x * 2 } fn double(a: &[i32]) -> Vec<i32> { // which we then π use here a.iter().map(times_two).collect() }
$ cargo run -q [src/main.rs:4] b = [ 2, 4, 6, ]
Yeah. It does the same thing.
Oh, now that you mention it yes, yes it does do the same thing.
Except a closure can close over its environment.
I see. No, wait. I don't. I don't see at all. Its environment? As in the birds and the trees and th-
Kinda, except it's more like... bindings. Look:
fn double(a: &[i32]) -> Vec<i32> { let factor = 2; a.iter().map(|x| x * factor).collect() }
Ohhh. Well that's a constant, it doesn't really count.
Fineeee, here:
fn main() { let a = [1, 2, 3]; let b = mul(&a, 10); dbg!(b); } fn mul(a: &[i32], factor: i32) -> Vec<i32> { a.iter().map(|x| x * factor).collect() }
Okay, okay, I see. So factor
is definitely not a constant there (if we don't
count constant folding), and it's... captured?
Closed over, yes.
...closed over by the closure. I'm gonna say "captured". Seems less obscure.
Sure, fine.
Wait wait wait this is boxed trait objects all over again, right? Sort of? Because closures are actually fat pointers? One pointer to the function itself, and one for the, uh, "environment". I mean, for everything captured by the closure.
Kinda, yes! But aren't we getting ahead of ourselv-
No no no, not at all, it doesn't matter that there might be a lot of new words, or that the underlying concepts aren't crystal clear to everyone reading this yet.
What matters is that we can proceed by analogy, because we've seen similar fuckery just before, and so we can show an example of a manual implementation of closures, just like we did boxed trait objects, and that'll clear it up for everyone.
Are you sure that'll work?
Eh, it's worth a shot right?
So here's what I mean. Say we want to provide a function that does something three times:
fn main() { do_three_times(todo!()); } fn do_three_times<T>(t: T) { todo!() }
It's generic, because it can do any thing three times. Caller's choice. Only how do I... how does the thing... do... something.
Oh! Traits! I can make a trait, hang on.
trait Thing { fn do_it(&self); }
There. And then do_three_times
will take anything that implements Thing
...
oh we can use impl Trait
syntax, no need for explicit generic type parameters
here:
fn do_three_times(t: impl Thing) { for _ in 0..3 { t.do_it(); } }
And then to call it, well... we need some type, on which we implement Thing
,
and make it do a thing. What's a good way to make up a new type that's empty?
Empty struct?
Right!
struct Greet; impl Thing for Greet { fn do_it(&self) { println!("hello!"); } } fn main() { do_three_times(Greet); }
And, if my calculations are correct...
$ cargo run -q hello! hello! hello!
Yes!!! See bear? Easy peasy! That wasn't even very long at all.
I must admit, I'm impressed.
And look, we can even box these!
trait Thing { fn do_it(&self); } fn do_three_times(things: &[Box<dyn Thing>]) { for _ in 0..3 { for t in things { t.do_it() } } } struct Greet; impl Thing for Greet { fn do_it(&self) { println!("hello!"); } } struct Part; impl Thing for Part { fn do_it(&self) { println!("goodbye!"); } } fn main() { do_three_times(&[Box::new(Greet), Box::new(Part)]); }
$ cargo run -q hello! goodbye! hello! goodbye! hello! goodbye!
Very nice. You even figured out how to make slices of heterogenous types.
Now let's see Paul Allen's trai-
Let me stop you right there, bear. I know what you're about to ask: "Oooh, but what if you need to mutate stuff from inside the closure? That won't work will it? Because Wust is such a special widdle wanguage uwu, it can't just wet you do the things you want, it has to be a whiny baby about it" well HAVE NO FEAR because yes, yes, I have realized that this right here:
trait Thing { // π fn do_it(&self); }
...means the closure can never mutate its environment.
Ah!
And so what you'd need to do if you wanted to be able to do that, is have a
ThingMut
trait, like so:
trait ThingMut { fn do_it(&mut self); } fn do_three_times(mut t: impl ThingMut) { for _ in 0..3 { t.do_it() } } struct Greet(usize); impl ThingMut for Greet { fn do_it(&mut self) { self.0 += 1; println!("hello {}!", self.0); } } fn main() { do_three_times(Greet(0)); }
$ cargo run -q hello 1! hello 2! hello 3!
Yes, but you don't really ne-
BUT YOU DON'T NEED TO TAKE OWNERSHIP OF THE THINGMUT I know I know, watch this:
fn do_three_times(t: &mut dyn ThingMut) { for _ in 0..3 { t.do_it() } }
Boom!
fn main() { do_three_times(&mut Greet(0)); }
Bang.
And I suppose you don't need me to do the link with the actual traits in the Rust standard library either?
Eh, who needs you. I'm sure I can find them... there!
There's three of them:
pub trait FnOnce<Args> { type Output; extern "rust-call" fn call_once(self, args: Args) -> Self::Output; } pub trait FnMut<Args>: FnOnce<Args> { extern "rust-call" fn call_mut( &mut self, args: Args ) -> Self::Output; } pub trait Fn<Args>: FnMut<Args> { extern "rust-call" fn call(&self, args: Args) -> Self::Output; }
So all Fn
(immutable reference) are also FnMut
(mutable reference), which
are also FnOnce
(takes ownership). Beautiful symmetry.
And then... I'm assuming the extern "rust-call"
fuckery is because... lack of
variadics right now?
Right, yes. And that's also why you can't really implement the Fn
/ FnMut
/ FnOnce
traits yourself on arbitrary types right now.
Yeah, see! Easy. So our example becomes this:
fn do_three_times(t: &mut dyn FnMut()) { for _ in 0..3 { t() } } fn main() { let mut counter = 0; do_three_times(&mut || { counter += 1; println!("hello {counter}!") }); }
Bam, weird syntax but that's a lot less typing, I like it, arguments are between pipes, sure why not.
Arguments are between pipes, what do you mean?
Oh, well closures can take arguments too, they're just like functions right? You told me that. So we can... do this!
// π fn do_three_times(t: impl Fn(i32)) { for i in 0..3 { t(i) } } fn main() { // π do_three_times(|i| println!("hello {i}!")); }
I see. And I supposed you've figured out boxing as well?
The sport, no. But the type erasure, sure, in that regard they're just regular traits, so, here we go:
fn do_all_the_things(things: &[Box<dyn Fn()>]) { for t in things { t() } } fn main() { do_all_the_things(&[ Box::new(|| println!("hello")), Box::new(|| println!("how are you")), Box::new(|| println!("I wasn't really asking")), Box::new(|| println!("goodbye")), ]); }
Well. It looks like you're all set.
Nothing left to learn.
The world no longer holds any secrets for you.
Through science, you have rid the universe of its last mystery, and you are now cursed to roam, surrounded by the mundane, devoid of the last shred of poet-
Wait, what about async stuff?
Ahhhhhhhhhhhhhhhhhhh fuck.
Async stuff
Okay, async stuff, is.... ugh. Wait, you've written about this before.
Multiple times yes, but humor me. Why do I want it?
You don't! God, why would you. I mean, okay you want it if you're writing network services and stuff.
Oh yes, I do want to do that! So I do want async!
Yes. Yes you very much want async.
And I've heard it makes everything worse!
Well...... so, you know how if you write a file, it writes to the file?
Yes? Like that:
fn main() { // error handling omitted for story-telling purposes let _ = std::fs::write("/tmp/hi", "hi!\n"); }
$ cargo run -q && cat /tmp/hi hi!
Well async is the same, except it doesn't work.
$ cargo add tokio +full (cut)
fn main() { // error handling omitted for story-telling purposes // π (was `std`) let _ = tokio::fs::write("/tmp/bye", "bye!\n"); }
$ cargo run -q && cat /tmp/bye cat: /tmp/bye: No such file or directory
Ah. Indeed it doesn't work.
Exactly, it does nothing, zilch:
$ cargo b -q && strace -ff ./target/debug/grr 2>&1 | grep bye