Here's a sentence I find myself saying several times a week:

...or we could just box it.

There's two remarkable things about this sentence.

The first, is that the advice is very rarely heeded, and instead, whoever I just said it to disappears for two days, emerging victorious, basking in the knowledge that, YES, the compiler could inline that, if it wanted to.

And the second is that, without a lot of context, this sentence is utter nonsense if you don't have a working knowledge of Rust. As a Java developer, you may be wondering if we're trying to turn numbers into objects (we are not). In fact, even as a Rust developer, you may have just accepted that boxing is just a fact of life.

It's just a thing we have to do sometimes, so the compiler stops being mad at us, and things just suddenly start working. That's not necessarily a bad thing. That's just how good compiler diagnostics are, that it can just tell you "hold on there friend, I really think you want to box it", and you can copy and paste the solution, and the puzzle is cracked.

But! Just because we can get by for a very long time without knowing what it means, doesn't mean I can resist the sweet sweet temptation of explaining in excruciating details what it actually means, and so, that's exactly what we're going to do in this article.

Before we do that, though, let's look at a simple example where we might be enjoined by a well-intentioned colleague to, as it were, "just box it".

A practical and very innocent example

Whenever cargo new is invoked, it generates a simple "hello world" application, that looks like this:

Rust code
fn main() {
    println!("Hello, world!");
}

It is pure, and innocent, and devoid of things that can fail, which is great.

Shell session
$ cargo run
   Compiling whatbox v0.1.0 (/home/amos/ftl/whatbox)
    Finished dev [unoptimized + debuginfo] target(s) in 0.47s
     Running `target/debug/whatbox`
Hello, world!

But sometimes we want to do things that can fail!

Like reading a file, for example:

Rust code
fn main() {
    println!("{}", std::fs::read_to_string("/etc/issue").unwrap())
}
Shell session
$ cargo run --quiet
Arch Linux \r (\l)

read_to_string can fail! And that's why it returns a Result<String, E> and not just a String.

And that's also why we need to call .unwrap() on it, to go from Result<String, E> to either:

Shell session
$ cargo run --quiet
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/main.rs:2:59
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Okay, good!

But let's say we want to read a string inside a function. Our own function.

Something like that:

Rust code
fn main() {
    println!("{}", read_issue())
}

fn read_issue() -> String {
    std::fs::read_to_string("/etc/issue").unwrap()
}

Well, here, everything works:

Shell session
$  cargo run --quiet
Arch Linux \r (\l)

But that's not really the code we want. See, the read_issue function feels like "library code". Right now, it's in our application, but I could see myself splitting that function into its own crate, maybe a crate named linux-info or something, because it could be useful to other applications.

And so, even though it's in the same crate as the main function, I don't feel comfortable causing a panic in read_issue, the way I felt comfortable causing a panic at the disco in main.

Instead, I think I want read_issue to return a Result, too. Because Result<T, E> is an enum, that can represent two things: the operation has succeeded (and we get a T), or it failed (and we get an E).

Rust code
enum Result<T, E> {
    Ok(T),
    Err(E),
}

And we know that when the operation succeeds, we get a String, so we know what to pick for T. But the question is: what do we pick for E?

Rust code
fn main() {
    println!("{}", read_issue().unwrap())
}

// what is `E` supposed to be?    👇
fn read_issue() -> Result<String, E> {
    std::fs::read_to_string("/etc/issue")
}

And that problem, that specific problem, is not something we really have to worry about in some other languages, like for instance... ECMAScript! I mean, JavaScript!

Because in JavaScript, if something goes wrong, we just throw!

JavaScript code
import { readFileSync } from "fs";

function main() {
  let issue = readIssue();
  console.log(`${issue}`);
}

function readIssue() {
  return readFileSync("/etc/i-do-not-exist");
}

main();
Shell session
$ node js/index.mjs
node:fs:505
  handleErrorFromBinding(ctx);
  ^

Error: ENOENT: no such file or directory, open '/etc/i-do-not-exist'
    at Object.openSync (node:fs:505:3)
    at readFileSync (node:fs:401:35)
    at readIssue (file:///home/amos/ftl/whatbox/js/index.mjs:9:5)
    at main (file:///home/amos/ftl/whatbox/js/index.mjs:4:17)
    at file:///home/amos/ftl/whatbox/js/index.mjs:12:1
    at ModuleJob.run (node:internal/modules/esm/module_job:154:23)
    at async Loader.import (node:internal/modules/esm/loader:177:24)
    at async Object.loadESM (node:internal/process/esm_loader:68:5) {
  errno: -2,
  syscall: 'open',
  code: 'ENOENT',
  path: '/etc/i-do-not-exist'
}

And we don't have to worry whether readIssue can or cannot throw when we call it:

JavaScript code
function main() {
  //          👇
  let issue = readIssue();
  console.log(`${issue}`);
}

Well, maybe we should! Maybe we should wrap it in a try-catch, just so we can recover from any exceptions thrown. But we don't have to. Our program follows the happy path happily.

In Go, there's no exceptions, but there is usually an indication that a function can fail in its signature.

Go code
package main

import (
    "log"
    "os"
)

func main() {
    issue := readIssue()
    log.Printf("issue = %v", issue)
}

func readIssue() string {
    bs, _ := os.ReadFile("")
    return string(bs)
}

Here, readIssue cannot fail! It only returns a string.

But here, it can fail:

Go code
package main

import (
    "log"
    "os"
)

func main() {
    // we get two values out of readIssue, including `err`
    issue, err := readIssue()
    // ...which we should check for nil-ness
    if err != nil {
        // ...and handle
        log.Fatalf("fatal error: %+v", err)
    }

    log.Printf("issue = %v", issue)
}

func readIssue() (string, error) {
    bs, err := os.ReadFile("")
    // same here, `ReadFile` does a multi-valued return, so we need
    // to check `err` first:
    if err != nil {
        return "", err
    }

    // and only here do we know reading the file actually succeeded:
    return string(bs), nil
}

And here, since we do our error handling properly, the output we get is:

Shell session
$ go run go/main.go
2021/04/17 20:47:37 fatal error: open : no such file or directory
exit status 1

However, note that it does not tell us where in the code the error occurred, whereas the JavaScript/Node.js version did.

There's a solution to that, but by default, out of the box, Go errors do not capture stack traces.

And then there's Rust, which is the most strict of the three, that forces us to declare that a function can fail, forces us to handle any error that may have occurred in a function, but also forces us to describe "what possible error values are there".

And that's where it can get confusing.

You see, in JavaScript, you can throw anything.

JavaScript code
function main() {
  let issue = readIssue();
  console.log(`${issue}`);
}

function readIssue() {
  throw "woops";
}

main();
Shell session
$ node js/index.mjs

node:internal/process/esm_loader:74
    internalBinding('errors').triggerUncaughtException(
                              ^
woops
(Use `node --trace-uncaught ...` to show where the exception was thrown)

This is not a good idea. Mostly, because then we don't get a stack trace.

No, not even with --trace-uncaught:

Shell session
$ node --trace-uncaught js/index.mjs

node:internal/process/esm_loader:74
    internalBinding('errors').triggerUncaughtException(
                              ^
woops
Thrown at:
    at loadESM (node:internal/process/esm_loader:74:31)

So please, never ever do that.

Instead, throw an Error object, like so:

JavaScript code
function main() {
  let issue = readIssue();
  console.log(`${issue}`);
}

function readIssue() {
  throw new Error("woops");
}

main();
Shell session
$ node js/index.mjs
file:///home/amos/ftl/whatbox/js/index.mjs:7
    throw new Error("woops");
          ^

Error: woops
    at readIssue (file:///home/amos/ftl/whatbox/js/index.mjs:7:11)
    at main (file:///home/amos/ftl/whatbox/js/index.mjs:2:17)
    at file:///home/amos/ftl/whatbox/js/index.mjs:10:1
    at ModuleJob.run (node:internal/modules/esm/module_job:154:23)
    at async Loader.import (node:internal/modules/esm/loader:177:24)
    at async Object.loadESM (node:internal/process/esm_loader:68:5)

As for Go, well. You can't just say you're going to return an error, and just return a string. That's good.

Go code
func readIssue() (string, error) {
    return "", "woops"
}
Shell session
$ go run go/main.go
# command-line-arguments
go/main.go:17:13: cannot use "woops" (type string) as type error in return argument:
        string does not implement error (missing Error method)

Whatever you return has to be of type error, and there is a shorthand for that:

Go code
func readIssue() (string, error) {
    return "", errors.New("woops")
}

Which is just this:

Go code
// New returns an error that formats as the given text.
// Each call to New returns a distinct error value even if the text is identical.
func New(text string) error {
    return &errorString{text}
}

Where errorString is simply a struct:

Go code
// errorString is a trivial implementation of error.
type errorString struct {
    s string
}

That implements the error interface. All the interface asks for is that there is an Error() method that returns a string:

Go code
func (e *errorString) Error() string {
    return e.s
}

And so our sample program now shows this:

Shell session
$ go run go/main.go
2021/04/17 20:59:37 fatal error: woops
exit status 1

Which is not to say that error handling in Go is a walk in the park.

This first bit has been pointed out in almost every article that has even the slightest amount of feelings about Go: it's just way too easy to ignore, or "forget to handle" Go errors:

Go code
func readIssue() (string, error) {
    bs, err := os.ReadFile("/etc/issue")
    err = os.WriteFile("/tmp/issue-copy", bs, 0o644)
    if err != nil {
        return "", err
    }
    return string(bs), nil
}

Woops! No warnings, no nothing. If we fail to read the file, that error is gone forever. The issue here is of course that Go returns "multiple things": both the "success value" and the "error value", and it's on you to pinky swear not to touch the success value, if you haven't checked the error value first.

And that problem doesn't exist in a language with sum types — a Rust Result is either Result::Ok(T), or Result::Err(E), never both.

But everyone knows about that one. The other one is a lot more fun.

If we make our own error type:

Go code
type naughtyError struct{}

func (ne *naughtyError) Error() string {
    return "oh no"
}

Then we can return it as an error. Because error is an interface, and *naughtyError has an Error method that returns a string, everything fits together, boom, composition, alright!

Go code
func readIssue() (string, error) {
    return "", &naughtyError{}
}
Shell session
$ go run go/main.go
2021/04/17 21:06:28 fatal error: oh no
exit status 1

But if we accidentally return a value of type *naughtyError, that just happens to be nil, well...

Go code
package main

import (
    "log"
)

func readIssue() (string, error) {
    var err *naughtyError
    log.Printf("(in readIssue) is err nil? %v", err == nil)
    return "", err
}

func main() {
    issue, err := readIssue()
    log.Printf("(in main) is err nil? %v", err == nil)

    if err != nil {
        log.Fatalf("fatal error: %+v", err)
    }

    log.Printf("issue = %v", issue)
}

//

type naughtyError struct{}

func (ne *naughtyError) Error() string {
    return "oh no"
}
Shell session
$ go run go/main.go
2021/04/17 21:08:08 (in readIssue) is err nil? true
2021/04/17 21:08:08 (in main) is err nil? false
2021/04/17 21:08:08 fatal error: oh no
exit status 1

...then bad things happen.

And this is really fun to me, but it is really bad for Go.

The first issue, "forgetting to check for nil", is easy to understand. We told you where the error was. Just don't forget to check it. It's easy to fit into one's mental model of Go, which is advertised as really really simple.

The second one is a lot worse, because it betrays a leaky abstraction.

You see... there's some magic afoot.

The great appearing act

We have two err values in our last, naughty sample program. One of them compares equal to nil, and the other does not.

But the differences don't stop there:

Go code
package main

import (
    "log"
    "unsafe"
)

func readIssue() (string, error) {
    var err *naughtyError
    log.Printf("(in readIssue) nil? %v, size = %v", err == nil, unsafe.Sizeof(err))
    return "", err
}

func main() {
    issue, err := readIssue()
    log.Printf("(in main) nil? %v, size = %v", err == nil, unsafe.Sizeof(err))

    if err != nil {
        log.Fatalf("fatal error: %+v", err)
    }

    log.Printf("issue = %v", issue)
}

//

type naughtyError struct{}

func (ne *naughtyError) Error() string {
    return "oh no"
}
Cool bear's hot tip

Why is Sizeof part of the unsafe package? Well, that's a very good question.

The package docs say:

Package unsafe contains operations that step around the type safety of Go programs.

Packages that import unsafe may be non-portable and are not protected by the Go 1 compatibility guidelines.

...but what we're doing here is completely harmless. The important bit, as I understand it, is that as a Go developer, you're not supposed to care.

You're not supposed to look at these things. Go is simple! Byte slices are strings! Go has no pointer arithmetic! Who cares how large a type is!

Until you do care, and then, well, you're on your own. And "using unsafe" is exactly what "being on your own" is. But it's okay. We're all on our own together.

The program above prints the following:

Shell session
$ go run go/main.go
2021/04/17 21:19:12 (in readIssue) nil? true, size = 8
2021/04/17 21:19:12 (in main) nil? false, size = 16
2021/04/17 21:19:12 fatal error: oh no
exit status 1

Which is iiiiinteresting.

This is the kind of example that, given enough time, one could figure out the solution all on their own. But when coming face to face with it, and when it has been a while, it is... puzzling.

The first line makes a ton of sense.

We declared a pointer, like this:

Go code
    var err *naughtyError

The zero value of a pointer is nil, so it's equal to nil. And we're (well, I'm) on 64-bit Linux, so the size of a pointer is 64 bits, or 8 bytes.

Cool bear's hot tip

Is a byte always 8 bits?

According to ISO/IEC 80000, yes.

If you're reading this from a machine whose byte isn't 8 bits, please, please send a picture.

The second line is a lot more surprising — not only does it not equal nil, but, it's also twice as large.

We can shed some light on the whole thing by introducing yet another error type:

Go code
package main

import (
    "log"
)

func main() {
    var err error

    err = (*naughtyError)(nil)
    log.Printf("%v", err)

    err = (*niceError)(nil)
    log.Printf("%v", err)
}

type naughtyError struct{}

func (ne *naughtyError) Error() string {
    return "oh no"
}

type niceError struct{}

func (ne *niceError) Error() string {
    return "ho ho ho!"
}

What a nice holiday-themed error. We have two nil values, and they both print different things!

Shell session
$ go run go/main.go
2021/04/17 21:26:42 oh no
2021/04/17 21:26:42 ho ho ho!

Ah. AH! This is why it's bigger! This is why error is wider than *naughtyError!

Yes bear?

Because these values are are both nil! But uhhh when acting as an interface value (for the error interface), they behave differently!

Yes!

And so the size of an error interface value is 16 bytes because... there's two pointers!

Precisely!

And the second pointer is... to the type!

Well, in Go, yes!

And it allows us to "downcast" it.

To what?

To "downcast" it, ie. to go from the interface type, back to the concrete type:

Go code
package main

import (
    "errors"
    "log"
)

func showType(err error) {
    // 👇 downcasting action happens here
    if _, ok := err.(*naughtyError); ok {
        log.Printf("got a *naughtyError")
    } else if _, ok := err.(*niceError); ok {
        log.Printf("got a *niceError")
    } else {
        log.Printf("got another kind of error")
    }
}

func main() {
    showType((*naughtyError)(nil))
    showType((*niceError)(nil))
    showType(errors.New(""))
}

type naughtyError struct{}

func (ne *naughtyError) Error() string {
    return "oh no"
}

type niceError struct{}

func (ne *niceError) Error() string {
    return "ho ho ho!"
}
Shell session
$ go run go/main.go
2021/04/17 21:33:48 got a *naughtyError
2021/04/17 21:33:48 got a *niceError
2021/04/17 21:33:48 got another kind of error

Ah, so mystery solved! One pointer for the value, one pointer for the type: 8 bytes each, together, 16 bytes.

Case closed.

Right! Close enough.

And now, let's turn our attention back to Rust.

What were we doing again?

Ah, right! We were here:

Rust code
fn main() {
    println!("{}", read_issue().unwrap())
}

fn read_issue() -> Result<String, E> {
    std::fs::read_to_string("/etc/issue")
}

And we had to pick an E.

Because, as we mentioned, Rust forces you to pick an "error type".

But... there is also a standard error type. Except in Rust, capitalization does not mean "private or public" (there's a keyword for that). Instead, all types are capitalized, by convention, so it's not error, it's Error.

More specifically, it's std::error::Error.

So, we can try to pick that:

Rust code
// 👇 we import it here
use std::error::Error;

fn main() {
    println!("{}", read_issue().unwrap())
}

// and use it there                 👇
fn read_issue() -> Result<String, Error> {
    std::fs::read_to_string("/etc/issue")
}

And, well...

Shell session
$ cargo run --quiet
warning: trait objects without an explicit `dyn` are deprecated
 --> src/main.rs:7:35
  |
7 | fn read_issue() -> Result<String, Error> {
  |                                   ^^^^^ help: use `dyn`: `dyn Error`
  |
  = note: `#[warn(bare_trait_objects)]` on by default

(rest omitted)

Oh, no, a warning! It says to use the dyn keyword. Alright, who am I to object, let's use the dyn keyword.

Rust code
//                                 👇
fn read_issue() -> Result<String, dyn Error> {
    std::fs::read_to_string("/etc/issue")
}

Let's try this again:

Shell session
$ cargo run --quiet
error[E0277]: the size for values of type `(dyn std::error::Error + 'static)` cannot be known at compilation time
   --> src/main.rs:7:20
    |
7   | fn read_issue() -> Result<String, dyn Error> {
    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
    |
   ::: /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:241:20
    |
241 | pub enum Result<T, E> {
    |                    - required by this bound in `std::result::Result`
    |
    = help: the trait `Sized` is not implemented for `(dyn std::error::Error + 'static)`

error: aborting due to previous error

And, especially coming from Go, this error is really puzzling.

Because this code feels more or less like a direct translation of that code:

Go code
func readIssue() (string, error) {
    bs, err := os.ReadFile("/etc/issue")
    return string(bs), err
}

And that code "just works".

Well, the explanation is rather simple: it is not a direct translation.

A direct translation would look more like this:

Rust code
use std::error::Error;

fn main() {
    println!("{}", read_issue().unwrap())
}

//                                 👇
fn read_issue() -> Result<String, Box<dyn Error>> {
    std::fs::read_to_string("/etc/issue")
}

Which, as you can see, works just f-

Shell session
$ cargo run --quiet
error[E0308]: mismatched types
 --> src/main.rs:8:5
  |
7 | fn read_issue() -> Result<String, Box<dyn Error>> {
  |                    ------------------------------ expected `std::result::Result<String, Box<(dyn std::error::Error + 'static)>>` because of return type
8 |     std::fs::read_to_string("/etc/issue")
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected struct `Box`, found struct `std::io::Error`
  |
  = note: expected enum `std::result::Result<_, Box<(dyn std::error::Error + 'static)>>`
             found enum `std::result::Result<_, std::io::Error>`

error: aborting due to previous error

...okay, so it doesn't work. But we can make it work fairly easily:

Rust code
use std::error::Error;

fn main() {
    println!("{}", read_issue().unwrap())
}

fn read_issue() -> Result<String, Box<dyn Error>> {
    Ok(std::fs::read_to_string("/etc/issue")?)
}
Shell session
$ cargo run --quiet
Arch Linux \r (\l)


Theeeeeeere we go. Now we're even. This is the closest we'll get to the aforementioned Go code.

But at that point, you may very well have several questions.

What the heck is a Box?

Well, for the time being, you can sort of think about it as a pointer.

But it's not, not really.

This is a pointer:

Rust code
struct MyError {
    value: u32,
}

fn main() {
    let e = MyError { value: 32 };
    let e_ptr: *const MyError = &e;
    print_error(e_ptr);
}

fn print_error(e: *const MyError) {
    if e != std::ptr::null() {
        println!("MyError (value = {})", unsafe { (*e).value });
    }
}
Shell session
$ cargo run --quiet
MyError (value = 32)

But as you may have noticed, dereferencing a pointer is unsafe:

Rust code
fn print_error(e: *const MyError) {
    if e != std::ptr::null() {
        //                                 👇
        println!("MyError (value = {})", unsafe { (*e).value });
    }
}

Why is dereferencing a pointer unsafe? Well, because it might be null! Or it might point to an address that does not fall within an area that's meaningful for the currently running program, and that would cause a segmentation fault.

So, whenever we dereference a pointer, we're on our own.

Getting the size of something, though, is perfectly safe:

Rust code
struct MyError {
    value: u32,
}

fn main() {
    let e = MyError { value: 32 };
    let e_ptr: *const MyError = &e;
    // 👇 no unsafe!
    dbg!(std::mem::size_of_val(&e_ptr));
    print_error(e_ptr);
}

fn print_error(e: *const MyError) {
    if e != std::ptr::null() {
        println!("MyError (value = {})", unsafe { (*e).value });
    }
}
Shell session
$ cargo run --quiet
[src/main.rs:8] std::mem::size_of_val(&e_ptr) = 8
MyError (value = 32)

And, as expected, the size of a pointer is 8 bytes, because I'm still writing this from Linux 64-bit.

But: if constructing a pointer value is safe, dereferencing it (reading from the memory it points to, or writing to it) is not.

So we often don't use it at all in Rust.

Instead, we use references!

Rust code
struct MyError {
    value: u32,
}

fn main() {
    let e = MyError { value: 32 };
    let e_ref: &MyError = &e;
    dbg!(std::mem::size_of_val(&e_ref));
    print_error(e_ref);
}

fn print_error(e: &MyError) {
    println!("MyError (value = {})", (*e).value);
}

Which are still 8 bytes:

Shell session
$ cargo run --quiet
[src/main.rs:8] std::mem::size_of_val(&e_ref) = 8
MyError (value = 32)

...but they're also perfectly safe to dereference, because it is guaranteed that they point to valid memory: in safe code, it is impossible to construct an invalid reference, or to keep a reference to some value after that value has been freed.

In fact, it's so safe that we don't even need to use the * operator to dereference: we can just rely on "autoderef":

Rust code
fn print_error(e: &MyError) {
    // star be gone!                 👇
    println!("MyError (value = {})", e.value);
}

And that works just as well.

And now, a quick note about safety: you'll notice that I just said "in safe code, it is impossible to construct an invalid reference".

In unsafe code, it is very possible:

Rust code
struct MyError {
    value: u32,
}

fn main() {
    let e: *const MyError = std::ptr::null();
    // ooooh no no no. crimes! 👇
    let e_ref: &MyError = unsafe { &*e };
    dbg!(std::mem::size_of_val(&e_ref));
    print_error(e_ref);
}

fn print_error(e: &MyError) {
    println!("MyError (value = {})", e.value);
}

And then BOOM:

Shell session
$ cargo run --quiet
[src/main.rs:8] std::mem::size_of_val(&e_ref) = 8
[1]    17569 segmentation fault  cargo run --quiet

Segmentation fault.

But that's not news. That's not a big flaw in Rust's safety model.

That is Rust's safety model.

The idea is that, if all the unsafe code is sound, then all the safe code is safe, too.

And you have a lot less "unsafe" code than you have "safe" code, which makes it a lot easier to audit. It's also very visible, with explicit unsafe blocks, unsafe traits and unsafe functions, and so it's easy to statically determine where unsafe code is — it's not just "woops you imported the forbidden package".

Finally, there's tools like the Miri interpreter, that help with unsafe code, just like there's sanitizers for C/C++, which do not have that safe/unsafe split.

But let's get back to boxes

So, we've seen two kinds of "pointers" in Rust so far: raw pointers, aka *const T (and its sibling, *mut T), and references (&T and &mut T).

We said we were going to ignore raw pointers, so let's focus on references.

In Go, when you get a pointer to an object, you can do anything with it. You can hold onto it as long as you want, you can shove it into a map — even if that object was originally going to be freed, you, as a function that receives a pointer to that object, can extend the lifetime of that object to be however long you need it to.

This works because Go is garbage-collected, so, as long as there's at least one reference to an object, it's "live", and it's not going to be collected (or "freed").

As soon as there are zero references left to an object, then it qualifies for garbage collection. The garbage collector does not guarantee how soon an object will actually be freed, or if it will ever be freed. It just qualifies.

And it's not immediately obvious if we try to showcase this with code like that:

Go code
package main

import (
    "log"
)

func main() {
    var slice []string
    addString(&slice)

    log.Printf("==== from main ====")
    for _, str := range slice {
        log.Printf("%v, %v", &str, str)
    }
}

func addString(slice *[]string) {
    var str = "hello"

    log.Printf("%v, %v", &str, str)
    *slice = append(*slice, str)
}

This should show the address of the string in both the addString function and in main, right? And I just said they were the same string, main just ends up keeping a reference to it.

But we get two different addresses:

sh
$ go run ./go/main.go
2021/04/18 11:34:42 0xc00011e220, hello
2021/04/18 11:34:42 ==== from main ====
2021/04/18 11:34:42 0xc00011e250, hello

To really see what's going on, we need to peel away one more layer of Go magic, and cast our string to a reflect.StringHeader:

Go code
package main

import (
    "log"
    "reflect"
    "unsafe"
)

func main() {
    var slice []string
    addString(&slice)

    log.Printf("==== from main ====")
    for _, str := range slice {
        log.Printf("%v, %v", &str, str)
        sh := (*reflect.StringHeader)(unsafe.Pointer(&str))
        log.Printf("%#v", sh)
    }
}

func addString(slice *[]string) {
    var str = "hello"

    log.Printf("%v, %v", &str, str)
    sh := (*reflect.StringHeader)(unsafe.Pointer(&str))
    log.Printf("%#v", sh)
    *slice = append(*slice, str)
}
Shell session
$ go run ./go/main.go
2021/04/18 11:35:24 0xc000010240, hello
2021/04/18 11:35:24 &reflect.StringHeader{Data:0x4c63e1, Len:5}
2021/04/18 11:35:24 ==== from main ====
2021/04/18 11:35:24 0xc000010270, hello
2021/04/18 11:35:24 &reflect.StringHeader{Data:0x4c63e1, Len:5}

There. Now we know it's the same string.

We have reflect.StringHeader, which is a Go struct, and the type that string actually is, and that has copy semantics, just like other Go structs, and then we have "the string data", which lives at 0x4c63e1.

Which... is a peculiar memory address. It's very low. Much lower than the two StringHeader values we have, which were at 0xc000010240 and 0xc000010270, respectively.

So again, to understand what's really going on, we need to get our hands dirty.

Shell session
$ go build ./go/main.go
$ gdb --quiet ./main
Reading symbols from ./main...
Loading Go Runtime support.
(gdb) catch syscall exit exit_group
Catchpoint 1 (syscalls 'exit' [60] 'exit_group' [231])
(gdb) r
Starting program: /home/amos/ftl/whatbox/main
[New LWP 24224]
[New LWP 24225]
[New LWP 24226]
[New LWP 24227]
[New LWP 24228]
2021/04/18 11:41:24 0xc00011e220, hello
2021/04/18 11:41:24 &reflect.StringHeader{Data:0x4c63e1, Len:5}
2021/04/18 11:41:24 ==== from main ====
2021/04/18 11:41:24 0xc00011e250, hello
2021/04/18 11:41:24 &reflect.StringHeader{Data:0x4c63e1, Len:5}

Thread 1 "main" hit Catchpoint 1 (call to syscall exit_group), runtime.exit ()
    at /usr/lib/go/src/runtime/sys_linux_amd64.s:57
57              RET

Okay, we've now successfully executed our main Go binary from GDB, and we've managed to pause execution right before it exits.

And at that point, we can inspect memory mappings:

Shell session
$ (gdb) info proc map
process 24220
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
            0x400000           0x4a2000    0xa2000        0x0 /home/amos/ftl/whatbox/main
            0x4a2000           0x545000    0xa3000    0xa2000 /home/amos/ftl/whatbox/main
            0x545000           0x55b000    0x16000   0x145000 /home/amos/ftl/whatbox/main
            0x55b000           0x58e000    0x33000        0x0 [heap]
        0xc000000000       0xc004000000  0x4000000        0x0
      0x7fffd1329000     0x7fffd369a000  0x2371000        0x0
      0x7fffd369a000     0x7fffe381a000 0x10180000        0x0
      0x7fffe381a000     0x7fffe381b000     0x1000        0x0
      0x7fffe381b000     0x7ffff56ca000 0x11eaf000        0x0
      0x7ffff56ca000     0x7ffff56cb000     0x1000        0x0
      0x7ffff56cb000     0x7ffff7aa0000  0x23d5000        0x0
      0x7ffff7aa0000     0x7ffff7aa1000     0x1000        0x0
      0x7ffff7aa1000     0x7ffff7f1a000   0x479000        0x0
      0x7ffff7f1a000     0x7ffff7f1b000     0x1000        0x0
      0x7ffff7f1b000     0x7ffff7f9a000    0x7f000        0x0
      0x7ffff7f9a000     0x7ffff7ffa000    0x60000        0x0
      0x7ffff7ffa000     0x7ffff7ffd000     0x3000        0x0 [vvar]
      0x7ffff7ffd000     0x7ffff7fff000     0x2000        0x0 [vdso]
      0x7ffffffdd000     0x7ffffffff000    0x22000        0x0 [stack]

And what we see here is very interesting.

First off, we notice that 0x4c63e1, where our string data actually was, is in a region directly mapped from our the main file:

          Start Addr           End Addr       Size     Offset objfile
            0x4a2000           0x545000    0xa3000    0xa2000 /home/amos/ftl/whatbox/main

And indeed, if we read 5 bytes at region_start_addr - str_addr + region_file_offset...

Shell session
$ dd status=none if=./main skip=$((0x4c63e1-0x4a2000+0xa2000)) bs=1 count=5
hello%

...there it is!

Cool bear's hot tip

The % character is just what Z shell prints when a command's output is not terminated with a new line.

That way the prompt is not messed up, but you still know that there was no newline.

And the other very interesting thing is that the StringHeader values, in the 0xc00011e000 neighborhood, are not in the region GDB tells us is the [stack]:

          Start Addr           End Addr       Size     Offset objfile
      0x7ffffffdd000     0x7ffffffff000    0x22000        0x0 [stack]

And they're not in the region GDB tells us is the [heap]:

          Start Addr           End Addr       Size     Offset objfile
            0x55b000           0x58e000    0x33000        0x0 [heap]

Why is that?

Well, because Go has its own stack. And its own heap. And everything is garbage-collected. And also, that makes goroutines cheap, and they can adjust their stack size dynamically, and it also complicates FFI a bunch.

But, point is: our example is a little moot, because "hello" is never going to be garbage collected — it's read directly from the executable, which never disappears as long as our program runs.

In fact, here's a fun way to show this:

Go code
package main

import (
    "log"
    "reflect"
    "runtime"
    "unsafe"
)

func main() {
    var str string
    sh := (*reflect.StringHeader)(unsafe.Pointer(&str))

    log.Printf("(main) %v, %#v", &str, str)
    log.Printf("(main) %#v", sh)

    data, len := lol()

    // Now that there's no pointers left to `"hello"`, let's try to get it
    // garbage-collected. There's no guarantees, still, but we're doing our
    // best.
    runtime.GC()

    sh.Data = uintptr(data)
    sh.Len = len

    log.Printf("(main) %v, %#v", &str, str)
    log.Printf("(main) %#v", sh)
}

func lol() (uint64, int) {
    var str = "hello"

    sh := (*reflect.StringHeader)(unsafe.Pointer(&str))

    log.Printf("(lol) %v, %#v", &str, str)
    log.Printf("(lol) %#v", sh)

    // we return `sh.Data` as an `uint64`, which _does not count as pointer_
    // because Go has a precise GC, not a conservative GC.
    return uint64(sh.Data), sh.Len
}
Shell session
$ go run ./go/main.go
2021/04/18 12:08:22 (main) 0xc00009e220, ""
2021/04/18 12:08:22 (main) &reflect.StringHeader{Data:0x0, Len:0}
2021/04/18 12:08:22 (lol) 0xc00009e230, "hello"
2021/04/18 12:08:22 (lol) &reflect.StringHeader{Data:0x4c63dd, Len:5}
2021/04/18 12:08:22 (main) 0xc00009e220, "hello"
2021/04/18 12:08:22 (main) &reflect.StringHeader{Data:0x4c63dd, Len:5}

Neat! Even though we explicitly invoke the garbage collector, the data at 0x4c63dd doesn't "disappear". It's still there.

Whereas if we compare with this code, which puts "hello" in the "Go heap":

Go code
// omitted: rest of the code

func lol() (uint64, int) {
    var str = string([]byte{'h', 'e', 'l', 'l', 'o'})

    sh := (*reflect.StringHeader)(unsafe.Pointer(&str))

    log.Printf("(lol) %v, %#v", &str, str)
    log.Printf("(lol) %#v", sh)

    return uint64(sh.Data), sh.Len
}
Shell session
$ go run ./go/main.go
2021/04/18 12:12:06 (main) 0xc00009e220, ""
2021/04/18 12:12:06 (main) &reflect.StringHeader{Data:0x0, Len:0}
2021/04/18 12:12:06 (lol) 0xc00009e230, "hello"
2021/04/18 12:12:06 (lol) &reflect.StringHeader{Data:0xc0000b80b8, Len:5}
2021/04/18 12:12:06 (main) 0xc00009e220, "hello"
2021/04/18 12:12:06 (main) &reflect.StringHeader{Data:0xc0000b80b8, Len:5}

...then we see that "hello" is indeed in the Go heap (it's in the 0xc000000000 neighborhood).

But uh... it doesn't disappear either.

Just curious, what did you expect?

To see the empty string? I don't know, that's a good question...

Well... the garbage collector doesn't really zero out memory blocks when it frees them, right?

Is just "marks them as free", and doesn't change anything about the contents of the memory.

Right, yes, I suppose.

And so unless something else gets allocated at the exact same location, re-using the previously-freed block, then we should still see the same "hello" string, even if it's been garbage-collected.

Right.

If only there was a way to get the Go GC to fill a memory block with nonsense after it's been freed oh wait, hang on, there it is, we can just use GODEBUG=clobberfree=1:

clobberfree: setting clobberfree=1 causes the garbage collector to clobber the memory content of an object with bad content when it frees the object.

Let's try it:

Shell session
$ GODEBUG=clobberfree=1 go run ./go/main.go
2021/04/18 12:16:00 (main) 0xc000012240, ""
2021/04/18 12:16:00 (main) &reflect.StringHeader{Data:0x0, Len:0}
2021/04/18 12:16:00 (lol) 0xc000012250, "hello"
2021/04/18 12:16:00 (lol) &reflect.StringHeader{Data:0xc0000161a8, Len:5}
2021/04/18 12:16:00 (main) 0xc000012240, "ï¾­\xde\xef"
2021/04/18 12:16:00 (main) &reflect.StringHeader{Data:0xc0000161a8, Len:5}

There! We have successfully written unsafe Go code. Through the help of the unsafe and reflect packages.

To really make our example work, though, we have to run the version where "hello" was mapped directly from the executable file, also with clobberfree:

Go code
func lol() (uint64, int) {
    var str = "hello"

    // etc.
}
Shell session
$ GODEBUG=clobberfree=1 go run ./go/main.go
2021/04/18 12:17:11 (main) 0xc000012240, ""
2021/04/18 12:17:11 (main) &reflect.StringHeader{Data:0x0, Len:0}
2021/04/18 12:17:11 (lol) 0xc000012250, "hello"
2021/04/18 12:17:11 (lol) &reflect.StringHeader{Data:0x4c63dd, Len:5}
2021/04/18 12:17:11 (main) 0xc000012240, "hello"
2021/04/18 12:17:11 (main) &reflect.StringHeader{Data:0x4c63dd, Len:5}
What did we learn?

Go string values are actually structs, with a Data field, that points somewhere in memory. The structs themselves, of type reflect.StringHeader, have copy semantics, so s2 := s1 creates a new StringHeader, pointing to the same area in memory.

The area in memory to which a StringHeader can point to can be in two different regions: "static data" mapped directly from the executable file, for string constants, or "that big block Go allocates", where the GC heap lives.

Now for some more Rust

Some of the same concepts apply to Rust code as well.

For instance, if we have a string literal, it will be neither on the heap nor the stack, it will be "static data", mapped directly from the executable:

Rust code
fn main() {
    let s = "hello";
    dbg!(s as *const _);
}
Shell session
$ cargo build --quiet
$ gdb --quiet --args ./target/debug/whatbox
Reading symbols from ./target/debug/whatbox...
(gdb) catch syscall exit exit_group
Catchpoint 1 (syscalls 'exit' [60] 'exit_group' [231])
(gdb) r
Starting program: /home/amos/ftl/whatbox/target/debug/whatbox
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[src/main.rs:3] s as *const _ = 0x000055555558c000

Catchpoint 1 (call to syscall exit_group), 0x00007ffff7e71621 in _exit () from /usr/lib/libc.so.6
(gdb) info proc map
process 30848
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
      0x555555554000     0x555555559000     0x5000        0x0 /home/amos/ftl/whatbox/target/debug/whatbox
      0x555555559000     0x55555558c000    0x33000     0x5000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x55555558c000     0x555555599000     0xd000    0x38000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x555555599000     0x55555559c000     0x3000    0x44000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x55555559c000     0x55555559d000     0x1000    0x47000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x55555559d000     0x5555555be000    0x21000        0x0 [heap]
      0x7ffff7da3000     0x7ffff7da5000     0x2000        0x0
      0x7ffff7da5000     0x7ffff7dcb000    0x26000        0x0 /usr/lib/libc-2.33.so
      0x7ffff7dcb000     0x7ffff7f17000   0x14c000    0x26000 /usr/lib/libc-2.33.so
      0x7ffff7f17000     0x7ffff7f63000    0x4c000   0x172000 /usr/lib/libc-2.33.so
      0x7ffff7f63000     0x7ffff7f66000     0x3000   0x1bd000 /usr/lib/libc-2.33.so
      0x7ffff7f66000     0x7ffff7f69000     0x3000   0x1c0000 /usr/lib/libc-2.33.so
(cut)

Here "hello" was at address 0x55555558c000, which is in this range:

          Start Addr           End Addr       Size     Offset objfile
      0x55555558c000     0x555555599000     0xd000    0x38000 /home/amos/ftl/whatbox/target/debug/whatbox

...in fact, it's at the very start of this range, and we can pull the same trick, to read it directly from the executable file ourselves:

Shell session
$ dd status=none if=./target/debug/whatbox skip=$((0x38000)) bs=1 count=5
hello%

We can also have things that are on the stack, for example, if we turn it into a String, the String itself will be on the stack:

Rust code
fn main() {
    //       👇
    let s: String = "hello".into();
    //   👇
    dbg!(&s as *const _);
}
Shell session
$ cargo build --quiet
$ gdb --quiet --args ./target/debug/whatbox
Reading symbols from ./target/debug/whatbox...
(gdb) catch syscall exit exit_group
Catchpoint 1 (syscalls 'exit' [60] 'exit_group' [231])
(gdb) r
Starting program: /home/amos/ftl/whatbox/target/debug/whatbox
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
                                        👇
[src/main.rs:3] &s as *const _ = 0x00007fffffffd760

Catchpoint 1 (call to syscall exit_group), 0x00007ffff7e71621 in _exit () from /usr/lib/libc.so.6
(gdb) info proc map
process 31339
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
      0x555555554000     0x555555559000     0x5000        0x0 /home/amos/ftl/whatbox/target/debug/whatbox
      0x555555559000     0x55555558e000    0x35000     0x5000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x55555558e000     0x55555559c000     0xe000    0x3a000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x55555559c000     0x55555559f000     0x3000    0x47000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x55555559f000     0x5555555a0000     0x1000    0x4a000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x5555555a0000     0x5555555c1000    0x21000        0x0 [heap]
      0x7ffff7da3000     0x7ffff7da5000     0x2000        0x0
(cut)
      0x7ffff7fb4000     0x7ffff7fb6000     0x2000        0x0
      0x7ffff7fc7000     0x7ffff7fca000     0x3000        0x0 [vvar]
      0x7ffff7fca000     0x7ffff7fcc000     0x2000        0x0 [vdso]
      0x7ffff7fcc000     0x7ffff7fcd000     0x1000        0x0 /usr/lib/ld-2.33.so
      0x7ffff7fcd000     0x7ffff7ff1000    0x24000     0x1000 /usr/lib/ld-2.33.so
      0x7ffff7ff1000     0x7ffff7ffa000     0x9000    0x25000 /usr/lib/ld-2.33.so
      0x7ffff7ffb000     0x7ffff7ffd000     0x2000    0x2e000 /usr/lib/ld-2.33.so
      0x7ffff7ffd000     0x7ffff7fff000     0x2000    0x30000 /usr/lib/ld-2.33.so
            👇                                                  👇
      0x7ffffffdd000     0x7ffffffff000    0x22000        0x0 [stack]

But the String's data is on the heap!

Rust code
fn main() {
    let s: String = "hello".into();
    //        👇
    dbg!(s.as_bytes() as *const _);
}
Shell session
$ cargo build --quiet
$ gdb --quiet --args ./target/debug/whatbox
Reading symbols from ./target/debug/whatbox...
(gdb) catch syscall exit exit_group
Catchpoint 1 (syscalls 'exit' [60] 'exit_group' [231])
(gdb) r
Starting program: /home/amos/ftl/whatbox/target/debug/whatbox
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
                                                   👇
[src/main.rs:3] s.as_bytes() as *const _ = 0x00005555555a0aa0

Catchpoint 1 (call to syscall exit_group), 0x00007ffff7e71621 in _exit () from /usr/lib/libc.so.6
(gdb) info proc map
process 31715
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
      0x555555554000     0x555555559000     0x5000        0x0 /home/amos/ftl/whatbox/target/debug/whatbox
      0x555555559000     0x55555558e000    0x35000     0x5000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x55555558e000     0x55555559c000     0xe000    0x3a000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x55555559c000     0x55555559f000     0x3000    0x47000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x55555559f000     0x5555555a0000     0x1000    0x4a000 /home/amos/ftl/whatbox/target/debug/whatbox
            👇                                                  👇
      0x5555555a0000     0x5555555c1000    0x21000        0x0 [heap]
(cut)

So a Rust String is like a Go string? I mean, a Go StringHeader?

Well, not exactly.

Because as we mentioned before, a Go string / StringHeader has copy semantics, which means we simply assign a string to another variable, and it'll create a new StringHeader, pointing to the same memory area:

Go code
package main

import (
    "log"
    "reflect"
    "unsafe"
)

func main() {
    var s1 = string([]byte{'h', 'e', 'l', 'l', 'o'})
    var s2 = s1
    var s3 = s1

    log.Printf("&s1 = %#v", &s1)
    log.Printf("&s2 = %#v", &s2)
    log.Printf("&s3 = %#v", &s3)

    var sh *reflect.StringHeader
    sh = (*reflect.StringHeader)(unsafe.Pointer(&s1))
    log.Printf("s1 points to %#v", sh.Data)
    sh = (*reflect.StringHeader)(unsafe.Pointer(&s2))
    log.Printf("s2 points to %#v", sh.Data)
    sh = (*reflect.StringHeader)(unsafe.Pointer(&s3))
    log.Printf("s3 points to %#v", sh.Data)
}
Shell session
$ go run ./go/main.go
2021/04/18 12:36:04 &s1 = (*string)(0xc00009e220) // these are all different
2021/04/18 12:36:04 &s2 = (*string)(0xc00009e230)
2021/04/18 12:36:04 &s3 = (*string)(0xc00009e240)
2021/04/18 12:36:04 s1 points to 0xc0000b8010 // these are the same
2021/04/18 12:36:04 s2 points to 0xc0000b8010
2021/04/18 12:36:04 s3 points to 0xc0000b8010

But String in Rust does not implement the Copy trait, so it has "move semantics".

Rust code
fn main() {
    let s1: String = "hello".into();
    let s2 = s1;
    let s3 = s1;

    dbg!(&s1 as *const _);
    dbg!(&s2 as *const _);
    dbg!(&s3 as *const _);

    dbg!(s1.as_bytes() as *const _);
    dbg!(s2.as_bytes() as *const _);
    dbg!(s3.as_bytes() as *const _);
}
Shell session
$ cargo run --quiet
error[E0382]: use of moved value: `s1`
 --> src/main.rs:4:14
  |
2 |     let s1: String = "hello".into();
  |         -- move occurs because `s1` has type `String`, which does not implement the `Copy` trait
3 |     let s2 = s1;
  |              -- value moved here
4 |     let s3 = s1;
  |              ^^ value used here after move

(cut)

When we first do let s2 = s1, we move the String into s2, and so, s1 can no longer be used. Which means let s3 = s1 is illegal.

What we can do is clone s1, but then the contents are also cloned, so they point to different copies of the data as well:

Rust code
fn main() {
    let s1: String = "hello".into();
    let s2 = s1.clone();
    let s3 = s1.clone();

    dbg!(&s1 as *const _);
    dbg!(&s2 as *const _);
    dbg!(&s3 as *const _);

    dbg!(s1.as_bytes() as *const _);
    dbg!(s2.as_bytes() as *const _);
    dbg!(s3.as_bytes() as *const _);
}
Shell session
$ cargo run --quiet
[src/main.rs:6] &s1 as *const _ = 0x00007fff40426188 // all different
[src/main.rs:7] &s2 as *const _ = 0x00007fff404261a0
[src/main.rs:8] &s3 as *const _ = 0x00007fff404261b8
[src/main.rs:10] s1.as_bytes() as *const _ = 0x000055ac20174aa0 // all different
[src/main.rs:11] s2.as_bytes() as *const _ = 0x000055ac20174ac0
[src/main.rs:12] s3.as_bytes() as *const _ = 0x000055ac20174ae0

No, if we want to get something closer to the Go version, we can use references:

Rust code
fn main() {
    let data: String = "hello".into();

    let s1: &str = &data;
    let s2: &str = &data;
    let s3: &str = &data;

    dbg!(&s1 as *const _);
    dbg!(&s2 as *const _);
    dbg!(&s3 as *const _);

    dbg!(s1.as_bytes() as *const _);
    dbg!(s2.as_bytes() as *const _);
    dbg!(s3.as_bytes() as *const _);
}
Shell session
$ cargo run --quiet
[src/main.rs:8] &s1 as *const _ = 0x00007ffeb7e82510
[src/main.rs:9] &s2 as *const _ = 0x00007ffeb7e82520
[src/main.rs:10] &s3 as *const _ = 0x00007ffeb7e82530
[src/main.rs:12] s1.as_bytes() as *const _ = 0x0000563249bbcaa0
[src/main.rs:13] s2.as_bytes() as *const _ = 0x0000563249bbcaa0
[src/main.rs:14] s3.as_bytes() as *const _ = 0x0000563249bbcaa0

Now, s1, s2, and s3 are all distinct references to the same underlying data.

But that's still not really what Go does. Because we cannot return a reference to a local variable, for example:

Rust code
fn main() {
    let s = lol();
    dbg!(s as *const _);
    dbg!(s.as_bytes() as *const _);
}

fn lol() -> &str {
    let data: String = "hello".into();
    let s: &str = &data;
    s
}
Shell session
$ cargo run --quiet
error[E0106]: missing lifetime specifier
 --> src/main.rs:7:13
  |
7 | fn lol() -> &str {
  |             ^ expected named lifetime parameter
  |
  = help: this function's return type contains a borrowed value, but there is no value for it to be borrowed from
help: consider using the `'static` lifetime
  |
7 | fn lol() -> &'static str {
  |             ^^^^^^^^

The Rust compiler is trying to help us. "You can't just return a reference to something", it pleads. "You need to tell me how long the thing that's referenced is will live".

And so, we can add 'static, to say that it will not be freed until the program exits.

We can say that...

Rust code
fn lol() -> &'static str {
    let data: String = "hello".into();
    let s: &str = &data;
    s
}

...but it's not true!

Shell session
$ cargo run --quiet
error[E0515]: cannot return value referencing local variable `data`
  --> src/main.rs:10:5
   |
9  |     let s: &str = &data;
   |                   ----- `data` is borrowed here
10 |     s
   |     ^ returns a value referencing data owned by the current function

Because data is owned by the current function! Sure, the "string data" actually lives on the heap, but it is owned by the String, which means it's allocated when let data is declared, and it's freed whenever data "goes out of scope", in this case, at the end of the lol function.

So, if we were able to return a reference to it, that reference would point to an object that is no longer live. We would have a good old dangling pointer.

If the string data lived elsewhere, say, if it were static data, in the executable itself, then we would be able to return a reference to it!

Rust code
fn main() {
    let s = lol();
    dbg!(s as *const _);
    dbg!(s.as_bytes() as *const _);
}


fn lol() -> &'static str {
    let s: &'static str = "hello";
    s
}
Shell session
$ cargo run --quiet
[src/main.rs:3] s as *const _ = 0x000055dab3e0e128
[src/main.rs:4] s.as_bytes() as *const _ = 0x000055dab3e0e128

Mhh. That address looks suspiciously close to the heap though.

Correct!

And now is as good a time as any to show some diagrams.

Let's get some addresses directly from GDB so our diagram can be close to reality.

Shell session
$ cargo build --quiet
$ gdb --quiet --args ./target/debug/whatbox
(gdb) catch syscall exit exit_group
(gdb) run
(gdb) info proc map
process 3406
Mapped address spaces:
          Start Addr           End Addr       Size     Offset objfile
      0x55555558c000     0x555555599000     0xd000    0x38000 /home/amos/ftl/whatbox/target/debug/whatbox
      0x55555559d000     0x5555555be000    0x21000        0x0 [heap]
      0x7ffffffdd000     0x7ffffffff000    0x22000        0x0 [stack]

In this case, our three main regions of interest are laid out roughly like this (not to scale):

Cool bear's hot tip

Why "in this case"? Well, GDB disables Address Space Layout Randomization (ASLR) by default, so we consistently get 0x555... and 0x7ff..., but if we ran this outside of GDB, we would get different addresses every time.

Also, a lot of this depends on what the executable itself asks for, in it headers:

Shell session
$ readelf -Wl ./target/debug/whatbox

Elf file type is DYN (Shared object file)
Entry point 0x5050
There are 14 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x000310 0x000310 R   0x8
  INTERP         0x000350 0x0000000000000350 0x0000000000000350 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x004eb0 0x004eb0 R   0x1000
  LOAD           0x005000 0x0000000000005000 0x0000000000005000 0x032815 0x032815 R E 0x1000
  LOAD           0x038000 0x0000000000038000 0x0000000000038000 0x00c4dc 0x00c4dc R   0x1000
  LOAD           0x044580 0x0000000000045580 0x0000000000045580 0x002ad0 0x002cb0 RW  0x1000
(cut)

The heap is managed by the program's memory allocator. In this case, it's glibc malloc, but it could just as well be jemalloc, or mimalloc, or snmalloc, or... you get the gist.

Things can be allocated and freed on the heap at any time, as long as we have enough memory. You can think of the memory allocator as just a registry of "memory blocks", some of which are used, and some of which are free.

The heap can get really large — at the time of this writing, realistically, in the hundreds of GiB (gibibytes).

The stack, on the other hand, is both a lot simpler, and a lot more restrictive.

"Allocating on the stack" just means "decrementing the stack pointer". Also, calling a function "allocates on the stack". That's why the list we look at when we try to find where an error occurred is called a "stack trace" (or a "call stack").

Calling a function is "just" pushing a return address and some arguments onto the stack, and jumping to some other code. The specifics depend on the exact ABI (application binary interface), for example, who's responsible for allocating and freeing the locals, where and in which order the arguments are passed, but it basically looks like this:

Which is why we cannot return a reference to a local variable: it would point to memory that has been "freed" (by changing the stack pointer).

However, if the function allocates memory on the heap, then we can return a reference to it no problem! It'll stay valid until it's freed.

Okay... but we never had to worry about any of that in Go!

Does Go not have a stack?

Of course Go has a stack! You can call functions, and they can return, therefore, there's a stack. And some locals are stack-allocated, even in Go.

In fact, the Go compiler tries very hard to stack-allocate as much as possible, and only uses the heap when it has no other choice.

So, for example, in the following code, s1 remains on the stack:

Go code
package main

import "log"

func main() {
    var s1 = []byte{'h', 'e', 'l', 'l', 'o'}

    log.Printf("s1 len = %#v", len(s1))
}

How do I know? Because I asked the go compiler to tell me, with -gcflags=-m:

Shell session
$ go run -gcflags=-m ./go/main.go
# command-line-arguments
go/main.go:5:6: can inline main
go/main.go:6:17: []byte{...} does not escape
go/main.go:8:12: ... argument does not escape
go/main.go:8:32: len(s1) escapes to heap
2021/04/18 14:20:38 s1 len = 5

However, in that code, s1 escapes to the heap:

Go code
package main

import "log"

func main() {
    var s1 = []byte{'h', 'e', 'l', 'l', 'o'}

    log.Printf("s1 = %#v", s1)
}
Shell session
$ go run -gcflags=-m ./go/main.go
# command-line-arguments
go/main.go:5:6: can inline main
go/main.go:6:17: []byte{...} escapes to heap
go/main.go:8:12: ... argument does not escape
go/main.go:8:13: s1 escapes to heap
2021/04/18 14:22:15 s1 = []byte{0x68, 0x65, 0x6c, 0x6c, 0x6f}

Why does it escape to the heap? My best guess is, because log.Printf takes variable arguments, and so every argument is implicitly cast to interface{}.

If we bring our own print method, s1 no longer escapes to the heap:

Go code
package main

import "fmt"

func main() {
    var s1 = []byte{'h', 'e', 'l', 'l', 'o'}
    printBytes(s1)
}

func printBytes(s []byte) {
    for _, b := range s {
        fmt.Printf("%x ", b)
    }
    fmt.Printf("\n")
}
Shell session
$ go run -gcflags=-m ./go/main.go
# command-line-arguments
go/main.go:12:13: inlining call to fmt.Printf
go/main.go:14:12: inlining call to fmt.Printf
go/main.go:5:6: can inline main
go/main.go:10:17: s does not escape
go/main.go:12:14: b escapes to heap
go/main.go:12:13: []interface {}{...} does not escape
                 👇
go/main.go:6:17: []byte{...} does not escape
<autogenerated>:1: .this does not escape
68 65 6c 6c 6f

Which suggests that anything that is printed (or even formatted to a string) ends up escaping to the heap in Go. So, uh, pro-tip, don't do any logging!

Uhhh..

Anyway, we were saying: the problem with making references like these:

Rust code
fn main() {
    let data: String = "hello".into();

    let s1: &str = &data;
    let s2: &str = &data;
    let s3: &str = &data;

    dbg!(&s1 as *const _);
    dbg!(&s2 as *const _);
    dbg!(&s3 as *const _);

    dbg!(s1.as_bytes() as *const _);
    dbg!(s2.as_bytes() as *const _);
    dbg!(s3.as_bytes() as *const _);
}

Is that they're tied to the lifetime of the source String. So if the String is a local, we cannot return a reference to it.

If we clone the source String, then we end up with three different copies, and that's not what the Go program does:

The Go program effectively creates three references to the same thing, and that's okay, because:

So to really do the same thing the Go program does, we need some sort of reference:

And since we have no garbage collector, we can reach for the previous-best thing: reference counting!

Rust code
use std::sync::Arc;

fn main() {
    let data: String = "hello".into();

    let s1 = Arc::new(data);
    let s2 = s1.clone();
    let s3 = s1.clone();

    dbg!(&s1 as *const _);
    dbg!(&s2 as *const _);
    dbg!(&s3 as *const _);

    dbg!(s1.as_bytes() as *const _);
    dbg!(s2.as_bytes() as *const _);
    dbg!(s3.as_bytes() as *const _);
}
Cool bear's hot tip

Arc is thread-safe, so it will work everywhere. If you don't need thread safety, you can use the lighter Rc instead.

Shell session
$ cargo run --quiet
[src/main.rs:10] &s1 as *const _ = 0x00007ffcea21d000
[src/main.rs:11] &s2 as *const _ = 0x00007ffcea21d020
[src/main.rs:12] &s3 as *const _ = 0x00007ffcea21d028
[src/main.rs:14] s1.as_bytes() as *const _ = 0x000056044ce89aa0
[src/main.rs:15] s2.as_bytes() as *const _ = 0x000056044ce89aa0
[src/main.rs:16] s3.as_bytes() as *const _ = 0x000056044ce89aa0

And now, we have something that's as close as we can get to the Go version.

Because, for example, we can definitely return an Arc<String>:

Rust code
use std::sync::Arc;

fn main() {
    let s = lol();
    dbg!(s);
}

fn lol() -> Arc<String> {
    let data: String = "hello".into();
    Arc::new(data)
}
Shell session
$ cargo run --quiet
[src/main.rs:5] s = "hello"

It's still not 100% what the Go version does. In the Go version, making another pointer that points to the same thing is completely free. There's no work at all involved there.

The work happens during garbage collection, when the GC tries to determine if some block of memory is live or dead, by literally looking at all the pointers in the program. The actual implementation is more complicated, so that it's faster, but that is the basic idea.

With reference-counting, whenever we clone an Arc, a counter is incremented. And whenever an Arc falls out of scope (is "dropped"), that counter is decremented. And that already is "some work" — especially with an Arc, because the counter is atomic (that's what the A stands for).

And when the counter reaches zero, well, the associated memory is freed. Immediately. Not "at some point in the future". And that is also "work".

So the big difference here really is "when the work happens", and also, how much work happens. "Doing garbage collection" is a ton of work, but between GC rounds, a lot of "program work" can happen without the GC interfering at all.

With reference counting, the amounts of work performed are much smaller, but also much more frequent. Whichever is best depends on what your program does!

But enough about strings and Arcs! We came here to talk about errors and Boxes.

One Sized fits all

Rust is rather explicit about a lot of things. And where things go in memory, and when they're allocated, and deallocated, is one of them.

So when we have a struct, like so:

Rust code
struct S {
    a: u32,
    b: u64,
}

fn main() {
    let s = S { a: 12, b: 24 };
    dbg!(s.a, s.b);
}
Shell session
$ cargo run --quiet
[src/main.rs:8] s.a = 12
[src/main.rs:8] s.b = 24

...then we know a few things:

  1. s is on the stack.
  2. s is allocated at the start of the main function
  3. s is freed at the end of the main function

How do we know? Well, let's print a backtrace when it's allocated, and when it's freed.

TOML markup
# in Cargo.toml

[dependencies]
backtrace = "0.3"
Rust code
use backtrace::Backtrace;

struct S {
    a: u32,
    b: u64,
}

impl S {
    fn new() -> Self {
        println!("(!) allocating at:\n{:?}", Backtrace::new());
        Self { a: 12, b: 24 }
    }
}

impl Drop for S {
    fn drop(&mut self) {
        println!("(!) freeing at:\n{:?}", Backtrace::new());
    }
}

fn main() {
    let s = S::new();
    dbg!(s.a, s.b);
}
Shell session
$ cargo run --quiet
(!) allocating at:
   0: whatbox::S::new
             at src/main.rs:10:46
   1: whatbox::main
             at src/main.rs:22:13
   2: core::ops::function::FnOnce::call_once
             at /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
   3: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:125:18
   4: std::rt::lang_start::{{closure}}
             at /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:66:18
   5: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/core/src/ops/function.rs:259:13
      std::panicking::try::do_call
             at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/std/src/panicking.rs:379:40
      std::panicking::try
             at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/std/src/panicking.rs:343:19
      std::panic::catch_unwind
             at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/std/src/panic.rs:431:14
      std::rt::lang_start_internal
             at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/std/src/rt.rs:51:25
   6: std::rt::lang_start
             at /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:65:5
   7: main
   8: __libc_start_main
   9: _start

[src/main.rs:23] s.a = 12
[src/main.rs:23] s.b = 24
(!) freeing at:
   0: <whatbox::S as core::ops::drop::Drop>::drop
             at src/main.rs:17:43
   1: core::ptr::drop_in_place<whatbox::S>
             at /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:179:1
   2: whatbox::main
             at src/main.rs:24:1
   3: core::ops::function::FnOnce::call_once
             at /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
   4: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:125:18
   5: std::rt::lang_start::{{closure}}
             at /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:66:18
   6: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/core/src/ops/function.rs:259:13
      std::panicking::try::do_call
             at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/std/src/panicking.rs:379:40
      std::panicking::try
             at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/std/src/panicking.rs:343:19
      std::panic::catch_unwind
             at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/std/src/panic.rs:431:14
      std::rt::lang_start_internal
             at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/std/src/rt.rs:51:25
   7: std::rt::lang_start
             at /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:65:5
   8: main
   9: __libc_start_main
  10: _start

And there we have it. It's allocated at the start of main, and deallocated at the end of main.

Now to think about sizedness. When we have a reference to something, we don't need to know its size. All we have is a pointer, which is literally just a number describing "where the thing-we-point-to starts in memory".

Which is not the case in this code here:

Rust code
struct S {
    a: u32,
    b: u64,
}

fn main() {
    // here 👇
    let s = S { a: 12, b: 24 };
    dbg!(s.a, s.b);
}

Here, we're holding the entire S on the stack.

And stack allocations must be predictable: we must know the size of whatever we push on the stack, so that we can pop it again.

And as we've seen earlier, calling functions pushes something on the stack, and function locals are also placed on the stack (we're ignoring CPU registers on purpose for this whole article), so we can get a rough measure of "how much the top of the stack moved" by printing the address of a function local:

Rust code
fn main() {
    println!("one call:");
    f();
    f();
    f();
    println!("two nested calls:");
    g();
    g();
    g();
}

#[inline(never)]
fn f() {
    let x = 0;
    dbg!(&x as *const _);
}

#[inline(never)]
fn g() {
    f()
}

Here, when printing the address of x when calling f() directly, the stack is smaller than it is when calling f() through g().

And since the stack grows downwards on x86, smaller stack = the "top" of the stack is a bigger number:

Shell session
$ cargo run --quiet
one call:
[src/main.rs:15] &x as *const _ = 0x00007fffcc125be4
[src/main.rs:15] &x as *const _ = 0x00007fffcc125be4
[src/main.rs:15] &x as *const _ = 0x00007fffcc125be4
two nested calls:
[src/main.rs:15] &x as *const _ = 0x00007fffcc125bd4
[src/main.rs:15] &x as *const _ = 0x00007fffcc125bd4
[src/main.rs:15] &x as *const _ = 0x00007fffcc125bd4

So here, we can see that the "cost" of calling f() through g() is 0x10, ie. 16 bytes, ie. two pointers.

If we declare a large local in g(), then that cost increases significantly:

Rust code
struct S {
    data: [u8; 0x1000],
}

fn main() {
    println!("one call:");
    f();
    f();
    f();
    println!("two nested calls:");
    g();
    g();
    g();
}

#[inline(never)]
fn f() {
    let x = 0;
    dbg!(&x as *const _);
}

#[inline(never)]
fn g() {
    let _s: S;
    f()
}
Shell session
$ cargo run --quiet
one call:
[src/main.rs:19] &x as *const _ = 0x00007ffc8577fa94
[src/main.rs:19] &x as *const _ = 0x00007ffc8577fa94
[src/main.rs:19] &x as *const _ = 0x00007ffc8577fa94
two nested calls:
[src/main.rs:19] &x as *const _ = 0x00007ffc8577ea84
[src/main.rs:19] &x as *const _ = 0x00007ffc8577ea84
[src/main.rs:19] &x as *const _ = 0x00007ffc8577ea84

The cost is now 0x1010: 0x1000 more, which is the size of S:

Rust code
    println!("0x{:x}", std::mem::size_of::<S>());
Shell session
0x1000

..which we already knew because, well, it's made of an array of 0x1000 bytes.

It follows that whenever we hold a value, we must know its size. And in Rust, that property is indicated by the marker trait Sized.

So, for example, if we take a value of type T, then T must be sized:

Rust code
fn f<T>(t: T) {}

Implicitly, we have:

Rust code
fn f<T: Sized>(t: T) {}

Or:

Rust code
fn f<T>(t: T)
where
    T: Sized,
{
}

I prefer the where form because the name of the type parameters and their constraints are clearly separated.

Because the Sized constraint is implicit, there exists a way to relax it, and it's spelled ?Sized:

Rust code
fn f<T>(t: T)
where
    T: ?Sized,
{
}

But then, it doesn't work. Because we're taking a T as an argument, and holding it for the duration of the function body (which, here, does nothing), so we must know its size, so that it can be put on the stack.

Shell session
$ cargo run --quiet
error[E0277]: the size for values of type `T` cannot be known at compilation time
 --> src/main.rs:3:9
  |
3 | fn f<T>(t: T)
  |      -  ^ doesn't have a size known at compile-time
  |      |
  |      this type parameter needs to be `Sized`
  |
help: function arguments must have a statically known size, borrowed types always have a known size
  |
3 | fn f<T>(&t: T)
  |         ^
Cool bear's hot tip

This is a compiler bug, and there's already an open PR for it!

The compiler's advice is a little strange here, but it is trying to make my next point.

Even if we don't know the size of T, we can still take a reference to T.

This compiles just fine:

Rust code
//         👇
fn f<T>(t: &T)
where
    T: ?Sized,
{
}

And, interestingly, that's almost exactly what the signature of std::mem::size_of_val is:

Rust code
pub const fn size_of_val<T: ?Sized>(val: &T) -> usize {
    // SAFETY: `val` is a reference, so it's a valid raw pointer
    unsafe { intrinsics::size_of_val(val) }
}

And the same goes for returning something from a function!

This is fine:

Rust code
fn f<T>() -> T {
    todo!()
}

Because we have an implicit Sized constraint, so we're effectively saying this:

Rust code
fn f<T>() -> T
where
    T: Sized,
{
    todo!()
}

But if we relax the Sized restriction...

Rust code
fn f<T>() -> T
where
    T: ?Sized,
{
    todo!()
}

Then we run into trouble again:

Shell session
$ cargo run --quiet
error[E0277]: the size for values of type `T` cannot be known at compilation time
 --> src/main.rs:3:14
  |
3 | fn f<T>() -> T
  |      -       ^ doesn't have a size known at compile-time
  |      |
  |      this type parameter needs to be `Sized`
  |
  = note: the return type of a function must have a statically known size

And that's exactly what we ran into with this program, an eternity ago:

Rust code
use std::error::Error;

fn main() {
    println!("{}", read_issue().unwrap())
}

fn read_issue() -> Result<String, dyn Error> {
    std::fs::read_to_string("/etc/issue")
}
Shell session
$ cargo run --quiet
error[E0277]: the size for values of type `(dyn std::error::Error + 'static)` cannot be known at compilation time
   --> src/main.rs:7:20
    |
7   | fn read_issue() -> Result<String, dyn Error> {
    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
    |
   ::: /home/amos/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:241:20
    |
241 | pub enum Result<T, E> {
    |                    - required by this bound in `std::result::Result`
    |
    = help: the trait `Sized` is not implemented for `(dyn std::error::Error + 'static)`

error: aborting due to previous error

Which, with all that additional context, hopefully makes a lot more sense.

And now, let's discuss how we can get out of this pickle.

The many ways we can return a Result

The issue with returning Result<T, dyn Error> is that dyn Error could be any type. And thus, it could have any size, and thus, we don't know what size it is, and so we can't hold a value of type dyn Error.

We can however, have references to dyn Error values:

Rust code
fn print_error(e: &dyn Error) {
    println!("error has source? {}", e.source().is_some());
}

We can take an argument of a concrete type that happens to implement Error:

Rust code
fn print_error(e: std::io::Error) {
    println!("error has source? {}", e.source().is_some());
}

(This is the error type that std::fs::read_to_string returns)

And we can take values of "any type that implements Error":

Rust code
fn print_error<E>(e: E)
where
    E: Error,
{
    println!("error has source? {}", e.source().is_some());
}

And there's even a more concise way to write this:

Rust code
fn print_error(e: impl Error) {
    println!("error has source? {}", e.source().is_some());
}

And finally, we can take a Box<dyn Error>, which is "an owned pointer to something on the heap, that implements Error".

Rust code
fn print_error(e: Box<dyn Error>) {
    println!("error has source? {}", e.source().is_some());
}

And! And, as a bonus, you can actually take any sort of smart pointer to something that implements Error:

Rust code
use std::sync::Arc;

fn print_error(e: Arc<dyn Error>) {
    println!("error has source? {}", e.source().is_some());
}

Let's compare those:

SolutionTakes ownership?Works with any type?Heap?
&dyn ErrorNoYesDepends
std::io::ErrorYesNoNo
<E: Error>YesYesNo
impl ErrorYesYesNo
Box<dyn Error>YesYesYes
Arc<dyn Error>Sort ofYesYes

That's all in argument position.

In return position, some of these don't work.

For example, &dyn Error doesn't work.

Well... it works in the abstract, like this:

Rust code
fn read_issue() -> Result<String, &'static dyn Error> {
    todo!()
}

This compiles. But it's only useful if somehow all the error values we want to return are also static.

Rust code
use std::fmt;

#[derive(Debug)]
struct MyError {}

impl fmt::Display for MyError {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        fmt::Debug::fmt(self, f)
    }
}

impl Error for MyError {}

const MY_ERROR: MyError = MyError {};

fn read_issue() -> Result<String, &'static dyn Error> {
    Err(&MY_ERROR)
}

...and that's not uhh that's not how we usually do things.

Usually we construct error values, because they hold some context:

Rust code
use std::{error::Error, fmt};

fn main() {
    println!("{}", read_issue().unwrap())
}

#[derive(Debug)]
struct MyError {
    some_value: u32,
}

impl fmt::Display for MyError {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        fmt::Debug::fmt(self, f)
    }
}

impl Error for MyError {}

fn read_issue() -> Result<String, &'static dyn Error> {
    let e = MyError { some_value: 128 };
    Err(&e)
}

...and then we run into the same problem we did before: we cannot return a reference to data owned by the current function:

Shell session
$ cargo run --quiet
error[E0515]: cannot return value referencing local variable `e`
  --> src/main.rs:22:5
   |
22 |     Err(&e)
   |     ^^^^--^
   |     |   |
   |     |   `e` is borrowed here
   |     returns a value referencing data owned by the current function

So, that one is out.

Next up is just returning the concrete type: we can do that!

Rust code
fn read_issue() -> Result<String, MyError> {
    let e = MyError { some_value: 128 };
    Err(e)
}

Although now it only works with errors of type MyError.

The generic type parameter version doesn't work in return position:

Rust code
fn read_issue<E>() -> Result<String, E>
where
    E: Error,
{
    let e = MyError { some_value: 128 };
    Err(e)
}

...because we must be able to infer the concrete type of any type parameter by looking at its call site.

And here:

Rust code
    read_issue().unwrap()

...there is nothing that tells us what E should be. Even if we did something like that:

Rust code
    let r: Result<_, MyError> = read_issue();

it still wouldn't work. Because it would be possible to invoke it like that instead:

Rust code
    let r: Result<_, std::io::Error> = read_issue();

...and then read_issue would return the wrong type.

No, type parameters really are type parameters, in that the function should be "parametric", we should be able to "parameterize" it by any type E that fits the constraints.

What we want to say here, is not really that E can be anything. We never return more than one concrete type from read_issue, it's always the same type.

What we want to say, is that we can't be bothered to spell out what the return type really is, and also that the concrete type doesn't matter, because the only visible/accessible part of it should be the Error interface.

And that's precisely what impl T does.

Rust code
fn read_issue() -> Result<String, impl Error> {
    let e = MyError { some_value: 128 };
    Err(e)
}

And then, finally, we can also return an owned pointer to "some type that implements Error", either unique:

Rust code
fn read_issue() -> Result<String, Box<dyn Error>> {
    let e = MyError { some_value: 128 };
    Err(Box::new(e))
}

Or reference-counted:

Rust code
use std::sync::Arc;

fn read_issue() -> Result<String, Arc<dyn Error>> {
    let e = MyError { some_value: 128 };
    Err(Arc::new(e))
}

So, in return position, we really only have these options:

SolutionGives ownership?Works with any type?Heap?
std::io::ErrorYesNoNo
impl ErrorYesYesNo
Box<dyn Error>YesYesYes
Arc<dyn Error>Sort ofYesYes

Error propagation and the ? sigil

But that's not the end of the story.

Sure, using the concrete type works fine in this case:

Rust code
fn main() {
    println!("{}", read_issue().unwrap())
}

fn read_issue() -> Result<String, std::io::Error> {
    std::fs::read_to_string("/etc/issue")
}

And so does impl Error:

Rust code
fn main() {
    println!("{}", read_issue().unwrap())
}

fn read_issue() -> Result<String, impl std::error::Error> {
    std::fs::read_to_string("/etc/issue")
}

And Box<dyn Error>:

Rust code
fn main() {
    println!("{}", read_issue().unwrap())
}

fn read_issue() -> Result<String, Box<dyn std::error::Error>> {
    std::fs::read_to_string("/etc/issue")
}

Mhh actually, that one doesn't work as-is:

Shell session
$ cargo run --quiet
error[E0308]: mismatched types
 --> src/main.rs:6:5
  |
5 | fn read_issue() -> Result<String, Box<dyn std::error::Error>> {
  |                    ------------------------------------------ expected `std::result::Result<String, Box<(dyn std::error::Error + 'static)>>` because of return type
6 |     std::fs::read_to_string("/etc/issue")
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected struct `Box`, found struct `std::io::Error`
  |
  = note: expected enum `std::result::Result<_, Box<(dyn std::error::Error + 'static)>>`
             found enum `std::result::Result<_, std::io::Error>`

error: aborting due to previous error

Because Box<dyn Error> and std::io::Error aren't the same type.

Here's a very long way to convert it:

Rust code
fn read_issue() -> Result<String, Box<dyn std::error::Error>> {
    match std::fs::read_to_string("/etc/issue") {
        Ok(s) => Ok(s),
        Err(e) => Err(Box::new(e)),
    }
}

Although, there is an impl From<T> for Box<T> that fits here, so we can just use .into():

Rust code
fn read_issue() -> Result<String, Box<dyn std::error::Error>> {
    match std::fs::read_to_string("/etc/issue") {
        Ok(s) => Ok(s),
        //              👇
        Err(e) => Err(e.into()),
    }
}

And if we want, we can deal with the error first (if any), and retrieve the result, which we can return later, like so:

Rust code
fn read_issue() -> Result<String, Box<dyn std::error::Error>> {
    let value = match std::fs::read_to_string("/etc/issue") {
        Ok(s) => s,
        Err(e) => return Err(e.into()),
    };
    // if we reach this point, `read_to_string` succeeded

    Ok(value)
}

And that's exactly what the ? sigil does:

Rust code
fn read_issue() -> Result<String, Box<dyn std::error::Error>> {
    let value = std::fs::read_to_string("/etc/issue")?;
    // if we reach this point, `read_to_string` succeeded

    Ok(value)
}

Seriously. Go back and read both those samples if you need to — they are equivalent.

But let's say we want to do two things in that function, and they can both fail. For example, we might read an entire file as bytes, and try to interpret those bytes as an UTF-8 string.

Those both can fail, because:

In that scenario, Box<dyn Error> works:

Rust code
fn read_issue() -> Result<String, Box<dyn std::error::Error>> {
    let buf = std::fs::read("/etc/issue")?;
    let s = String::from_utf8(buf)?;
    Ok(s)
}

In fact, the whole program runs fine:

Rust code
cargo run --quiet
Arch Linux \r (\l)

But two of our other solutions no longer work. We cannot use impl Error here:

Rust code
fn read_issue() -> Result<String, impl std::error::Error> {
    let buf = std::fs::read("/etc/issue")?;
    let s = String::from_utf8(buf)?;
    Ok(s)
}
Shell session
$ cargo run --quiet
error[E0277]: `?` couldn't convert the error to `impl std::error::Error`
 --> src/main.rs:6:42
  |
5 | fn read_issue() -> Result<String, impl std::error::Error> {
  |                    -------------------------------------- expected `impl std::error::Error` because of this
6 |     let buf = std::fs::read("/etc/issue")?;
  |                                          ^ the trait `From<std::io::Error>` is not implemented for `impl std::error::Error`
  |
  = note: the question mark operation (`?`) implicitly performs a conversion on the error value using the `From` trait
  = note: required by `from`

error[E0277]: `?` couldn't convert the error to `impl std::error::Error`
 --> src/main.rs:7:35
  |
5 | fn read_issue() -> Result<String, impl std::error::Error> {
  |                    -------------------------------------- expected `impl std::error::Error` because of this
6 |     let buf = std::fs::read("/etc/issue")?;
7 |     let s = String::from_utf8(buf)?;
  |                                   ^ the trait `From<FromUtf8Error>` is not implemented for `impl std::error::Error`
  |
  = note: the question mark operation (`?`) implicitly performs a conversion on the error value using the `From` trait
  = note: required by `from`

error[E0720]: cannot resolve opaque type
 --> src/main.rs:5:35
  |
5 | fn read_issue() -> Result<String, impl std::error::Error> {
  |                                   ^^^^^^^^^^^^^^^^^^^^^^ recursive opaque type
6 |     let buf = std::fs::read("/etc/issue")?;
  |               ---------------------------- returning here with type `std::result::Result<String, impl std::error::Error>`
7 |     let s = String::from_utf8(buf)?;
  |             ----------------------- returning here with type `std::result::Result<String, impl std::error::Error>`
8 |     Ok(s)
  |     ----- returning here with type `std::result::Result<String, impl std::error::Error>`

...because there's two possible error types we can return! And impl Error needs to be a single one!

Similarly, we can't "just return the concrete type" because there's two different concrete types!

If we pick std::io::Error, this line errors out:

Rust code
fn read_issue() -> Result<String, std::io::Error> {
    let buf = std::fs::read("/etc/issue")?;
    // 👇 can't convert to `std::io::Error!
    let s = String::from_utf8(buf)?;
    Ok(s)
}

And if we pick std::string::FromUtf8Error, then that line errors out:

Rust code
    fn read_issue() -> Result<String, std::string::FromUtf8Error> {
    // 👇 can't convert to `std::string::FromUtf8Error`
    let buf = std::fs::read("/etc/issue")?;
    let s = String::from_utf8(buf)?;
    Ok(s)
}

So, we need to update our table:

SolutionGives ownership?Generic?Unifies types?Heap?
std::io::ErrorYesNoNoNo
impl ErrorYesYesNoNo
Box<dyn Error>YesYesYesYes
Arc<dyn Error>Sort ofYesYesYes

And looking at this, it seems like we have two questions left to answer:

How does Box unify types?

Well, the trick is not in the Box, it's really in the dyn Error.

Consider this program:

Rust code
fn main() {
    let e = get_error();
    dbg!(std::mem::size_of_val(&e));
}

fn get_error() -> Box<dyn std::error::Error> {
    let e: std::io::Error = std::io::ErrorKind::Other.into();
    let e = Box::new(e);
    dbg!(std::mem::size_of_val(&e));
    e
}

What should this print?

Well, Box is just a "smart pointer", so... 8 bytes?

Wrong! Well. Half-right.

Shell session
$ cargo run --quiet
[src/main.rs:9] std::mem::size_of_val(&e) = 8
[src/main.rs:3] std::mem::size_of_val(&e) = 16

In get_error, we hold a Box<std::io::Error>, and that is 8 bytes, ie. one pointer.

But in main, we hold a Box<dyn std::error::Error>, and that's 16 bytes.

Two pointers.

Heyyyyyy we've seen that before! In the Go stuff!

Go interface types are 16 bytes!

They are! One pointer for the value, and one for the type.

It's roughly the same here, except the second pointer in a Box<dyn T>, whose real name is a "boxed trait object", is not a pointer to "the concrete type". It's a pointer to the "virtual table that corresponds to the implementation of the interface for the concrete type".

All that means is that, we have just enough information to treat the value inside the box as something that implements Error, and nothing else.

There is no safe way to downcast from a Box<dyn T> to a concrete type U, without using Any, which is made explicitly for that purpose.

There's an unsafe way, and it's unsafe because nothing in Box let us know what the concrete type is, so we have to use prior knowledge, or pass information "out of band" — and if we get it wrong, really bad things will happen.

Rust code
fn main() {
    let e = get_error();
    dbg!(std::mem::size_of_val(&e));

    let e = unsafe { Box::from_raw(Box::into_raw(e) as *mut std::io::Error) };
    dbg!(std::mem::size_of_val(&e));
}

fn get_error() -> Box<dyn std::error::Error> {
    let e: std::io::Error = std::io::ErrorKind::Other.into();
    let e = Box::new(e);
    dbg!(std::mem::size_of_val(&e));
    e
}
Shell session
$ cargo run --quiet
[src/main.rs:12] std::mem::size_of_val(&e) = 8
[src/main.rs:3] std::mem::size_of_val(&e) = 16
[src/main.rs:6] std::mem::size_of_val(&e) = 8

And now, there's only one question left.

How do we unify types without forcing a heap allocation?

Why, with an enum of course!

"Of course?"

An enum is perfect for what we want. It's like a struct with two fields. One of them is the "discriminant", that records which variant is active. And the second one is "large enough to hold any of the variants", similar to a union in C.

So, we can make an enum that can contain either an std::io::Error, or an std::string::FromUtf8Error:

Rust code
enum MyError {
    Io(std::io::Error),
    Utf8(std::string::FromUtf8Error),
}

Then implement std::error::Error for it, along with std::fmt::Debug and std::fmt::Display, which are required for std::error::Error:

Rust code
use std::fmt;

#[derive(Debug)]
enum MyError {
    Io(std::io::Error),
    Utf8(std::string::FromUtf8Error),
}

impl fmt::Display for MyError {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        match self {
            MyError::Io(e) => {
                write!(f, "i/o error: {}", e)
            }
            MyError::Utf8(e) => {
                write!(f, "utf-8 error: {}", e)
            }
        }
    }
}

impl std::error::Error for MyError {}

Then implement From for both variants, so that they work well with the ? sigil:

Rust code
impl From<std::io::Error> for MyError {
    fn from(e: std::io::Error) -> Self {
        Self::Io(e)
    }
}

impl From<std::string::FromUtf8Error> for MyError {
    fn from(e: std::string::FromUtf8Error) -> Self {
        Self::Utf8(e)
    }
}

...and finally, change the return type of read_issue:

Rust code
fn main() {
    println!("{}", read_issue().unwrap())
}

//                                   👇
fn read_issue() -> Result<String, MyError> {
    let buf = std::fs::read("/etc/issue")?;
    let s = String::from_utf8(buf)?;
    Ok(s)
}

And it just works!

Of course, that's a lot of code, and in real life we rustaceans often reach for a crate like thiserror

TOML markup
# in `Cargo.toml`

[dependencies]
thiserror = "1.0"

...which reduces this whole example to just this:

Rust code
#[derive(Debug, thiserror::Error)]
enum MyError {
    #[error("i/o error: {0}")]
    Io(#[from] std::io::Error),
    #[error("utf-8 error: {0}")]
    Utf8(#[from] std::string::FromUtf8Error),
}

fn main() {
    println!("{}", read_issue().unwrap())
}

fn read_issue() -> Result<String, MyError> {
    let buf = std::fs::read("/etc/issue")?;
    let s = String::from_utf8(buf)?;
    Ok(s)
}

Let's try to summarize as best we can:

SolutionGives ownership?Generic?Unifies types?Heap?
std::io::ErrorYesNoNoNo
impl ErrorYesYesNoNo
Box<dyn Error>YesYesYesYes
Custom enumYesNoYesNo

Now we're done with all the important questions, so I guess it's time to close out this article and wish y'-

Ooh, ooh! raises paw

Yes bear?

What is even the point of impl Trait? Sure, it let us not worry about spelling out the concrete type, and it "hides it" but... really? A whole feature just for that?

Ah, excellent question.

impl Trait is actually rarely used in the context of error handling. Mentioning it here was probably silly even.

Where it is especially useful, is when the concrete type cannot be named.

It who cannot be named?

Yes. Like closures. What's the type of f here?

Rust code
fn main() {
    let f = || {
        println!("hello from the closure side");
    };
    f();
}

Well, it doesn't even "close over" anything (no variables are captured) but uhh I guess it's an Fn()?

Not quite! Fn is a trait that it does implement... but it's not a type.

Rust code
fn main() {
    let f: dyn Fn() = || {
        println!("hello from the closure side");
    };
    f();
}
Shell session
$ cargo run --quiet
error[E0308]: mismatched types
 --> src/main.rs:2:23
  |
2 |       let f: dyn Fn() = || {
  |  ____________--------___^
  | |            |
  | |            expected due to this
3 | |         println!("hello from the closure side");
4 | |     };
  | |_____^ expected trait object `dyn Fn`, found closure
  |
  = note: expected trait object `dyn Fn()`
                  found closure `[closure@src/main.rs:2:23: 4:6]`

error[E0277]: the size for values of type `dyn Fn()` cannot be known at compilation time
 --> src/main.rs:2:9
  |
2 |     let f: dyn Fn() = || {
  |         ^ doesn't have a size known at compile-time
  |
  = help: the trait `Sized` is not implemented for `dyn Fn()`
  = note: all local variables must have a statically known size
  = help: unsized locals are gated as an unstable feature

Two errors here: the second is the one we've been fighting all along: a dyn Fn() isn't sized, because Fn is a trait. We need to "just box it" if we want to hold it, or refer to a concrete type.

And the first error tells us the concrete type, except uhh.. this:

closure `[closure@src/main.rs:2:23: 4:6]`

...is not the name of a type. We cannot name it.

So, if we want to return such a closure, we can either box it:

Rust code
fn main() {
    let f = get_closure();
}

fn get_closure() -> Box<dyn Fn()> {
    Box::new(|| {
        println!("hello from the closure side");
    })
}

...which forces a heap allocation, or we can use impl Trait syntax:

Rust code
fn main() {
    let f = get_closure();
}

fn get_closure() -> impl Fn() {
    || {
        println!("hello from the closure side");
    }
}

And you know what's fun? Using std::mem::size_of_val, we can print the size of that closure:

Rust code
fn main() {
    let f = get_closure();
    dbg!(std::mem::size_of_val(&f));
}

fn get_closure() -> impl Fn() {
    || {
        println!("hello from the closure side");
    }
}
Shell session
$ cargo run --quiet
[src/main.rs:3] std::mem::size_of_val(&f) = 0

Hah! It's zero! Told you it didn't capture anything.

That's right! And if we make it capture something...

Rust code
fn main() {
    let f = get_closure();
    dbg!(std::mem::size_of_val(&f));
}

fn get_closure() -> impl Fn() {
    let val = 27_u64;
    move || {
        println!("hello from the closure side, val is {}", val);
    }
}
Shell session
$ cargo run --quiet
[src/main.rs:3] std::mem::size_of_val(&f) = 8

...it's no longer zero-sized!

What did we learn?

As of Rust 1.51.0, only sized values can be "held" (as a local variable), "passed", or "returned". Box is an owned pointer, so it can contain unsized values.

Trait objects (dyn Trait) are unsized, because they might be different types, of different types. We can manipulate them through references (&dyn Trait), and smart pointers (Box<dyn Trait>, Rc<dyn Trait>, Arc<dyn Trait>). Smart pointers, which carry ownership (either exclusive or shared) force the concrete value to live on the heap.

Garbage-collected languages don't have to deal with any of this. Go opportunistically allocates values on the stack, only moving to the heap when they escape. Creating or destroying another pointer to the same object is cost-free — the cost is offset to the GC cycles.

Trait objects are not the only mechanism to "unify" disparate types: an enum works just as well. Implementing a trait for each variant of an enum is a lot of boilerplate, thankfully, there's crates for that: thiserror, enum_dispatch, etc.

impl Trait allows taking and returning types that cannot be named, such as closures, or generator-based futures (created by async blocks). They can also be used to hide a concrete type.