Home Log in

Dec 29, 2022

38 min #rust · #advent-of-code

Day 16 (Advent of Code 2022)

From the series Advent of Code 2022

👋 This page was last updated ~2 years ago. Just so you know.

Let's tackle the day 16 puzzle!

Breadth-first search

We start at valve AA, and we have 30 "turns" (the statement calls them minutes).

Traveling to another valve (connected by a tunnel) takes a turn, opening a valve takes a turn. Every turn, opened valves contribute flow to the pressure released.

It seems like we'll want to do a little graph traversal here. My idea is: knowing how many turns we have left, find the valve we should open, or in other words, the valve that, once we travel to it and open it, will release the most pressure total.

For convenience, we'll want to have our Valve values indexed by their Name rather than in a Vec. We can also make a Network type that will be immutable, and contain all valves.

Rust code

use std::collections::{hash_map::Entry, HashMap};

use nom::{combinator::all_consuming, Finish};
use parse::{Name, Valve};

mod parse;

pub struct Network {
    valves: HashMap<Name, Valve>,
}

pub type Path = Vec<(Name, Name)>;

impl Network {
    #![allow(clippy::new_without_default)]
    pub fn new() -> Self {
        Self {
            valves: include_str!("input-sample.txt")
                .lines()
                .map(|l| all_consuming(parse::Valve::parse)(l).finish().unwrap().1)
                .map(|valve| (valve.name, valve))
                .collect(),
        }
    }

    /// Given a valve name, return a list of valves we can travel to, along
    /// with the path to get there.
    ///
    /// Only the shortest paths are considered, so the search ends.
    pub fn connections(&self, start: Name) -> HashMap<Name, Path> {
        let mut current: HashMap<Name, Path> = Default::default();
        current.insert(start, vec![]);

        let mut connections = current.clone();

        while !current.is_empty() {
            let mut next: HashMap<Name, Path> = Default::default();
            for (name, path) in current {
                for link in self.valves[&name].links.iter().copied() {
                    if let Entry::Vacant(e) = connections.entry(link) {
                        let conn_path: Path = path
                            .iter()
                            .copied()
                            .chain(std::iter::once((name, link)))
                            .collect();
                        e.insert(conn_path.clone());
                        next.insert(link, conn_path);
                    }
                }
            }
            current = next;
        }

        connections
    }
}

fn main() {
    let net = Network::new();

    println!("From AA:");
    for (name, path) in net.connections(Name(*b"AA")) {
        println!("We can get to {name} using path {path:?}");
    }
}

Shell session

$ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
     Running `target/debug/day16`
From AA:
We can get to HH using path [(AA, DD), (DD, EE), (EE, FF), (FF, GG), (GG, HH)]
We can get to AA using path []
We can get to BB using path [(AA, BB)]
We can get to CC using path [(AA, BB), (BB, CC)]
We can get to GG using path [(AA, DD), (DD, EE), (EE, FF), (FF, GG)]
We can get to JJ using path [(AA, II), (II, JJ)]
We can get to EE using path [(AA, DD), (DD, EE)]
We can get to FF using path [(AA, DD), (DD, EE), (EE, FF)]
We can get to DD using path [(AA, DD)]
We can get to II using path [(AA, II)]

I'm pretty happy with that!

Highest expected reward

Next up, we should exclude valves that are already open, and calculate the expected reward from "traveling to then opening" each valve.

Rust code

// in `src/main.rs`

// unchanged: `struct / impl Network`

#[derive(Debug, Clone)]
struct Move {
    reward: u64,
    target: Name,
    path: Path,
}

impl Move {
    fn cost(&self) -> u64 {
        let travel_turns = self.path.len() as u64;
        let open_turns = 1_u64;
        travel_turns + open_turns
    }
}

#[derive(Clone)]
struct State<'a> {
    net: &'a Network,
    position: Name,
    max_turns: u64,
    turn: u64,
    pressure: u64,
    open_valves: HashSet<Name>,
}

impl State<'_> {
    fn turns_left(&self) -> u64 {
        self.max_turns - self.turn
    }

    /// Compute all moves and expected reward (pressure contributed till time
    /// runs out if we travel to it and open it now)
    fn moves(&self) -> Vec<Move> {
        self.net
            .connections(self.position)
            .into_iter()
            .filter_map(|(name, path)| {
                if self.open_valves.contains(&name) {
                    return None;
                }

                let flow = self.net.valves[&name].flow;
                if flow == 0 {
                    return None;
                }

                let travel_turns = path.len() as u64;
                let open_turns = 1_u64;
                let turns_spent_open = self.turns_left().checked_sub(travel_turns + open_turns)?;
                let reward = flow * turns_spent_open;
                Some(Move {
                    reward,
                    target: name,
                    path,
                })
            })
            .collect()
    }
}

fn main() {
    let net = Network::new();
    let state = State {
        net: &net,
        position: Name(*b"AA"),
        max_turns: 30,
        turn: 0,
        pressure: 0,
        open_valves: Default::default(),
    };

    println!("From AA:");
    for mv in state.moves() {
        println!("Can do {mv:?}");
    }
}

Shell session

$ cargo run
   Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16)
    Finished dev [unoptimized + debuginfo] target(s) in 0.47s
     Running `target/debug/day16`
From AA:
Can do Move { reward: 364, target: BB, path: [(AA, BB)] }
Can do Move { reward: 54, target: CC, path: [(AA, BB), (BB, CC)] }
Can do Move { reward: 528, target: HH, path: [(AA, DD), (DD, EE), (EE, FF), (FF, GG), (GG, HH)] }
Can do Move { reward: 560, target: DD, path: [(AA, DD)] }
Can do Move { reward: 81, target: EE, path: [(AA, DD), (DD, EE)] }
Can do Move { reward: 567, target: JJ, path: [(AA, II), (II, JJ)] }

According to this, the first step we should take is traveling to JJ. The sample in the problem statement, however, disagrees:

== Minute 1 ==
No valves are open.
You move to valve DD.

== Minute 2 ==
No valves are open.
You open valve DD.

(cut)

So, we're probably going into a dead-end (this is what I get for doing these legit), but let's follow that path (heh) anyway and see what our solution looks like: we can always brute-force later.

Greedily simulating moves

With all we've written, it's relatively simple to simulate taking a move:

Rust code

// in `src/main.rs`

impl State<'_> {
    /// Apply a given move
    fn apply(&self, mv: &Move) -> Self {
        let mut next = self.clone();
        next.position = mv.target;
        next.turn += mv.cost();
        next.pressure += mv.reward;
        next.open_valves.insert(mv.target);
        next
    }
}

This returns a copy of State: I have a feeling this'll be handy later.

And then we can just be greedy and perform moves with the maximum expected reward:

Rust code

fn main() {
    let net = Network::new();
    let mut state = State {
        net: &net,
        position: Name(*b"AA"),
        max_turns: 30,
        turn: 0,
        open_valves: Default::default(),
        pressure: 0,
    };

    loop {
        let moves = state.moves();
        if moves.is_empty() {
            break;
        }
        let mv = moves
            .iter()
            .max_by_key(|mv| mv.reward)
            .expect("moves is not empty");
        println!(
            "Turn {}, at {:?}, doing {mv:?}",
            state.turn + 1,
            state.position
        );
        state = state.apply(mv);
    }

    println!("Final pressure: {}", state.pressure);
}

Shell session

$ cargo run --quiet
Turn 1, at AA, doing Move { reward: 567, target: JJ, path: [(AA, II), (II, JJ)] }
Turn 4, at JJ, doing Move { reward: 460, target: DD, path: [(JJ, II), (II, AA), (AA, DD)] }
Turn 8, at DD, doing Move { reward: 396, target: HH, path: [(DD, EE), (EE, FF), (FF, GG), (GG, HH)] }
Turn 13, at HH, doing Move { reward: 143, target: BB, path: [(HH, GG), (GG, FF), (FF, EE), (EE, DD), (DD, AA), (AA, BB)] }
Turn 20, at BB, doing Move { reward: 21, target: EE, path: [(BB, AA), (AA, DD), (DD, EE)] }
Turn 24, at EE, doing Move { reward: 8, target: CC, path: [(EE, DD), (DD, CC)] }
Final pressure: 1595

Well, you always hope you'll be the one to find a mistake in the (heavily-reviewed) problem statement (12 days after it went live), but it's never the case.

Our greedy solution of 1595 is worse than the sample solution of 1651.

Simulating all possible moves

Very well then, doesn't matter if we can't tell what move is truly the best, we already have "all possible moves", so we can "simply" simulate all of them.

And to keep memory usage under control, this time we can do so depth-first!

Rust code

impl State<'_> {
    fn find_best_moves(&self) -> (Self, Vec<Move>) {
        let mut best_moves = vec![];
        let mut best_state = self.clone();
        let mut best_pressure = 0;

        let mut moves = self.moves();
        moves.sort_by_key(|mv| mv.reward);
        moves.reverse();

        for mv in moves {
            let next = self.apply(&mv);
            let (next, mut next_moves) = next.find_best_moves();
            next_moves.push(mv);
            if next.pressure > best_pressure {
                best_pressure = next.pressure;
                best_moves = next_moves;
                best_state = next;
            }
        }

        (best_state, best_moves)
    }
}

// later, in `main`:

fn main() {
    let net = Network::new();
    let state = State {
        net: &net,
        position: Name(*b"AA"),
        max_turns: 30,
        turn: 0,
        pressure: 0,
        open_valves: Default::default(),
    };

    let (state, moves) = state.find_best_moves();
    println!("moves = {:?}, final pressure = {}", moves, state.pressure);
}

This seems to work for the sample input:

Shell session

# equivalent of `--quiet --release`
$ cargo run -qr
moves = [Move { reward: 12, target: CC, path: [(EE, DD), (DD, CC)] }, Move { reward: 27, target: EE, path: [(HH, GG), (GG, FF), (FF, EE)] }, Move { reward: 286, target: HH, path: [(JJ, II), (II, AA), (AA, DD), (DD, EE), (EE, FF), (FF, GG), (GG, HH)] }, Move { reward: 441, target: JJ, path: [(BB, AA), (AA, II), (II, JJ)] }, Move { reward: 325, target: BB, path: [(DD, AA), (AA, BB)] }, Move { reward: 560, target: DD, path: [(AA, DD)] }], final pressure = 1651

And it works for the real input, too!

Shell session

$ cargo build --release && time ./target/release/day16
    Finished release [optimized] target(s) in 0.00s
moves = [Move { reward: 18, target: BF, path: [(SZ, VB), (VB, BF)] }, Move { reward: 96, target: SZ, path: [(QH, SO), (SO, SZ)] }, Move { reward: 140, target: QH, path: [(JG, YZ), (YZ, WW), (WW, QH)] }, Move { reward: 110, target: JG, path: [(JF, QF), (QF, PL), (PL, JG)] }, Move { reward: 285, target: JF, path: [(IU, RV), (RV, JF)] }, Move { reward: 450, target: IU, path: [(XF, TE), (TE, HE), (HE, IU)] }, Move { reward: 484, target: XF, path: [(IY, LW), (LW, KD), (KD, XF)] }, Move { reward: 364, target: IY, path: [(AA, UK), (UK, UY), (UY, IY)] }], final pressure = 1947
./target/release/day16  4,48s user 0,00s system 99% cpu 4,476 total

Code review

But you want to know a secret? This implementation was generated entirely by GitHub Copilot:

Rust code

impl State<'_> {
    fn find_best_moves(&self) -> (Self, Vec<Move>) {
        let mut best_moves = vec![];
        let mut best_state = self.clone();
        let mut best_pressure = 0;

        let mut moves = self.moves();
        moves.sort_by_key(|mv| mv.reward);
        moves.reverse();

        for mv in moves {
            let next = self.apply(&mv);
            let (next, mut next_moves) = next.find_best_moves();
            next_moves.push(mv);
            if next.pressure > best_pressure {
                best_pressure = next.pressure;
                best_moves = next_moves;
                best_state = next;
            }
        }

        (best_state, best_moves)
    }
}

That's right. "All I did" was provide the data structure, base methods, with consistent naming conventions, and an ML model was able to fill in the gaps.

Of course I'm not exactly happy with that code: but it's fairly easy to refactor.

First off, we don't really need to sort moves:

Rust code

impl State<'_> {
    fn find_best_moves(&self) -> (Self, Vec<Move>) {
        let mut best_moves = vec![];
        let mut best_state = self.clone();
        let mut best_pressure = 0;

        for mv in self.moves() {
            let next = self.apply(&mv);
            let (next, mut next_moves) = next.find_best_moves();
            next_moves.push(mv);
            if next.pressure > best_pressure {
                best_pressure = next.pressure;
                best_moves = next_moves;
                best_state = next;
            }
        }

        (best_state, best_moves)
    }

And at that point, why not have moves return an Iterator instead of a Vec?

Rust code

impl State<'_> {
    // 👋 the `+ '_` is the tricky bit here, since the Iterator
    // borrows `&self.net`
    fn moves(&self) -> impl Iterator<Item = Move> + '_ {
        self.net
            .connections(self.position)
            .into_iter()
            .filter_map(|(name, path)| {
                if self.open_valves.contains(&name) {
                    return None;
                }

                let flow = self.net.valves[&name].flow;
                if flow == 0 {
                    return None;
                }

                let travel_turns = path.len() as u64;
                let open_turns = 1_u64;
                let turns_spent_open = self.turns_left().checked_sub(travel_turns + open_turns)?;
                let reward = flow * turns_spent_open;
                Some(Move {
                    reward,
                    target: name,
                    path,
                })
            })
            // there's just no `.collect()` on the end!
    }
}

(No code changes are required in find_best_moves)

Next up, we don't need to store best_pressure separately, since it's already part of state:

Rust code

impl State<'_> {
    fn find_best_moves(&self) -> (Self, Vec<Move>) {
        let mut best_moves = vec![];
        let mut best_state = self.clone();

        for mv in self.moves() {
            let next = self.apply(&mv);
            let (next, mut next_moves) = next.find_best_moves();
            next_moves.push(mv);
            if next.pressure > best_state.pressure {
                best_moves = next_moves;
                best_state = next;
            }
        }

        (best_state, best_moves)
    }
}

And here's another problem with it: the moves are backwards!

Rust code

    // with sample input

    let (state, moves) = state.find_best_moves();
    println!("final pressure = {}", state.pressure);
    for m in &moves {
        println!(" - {m:?}")
    }

Shell session

$ cargo run
   Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16)
    Finished dev [unoptimized + debuginfo] target(s) in 0.43s
     Running `target/debug/day16`
final pressure = 1651
 - Move { reward: 12, target: CC, path: [(EE, DD), (DD, CC)] }
 - Move { reward: 27, target: EE, path: [(HH, GG), (GG, FF), (FF, EE)] }
 - Move { reward: 286, target: HH, path: [(JJ, II), (II, AA), (AA, DD), (DD, EE), (EE, FF), (FF, GG), (GG, HH)] }
 - Move { reward: 441, target: JJ, path: [(BB, AA), (AA, II), (II, JJ)] }
 - Move { reward: 325, target: BB, path: [(DD, AA), (AA, BB)] }
 - Move { reward: 560, target: DD, path: [(AA, DD)]

Now that's not a big issue, but Copilot did generate incorrect code: I may have used a VecDeque or some other data structure.

Come to think of it, we don't really need to store all the moves to get the solution! Maybe that'll make it run faster?

Rust code

impl State<'_> {
    fn apply_best_moves(&self) -> Self {
        let mut best_state = self.clone();

        for mv in self.moves() {
            let next = self.apply(&mv).apply_best_moves();
            if next.pressure > best_state.pressure {
                best_state = next;
            }
        }
        best_state
    }
}

Rust code

fn main() {
    let net = Network::new();
    let state = State {
        net: &net,
        position: Name(*b"AA"),
        max_turns: 30,
        turn: 0,
        pressure: 0,
        open_valves: Default::default(),
    };

    let state = state.apply_best_moves();
    println!("final pressure = {}", state.pressure);
}

Shell session

$ cargo build --release && time ./target/release/day16
   Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16)
    Finished release [optimized] target(s) in 0.59s
final pressure = 1947
./target/release/day16  4,53s user 0,00s system 99% cpu 4,531 total

Not really.

Now I'm curious: how many times are we even calling State::moves?

Rust code

use std::sync::atomic::AtomicU64;

static MOVES_CALLED: AtomicU64 = AtomicU64::new(0);

impl State<'_> {
    fn moves(&self) -> impl ... {
        MOVES_CALLED.fetch_add(1, std::sync::atomic::Ordering::SeqCst);

        // etc.
    }
}

fn main() {
    // cut: everything

    println!(
        "State::moves called {} times",
        MOVES_CALLED.load(std::sync::atomic::Ordering::SeqCst)
    );
}

Shell session

# with the real puzzle input
$ cargo build --release && time ./target/release/day16
   Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16)
    Finished release [optimized] target(s) in 0.59s
final pressure = 1947
State::moves called 257798
./target/release/day16  4,55s user 0,01s system 100% cpu 4,557 total

Maybe we can optimize that a little bit.

Profiling our code with the Valgrind suite

Callgrind is a "call-graph generating cache and branch prediction profiler", that's fairly heavyweight.

I'll be doing this on Linux, the easiest platform to run Valgrind on.

We'll need symbols for that:

TOML markup

# in `Cargo.toml`

[profile.release]
debug = 1

Shell session

$ cargo build --release
(everything gets rebuilt)

$ valgrind --tool=callgrind ./target/release/day16
==10655== Callgrind, a call-graph generating cache profiler
==10655== Copyright (C) 2002-2017, and GNU GPL'd, by Josef Weidendorfer et al.
==10655== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==10655== Command: ./target/release/day16
==10655==
==10655== For interactive control, run 'callgrind_control -h'.

This takes a long time. The docs mention that callgrind/cachegrind, since they emulate the CPU, runs the program around 50 times slower. So we can expect a result in.. a few minutes.

There. It finished. It generated a file:

Shell session

$ ll callgrind*
-rw------- 1 amos amos 142K déc.  28 19:13 callgrind.out.10655

We can use a frontend like kcachegrind to visualize the profile:

We can even visualize a call graph!

(Those might appear small in the article, but you can right-click / long-press to open the image in a new tab)

So there's a bunch of memory operations happening, hash tables being resized (which forces re-hashing all its elements), and building vec data structures.

Fair enough. This doesn't show State::moves or Network::connections calls, probably because they're inlined?

I don't think we learned much from this particular run, but it's always nice to show off some tooling.

Caching moves

One way to make our code faster (still for part 1) is to stop re-doing the same work over and over. The generic name for this technique is memoization, and there's a few crates for it, but we don't really need a generic solution here.

Let's store all connections as part of the Network itself:

Rust code

// in `src/main.rs`

// Making a newtype to represent flow because it's used in a tuple later, and a
// lone `u64` isn't really semantic, now is it.
#[derive(Clone, Copy, PartialEq, Eq)]
struct Flow(u64);

type Connections = HashMap<Name, (Path, Flow)>;

pub struct Network {
    valves: HashMap<Name, (Valve, Connections)>,
}

Network's impl becomes this:

Rust code

impl Network {
    #![allow(clippy::new_without_default)]
    pub fn new() -> Self {
        let mut net = Self {
            valves: include_str!("input.txt")
                .lines()
                .map(|l| all_consuming(parse::Valve::parse)(l).finish().unwrap().1)
                // start off with zero connections (since we're still parsing)
                .map(|valve| (valve.name, (valve, Connections::default())))
                .collect(),
        };
        let names = net.valves.keys().copied().collect::<Vec<_>>();
        for name in names {
            // fill in the connections as needed
            let conns = net.connections(name);
            net.valves.get_mut(&name).unwrap().1 = conns;
        }
        net
    }

    /// Given a valve name, return a list of valves we can travel to, along
    /// with the path to get there, and their flow.
    ///
    /// Only the shortest paths are considered, so the search ends.
    fn connections(&self, start: Name) -> Connections {
        // this used to be just `HashMap<Name, Path>`, it's a bit hairier now
        let mut current: HashMap<Name, (Path, Flow)> = Default::default();
        {
            let valve = &self.valves[&start].0;
            current.insert(start, (vec![], Flow(valve.flow)));
        }

        let mut connections = current.clone();

        while !current.is_empty() {
            let mut next: HashMap<Name, (Path, Flow)> = Default::default();
            for (name, (path, _flow)) in current {
                for link in self.valves[&name].0.links.iter().copied() {
                    let valve = &self.valves[&link].0;
                    if let Entry::Vacant(e) = connections.entry(link) {
                        let conn_path: Path = path
                            .iter()
                            .copied()
                            .chain(std::iter::once((name, link)))
                            .collect();
                        let item = (conn_path.clone(), Flow(valve.flow));
                        e.insert(item.clone());
                        next.insert(link, item);
                    }
                }
            }
            current = next;
        }

        connections
    }
}

And we can change State::moves to use it. Because now all possible paths are stored in State, we can have Move borrow from it:

Rust code

// 👋 now with `<'a>`
#[derive(Debug, Clone)]
struct Move<'a> {
    reward: u64,
    target: Name,
    //    👇 now borrowed
    path: &'a Path,
}

//  needs 👇
impl Move<'_> {
    fn cost(&self) -> u64 {
        let travel_turns = self.path.len() as u64;
        let open_turns = 1_u64;
        travel_turns + open_turns
    }
}

And then:

Rust code

impl State<'_> {
    /// Compute all moves and expected reward (pressure contributed till time
    /// runs out if we travel to it and open it now)
    fn moves(&self) -> impl Iterator<Item = Move> + '_ {
        // note: I removed the `MOVES_CALLED` business

        let (_valve, connections) = &self.net.valves[&self.position];
        connections.iter().filter_map(|(name, (path, flow))| {
            if self.open_valves.contains(name) {
                return None;
            }

            if flow.0 == 0 {
                return None;
            }

            let travel_turns = path.len() as u64;
            let open_turns = 1_u64;
            let turns_spent_open = self.turns_left().checked_sub(travel_turns + open_turns)?;
            let reward = flow.0 * turns_spent_open;
            Some(Move {
                reward,
                target: *name,
                path,
            })
        })
    }
}

And boom, it's fast:

Shell session

$ cargo build --release && time ./target/release/day16
   Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16)
    Finished release [optimized + debuginfo] target(s) in 0.65s
final pressure = 1947
./target/release/day16  0,31s user 0,00s system 99% cpu 0,306 total

Running callgrind on it again indicates that we're now spending a lot of cycles hashing stuff:

Shell session

$ callgrind_annotate callgrind.out.11770 | grep 'file:function' -A 10
Ir                      file:function
--------------------------------------------------------------------------------
1,162,174,008 (28.18%)  /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/core/src/hash/sip.rs:_ZN81_$LT$std..collections..hash..map..DefaultHasher$u20$as$u20$core..hash..Hasher$GT$5write17hd317e39715b716a1E.llvm.8639191835063113673
  703,851,864 (17.07%)  /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/core/src/num/uint_macros.rs:core::hash::BuildHasher::hash_one
  523,796,736 (12.70%)  /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/core/src/hash/sip.rs:core::hash::BuildHasher::hash_one
  361,111,980 ( 8.76%)  /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.12.3/src/raw/mod.rs:day16::State::apply_best_moves'2
  261,898,368 ( 6.35%)  /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/core/src/hash/mod.rs:core::hash::BuildHasher::hash_one [/home/amos/bearcove/aoc2022/day16/target/release/day16]
  179,168,705 ( 4.34%)  /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.12.3/src/map.rs:day16::State::apply_best_moves'2
  163,686,480 ( 3.97%)  /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/core/src/num/uint_macros.rs:_ZN81_$LT$std..collections..hash..map..DefaultHasher$u20$as$u20$core..hash..Hasher$GT$5write17hd317e39715b716a1E.llvm.8639191835063113673
  117,252,284 ( 2.84%)  /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.12.3/src/raw/bitmask.rs:day16::State::apply_best_moves'2
   98,211,888 ( 2.38%)  /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/std/src/collections/hash/map.rs:_ZN81_$LT$std..collections..hash..map..DefaultHasher$u20$as$u20$core..hash..Hasher$GT$5write17hd317e39715b716a1E.llvm.8639191835063113673 [/home/amos/bearcove/aoc2022/day16/target/release/day16]

And: yes we are! I bet it's that one line:

Rust code

        let (_valve, connections) = &self.net.valves[&self.position];

ie. looking up the valve by its name.

Luckily, we can do better than that!

A custom data structure

We store valve names as [u8; 2], which is better than String, but it's still worse than it could be! We could in fact be using arrays rather than hash maps, because both u8 are always upper-case letters, and so the number of unique valve names are ${26^2 = 676}$ .

Using that property, we can do something clever.

Rust code

// in `src/parse.rs`

// as before:
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
pub struct Name(pub [u8; 2]);

impl Name {
    /// Returns this name as a usize between 0 and 26**2
    pub fn as_usize(self) -> usize {
        let [a, b] = self.0;
        debug_assert!((b'A'..=b'Z').contains(&a));
        debug_assert!((b'A'..=b'Z').contains(&b));

        (a - b'A') as usize * 26 + (b - b'A') as usize
    }

    /// Returns a name from a usize between 0 and 26**2
    pub fn from_usize(index: usize) -> Self {
        debug_assert!(index < MAX_NAME);
        let a = (index / 26) as u8 + b'A';
        let b = (index % 26) as u8 + b'A';
        Self([a, b])
    }

    // Note: these could both expressed with `From` / `Into` implementations,
    // but I can't be bothered right now.
}

pub const MAX_NAME: usize = 26_usize.pow(2);

#[derive(Clone)]
pub struct NameMap<T> {
    values: [Option<T>; MAX_NAME],
}

impl<T> NameMap<T> {
    pub fn new() -> Self {
        Self {
            values: std::array::from_fn(|_| None),
        }
    }

    pub fn get(&self, name: Name) -> Option<&T> {
        self.values[name.as_usize()].as_ref()
    }

    pub fn get_mut(&mut self, name: Name) -> Option<&mut T> {
        self.values[name.as_usize()].as_mut()
    }

    pub fn insert(&mut self, name: Name, value: T) {
        self.values[name.as_usize()] = Some(value);
    }

    pub fn contains(&self, name: Name) -> bool {
        self.values[name.as_usize()].is_some()
    }

    pub fn is_empty(&self) -> bool {
        self.values.iter().all(|v| v.is_none())
    }

    pub fn iter(&self) -> impl Iterator<Item = (Name, &T)> {
        self.values
            .iter()
            .enumerate()
            .filter_map(|(i, v)| v.as_ref().map(|v| (Name::from_usize(i), v)))
    }

    pub fn keys(&self) -> impl Iterator<Item = Name> + '_ {
        self.values
            .iter()
            .enumerate()
            .filter_map(|(i, v)| v.as_ref().map(|_| Name::from_usize(i)))
    }
}

impl<T> Default for NameMap<T> {
    fn default() -> Self {
        Self::new()
    }
}

Now, we can adjust the type of Connections:

Rust code

// in `src/main.rs`

use parse::NameMap;

type Connections = NameMap<(Path, Flow)>;

And Network::connections:

Rust code

impl Network {
    /// Given a valve name, return a list of valves we can travel to, along
    /// with the path to get there, and their flow.
    ///
    /// Only the shortest paths are considered, so the search ends.
    fn connections(&self, start: Name) -> Connections {
        // some light changes here
        let mut current = Connections::default();
        {
            let valve = &self.valves[&start].0;
            current.insert(start, (vec![], Flow(valve.flow)));
        }

        let mut connections = current.clone();

        while !current.is_empty() {
            // and here
            let mut next = Connections::default();
            for (name, (path, _flow)) in current {
                // can't use the nice `Entry` API anymore:
                for link in self.valves[&name].0.links.iter().copied() {
                    let valve = &self.valves[&link].0;
                    if !connections.contains(link) {
                        let conn_path: Path = path
                            .iter()
                            .copied()
                            .chain(std::iter::once((name, link)))
                            .collect();
                        let item = (conn_path.clone(), Flow(valve.flow));
                        connections.insert(link, item.clone());
                        next.insert(link, item);
                    }
                }
            }
            current = next;
        }

        connections
    }
}

Similarly, open_valves can be a NameMap instead of a HashSet:

Rust code

#[derive(Clone)]
struct State<'a> {
    net: &'a Network,
    position: Name,
    max_turns: u64,
    turn: u64,
    pressure: u64,
    // 👇 was: HashSet<Name>
    open_valves: NameMap<()>,
}

The code changes required are relatively minimal. Notably, NameMap::contains takes a Name rather than a &Name (since we know it's Copy anyway), and NameMap::insert takes two arguments, so in State::apply, it becomes:

Rust code

impl State<'_> {
    /// Apply a given move
    fn apply(&self, mv: &Move) -> Self {
        let mut next = self.clone();
        next.position = mv.target;
        next.turn += mv.cost();
        next.pressure += mv.reward;
        // 👇 inserting an empty tuple here
        next.open_valves.insert(mv.target, ());
        next
    }
}

And of course I forgot about Network::valves, so let's change it too:

Rust code

pub struct Network {
    // was: `HashMap<Name, (Valve, Connections)>`
    valves: NameMap<(Valve, Connections)>,
}

The constructor becomes:

Rust code

impl Network {
    #![allow(clippy::new_without_default)]
    pub fn new() -> Self {
        let mut net = Self {
            valves: Default::default(),
        };

        // we'd have to implement `FromIterator` if we wanted to use `collect`,
        // let's just it do manually for now
        for valve in include_str!("input.txt")
            .lines()
            .map(|l| all_consuming(parse::Valve::parse)(l).finish().unwrap().1)
        {
            net.valves.insert(valve.name, (valve, Default::default()));
        }

        let names = net.valves.keys().collect::<Vec<_>>();
        for name in names {
            let conns = net.connections(name);
            net.valves.get_mut(name).unwrap().1 = conns;
        }
        net
    }
}

And then a few lines need changing, like:

Rust code

    // before
    let valve = &self.valves[&start].0;

    // after
    let valve = &self.valves.get(start).unwrap().0;

etc.

Unfortunately, this version of the program overflows its stack:

Rust code

$ cargo run --release
    Finished release [optimized + debuginfo] target(s) in 0.00s
     Running `target/release/day16`

thread 'main' has overflowed its stack
fatal runtime error: stack overflow
[1]    19955 IOT instruction (core dumped)  cargo run --release

My instinct is that it tries to allocate too much in one go, rather than there being actually infinite recursion or something.

Let's find out! Following Nicholas Nethercote's excellent Rust Performance Book:

Shell session

$ RUSTFLAGS=-Zprint-type-sizes cargo +nightly build --release
print-type-size type: `Network`: 14650272 bytes, alignment: 8 bytes
print-type-size     field `.valves`: 14650272 bytes

Mkay.

Well! I have 128GB of RAM on this machine, maybe 128MB of stack will be enough?

Shell session

$ ulimit -s unlimited; RUSTFLAGS="-C link-args=-Wl,-zstack-size=134217728" cargo build --release && time ./target/release/day16
    Finished release [optimized + debuginfo] target(s) in 0.00s
final pressure = 1947
./target/release/day16  0,13s user 0,00s system 99% cpu 0,130 total

Oh the crimes!

Yes, but look at how fast it is!

Truth is, we can probably optimize that further, but I didn't have to think too hard about this, and in my book, that's a win.

Because we're going to keep that custom data structure around, let's make a Justfile:

# just manual: https://github.com/casey/just#readme

run:
	#!/bin/bash -eux
	ulimit -s unlimited
	export RUSTFLAGS="-C link-args=-Wl,-zstack-size=134217728"
	cargo build --release
	time ./target/release/day16

Part 2

In part 2, we have an elephant buddy. We spend 4 turns teaching them to walk through tunnels and unlock valves, so we have 26 turns left, the two of us, to do the optimal thing.

Well.

You know the annoying bit about that part? It's that we don't really simulate each turn. When we've chosen to "walk that path and unlock that valve", we do all these steps at once.

That doesn't really work if we have two actors, because it might go:

 --------------- time --------------->
|-I do thing-|-I do other thing|
|-elephant does thing-|
                      ^
                    uh oh

And then we're hecked.

After I wrote these words, I wasted an entire day trying to simulate step-by-step. It worked for the sample input, but was way too expensive for the real input.

I tried memoizing everything in sight, to no avail. Eventually, my code failed to allocate 42GiB of RAM (there's only 32GiB allocated to this VM), and it was my cue to stop being a tryhard, for once.

That's growth.

So, for the first time in this Advent of Code, I decided to take a look at others' solutions in search of hints. The reddit thread for day 16 is full of interesting ones, I ended up getting inspiration from Nicky Meuleman's solution.

The basic idea is to keep track of the maximum pressure achieved with any given combination of valves, while running the code for part 1.

Rust code

// maps "valves open" to "best pressure achieved"
type Best = HashMap<NameMap<()>, u64>;

In apply_best_moves, we record those:

Rust code

impl State<'_> {
    fn apply_best_moves(&self, best: &mut Best) -> Self {
        let mut best_state = self.clone();

        best.entry(self.open_valves.clone())
            .and_modify(|v| {
                if self.pressure > *v {
                    *v = self.pressure
                }
            })
            .or_insert(self.pressure);

        for mv in self.moves() {
            let next = self.apply(&mv).apply_best_moves(best);
            if next.pressure > best_state.pressure {
                best_state = next;
            }
        }
        best_state
    }
}

And then our main function becomes:

Rust code

fn main() {
    let net = Network::new();
    let p1_state = State {
        net: &net,
        position: Name(*b"AA"),
        max_turns: 30,
        turn: 0,
        pressure: 0,
        open_valves: Default::default(),
    };

    {
        let mut best = Best::default();
        let state = p1_state.apply_best_moves(&mut best);
        println!("part 1 pressure = {}", state.pressure);
    }

    {
        let mut best = Best::default();
        let mut p2_state = p1_state.clone();
        p2_state.max_turns = 26;
        p2_state.apply_best_moves(&mut best);

        let best_pressure = best
            .iter()
            .tuple_combinations()
            .filter(|(human, elephant)| human.0.is_disjoint(elephant.0))
            .map(|(human, elephant)| human.1 + elephant.1)
            .max()
            .unwrap();
        println!("part 2 pressure: {best_pressure}");
    }
}

tuple_combinations comes from our beloved itertools, and I had to implement is_disjoint myself, since we're using a custom data structure:

Rust code

// in `src/parse.rs`

impl<T> NameMap<T> {
    pub fn overlaps(&self, other: &Self) -> bool {
        self.keys().any(|k| other.contains(k))
    }

    pub fn is_disjoint(&self, other: &Self) -> bool {
        !self.overlaps(other) && !other.overlaps(self)
    }
}

The idea is that, once we have all the best combinations, we find two disjoint sets for ourselves and the elephant to open. As far as I can tell, this only works because there's time to travel to and open all of 15 valves that have a non-zero flow.

Shell session

$ just run
+ ulimit -s unlimited
+ export 'RUSTFLAGS=-C link-args=-Wl,-zstack-size=134217728'
+ RUSTFLAGS='-C link-args=-Wl,-zstack-size=134217728'
+ cargo build --release
   Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16b)
    Finished release [optimized + debuginfo] target(s) in 0.66s
+ ./target/release/day16
part 1 pressure = 1947
part 2 pressure: 2556

real    0m2,865s
user    0m2,831s
sys     0m0,031s

Here's something interesting though! For that part of the code, our custom structure is slower than a BTreeSet. (We can't use a HashSet, since it doesn't itself implement Hash — I bet it's because the BTreeSet keeps keys sorted, making it easy to hash).

If we simply add this:

Rust code

        let best: HashMap<BTreeSet<_>, _> = best
            .into_iter()
            .map(|(valves, total)| (valves.iter().map(|(k, _v)| k).collect(), total))
            .collect();

The runtime is halved!

Shell session

$ just run
(cut)
+ ./target/release/day16
part 1 pressure = 1947
part 2 pressure: 2556

real    0m1,429s
user    0m1,415s
sys     0m0,014s

And that's it! This isn't the fastest solution by any stretch of the imagination, but I'm happy with it for now.

If you want to get all galaxy brain, check out Orson Peters' solution, which uses the Floyd–Warshall algorithm.

You're reading the Advent of Code 2022 series.

Comment on /r/fasterthanlime

If you liked what you saw, please support my work!

Donate on GitHub Donate on Patreon

Here's another article just for you:

ktls now under the rustls org

Sep 26, 2024

5 min #rustls · #ktls · #rust

What's a ktls

I started work on ktls and ktls-sys, a pair of crates exposing Kernel TLS offload to Rust, about two years ago.

kTLS lets the kernel (and, in turn, any network interface that supports it) take care of encryption, framing, etc., for the entire duration of a TLS connection... as soon as you have a TLS connection.

For the handshake itself (hellos, change cipher, encrypted extensions, certificate verification, etc.), you still have to use a userland TLS implementation.

Parsing
Breadth-first search
Highest expected reward
Greedily simulating moves
Simulating all possible moves
Code review
Profiling our code with the Valgrind suite
Caching moves
A custom data structure
Part 2

Day 16 (Advent of Code 2022)

Parsing

Breadth-first search

Highest expected reward

Greedily simulating moves

Simulating all possible moves

Code review

Profiling our code with the Valgrind suite

Caching moves

A custom data structure

Part 2

What's a ktls

Contents