Thanks to my sponsors: Nicolas Riebesel, Andrew Neth, Guy Waldman, Sarah Berrettini, Ivo Murrell, Dom, Matt Jackson, notryanb, old.woman.josiah, Torben Clasen, Cass, Luiz Irber, Sean Bryant, Mikkel Rasmussen, std__mpa, belzael, Mike Cripps, Max Bruckner, Moritz Lammerich, Jake Demarest-Mays and 230 more
Day 16 (Advent of Code 2022)
Let's tackle the day 16 puzzle!
Parsing
The input looks like this:
Valve AA has flow rate=0; tunnels lead to valves DD, II, BB
Valve BB has flow rate=13; tunnels lead to valves CC, AA
Valve CC has flow rate=2; tunnels lead to valves DD, BB
Valve DD has flow rate=20; tunnels lead to valves CC, AA, EE
Valve EE has flow rate=3; tunnels lead to valves FF, DD
Valve FF has flow rate=0; tunnels lead to valves EE, GG
Valve GG has flow rate=0; tunnels lead to valves FF, HH
Valve HH has flow rate=22; tunnel leads to valve GG
Valve II has flow rate=0; tunnels lead to valves AA, JJ
Valve JJ has flow rate=21; tunnel leads to valve II
And we can stuff it in src/input-sample.txt
.
Let's grab nom for parsing:
$ cargo add nom@7
// in `src/parse.rs`
use std::fmt;
use nom::{
branch::alt,
bytes::complete::{tag, take},
combinator::map,
multi::separated_list1,
sequence::{preceded, tuple},
IResult,
};
// Valve names are always two characters: this is more compact than a `String`,
// doesn't involves heap allocations, and gives us a `Copy` type, which is
// really convenient.
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
pub struct Name([u8; 2]);
impl fmt::Debug for Name {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let [a, b] = self.0;
write!(f, "{}{}", a as char, b as char)
}
}
impl fmt::Display for Name {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
fmt::Debug::fmt(self, f)
}
}
impl Name {
fn parse(i: &str) -> IResult<&str, Self> {
map(take(2usize), |slice: &str| {
Self(slice.as_bytes().try_into().unwrap())
})(i)
}
}
#[derive(Debug)]
pub struct Valve {
pub name: Name,
pub flow: u64,
pub links: Vec<Name>,
}
impl Valve {
pub fn parse(i: &str) -> IResult<&str, Self> {
// example input:
// Valve AA has flow rate=0; tunnels lead to valves DD, II, BB
map(
tuple((
preceded(tag("Valve "), Name::parse),
preceded(tag(" has flow rate="), nom::character::complete::u64),
preceded(
alt((
tag("; tunnels lead to valves "),
tag("; tunnel leads to valve "),
)),
separated_list1(tag(", "), Name::parse),
),
)),
|(name, flow, links)| Self { name, flow, links },
)(i)
}
}
And try that code out:
use nom::{combinator::all_consuming, Finish};
mod parse;
fn main() {
let valves: Vec<_> = include_str!("input-sample.txt")
.lines()
.map(|l| all_consuming(parse::Valve::parse)(l).finish().unwrap().1)
.collect();
for valve in &valves {
println!("{valve:?}");
}
}
$ cargo run
Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16)
Finished dev [unoptimized + debuginfo] target(s) in 0.37s
Running `target/debug/day16`
Valve { name: AA, flow: 0, links: [DD, II, BB] }
Valve { name: BB, flow: 13, links: [CC, AA] }
Valve { name: CC, flow: 2, links: [DD, BB] }
Valve { name: DD, flow: 20, links: [CC, AA, EE] }
Valve { name: EE, flow: 3, links: [FF, DD] }
Valve { name: FF, flow: 0, links: [EE, GG] }
Valve { name: GG, flow: 0, links: [FF, HH] }
Valve { name: HH, flow: 22, links: [GG] }
Valve { name: II, flow: 0, links: [AA, JJ] }
Valve { name: JJ, flow: 21, links: [II] }
Looking good!
Breadth-first search
We start at valve AA, and we have 30 "turns" (the statement calls them minutes).
Traveling to another valve (connected by a tunnel) takes a turn, opening a valve
takes a turn. Every turn, opened valves contribute flow
to the pressure
released.
It seems like we'll want to do a little graph traversal here. My idea is: knowing how many turns we have left, find the valve we should open, or in other words, the valve that, once we travel to it and open it, will release the most pressure total.
For convenience, we'll want to have our Valve
values indexed by their Name
rather than in a Vec
. We can also make a Network
type that will be immutable,
and contain all valves.
use std::collections::{hash_map::Entry, HashMap};
use nom::{combinator::all_consuming, Finish};
use parse::{Name, Valve};
mod parse;
pub struct Network {
valves: HashMap<Name, Valve>,
}
pub type Path = Vec<(Name, Name)>;
impl Network {
#![allow(clippy::new_without_default)]
pub fn new() -> Self {
Self {
valves: include_str!("input-sample.txt")
.lines()
.map(|l| all_consuming(parse::Valve::parse)(l).finish().unwrap().1)
.map(|valve| (valve.name, valve))
.collect(),
}
}
/// Given a valve name, return a list of valves we can travel to, along
/// with the path to get there.
///
/// Only the shortest paths are considered, so the search ends.
pub fn connections(&self, start: Name) -> HashMap<Name, Path> {
let mut current: HashMap<Name, Path> = Default::default();
current.insert(start, vec![]);
let mut connections = current.clone();
while !current.is_empty() {
let mut next: HashMap<Name, Path> = Default::default();
for (name, path) in current {
for link in self.valves[&name].links.iter().copied() {
if let Entry::Vacant(e) = connections.entry(link) {
let conn_path: Path = path
.iter()
.copied()
.chain(std::iter::once((name, link)))
.collect();
e.insert(conn_path.clone());
next.insert(link, conn_path);
}
}
}
current = next;
}
connections
}
}
fn main() {
let net = Network::new();
println!("From AA:");
for (name, path) in net.connections(Name(*b"AA")) {
println!("We can get to {name} using path {path:?}");
}
}
$ cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/day16`
From AA:
We can get to HH using path [(AA, DD), (DD, EE), (EE, FF), (FF, GG), (GG, HH)]
We can get to AA using path []
We can get to BB using path [(AA, BB)]
We can get to CC using path [(AA, BB), (BB, CC)]
We can get to GG using path [(AA, DD), (DD, EE), (EE, FF), (FF, GG)]
We can get to JJ using path [(AA, II), (II, JJ)]
We can get to EE using path [(AA, DD), (DD, EE)]
We can get to FF using path [(AA, DD), (DD, EE), (EE, FF)]
We can get to DD using path [(AA, DD)]
We can get to II using path [(AA, II)]
I'm pretty happy with that!
Highest expected reward
Next up, we should exclude valves that are already open, and calculate the expected reward from "traveling to then opening" each valve.
// in `src/main.rs`
// unchanged: `struct / impl Network`
#[derive(Debug, Clone)]
struct Move {
reward: u64,
target: Name,
path: Path,
}
impl Move {
fn cost(&self) -> u64 {
let travel_turns = self.path.len() as u64;
let open_turns = 1_u64;
travel_turns + open_turns
}
}
#[derive(Clone)]
struct State<'a> {
net: &'a Network,
position: Name,
max_turns: u64,
turn: u64,
pressure: u64,
open_valves: HashSet<Name>,
}
impl State<'_> {
fn turns_left(&self) -> u64 {
self.max_turns - self.turn
}
/// Compute all moves and expected reward (pressure contributed till time
/// runs out if we travel to it and open it now)
fn moves(&self) -> Vec<Move> {
self.net
.connections(self.position)
.into_iter()
.filter_map(|(name, path)| {
if self.open_valves.contains(&name) {
return None;
}
let flow = self.net.valves[&name].flow;
if flow == 0 {
return None;
}
let travel_turns = path.len() as u64;
let open_turns = 1_u64;
let turns_spent_open = self.turns_left().checked_sub(travel_turns + open_turns)?;
let reward = flow * turns_spent_open;
Some(Move {
reward,
target: name,
path,
})
})
.collect()
}
}
fn main() {
let net = Network::new();
let state = State {
net: &net,
position: Name(*b"AA"),
max_turns: 30,
turn: 0,
pressure: 0,
open_valves: Default::default(),
};
println!("From AA:");
for mv in state.moves() {
println!("Can do {mv:?}");
}
}
$ cargo run
Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16)
Finished dev [unoptimized + debuginfo] target(s) in 0.47s
Running `target/debug/day16`
From AA:
Can do Move { reward: 364, target: BB, path: [(AA, BB)] }
Can do Move { reward: 54, target: CC, path: [(AA, BB), (BB, CC)] }
Can do Move { reward: 528, target: HH, path: [(AA, DD), (DD, EE), (EE, FF), (FF, GG), (GG, HH)] }
Can do Move { reward: 560, target: DD, path: [(AA, DD)] }
Can do Move { reward: 81, target: EE, path: [(AA, DD), (DD, EE)] }
Can do Move { reward: 567, target: JJ, path: [(AA, II), (II, JJ)] }
According to this, the first step we should take is traveling to JJ
. The
sample in the problem statement, however, disagrees:
== Minute 1 ==
No valves are open.
You move to valve DD.
== Minute 2 ==
No valves are open.
You open valve DD.
(cut)
So, we're probably going into a dead-end (this is what I get for doing these legit), but let's follow that path (heh) anyway and see what our solution looks like: we can always brute-force later.
Greedily simulating moves
With all we've written, it's relatively simple to simulate taking a move:
// in `src/main.rs`
impl State<'_> {
/// Apply a given move
fn apply(&self, mv: &Move) -> Self {
let mut next = self.clone();
next.position = mv.target;
next.turn += mv.cost();
next.pressure += mv.reward;
next.open_valves.insert(mv.target);
next
}
}
This returns a copy of State
: I have a feeling this'll be handy later.
And then we can just be greedy and perform moves with the maximum expected reward:
fn main() {
let net = Network::new();
let mut state = State {
net: &net,
position: Name(*b"AA"),
max_turns: 30,
turn: 0,
open_valves: Default::default(),
pressure: 0,
};
loop {
let moves = state.moves();
if moves.is_empty() {
break;
}
let mv = moves
.iter()
.max_by_key(|mv| mv.reward)
.expect("moves is not empty");
println!(
"Turn {}, at {:?}, doing {mv:?}",
state.turn + 1,
state.position
);
state = state.apply(mv);
}
println!("Final pressure: {}", state.pressure);
}
$ cargo run --quiet
Turn 1, at AA, doing Move { reward: 567, target: JJ, path: [(AA, II), (II, JJ)] }
Turn 4, at JJ, doing Move { reward: 460, target: DD, path: [(JJ, II), (II, AA), (AA, DD)] }
Turn 8, at DD, doing Move { reward: 396, target: HH, path: [(DD, EE), (EE, FF), (FF, GG), (GG, HH)] }
Turn 13, at HH, doing Move { reward: 143, target: BB, path: [(HH, GG), (GG, FF), (FF, EE), (EE, DD), (DD, AA), (AA, BB)] }
Turn 20, at BB, doing Move { reward: 21, target: EE, path: [(BB, AA), (AA, DD), (DD, EE)] }
Turn 24, at EE, doing Move { reward: 8, target: CC, path: [(EE, DD), (DD, CC)] }
Final pressure: 1595
Well, you always hope you'll be the one to find a mistake in the (heavily-reviewed) problem statement (12 days after it went live), but it's never the case.
Our greedy solution of 1595
is worse than the sample solution of 1651
.
Simulating all possible moves
Very well then, doesn't matter if we can't tell what move is truly the best, we already have "all possible moves", so we can "simply" simulate all of them.
And to keep memory usage under control, this time we can do so depth-first!
impl State<'_> {
fn find_best_moves(&self) -> (Self, Vec<Move>) {
let mut best_moves = vec![];
let mut best_state = self.clone();
let mut best_pressure = 0;
let mut moves = self.moves();
moves.sort_by_key(|mv| mv.reward);
moves.reverse();
for mv in moves {
let next = self.apply(&mv);
let (next, mut next_moves) = next.find_best_moves();
next_moves.push(mv);
if next.pressure > best_pressure {
best_pressure = next.pressure;
best_moves = next_moves;
best_state = next;
}
}
(best_state, best_moves)
}
}
// later, in `main`:
fn main() {
let net = Network::new();
let state = State {
net: &net,
position: Name(*b"AA"),
max_turns: 30,
turn: 0,
pressure: 0,
open_valves: Default::default(),
};
let (state, moves) = state.find_best_moves();
println!("moves = {:?}, final pressure = {}", moves, state.pressure);
}
This seems to work for the sample input:
# equivalent of `--quiet --release`
$ cargo run -qr
moves = [Move { reward: 12, target: CC, path: [(EE, DD), (DD, CC)] }, Move { reward: 27, target: EE, path: [(HH, GG), (GG, FF), (FF, EE)] }, Move { reward: 286, target: HH, path: [(JJ, II), (II, AA), (AA, DD), (DD, EE), (EE, FF), (FF, GG), (GG, HH)] }, Move { reward: 441, target: JJ, path: [(BB, AA), (AA, II), (II, JJ)] }, Move { reward: 325, target: BB, path: [(DD, AA), (AA, BB)] }, Move { reward: 560, target: DD, path: [(AA, DD)] }], final pressure = 1651
And it works for the real input, too!
$ cargo build --release && time ./target/release/day16
Finished release [optimized] target(s) in 0.00s
moves = [Move { reward: 18, target: BF, path: [(SZ, VB), (VB, BF)] }, Move { reward: 96, target: SZ, path: [(QH, SO), (SO, SZ)] }, Move { reward: 140, target: QH, path: [(JG, YZ), (YZ, WW), (WW, QH)] }, Move { reward: 110, target: JG, path: [(JF, QF), (QF, PL), (PL, JG)] }, Move { reward: 285, target: JF, path: [(IU, RV), (RV, JF)] }, Move { reward: 450, target: IU, path: [(XF, TE), (TE, HE), (HE, IU)] }, Move { reward: 484, target: XF, path: [(IY, LW), (LW, KD), (KD, XF)] }, Move { reward: 364, target: IY, path: [(AA, UK), (UK, UY), (UY, IY)] }], final pressure = 1947
./target/release/day16 4,48s user 0,00s system 99% cpu 4,476 total
Code review
But you want to know a secret? This implementation was generated entirely by GitHub Copilot:
impl State<'_> {
fn find_best_moves(&self) -> (Self, Vec<Move>) {
let mut best_moves = vec![];
let mut best_state = self.clone();
let mut best_pressure = 0;
let mut moves = self.moves();
moves.sort_by_key(|mv| mv.reward);
moves.reverse();
for mv in moves {
let next = self.apply(&mv);
let (next, mut next_moves) = next.find_best_moves();
next_moves.push(mv);
if next.pressure > best_pressure {
best_pressure = next.pressure;
best_moves = next_moves;
best_state = next;
}
}
(best_state, best_moves)
}
}
That's right. "All I did" was provide the data structure, base methods, with consistent naming conventions, and an ML model was able to fill in the gaps.
Of course I'm not exactly happy with that code: but it's fairly easy to refactor.
First off, we don't really need to sort moves:
impl State<'_> {
fn find_best_moves(&self) -> (Self, Vec<Move>) {
let mut best_moves = vec![];
let mut best_state = self.clone();
let mut best_pressure = 0;
for mv in self.moves() {
let next = self.apply(&mv);
let (next, mut next_moves) = next.find_best_moves();
next_moves.push(mv);
if next.pressure > best_pressure {
best_pressure = next.pressure;
best_moves = next_moves;
best_state = next;
}
}
(best_state, best_moves)
}
And at that point, why not have moves
return an Iterator
instead of a Vec
?
impl State<'_> {
// 👋 the `+ '_` is the tricky bit here, since the Iterator
// borrows `&self.net`
fn moves(&self) -> impl Iterator<Item = Move> + '_ {
self.net
.connections(self.position)
.into_iter()
.filter_map(|(name, path)| {
if self.open_valves.contains(&name) {
return None;
}
let flow = self.net.valves[&name].flow;
if flow == 0 {
return None;
}
let travel_turns = path.len() as u64;
let open_turns = 1_u64;
let turns_spent_open = self.turns_left().checked_sub(travel_turns + open_turns)?;
let reward = flow * turns_spent_open;
Some(Move {
reward,
target: name,
path,
})
})
// there's just no `.collect()` on the end!
}
}
(No code changes are required in find_best_moves
)
Next up, we don't need to store best_pressure
separately, since it's already
part of state
:
impl State<'_> {
fn find_best_moves(&self) -> (Self, Vec<Move>) {
let mut best_moves = vec![];
let mut best_state = self.clone();
for mv in self.moves() {
let next = self.apply(&mv);
let (next, mut next_moves) = next.find_best_moves();
next_moves.push(mv);
if next.pressure > best_state.pressure {
best_moves = next_moves;
best_state = next;
}
}
(best_state, best_moves)
}
}
And here's another problem with it: the moves are backwards!
// with sample input
let (state, moves) = state.find_best_moves();
println!("final pressure = {}", state.pressure);
for m in &moves {
println!(" - {m:?}")
}
$ cargo run
Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16)
Finished dev [unoptimized + debuginfo] target(s) in 0.43s
Running `target/debug/day16`
final pressure = 1651
- Move { reward: 12, target: CC, path: [(EE, DD), (DD, CC)] }
- Move { reward: 27, target: EE, path: [(HH, GG), (GG, FF), (FF, EE)] }
- Move { reward: 286, target: HH, path: [(JJ, II), (II, AA), (AA, DD), (DD, EE), (EE, FF), (FF, GG), (GG, HH)] }
- Move { reward: 441, target: JJ, path: [(BB, AA), (AA, II), (II, JJ)] }
- Move { reward: 325, target: BB, path: [(DD, AA), (AA, BB)] }
- Move { reward: 560, target: DD, path: [(AA, DD)]
Now that's not a big issue, but Copilot did generate incorrect code: I may have used a VecDeque or some other data structure.
Come to think of it, we don't really need to store all the moves to get the solution! Maybe that'll make it run faster?
impl State<'_> {
fn apply_best_moves(&self) -> Self {
let mut best_state = self.clone();
for mv in self.moves() {
let next = self.apply(&mv).apply_best_moves();
if next.pressure > best_state.pressure {
best_state = next;
}
}
best_state
}
}
fn main() {
let net = Network::new();
let state = State {
net: &net,
position: Name(*b"AA"),
max_turns: 30,
turn: 0,
pressure: 0,
open_valves: Default::default(),
};
let state = state.apply_best_moves();
println!("final pressure = {}", state.pressure);
}
$ cargo build --release && time ./target/release/day16
Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16)
Finished release [optimized] target(s) in 0.59s
final pressure = 1947
./target/release/day16 4,53s user 0,00s system 99% cpu 4,531 total
Not really.
Now I'm curious: how many times are we even calling State::moves
?
use std::sync::atomic::AtomicU64;
static MOVES_CALLED: AtomicU64 = AtomicU64::new(0);
impl State<'_> {
fn moves(&self) -> impl ... {
MOVES_CALLED.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
// etc.
}
}
fn main() {
// cut: everything
println!(
"State::moves called {} times",
MOVES_CALLED.load(std::sync::atomic::Ordering::SeqCst)
);
}
# with the real puzzle input
$ cargo build --release && time ./target/release/day16
Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16)
Finished release [optimized] target(s) in 0.59s
final pressure = 1947
State::moves called 257798
./target/release/day16 4,55s user 0,01s system 100% cpu 4,557 total
Maybe we can optimize that a little bit.
Profiling our code with the Valgrind suite
Callgrind is a "call-graph generating cache and branch prediction profiler", that's fairly heavyweight.
I'll be doing this on Linux, the easiest platform to run Valgrind on.
We'll need symbols for that:
# in `Cargo.toml`
[profile.release]
debug = 1
$ cargo build --release
(everything gets rebuilt)
$ valgrind --tool=callgrind ./target/release/day16
==10655== Callgrind, a call-graph generating cache profiler
==10655== Copyright (C) 2002-2017, and GNU GPL'd, by Josef Weidendorfer et al.
==10655== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==10655== Command: ./target/release/day16
==10655==
==10655== For interactive control, run 'callgrind_control -h'.
This takes a long time. The docs mention that callgrind/cachegrind, since they emulate the CPU, runs the program around 50 times slower. So we can expect a result in.. a few minutes.
There. It finished. It generated a file:
$ ll callgrind*
-rw------- 1 amos amos 142K déc. 28 19:13 callgrind.out.10655
We can use a frontend like kcachegrind to visualize the profile:
We can even visualize a call graph!
(Those might appear small in the article, but you can right-click / long-press to open the image in a new tab)
So there's a bunch of memory operations happening, hash tables being resized
(which forces re-hashing all its elements), and building vec
data structures.
Fair enough. This doesn't show State::moves
or Network::connections
calls,
probably because they're inlined?
I don't think we learned much from this particular run, but it's always nice to show off some tooling.
Caching moves
One way to make our code faster (still for part 1) is to stop re-doing the same work over and over. The generic name for this technique is memoization, and there's a few crates for it, but we don't really need a generic solution here.
Let's store all connections as part of the Network
itself:
// in `src/main.rs`
// Making a newtype to represent flow because it's used in a tuple later, and a
// lone `u64` isn't really semantic, now is it.
#[derive(Clone, Copy, PartialEq, Eq)]
struct Flow(u64);
type Connections = HashMap<Name, (Path, Flow)>;
pub struct Network {
valves: HashMap<Name, (Valve, Connections)>,
}
Network
's impl
becomes this:
impl Network {
#![allow(clippy::new_without_default)]
pub fn new() -> Self {
let mut net = Self {
valves: include_str!("input.txt")
.lines()
.map(|l| all_consuming(parse::Valve::parse)(l).finish().unwrap().1)
// start off with zero connections (since we're still parsing)
.map(|valve| (valve.name, (valve, Connections::default())))
.collect(),
};
let names = net.valves.keys().copied().collect::<Vec<_>>();
for name in names {
// fill in the connections as needed
let conns = net.connections(name);
net.valves.get_mut(&name).unwrap().1 = conns;
}
net
}
/// Given a valve name, return a list of valves we can travel to, along
/// with the path to get there, and their flow.
///
/// Only the shortest paths are considered, so the search ends.
fn connections(&self, start: Name) -> Connections {
// this used to be just `HashMap<Name, Path>`, it's a bit hairier now
let mut current: HashMap<Name, (Path, Flow)> = Default::default();
{
let valve = &self.valves[&start].0;
current.insert(start, (vec![], Flow(valve.flow)));
}
let mut connections = current.clone();
while !current.is_empty() {
let mut next: HashMap<Name, (Path, Flow)> = Default::default();
for (name, (path, _flow)) in current {
for link in self.valves[&name].0.links.iter().copied() {
let valve = &self.valves[&link].0;
if let Entry::Vacant(e) = connections.entry(link) {
let conn_path: Path = path
.iter()
.copied()
.chain(std::iter::once((name, link)))
.collect();
let item = (conn_path.clone(), Flow(valve.flow));
e.insert(item.clone());
next.insert(link, item);
}
}
}
current = next;
}
connections
}
}
And we can change State::moves
to use it. Because now all possible paths are
stored in State
, we can have Move
borrow from it:
// 👋 now with `<'a>`
#[derive(Debug, Clone)]
struct Move<'a> {
reward: u64,
target: Name,
// 👇 now borrowed
path: &'a Path,
}
// needs 👇
impl Move<'_> {
fn cost(&self) -> u64 {
let travel_turns = self.path.len() as u64;
let open_turns = 1_u64;
travel_turns + open_turns
}
}
And then:
impl State<'_> {
/// Compute all moves and expected reward (pressure contributed till time
/// runs out if we travel to it and open it now)
fn moves(&self) -> impl Iterator<Item = Move> + '_ {
// note: I removed the `MOVES_CALLED` business
let (_valve, connections) = &self.net.valves[&self.position];
connections.iter().filter_map(|(name, (path, flow))| {
if self.open_valves.contains(name) {
return None;
}
if flow.0 == 0 {
return None;
}
let travel_turns = path.len() as u64;
let open_turns = 1_u64;
let turns_spent_open = self.turns_left().checked_sub(travel_turns + open_turns)?;
let reward = flow.0 * turns_spent_open;
Some(Move {
reward,
target: *name,
path,
})
})
}
}
And boom, it's fast:
$ cargo build --release && time ./target/release/day16
Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16)
Finished release [optimized + debuginfo] target(s) in 0.65s
final pressure = 1947
./target/release/day16 0,31s user 0,00s system 99% cpu 0,306 total
Running callgrind on it again indicates that we're now spending a lot of cycles hashing stuff:
$ callgrind_annotate callgrind.out.11770 | grep 'file:function' -A 10
Ir file:function
--------------------------------------------------------------------------------
1,162,174,008 (28.18%) /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/core/src/hash/sip.rs:_ZN81_$LT$std..collections..hash..map..DefaultHasher$u20$as$u20$core..hash..Hasher$GT$5write17hd317e39715b716a1E.llvm.8639191835063113673
703,851,864 (17.07%) /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/core/src/num/uint_macros.rs:core::hash::BuildHasher::hash_one
523,796,736 (12.70%) /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/core/src/hash/sip.rs:core::hash::BuildHasher::hash_one
361,111,980 ( 8.76%) /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.12.3/src/raw/mod.rs:day16::State::apply_best_moves'2
261,898,368 ( 6.35%) /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/core/src/hash/mod.rs:core::hash::BuildHasher::hash_one [/home/amos/bearcove/aoc2022/day16/target/release/day16]
179,168,705 ( 4.34%) /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.12.3/src/map.rs:day16::State::apply_best_moves'2
163,686,480 ( 3.97%) /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/core/src/num/uint_macros.rs:_ZN81_$LT$std..collections..hash..map..DefaultHasher$u20$as$u20$core..hash..Hasher$GT$5write17hd317e39715b716a1E.llvm.8639191835063113673
117,252,284 ( 2.84%) /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.12.3/src/raw/bitmask.rs:day16::State::apply_best_moves'2
98,211,888 ( 2.38%) /rustc/b70baa4f922a1809d79caeaeb902800c3be283b9/library/std/src/collections/hash/map.rs:_ZN81_$LT$std..collections..hash..map..DefaultHasher$u20$as$u20$core..hash..Hasher$GT$5write17hd317e39715b716a1E.llvm.8639191835063113673 [/home/amos/bearcove/aoc2022/day16/target/release/day16]
And: yes we are! I bet it's that one line:
let (_valve, connections) = &self.net.valves[&self.position];
ie. looking up the valve by its name.
Luckily, we can do better than that!
A custom data structure
We store valve names as [u8; 2]
, which is better than String
, but it's still
worse than it could be! We could in fact be using arrays rather than hash
maps, because both u8
are always upper-case letters, and so the number of
unique valve names are .
Using that property, we can do something clever.
// in `src/parse.rs`
// as before:
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
pub struct Name(pub [u8; 2]);
impl Name {
/// Returns this name as a usize between 0 and 26**2
pub fn as_usize(self) -> usize {
let [a, b] = self.0;
debug_assert!((b'A'..=b'Z').contains(&a));
debug_assert!((b'A'..=b'Z').contains(&b));
(a - b'A') as usize * 26 + (b - b'A') as usize
}
/// Returns a name from a usize between 0 and 26**2
pub fn from_usize(index: usize) -> Self {
debug_assert!(index < MAX_NAME);
let a = (index / 26) as u8 + b'A';
let b = (index % 26) as u8 + b'A';
Self([a, b])
}
// Note: these could both expressed with `From` / `Into` implementations,
// but I can't be bothered right now.
}
pub const MAX_NAME: usize = 26_usize.pow(2);
#[derive(Clone)]
pub struct NameMap<T> {
values: [Option<T>; MAX_NAME],
}
impl<T> NameMap<T> {
pub fn new() -> Self {
Self {
values: std::array::from_fn(|_| None),
}
}
pub fn get(&self, name: Name) -> Option<&T> {
self.values[name.as_usize()].as_ref()
}
pub fn get_mut(&mut self, name: Name) -> Option<&mut T> {
self.values[name.as_usize()].as_mut()
}
pub fn insert(&mut self, name: Name, value: T) {
self.values[name.as_usize()] = Some(value);
}
pub fn contains(&self, name: Name) -> bool {
self.values[name.as_usize()].is_some()
}
pub fn is_empty(&self) -> bool {
self.values.iter().all(|v| v.is_none())
}
pub fn iter(&self) -> impl Iterator<Item = (Name, &T)> {
self.values
.iter()
.enumerate()
.filter_map(|(i, v)| v.as_ref().map(|v| (Name::from_usize(i), v)))
}
pub fn keys(&self) -> impl Iterator<Item = Name> + '_ {
self.values
.iter()
.enumerate()
.filter_map(|(i, v)| v.as_ref().map(|_| Name::from_usize(i)))
}
}
impl<T> Default for NameMap<T> {
fn default() -> Self {
Self::new()
}
}
Now, we can adjust the type of Connections
:
// in `src/main.rs`
use parse::NameMap;
type Connections = NameMap<(Path, Flow)>;
And Network::connections
:
impl Network {
/// Given a valve name, return a list of valves we can travel to, along
/// with the path to get there, and their flow.
///
/// Only the shortest paths are considered, so the search ends.
fn connections(&self, start: Name) -> Connections {
// some light changes here
let mut current = Connections::default();
{
let valve = &self.valves[&start].0;
current.insert(start, (vec![], Flow(valve.flow)));
}
let mut connections = current.clone();
while !current.is_empty() {
// and here
let mut next = Connections::default();
for (name, (path, _flow)) in current {
// can't use the nice `Entry` API anymore:
for link in self.valves[&name].0.links.iter().copied() {
let valve = &self.valves[&link].0;
if !connections.contains(link) {
let conn_path: Path = path
.iter()
.copied()
.chain(std::iter::once((name, link)))
.collect();
let item = (conn_path.clone(), Flow(valve.flow));
connections.insert(link, item.clone());
next.insert(link, item);
}
}
}
current = next;
}
connections
}
}
Similarly, open_valves
can be a NameMap
instead of a HashSet
:
#[derive(Clone)]
struct State<'a> {
net: &'a Network,
position: Name,
max_turns: u64,
turn: u64,
pressure: u64,
// 👇 was: HashSet<Name>
open_valves: NameMap<()>,
}
The code changes required are relatively minimal. Notably, NameMap::contains
takes a Name
rather than a &Name
(since we know it's Copy
anyway), and
NameMap::insert
takes two arguments, so in State::apply
, it becomes:
impl State<'_> {
/// Apply a given move
fn apply(&self, mv: &Move) -> Self {
let mut next = self.clone();
next.position = mv.target;
next.turn += mv.cost();
next.pressure += mv.reward;
// 👇 inserting an empty tuple here
next.open_valves.insert(mv.target, ());
next
}
}
And of course I forgot about Network::valves
, so let's change it too:
pub struct Network {
// was: `HashMap<Name, (Valve, Connections)>`
valves: NameMap<(Valve, Connections)>,
}
The constructor becomes:
impl Network {
#![allow(clippy::new_without_default)]
pub fn new() -> Self {
let mut net = Self {
valves: Default::default(),
};
// we'd have to implement `FromIterator` if we wanted to use `collect`,
// let's just it do manually for now
for valve in include_str!("input.txt")
.lines()
.map(|l| all_consuming(parse::Valve::parse)(l).finish().unwrap().1)
{
net.valves.insert(valve.name, (valve, Default::default()));
}
let names = net.valves.keys().collect::<Vec<_>>();
for name in names {
let conns = net.connections(name);
net.valves.get_mut(name).unwrap().1 = conns;
}
net
}
}
And then a few lines need changing, like:
// before
let valve = &self.valves[&start].0;
// after
let valve = &self.valves.get(start).unwrap().0;
etc.
Unfortunately, this version of the program overflows its stack:
$ cargo run --release
Finished release [optimized + debuginfo] target(s) in 0.00s
Running `target/release/day16`
thread 'main' has overflowed its stack
fatal runtime error: stack overflow
[1] 19955 IOT instruction (core dumped) cargo run --release
My instinct is that it tries to allocate too much in one go, rather than there being actually infinite recursion or something.
Let's find out! Following Nicholas Nethercote's excellent Rust Performance Book:
$ RUSTFLAGS=-Zprint-type-sizes cargo +nightly build --release
print-type-size type: `Network`: 14650272 bytes, alignment: 8 bytes
print-type-size field `.valves`: 14650272 bytes
Mkay.
Well! I have 128GB of RAM on this machine, maybe 128MB of stack will be enough?
$ ulimit -s unlimited; RUSTFLAGS="-C link-args=-Wl,-zstack-size=134217728" cargo build --release && time ./target/release/day16
Finished release [optimized + debuginfo] target(s) in 0.00s
final pressure = 1947
./target/release/day16 0,13s user 0,00s system 99% cpu 0,130 total
Oh the crimes!
Yes, but look at how fast it is!
Truth is, we can probably optimize that further, but I didn't have to think too hard about this, and in my book, that's a win.
Because we're going to keep that custom data structure around, let's make a
Justfile
:
# just manual: https://github.com/casey/just#readme
run:
#!/bin/bash -eux
ulimit -s unlimited
export RUSTFLAGS="-C link-args=-Wl,-zstack-size=134217728"
cargo build --release
time ./target/release/day16
Part 2
In part 2, we have an elephant buddy. We spend 4 turns teaching them to walk through tunnels and unlock valves, so we have 26 turns left, the two of us, to do the optimal thing.
Well.
You know the annoying bit about that part? It's that we don't really simulate each turn. When we've chosen to "walk that path and unlock that valve", we do all these steps at once.
That doesn't really work if we have two actors, because it might go:
--------------- time --------------->
|-I do thing-|-I do other thing|
|-elephant does thing-|
^
uh oh
And then we're hecked.
After I wrote these words, I wasted an entire day trying to simulate step-by-step. It worked for the sample input, but was way too expensive for the real input.
I tried memoizing everything in sight, to no avail. Eventually, my code failed to allocate 42GiB of RAM (there's only 32GiB allocated to this VM), and it was my cue to stop being a tryhard, for once.
That's growth.
So, for the first time in this Advent of Code, I decided to take a look at others' solutions in search of hints. The reddit thread for day 16 is full of interesting ones, I ended up getting inspiration from Nicky Meuleman's solution.
The basic idea is to keep track of the maximum pressure achieved with any given combination of valves, while running the code for part 1.
// maps "valves open" to "best pressure achieved"
type Best = HashMap<NameMap<()>, u64>;
In apply_best_moves
, we record those:
impl State<'_> {
fn apply_best_moves(&self, best: &mut Best) -> Self {
let mut best_state = self.clone();
best.entry(self.open_valves.clone())
.and_modify(|v| {
if self.pressure > *v {
*v = self.pressure
}
})
.or_insert(self.pressure);
for mv in self.moves() {
let next = self.apply(&mv).apply_best_moves(best);
if next.pressure > best_state.pressure {
best_state = next;
}
}
best_state
}
}
And then our main
function becomes:
fn main() {
let net = Network::new();
let p1_state = State {
net: &net,
position: Name(*b"AA"),
max_turns: 30,
turn: 0,
pressure: 0,
open_valves: Default::default(),
};
{
let mut best = Best::default();
let state = p1_state.apply_best_moves(&mut best);
println!("part 1 pressure = {}", state.pressure);
}
{
let mut best = Best::default();
let mut p2_state = p1_state.clone();
p2_state.max_turns = 26;
p2_state.apply_best_moves(&mut best);
let best_pressure = best
.iter()
.tuple_combinations()
.filter(|(human, elephant)| human.0.is_disjoint(elephant.0))
.map(|(human, elephant)| human.1 + elephant.1)
.max()
.unwrap();
println!("part 2 pressure: {best_pressure}");
}
}
tuple_combinations
comes from our beloved
itertools, and I had to implement is_disjoint
myself, since we're using a custom data structure:
// in `src/parse.rs`
impl<T> NameMap<T> {
pub fn overlaps(&self, other: &Self) -> bool {
self.keys().any(|k| other.contains(k))
}
pub fn is_disjoint(&self, other: &Self) -> bool {
!self.overlaps(other) && !other.overlaps(self)
}
}
The idea is that, once we have all the best combinations, we find two disjoint sets for ourselves and the elephant to open. As far as I can tell, this only works because there's time to travel to and open all of 15 valves that have a non-zero flow.
$ just run
+ ulimit -s unlimited
+ export 'RUSTFLAGS=-C link-args=-Wl,-zstack-size=134217728'
+ RUSTFLAGS='-C link-args=-Wl,-zstack-size=134217728'
+ cargo build --release
Compiling day16 v0.1.0 (/home/amos/bearcove/aoc2022/day16b)
Finished release [optimized + debuginfo] target(s) in 0.66s
+ ./target/release/day16
part 1 pressure = 1947
part 2 pressure: 2556
real 0m2,865s
user 0m2,831s
sys 0m0,031s
Here's something interesting though! For that part of the code, our custom
structure is slower than a BTreeSet
. (We can't use a HashSet
, since it
doesn't itself implement Hash
— I bet it's because the BTreeSet
keeps keys
sorted, making it easy to hash).
If we simply add this:
let best: HashMap<BTreeSet<_>, _> = best
.into_iter()
.map(|(valves, total)| (valves.iter().map(|(k, _v)| k).collect(), total))
.collect();
The runtime is halved!
$ just run
(cut)
+ ./target/release/day16
part 1 pressure = 1947
part 2 pressure: 2556
real 0m1,429s
user 0m1,415s
sys 0m0,014s
And that's it! This isn't the fastest solution by any stretch of the imagination, but I'm happy with it for now.
If you want to get all galaxy brain, check out Orson Peters' solution, which uses the Floyd–Warshall algorithm.
Here's another article just for you:
So you want to live-reload Rust
Good morning! It is still 2020, and the world is literally on fire, so I guess we could all use a distraction.
This article continues the tradition of me getting shamelessly nerd-sniped - once by Pascal about small strings, then again by a twitch viewer about Rust enum sizes.
This time, Ana was handing out free nerdsnipes, so I got in line, and mine was: