Day 18 (Advent of Code 2022)
👋 This page was last updated ~2 years ago. Just so you know.
This time around, we're porting a solution from C++ to Rust and seeing how it feels, how it performs, and what we can learn about both languages by doing that.
See Day 17 for the rationale re: porting solutions rather than writing my own from scratch. TL;DR is: it's better than nothing, and we can still focus about learning Rust rather than spending entire days fighting off-by-one errors.
The research for this article was done live on stream, I'm not going to cover everything in the write-up, so check it out if you want the full experience:
Part 1
After reviewing a couple solutions, we went for osalbahr's solution, and here it is:
// in `ref/day18.cpp` #include <iostream> #include <vector> #include <set> #include <map> #include <sstream> #include <cstdio> #include <cmath> using namespace std; typedef struct { int x; int y; int z; } Cube; vector<Cube> cubes; static void printCubes( FILE *fp ) { for ( Cube cube : cubes ) fprintf( fp, "%d,%d,%d\n", cube.x, cube.y, cube.z ); } static int adjacent( Cube a, Cube b ) { return ( abs( a.x - b.x ) + abs( a.y - b.y ) + abs( a.z - b.z ) ) == 1; } static int getSurfaceArea() { int total = 6 * cubes.size(); for ( int i = 0; i < cubes.size(); i++ ) for ( int j = i + 1; j < cubes.size(); j++ ) if ( adjacent( cubes[ i ], cubes[ j ] ) ) total -= 2; return total; } int main() { string line; while( getline( cin, line ) ) { Cube cube; char delim; stringstream ss( line ); ss >> cube.x >> delim >> cube.y >> delim >> cube.z; cubes.push_back( cube ); } fclose( fopen( "PARSE", "w" ) ); FILE *out = fopen( "PARSE", "w" ); printCubes( out ); fclose( out ); cout << getSurfaceArea() << endl; }
I made sure the solution worked for both parts of the problem, and got to porting!
Someone on-stream suggested we use the !
(never type) so I ended up opting into nightly right away:
# in rust-toolchain.toml [toolchain] channel = "nightly-2023-01-11"
And then the port was pretty straightforward:
// in `src/main.rs` #![feature(never_type)] use std::{ fmt, fs::File, io::{BufWriter, Write}, str::FromStr, }; #[derive(Clone, Copy)] struct Cube { x: i32, y: i32, z: i32, } // we could've derived this, but I wanted a more compact representation impl fmt::Debug for Cube { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { // did you know you could destructure structs in Rust? // now you do! let Cube { x, y, z } = self; write!(f, "{x},{y},{z}") } } impl Cube { fn adjacent(self, other: Self) -> bool { // I love that `abs_diff` is a thing. I'm gonna use it all the time. self.x.abs_diff(other.x) + self.y.abs_diff(other.y) + self.z.abs_diff(other.z) == 1 } } impl FromStr for Cube { // technically a lie but we took a shortcut by // unwrapping in `from_str` type Err = !; fn from_str(s: &str) -> Result<Self, Self::Err> { let mut tokens = s.split(',').map(|s| s.parse::<i32>().unwrap()); Ok(Self { x: tokens.next().unwrap(), y: tokens.next().unwrap(), z: tokens.next().unwrap(), }) } } #[derive(Debug)] struct State { cubes: Vec<Cube>, } impl State { fn get_surface_area(&self) -> i32 { let mut total = 6 * self.cubes.len() as i32; for (i, cube) in self.cubes.iter().enumerate() { for other in self.cubes.iter().skip(i + 1) { if cube.adjacent(*other) { total -= 2; } } } total } } fn main() { // in the original, this is a global variable. `static mut` // in Rust is unsafe, so we just have a local here. let cubes = include_str!("input.txt") .lines() .map(|l| Cube::from_str(l).unwrap()) .collect::<Vec<_>>(); let state = State { cubes }; // in the initial measurements, the Rust version was slower. // turns out it was spending all its time issuing `write` // syscalls! hence the `BufWriter` here, which reduced the // number of write syscalls to just 3. let mut f = BufWriter::new(File::create("PARSE").unwrap()); for c in &state.cubes { writeln!(f, "{c:?}").unwrap(); } f.flush().unwrap(); dbg!(state.get_surface_area()); }
I hardcoded the real input in the C++ program for benchmarking, and that was a hassle. I learned a couple things along the way.
First, there is no equivalent to include_bytes! / include_str! in C++ (that's why people have resource compilers and stuff)
Second, MSVC will choke on string literals that are "too long" - ours certainly was.
Third, the syntax for a raw string in C++ is even weirder than it is in Rust.
In rust we have:
let regular_string = "hello"; let raw_string = r#"hello"#; // note: you can have any number of `#` characters, as long as they match
But in C++, it's:
// I don't even know what type it's supposed to be, leave me alone "regular string" R"(raw string)"
The parentheses are syntax, they're not part of the string, the string is equal to "raw string", not "(raw string)". We reached dangerous levels of sighing when I discovered that, but that's also what this journey is about.
I was curious how it would perform, and after forgetting to turn on optimizations for a bit, I finally got some actual numbers:
Compiler | Flags | Runtime | Speedup | |
C++ | clang++ 14 | -O3 | 8.9ms | 1.00x |
Rust | rustc 2023-01-11 | --release | 6.1ms | 1.46x |
These are both on Linux, and we'll have a lot more numbers to look at once we move on to part 2, which... let's!
Part 2
Here's the C++ solution for Part 2: it has a depth-first search, what's not to like?
The answer is floats. I don't like floats in this solution, but ah well.
// in `ref/day18_2.cpp` #include <iostream> #include <vector> #include <set> #include <map> #include <sstream> #include <cstdio> #include <cmath> using namespace std; #define REPORT( X ) cout << #X << " = " << ( X ) << endl // Like echo -n #define REPORTN( X ) cout << #X << " = " << ( X ) << ", " #define REPORTC( C ) printf( "%s = (%d,%d,%d)\n", #C, (C).x, (C).y, (C).z ); typedef struct { int x; int y; int z; } Cube; // Adapted from // https://stackoverflow.com/questions/3882467/defining-operator-for-a-struct bool operator<(const Cube& a, const Cube& b) { return tie(a.x, a.y, a.z) < tie(b.x, b.y, b.z); } Cube operator+( const Cube& l, const Cube& r ) { int x = l.x + r.x; int y = l.y + r.y; int z = l.z + r.z; return { x, y, z }; } // A side is a midpoint of two cubes typedef struct { float x; float y; float z; } Side; // Adapted from // https://stackoverflow.com/questions/3882467/defining-operator-for-a-struct bool operator<(const Side& a, const Side& b) { return tie(a.x, a.y, a.z) < tie(b.x, b.y, b.z); } set<Cube> cubes; set<Side> sides; int minX, maxX; int minY, maxY; int minZ, maxZ; set<Cube> lavaCubes; static void printCubes( FILE *fp ) { for ( Cube cube : cubes ) fprintf( fp, "%d,%d,%d\n", cube.x - 1, cube.y - 1, cube.z - 1 ); } // static void printSides() // { // for ( Side s : sides ) { // printf( "(%.1f,%.1f,%.1f)\n", s.x - 1, s.y - 1, s.z - 1 ); // } // } static Cube getOppositeCube( Side side ) { float diffs[] = { -0.5, 0.5 }; bool isX = side.x != (int)side.x; bool isY = side.y != (int)side.y; bool isZ = side.z != (int)side.z; for ( float diff : diffs ) { Cube cube = { int(side.x + isX * diff), int(side.y + isY * diff), int(side.z + isZ * diff) }; if ( cubes.count( cube ) > 0 ) { return { int(side.x + isX * -diff), int(side.y + isY * -diff), int(side.z + isZ * -diff) }; } } cerr << "Not found" << endl; exit( 1 ); } static bool isLavaSide( Side side ) { // Look around based on orientation Cube oppositeCube = getOppositeCube( side ); // REPORT( lavaCubes.count( oppositeCube ) > 0 ); return lavaCubes.count( oppositeCube ) > 0; } static void dfsLava( Cube lava ) { // Off limits if ( lava.x > maxX + 1 || lava.x < minX - 1 || lava.y > maxY + 1 || lava.y < minY - 1 || lava.z > maxZ + 1 || lava.z < minZ - 1 ) return; // Can't be both lava and non-lava if ( cubes.count( lava ) > 0 ) return; // Already inserted if ( ! lavaCubes.insert( lava ).second ) return; // Start dfs Cube differs[] = { { -1, 0, 0 }, { 1, 0, 0 }, { 0, -1, 0 }, { 0, 1, 0 }, { 0, 0, -1 }, { 0, 0, 1 } }; for ( Cube differ : differs ) { dfsLava( lava + differ ); } } // Mark the corner as a lava cube // DFS from there static void populateLavaCubes() { // Mark the corner as a lava cube Cube cube = *cubes.begin(); minX = maxX = cube.x; minY = maxY = cube.y; minZ = maxZ = cube.z; for ( Cube cube : cubes ) { minX = min( minX, cube.x ); maxX = max( maxX, cube.x ); minY = min( minY, cube.y ); maxY = max( maxY, cube.y ); minZ = min( minZ, cube.z ); maxZ = max( maxZ, cube.z ); } Cube lava = { minX - 1, minY - 1, minZ - 1 }; dfsLava( lava ); } // Keep only valid sides static void filterSides() { // cout << "populateLavaCubes ... " << flush; populateLavaCubes(); // cout << "Done" << endl; assert( lavaCubes.size() ); // Filter out sides not exposed to lava for ( auto it = sides.begin(); it != sides.end(); ) { Side side = *it; assert( side.x > 0 ); assert( side.y > 0 ); assert( side.z > 0 ); Cube a = { (int)side.x, (int)side.y, (int)side.z }; Cube b = a; if ( side.x != (int)side.x ) { b.x++; } else if ( side.y != (int)side.y ) { b.y++; } else if ( side.z != (int)side.z ) { b.z++; } else { cerr << "Invalid side" << endl; REPORTN( side.x ), REPORTN( side.y ), REPORT( side.z ); exit( 1 ); } int countA = cubes.count( a ); int countB = cubes.count( b ); // Can't both be zero if ( countA == 0 && countB == 0 ) { cerr << "Invalid side" << endl; REPORTN( side.x ), REPORTN( side.y ), REPORT( side.z ); REPORTN( countA ), REPORT( countB ); exit( 1 ); } // Not exposed to air if ( countA > 0 && countB > 0 ) { sides.erase( it++ ); continue; } // Not exposed to lava if ( !isLavaSide( side ) ) { sides.erase( it++ ); continue; } // Keep it it++; } // // See the rest // printSides(); } static int getSurfaceArea() { for ( Cube cube : cubes ) { float diffs[] = { -0.5, 0.5 }; for ( float diff : diffs ) { sides.insert( { (float)cube.x + diff, (float)cube.y, (float)cube.z } ); sides.insert( { (float)cube.x, (float)cube.y + diff, (float)cube.z } ); sides.insert( { (float)cube.x, (float)cube.y, (float)cube.z + diff } ); } } filterSides(); return sides.size(); } int main() { string line; while( getline( cin, line ) ) { Cube cube; char delim; stringstream ss( line ); ss >> cube.x >> delim >> cube.y >> delim >> cube.z; // Avoid 0 cube.x++, cube.y++, cube.z++; cubes.insert( cube ); } fclose( fopen( "PARSE", "w" ) ); FILE *out = fopen( "PARSE", "w" ); printCubes( out ); fclose( out ); cout << getSurfaceArea() << endl; }
I got some tips from chat while porting this. For example,
std::set is actually a
BTreeSet,
not a
HashSet
(hence the operator<
, B-tree sets are ordered).
I also had to look up what std::tie does: it's tuple-related. My best guess as a C++ newbie is that tuples are implemented in the standard library rather than in the compiler, which is possible because the language is expressive enough?
Also I learned insert
returns an
std::pair, not a tuple, and
that the second field of a pair is accessed with .second
😬
Anyway! Here's my initial 1-1 port:
#![feature(never_type)] use std::{ collections::BTreeSet, fmt, fs::File, io::{BufWriter, Write}, str::FromStr, }; use ordered_float::NotNan; #[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, derive_more::Add, Hash)] struct Cube { x: i32, y: i32, z: i32, } impl fmt::Debug for Cube { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { let Cube { x, y, z } = self; write!(f, "{x},{y},{z}") } } impl FromStr for Cube { type Err = !; fn from_str(s: &str) -> Result<Self, Self::Err> { let mut tokens = s.split(',').map(|s| s.parse::<i32>().unwrap()); Ok(Self { x: tokens.next().unwrap(), y: tokens.next().unwrap(), z: tokens.next().unwrap(), }) } } #[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, derive_more::Add, Hash)] struct Side { // This is so that we can derive `Ord`, `Eq` and `Hash`. It means a bunch of // `.try_into().unwrap()` and `.into_inner()` later, but it's the price to // pay to be able to use it as a BTreeSet/HashSet key. x: NotNan<f32>, y: NotNan<f32>, z: NotNan<f32>, } #[derive(Default)] struct State { cubes: BTreeSet<Cube>, sides: BTreeSet<Side>, min_x: i32, max_x: i32, min_y: i32, max_y: i32, min_z: i32, max_z: i32, lava_cubes: BTreeSet<Cube>, } impl State { fn get_opposite_cube(&self, side: Side) -> Cube { let diffs = [-0.5_f32, 0.5]; // instead of casting float->int->float, we just do truncation let is_x = side.x != side.x.trunc(); let is_y = side.y != side.y.trunc(); let is_z = side.z != side.z.trunc(); for diff in diffs { let cube = Cube { // we might be able to cast a `bool` to a floating point like // they do in the C/C++ version, but I refuse. x: (side.x + if is_x { diff } else { 0.0 }).into_inner() as _, y: (side.y + if is_y { diff } else { 0.0 }).into_inner() as _, z: (side.z + if is_z { diff } else { 0.0 }).into_inner() as _, }; if self.cubes.contains(&cube) { return Cube { x: (side.x + if is_x { -diff } else { 0.0 }).into_inner() as _, y: (side.y + if is_y { -diff } else { 0.0 }).into_inner() as _, z: (side.z + if is_z { -diff } else { 0.0 }).into_inner() as _, }; } } panic!("Not found") // best error msg eva <333 } fn is_lava_side(&self, side: Side) -> bool { self.lava_cubes.contains(&self.get_opposite_cube(side)) } fn dfs_lava(&mut self, lava: Cube) { // Off limits if (lava.x > self.max_x + 1 || lava.x < self.min_x - 1) || (lava.y > self.max_y + 1 || lava.y < self.min_y - 1) || (lava.z > self.max_z + 1 || lava.z < self.min_z - 1) { return; } // Can't be both lava and non-lava if self.cubes.contains(&lava) { return; } // Already inserted if !self.lava_cubes.insert(lava) { return; } // Start dfs let differs = [ // that part's shorter in C++, because struct initializers don't // need to be named: you put the whole type on the left-hand side // and bracket soup on the right-hand side and weeeee Cube { x: -1, y: 0, z: 0 }, Cube { x: 1, y: 0, z: 0 }, Cube { x: 0, y: -1, z: 0 }, Cube { x: 0, y: 1, z: 0 }, Cube { x: 0, y: 0, z: -1 }, Cube { x: 0, y: 0, z: 1 }, ]; for differ in differs { // this is what you'd put in `stacker` if you needed it (see below) self.dfs_lava(lava + differ); } } // Mark the corner as a lava cube // DFS from there fn populate_lava_cubes(&mut self) { let cube = self.cubes.first().unwrap(); // I tried using `RangeInclusive` for this but had off-by-one errors, // so I made the port more direct and used 6 separate values instead. self.min_x = cube.x; self.max_x = cube.x; self.min_y = cube.y; self.max_y = cube.y; self.min_z = cube.z; self.max_z = cube.z; for cube in &self.cubes { // you might, like me, be tempted to reach for `minmax` from // itertools, and then later realize you don't need the "minmax 3d // coordinate", you need the minmax x, minmax y and minmax z, // separately, and that's a different operation. self.min_x = self.min_x.min(cube.x); self.max_x = self.max_x.max(cube.x); self.min_y = self.min_y.min(cube.y); self.max_y = self.max_y.max(cube.y); self.min_z = self.min_z.min(cube.z); self.max_z = self.max_z.max(cube.z); } self.dfs_lava(Cube { x: self.min_x - 1, y: self.min_y - 1, z: self.min_z - 1, }); } fn filter_sides(&mut self) { self.populate_lava_cubes(); assert!(!self.lava_cubes.is_empty()); let mut sides = std::mem::take(&mut self.sides); sides.retain(|side| { // in C/C++ these are just `assert!(something_coerced_to_int_or_bool)` assert!(side.x.into_inner() > 0.0); assert!(side.y.into_inner() > 0.0); assert!(side.z.into_inner() > 0.0); let a = Cube { // escaping our `NotNan` prison here. Maybe we should be // using `.try_into()` instead, so it panics on `NaN` / // too-large numbers? x: side.x.into_inner() as i32, y: side.y.into_inner() as i32, z: side.z.into_inner() as i32, }; let mut b = a; if side.x != side.x.trunc() { b.x += 1; } else if side.y != side.y.trunc() { b.y += 1; } else if side.z != side.z.trunc() { b.z += 1; } else { panic!("Invalid side"); } let has_a = self.cubes.contains(&a); let has_b = self.cubes.contains(&b); // Can't both be false assert!( has_a || has_b, "invalid side: {side:?}, has_a={has_a}, has_b={has_b}" ); // Not exposed to air if has_a && has_b { return false; } // Not exposed to lava if !self.is_lava_side(*side) { return false; } true }); self.sides = sides; } fn get_surface_area(&mut self) -> usize { for cube in &self.cubes { let diffs = [-0.5_f32, 0.5]; for diff in diffs { // this is ugly and I don't like it. there's better ways to // write this that don't even include macros, but too late, I've // already moved on. self.sides.insert(Side { x: (cube.x as f32 + diff).try_into().unwrap(), y: (cube.y as f32).try_into().unwrap(), z: (cube.z as f32).try_into().unwrap(), }); self.sides.insert(Side { x: (cube.x as f32).try_into().unwrap(), y: (cube.y as f32 + diff).try_into().unwrap(), z: (cube.z as f32).try_into().unwrap(), }); self.sides.insert(Side { x: (cube.x as f32).try_into().unwrap(), y: (cube.y as f32).try_into().unwrap(), z: (cube.z as f32 + diff).try_into().unwrap(), }); } } self.filter_sides(); self.sides.len() } } fn main() { let unit_cube = Cube { x: 1, y: 1, z: 1 }; let cubes = include_str!("input.txt") .lines() .map(|l| Cube::from_str(l).unwrap() + unit_cube) .collect::<_>(); let mut state = State { cubes, ..Default::default() }; let mut f = BufWriter::new(File::create("PARSE").unwrap()); for c in &state.cubes { writeln!(f, "{c:?}").unwrap(); } f.flush().unwrap(); dbg!(state.get_surface_area()); }
I noticed a bunch of strange things while porting this code: I don't think this is a particularly good example of C++ code: I think this is the dreaded C/C++.
Macros are a giveaway, but then also things like this:
// Can't be both lava and non-lava if ( cubes.count( lava ) > 0 ) return;
This is odd. There is a count
method on std::set
, but it only really makes
sense for multisets, right? This strikes me as something a C person would do
(just compare numbers, which is how they do error handling).
But as I'm writing this, I'm discovering that the contains method only exists since C++20, so, maybe they're limited to C++14 or something 🤷
Another very C-ish thing is this:
Cube cube = { int(side.x + isX * diff), int(side.y + isY * diff), int(side.z + isZ * diff) };
...where isX
, ixY
and isZ
are booleans. I suppose it promotes them to
floats, since that's the type of diff
?
Having to remember what type int
and float
actually were made me thankful
Rust has you explicitly say i32
and f32
.
Oh and it's worth nothing that C++ doesn't concern itself with problems like "you can't compare NaN floats":
// A side is a midpoint of two cubes typedef struct { float x; float y; float z; } Side; // Adapted from // https://stackoverflow.com/questions/3882467/defining-operator-for-a-struct bool operator<(const Side& a, const Side& b) { return tie(a.x, a.y, a.z) < tie(b.x, b.y, b.z); }
In Rust, we had to use the NotNan
wrapper from
ordered-float.
Finally, I thought that pattern would be troublesome:
for (auto it = sides.begin(); it != sides.end()) { if (cond) { // remove element from set sides.erase(it++); continue; } // keep element in set it++; }
...because there's no borrow checking going on, so these can get hairy when translating to Rust, but it turns out that's just retain.
Before I show you any performance numbers, I'll go through the code changes to make it faster (faster than the direct port).
- Switching from a
BTreeSet
to aHashSet
made it faster - Switching from the default hash to rustc-hash made it faster.
Another thing we ran into was that, on some platforms with some compilers, some of the executables (both C++ and Rust) did a lil' stack overflow.
On Windows, fixing it was a simple editbin /STACK:4194304
away, but we also
tried using the stacker crate to grow it
dynamically: it has to go around the self.dfs_lava
recursive call.
Apart from that, here's some of the compiler flags we tried:
-O2
vs-O3
: sometimes one performs better, and it's not always the highest-march=native
: this lets the compiler uses the actual CPU instruction sets you have on the build machine, not a popular-enough common denominator-ffast-math
: makes floats go brrr at the expense of some correctness
So, here are the Linux numbers:
Language | Variant | Compiler | Extra flags | Runtime | Speedup |
C++ | set | g++ 11.3.0 | (only -O3) | 12.0ms | 1.00x |
clang++ 14 | 11.7ms | 1.03x | |||
-funroll-loops | 11.6ms | 1.03x | |||
-march=native | 11.5ms | 1.04x | |||
-ffast-math | 11.4ms | 1.05x | |||
(all of the above) | 11.0ms | 1.09x | |||
Rust | BTreeSet | rustc 2023-01-11 | (only --release) | 8.5ms | 1.41x |
HashSet | 6.9ms | 1.74x | |||
-Z build-std (cargo) | 6.6ms | 1.82x | |||
-C target-cpu=native | 6.4ms | 1.88x | |||
FxHashSet | 3.3ms | 3.64x |
(Values omitted are the same as the line above).
And here are the Windows numbers:
Language | Variant | Compiler | Extra flags | Runtime | Speedup |
C++ | MSVC 19.34.31937 | /O2 /EHsc /F4194304 | 17.1ms | 0.92x | |
clang++ 14 | (only -O3) | 15.7ms | 1.00x | ||
Rust | stacker | rustc 2023-01-11 | -C target-cpu=native | 8.4ms | 1.87x |
editbin | 7.7ms | 2.04x |
So first off, to those asking "why are you using a Linux VM on Windows 11 rather than building for Windows 11" directly: apart from "it's nice and I like it", there you have it. A bunch of workloads are faster in a Linux VM on Windows than directly on Windows. (It usually has to do with I/O, but I guess it didn't in this case).
It's worth noting that this isn't a real benchmark, and that even real benchmarks lie. As I pointed out for the Python version, there's definitely tricks you can use to bring C++ to par here, like:
- Some compiler flags I don't know about
- Use more recent C++ stuff?
- Same tricks we used in Rust: change to a HashSet, change the hashing algorithm
I'd actually be excited about C++ folks coming in and showing me how it's done: ultimately, I still feel like in certain situations, you can squeeze extra performance from C++ in a way you can't quite do in Rust. I just don't know how.
But again, I'd like to outline how nice and readable the Rust port is (how
hilarious is it that the operator<
has a stackoverflow link? I don't blame
them, I wouldn't remember the right syntax either! I'll take verbose impl blocks
any day of the week).
This was fun, and I might choose to port from C++ again for Day 19, let's see how it turns out!
Thanks to my sponsors: xales, Daniel Strittmatter, budrick, Marcin Kołodziej, Joonas Koivunen, Marcus Griep, Marc-Andre Giroux, Matt Jadczak, Jean-David Gadina, Beth Rennie, Rufus Cable, Paul Schuberth, Tom Forbes, James Brown, James Rhodes, joseph, Aalekh Patel, zaurask, Michal Hošna, Corey Alexander and 227 more
If you liked what you saw, please support my work!
Here's another article just for you:
So! Rust futures! Easy peasy lemon squeezy. Until it's not. So let's do the easy thing, and then instead of waiting for the hard thing to sneak up on us, we'll go for it intentionally.
That's all-around solid life advice.
Choo choo here comes the easy part 🚂💨
We make a new project:
$ cargo new waytoodeep Created binary (application) `waytoodeep` package