Day 2, Day 2! Woo!

The Advent of Code 2020, Day 2 problem talks about passwords. Sounds familiar.

Basically, our input looks like this:

1-3 a: abcde
1-3 b: cdefg
2-9 c: ccccccccc

Each line contains a "password policy" and a "password". For the first line, the policy is that the password must contain between 1 and 3 (inclusive) times the letter "a".

So, more parsing?

More parsing.

Alright, same as Day 1, let's make a new project:

Shell session
$ cargo new day2
     Created binary (application) `day2` package
$ cd day2
$ cargo add anyhow
      Adding anyhow v1.0.35 to dependencies

Add our input to day2/src/input.txt, and:

Rust code
// in `day2/src/main.rs`

fn main() -> anyhow::Result<()> {
    let input = include_str!("input.txt");

    Ok(())
}

So, each line contains a password policy and a password. Let's start thinking about how we would represent that in terms of types.

The policy has a range and a character.

Is it a character? Isn't our whole input ASCII?

It is!

So we could get away with just using bytes, correct right?

I guess so!

Rust code
use std::ops::RangeInclusive;

struct PasswordPolicy {
    byte: u8,
    range: RangeInclusive<usize>,
}

Next up, we'll probably want a function that parses a line and returns both a PasswordPolicy and, I guess, a &str?

Yeah, if we're dealing with bytes, that's fine.

Rust code
fn parse_line(s: &str) -> anyhow::Result<(PasswordPolicy, &str)> {
    todo!()
}

We're also going to need some method to make sure a password matches a given PasswordPolicy:

Rust code
impl PasswordPolicy {
    fn is_valid(&self, password: &str) -> bool {
        todo!()
    }
}

That's a lot of todos! Can we at least sketch out what the main function would look like with all of these?

Part 1 asks how many passwords are valid according to their policies, so, sure, now that we have all the pieces, we can do that:

Rust code
fn main() -> anyhow::Result<()> {
    let count = include_str!("input.txt")
        .lines()
        .map(parse_line)
        .map(Result::unwrap)
        .filter(|(policy, password)| policy.is_valid(password))
        .count();
    println!("{} passwords are valid", count);

    Ok(())
}

Ah neat! What's lines do? Is it like split('\n')?

Sort of! It also works on \r\n though (CRLF), ie. "Windows-style line endings".

And this time we're not collect-ing into a Result?

Nah, I'd rather emphasize that we can use filter and count to answer the question, without collecting at all. It's all streaming!

What do we implement next?

Let's give is_valid a shot.

We just said we only care about bytes, because the input is ASCII-only, so we don't have to care about UTF-8 encoding.

The str::as_bytes method gives us a an &[u8], and from there we can use the iterator methods we know and love:

Rust code
impl PasswordPolicy {
    fn is_valid(&self, password: &str) -> bool {
        self.range.contains(
            &password
                .as_bytes()
                .iter()
                .copied()
                .filter(|&b| b == self.byte)
                .count(),
        )
    }
}

Uhh back off a minute there. What's .copied()?

Well, password.as_bytes().iter() gives us an Iterator<Item = &u8>.

And it just so happens that u8 is a Copy type - we don't have to worry about its ownership, and it's really cheap to, well, copy.

Ah, so .copied() turns it into an Iterator<Item = u8>? And then we can operate directly on u8 values instead of references like &u8?

Almost, yes! The other things is that iter.filter() when iter is an Iterator<Item = T>, passes &T.

Ah, because we're filtering, so we can't "consume" the items, we just want to read them and decide whether to keep them or not.

Exactly!

What I don't get is why there's a &b - shouldn't it be *b? Because we're dereferencing a reference?

Well, our filter invocation:

Rust code
                .filter(|&b| b == self.byte)

Could also be written like this:

Rust code
                .filter(|b| *b == self.byte)

Ohhh there's... a symmetry! So either we dereference it with * on the right-hand side... or we add a & to the left-hand-side?

Yep!

It's all just pattern matching.

If we want to match an &i32, for example, we can do this:

Rust code
// (this code is not part of the solution)

    let i = &42;
    if let 42 = *i {
        println!("yes!");
    }

But we can also do this:

Rust code
    let i = &42;
    if let &42 = i {
        println!("yes!");
    }

I see. So as long as the balance is maintained, we're fine?

Yup!

Now, only the parse_line method needs to be implemented. But it would be nice to make sure that our PasswordPolicy::is_valid method behaves as expected.

Let's write a unit test!

Rust code
#[cfg(test)]
mod tests {
    use super::PasswordPolicy;

    #[test]
    fn test_is_valid() {
        let pp = PasswordPolicy {
            range: 1..=3,
            byte: b'a',
        };
        assert_eq!(pp.is_valid("zeus"), false, "no 'a's");
        assert_eq!(pp.is_valid("hades"), true, "single 'a'");
        assert_eq!(pp.is_valid("banana"), true, "three 'a's");
        assert_eq!(pp.is_valid("aaaah"), false, "too many 'a's");
    }
}

Quick question - could this module be in a separate file?

It could! You can read Rust modules vs files for more info.

Our test passes:

Shell session
$ cargo t
   Compiling day2 v0.1.0 (/home/amos/ftl/aoc2020/day2)
    Finished test [unoptimized + debuginfo] target(s) in 0.35s
     Running target/debug/deps/day2-b64ad06ec1d18663

running 1 test
test tests::test_is_valid ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Hurray!

Which means we can move on to implementing our last function.

Could we test this one as well?

Sure! Let's write the test first.

Rust code
// in `mod tests`

    use super::parse_line;

    #[test]
    fn test_parse() {
        assert_eq!(
            parse_line("1-3 a: banana").unwrap(),
            (
                PasswordPolicy {
                    range: 1..=3,
                    byte: b'a',
                },
                "banana"
            )
        );
    }

That doesn't compile... yet! For assert_eq! to work, two traits need to be implemented: PartialEq (for partial equality testing) and Debug (to format the left-hand side and right-hand side, when they're not equal).

We could implement these ourselves:

Rust code
use std::fmt::Debug;

impl PartialEq for PasswordPolicy {
    fn eq(&self, other: &Self) -> bool {
        self.byte == other.byte && self.range == other.range
    }
}

impl Debug for PasswordPolicy {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        f.debug_struct("PasswordPolicy")
            .field("byte", &self.byte)
            .field("range", &self.range)
            .finish()
    }
}

...or we could just derive them:

Rust code
#[derive(PartialEq, Debug)]
struct PasswordPolicy {
    byte: u8,
    range: RangeInclusive<usize>,
}

Whoa!

Does that work because both u8 and RangeInclusive<usize> implement PartialEq and Debug?

Yup! And they're not implemented by default for all new structs because, well, if you don't need them that's just extra code in your binary you're not using - ie. bloat.

With those changes, the tests compile, but fail:

Shell session
$ cargo t
    Finished test [unoptimized + debuginfo] target(s) in 0.01s
     Running target/debug/deps/day2-b64ad06ec1d18663

running 2 tests
test tests::test_is_valid ... ok
test tests::test_parse ... FAILED

failures:

---- tests::test_parse stdout ----
thread 'tests::test_parse' panicked at 'not yet implemented', src/main.rs:23:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    tests::test_parse

test result: FAILED. 1 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

error: test failed, to rerun pass '--bin day2'
Cool bear's hot tip

cargo t is short for cargo test, just like cargo b is short for cargo build, cargo c is short for cargo check, and cargo r is short for cargo run.

So all we have to do is to actually implement parse_line.

Well that's easy! split, parse, and boom, we're done.

Sure, why not:

Shell session
$ cargo add thiserror
      Adding thiserror v1.0.22 to dependencies
Rust code
#[derive(thiserror::Error, Debug)]
enum ParseError {
    #[error("expected {0}")]
    Expected(&'static str),
}

fn parse_line(s: &str) -> anyhow::Result<(PasswordPolicy, &str)> {
    let (policy, password) = {
        let mut tokens = s.split(':');
        (
            tokens
                .next()
                .ok_or(ParseError::Expected("password policy"))?,
            tokens
                .next()
                .ok_or(ParseError::Expected("password"))?
                .trim(),
        )
    };

    let (range, byte) = {
        let mut tokens = policy.split(' ');
        (
            tokens.next().ok_or(ParseError::Expected("policy range"))?,
            tokens.next().ok_or(ParseError::Expected("policy byte"))?,
        )
    };

    let byte = if byte.as_bytes().len() == 1 {
        byte.as_bytes()[0]
    } else {
        return Err(ParseError::Expected("password policy byte to be exactly 1 byte").into());
    };

    let (min, max) = {
        let mut tokens = range.split('-');
        (
            tokens
                .next()
                .ok_or(ParseError::Expected("policy range (lower bound)"))?,
            tokens
                .next()
                .ok_or(ParseError::Expected("policy range (upper bound)"))?,
        )
    };

    let range = (min.parse()?)..=(max.parse()?);

    Ok((PasswordPolicy { range, byte }, password))
}

Whew that's... more code than I expected.

Haha. It works though!

Shell session
$ cargo t
   Compiling day2 v0.1.0 (/home/amos/ftl/aoc2020/day2)
    Finished test [unoptimized + debuginfo] target(s) in 0.40s
     Running target/debug/deps/day2-67204c5ae8320973

running 2 tests
test tests::test_is_valid ... ok
test tests::test_parse ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Right.. but do we really need all that error handling?

I don't know, I kinda like it! We can even write tests for it.

Rust code
// in `mod tests`

    #[test]
    fn test_parse() {
        assert_eq!(
            parse_line("1-3 a: banana").unwrap(),
            (
                PasswordPolicy {
                    range: 1..=3,
                    byte: b'a',
                },
                "banana"
            )
        );

        assert_eq!(
            parse_line("1-3 a").unwrap_err().to_string(),
            "expected password"
        );
        assert_eq!(
            parse_line("1-3 : banana").unwrap_err().to_string(),
            "expected password policy byte to be exactly 1 byte"
        );

        // feel free to add more tests!
    }

I guess so!

Bear's hesitation is understandable. I wouldn't want to write parsers like that all year round.

And that's why, usually, I reach for the nom crate.

But today we're going to look at something different!

Let's give peg a try.

Let's give what a try?

Oh. OH!

What?

Nothing, nothing.

Shell session
$ cargo add peg
      Adding peg v0.6.3 to dependencies
Rust code
// note: we don't need the `ParseError` type anymore

fn parse_line(s: &str) -> anyhow::Result<(PasswordPolicy, &str)> {
    peg::parser! {
      grammar parser() for str {
        rule number() -> usize
          = n:$(['0'..='9']+) { n.parse().unwrap() }

        rule range() -> RangeInclusive<usize>
          = min:number() "-" max:number() { min..=max }

        rule byte() -> u8
          = letter:$(['a'..='z']) { letter.as_bytes()[0] }

        rule password() -> &'input str
          = letters:$([_]*) { letters }

        pub(crate) rule line() -> (PasswordPolicy, &'input str)
          = range:range() " " byte:byte() ": " password:password() {
              (PasswordPolicy { range, byte }, password)
          }
      }
    }

    Ok(parser::line(s)?)
}

Now we're talking! Does this actually work?

Well, the tests don't pass, but that's because we're just getting different error messages:

Shell session
$ cargo t -q

running 2 tests
.F
failures:

---- tests::test_parse stdout ----
thread 'tests::test_parse' panicked at 'assertion failed: `(left == right)`
  left: `"error at 1:6: expected \": \""`,
 right: `"expected password"`', src/main.rs:90:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Mhh, right. I guess we could get rid of those tests now since the grammar is declarative (and fairly short)? It's nice that we do get some sort of error reporting out of the box though.

Sure we can! Boom, they're gone.

Let's actually run our solution:

Shell session
$ cargo run --quiet
546 passwords are valid

Seems to be the right answer! Moving on.

Part 2

For part 2, Advent of Code authors throw us a curve ball.

Turns out those 1-3 weren't ranges, they were positions! 1-indexed positions, to add insult to injury.

And so if we have:

1-3 a: abcde

The password is only valid if there's an a at position 1, or position 3, but not both.

Well, we shouldn't be using InclusiveRange anymore!

Right. What should we be using?

We're going to be using both positions... maybe a fixed-size array, so we can iterate over it?

Sure!

Rust code
#[derive(PartialEq, Debug)]
struct PasswordPolicy {
    byte: u8,
    positions: [usize; 2],
}

Let's replace is_valid with a todo!(), since the semantics changed:

Rust code
impl PasswordPolicy {
    fn is_valid(&self, password: &str) -> bool {
        todo!()
    }
}

And adjust the parser:

Rust code
fn parse_line(s: &str) -> anyhow::Result<(PasswordPolicy, &str)> {
    peg::parser! {
      grammar parser() for str {
        rule number() -> usize
          = n:$(['0'..='9']+) { n.parse().unwrap() }

        // was: range
        rule positions() -> [usize; 2]
          = first:number() "-" second:number() { [first, second] }

        rule byte() -> u8
          = letter:$(['a'..='z']) { letter.as_bytes()[0] }

        rule password() -> &'input str
          = letters:$([_]*) { letters }

        pub(crate) rule line() -> (PasswordPolicy, &'input str)
          // this now uses `positions`, rather than `range`
          = positions:positions() " " byte:byte() ": " password:password() {
              (PasswordPolicy { positions, byte }, password)
          }
      }
    }

    Ok(parser::line(s)?)
}

We can adjust our parse test like that:

Rust code
    #[test]
    fn test_parse() {
        assert_eq!(
            parse_line("1-3 a: banana").unwrap(),
            (
                PasswordPolicy {
                    positions: [1, 3],
                    byte: b'a',
                },
                "banana"
            )
        );
    }

And our validity test like that:

Rust code
    #[test]
    fn test_is_valid() {
        let pp = PasswordPolicy {
            positions: [1, 3],
            byte: b'a',
        };
        assert_eq!(pp.is_valid("abcde"), true, "'a' in position 1");
        assert_eq!(pp.is_valid("bcade"), true, "'a' in position 3");
        assert_eq!(pp.is_valid("food"), false, "no 'a' whatsoever");
        assert_eq!(pp.is_valid("abacus"), false, "'a' in both positions");
    }

Our parse test already passes, let's think about our new PasswordPolicy::is_valid.

Can we still use an iterator-based approach?

I think we can! All we have to do is... iterate over the positions, and count how many of those actually "match", ie., the byte in the input is the same as the byte in the policy. And then count has to be exactly one.

Aren't we in danger of off-by-one errors here though? Since the input is giving us 1-based indices, and Rust uses 0-based indexing.

Right. We could normalize them directly in the parser. Let's do that.

Rust code
// in `grammar parser()`

        rule number() -> usize
          = n:$(['0'..='9']+) { n.parse().unwrap() }

        /// Positions are 1-based indices in the input
        rule position() -> usize
          = n:number() { n - 1 }

        rule positions() -> [usize; 2]
          // now using `position()` rather than `number()`
          = first:position() "-" second:position() { [first, second] }

Cool! Now the indices in PasswordPolicy::positions are 0-based, so we can use them directly.

And use them directly we shall:

Rust code
impl PasswordPolicy {
    fn is_valid(&self, password: &str) -> bool {
        self.positions
            .iter()
            .copied()
            .filter(|&index| password.as_bytes()[index] == self.byte)
            .count()
            == 1
    }
}

Does this work? Let's run our tests first:

Rust code
$ cargo test --quiet

running 2 tests
FF
failures:

---- tests::test_is_valid stdout ----
thread 'tests::test_is_valid' panicked at 'assertion failed: `(left == right)`
  left: `false`,
 right: `true`: 'a' in position 1', src/main.rs:72:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- tests::test_parse stdout ----
thread 'tests::test_parse' panicked at 'assertion failed: `(left == right)`
  left: `(PasswordPolicy { byte: 97, positions: [0, 2] }, "banana")`,
 right: `(PasswordPolicy { byte: 97, positions: [1, 3] }, "banana")`', src/main.rs:82:9

Uh oh - forgot to adjust our tests. Classic.

Rust code
#[cfg(test)]
mod tests {
    use super::PasswordPolicy;

    #[test]
    fn test_is_valid() {
        let pp = PasswordPolicy {
            positions: [0, 2], // now 0-based
            byte: b'a',
        };
        assert_eq!(pp.is_valid("abcde"), true, "'a' in position 1");
        assert_eq!(pp.is_valid("bcade"), true, "'a' in position 3");
        assert_eq!(pp.is_valid("food"), false, "no 'a' whatsoever");
        assert_eq!(pp.is_valid("abacus"), false, "'a' in both positions");
    }

    use super::parse_line;

    #[test]
    fn test_parse() {
        assert_eq!(
            parse_line("1-3 a: banana").unwrap(),
            (
                PasswordPolicy {
                    positions: [0, 2], // now 0-based
                    byte: b'a',
                },
                "banana"
            )
        );
    }
}

And finally let's get the answer:

Shell session
$ cargo run --quiet
275 passwords are valid

Hey, that's the right answer for our input!

Until next time, take care!