It's time for Day 4 of the Advent of Code 2020!

Now, I've already had a look at the problem statement, at least for part 1, and I'm not particularly excited.

But it will allow me to underline some of the points I've recently been *trying to make about types and correctness.

Ah, yes, the novel.

The problem is to parse passports, with fields like these:

Newlines are ignored, unless there's two of them, and then it separates one passport "record" from the next:

ecl:gry pid:860033327 eyr:2020 hcl:#fffffd
byr:1937 iyr:2017 cid:147 hgt:183cm

iyr:2013 ecl:amb cid:350 eyr:2023 pid:028048884
hcl:#cfa07d byr:1929

hcl:#ae17e1 iyr:2013
eyr:2024
ecl:brn pid:760753108 byr:1931
hgt:179cm

hcl:#cfa07d eyr:2025 pid:166559648
iyr:2011 ecl:brn hgt:59in

Oh, also, some passport are missing fields. Only cid can be missing - the others cannot.

Let's try to think with types again!

byr, iyr, and eyr are both years - an u64 will do.

Yes. In case they're born very, very far in the future.

Rust code
#[derive(Clone, Copy, PartialEq, Debug)]
struct Year(u64);

hgt can be in cm or in, apparently, so it looks like a job for an enum:

Rust code
#[derive(Clone, Copy, PartialEq, Debug)]
enum Length {
    /// Centimeters (the correct unit)
    Cm(u64),
    /// Inches (the incorrect unit)
    In(u64),
}

Then we have "hair color" - it seems to always be in the format #rrggbb, so we'll recognize that, but keep it as a string. I doubt the next part involves color manipulation.

Rust code
#[derive(Clone, Copy, PartialEq, Debug)]
struct Color<'a>(&'a str);

Finally we have pid and cid, which I guess can both start with a zero, and we might also want to keep them as opaque strings.

Rust code
/// An identifier
#[derive(Clone, Copy, PartialEq, Debug)]
struct ID<'a>(&'a str);

Of all those fields, only cid is optional:

Rust code
#[derive(PartialEq, Debug)]
struct Passport<'a> {
    birth_year: Year,
    issue_year: Year,
    expiration_year: Year,
    height: Length,
    hair_color: Color<'a>,
    eye_color: Color<'a>,
    passport_id: ID<'a>,
    country_id: Option<ID<'a>>,
}

Just like for Day 2, we'll use peg for this.

Shell session
$ cargo add peg
      Adding peg v0.6.3 to dependencies

But before we do... to make our grammar simpler, we'll make another type, PassportBuilder, with all fields optional.

Rust code
#[derive(PartialEq, Debug, Default)]
struct PassportBuilder<'a> {
    birth_year: Option<Year>,
    issue_year: Option<Year>,
    expiration_year: Option<Year>,
    height: Option<Length>,
    hair_color: Option<Color<'a>>,
    eye_color: Option<Color<'a>>,
    passport_id: Option<ID<'a>>,
    country_id: Option<ID<'a>>,
}
Shell session
$ cargo add thiserror
      Adding thiserror v1.0.22 to dependencies

And add a build method that returns either a Passport, or an error:

Rust code
#[derive(thiserror::Error, Debug)]
enum Error {
    #[error("missing field: {0}")]
    MissingField(&'static str),
}

impl<'a> PassportBuilder<'a> {
    fn build(self) -> Result<Passport<'a>, Error> {
        Ok(Passport {
            birth_year: self.birth_year.ok_or(Error::MissingField("birth_year"))?,
            issue_year: self.issue_year.ok_or(Error::MissingField("issue year"))?,
            expiration_year: self
                .expiration_year
                .ok_or(Error::MissingField("expiration_year"))?,
            height: self.height.ok_or(Error::MissingField("height"))?,
            hair_color: self.hair_color.ok_or(Error::MissingField("hair color"))?,
            eye_color: self.eye_color.ok_or(Error::MissingField("eye_color"))?,
            passport_id: self.passport_id.ok_or(Error::MissingField("passport id"))?,
            country_id: self.country_id,
        })
    }
}

And now, the grammar:

Waiiiiiiiiit

What?

Isn't that a lot of repetition?

It's not that bad in Vim mode?

Couldn't we use macros?

Bear, it's late, I'm tired.

Pleaaaaaaaaaase? We're repeating ourselves!

Fine.

Rust code
impl<'a> PassportBuilder<'a> {
    fn build(self) -> Result<Passport<'a>, Error> {
        macro_rules! build {
            (
                required => {
                    $($req: ident),* $(,)*
                }$(,)*
                optional => {
                    $($opt: ident),* $(,)*
                }$(,)*
            ) => {
                Ok(Passport {
                    $($req: self.$req.ok_or(Error::MissingField(stringify!($req)))?),*,
                    $($opt: self.$opt),*
                })
            }
        }

        build! {
            required => {
                birth_year,
                issue_year,
                expiration_year,
                height,
                hair_color,
                eye_color,
                passport_id,
            },
            optional => {
                country_id,
            },
        }
    }
}

How cute! Our own little domain-specific language.

Where are the tests though?

Rust code
#[test]
fn test_builder() {
    assert!(PassportBuilder {
        ..Default::default()
    }
    .build()
    .is_err());
    assert!(PassportBuilder {
        birth_year: Some(Year(2014)),
        issue_year: Some(Year(2017)),
        expiration_year: Some(Year(2023)),
        height: Some(Length::Cm(195)),
        hair_color: Some(Color("#ffffff")),
        eye_color: Some(Color("#ee7812")),
        passport_id: Some(ID("00023437")),
        country_id: None,
    }
    .build()
    .is_ok());
}
Shell session
$ cargo test --quiet

running 1 test
.
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Happy?

Very! 😊

And now... the grammar.

It'll be slightly different, because, well, fields can happen in any order.

We'll write a parser that only parses one record, and most parsers will take a mutable reference to a PassportBuilder:

Rust code
impl<'a> PassportBuilder<'a> {
    fn parse(input: &'a str) -> Self {
        let mut b: Self = Default::default();

        peg::parser! {
            grammar parser() for str {

                pub(crate) rule root(b: &mut PassportBuilder<'input>)
                    = (field(b) separator()*)* ![_]

                rule separator()
                    = ['\n' | ' ']

                rule field(b: &mut PassportBuilder<'input>)
                    // years
                    = byr(b) / iyr(b) / eyr(b)
                    // height
                    / hgt(b)
                    // colors
                    / hcl(b) / ecl(b)
                    // IDs
                    / pid(b) / cid(b)

                rule byr(b: &mut PassportBuilder<'input>)
                    = "byr:" year:year() { b.birth_year = Some(year) }

                rule iyr(b: &mut PassportBuilder<'input>)
                    = "iyr:" year:year() { b.issue_year = Some(year) }

                rule eyr(b: &mut PassportBuilder<'input>)
                    = "eyr:" year:year() { b.expiration_year = Some(year) }

                rule hgt(b: &mut PassportBuilder<'input>)
                    = "hgt:" height:length() { b.height = Some(height) }

                rule pid(b: &mut PassportBuilder<'input>)
                    = "pid:" id:id() { b.passport_id = Some(id) }

                rule cid(b: &mut PassportBuilder<'input>)
                    = "cid:" id:id() { b.country_id = Some(id) }

                rule hcl(b: &mut PassportBuilder<'input>)
                    = "hcl:" color:color() { b.hair_color = Some(color) }

                rule ecl(b: &mut PassportBuilder<'input>)
                    = "ecl:" color:color() { b.eye_color = Some(color) }

                rule year() -> Year
                    = num:num() { Year(num) }

                rule color() -> Color<'input>
                    = s:$((!separator()[_])*) { Color(s) }

                rule length() -> Length
                    = num:num() "cm" { Length::Cm(num) }
                    / num:num() "in" { Length::In(num) }

                rule num() -> u64
                    = s:$(['0'..='9']+) { s.parse().unwrap() }

                rule id() -> ID<'input>
                    = s:$(['0'..='9']+) { ID(s) }
            }
        }

        parser::root(input, &mut b).unwrap_or_else(|e| panic!("Could not parse {}: {}", input, e));
        b
    }
}

Hopefully it's self-explanatory!

Uh let me look... what's [_]?

Matches anything. Says so in the rustdocs.

Okay but what's ![_]?

Only matches at EOF (the end of the file)!

Ooh that way we're sure we parse everything from the input?

Yup!

Time to answer part 1.

Rust code
fn main() {
    let results = include_str!("input.txt")
        .split("\n\n")
        .map(PassportBuilder::parse)
        .map(PassportBuilder::build);

    let num_valid = results.filter(Result::is_ok).count();
    println!("{} passport records were valid", num_valid);
}

After actually trying out my input, I had to make two adjustments.

First, the passport ID occasionally contained a pound/sharp/hash/octothorpe character and some letters:

Rust code
// in the grammar:

                rule id() -> ID<'input>
                    = s:$(['0'..='9' | 'a'..='z' | '#']+) { ID(s) }

Secondly, sometimes the height had no specified unit (gasp!). Luckily, easy adjustment to make here too:

Rust code
#[derive(Clone, Copy, PartialEq, Debug)]
enum Length {
    /// Centimeters (the correct unit)
    Cm(u64),
    /// Inches (the incorrect unit)
    In(u64),
    /// No unit
    Unspecified(u64), // new!
}

// in the grammar:

                rule length() -> Length
                    = num:num() "cm" { Length::Cm(num) }
                    / num:num() "in" { Length::In(num) }
                    / num:num() { Length::Unspecified(num) }
Shell session
$ cargo run --quiet
235 passport records were valid

Bingo!

Part 2

The second part of the problem gets serious about validation, we now have:

Time to actually make our parsers fallible!

For that, we'll use {? } blocks of Rust code (rather than {}). That way we can return Err (with a &'static str) or Ok.

We can fix all the years right up by just adding a parameter to our year parser:

Rust code
                rule byr(b: &mut PassportBuilder<'input>) -> ()
                    = "byr:" year:year((1920..=2002)) { b.birth_year = Some(year); }

                rule iyr(b: &mut PassportBuilder<'input>) -> ()
                    = "iyr:" year:year((2010..=2020)) { b.issue_year = Some(year); }

                rule eyr(b: &mut PassportBuilder<'input>) -> ()
                    = "eyr:" year:year((2020..=2030)) { b.expiration_year = Some(year); }

                rule year(range: RangeInclusive<u64>) -> Year
                    = num:num() {?
                        if range.contains(&num) {
                            Ok(Year(num))
                        } else {
                            Err("year out of range")
                        }
                    }

For hgt, we need to first remove the Unspecified variant, and check the ranges as well:

Rust code
                rule hgt(b: &mut PassportBuilder<'input>)
                    = "hgt:" height:length() {?
                        match &height {
                            Length::Cm(v) if !(150..=193).contains(v) => {
                                Err("bad height (cm)")
                            },
                            Length::In(v) if !(59..=76).contains(v) => {
                                Err("bad height (in)")
                            },
                            _ => {
                                b.height = Some(height);
                                Ok(())
                            },
                        }
                    }

I knew that was fishy! Unitless lengths. Nonsense!

And the rest we can fix pretty easily too:

Rust code
                rule pid(b: &mut PassportBuilder<'input>)
                    = "pid:" id:$(['0'..='9']*<9,9>) { b.passport_id = Some(ID(id)) }

                rule cid(b: &mut PassportBuilder<'input>)
                    = "cid:" id:$((!separator()[_])+) { b.country_id = Some(ID(id)) }

                rule hcl(b: &mut PassportBuilder<'input>)
                    = "hcl:" color:hcl0() { b.hair_color = Some(color) }

                rule hcl0() -> Color<'input>
                    = s:$("#" ['0'..='9' | 'a'..='f']*<6,6>) { Color(s) }

                rule ecl(b: &mut PassportBuilder<'input>)
                    = "ecl:" color:ecl0() { b.eye_color = Some(color) }

                rule ecl0() -> Color<'input>
                    = s:$("amb" / "blu" / "brn" / "gry" / "grn" / "hzl" / "oth") { Color(s) }

Now we only need to forward parsing errors rather than panic! on it.

AhAH! That's what you get for being lazy about error handling.

It was a different time... it made sense back then!

Rust code
#[derive(thiserror::Error, Debug)]
enum Error {
    #[error("missing field: {0}")]
    MissingField(&'static str),

    // new!
    #[error("could not parse {0}: {1}")]
    ParseError(String, String),
}
Rust code
    fn parse(input: &'a str) -> Result<Self, Error> {
        let mut b: Self = Default::default();

        // omitted: grammar

        parser::root(input, &mut b).map_err(|e| Error::ParseError(input.into(), e.to_string()))?;
        Ok(b)
    }

And finally, let's get our answer for part 2:

Rust code
fn main() {
    let results = include_str!("input.txt")
        .split("\n\n")
        .map(|input| PassportBuilder::parse(input).and_then(|b| b.build()));

    let num_valid = results.filter(Result::is_ok).count();
    println!("{} passport records were valid", num_valid);
}
Shell session
$ cargo run --quiet
194 passport records were valid

Whoa, we actually got it right the first time!

We always do 😎

Until next time, stay warm!