Another day, another Advent of Code 2020 problem.

That one seems fun! For some nerdy values of fun.

Our input is a set of rules:

light red bags contain 1 bright white bag, 2 muted yellow bags.
dark orange bags contain 3 bright white bags, 4 muted yellow bags.
bright white bags contain 1 shiny gold bag.
muted yellow bags contain 2 shiny gold bags, 9 faded blue bags.
shiny gold bags contain 1 dark olive bag, 2 vibrant plum bags.
dark olive bags contain 3 faded blue bags, 4 dotted black bags.
vibrant plum bags contain 5 faded blue bags, 6 dotted black bags.
faded blue bags contain no other bags.
dotted black bags contain no other bags.

And the question we have to answer is: "How many bag colors can eventually contain at least one shiny gold bag?"

From looking at the example input above, for example, "bright white bags" can directly contain a shiny gold bag, whereas light red bags can contain a bright white bag which can contain a shiny gold bag.

The question for part 1 doesn't use "how many bags of colors Y can be contained in bags of color X", so I suspect it's going to show up in part 2 (which, as a reminder, I can't see until I've solved the first part) - I'd like to try and parse and represent that input as types that make it easy to answer that first question.

Bags are clearly defined by two things: an adjective (light, dark, bright, muted, shiny, vibrant, faded, dotted) and a color (red, orange, white, yellow, gold, olive, plum, etc).

We don't know yet how many different adjectives or colors there'll be, so I'm tempted to just represent those as borrowed strings:

Rust code
/// (adjective, color), i.e. ("dark", "orange")
type BagSpec<'a> = (&'a str, &'a str);
Cool bear's hot tip

"Spec" is used here as a shorthand for "specification".

From there, all our rules are simply a mapping from BagSpec to a quantity and another BagSpec:

Rust code
use std::collections::HashMap;

/// K can contain V.0 of V.1
type Rules<'a> = HashMap<BagSpec<'a>, (usize, BagSpec<'a>)>;

But here's the thing - some bags can contain several types of other bags. From the example above:

light red bags contain 1 bright white bag, 2 muted yellow bags.

We can't represent this rule with our current set of types:

Rust code
fn main() {
    let mut rules: Rules = Default::default();
    rules.insert(("light", "red"), (1, ("bright", "white")));
    rules.insert(("light", "red"), (2, ("muted", "yellow")));
    dbg!(&rules);
}
Shell session
$ cargo run --quiet
[src/main.rs:13] &rules = {
    (
        "light",
        "red",
    ): (
        2,
        (
            "muted",
            "yellow",
        ),
   

The second rule has overwritten the first one!

The good news is: there's a crate (and a type) for that!

Shell session
$ cargo add multimap
      Adding multimap v0.8.2 to dependencies
Rust code
use multimap::MultiMap;

/// K can contain V.0 of V.1
type Rules<'a> = MultiMap<BagSpec<'a>, (usize, BagSpec<'a>)>;
Shell session
$ cargo run --quiet
[src/main.rs:13] &rules = {
    (
        "light",
        "red",
    ): [
        (
            1,
            (
                "bright",
                "white",
            ),
        ),
        (
            2,
            (
                "muted",
                "yellow",
            ),
        ),
    ],
}

Ooh, just like HeaderMap in hyper?

Let's now try parsing our sample rules into this data structure.

peg?

peg.

Shell session
$ cargo add peg
      Adding peg v0.6.3 to dependencies
Rust code
fn parse_rules(input: &str) -> Rules<'_> {
    let mut rules: Rules = Default::default();

    peg::parser! {
        pub(crate) grammar parser() for str {
            pub(crate) rule root(r: &mut Rules<'input>)
                = (line(r) "." whitespace()*)* ![_]

            rule line(r: &mut Rules<'input>)
                = spec:bag_spec() " contain " rules:rules() {
                    if let Some(rules) = rules {
                        for rule in rules {
                            r.insert(spec, rule)
                        }
                    }
                }

            rule bag_spec() -> BagSpec<'input>
                = adjective:name() " " color:name() " bag" "s"? { (adjective, color) }

            rule rules() -> Option<Vec<(usize, BagSpec<'input>)>>
                = rules:rule1()+ { Some(rules) }
                / "no other bags" { None }

            /// Rule followed by an optional comma and space
            rule rule1() -> (usize, BagSpec<'input>)
                = r:rule0() ", "? { r }

            /// A single rule
            rule rule0() -> (usize, BagSpec<'input>)
                = quantity:number() " " spec:bag_spec() { (quantity, spec) }

            rule number() -> usize
                = e:$(['0'..='9']+) { e.parse().unwrap() }

            /// A sequence of non-whitespace characters
            rule name() -> &'input str
                = $((!whitespace()[_])*)

            /// Spaces, tabs, CR and LF
            rule whitespace()
                = [' ' | '\t' | '\r' | '\n']
        }
    }

    parser::root(input, &mut rules).unwrap();
    rules
}

Now, dbg!() makes for pretty verbose output - maybe we can replicate the formatting of the input, for inspection?

Rust code
use std::fmt;

struct FormattedRules<'a>(Rules<'a>);

impl fmt::Display for FormattedRules<'_> {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        for (k, vv) in &self.0 {
            write!(f, "{} {} bags can contain ", k.0, k.1)?;
            if vv.is_empty() {
                write!(f, "no other bags")?;
            } else {
                for (i, v) in vv.iter().enumerate() {
                    if i > 0 {
                        write!(f, ", ")?;
                    }
                    write!(
                        f,
                        "{} {} {} {}",
                        v.0,
                        v.1 .0,
                        v.1 .1,
                        if v.0 == 1 { "bag" } else { "bags" }
                    )?;
                }
            }
            writeln!(f, ".")?;
        }
        Ok(())
    }
}
Rust code
fn main() {
    let rules = parse_rules(include_str!("input.txt"));
    print!("{}", FormattedRules(rules));
}

Now, to compare the output of our program to the sample input, we have to sort it - good old sort command will work out just fine:

Shell session
$ cargo run --quiet | sort -n
bright white bags can contain 1 shiny gold bag.
dark olive bags can contain 3 faded blue bags, 4 dotted black bags.
dark orange bags can contain 3 bright white bags, 4 muted yellow bags.
light red bags can contain 1 bright white bag, 2 muted yellow bags.
muted yellow bags can contain 2 shiny gold bags, 9 faded blue bags.
shiny gold bags can contain 1 dark olive bag, 2 vibrant plum bags.
vibrant plum bags can contain 5 faded blue bags, 6 dotted black bags.
Shell session
$ cat src/input.txt | sort -n
bright white bags contain 1 shiny gold bag.
dark olive bags contain 3 faded blue bags, 4 dotted black bags.
dark orange bags contain 3 bright white bags, 4 muted yellow bags.
dotted black bags contain no other bags.
faded blue bags contain no other bags.
light red bags contain 1 bright white bag, 2 muted yellow bags.
muted yellow bags contain 2 shiny gold bags, 9 faded blue bags.
shiny gold bags contain 1 dark olive bag, 2 vibrant plum bags.
vibrant plum bags contain 5 faded blue bags, 6 dotted black bags.

Seems good! The only difference is we don't remember bags that can contain no other bags. I'm going to say that doesn't matter right now - but we can always change our mind later.

Back to the question. Even though this data structure is flat, we have a graph on our hands:

It's not just any graph, either. It's a Directed Acyclic Graph.

Oh, DAGs! Sure! I like DAGs.

So if we want to know "what bag colors can eventually contain a shiny gold bag?", we can just walk the graph. For every node in the graph, we just follow the arrows until we either:

Let's take "light red" for example. If we follow the whole graph, these are all the edges and nodes we encounter:

And this subset does contain "shiny gold", so "light red" bags can eventually contain "shiny gold" bags.

If we start with dark olive however, we only encounter these:

And none of these are "shiny gold".

Now all we have left is the easy part - make our dreams into reality:

Rust code
fn subgraph_contains(graph: &Rules<'_>, root: &(&str, &str), needle: &(&str, &str)) -> bool {
    if let Some(neighbors) = graph.get_vec(root) {
        for (_, neighbor) in neighbors {
            if neighbor == needle || subgraph_contains(graph, neighbor, needle) {
                return true;
            }
        }
    }
    false
}

Do we need those if let and that for loop? Couldn't we just use iterator methods?

Fine, something like this then:

Rust code
fn subgraph_contains(graph: &Rules<'_>, root: &(&str, &str), needle: &(&str, &str)) -> bool {
    graph
        .get_vec(root)
        .map(|v| {
            v.iter().any(|(_, neighbor)| {
                neighbor == needle || subgraph_contains(graph, neighbor, needle)
            })
        })
        .unwrap_or_default()
}

Ehh couldn't we flatten this a bit?

Sure, sure, something like that:

Rust code
fn subgraph_contains(graph: &Rules<'_>, root: &(&str, &str), needle: &(&str, &str)) -> bool {
    graph
        .get_vec(root)
        .unwrap_or(&Default::default())
        .iter()
        .any(|(_, neighbor)| neighbor == needle || subgraph_contains(graph, neighbor, needle))
}

Ah, that's sorta silly - won't it allocate in case there there's no values associated to the root key?

Welp, too late now.

Fine, but do you promise we'll learn a trick later that could make this better?

Yes. I promise. There!

ANYWAY, we can use such a function like that:

Rust code
fn main() {
    let rules = parse_rules(include_str!("input.txt"));

    let needle = &("shiny", "gold");
    let colors_that_contain_shiny_gold: Vec<_> = rules
        .keys()
        // shiny gold bags are already shiny god, we're not interested
        // in what they can contain (as per the example)
        .filter(|&k| k != needle)
        .filter(|&k| subgraph_contains(&rules, k, needle))
        .collect();
    println!("{:?}", colors_that_contain_shiny_gold);
}

And we would get that answer:

Shell session
$ cargo run --quiet
[("dark", "orange"), ("light", "red"), ("bright", "white"), ("muted", "yellow")]

Which matches the example in the problem statement:

In the above rules, the following options would be available to you:

  • A bright white bag, which can hold your shiny gold bag directly.
  • A muted yellow bag, which can hold your shiny gold bag directly, plus some other bags.
  • A dark orange bag, which can hold bright white and muted yellow bags, either of which could then hold your shiny gold bag.
  • A light red bag, which can hold bright white and muted yellow bags, either of which could then hold your shiny gold bag.

So that's it? We're done?

Well...

There's something else I want to try. When we walk the graph starting from all the nodes, we walk the same subgraphs multiple times.

For example, walking the graph from "light red" and from "dark orange" means visiting "bright white", "muted yellow", "faded blue", and "shiny gold" twice:

We can get rid of all this duplciate work if we just switch it up a little.

Just like gittup, we're going to make things faster by making the arrows go up.

Imagine our graph looked like this instead:

Now our arrows, or edges, mean "N of this bag can be stored in that bag".

And if we want to ask the question "which bags can stored shiny gold bags?", we just need to start with "shiny gold" and walk the entire subgraph.

But how does one go about making the arrows go up?

It's quite simple really:

Rust code
fn reverse_graph<'a>(graph: &Rules<'a>) -> Rules<'a> {
    let mut reverse: Rules = Default::default();
    for (&node, neighbors) in graph.iter_all() {
        for &(quantity, neighbor) in neighbors {
            reverse.insert(neighbor, (quantity, node));
        }
    }
    reverse
}

Wait.. can you collect into a MultiMap?

Oh right!

Rust code
fn reverse_graph<'a>(graph: &Rules<'a>) -> Rules<'a> {
    graph
        .iter_all()
        .map(|(&node, neighbors)| {
            neighbors
                .iter()
                .map(move |&(quantity, neighbor)| (neighbor, (quantity, node)))
        })
        .flatten()
        .collect()
}

Ooh, flatten! Is that new?

Old as the world itself. We need it because .iter_all() returns an Iterator<Item = (tuple, Vec<tuple>)> - because it's a multimap, remember?

Neat!

And then we make a function to walk a subgraph, right?

We do! But what should it return?

Mhhh an iterator?

Let's start with a Vec:

Rust code
fn walk_subgraph<'a>(graph: &Rules<'a>, root: &(&str, &str)) -> Vec<(&'a str, &'a str)> {
    let mut res: Vec<_> = Default::default();
    if let Some(neighbors) = graph.get_vec(root) {
        for &(_quantity, neighbor) in neighbors {
            res.push(neighbor);
            res.extend(walk_subgraph(graph, &neighbor));
        }
    }
    res
}

And see what we get:

Rust code
fn main() {
    let rules = parse_rules(include_str!("input.txt"));
    let rev_rules = reverse_graph(&rules);

    let colors_that_contain_shiny_gold = walk_subgraph(&rev_rules, &("shiny", "gold"));
    println!("{:?}", colors_that_contain_shiny_gold);
}
Shell session
$ cargo run --quiet
[("bright", "white"), ("light", "red"), ("dark", "orange"), ("muted", "yellow"), ("light", "red"), ("dark", "orange")]

Hey! There's duplicates!

Indeed, if we naively walk the graph starting from "shiny gold", we're going to visit some nodes more than once.

We could collect into a HashSet, or do something else - for now let's worry about not allocating a Vec every time we walk a subgraph.

One option is to just take a &mut Vec. That's completely legitimate:

Rust code
fn walk_subgraph1<'a>(graph: &Rules<'a>, root: &(&str, &str), res: &mut Vec<(&'a str, &'a str)>) {
    if let Some(neighbors) = graph.get_vec(root) {
        for &(_quantity, neighbor) in neighbors {
            res.push(neighbor);
            walk_subgraph1(graph, &neighbor, res);
        }
    }
}

fn main() {
    let rules = parse_rules(include_str!("input.txt"));
    let rev_rules = reverse_graph(&rules);

    let mut colors_that_contain_shiny_gold = Default::default();
    walk_subgraph1(
        &rev_rules,
        &("shiny", "gold"),
        &mut colors_that_contain_shiny_gold,
    );
    println!("{:?}", colors_that_contain_shiny_gold);
}

This gives us the exact same result, with fewer allocations. Again, it's a legitimate technique that allows us to not worry about the borrow checker as much.

Of course now we cannot collect into a HashSet to filter out duplicates. And we didn't even want to do that in the first place, we just needed to count them!

So let's try making a version that returns an iterator. The problem we're going to encounter is that our graph can be infinitely large, and so our iterator type, if done naively, will also be infinitely large - I've talked about this in May of 2019.

Ah yes, back when your articles still fit inside of a coffee break.

The tl;dr of this article was: Box is your friend, so, without further ado:

Rust code
fn walk_subgraph2<'iter, 'elems: 'iter>(
    graph: &'iter Rules<'elems>,
    root: &(&'iter str, &'iter str),
) -> Box<dyn Iterator<Item = (&'elems str, &'elems str)> + 'iter> {
    Box::new(
        graph
            .get_vec(root)
            .into_iter()
            .flatten()
            .map(move |&(_, neighbor)| {
                std::iter::once(neighbor).chain(walk_subgraph2(graph, &neighbor))
            })
            .flatten(),
    )
}

Quick question! So the into_iter() I understand - we have a &Vec<V>, which we turn into an Iterator<Item = V>. But what's with the flatten()?

Well actually... we don't have a &Vec<T>.

We don't?

Rust code
    pub fn get_vec<Q: ?Sized>(&self, k: &Q) -> Option<&Vec<V>>
        where K: Borrow<Q>,
              Q: Eq + Hash

Ohh we have an Option<&Vec<V>>! That's iterable?

Of course it is! It's an iterator that yields one element if it's Some, or zero elements if it's None.

Wicked. And flatten() gives us an iterator that either yield all the elements of the Vec, or nothing at all?

OHHhhhh is that the trick you mentioned that could've replaced .unwrap_or(&Default::default()) with something better?

Yup! 😎

Now that we have an iterator, we can use another goodie from itertools, and this time, I triple-checked that it's not in the standard library:

Shell session
$ cargo add itertools
      Adding itertools v0.9.0 to dependencies
Rust code
use itertools::Itertools;

fn main() {
    let rules = parse_rules(include_str!("input.txt"));
    let rev_rules = reverse_graph(&rules);

    let needle = ("shiny", "gold");
    let answer = walk_subgraph2(&rev_rules, &needle).unique().count();
    println!("{} colors can contain {:?} bags", answer, needle);
}
Shell session
$ cargo run --quiet
4 colors can contain ("shiny", "gold") bags

Ooh, I saw what you did there - &str implements Debug, so (&str, &str) does too. Neat!

But how does .unique() work?

It... it collects into a HashSet. But that's hidden! So our code is nice and functional.

Let's run it on my actual puzzle input:

Shell session
$ cargo run --quiet
103 colors can contain ("shiny", "gold") bags

Correct!

Part 2

We're now asked to count how many bags you must buy to fit into a "shiny gold" bag. We're finally putting those numbers to use!

Time for another walk_subgraph method - one that actually includes quantities.

Only a few changes are needed:

Rust code
fn walk_subgraph3<'iter, 'elems: 'iter>(
    graph: &'iter Rules<'elems>,
    root: &(&'iter str, &'iter str),
                          // 👇 we're now returning the quantity as well
) -> Box<dyn Iterator<Item = (usize, (&'elems str, &'elems str))> + 'iter> {
    Box::new(
        graph
            .get_vec(root)
            .into_iter()
            .flatten()
            // 👇 this is even simpler, since we're not destructing the tuple anymore
            .map(move |&n| std::iter::once(n).chain(walk_subgraph3(graph, &n.1)))
            .flatten(),
    )
}

We're going to use the graph with the arrows pointing down, so we don't need to reverse the rules anymore. Also, we're not interested in .unique() anymore.

If "shiny gold" bags contain both "dark red" and "light red" bags, and both of these contain "dark teal" bags - we must count "dark teal" bags twice.

So, our main function is even simpler:

Rust code
fn main() {
    let rules = parse_rules(include_str!("input.txt"));
    let root = ("shiny", "gold");
    let answer = walk_subgraph3(&rules, &root).count();
    println!("you must buy {} bags to fill a {:?} bag", answer, root);
}
Shell session
$ cargo run --quiet
you must buy 63 bags to fill a ("shiny", "gold") bag

Uh oh... that's not the right answer!

Don't worry - it happens to the best of us. Too bad for the streak though.

Oh right - in my excitement, I forgot we need to multiply stuff together!

If every "shiny gold" bag has two "dark red" bags, and those have three "light magenta" bags, then we have 2*3 = 6 "light magenta" bags.

I guess we can't have a method as generic as walk_subgraph3 - let's make a custom one.

Rust code
fn bag_quantities<'iter, 'elems: 'iter>(
    graph: &'iter Rules<'elems>,
    root: &(&'iter str, &'iter str),
) -> Box<dyn Iterator<Item = usize> + 'iter> {
    Box::new(
        graph
            .get_vec(root)
            .into_iter()
            .flatten()
            .map(move |&(qt, n)| {
                std::iter::once(qt).chain(bag_quantities(graph, &n).map(move |x| x * qt))
            })
            .flatten(),
    )
}
Rust code
fn main() {
    let rules = parse_rules(include_str!("input.txt"));
    let root = ("shiny", "gold");
    let answer: usize = bag_quantities(&rules, &root).sum();
    println!("you must buy {} bags to fill a {:?} bag", answer, root);
}
Shell session
$ cargo run --quiet
you must buy 1469 bags to fill a ("shiny", "gold") bag

That's more like it!

I knew 63 was suspiciously low...

Until next time, take care!