In Veronica Mars and password hashes, from my new Tech As Seen On TV series, we've explored "cracking passwords" using brute-force methods, and then using rainbow tables, which was much, much faster.

But how do rainbow tables actually work? Let's start at the beginning.

What's a password hash?

A very simple design for an authentication system is to store passwords in clear text, say, in a file named password.txt:

hunter2
Cool bear's hot tip

Don't do that. Obviously. Not in the 80s, and not in 2020.

No, not even temporarily. Not even in logs. Just don't.

Then, the authentication code is really simple: just compare the password the user entered, against the one you have in your password file.

Shell session
$ cargo new pass
     Created binary (application) `pass` package
Rust code
// in `src/main.rs`

use std::{error::Error, fs::read_to_string, io::stdin};

fn main() -> Result<(), Box<dyn Error>> {
    let expected = read_to_string("password.txt")?;
    println!("Please enter your password:");

    let mut entered = Default::default();
    stdin().read_line(&mut entered)?;

    if entered.trim() == expected.trim() {
        println!("You're authenticated!");
    } else {
        println!("*angry beep* You do not have the required credentials.");
    }

    Ok(())
}

Either it matches, or it doesn't:

Shell session
$ cargo run -q
Please enter your password:
hunter2
You're authenticated!

$ cargo run -q
Please enter your password:
rabbit7
*angry beep* You do not have the required credentials.

But that's, uh, not great. If someone breaks into your server and gets ahold of password.txt, then they get aaaaall the passwords, and they can break into aaaaaaall the accounts.

Normally the password file would contain usernames as well, and it would be a database.

A slightly less awful way of doing it, would be store not the passwords, but cryptographic hashes of the passwords instead.

What's a hash? It's a function!

And what's a function?

A function is something that maps inputs to outputs.

For example, an oven is a function. If you give dough as an input to the oven, you get bread.

We can also write that in text form, with any input we want:

oven(dough) = bread
oven(apples) = baked apples
oven(phone) = fire hazard

It's very convenient to think of a function as a mapping from an input space (containing dough, apples, etc.) to an output space.

Every input maps to an output. That's not true for all functions... but for the ones we're interested in today, it is!

Let's look at other functions! How about f(x) = x + 1? If we say that x can be any positive integer (whole number), then both our input space and our output space are infinitely large — so I'm only going to show part of them.

You see here that, for this input space, 0 is never produced by our function.

It can get even worse, see f(x) = x * 2.

Here, half the output space does not map back to the input space.

In the case of hash functions though, things are slightly different. Although our input space is infinite (any byte string of any length), our output space is finite.

For example, the output space for the MD5 function is: "all possible 128-bit values", which means there's only 2128 possible outputs – just about 340 undecillion possibilities (that's 3.4e38 for our engineer friends).

This means several things.

First, with an infinite input space and a finite output space, a good hash function should leave no part of the output space unmapped. You can imagine applying the function to more than 340 undecillion inputs - much more, even.

After a while, if it's a good hash function, then the entire output space has been used.

But there's a second thing - our output space is not large enough for every element of the input space to have a unique mapping.

When two inputs map to the same output, we say we have a collision.

MD5 is super broken by now. In December 2010, researchers Tao Xie and Dengguo Feng announced they had found a "single block" (the inputs are only 512 bits) collision for MD5, where the inputs only differ by two bits.

But, we won't let the brokenness of MD5 get in the way of us discussing authentication systems. Let's use it.

First we'll do by hand the "set password" phase, where we need to calculate the MD5 hash of the password:

Shell session
$ printf "hunter2" | md5sum
2ab96390c7dbe3439de74d0c9b0b1767  -
Cool bear's hot tip

We're using printf instead of echo because echo adds a newline afterwards, and that would change the hash:

Shell session
echo "hunter2" | md5sum
6a0f0731d84afa4082031e3a72354991  -

And that's what we'll store in a hash.txt file.

2ab96390c7dbe3439de74d0c9b0b1767

Then, when we want to verify the user's password, well, we compute the hash of whatever the user entered and check it against the hash we have in our "database".

Shell session
$ cargo add md5
      Adding md5 v0.7.0 to dependencies
Rust code
use std::{error::Error, fs::read_to_string, io::stdin};

fn main() -> Result<(), Box<dyn Error>> {
    let expected = read_to_string("hash.txt")?;
    println!("Please enter your password:");

    let mut entered = Default::default();
    stdin().read_line(&mut entered)?;

    let entered = md5::compute(entered.trim().as_bytes());
    // we stored our md5 hash in text (hexadecimal) format,
    // so we must format our computed hash the same way
    // (or parse `expected` into an array of bytes - left
    // as an exercise to the reader)
    let entered = format!("{:016x}", entered);

    if entered.trim() == expected.trim() {
        println!("You're authenticated!");
    } else {
        println!("*angry beep* You do not have the required credentials.");
    }

    Ok(())
}

And then our authentication system is complete!

Shell session
$ cargo run -q
Please enter your password:
hunter2
You're authenticated!

$ cargo run -q
Please enter your password:
gatherer17
*angry beep* You do not have the required credentials.

A real users database

Let's make our program a little more interesting. We'll try to have a real password database.

For simplicity, we'll keep all the functionality in the same program: adding a user, authenticating, and various attack functions.

Let's define what we want our CLI to look like. I want to be able to do something like:

Shell session
$ pass add-user renny hunter2
User renny added to database
$ pass list-users
Users:
  - renny
$ pass auth renny hunter2
Authentication successful!
$ pass auth renny foobar
Bad password.
$ pass auth remy foobar
No such user.

With argh, we can achieve this in a nice and declarative way:

Shell session
$ cargo add argh
      Adding argh v0.1.3 to dependencies 
Rust code
use argh::FromArgs;

/// Experiment with passwords.
#[derive(FromArgs)]
struct Args {
    #[argh(subcommand)]
    command: Command,
}

#[derive(FromArgs)]
#[argh(subcommand)]
enum Command {
    AddUser(AddUser),
    ListUsers(ListUsers),
    Auth(Auth),
}

#[derive(FromArgs)]
/// Add a user to the database
#[argh(subcommand, name = "add-user")]
struct AddUser {
    #[argh(positional)]
    username: String,

    #[argh(positional)]
    password: String,
}

#[derive(FromArgs)]
/// List users
#[argh(subcommand, name = "list-users")]
struct ListUsers {}


#[derive(FromArgs)]
/// Authenticate as a user
#[argh(subcommand, name = "auth")]
struct Auth {
    #[argh(positional)]
    username: String,

    #[argh(positional)]
    password: String,
}

Next up, we need some way to represent our database. I think here it can just be a HashMap<String, Vec<u8>>, where the String is the username, and the Vec<u8> is the MD5 hash of the password.

Rust code
use std::collections::HashMap;

/// Maps username to passwords
struct Database {
    records: HashMap<String, Vec<u8>>,
}

We want to save our database to disk and load it back up, so let's draft a few helpful functions:

Rust code
use std::error::Error;

impl Database {
    fn load_or_create() -> Result<Self, Box<dyn Error>> {
        todo!("load database")
    }

    fn save(&self) -> Result<(), Box<dyn Error>> {
        todo!("save database")
    }

    /// Note: this function performs no locking whatsoever.
    /// Last closer wins.
    fn with<F, T>(f: F) -> Result<T, Box<dyn Error>>
    where
        F: FnOnce(&mut Self) -> Result<T, Box<dyn Error>>,
    {
        let mut db = Self::load_or_create()?;
        let res = f(&mut db);
        db.save()?;
        res
    }
}

And given those, it's fairly simple to implement what we had in mind:

Rust code
fn main() -> Result<(), Box<dyn Error>> {
    let args: Args = argh::from_env();
    match args.command {
        Command::AddUser(args) => Database::with(|db| {
            db.records
                .insert(args.username.clone(), md5::compute(args.password).to_vec());
            println!("User {} added to database", args.username);
            Ok(())
        }),
        Command::ListUsers(_) => Database::with(|db| {
            println!("Users:");
            for k in db.records.keys() {
                println!("  - {}", k);
            }
            Ok(())
        }),
        Command::Auth(args) => Database::with(|db| {
            let entered = md5::compute(args.password);
            match db.records.get(&args.username) {
                Some(stored) if stored == entered.as_ref() => {
                    println!("Authentication successful!");
                }
                Some(_) => {
                    println!("Bad password.");
                }
                None => {
                    println!("No such user.");
                }
            }
            Ok(())
        }),
    }
}

Let's try it out!

Shell session
$ cargo run -q
One of the following subcommands must be present:
    help
    add-user
    list-users
    auth

$ cargo run -q -- add-user renny hunter2
thread 'main' panicked at 'not yet implemented: load database', src/main.rs:53:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Oh, uh... right!

There you go again, cutting corners!

I resent that! todo!() is a perfectly fine macro to use – it's just doing its job.

Mhh, if you say so. Chop chop!

Okay, we need to serialize and deserialize something, y'all already know where this is going.

TOML markup
# in Cargo.toml

[dependencies]
serde = { version = "1", features = ["derive"] }
serde_json = "1"

# (already added by cargo-edit: md5, argh)

And then, well... from here on, as they say, it's all gravy.

First we:

Rust code
use serde::{Deserialize, Serialize};

/// Maps username to passwords
#[derive(Clone, Default, Serialize, Deserialize)]
struct Database {
    records: HashMap<String, Vec<u8>>,
}

And then we:

Rust code
use std::fs::File;

impl Database {
    const PATH: &'static str = "users.db";

    fn load_or_create() -> Result<Self, Box<dyn Error>> {
        Ok(match File::open(Self::PATH) {
            Ok(f) => serde_json::from_reader(f)?,
            Err(_) => Default::default(),
        })
    }

    fn save(&self) -> Result<(), Box<dyn Error>> {
        let f = File::create(Self::PATH)?;
        Ok(serde_json::to_writer_pretty(f, self)?)
    }

    // omitted: `with`
}

And then we're golden!

sh
$ cargo run -q -- add-user renny hunter2
User renny added to database

$ cargo run -q -- list-users
Users:
  - renny

$ cargo run -q -- auth renny hunter2
Authentication successful!

$ cargo run -q -- auth renny foobar
Bad password.

$ cargo run -q -- auth remy foobar
No such user.

Let's look at our users.db file:

JSON
{
  "records": {
    "renny": [
      42,
      185,
      99,
      144,
      199,
      219,
      227,
      67,
      157,
      231,
      77,
      12,
      155,
      11,
      23,
      103
    ]
  }
}

Oh that's...

Oh whoa.

Yeah, JSON doesn't really have a "byte slice" or "raw data" type

Yeah I suppose you could plug base64 in there, but...

Yeah. Better switch to something else.

We don't really care what format our "database" uses. Using serde gives us many different choices.

Let's go with something simple, like bincode.

Shell session
$ cargo rm serde_json
    Removing serde_json from dependencies
$ cargo add bincode
      Adding bincode v1.3.1 to dependencies

Let's change just a couple lines:

Rust code
impl Database {
    fn load_or_create() -> Result<Self, Box<dyn Error>> {
        Ok(match File::open(Self::PATH) {
            // was using serde_json
            Ok(f) => bincode::deserialize_from(f)?,
            Err(_) => Default::default(),
        })
    }

    fn save(&self) -> Result<(), Box<dyn Error>> {
        let f = File::create(Self::PATH)?;
        // was using serde_json
        Ok(bincode::serialize_into(f, self)?)
    }

    // omitted: everything else
}

And try adding Renny to the database again:

Shell session
$ cargo run -q -- add-user renny hunter2
pass(24262,0x10ccd8dc0) malloc: can't allocate region
:*** mach_vm_map(size=8872155185559142400, flags: 100) failed (error code=3)
pass(24262,0x10ccd8dc0) malloc: *** set a breakpoint in malloc_error_break to debug
memory allocation of 8872155185559138927 bytes failed[1]    24262 abort      cargo run -q -- add-user renny hunter2

Oh.

Huh. Oh wait - it's trying to load the JSON as bincode!

Yeah. And it's not happy about it.

Let's wipe users.db and try this again:

Shell session
$ rm users.db

$ cargo run -q -- add-user renny hunter2
User renny added to database

$ xxd users.db
00000000: 0100 0000 0000 0000 0500 0000 0000 0000  ................
00000010: 7265 6e6e 7910 0000 0000 0000 002a b963  renny........*.c
00000020: 90c7 dbe3 439d e74d 0c9b 0b17 67         ....C..M....

Eyyy, see 2ab963...? That's our hash! There's still a bunch of zeroes, but what are you gonna do - that's alignment for you.

We could plug in compression, if we want.

Can we please?

Oh alright.

But we won't use DEFLATE. We'll use something "modern". Something webscale. We're going to try ourselves a little.

What we want is... something that compresses away all those pesky zeroes, but doesn't necessarily try too hard. We probably want to prefer speed over compression ratio.

What we want is... something like Snappy.

And what do you know, there's a crate for that.

Shell session
$ cargo add snap
      Adding snap v1.0.1 to dependencies
Rust code
impl Database {
    fn load_or_create() -> Result<Self, Box<dyn Error>> {
        Ok(match File::open(Self::PATH) {
            // new: snap usage
            Ok(f) => bincode::deserialize_from(snap::read::FrameDecoder::new(f))?,
            Err(_) => Default::default(),
        })
    }

    fn save(&self) -> Result<(), Box<dyn Error>> {
        let f = File::create(Self::PATH)?;
        Ok(bincode::serialize_into(
            // new: snap usage
            snap::write::FrameEncoder::new(f),
            self,
        )?)
    }

    // omitted: everything else
}

But that's it. No more distractions.

Awww.

Let's see how snap behaves when loading something that's not snap-encoded...

Shell session
$ cargo run -q -- add-user renny hunter2
Error: Io(Custom { kind: Other, error: StreamHeader { byte: 1 } })

Yeah, that figures.

Shell session
$ rm users.db

$ cargo run -q -- add-user renny hunter2
User renny added to database

$ xxd users.db
00000000: ff06 0000 734e 6150 7059 0029 0000 5a2d  ....sNaPpY.)..Z-
00000010: 6c0b 2d04 0100 0901 0005 0907 1800 7265  l.-...........re
00000020: 6e6e 7910 0d0d 3c2a b963 90c7 dbe3 439d  nny...<*.c....C.
00000030: e74d 0c9b 0b17 67                        .M....g

Huh, it's actually longer. We're paying for the header here, and... I guess we don't have that many repeated zeros. It would probably pay off more with a bigger database - let's try adding 100 users, and compare sizes.

Shell session
$ ls -l users.db
-rw-r--r--  1 amos  staff  55 Oct 24 21:58 users.db

$ for i in $(seq 1 100); do ./target/debug/pass add-user user${i} foobar; done
User user1 added to database
User user2 added to database
User user3 added to database
(cut)
User user98 added to database
User user99 added to database
User user100 added to database

$ ls -l users.db
-rw-r--r--  1 amos  staff  610 Oct 24 22:00 users.db

Okay, that's pretty neat - the number of records increased 100x, and the database size only increased 11x.

Yeah, nice benchmark, genius - they all have the same password.

Too late, I'm being published as we speak.

But enough messing around. I recreated users.db with just the renny user, and our auth subcommand works fine:

Rust code
$ cargo run -q -- auth renny hunter2
Authentication successful!

Attacking the users database

How do we attack that database?

How do we, as Veronica puts it "recover the plaintext"?

Well, we can go old-school.

Just try all the passwords! One by one!

Rust code
#[derive(FromArgs)]
#[argh(subcommand)]
enum Command {
    AddUser(AddUser),
    ListUsers(ListUsers),
    Auth(Auth),
    // new!
    Bruteforce(Bruteforce),
}

#[derive(FromArgs)]
/// Try to brute-force user accounts
#[argh(subcommand, name = "bruteforce")]
struct Bruteforce {}

fn main() -> Result<(), Box<dyn Error>> {
    let args: Args = argh::from_env();
    match args.command {
        // omitted: other variants
        Command::Bruteforce(_) => bruteforce(),
    }
}

We're going to over-engineer the heck out of this. We'll want a Charset type, to represent the character set used to crack passwords, ie. the set of all characters we think might be in the password:

Rust code
struct Charset(Vec<u8>);

We'll want to be able to construct it from a string, for convenience:

Rust code
impl<T> From<T> for Charset
where
    T: AsRef<str>,
{
    fn from(t: T) -> Self {
        Self(t.as_ref().as_bytes().to_vec())
    }
}

...and treat it as a byte slice:

Rust code
impl AsRef<[u8]> for Charset {
    fn as_ref(&self) -> &[u8] {
        &self.0
    }
}

...and have a custom Debug implementation, in case we have non-graphic characters in there:

Rust code
use std::fmt;

impl fmt::Debug for Charset {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        for c in &self.0 {
            let c = *c as char;
            if c.is_ascii_graphic() {
                write!(f, "{}", c)?;
            } else {
                write!(f, "\\x{:02x}", c as u64)?;
            }
        }
        Ok(())
    }
}

And then well.. we'd love our Charset type to be able to generate all possible strings for a given character set and minimum/maximum length.

So... we'll need an iterator. But we don't want the iterator's items to be owned types, like Vec<u8> - because then we'd have to allocate for every string we generate, and that would probably slow us down very quickly.

Have you measured that?

No? But no code is faster than some code?

...okay fine.

So we can't use the standard Iterator trait. But that's okay, we can just roll with our own type!

Or use the streaming_iterator crate!

Yes, we could! But we spent all our budget on snap, sorry.

Rust code
struct CharsetIterator<'a> {
    /// Characters that can appear in the output
    charset: &'a Charset,
    /// Minimum and maximum length of the output
    len_range: RangeInclusive<usize>,
    /// Current length of the output
    current_len: usize,
    /// Current positions used in `charset` for the output
    indices: Vec<usize>,
    /// Current output
    buf: Vec<u8>,
}

And then, the big one:

Rust code
impl<'a> CharsetIterator<'a> {
    fn next(&mut self) -> Option<&[u8]> {
        if !self.len_range.contains(&self.current_len) {
            return None;
        }

        // Generate current item slice
        let charset = self.charset.as_ref();
        // ...this better get inlined
        for (index, slot) in self.indices.iter().copied().zip(self.buf.iter_mut()) {
            *slot = charset[index];
        }
        let slice = &self.buf[..self.current_len];

        // Generate next item indices
        match (&self.indices[..self.current_len])
            .iter()
            .position(|&i| i < charset.len() - 1)
        {
            Some(i) => {
                // one of the characters in the string is not the
                // *last* character of the charset yet, so reset
                // everything before it to the first character...
                for j in 0..i {
                    self.indices[j] = 0;
                }
                // ...and increment it
                self.indices[i] += 1;
            }
            None => {
                // all characters are maxed out, our only option is
                // to make the input string longer. We'll return `None`
                // the next time `next` gets called if we reached our
                // maximum length.
                self.current_len += 1;
                for slot in self.indices.iter_mut() {
                    *slot = 0;
                }
            }
        }

        Some(slice)
    }
}

impl Charset {
    fn iter<'a>(&'a self, len_range: RangeInclusive<usize>) -> CharsetIterator<'a> {
        CharsetIterator {
            charset: self,
            len_range,
            current_len: 0,
            indices: vec![],
            buf: Vec::new(),
        }
    }
}

Now, this is not the fastest way do to this. It's not even the smartest way to do this. But it works.

The rest is really simple: for each password candidate generated via CharsetIterator, compute the MD5 hash, and check against the database.

Rust code
#[derive(Debug)]
struct BruteforceParams {
    len_range: RangeInclusive<usize>,
    charset: Charset,
}

fn bruteforce() -> Result<(), Box<dyn Error>> {
    let params = BruteforceParams {
        len_range: 4..=8,
        charset: "abcdefghijklmnopqrstuvwxyz0123456789".into(),
    };
    println!("{:?}", params);

    let records = Database::with(|db| Ok(db.records.clone()))?;
    let start_time = Instant::now();

    let mut candidates = params.charset.iter(params.len_range);
    while let Some(pass) = candidates.next() {
        let hash = md5::compute(pass);

        for (db_user, db_hash) in &records {
            if hash.as_ref() == db_hash {
                println!(
                    "[CRACKED in {:?}] user ({}) has password ({})",
                    start_time.elapsed(),
                    db_user,
                    std::str::from_utf8(pass).unwrap_or("<not utf-8>")
                );
            }
        }
    }
    Ok(())
}

AS IT TURNS OUT, this code is not the fastest. Not even a little.

So I've added some weaker passwords to the database just so we can make sure it works.

Let's try it out!

Shell session
$ cargo build --release --quiet

$ ./target/release/pass bruteforce
BruteforceParams { len_range: 4..=8, charset: abcdefghijklmnopqrstuvwxyz0123456789 }
[CRACKED in 149.740418ms] user (keith) has password (mars)
[CRACKED in 3.994172005s] user (joe) has password (biden)
[CRACKED in 9.72343814s] user (bernie) has password (bird7)
[CRACKED in 10.487802794s] user (twentytwenty) has password (aaaaaa)
[CRACKED in 171.819211016s] user (dev) has password (foobar)
^C

aaaaand that's how much patience I have right now. Just under three minutes.

So there's several things we could address with our brute force implementation.

We're doing a lot of unnecessary in our CharsetIterator - for every output we generate, we look up and set len_range.max() characters. Often, we'd only need to change just a single character.

The fact that we're returning a borrowed slice is cool, but it makes it awkward to parallelize. What if we wanted to use N threads instead of just the one? We couldn't just hand out &'a [u8] to our worker threads, because the CharsetIterator is re-using its internal buffer to write the next one.

Having each thread copy the &[u8] to their own buffer would still leave us with synchronization problems. Not to mention, that's a copy we should be able to avoid, since the source of truth here is CharsetIterator::indices.

Speaking of indices - why is it a Vec<usize>? The charset is not really a character set... it's a byte set! All entries in Charset::0 should be unique, and there should never be more than 256 entries, because there's only 256 distinct 8-bit values.

But you know - if your password was "biden", we'd still "pwn" you in four seconds. So there's that.

Generating hash tables

If we're cracking a lot of password database that use MD5 as their "cryptographic hash" (ie., if we're stuck between 1992 and 1996), then we probably want to pre-compute some hashes.

And we can do that!

But before we do - let's change up our CharsetIterator a little.

Here's an idea: instead of maintaining an indices vector, how about we use a single number to refer to an input of a given length and charset?

For length 3 and charset "abc", we'd have:

...and so on. We could make it an u64, which has 1.9e19 possible values.

For a full charset (all 255 bytes are fair game), there's only 1.8e19 passwords of length 8. For just [A-Za-z0-9], there's only 8.4e10 passwords of length 10.

And I can guarantee you I don't have the patience to come up with an implementation that is fast enough for passwords longer than 7 characters today.

And if I did, well, we could always switch to an u128, which has 3.4e38 possible values.

So... stick to an u64?

Stick to an u64.

Our strategy is going to be much simpler here.

Rust code
impl Charset {
    fn range(&self, output_len: u32) -> Range<u64> {
        0..(self.0.len() as u64).pow(output_len)
    }

    fn get_into(&self, i: u64, buf: &mut [u8]) {
        let n = self.0.len() as u64;

        let mut remain = i;
        for slot in buf.iter_mut() {
            let modulo = remain % n;
            *slot = self.0[modulo as usize];
            remain = (remain - modulo) / n;
        }
    }
}

..and then we can forget about CharsetIterator completely.

Here's how we can use it:

Rust code
fn bruteforce() -> Result<(), Box<dyn Error>> {
    let params = BruteforceParams {
        len_range: 4..=8,
        charset: "abcdefghijklmnopqrstuvwxyz0123456789".into(),
    };
    println!("{:?}", params);

    let records = Database::with(|db| Ok(db.records.clone()))?;
    let start_time = Instant::now();

    for len in params.len_range {
        let mut buf = vec![0u8; len];
        for i in params.charset.range(buf.len() as _) {
            params.charset.get_into(i, &mut buf);
            let hash = md5::compute(&buf);

            for (db_user, db_hash) in &records {
                if hash.as_ref() == db_hash {
                    println!(
                        "[CRACKED in {:?}] user ({}) has password ({})",
                        start_time.elapsed(),
                        db_user,
                        std::str::from_utf8(&buf).unwrap_or("<not utf-8>")
                    );
                }
            }
        }
    }
    Ok(())
}

Let's give it a whirl:

Shell session
$ ./target/release/pass bruteforce
BruteforceParams { len_range: 4..=8, charset: abcdefghijklmnopqrstuvwxyz0123456789 }
[CRACKED in 172.104729ms] user (keith) has password (mars)
[CRACKED in 4.863978521s] user (joe) has password (biden)
[CRACKED in 11.755943184s] user (bernie) has password (bird7)
[CRACKED in 12.761011938s] user (twentytwenty) has password (aaaaaa)
^C

It still works, hurray!

It's... slightly slower though. Integer division doesn't come for free!

But you know what we can do here that's sorta neat? We can parallelize that shit. Hell yeah. Someone get rayon on the phone.

Ring ring motherf*cker. We've got some cores for you.

Shell session
$ cargo add rayon
      Adding rayon v1.5.0 to dependencies
Rust code
use rayon::prelude::*;

fn bruteforce() -> Result<(), Box<dyn Error>> {
    let params = BruteforceParams {
        len_range: 4..=8,
        charset: "abcdefghijklmnopqrstuvwxyz0123456789".into(),
    };
    println!("{:?}", params);

    let records = Database::with(|db| Ok(db.records.clone()))?;
    let start_time = Instant::now();

    for len in params.len_range.clone() {
        params
            .charset
            .range(len as _)
            .into_par_iter()
            .for_each_with(vec![0u8; len], |mut buf, i| {
                params.charset.get_into(i, &mut buf);
                let hash = md5::compute(&buf);

                for (db_user, db_hash) in &records {
                    if hash.as_ref() == db_hash {
                        println!(
                            "[CRACKED in {:?}] user ({}) has password ({})",
                            start_time.elapsed(),
                            db_user,
                            std::str::from_utf8(&buf).unwrap_or("<not utf-8>")
                        );
                    }
                }
            })
    }
    Ok(())
}

Let it all burn, baby. Make those fans roar.

Shell session
$ cargo build --release && ./target/release/pass bruteforce
   Compiling pass v0.1.0 (/Users/amos/ftl/pass)
    Finished release [optimized] target(s) in 1.17s
BruteforceParams { len_range: 4..=8, charset: abcdefghijklmnopqrstuvwxyz0123456789 }
[CRACKED in 8.082686ms] user (keith) has password (mars)
[CRACKED in 751.047863ms] user (joe) has password (biden)
[CRACKED in 1.231836105s] user (bernie) has password (bird7)
[CRACKED in 1.302122549s] user (twentytwenty) has password (aaaaaa)
[CRACKED in 54.672641246s] user (dev) has password (foobar)
^C

There. That's more like it. Why do we care? Well, because we're going to be computing tables! Like... the MD5 hashes of all passwords of length 6.

And well.. that's a lot of hashes.

So we might as well do it in parallel.

And now we have to think: what's the best way to compute and store 2.1 billion hashes?

bincode is fast, sure, but do we really need it? Do we really want to pay the cost of serialization for every single hash?

If you think about it... 2.1 billion hashes is just 2.1 billion [u8; 16]. It's just one big &[[u8; 16]].

In my early experiments, I've learned one thing: it's that just mapping a 32GB file and letting 16 threads write to it is... not a good time.

What we want here is to split our workload into chunks, and sequentially process each of these chunks... with rayon, so it can split chunks even further and let different CPU cores handle them.

Let's go:

Shell session
$ cargo add memmap indicatif
      Adding memmap v0.7.0 to dependencies
      Adding indicatif v0.15.0 to dependencies
Rust code
#[derive(FromArgs)]
#[argh(subcommand)]
enum Command {
    // omitted: other variants
    GenHtable(GenHtable),
}

#[derive(FromArgs)]
/// Generate a hash table
#[argh(subcommand, name = "gen-htable")]
struct GenHtable {}

fn main() -> Result<(), Box<dyn Error>> {
    let args: Args = argh::from_env();
    match args.command {
        // omitted: other variants
        Command::GenHtable(_) => gen_htable(),
    }
}

Just before we store 2.1 billion hashes, we're probably going to write a "table header" - that specifies what length and character set the table is for, which will help when we actually want to, y'know, use it.

For that we can use bincode - we only have the one header:

Rust code
#[derive(Serialize, Deserialize)]
struct TableHeader {
    len: u32,
    charset: Vec<u8>,
}

And then... it's time to burn our CPU cores a little more:

Rust code
const HASH_LENGTH: usize = 16; 

fn progress_style() -> ProgressStyle {
    ProgressStyle::default_bar()
        .template("[{elapsed_precise}] [{bar:40.blue}] ({eta_precise} left)")
        .progress_chars("#>-")
}

fn gen_htable() -> Result<(), Box<dyn Error>> {
    let item_len = 6;
    let charset: Charset = "abcdefghijklmnopqrstuvwxyz0123456789".into();
    let total_hashes = charset.range(item_len).end;
    println!(
        "Generating {} hashes — for all items of length {}, with characters {:?}",
        total_hashes, item_len, charset
    );

    let progress = ProgressBar::new(total_hashes).with_style(progress_style());
    progress.enable_steady_tick(250);

    // Write the header and pre-size the file
    let hashes_offset_in_file = {
        let mut file = File::create("table.db")?;
        bincode::serialize_into(
            &mut file,
            &TableHeader {
                len: item_len,
                charset: charset.0.to_vec(),
            },
        )?;

        let hashes_offset_in_file = file.seek(SeekFrom::Current(0))?;
        let hashes_len = total_hashes * HASH_LENGTH as u64;

        let file_len = hashes_offset_in_file + hashes_len;
        file.set_len(file_len)?;

        hashes_offset_in_file
    };

    let max_bytes_per_chunk = {
        let gb: u64 = 1024 * 1024 * 1024;
        // Picked to keep memory usage low-enough and flush to disk often-enough
        2 * gb
    };
    let hashes_per_chunk = max_bytes_per_chunk / HASH_LENGTH as u64;
    let bytes_per_chunk = hashes_per_chunk * HASH_LENGTH as u64;
    let num_chunks = total_hashes / hashes_per_chunk;

    // For each chunk, one by one...
    for chunk_index in 0..num_chunks {
        // Show progress
        let hashes_done = chunk_index * hashes_per_chunk;
        progress.set_position(hashes_done);

        let file = OpenOptions::new().read(true).write(true).open("table.db")?;
        let chunk_offset_in_file = hashes_offset_in_file + chunk_index * bytes_per_chunk;
        let mut file = unsafe {
            MmapOptions::new()
                .offset(chunk_offset_in_file)
                .len(bytes_per_chunk as _)
                .map_mut(&file)
        }?;

        // Map `hashes_per_chunk` hashes into memory, so we can write to the file
        let hashes = unsafe {
            std::slice::from_raw_parts_mut(
                file.as_mut_ptr() as *mut [u8; HASH_LENGTH],
                hashes_per_chunk as _,
            )
        };

        // In the collection of "all outputs of this charset", this is
        // where our chunk starts.
        let first_item_index = chunk_index * hashes_per_chunk;

        // Enumerate gives us the position within the chunk.
        hashes.par_iter_mut().enumerate().for_each_with(
            vec![0u8; item_len as usize],
            |buf, (index_in_chunk, out)| {
                let item_index = first_item_index + index_in_chunk as u64;
                // Generate the candidate password
                charset.get_into(item_index, buf);
                // Hash it and store it to the file.
                *out = md5::compute(buf).0;
            },
        );
    }

    progress.finish();
    Ok(())
}

Now look at it go:

Shell session
$ cargo build --release && ./target/release/pass gen-htable
    Finished release [optimized + debuginfo] target(s) in 0.03s
Generating 2176782336 hashes — for all items of length 6, with characters abcdefghijklmnopqrstuvwxyz0123456789
[00:00:14] [#######>--------------------------------] (00:00:53 left)

On my machine, it takes just over a minute to complete - that's about 30 million hashes per second.

CPU utilization is not optimal: there's dips between each chunk, where all the threads are stopped and started again, and presumably some I/O flushing is happening:

There's definitely some extra performance to be squeezed out of this, but it's going to work for our use case.

The resulting file is 32G, as expected:

Shell session
$ ls -lhA table.db
-rw-r--r--  1 amos  staff    32G Oct 27 13:26 table.db

Now it's time.. to use it!

Rust code
#[derive(FromArgs)]
#[argh(subcommand)]
enum Command {
    // omitted: other variantsj0
    UseHtable(UseHtable),
}

#[derive(FromArgs)]
/// Use a hash table
#[argh(subcommand, name = "use-htable")]
struct UseHtable {}

fn main() -> Result<(), Box<dyn Error>> {
    let args: Args = argh::from_env();
    match args.command {
        // omitted: other variants
        Command::UseHtable(_) => use_htable(),
    }
}

Now for the actual use_htable function:

Rust code
fn use_htable() -> Result<(), Box<dyn Error>> {
    let (header, hashes_offset_in_file) = {
        let mut file = File::open("table.db")?;
        let header: TableHeader = bincode::deserialize_from(&mut file)?;
        let offset = file.seek(SeekFrom::Current(0))?;
        (header, offset)
    };

    let charset = Charset(header.charset);
    let num_hashes = charset.range(header.len).end;

    let file = File::open("table.db")?;
    let file = unsafe { MmapOptions::new().offset(hashes_offset_in_file).map(&file) }?;
    let hashes = unsafe {
        std::slice::from_raw_parts(
            file.as_ptr() as *const [u8; HASH_LENGTH],
            num_hashes as usize,
        )
    };

    let records = Database::with(|f| Ok(f.records.clone()))?;
    let start_time = Instant::now();

    hashes.par_iter().enumerate().for_each_with(
        vec![0u8; header.len as usize],
        |buf, (item_index, hash)| {
            for (db_user, db_hash) in &records {
                if db_hash == hash {
                    charset.get_into(item_index as _, buf);
                    println!(
                        "[CRACKED in {:?}] user {} has password {}",
                        start_time.elapsed(),
                        db_user,
                        std::str::from_utf8(buf).unwrap_or("<not utf-8>")
                    );
                }
            }
        },
    );
    println!("Spent {:?} going through whole table", start_time.elapsed());

    Ok(())
}
Shell session
$ cargo build --release && ./target/release/pass use-htable
   Compiling pass v0.1.0 (/Users/amos/ftl/pass)
    Finished release [optimized + debuginfo] target(s) in 3.72s
[CRACKED in 510.335µs] user twentytwenty has password aaaaaa
[CRACKED in 8.918158884s] user alpha has password abcdef
[CRACKED in 23.690443816s] user dev has password foobar
Spent 33.318824641s going through whole table

We were able to crack all three six-character passwords in just 23 seconds!

Compare this with the brute-force method, with which it took us 171 seconds to find dev's password.

So, it's better. But we stopped at 6-character passwords with only [a-z0-9], and our hash table is already 32 gigabytes. What if we had 7-character passwords? We can make a back-of-the-envelope calculation.

For 7-character passwords with [a-z0-9] (36 unique characters), we have 36**7 combinations - just shy of 79 billion.

79 billion times 16 bytes is... 1.25 terabytes.

Wooo laddy. That's not gonna fit on your SSD.

No it's not.

Which is what, 46 minutes into this article, brings us to rainbow tables.

What's a rainbow table?

What we've just generated is an exhaustive hash table. First off, all our inputs have an "index". From that index, we can generate the plaintext, and from that plaintext we can generate the hash.

When we use our table, since we know the position at which the hash is stored, we also know which plaintext candidate it corresponds to. That's why we didn't need to store the plaintext - or the index for that matter. The order of the hashes in the database is all we need.

But, as we've seen, exhaustive hash tables get large quickly, because we're storing all possible hashes.

The whole idea behind hash tables is to only store a fraction of all hashes, in exchange for doing a bit more computation.

To start with, we generate a random plaintext. Really, any plaintext at all. And we compute its hash. And those are the first two nodes of our chain:

Then — and here's the tricky bit — we need a family of functions that, given a ciphertext (an MD5 hash, here), generate another plaintext.

We'll name those functions R1, R2, R3, etc. The R stands for "reduction": we reduce from the "hash space" to the "plaintext space", which is smaller.

What do those functions look like?

We'll come back to it later, but they basically are "using the bits of the ciphertext, pick values from the charset to generate another plaintext".

Now that our chain ends with a plaintext again, we use the hash function again (MD5) to generate the next node:

And then we use the R2 function to generate the next plain text, and so on and so forth, until we reach the desired length for a chain. And then we generate a lot more chains:

And it's called rainbow tables because, conceptually, each chain has its own unique color:

It's important to note that the functions R1, R2, R3 need to all be distinct from one another, ie. we should have R1(x) != R2(x).

Now, let's say we stored all the nodes of all our chains in a database.

It would be pretty easy to use, right? For each chain, consider nodes two by two: we get tuples of (plaintext, hash). If the hash is the one we're trying to crack, boom, we're done.

But then it would be the same as an exhaustive hash table.

Just... more random.

And bigger, since we'd store the plaintexts as well.

So clearly that's not the trick.

No it's not!

Here's the trick: if our rainbow table is "complete enough", the hash we're trying to reverse is somewhere in one of the chains - we just need to figure out in which chain (which row), and at which position in the chain (which column).

And we can do that by only storing the first column (starting plaintexts) and the last column (final hashes).

How do we do that? We make assumptions!

The easy case

First, we assume that the hash we're looking for is in the last column.

So we compare our hash against the last node of all the chains.

If we're cracking 8ae101... for example, well, it's not in the first chain (which ends in ddc510...), it's not in the second chain (which ends in cd7edc...), but it is in the third chain.

Now, there's just one problem... we can't navigate chains from right to left. There's only arrows going from left to right, because we only have functions going from left to right: from plaintext to ciphertext (the MD5) function, and from ciphertext to another plaintext, our R family of functions.

So we can't go from 8ae101... from career by moving "one to the left".

But.. since we have the last and the first column... we can start from the first column, and use our H and R functions to move to the right, until we reach the column we want:

Tada! The plaintext we were looking for was career.

The less-easy case

But that was the easy case - what about the case where the hash we're looking for is not in the last column?

Say we're looking for d487dd...: it's not in the last column of any of the chains:

In this case, we move on to another assumption: we just assume it's in the previous ciphertext column.

...but here's the thing: we don't have that column. We're only storing the first (plaintext) column, and the last (ciphertext) column:

So what do we do here? We could start from the leftmost column and work our way towards the column we're interested in — but that would be equivalent to re-computing the whole rainbow table!

What we do instead is move the node we're searching for to the right.

So, instead of looking for d487dd... in that column, we apply R2 to find the next node:

And then again, until we've reached the last column - the one we have.

And then we compare with the last nodes from all the chains and.. oh look! It's the red chain!

Now that we know in which chain it is, we start from the beginning (leftmost node) of the chain, and move right until we reach the column we want:

And there we have it. The plaintext was yellow.

That is cool.

Yeah! It's pretty neat!

So the worst case scenario is if our plaintext is in the first column, right?

Right - because we try every column starting from the right. So the first column is actually the last one we check.

One thing that is really neat with Rainbow tables, is that it works for any hashing algorithm. We've been going with MD5 for this example, but it does not matter what the algorithm is, as rainbow tables don't exploit any particular weaknesses — they're just a space-saving trick.

Implementing rainbow tables

Shell session
$ cargo add rand
      Adding rand v0.7.3 to dependencies

I'm going to skip over adding subcommands for the executable, because, well, you know the drill.

Just as before, we're going to have a "table header". We need a couple more fields: chain length is how many (plaintext, hash) pairs we have in a chain, and total_chains is the number of chains.

Rust code
#[derive(Serialize, Deserialize)]
struct RainbowTableHeader {
    item_len: u32,
    charset: Vec<u8>,
    chain_length: usize,
    total_chains: u64,
}

We're going to use the same "map a file as one big slice" trick as we did when we generated an exhaustive hash table, only elements are going to be of type RainbowTuple instead of [u8; 16].

Rust code
const MAX_PLAINTEXT_LEN: usize = 16;

/// When stored in a rainbow table file: the first plaintext (start of a chain)
/// and the last hash (end of a chain). Also used as temporary values to
/// represent a single node of a chain.
#[repr(C)]
#[derive(Default, Clone)]
struct RainbowTuple {
    plaintext: [u8; MAX_PLAINTEXT_LEN],
    hash: [u8; HASH_LENGTH],
}

Why is it repr(C)?

This tells the compiler not to re-order fields. It probably doesn't matter much here since both fields are 16 bytes, but since we're mapping memory directly, I'd rather be sure.

Since we're going to deal with big numbers, let's throw in a utility routine to format them for humans:

Rust code
fn format_num<N: TryInto<u64, Error = E> + fmt::Display + fmt::Debug, E: fmt::Debug>(
    num: N,
) -> String {
    let num = num.try_into().unwrap();
    let mut num = num as f64;
    let mut suffix = "";
    if num > 10000.0 {
        num /= 1000.0;
        suffix = " thousand";

        if num > 1000.0 {
            num /= 1000.0;
            suffix = " million";
            if num > 1000.0 {
                num /= 1000.0;
                suffix = " billion";
            }
        }
    }

    match suffix {
        "" => format!("{}", num),
        _ => format!("{:.2}{}", num, suffix),
    }
}

Why not use something like human_format?

Didn't feel like it!

Let's look at the actual rainbow table generation code. The general idea is fairly simple. Just as described in the paper, we generate a set of initial plaintexts, and for each of them, generate our full rainbow chains and only save the first plaintext and the last hash.

Rust code
fn gen_rtable() -> Result<(), Box<dyn Error>> {
    // Parameters, found through experimentation
    let chain_length = 2400;
    let total_chains = 2_000_000;
    let item_len: usize = 6;

    // Same character set as before
    let charset: Charset = "abcdefghijklmnopqrstuvwxyz0123456789".into();

    let hashes_in_table = chain_length as u64 * total_chains;
    println!("Chain length: {}", format_num(chain_length));
    println!("Total chains: {}", format_num(total_chains));
    println!("Total hashes in table: {}", format_num(hashes_in_table));

    let header = RainbowTableHeader {
        item_len: item_len as _,
        chain_length,
        total_chains,
        charset: charset.0.clone(),
    };

    create_rainbow_file(&header, |tuples| {
        let n = AtomicU64::new(0);

        // For debugging purposes: while generating, we're going to see how many
        // times each of our user password hashes appear in our rainbow table.
        let records = Database::with(|db| Ok(db.records.clone()))?;

        generate_initial_plaintexts(tuples, &charset, item_len);

        let progress = ProgressBar::new(total_chains).with_style(progress_style());
        progress.enable_steady_tick(250);

        tuples.par_iter_mut().for_each(|tuple| {
            generate_rainbow_chain(tuple, &header, &charset, &records, &progress, &n);
        });

        progress.finish();

        Ok(())
    })
}

A create_rainbow_file function? Someone's been refactoring!

Oh yeah that part's neat, let me show you.

Rust code
const RAINBOW_TABLE_FILE_NAME: &str = "rtable.db";

fn create_rainbow_file<F, T>(header: &RainbowTableHeader, f: F) -> Result<T, Box<dyn Error>>
where
    F: FnOnce(&mut [RainbowTuple]) -> Result<T, Box<dyn Error>>,
{
    let tuples_len = header.total_chains * std::mem::size_of::<RainbowTuple>() as u64;

    // Write the header and remember the offset where actual tuples start
    let tuples_offset_in_file = {
        let mut file = File::create(RAINBOW_TABLE_FILE_NAME)?;
        bincode::serialize_into(&mut file, &header)?;

        let tuples_offset_in_file = file.seek(SeekFrom::Current(0))?;

        let file_len = tuples_offset_in_file + tuples_len;
        file.set_len(file_len)?;

        tuples_offset_in_file
    };

    let file = OpenOptions::new()
        .read(true)
        .write(true)
        .open(RAINBOW_TABLE_FILE_NAME)?;
    let mut mapping = unsafe {
        MmapOptions::new()
            .offset(tuples_offset_in_file)
            .len(tuples_len as usize)
            .map_mut(&file)?
    };
    let tuples = unsafe {
        std::slice::from_raw_parts_mut(
            mapping.as_mut_ptr() as *mut RainbowTuple,
            header.total_chains as _,
        )
    };

    f(tuples)
}

Ohhh it takes care of all the dirty I/O (input/output) details, I see.

But why take a closure / a callback / an FnOnce, instead of returning the slice?

Well, we have to reason about types.

Sure.

What type would we return exactly?

Mhhh we're writing to it, so... &mut [RainbowTuple]?

Correct! A mutable borrow. What's the lifetime of that borrow?

Uhhh...

Said differently: what does it borrow from?

Ah! From the MmapMut I guess?

Correct! But the compiler doesn't know that... so what would happen if we returned only the slice, and dropped the MmapMut?

Oh, we'd have a dangling pointer!

Right!

Couldn't we just return both of them?

We could! But then we could still accidentally drop the mapping:

Rust code
let (slice, mapping) = create_rainbow_file();
drop(mapping);

// woops!
slice[0] = foobar;

Mhhhh. Couldn't we... return both of them? Like.. a struct that has both the borrowee and the borrower, and that implements DerefMut<Target = [RainbowTuple]>, and and and

Yeah we could! Or we could just take a function. So that it's only possible to access the slice while the memory mapping is valid.

Okay yeah, that's simpler.

Next up, how do we generate the initial plaintexts? Well, Charset::get_into allows us to map an u64 to a plaintext of size item_len with our given character set. And the rand crate allows us to generate an arbitrary u64. So...

Rust code
fn generate_initial_plaintexts(tuples: &mut [RainbowTuple], charset: &Charset, item_len: usize) {
    println!("Generating initial plaintexts sequentially, from single RNG");
    {
        let mut set = HashSet::new();

        let start_time = Instant::now();
        let mut rng = rand::thread_rng();
        let max_plaintext_pos = charset.range(item_len as _).end;
        for tuple in tuples.iter_mut() {
            // try a few times to get a unique hash
            for _ in 0..5 {
                charset.get_into(
                    rng.gen_range(0, max_plaintext_pos),
                    &mut tuple.plaintext[..item_len],
                );
                if set.insert((&tuple.plaintext[..item_len]).to_vec()) {
                    // was unique, we can stop trying
                    break;
                }
            }
        }
        let duration = start_time.elapsed();
        println!(
            "Generated {} unique plaintexts in {:?} ({} per second)",
            tuples.len(),
            duration,
            format_num((tuples.len() as f64 / duration.as_secs_f64()) as u64),
        );
    }
}

We spend some extra computation time here to make sure we generate unique starting plaintexts. If we didn't, we'd have duplicate chains in our rainbow table, which would be a complete waste of space and time.

Given those initial plaintexts, we generate every chain, in parallel. Here's what generating a single chain looks like:

Rust code
#[inline(always)]
fn generate_rainbow_chain(
    tuple: &mut RainbowTuple,
    header: &RainbowTableHeader,
    charset: &Charset,
    records: &HashMap<String, Vec<u8>>,
    progress: &ProgressBar,
    n: &AtomicU64,
) {
    let item_len = header.item_len as usize;

    // Move to the end of the chain, hop by hop
    let mut tmp: RainbowTuple = tuple.clone();
    // Apply H
    tmp.hash = md5::compute(&tmp.plaintext[..item_len]).0;

    for column_index in 0..(header.chain_length - 1) {
        // For debugging:
        for db_hash in records.values() {
            if db_hash == tmp.hash.as_ref() {
                progress.println(format!(
                    "{:?} is in rainbow table (at position {}/{} in a chain)",
                    std::str::from_utf8(&tmp.plaintext[..item_len]).unwrap_or("<non utf-8>"),
                    column_index,
                    header.chain_length
                ));
            }
        }

        // Apply Ri
        next_plaintext(column_index, &charset, item_len, &mut tmp);
        // Apply H
        tmp.hash = md5::compute(&tmp.plaintext[..item_len]).0;
    }

    // The stored tuple contains the first plaintext and the last hash
    tuple.hash = tmp.hash;

    // Progress update
    let m = n.fetch_add(1, Ordering::Relaxed);
    if m % 100_000 == 0 {
        progress.set_position(m);
    }
}

We only update progress for every 100K chains we generate, to avoid spending a lot of time on synchronizing the progress bar (it's full of locks).

Which brings us to next_plaintext, our family of reduction functions. What should they look like? Well, I didn't look at other implementations of rainbow tables: instead I just went with something that looked reasonable.

Our requirements are:

And you know... we can fill both those requirements pretty easily, by just making up an u64 from the first half of the hash, adding it with column_index, constraining it to the 0..(charset.0.len().pow(item_len)) range, and using Charset::get_into.

But then we run into modulo bias.

Say you have a random number generator function, r(x) that generates numbers between 0 and 100.

But what you really want is a number between 0 and 40. You could do r(x) % 40. Problem solved, right?

Wrong!

And you know, I apologize, I don't have any hard data for you — but when I tried really-naive versions of next_plaintext, plaintext "aaaaaa" showed up a lot more than anything else.

So, this is my attempt at escaping modulo bias: by keeping track of the bits we don't use it, and feeding them into the next stage... it seems to be better.

I might be wrong. Please write an hour-long article on how I'm wrong, if I'm wrong.

Rust code
[inline(always)]
fn next_plaintext(
    column_index: usize,
    charset: &Charset,
    item_len: usize,
    column: &mut RainbowTuple,
) {
    assert!(item_len < HASH_LENGTH);
    let chars = &charset.0[..];
    let max_char = chars.len();
    let mut hash_rest: usize = 0;
    let mut charset_rest: usize = 0;

    for plaintext_pos in 0..item_len {
        let hash_index = (column_index + plaintext_pos) + hash_rest;
        hash_rest = hash_index / HASH_LENGTH;
        let hash_index = hash_index % HASH_LENGTH;

        let charset_index = column.hash[hash_index] as usize + charset_rest;
        charset_rest = charset_index / max_char;
        let charset_index = charset_index % max_char;

        column.plaintext[plaintext_pos] = chars[charset_index];
    }
}

Okay, that's it for rainbow table generation!

Let's run it:

Shell session
$ cargo build --release && ./target/release/pass gen-rtable
   Compiling pass v0.1.0 (/Users/amos/ftl/pass)
    Finished release [optimized + debuginfo] target(s) in 3.74s
Chain length: 2400
Total chains: 2.00 million
Total hashes in table: 4.80 billion
Generating initial plaintexts sequentially, from single RNG
Generated 2000000 unique plaintexts in 883.633174ms (2.26 million per second)
"abcdef" is in rainbow table (at position 1188/2400 in a chain)
"aaaaaa" is in rainbow table (at position 56/2400 in a chain)
"aaaaaa" is in rainbow table (at position 1192/2400 in a chain)
"abcdef" is in rainbow table (at position 185/2400 in a chain)
"abcdef" is in rainbow table (at position 1485/2400 in a chain)
"abcdef" is in rainbow table (at position 1485/2400 in a chain)
"foobar" is in rainbow table (at position 2241/2400 in a chain)
"abcdef" is in rainbow table (at position 185/2400 in a chain)
"abcdef" is in rainbow table (at position 1485/2400 in a chain)
"abcdef" is in rainbow table (at position 1485/2400 in a chain)
[00:01:44] [########################################] (00:00:00 left)

Woo, all our six-letter passwords are in there. That's good. But hey, we generated 4.8 billion hashes... how much space does it take on disk?

Shell session
$ ls -lhA rtable.db
-rw-r--r--  1 amos  staff    61M Oct 28 11:31 rtable.db

61 megabytes. A far cry from our initial 32 gigabytes. It's worth noting that we got lucky, though.

You mean... you re-ran the command until it included all three plaintexts we knew we were looking for?

...maybe?

Point is — rainbow tables are probabilistic. There's a chance that the hash we're looking for is somewhere in the table, but it's not 100%.

So now, how do we use that rainbow tables? Just like we said in the pretty diagrams, except with code instead. Easy. Easy! I definitely didn't struggle with this. Nope. Not me.

First, for convenience, we're going to want to group some parameters together in a struct:

Rust code
struct FindChainContext<'a> {
    start_time: Instant,
    progress: &'a ProgressBar,
    header: &'a RainbowTableHeader,
    charset: &'a Charset,
}

Similar to write_rainbow_file, we're going to make a read_rainbow_file function that gives us both the header, and an immutable slice of rainbow tuples:

Rust code
fn read_rainbow_file<F, T>(f: F) -> Result<T, Box<dyn Error>>
where
    F: FnOnce(RainbowTableHeader, &[RainbowTuple]) -> Result<T, Box<dyn Error>>,
{
    let (header, tuples_offset_in_file) = {
        let mut file = File::open("rtable.db")?;
        let header: RainbowTableHeader = bincode::deserialize_from(&mut file)?;
        let offset = file.seek(SeekFrom::Current(0))?;
        (header, offset)
    };

    let file = File::open("rtable.db")?;
    let file = unsafe {
        MmapOptions::new()
            .offset(tuples_offset_in_file)
            .map(&file)?
    };
    let tuples = unsafe {
        std::slice::from_raw_parts(
            file.as_ptr() as *const RainbowTuple,
            header.total_chains as _,
        )
    };

    f(header, tuples)
}

From there... we have to think about what we want to parallelize. We have millions of chains, and only a handful of hashes to crack, so I think (and I measured that) it makes sense to try to crack all the hashes concurrently, and for each "assumed column", to check against all the chains in parallel.

At first I tried to parallelize over "assumed columns" rather than "chains", and it was, actually, much slower.

So, we're going to have to jump through some hoops to "crack hashes concurrently". We're going to start by assuming they're all in the last column, and look that up in the rainbow table. Then we're going to assume they're all in the second-to-last column. And so on.

So, unlike in our nice diagrammy explanation, we're not dealing with a single hash, but with a collection of hashes - we call them "queries".

Here's our use_rtable:

Rust code
fn use_rtable() -> Result<(), Box<dyn Error>> {
    let start_time = Instant::now();

    read_rainbow_file(|header, tuples| {
        let item_len = header.item_len as usize;
        let charset = Charset(header.charset.clone());

        println!(
            "Loading {} chains of len {} for {}-len plaintexts with charset {:?}",
            format_num(header.total_chains),
            format_num(header.chain_length),
            header.item_len,
            charset
        );

        let records = Database::with(|db| Ok(db.records.clone()))?;
        let chain_length = header.chain_length;

        // Process all "passwords we want to crack" in parallel. We store the
        // username, the hash as stored in the users db (we'll assume it's in
        // different columns every time), and a `RainbowTuple` we're going to
        // "move to the right" by the distance between the "assumed column" and
        // the "last column".
        let mut queries_initial: Vec<_> = records
            .iter()
            .map(|(db_user, db_hash)| {
                let hash: [u8; HASH_LENGTH] = (&db_hash[..]).try_into().unwrap();
                (
                    db_user,
                    hash,
                    RainbowTuple {
                        plaintext: Default::default(),
                        hash,
                    },
                )
            })
            .collect();

        // Whenever we crack a password, we stop trying to crack it. It's `Arc`
        // because each worker thread needs a reference, and it's `Mutex`
        // because, when we do crack a pasword, we end up mutating it. Lock
        // contention should be very low though - we do millions of iterations
        // before cracking a single password.
        let found: Arc<Mutex<HashSet<&str>>> = Arc::new(Mutex::new(HashSet::new()));

        let progress = ProgressBar::new(chain_length as u64).with_style(progress_style());
        progress.enable_steady_tick(250);

        let ctx = FindChainContext {
            start_time,
            progress: &progress,
            header: &header,
            charset: &charset,
        };

        for assumed_column in (0..chain_length).into_iter().rev() {
            progress.set_position((chain_length - assumed_column) as u64);

            // We're assuming the hashes we're looking for are in `assumed_column`. To
            // find a chain match, we need to apply `Ri` and `H` repeatedly, to move
            // them to the right, until they're in the last column - at the end
            // of the chain, for which we have a hash in the rainbow file.
            let mut queries = queries_initial.clone();
            for column_index in assumed_column..(chain_length - 1) {
                for (_, _, query_column) in queries.iter_mut() {
                    // Apply Ri
                    next_plaintext(column_index, &charset, item_len, query_column);
                    // Apply H
                    query_column.hash = md5::compute(&query_column.plaintext[..item_len]).0;
                }
            }

            tuples.par_iter().enumerate().for_each_with(
                found.clone(),
                |found, (chain_index, chain_tuple)| {
                    ctx.find_chain_match(found, &queries, assumed_column, chain_index, chain_tuple)
                },
            );

            // Exclude the passwords we've already cracked from `queries_initial`
            {
                let found = found.lock().unwrap();
                queries_initial = queries_initial
                    .into_iter()
                    .filter(|(k, _, _)| !found.contains::<str>(k.as_ref()))
                    .collect();
            }
        }
        progress.finish();

        Ok(())
    })
}

In there is the code to "move" from the assumed column to the last column, and then we still have to compare that to all the chains in our rainbow file, which is what FindChainContext::find_chain_match does.

And, look! It comes with ASCII diagrams!

Rust code
impl<'a> FindChainContext<'a> {
    #[inline(always)]
    fn find_chain_match(
        &self,
        found: &mut Arc<Mutex<HashSet<&'a str>>>,
        queries: &[(&'a String, [u8; HASH_LENGTH], RainbowTuple)],
        assumed_column: usize,
        chain_index: usize,
        chain_tuple: &RainbowTuple,
    ) {
        let item_len = self.header.item_len as usize;

        /*
         * If everything goes fine, we have:
         *
         * /-----------------------|-------------------------------|----------------------\
         * | first column          | assumed column tuple          | last column          |
         * |-----------------------|-------------------------------|----------------------|
         * |                       |   query_hash (from users.db)  |                      |
         * |                       |     ||   \------------------- |-> query_moved_right  |
         * |                       |     ||                        |                      |
         * | table_tuple.plaintext |     ||                        |   table_tuple.hash   |
         * |            \----------|-> query_from_left             |                      |
         * \-----------------------|-------------------------------|----------------------/
         */
        for (username, query_hash, query_moved_right) in queries.iter() {
            // Is `query_hash` in a chain that ends with the same hash as `chain_tuple`
            // of length `chain_length - assumed_column`?
            if query_moved_right.hash != chain_tuple.hash {
                // No.
                continue;
            }

            // Rebuild queried hash starting from the left column (first plaintext in chain)
            // to recover the plaintext.
            let mut query_from_left = chain_tuple.clone();

            // Apply H - this "completes" the 0th column.
            query_from_left.hash = md5::compute(&query_from_left.plaintext[..item_len]).0;
            for column_index in 0..assumed_column {
                // Apply Ri
                next_plaintext(column_index, self.charset, item_len, &mut query_from_left);
                // Apply H
                query_from_left.hash = md5::compute(&query_from_left.plaintext[..item_len]).0;
            }

            if &query_from_left.hash != query_hash {
                // If the rebuilt hash is not equal to the initial hash, the
                // query_hash was in a different chain that merged with the one
                // stored in the rainbow table.
                //
                // This happens all the time. `query_from_left.plaintext` is not
                // what we're looking for.
                continue;
            }

            {
                // If this is the first time we crack this password (it may be in
                // multiple chains stored in the rainbow table)...
                let mut found = found.lock().unwrap();
                if found.insert(username) {
                    // ...then print it out.
                    self.progress.println(format!(
                                "[CRACKED in {:?}] user '{}' has password '{}' (found in chain {} at {}/{})",
                                self.start_time.elapsed(),
                                username,
                                std::str::from_utf8(&query_from_left.plaintext[..item_len]).unwrap_or("<non utf-8>"),
                                chain_index,
                                assumed_column,
                                self.header.chain_length,
                            ));
                }
            }
        }
    }
}

Well, there it is. It's showtime.

We've optimized some things since we last ran bruteforce, so let's run it again:

Shell session
$ cargo build --release && ./target/release/pass bruteforce
   Compiling pass v0.1.0 (/Users/amos/ftl/pass)
    Finished release [optimized + debuginfo] target(s) in 3.46s
BruteforceParams { len_range: 4..=8, charset: abcdefghijklmnopqrstuvwxyz0123456789 }
[CRACKED in 1.181187301s] user (twentytwenty) has password (aaaaaa)
[CRACKED in 14.501821615s] user (alpha) has password (abcdef)
[CRACKED in 29.108262954s] user (dev) has password (foobar)

And let's run use-htable again:

Shell session
$ cargo build --release && ./target/release/pass use-htable
    Finished release [optimized + debuginfo] target(s) in 0.04s
[CRACKED in 410.505µs] user twentytwenty has password aaaaaa
[CRACKED in 10.731340801s] user alpha has password abcdef
[CRACKED in 17.523243879s] user dev has password foobar

And fianlly, use-rtable:

Shell session
$ cargo build --release && ./target/release/pass use-rtable
    Finished release [optimized + debuginfo] target(s) in 0.16s
Loading 2.00 million chains of len 2400 for 6-len plaintexts with charset abcdefghijklmnopqrstuvwxyz0123456789
[CRACKED in 426.636558ms] user 'dev' has password 'foobar' (found in chain 1826222 at 2241/2400)
[CRACKED in 2.492776857s] user 'alpha' has password 'abcdef' (found in chain 1818787 at 1485/2400)
[CRACKED in 3.227486797s] user 'twentytwenty' has password 'aaaaaa' (found in chain 837839 at 1192/2400)
[00:00:04] [########################################] (00:00:00 left)

The results are in!

MethodSpaceaaaaaaabcdeffoobar
Bruteforce01.2s14.5s29.1s
Exhaustive32GB0s10.7s17.5s
Rainbow61MB426ms2.5s3.2s

Rainbow tables are the clear winner here 🎉

Just to convince you that there's not just those hashes in the rainbow table, let's generate ten other dummy accounts with random 6-character passwords:

Shell session
$ for i in $(seq 1 10); do PASSWORD=$(openssl rand -base64 16 | tr -dc 'a-z0-9' | head -c 6); ./target/release/pass add-user random$i "$PASSWORD"; echo "(with password $PASSWORD)"; done
User random1 added to database
(with password ivypj5)
User random2 added to database
(with password iipjrc)
User random3 added to database
(with password zq1djg)
User random4 added to database
(with password e3kyj9)
User random5 added to database
(with password cz7pws)
User random6 added to database
(with password 1wcauj)
User random7 added to database
(with password acd8y6)
User random8 added to database
(with password uul2nk)
User random9 added to database
(with password ov7nx6)
User random10 added to database
(with password rrv6pi)

And let's run use-rtable again:

Shell session
$ cargo build --release && ./target/release/pass use-rtable
    Finished release [optimized + debuginfo] target(s) in 0.14s
Loading 2.00 million chains of len 2400 for 6-len plaintexts with charset abcdefghijklmnopqrstuvwxyz0123456789
[CRACKED in 678.589077ms] user 'random2' has password 'iipjrc' (found in chain 402534 at 2274/2400)
[CRACKED in 855.57664ms] user 'dev' has password 'foobar' (found in chain 1826222 at 2241/2400)
[CRACKED in 1.694286796s] user 'random6' has password '1wcauj' (found in chain 184202 at 2075/2400)
[CRACKED in 1.909697589s] user 'random7' has password 'acd8y6' (found in chain 1911512 at 2033/2400)
[CRACKED in 3.05897522s] user 'random4' has password 'e3kyj9' (found in chain 135326 at 1798/2400)
[CRACKED in 4.605489184s] user 'alpha' has password 'abcdef' (found in chain 1801797 at 1485/2400)
[CRACKED in 6.079666699s] user 'twentytwenty' has password 'aaaaaa' (found in chain 837839 at 1192/2400)
[CRACKED in 7.417559702s] user 'random1' has password 'ivypj5' (found in chain 1088265 at 911/2400)
[CRACKED in 10.187383951s] user 'random8' has password 'uul2nk' (found in chain 1432884 at 315/2400)
[CRACKED in 10.315894942s] user 'random9' has password 'ov7nx6' (found in chain 663976 at 285/2400)
[CRACKED in 10.443286352s] user 'random10' has password 'rrv6pi' (found in chain 1969534 at 251/2400)
[00:00:11] [########################################] (00:00:00 left)

Now here's a good demonstration of the "probabilistic" nature of rainbow tables: we got eight ouf of ten. random3 and random5 are missing from our table. If we were to run gen-rtable again, we might catch them! But we might also lose out on others.

Improving our rainbow tables implementation

...is not something we're going to do. But there are things we could do.

We could — and I think that's what the original paper intends — not have a single item_len, but instead, have next_plaintext (our Ri family of functions) decide the length of the next plaintext, within a specified range.

The resulting rainbow table would be larger, but it would work for passwords that aren't exactly of length 6.

We also could treat hashes as an u128 - or a pair of u64. Then equality comparison would be faster than comparing u8 slices. Or maybe it's already optimizing that far? You do the sleuthing. I'm all sleuthed out.

Most of what we're doing is embarassingly parallel, so we could use SIMD (Single Instruction, Multiple Data) to process several items at a time on a single thread. That's left as an exercise to the reader.

Also, I'm sure we could do even fewer allocations. The memory-mapping tricks are also costing us — we probably should not trust the OS to "just do the right thing".

We could also sort our rainbow tables after generation, so that, in find_chain_match, we don't do 2 million comparisons, only log2 of that (which is... 20).

I'm not sure how our rainbow tables implementation compares to existing, state-of-the-art implementations, but it allowed us to show the trade-off, which is cool.

But how useful are rainbow tables... in the real world?

Defeating rainbow tables

The first defense against rainbow tables was to "add salt".

Is this a metaphor?

No, really. Salt is something you add to a password before you hash it.

Say, instead of doing:

Rust code
let password = "foobar";
let hash = md5(password);

You do:

Rust code
let password = "foobar";
let salt = "a79n48gm"

let input = format!("{}{}", password, salt);
let hash = md5(input);

And then you store both the hash and the salt in the database.

Wait... but then attackers could just generate a rainbow table using that salt?

Bear is right! But the idea is to have a unique salt per user. So you can generate a rainbow table for a specific user, if you really want their password.

But if the salt is long enough, generating rainbow tables for all possible salts is.. impractical. And boom, rainbow tables don't work anymore.

However — rainbow tables are no longer the best option to crack passwords. These days, your GPU is much more effective at brute-forcing passwords than your CPU is at generating and using rainbow tables.

Oh well. It was fun living in 2004 for a while!