In the last part, we've finally parsed some IPv4 packets. We even found a way to filter only IPv4 packets that contain ICMP packets.

There's one thing we haven't done though, and that's verify their checksum. Folks could be sending us invalid IPv4 packets and we'd be parsing them like a fool!

This series is getting quite long, so let's jump right into it.

Let's read a bit of RFC 791:

A checksum on the header only. Since some header fields change (e.g., time to live), this is recomputed and verified at each point that the internet header is processed.

The checksum algorithm is:

The checksum field is the 16 bit one's complement of the one's complement sum of all 16 bit words in the header. For purposes of computing the checksum, the value of the checksum field is zero.

This is a simple to compute checksum and experimental evidence indicates it is adequate, but it is provisional and may be replaced by a CRC procedure, depending on further experience.

Cool bear's hot tip

Yeah, it was never replaced with a Cyclic Redundancy Check. Note that this RFC was published in 1981 - the CRC-32 had been invented 20 years prior!

That said, CRC-32 is used in Ethernet - it's the 4 bytes right after the payload, the field is named "Frame check sequence".

Note also that different polynomials can be used to compute CRC-32s - the IEEE one (used in Ethernet) is "not the best". That's all I'll say about this now, one could spend an entire article talking about those.

The idea behind checksumming is to include a hash alongside the actual data, to detect transmission errors. We're interested in detection only - if we needed to be able to correct those errors, we'd need an Error correction code.

A hash function needs to have several properties in order to be suitable for error detection. The most important one is that when the input changes even slightly, the resulting hash should be completely different. It's likely for a transmission error to be a single bit flip - and even that should give a completely different sum:

One thing that isn't so important is for the hash function to be collision-resistant. It is not particularly hard to find two inputs whose CRC-32 are the same:

What did we learn?

Checksums are meant to detect accidental data corruption. They are useless against intentional data corruption.

To protect against the latter, you'll need a cryptographic hash function.

We're going to be using the "internet checksum" twice: once for IPv4, and another time for ICMP. We're also going to need it when we eventually generate packets of our own.

So we want to make it re-usable. How about we make it take a slice of bytes?

Rust code
// in `src/ipv4.rs`

pub fn checksum(slice: &[u8]) -> u16 {
    unimplemented!()
}

Ah, a good start. The algorithm says to take the "one's complement sum of all 16 bit words in the header". But we don't have a slice of u16, we have a slice of u8.

Well, I'm sure we can work that out. We're all reasonable people here - what's a little reinterpretation between friends?

Let's take a look at std::slice::align_to:

Rust code
// in rust standard library

pub unsafe fn align_to<U>(&self) -> (&[T], &[U], &[T])

The docs say the following:

Transmute the slice to a slice of another type, ensuring alignment of the types is maintained.

This method splits the slice into three distinct slices: prefix, correctly aligned middle slice of a new type, and the suffix slice. The method may make the middle slice the greatest length possible for a given type and input slice, but only your algorithm's performance should depend on that, not its correctness. It is permissible for all of the input data to be returned as the prefix or suffix slice.

This method has no purpose when either input element T or output element U are zero-sized and will return the original slice without splitting anything.

Well, tail is pretty self-explanatory right? If we have five u8, we can only make two u16:

What about the head though?

Well, what if the start of our u8 slice is not on a 16-bit (2-byte) boundary?

In our case:

However, stranger things have happened, so we'll enforce our assumptions by panicking if they're broken. (And if we ever hit that panic, we'll need a plan B).

But first, let's validate our assumptions.

Rust code
// in `src/main.rs`

use std::fmt::UpperHex;
fn dump<T: UpperHex + Sized, S: AsRef<[T]>>(name: &str, slice: S) {
    print!("{:>12} ", name);
    for s in slice.as_ref() {
        print!("{:02X} ", s);
    }
    println!()
}

fn main() {
    // make an array of four `u16` integers
    let input_16: [u16; 4] = [0x1122, 0x3344, 0x5566, 0x7788];
    dump("input_16", input_16);

    // violently transmute that to an array of `u8`
    // note: on little-endian, those will seem out of order
    let input_8: [u8; 8] = unsafe { std::mem::transmute(input_16) };
    dump("input_8", input_8);

    // get a sub-slice that isn't aligned on the 16-bit boundary
    let sub_8 = &input_8[1..7];
    dump("sub_8", sub_8);

    // try to align to 16-bit
    let (head, slice, tail) = unsafe { sub_8.align_to::<u16>() };
    dump("head", head);
    dump("slice", slice);
    dump("tail", tail);
}

This prints:

Shell session
$ cargo run
    input_16 1122 3344 5566 7788
     input_8 22 11 44 33 66 55 88 77
       sub_8 11 44 33 66 55 88
        head 11
       slice 3344 5566
        tail 88

Alright, so it seems align_to is doing what we want. If we were to pass an aligned slice, then head and tail would be empty:

Rust code
fn main() {
    let input_16: [u16; 4] = [0x1122, 0x3344, 0x5566, 0x7788];
    dump("input_16", input_16);

    let input_8: [u8; 8] = unsafe { std::mem::transmute(input_16) };
    dump("input_8", input_8);

    let (head, slice, tail) = unsafe { input_8.align_to::<u16>() };
    dump("head", head);
    dump("slice", slice);
    dump("tail", tail);
}
Shell session
$ cargo run
    input_16 1122 3344 5566 7788
     input_8 22 11 44 33 66 55 88 77
        head
       slice 1122 3344 5566 7788
        tail

Since our assumption is that the input to checksum is 16-bit aligned and has a size that's a multiple of 2, we can just panic if head or tail aren't empty:

Rust code
// in `src/ipv4.rs`

pub fn checksum(slice: &[u8]) -> u16 {
    let (head, slice, tail) = unsafe { slice.align_to::<u16>() };
    if !head.is_empty() {
        panic!("checksum() input should be 16-bit aligned");
    }
    if !tail.is_empty() {
        panic!("checksum() input size should be a multiple of 2 bytes");
    }

    unimplemented!()
}

Okay, good! Now how did that checksum work again?

The checksum field is the 16 bit one's complement of the one's complement sum of all 16 bit words in the header. For purposes of computing the checksum, the value of the checksum field is zero.

Alright. So, the one's complement of a number is obtained by just flipping all the bits, aka doing a "bitwise NOT", which in Rust is just !, same as the "logical NOT".

And doing a one's complement sum just means that, if there's been an overflow, we must add the "overflow bit" back to the rightmost bit of the result.

Let's have a bit of an example. If we add two bytes and it doesn't overflow, like so:

Adding 96 and 64 in binary representation

  0110 0000  // 96
+ 0100 0000  // 64
-----------
  0100 0000
+ 0110 0000
-----------
  1000 0000
+ 0010 0000
-----------
  1000 0000
+ 0010 0000
-----------
= 1010 0000

= 2**7 + 2**5 = 128 + 32 = 160

We didn't have any overflow here - all good. But what if we add 128 and 128?

In two's complement, we just ignore the overflow bit:

Two's complement sum of 128 and 128

  1000 0000  // 128
+ 1000 0000  // 128
-----------
 10000 0000
+ 0000 0000
-----------
= 0000 0000

= 0

But in one's complement, we add it back to the rightmost side:

One's complement sum of 128 and 128

  1000 0000  // 128
+ 1000 0000  // 128
-----------
 10000 0000
+ 0000 0000
-----------
  0000 0001
+ 0000 0000
-----------
= 0000 0001

= 1

To implement one's complement sum, we can simply do the following:

We'll make a convenience function just for that.

Finally, our result needs to be the "one's complement" of the sum, ie. we'll need to do a "bitwise NOT":

Rust code
pub fn checksum(slice: &[u8]) -> u16 {
    let (head, slice, tail) = unsafe { slice.align_to::<u16>() };
    if !head.is_empty() {
        panic!("checksum() input should be 16-bit aligned");
    }
    if !tail.is_empty() {
        panic!("checksum() input size should be a multiple of 2 bytes");
    }

    fn add(a: u16, b: u16) -> u16 {
        let s: u32 = (a as u32) + (b as u32);
        if s & 0x1_00_00 > 0 {
            // overflow, add carry bit
            (s + 1) as u16
        } else {
            s as u16
        }
    }

    !slice.iter().fold(0, |x, y| add(x, *y))
}
Cool bear's hot tip

x & bit_pattern > 0 is a fairly standard way of checking whether certain bits are set.

0x1_00_00 is the first bit that doesn't fit in an u16 - each 00 represents a byte. In Rust, you can use underscores (_) as visual separators in literals, which we use to our advantage here.

We could just as well have written 0x10000, 65536, 0b1_00000000_00000000, but 0x1_00_00 just seemed like a good balance.

Let's give our code a go shall we?

Since we're capturing live network traffic, the checksums we receive and the ones we compute should match up. So in theory, we'd just have to zero out the checksum field, call checksum() on the whole header, and compare them. Makes sense!

But! There's an even nicer way to check. If we checksum the header without zeroing checksum field, the result should be zero. Let's give it a shot.

Rust code
// in `src/ipv4.rs`

impl Packet {
    pub fn parse(i: parse::Input) -> parse::Result<Self> {
        let original_i = i;
        let (i, (version, ihl)) = bits(tuple((u4::parse, u4::parse)))(i)?;

        let header_slice = &original_i[..(ihl.into() * 4)];
        let computed_checksum = checksum(header_slice);
        println!("--------------------");
        println!("computed {:04X}", computed_checksum);

        // rest of parser

        // etc.

The output looks something like this:

Shell session
$ cargo run
--------------------
computed B066
5.8341465s | Packet {
    ihl: 5,
    dscp: 0,
    ecn: 0,
    length: 60,
    identification: 0249,
    flags: 0,
    fragment_offset: 0,
    ttl: 128,
    checksum: 0000,
    src: 192.168.1.16,
    dst: 8.8.8.8,
    payload: Unknown,
}
--------------------
computed 0000
5.8342008s | Packet {
    ihl: 5,
    dscp: 0,
    ecn: 0,
    length: 60,
    identification: 0000,
    flags: 0,
    fragment_offset: 0,
    ttl: 54,
    checksum: b2f9,
    src: 8.8.8.8,
    dst: 192.168.1.16,
    payload: Unknown,
}
Cool bear's hot tip

As always, ping -t 8.8.8.8 is running in the background.

So, it's just as we thought previously:

I was curious what could cause that, so I went hunting for an explanation, and here's what I found:

There are lots of NICs that can compute the checksum on chip.

Thus, if libpcap is loaded on a machine that is sending/receiving packets itself, the checksum will validate correctly going in one direction, but not the other (inbound good, outbound bad).

That's because it can sniff the packet contents BEFORE it makes it to the wire, and before the hardware can compute the checksum. The only way to guarantee a proper checksum is to sniff packets that have already made it to the wire (e.g. a mirror port on a switch).

Source: winpcap-users mailing list, June 2005

What's inside an ICMP packet

I think we've danced around it long enough. It's time to look at the structure of an ICMP packet:

Alright, seems straight-forward. At least everything is byte-aligned!

Here's some of the possible values for the type field:

There's many others, but we'll only bother with those for the time being.

So it looks like we'll need an enum of sorts.

Let's make a new module.

Rust code
// in `src/main.rs`

mod icmp;

Instead of using derive_try_from_primitive as we did for Ethernet, we'll simply have an Other variant, that holds an arbitrary (and unsupported) u8:

Rust code
// in `src/icmp.rs`

#[derive(Debug)]
pub enum Type {
    EchoReply,
    DestinationUnreachable,
    EchoRequest,
    TimeExceeded,
    Other(u8),
}

Okay, good! Now for the code field. Looks like it's unused for "Echo Reply" and "Echo Request", but it has a meaning for "Destination Unreachable" and "Time Exceeded".

For example, for "Destination Unreachable", a code of 1 means "Destination host unreachable", whereas a code of 4 means: "Fragmentation required, and DF flag set". That's interesting! The DF flag means "don't fragment", which means we want our IP datagrams to arrive unfragmented. If we send a datagram large enough, it might make a few hops and eventually reach a host that doesn't support datagrams that large, and we might get back that type+code.

Cool bear's hot tip

Turns out you can use that to find the MTU (maximum transmission unit - ie. the maximum size of IP packet) of a path between two hosts.

If you're curious (and who could blame you?), check out Path MTU Discovery.

Which of these do we actually care about though? How many can we reproduce empirically?

What if we try to ping a host we know doesn't exist on the local network?

$ ping 192.168.1.99

Pinging 192.168.1.99 with 32 bytes of data:
Reply from 192.168.1.16: Destination host unreachable.

Okay, that's "Destination Unreachable" with code 1.

What if we set a TTL that we know is too low?

$ ping -i 3 8.8.8.8

Pinging 8.8.8.8 with 32 bytes of data:
Reply from 78.255.77.126: TTL expired in transit.

Okay, that's "Time Exceeded" with code 0.

Let's model those, and leave the others as numbers:

Rust code
// in `src/icmp.rs`

#[derive(Debug)]
pub enum Type {
    EchoReply,
    DestinationUnreachable(DestinationUnreachable),
    EchoRequest,
    TimeExceeded(TimeExceeded),
    // new: we also store the code here
    Other(u8, u8),
}

#[derive(Debug)]
pub enum DestinationUnreachable {
    HostUnreachable,
    Other(u8),
}

#[derive(Debug)]
pub enum TimeExceeded {
    TTLExpired,
    Other(u8),
}

We'll need a way to get a Type from two u8 values, so let's implement that now:

Rust code
impl From<(u8, u8)> for Type {
    fn from(x: (u8, u8)) -> Self {
        let (typ, code) = x;

        match typ {
            0 => Self::EchoReply,
            3 => Self::DestinationUnreachable(code.into()),
            8 => Self::EchoRequest,
            11 => Self::TimeExceeded(code.into()),
            _ => Self::Other(typ, code),
        }
    }
}

impl From<u8> for DestinationUnreachable {
    fn from(x: u8) -> Self {
        match x {
            1 => Self::HostUnreachable,
            x => Self::Other(x),
        }
    }
}

impl From<u8> for TimeExceeded {
    fn from(x: u8) -> Self {
        match x {
            0 => Self::TTLExpired,
            x => Self::Other(x),
        }
    }
}

There. As you can imagine, we could go through every known type/code combination like that, but let's not.

Now I guess it's time to get parsing!

Rust code
use crate::parse;
use custom_debug_derive::*;

#[derive(CustomDebug)]
pub struct Packet {
    pub typ: Type,

    #[debug(format = "{:04x}")]
    pub checksum: u16,
}

use nom::{
    number::complete::{be_u16, be_u8},
    sequence::tuple,
};

impl Packet {
    pub fn parse(i: parse::Input) -> parse::Result<Self> {
        let (i, typ) = {
            let (i, (typ, code)) = tuple((be_u8, be_u8))(i)?;
            (i, Type::from((typ, code)))
        };
        let (i, checksum) = be_u16(i)?;

        let res = Self { typ, checksum };
        Ok((i, res))
    }
}

Alright well the theory is sound, but does it work?

We're going to have to find a way to fit that into our ipv4::Packet.

Rust code
// in `src/ipv4.rs`

// before:
#[derive(Debug)]
pub enum Payload {
    Unknown,
}

// after:
use crate::icmp;

#[derive(Debug)]
pub enum Payload {
    ICMP(icmp::Packet),
    Unknown,
}

Later in the same file:

Rust code
// in `src/ipv4.rs`

impl Packet {
    pub fn parse(i: parse::Input) -> parse::Result<Self> {
        // parse all fields

        let (i, payload) = match protocol {
            Some(Protocol::ICMP) => map(icmp::Packet::parse, Payload::ICMP)(i)?,
            _ => (i, Payload::Unknown),
        };       

        let res = Self {
            version,
            ihl,
            dscp,
            ecn,
            length,
            identification,
            flags,
            fragment_offset,
            ttl,
            protocol,
            checksum,
            src,
            dst,
            payload,
        };
        Ok((i, res))
    }
}

At this point, our ipv4::Packet contains redundant information - the constraints of an IPv4 packet are not fully encoded via the type system. When serializing (generating bytes to send on the wire), we'll probably ignore fields like protocol, and use only payload. Likewise, the checksum field is only meaningful when we read an IPv4 packet, not when we generate one.

Nevertheless, let's take it for a spin!

Shell session
$ cargo run --quiet
Listening for packets...
1.0001522s | Packet {
    ihl: 5,
    dscp: 0,
    ecn: 0,
    length: 60,
    identification: 0b92,
    flags: 0,
    fragment_offset: 0,
    ttl: 128,
    checksum: 0000,
    src: 192.168.1.16,
    dst: 8.8.8.8,
    payload: ICMP(
        Packet {
            typ: EchoRequest,
            checksum: 3d34,
        },
    ),
}
1.000264s | Packet {
    ihl: 5,
    dscp: 0,
    ecn: 0,
    length: 60,
    identification: 0000,
    flags: 0,
    fragment_offset: 0,
    ttl: 54,
    checksum: b2f9,
    src: 8.8.8.8,
    dst: 192.168.1.16,
    payload: ICMP(
        Packet {
            typ: EchoReply,
            checksum: 4534,
        },
    ),
}

Yay! Those are definitely ICMP requests and replies.

Let's filter a little more and only show ICMP packets:

Rust code
// in `src/main.rs`

fn process_packet(now: Duration, packet: &BorrowedPacket) {
    match ethernet::Frame::parse(packet) {
        Ok((_remaining, frame)) => {
            if let ethernet::Payload::IPv4(ref ip_packet) = frame.payload {
                if let ipv4::Payload::ICMP(ref icmp_packet) = ip_packet.payload {
                    println!(
                        "{:?} | ({:?}) => ({:?}) | {:#?}",
                        now, ip_packet.src, ip_packet.dst, icmp_packet
                    );
                }
            }
        }
        Err(nom::Err::Error(e)) => {
            println!("{:?} | {:?}", now, e);
        }
        _ => unreachable!(),
    }
}
Shell session
$ cargo run
Listening for packets...
942.155ms | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    checksum: 3bc6,
}
942.2091ms | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    checksum: 43c6,
}
1.9728682s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    checksum: 3bc5,
}
1.9728992s | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    checksum: 43c5,
}
2.973634s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    checksum: 3bc4,
}
2.9736753s | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    checksum: 43c4,
}

Good, good. We're not quite done parsing ICMP though. We haven't touched the 4-byte "rest of the header" field yet. Turns out this one also depends on the "type" field.

We're only interested in its content when it's an "Echo Request" or an "Echo Reply", and it has the same contents both times:

Rust code
// in `src/icmp.rs`

#[derive(CustomDebug)]
pub struct Echo {
    #[debug(format = "{:04x}")]
    pub identifier: u16,
    #[debug(format = "{:04x}")]
    pub sequence_number: u16,
}

use nom::combinator::map;

impl Echo {
    fn parse(i: parse::Input) -> parse::Result<Self> {
        map(tuple((be_u16, be_u16)), |(identifier, sequence_number)| {
            Echo {
                identifier,
                sequence_number,
            }
        })(i)
    }
}

#[derive(Debug)]
pub enum Header {
    EchoRequest(Echo),
    EchoReply(Echo),
    Other(u32),
}

And finally, we also want to collect the rest of the ICMP packet. This is going to be binary data, so let's make a type for it.

Rust code
// in `src/main.rs`
mod blob;
Rust code
// in `src/blob.rs`

use std::{cmp::min, fmt};

pub struct Blob(pub Vec<u8>);

impl fmt::Debug for Blob {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        let slice_len = self.0.len();
        let shown_len = 20;
        let slice = &self.0[..min(shown_len, slice_len)];
        write!(f, "[")?;
        for (i, x) in slice.iter().enumerate() {
            let prefix = if i > 0 { " " } else { "" };
            write!(f, "{}{:02x}", prefix, x)?;
        }
        if slice_len > shown_len {
            write!(f, " + {} bytes", slice_len - shown_len)?;
        }
        write!(f, "]")
    }
}

impl Blob {
    pub fn new(slice: &[u8]) -> Self {
        Self(slice.into())
    }
}

There. Now it'll show us the first 20 bytes, and our output won't be too crowded.

Let's actually parse the header and grab the payload:

Rust code
// in `src/icmp.rs`

use crate::blob::Blob;

#[derive(CustomDebug)]
pub struct Packet {
    pub typ: Type,

    // new! let's skip that one.
    #[debug(skip)]
    pub checksum: u16,
    
    // new! those two fields:
    #[debug(format = "{:?}")]
    pub header: Header,
    pub payload: Blob,
}

use nom::{
    number::complete::{be_u16, be_u32, be_u8},
    sequence::tuple,
};

impl Packet {
    pub fn parse(i: parse::Input) -> parse::Result<Self> {
        let (i, typ) = {
            let (i, (typ, code)) = tuple((be_u8, be_u8))(i)?;
            (i, Type::from((typ, code)))
        };
        let (i, checksum) = be_u16(i)?;
        let (i, header) = match typ {
            Type::EchoRequest => map(Echo::parse, Header::EchoRequest)(i)?,
            Type::EchoReply => map(Echo::parse, Header::EchoReply)(i)?,
            _ => map(be_u32, Header::Other)(i)?,
        };
        let payload = Blob::new(i);

        let res = Self {
            typ,
            checksum,
            header,
            payload,
        };
        Ok((i, res))
    }
}

Now we can get a look at what those sequence number / identification fields do:

Shell session
$ cargo run --quiet
Listening for packets...
1.2998279s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 02c8 }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
1.2999122s | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    header: EchoReply(Echo { identifier: 0001, sequence_number: 02c8 }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
2.3000035s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 02c9 }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
2.300062s | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    header: EchoReply(Echo { identifier: 0001, sequence_number: 02c9 }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
3.3003952s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 02ca }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
3.3004533s | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    header: EchoReply(Echo { identifier: 0001, sequence_number: 02ca }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
(etc.)

Interesting! It seems the sequence number increases, and the identifier is fixed. I tried several runs of ping and, on my version of Windows, it always uses 0001.

The Wikipedia article on ping gives us the deets:

The Identifier and Sequence Number can be used by the client to match the reply with the request that caused the reply.

In practice, most Linux systems use a unique identifier for every ping process, and sequence number is an increasing number within that process.

Windows uses a fixed identifier, which varies between Windows versions, and a sequence number that is only reset at boot time.

It looks like we can do whatever we want, as long as the identifier / sequence_number combo is unique. We also probably want sequence_number to be increasing.

I'm interested in that payload though - let's try to print it as a string.

Rust code
// in `src/main.rs`

fn process_packet(now: Duration, packet: &BorrowedPacket) {
    match ethernet::Frame::parse(packet) {
        Ok((_remaining, frame)) => {
            if let ethernet::Payload::IPv4(ref ip_packet) = frame.payload {
                if let ipv4::Payload::ICMP(ref icmp_packet) = ip_packet.payload {
                    println!(
                        "{:?} | ({:?}) => ({:?}) | {:#?}",
                        now, ip_packet.src, ip_packet.dst, icmp_packet
                    );
                    // new!
                    let payload = String::from_utf8_lossy(&icmp_packet.payload.0);
                    println!("payload = {}", payload);
                }
            }
        }
        Err(nom::Err::Error(e)) => {
            println!("{:?} | {:?}", now, e);
        }
        _ => unreachable!(),
    }
}
Shell session
$ cargo run --quiet
Listening for packets...
1.000148s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 02eb }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
payload = abcdefghijklmnopqrstuvwabcdefghi
1.0002762s | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    header: EchoReply(Echo { identifier: 0001, sequence_number: 02eb }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
payload = abcdefghijklmnopqrstuvwabcdefghi
1.8147006s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 02ec }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
payload = abcdefghijklmnopqrstuvwabcdefghi
1.8147644s | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    header: EchoReply(Echo { identifier: 0001, sequence_number: 02ec }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
payload = abcdefghijklmnopqrstuvwabcdefghi

Huzzah! This is exactly what we figured out when we were messing with the built-in Windows ICMP APIs.

What does the payload look like if we intentionally provoke "TTL expired in transit"?

Shell session
$ cargo run --quiet
Listening for packets...
2.7051148s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 036f }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
payload = abcdefghijklmnopqrstuvwabcdefghi
2.7051798s | (78.255.77.126) => (192.168.1.16) | Packet {
    typ: TimeExceeded(
        TTLExpired,
    ),
    header: Other(0),
    payload: [45 00 00 3c 3e 56 00 00 01 01 a9 a3 c0 a8 01 10 08 08 08 08 + 8 bytes],
}
payload = E<>I�o��
3.7051146s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 0370 }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
payload = abcdefghijklmnopqrstuvwabcdefghi
3.7051765s | (78.255.77.126) => (192.168.1.16) | Packet {
    typ: TimeExceeded(
        TTLExpired,
    ),
    header: Other(0),
    payload: [45 00 00 3c 3e 57 00 00 01 01 a9 a2 c0 a8 01 10 08 08 08 08 + 8 bytes],
}
payload = E<>I�p��

This was obtained by running ping -i 3 8.8.8.8. Note that the payload we send is still a bunch of letters from the alphabet. The payload we get back though, does not seem like a string at all.

Cool bear's hot tip

The Unicode replacement character '�' that you see is not because your device is missing some fonts. It's been inserted by from_utf8_lossy because parts of the input were not valid UTF-8.

In fact, 45 00 00 looks really familiar. It looks like an IPv4 packet, where 4 is the version, 5 is the IHL (5 32-bit words = 20 bytes), then dscp and ecn are both zero, then total length is 00 3c, the 16-bit big-endian integer 3c, which is 60.

Let's dump outgoing echo requests as a Blob:

Rust code
// in `src/ipv4.rs`

// in Packet::parse
let (i, (src, dst)) = tuple((Addr::parse, Addr::parse))(i)?;

if src.0[0] == 192 {
    let blob = crate::blob::Blob::new(original_i);
    println!("outgoing ipv4 packet: {:?}", blob);
}
Shell session
$ cargo run --quiet
Listening for packets...
outgoing ipv4 packet: [45 00 00 3c 3e cb 00 00 03 01 00 00 c0 a8 01 10 08 08 08 08 + 40 bytes]
999.4404ms | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 0374 }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
payload = abcdefghijklmnopqrstuvwabcdefghi
999.5319ms | (78.255.77.126) => (192.168.1.16) | Packet {
    typ: TimeExceeded(
        TTLExpired,
    ),
    header: Other(0),
    payload: [45 00 00 3c 3e cb 00 00 01 01 a9 2e c0 a8 01 10 08 08 08 08 + 8 bytes],
}
payload = E<>I�t��

AhAH! Those look really similar:

[45 00 00 3c 3e cb 00 00 03 01 00 00 c0 a8 01 10 08 08 08 08 + 40 bytes]
[45 00 00 3c 3e cb 00 00 01 01 a9 2e c0 a8 01 10 08 08 08 08 + 8 bytes]
                         ~~    ~~ ~~

(top: outgoing ipv4 packet, bottom: ICMP payload for "TTL expired in transit")

They're not exactly the same though - what changed? We know the IPv4 header structure, let's review:

Everything is the same up until the TTL - that makes sense! For every hop, the TTL is decreased by one. It makes complete sense that it expired when the TTL was about to drop to zero.

As for the checksum, we've already established that when using npcap, outgoing packets had a zero checksum. If that wasn't the case, it would still be different though, as changing the TTL changes the packet's checksum. Everything checks out.

Another interesting thing to notice is that we're trying to ping 8.8.8.8, but it's 78.255.77.126 that's replying to us. That also makes sense - we never made it all the way to 8.8.8.8, so some node in the middle replied to us.

In this case, the node that replied to us was...

The host that replied to us, operated by Proxad (aka the French ISP 'Free')

ipinfo.io

What did we learn?

To ping a host, we send an "Echo request" to it. If everything goes well, that host sends an "Echo reply" back to us.

If something goes wrong, another host might reply to us with an ICMP message like "Time exceeded", and part of the original packet.

Crafting our own ICMP packets

I think we know everything we need to know to start making our own ICMP packets.

We used nom for parsing packets. Is there a similar crate for serializing arbitrary binary data? There is! By the same author! Hi Geoffroy, love your stuff.

Let's grab cookie-factory.

Shell session
$ cargo add cookie-factory
      Adding cookie-factory v0.3.0 to dependencies

Now, we need to take a few steps back and think about serialization though.

Our parsers looked something like this:

Rust code
fn parse(input: &'a [u8]) -> IResult<&'a [u8], Self, Error<&'a [u8]>> {
    // parser goes here.
}

In other words, we had the following constraints:

So the input is borrowed until we either:

When serializing though, we're generating bytes. Do we still need to worry about borrowing / ownership? That all depends.

We could take an std::io::Write and do something like this:

Rust code
// in `src/icmp.rs`

impl Packet {
    pub fn serialize(&self, w: &mut dyn io::Write) -> Result<(), io::Error> {
        match self.typ {
            Type::EchoRequest => {
                // write "type" and "code" fields
                w.write_all(&[8, 0])?;
                // TODO: write the rest
            }
            Type::EchoReply => {
                // write "type" and "code" fields
                w.write_all(&[0, 0])?;
                // TODO: write the rest
            }
            _ => unimplemented!(),
        }

        Ok(())
    }
}

However, we're not going to do that. One of the many reasons include: why the heck are we dealing with I/O errors here? We're just generating bytes, we're not supposed to care whether they're successfully written out or not.

But the point is: if we were doing that, then we wouldn't have to worry about ownership. By the time serialize returns, we can throw away the icmp::Packet completely, because it's all written out.

Instead, we're going to do something similar to what we did for parsing with nom, except, the other way around.

Let's start by serializing something simple, like Echo

Rust code
// in `src/icmp.rs`

use cookie_factory as cf;
use std::io;

impl Echo {
    pub fn serialize<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a {
        use cf::{bytes::be_u16, sequence::tuple};
        tuple((be_u16(self.identifier), be_u16(self.sequence_number)))
    }
}

Wonderful! It's just like nom!

We return a function that:

Also, this serializer is purely declarative. It's just a tuple of 16-bit big-endian numbers!

What did we learn?

The same way nom lets us combine parsers, cookie-factory lets us combine serializers.

Serializers are just functions. They borrow (or copy) their input, and can serialize it any number of times to a compatible output type. They're entirely uninterested in I/O errors, which are handled on another level.

Let's move on up:

Rust code
// in `src/icmp.rs`

#[derive(Debug)]
pub enum Header {
    EchoRequest(Echo),
    EchoReply(Echo),
    Other(u32),
}

impl Header {
    fn serialize<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a {
        // mhhh.
    }
}

Okay so. There's no map equivalent in cookie-factory. And we definitely want to write different things based on which variant of Header we have here.

No worries! We're just returning a function, right? And that can be a closure, correct? Right. Let's go:

Rust code
// in `src/icmp.rs`

impl Header {
    fn serialize<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a {
        use cf::bytes::be_u32;

        move |out| match self {
            Self::EchoRequest(echo) | Self::EchoReply(echo) => echo.serialize()(out),
            Self::Other(x) => be_u32(*x)(out),
        }
    }
}

Alright, that works! Having a match here will ensure that we have exhaustive coverage of all the variants, so if we handle other kinds of ICMP request headers, we'll get a nice compile-time error here to remind us to serialize that, too.

Now, we said earlier that, when serializing, we'd disregard the typ field because it's redundant with Header.

Also, we're going to need our ICMP packet to have a valid checksum, and some of the information we can serialize from Header will be before the checksum, while the rest will be after the checksum.

So let's make another function to serialize what goes before the checksum.

Rust code
// in `src/icmp.rs`

impl Header {
    // omitted: serialize()

    fn serialize_type_and_code<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a {
        use cf::{bytes::be_u8, sequence::tuple};

        move |out| match self {
            Self::EchoRequest(_) => tuple((be_u8(8), be_u8(0)))(out),
            Self::EchoReply(_) => tuple((be_u8(0), be_u8(0)))(out),
            // we're not planning on sending any "TTL Expired" or
            // "Host unreachable" ICMP packets.
            _ => unimplemented!(),
        }
    }
}

Let's also make Blob serializable:

Rust code
// in `src/blob.rs`

use cookie_factory as cf;
use std::io;

impl Blob {
    pub fn serialize<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a {
        use cf::combinator::slice;
        slice(&self.0)
    }
}

Alright. It's just a slice of bytes! Again, we borrow the Blob for 'a and the returned serializer function is valid for 'a. It all works out.

We're ready to move on to icmp::Packet. Let's first make a serializer that ignores the checksum field:

Rust code
// in `src/icmp.rs`

impl Packet {
    pub fn serialize_no_checksum<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a {
        use cf::{bytes::be_u16, sequence::tuple};

        tuple((
            self.header.serialize_type_and_code(),
            be_u16(0), // checksum
            self.header.serialize(),
            self.payload.serialize(),
        ))
    }
}

Alright, I think we're far enough along that we can try it out.

We'll try it out from icmp::Packet::parse, because it's a place where we have access to an u8 slice that should look roughly the same.

Rust code
// in `src/icmp.rs`

impl Packet {
    pub fn parse(i: parse::Input) -> parse::Result<Self> {
        let original_i = i;

        // rest of parser
        let res = Self {
            typ,
            checksum,
            header,
            payload,
        };

        // here we go:
        let serialized = cf::gen_simple(res.serialize_no_checksum(), Vec::new()).unwrap();

        println!("  original = {:?}", Blob::new(original_i));
        println!("serialized = {:?}", Blob::new(&serialized));

        Ok((i, res))
    }
}
Shell session
$ cargo run --quiet
Listening for packets...
  original = [08 00 49 a4 00 01 03 b7 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
serialized = [08 00 00 00 00 01 03 b7 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
1.0002556s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 03b7 }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
  original = [00 00 51 a4 00 01 03 b7 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
serialized = [00 00 00 00 00 01 03 b7 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
1.0003878s | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    header: EchoReply(Echo { identifier: 0001, sequence_number: 03b7 }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
  original = [08 00 49 a3 00 01 03 b8 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
serialized = [08 00 00 00 00 01 03 b8 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
2.0002365s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 03b8 }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
  original = [00 00 51 a3 00 01 03 b8 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
serialized = [00 00 00 00 00 01 03 b8 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
2.0003223s | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    header: EchoReply(Echo { identifier: 0001, sequence_number: 03b8 }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}

Heyyyyy. They do look similar. The only differences are at byte offsets 2 and 3, which.. that's the checksum!

Marty! You've gotta come back with me!

Where? Back to the buffer!

So we've got a little pickle. We need to compute a checksum for our freshly-generated ICMP packet, but in order to do that we need to generate some stuff before and some stuff after the checksum, so by the time we need it, we don't yet have all the information we need.

cookie-factory has something for that, a back_to_the_buffer combinator.

Let's say you're writing a binary format that has messages of variable length, and when serialized, they're prefixed by their length, as an u32.

back_to_the_buffer would let you reserve 4 bytes, then write out the message itself:

Then it would go back, and invoke a closure with a write context at the position we reserved, with the result of the message payload's serialization. Now that it's done, we know it's of length 12, so we can write an u32 with value 12:

And voilà!

Unfortunately, we can't use that. We'd need to reserve data in the middle.

We could technically come up with our own riff on BackToTheBuffer (it's all just a bunch of seeks anyway - I know, I looked), but let's not worry too much about it for now. It's nice to know we have the option, if we ever need to squeeze some extra performance out of our custom network stack.

Cool bear's hot tip

We won't.

We won't need to.

So for now, let's do it the old-fashioned way. By first generating the entire ICMP packet in a buffer, then computing the checksum, then overwriting the checksum part of the buffer with our u16 result.

The code is actually not that bad:

Rust code
// in `src/icmp.rs`

impl Packet {
    // omitted: `serialize_no_checksum()`

    pub fn serialize<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a {
        use cf::{bytes::le_u16, combinator::slice};

        move |out| {
            let mut buf = cf::gen_simple(self.serialize_no_checksum(), Vec::new())?;
            let checksum = crate::ipv4::checksum(&buf);
            cf::gen_simple(le_u16(checksum), &mut buf[2..])?;
            slice(buf)(out)
        }
    }
}

Good, good. WAIT A MINUTE, le_u16? As in, little-endian?

Well here's a neat thing we haven't really covered about the "internet checksum" we implemented earlier: endianness "doesn't really matter".

By which I mean, we could convert all u16 in the input from big-endian to little endian, and do the one's complement sum, take the one's complement of the result, and then write it as big-endian.

Or we could do no conversion at all, do everything as little-endian, and write it out as little-endian, and get the exact same result.

But don't take it from me, take it from RFC 1071:

The sum of 16-bit integers can be computed in either byte order. Thus, if we calculate the swapped sum:

[B,A] +' [D,C] +' ... +' [Z,Y]

the result is the same as [before], except the bytes are swapped in the sum! To see why this is so, observe that in both orders the carries are the same: from bit 15 to bit 0 and from bit 7 to bit 8. In other words, consistently swapping bytes simply rotates the bits within the sum, but does not affect their internal ordering.

Therefore, the sum may be calculated in exactly the same way regardless of the byte order ("big-endian" or "little-endian") of the underlaying hardware. For example, assume a "little- endian" machine summing data that is stored in memory in network ("big-endian") order. Fetching each 16-bit word will swap bytes, resulting in the sum [above]; however, storing the result back into memory will swap the sum back into network byte order.

Note that our code will now only work on little-endian machines, because we're explicitly using le_u16 when serializing, which would do an extra swap on big-endian machines. So, you know, don't go run this on OpenRISC with no modifications.

Anyway, does it even work?

Rust code
// in `src/icmp.rs`
// in `Packet::parse`

// was: `res.serialize_no_checksum()`
let serialized = cf::gen_simple(res.serialize(), Vec::new()).unwrap();

println!("  original = {:?}", Blob::new(original_i));
println!("serialized = {:?}", Blob::new(&serialized));
assert_eq!(original_i, &serialized[..]);

Ok((i, res))
Shell session
$ cargo run --quiet
Listening for packets...
  original = [08 00 42 c1 00 01 0a 9a 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
serialized = [08 00 42 c1 00 01 0a 9a 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
1.0000433s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 0a9a }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
  original = [00 00 4a c1 00 01 0a 9a 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
serialized = [00 00 4a c1 00 01 0a 9a 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
1.0001887s | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    header: EchoReply(Echo { identifier: 0001, sequence_number: 0a9a }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
  original = [08 00 42 c0 00 01 0a 9b 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
serialized = [08 00 42 c0 00 01 0a 9b 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
2.7004159s | (192.168.1.16) => (8.8.8.8) | Packet {
    typ: EchoRequest,
    header: EchoRequest(Echo { identifier: 0001, sequence_number: 0a9b }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}
  original = [00 00 4a c0 00 01 0a 9b 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
serialized = [00 00 4a c0 00 01 0a 9b 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes]
2.7004974s | (8.8.8.8) => (192.168.1.16) | Packet {
    typ: EchoReply,
    header: EchoReply(Echo { identifier: 0001, sequence_number: 0a9b }),
    payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes],
}

It does! And we don't even have to worry about those extra 20 bytes that Blob doesn't print, because we sneaked an assert_eq! in there, so our code would crash if it didn't work.

And just like that, we're seemingly real close to sending our own, hand-crafted network traffic... We just have to also serialize IPv4 packets and, oh, Ethernet frames. Well that shouldn't be too hard, right?

Right?

😐