Parsing and serializing ICMP packets with cookie-factory.
From the series
Making our own ping
In the last part, we've finally parsed some IPv4 packets. We even found a way to filter only IPv4 packets that contain ICMP packets.
There's one thing we haven't done though, and that's verify their checksum. Folks could be sending us invalid IPv4 packets and we'd be parsing them like a fool!
This series is getting quite long, so let's jump right into it.
Let's read a bit of RFC 791:
A checksum on the header only. Since some header fields change (e.g., time to live), this is recomputed and verified at each point that the internet header is processed.
The checksum algorithm is:
The checksum field is the 16 bit one's complement of the one's complement sum of all 16 bit words in the header. For purposes of computing the checksum, the value of the checksum field is zero.
This is a simple to compute checksum and experimental evidence indicates it is adequate, but it is provisional and may be replaced by a CRC procedure, depending on further experience.
Yeah, it was never replaced with a Cyclic Redundancy Check. Note that this RFC was published in 1981 - the CRC-32 had been invented 20 years prior!
That said, CRC-32 is used in Ethernet - it's the 4 bytes right after the payload, the field is named "Frame check sequence".
Note also that different polynomials can be used to compute CRC-32s - the IEEE one (used in Ethernet) is "not the best". That's all I'll say about this now, one could spend an entire article talking about those.
The idea behind checksumming is to include a hash alongside the actual data, to detect transmission errors. We're interested in detection only - if we needed to be able to correct those errors, we'd need an Error correction code.
A hash function needs to have several properties in order to be suitable for error detection. The most important one is that when the input changes even slightly, the resulting hash should be completely different. It's likely for a transmission error to be a single bit flip - and even that should give a completely different sum:
One thing that isn't so important is for the hash function to be collision-resistant. It is not particularly hard to find two inputs whose CRC-32 are the same:
Checksums are meant to detect accidental data corruption. They are useless against intentional data corruption.
To protect against the latter, you'll need a cryptographic hash function.
We're going to be using the "internet checksum" twice: once for IPv4, and another time for ICMP. We're also going to need it when we eventually generate packets of our own.
So we want to make it re-usable. How about we make it take a slice of bytes?
// in `src/ipv4.rs` pub fn checksum(slice: &[u8]) -> u16 { unimplemented!() }
Ah, a good start. The algorithm says to take the "one's complement sum of all 16
bit words in the header". But we don't have a slice of u16
, we have a slice of u8
.
Well, I'm sure we can work that out. We're all reasonable people here - what's a little reinterpretation between friends?
Let's take a look at
std::slice::align_to
:
// in rust standard library pub unsafe fn align_to<U>(&self) -> (&[T], &[U], &[T])
The docs say the following:
Transmute the slice to a slice of another type, ensuring alignment of the types is maintained.
This method splits the slice into three distinct slices: prefix, correctly aligned middle slice of a new type, and the suffix slice. The method may make the middle slice the greatest length possible for a given type and input slice, but only your algorithm's performance should depend on that, not its correctness. It is permissible for all of the input data to be returned as the prefix or suffix slice.
This method has no purpose when either input element T or output element U are zero-sized and will return the original slice without splitting anything.
Well, tail
is pretty self-explanatory right? If we have five u8
, we can only make
two u16
:
What about the head
though?
Well, what if the start of our u8
slice is not on a 16-bit (2-byte) boundary?
In our case:
- We expect the input to be 16-bit aligned. Since it'll often be a freshly-allocated slice, it'll often be 64-bit aligned in fact, which is enough for us.
- We also expect the input's size to be a multiple of 16 bits. This is at
least true for IPv4 headers, which size is expressed in "32-bit words"
(that's the unit of the
ihl
field).
However, stranger things have happened, so we'll enforce our assumptions by panicking if they're broken. (And if we ever hit that panic, we'll need a plan B).
But first, let's validate our assumptions.
// in `src/main.rs` use std::fmt::UpperHex; fn dump<T: UpperHex + Sized, S: AsRef<[T]>>(name: &str, slice: S) { print!("{:>12} ", name); for s in slice.as_ref() { print!("{:02X} ", s); } println!() } fn main() { // make an array of four `u16` integers let input_16: [u16; 4] = [0x1122, 0x3344, 0x5566, 0x7788]; dump("input_16", input_16); // violently transmute that to an array of `u8` // note: on little-endian, those will seem out of order let input_8: [u8; 8] = unsafe { std::mem::transmute(input_16) }; dump("input_8", input_8); // get a sub-slice that isn't aligned on the 16-bit boundary let sub_8 = &input_8[1..7]; dump("sub_8", sub_8); // try to align to 16-bit let (head, slice, tail) = unsafe { sub_8.align_to::<u16>() }; dump("head", head); dump("slice", slice); dump("tail", tail); }
This prints:
$ cargo run input_16 1122 3344 5566 7788 input_8 22 11 44 33 66 55 88 77 sub_8 11 44 33 66 55 88 head 11 slice 3344 5566 tail 88
Alright, so it seems align_to is doing what we want. If we were to pass an aligned slice, then head and tail would be empty:
fn main() { let input_16: [u16; 4] = [0x1122, 0x3344, 0x5566, 0x7788]; dump("input_16", input_16); let input_8: [u8; 8] = unsafe { std::mem::transmute(input_16) }; dump("input_8", input_8); let (head, slice, tail) = unsafe { input_8.align_to::<u16>() }; dump("head", head); dump("slice", slice); dump("tail", tail); }
$ cargo run input_16 1122 3344 5566 7788 input_8 22 11 44 33 66 55 88 77 head slice 1122 3344 5566 7788 tail
Since our assumption is that the input to checksum
is 16-bit aligned and
has a size that's a multiple of 2, we can just panic if head
or tail
aren't empty:
// in `src/ipv4.rs` pub fn checksum(slice: &[u8]) -> u16 { let (head, slice, tail) = unsafe { slice.align_to::<u16>() }; if !head.is_empty() { panic!("checksum() input should be 16-bit aligned"); } if !tail.is_empty() { panic!("checksum() input size should be a multiple of 2 bytes"); } unimplemented!() }
Okay, good! Now how did that checksum work again?
The checksum field is the 16 bit one's complement of the one's complement sum of all 16 bit words in the header. For purposes of computing the checksum, the value of the checksum field is zero.
Alright. So, the one's complement of a number is obtained by just
flipping all the bits, aka doing a "bitwise NOT", which in Rust is
just !
, same as the "logical NOT".
And doing a one's complement sum just means that, if there's been an overflow, we must add the "overflow bit" back to the rightmost bit of the result.
Let's have a bit of an example. If we add two bytes and it doesn't overflow, like so:
Adding 96 and 64 in binary representation 0110 0000 // 96 + 0100 0000 // 64 ----------- 0100 0000 + 0110 0000 ----------- 1000 0000 + 0010 0000 ----------- 1000 0000 + 0010 0000 ----------- = 1010 0000 = 2**7 + 2**5 = 128 + 32 = 160
We didn't have any overflow here - all good. But what if we add 128 and 128?
In two's complement, we just ignore the overflow bit:
Two's complement sum of 128 and 128 1000 0000 // 128 + 1000 0000 // 128 ----------- 10000 0000 + 0000 0000 ----------- = 0000 0000 = 0
But in one's complement, we add it back to the rightmost side:
One's complement sum of 128 and 128 1000 0000 // 128 + 1000 0000 // 128 ----------- 10000 0000 + 0000 0000 ----------- 0000 0001 + 0000 0000 ----------- = 0000 0001 = 1
To implement one's complement sum, we can simply do the following:
- Promote our operands to the next-biggest size (
u16
tou32
) - Add them as
u32
- If there's been an overflow, add 1 and demote to
u16
- Otherwise, just demote to
u16
We'll make a convenience function just for that.
Finally, our result needs to be the "one's complement" of the sum, ie. we'll need to do a "bitwise NOT":
pub fn checksum(slice: &[u8]) -> u16 { let (head, slice, tail) = unsafe { slice.align_to::<u16>() }; if !head.is_empty() { panic!("checksum() input should be 16-bit aligned"); } if !tail.is_empty() { panic!("checksum() input size should be a multiple of 2 bytes"); } fn add(a: u16, b: u16) -> u16 { let s: u32 = (a as u32) + (b as u32); if s & 0x1_00_00 > 0 { // overflow, add carry bit (s + 1) as u16 } else { s as u16 } } !slice.iter().fold(0, |x, y| add(x, *y)) }
x & bit_pattern > 0
is a fairly standard way of checking
whether certain bits are set.
0x1_00_00
is the first bit that doesn't fit in an u16 - each 00
represents a byte. In Rust, you can use underscores (_
) as visual
separators in literals, which we use to our advantage here.
We could just as well have written 0x10000
, 65536
,
0b1_00000000_00000000
, but 0x1_00_00
just seemed like a good balance.
Let's give our code a go shall we?
Since we're capturing live network traffic, the checksums we receive and the ones
we compute should match up. So in theory, we'd just have to zero out the checksum
field, call checksum()
on the whole header, and compare them. Makes sense!
But! There's an even nicer way to check. If we checksum the header without zeroing checksum field, the result should be zero. Let's give it a shot.
// in `src/ipv4.rs` impl Packet { pub fn parse(i: parse::Input) -> parse::Result<Self> { let original_i = i; let (i, (version, ihl)) = bits(tuple((u4::parse, u4::parse)))(i)?; let header_slice = &original_i[..(ihl.into() * 4)]; let computed_checksum = checksum(header_slice); println!("--------------------"); println!("computed {:04X}", computed_checksum); // rest of parser // etc.
The output looks something like this:
$ cargo run -------------------- computed B066 5.8341465s | Packet { ihl: 5, dscp: 0, ecn: 0, length: 60, identification: 0249, flags: 0, fragment_offset: 0, ttl: 128, checksum: 0000, src: 192.168.1.16, dst: 8.8.8.8, payload: Unknown, } -------------------- computed 0000 5.8342008s | Packet { ihl: 5, dscp: 0, ecn: 0, length: 60, identification: 0000, flags: 0, fragment_offset: 0, ttl: 54, checksum: b2f9, src: 8.8.8.8, dst: 192.168.1.16, payload: Unknown, }
As always, ping -t 8.8.8.8
is running in the background.
So, it's just as we thought previously:
- Incoming packets have a valid checksum
- Outgoing packets have a zero checksum
I was curious what could cause that, so I went hunting for an explanation, and here's what I found:
There are lots of NICs that can compute the checksum on chip.
Thus, if libpcap is loaded on a machine that is sending/receiving packets itself, the checksum will validate correctly going in one direction, but not the other (inbound good, outbound bad).
That's because it can sniff the packet contents BEFORE it makes it to the wire, and before the hardware can compute the checksum. The only way to guarantee a proper checksum is to sniff packets that have already made it to the wire (e.g. a mirror port on a switch).
What's inside an ICMP packet
I think we've danced around it long enough. It's time to look at the structure of an ICMP packet:
Alright, seems straight-forward. At least everything is byte-aligned!
Here's some of the possible values for the type
field:
0
: Echo Reply3
: Destination unreachable8
: Echo Request11
: Time Exceeded
There's many others, but we'll only bother with those for the time being.
So it looks like we'll need an enum of sorts.
Let's make a new module.
// in `src/main.rs` mod icmp;
Instead of using derive_try_from_primitive
as we did for Ethernet, we'll
simply have an Other
variant, that holds an arbitrary (and unsupported) u8
:
// in `src/icmp.rs` #[derive(Debug)] pub enum Type { EchoReply, DestinationUnreachable, EchoRequest, TimeExceeded, Other(u8), }
Okay, good! Now for the code
field. Looks like it's unused for "Echo Reply" and "Echo Request",
but it has a meaning for "Destination Unreachable" and "Time Exceeded".
For example, for "Destination Unreachable", a code of 1
means "Destination host unreachable",
whereas a code of 4
means: "Fragmentation required, and DF flag set". That's interesting! The
DF
flag means "don't fragment", which means we want our IP datagrams to arrive unfragmented.
If we send a datagram large enough, it might make a few hops and eventually reach a host that
doesn't support datagrams that large, and we might get back that type+code.
Turns out you can use that to find the MTU (maximum transmission unit - ie. the maximum size of IP packet) of a path between two hosts.
If you're curious (and who could blame you?), check out Path MTU Discovery.
Which of these do we actually care about though? How many can we reproduce empirically?
What if we try to ping a host we know doesn't exist on the local network?
$ ping 192.168.1.99 Pinging 192.168.1.99 with 32 bytes of data: Reply from 192.168.1.16: Destination host unreachable.
Okay, that's "Destination Unreachable" with code 1.
What if we set a TTL that we know is too low?
$ ping -i 3 8.8.8.8 Pinging 8.8.8.8 with 32 bytes of data: Reply from 78.255.77.126: TTL expired in transit.
Okay, that's "Time Exceeded" with code 0.
Let's model those, and leave the others as numbers:
// in `src/icmp.rs` #[derive(Debug)] pub enum Type { EchoReply, DestinationUnreachable(DestinationUnreachable), EchoRequest, TimeExceeded(TimeExceeded), // new: we also store the code here Other(u8, u8), } #[derive(Debug)] pub enum DestinationUnreachable { HostUnreachable, Other(u8), } #[derive(Debug)] pub enum TimeExceeded { TTLExpired, Other(u8), }
We'll need a way to get a Type
from two u8
values, so let's implement that now:
impl From<(u8, u8)> for Type { fn from(x: (u8, u8)) -> Self { let (typ, code) = x; match typ { 0 => Self::EchoReply, 3 => Self::DestinationUnreachable(code.into()), 8 => Self::EchoRequest, 11 => Self::TimeExceeded(code.into()), _ => Self::Other(typ, code), } } } impl From<u8> for DestinationUnreachable { fn from(x: u8) -> Self { match x { 1 => Self::HostUnreachable, x => Self::Other(x), } } } impl From<u8> for TimeExceeded { fn from(x: u8) -> Self { match x { 0 => Self::TTLExpired, x => Self::Other(x), } } }
There. As you can imagine, we could go through every known type/code combination like that, but let's not.
Now I guess it's time to get parsing!
use crate::parse; use custom_debug_derive::*; #[derive(CustomDebug)] pub struct Packet { pub typ: Type, #[debug(format = "{:04x}")] pub checksum: u16, } use nom::{ number::complete::{be_u16, be_u8}, sequence::tuple, }; impl Packet { pub fn parse(i: parse::Input) -> parse::Result<Self> { let (i, typ) = { let (i, (typ, code)) = tuple((be_u8, be_u8))(i)?; (i, Type::from((typ, code))) }; let (i, checksum) = be_u16(i)?; let res = Self { typ, checksum }; Ok((i, res)) } }
Alright well the theory is sound, but does it work?
We're going to have to find a way to fit that into our ipv4::Packet
.
// in `src/ipv4.rs` // before: #[derive(Debug)] pub enum Payload { Unknown, } // after: use crate::icmp; #[derive(Debug)] pub enum Payload { ICMP(icmp::Packet), Unknown, }
Later in the same file:
// in `src/ipv4.rs` impl Packet { pub fn parse(i: parse::Input) -> parse::Result<Self> { // parse all fields let (i, payload) = match protocol { Some(Protocol::ICMP) => map(icmp::Packet::parse, Payload::ICMP)(i)?, _ => (i, Payload::Unknown), }; let res = Self { version, ihl, dscp, ecn, length, identification, flags, fragment_offset, ttl, protocol, checksum, src, dst, payload, }; Ok((i, res)) } }
At this point, our ipv4::Packet
contains redundant information - the
constraints of an IPv4 packet are not fully encoded via the type system. When
serializing (generating bytes to send on the wire), we'll probably ignore fields
like protocol
, and use only payload
. Likewise, the checksum
field is only
meaningful when we read an IPv4 packet, not when we generate one.
Nevertheless, let's take it for a spin!
$ cargo run --quiet Listening for packets... 1.0001522s | Packet { ihl: 5, dscp: 0, ecn: 0, length: 60, identification: 0b92, flags: 0, fragment_offset: 0, ttl: 128, checksum: 0000, src: 192.168.1.16, dst: 8.8.8.8, payload: ICMP( Packet { typ: EchoRequest, checksum: 3d34, }, ), } 1.000264s | Packet { ihl: 5, dscp: 0, ecn: 0, length: 60, identification: 0000, flags: 0, fragment_offset: 0, ttl: 54, checksum: b2f9, src: 8.8.8.8, dst: 192.168.1.16, payload: ICMP( Packet { typ: EchoReply, checksum: 4534, }, ), }
Yay! Those are definitely ICMP requests and replies.
Let's filter a little more and only show ICMP packets:
// in `src/main.rs` fn process_packet(now: Duration, packet: &BorrowedPacket) { match ethernet::Frame::parse(packet) { Ok((_remaining, frame)) => { if let ethernet::Payload::IPv4(ref ip_packet) = frame.payload { if let ipv4::Payload::ICMP(ref icmp_packet) = ip_packet.payload { println!( "{:?} | ({:?}) => ({:?}) | {:#?}", now, ip_packet.src, ip_packet.dst, icmp_packet ); } } } Err(nom::Err::Error(e)) => { println!("{:?} | {:?}", now, e); } _ => unreachable!(), } }
$ cargo run Listening for packets... 942.155ms | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, checksum: 3bc6, } 942.2091ms | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, checksum: 43c6, } 1.9728682s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, checksum: 3bc5, } 1.9728992s | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, checksum: 43c5, } 2.973634s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, checksum: 3bc4, } 2.9736753s | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, checksum: 43c4, }
Good, good. We're not quite done parsing ICMP though. We haven't touched the 4-byte "rest of the header" field yet. Turns out this one also depends on the "type" field.
We're only interested in its content when it's an "Echo Request" or an "Echo Reply", and it has the same contents both times:
// in `src/icmp.rs` #[derive(CustomDebug)] pub struct Echo { #[debug(format = "{:04x}")] pub identifier: u16, #[debug(format = "{:04x}")] pub sequence_number: u16, } use nom::combinator::map; impl Echo { fn parse(i: parse::Input) -> parse::Result<Self> { map(tuple((be_u16, be_u16)), |(identifier, sequence_number)| { Echo { identifier, sequence_number, } })(i) } } #[derive(Debug)] pub enum Header { EchoRequest(Echo), EchoReply(Echo), Other(u32), }
And finally, we also want to collect the rest of the ICMP packet. This is going to be binary data, so let's make a type for it.
// in `src/main.rs` mod blob;
// in `src/blob.rs` use std::{cmp::min, fmt}; pub struct Blob(pub Vec<u8>); impl fmt::Debug for Blob { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { let slice_len = self.0.len(); let shown_len = 20; let slice = &self.0[..min(shown_len, slice_len)]; write!(f, "[")?; for (i, x) in slice.iter().enumerate() { let prefix = if i > 0 { " " } else { "" }; write!(f, "{}{:02x}", prefix, x)?; } if slice_len > shown_len { write!(f, " + {} bytes", slice_len - shown_len)?; } write!(f, "]") } } impl Blob { pub fn new(slice: &[u8]) -> Self { Self(slice.into()) } }
There. Now it'll show us the first 20 bytes, and our output won't be too crowded.
Let's actually parse the header and grab the payload:
// in `src/icmp.rs` use crate::blob::Blob; #[derive(CustomDebug)] pub struct Packet { pub typ: Type, // new! let's skip that one. #[debug(skip)] pub checksum: u16, // new! those two fields: #[debug(format = "{:?}")] pub header: Header, pub payload: Blob, } use nom::{ number::complete::{be_u16, be_u32, be_u8}, sequence::tuple, }; impl Packet { pub fn parse(i: parse::Input) -> parse::Result<Self> { let (i, typ) = { let (i, (typ, code)) = tuple((be_u8, be_u8))(i)?; (i, Type::from((typ, code))) }; let (i, checksum) = be_u16(i)?; let (i, header) = match typ { Type::EchoRequest => map(Echo::parse, Header::EchoRequest)(i)?, Type::EchoReply => map(Echo::parse, Header::EchoReply)(i)?, _ => map(be_u32, Header::Other)(i)?, }; let payload = Blob::new(i); let res = Self { typ, checksum, header, payload, }; Ok((i, res)) } }
Now we can get a look at what those sequence number / identification fields do:
$ cargo run --quiet Listening for packets... 1.2998279s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 02c8 }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } 1.2999122s | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, header: EchoReply(Echo { identifier: 0001, sequence_number: 02c8 }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } 2.3000035s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 02c9 }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } 2.300062s | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, header: EchoReply(Echo { identifier: 0001, sequence_number: 02c9 }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } 3.3003952s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 02ca }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } 3.3004533s | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, header: EchoReply(Echo { identifier: 0001, sequence_number: 02ca }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } (etc.)
Interesting! It seems the sequence number
increases, and the identifier
is fixed.
I tried several runs of ping
and, on my version of Windows, it always uses 0001
.
The Wikipedia article on ping gives us the deets:
The Identifier and Sequence Number can be used by the client to match the reply with the request that caused the reply.
In practice, most Linux systems use a unique identifier for every ping process, and sequence number is an increasing number within that process.
Windows uses a fixed identifier, which varies between Windows versions, and a sequence number that is only reset at boot time.
It looks like we can do whatever we want, as long as the identifier
/ sequence_number
combo is unique. We also probably want sequence_number
to be increasing.
I'm interested in that payload though - let's try to print it as a string.
// in `src/main.rs` fn process_packet(now: Duration, packet: &BorrowedPacket) { match ethernet::Frame::parse(packet) { Ok((_remaining, frame)) => { if let ethernet::Payload::IPv4(ref ip_packet) = frame.payload { if let ipv4::Payload::ICMP(ref icmp_packet) = ip_packet.payload { println!( "{:?} | ({:?}) => ({:?}) | {:#?}", now, ip_packet.src, ip_packet.dst, icmp_packet ); // new! let payload = String::from_utf8_lossy(&icmp_packet.payload.0); println!("payload = {}", payload); } } } Err(nom::Err::Error(e)) => { println!("{:?} | {:?}", now, e); } _ => unreachable!(), } }
$ cargo run --quiet Listening for packets... 1.000148s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 02eb }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } payload = abcdefghijklmnopqrstuvwabcdefghi 1.0002762s | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, header: EchoReply(Echo { identifier: 0001, sequence_number: 02eb }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } payload = abcdefghijklmnopqrstuvwabcdefghi 1.8147006s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 02ec }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } payload = abcdefghijklmnopqrstuvwabcdefghi 1.8147644s | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, header: EchoReply(Echo { identifier: 0001, sequence_number: 02ec }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } payload = abcdefghijklmnopqrstuvwabcdefghi
Huzzah! This is exactly what we figured out when we were messing with the built-in Windows ICMP APIs.
What does the payload look like if we intentionally provoke "TTL expired in transit"?
$ cargo run --quiet Listening for packets... 2.7051148s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 036f }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } payload = abcdefghijklmnopqrstuvwabcdefghi 2.7051798s | (78.255.77.126) => (192.168.1.16) | Packet { typ: TimeExceeded( TTLExpired, ), header: Other(0), payload: [45 00 00 3c 3e 56 00 00 01 01 a9 a3 c0 a8 01 10 08 08 08 08 + 8 bytes], } payload = E<>I�o�� 3.7051146s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 0370 }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } payload = abcdefghijklmnopqrstuvwabcdefghi 3.7051765s | (78.255.77.126) => (192.168.1.16) | Packet { typ: TimeExceeded( TTLExpired, ), header: Other(0), payload: [45 00 00 3c 3e 57 00 00 01 01 a9 a2 c0 a8 01 10 08 08 08 08 + 8 bytes], } payload = E<>I�p��
This was obtained by running ping -i 3 8.8.8.8
. Note that the payload we send
is still a bunch of letters from the alphabet. The payload we get back though, does not
seem like a string at all.
The Unicode replacement character '�' that you see is not because your
device is missing some fonts. It's been inserted by from_utf8_lossy
because
parts of the input were not valid UTF-8.
In fact, 45 00 00
looks really familiar. It looks like an IPv4 packet, where
4
is the version, 5
is the IHL (5 32-bit words = 20 bytes), then dscp
and ecn
are both zero, then total length is 00 3c
, the 16-bit big-endian integer
3c
, which is 60.
Let's dump outgoing echo requests as a Blob:
// in `src/ipv4.rs` // in Packet::parse let (i, (src, dst)) = tuple((Addr::parse, Addr::parse))(i)?; if src.0[0] == 192 { let blob = crate::blob::Blob::new(original_i); println!("outgoing ipv4 packet: {:?}", blob); }
$ cargo run --quiet Listening for packets... outgoing ipv4 packet: [45 00 00 3c 3e cb 00 00 03 01 00 00 c0 a8 01 10 08 08 08 08 + 40 bytes] 999.4404ms | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 0374 }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } payload = abcdefghijklmnopqrstuvwabcdefghi 999.5319ms | (78.255.77.126) => (192.168.1.16) | Packet { typ: TimeExceeded( TTLExpired, ), header: Other(0), payload: [45 00 00 3c 3e cb 00 00 01 01 a9 2e c0 a8 01 10 08 08 08 08 + 8 bytes], } payload = E<>I�t��
AhAH! Those look really similar:
[45 00 00 3c 3e cb 00 00 03 01 00 00 c0 a8 01 10 08 08 08 08 + 40 bytes] [45 00 00 3c 3e cb 00 00 01 01 a9 2e c0 a8 01 10 08 08 08 08 + 8 bytes] ~~ ~~ ~~ (top: outgoing ipv4 packet, bottom: ICMP payload for "TTL expired in transit")
They're not exactly the same though - what changed? We know the IPv4 header structure, let's review:
Everything is the same up until the TTL - that makes sense! For every hop, the TTL is decreased by one. It makes complete sense that it expired when the TTL was about to drop to zero.
As for the checksum, we've already established that when using npcap, outgoing packets had a zero checksum. If that wasn't the case, it would still be different though, as changing the TTL changes the packet's checksum. Everything checks out.
Another interesting thing to notice is that we're trying to ping 8.8.8.8
, but
it's 78.255.77.126
that's replying to us. That also makes sense - we never
made it all the way to 8.8.8.8
, so some node in the middle replied to us.
In this case, the node that replied to us was...
To ping a host, we send an "Echo request" to it. If everything goes well, that host sends an "Echo reply" back to us.
If something goes wrong, another host might reply to us with an ICMP message like "Time exceeded", and part of the original packet.
Crafting our own ICMP packets
I think we know everything we need to know to start making our own ICMP packets.
We used nom
for parsing packets. Is there a
similar crate for serializing arbitrary binary data? There is! By the same
author! Hi Geoffroy, love your stuff.
Let's grab cookie-factory
.
$ cargo add cookie-factory Adding cookie-factory v0.3.0 to dependencies
Now, we need to take a few steps back and think about serialization though.
Our parsers looked something like this:
fn parse(input: &'a [u8]) -> IResult<&'a [u8], Self, Error<&'a [u8]>> { // parser goes here. }
In other words, we had the following constraints:
- The input is borrowed. It does not need to be copied. It's just a slice.
- The output is owned. It cannot refer to the input in any way. We're
making
u16
andu12
andu3
out of the input, but everything needs to be owned. If we want some part of the input as-is, we need to make aVec
out of it, which is part of the reason we madeBlob
. - The error can reference the input
So the input is borrowed until we either:
- discard the error, or
- succeed in parsing it.
When serializing though, we're generating bytes. Do we still need to worry about borrowing / ownership? That all depends.
We could take an std::io::Write
and do something like this:
// in `src/icmp.rs` impl Packet { pub fn serialize(&self, w: &mut dyn io::Write) -> Result<(), io::Error> { match self.typ { Type::EchoRequest => { // write "type" and "code" fields w.write_all(&[8, 0])?; // TODO: write the rest } Type::EchoReply => { // write "type" and "code" fields w.write_all(&[0, 0])?; // TODO: write the rest } _ => unimplemented!(), } Ok(()) } }
However, we're not going to do that. One of the many reasons include: why the heck are we dealing with I/O errors here? We're just generating bytes, we're not supposed to care whether they're successfully written out or not.
But the point is: if we were doing that, then we wouldn't have to worry
about ownership. By the time serialize
returns, we can throw away the
icmp::Packet
completely, because it's all written out.
Instead, we're going to do something similar to what we did for parsing with
nom
, except, the other way around.
Let's start by serializing something simple, like Echo
// in `src/icmp.rs` use cookie_factory as cf; use std::io; impl Echo { pub fn serialize<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a { use cf::{bytes::be_u16, sequence::tuple}; tuple((be_u16(self.identifier), be_u16(self.sequence_number))) } }
Wonderful! It's just like nom!
We return a function that:
- is valid for the lifetime
'a
(seeimpl Fn + 'a
) - borrows
self
for the lifetime'a
(&'a self
) - knows how to serialize
&self
for a givenW
, which should be valid for the lifetime'a
(seeW: Write + 'a
)
Also, this serializer is purely declarative. It's just a tuple of 16-bit big-endian numbers!
The same way nom
lets us combine parsers, cookie-factory
lets us combine
serializers.
Serializers are just functions. They borrow (or copy) their input, and can serialize it any number of times to a compatible output type. They're entirely uninterested in I/O errors, which are handled on another level.
Let's move on up:
// in `src/icmp.rs` #[derive(Debug)] pub enum Header { EchoRequest(Echo), EchoReply(Echo), Other(u32), } impl Header { fn serialize<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a { // mhhh. } }
Okay so. There's no map
equivalent in cookie-factory
. And we definitely
want to write different things based on which variant of Header
we have here.
No worries! We're just returning a function, right? And that can be a closure, correct? Right. Let's go:
// in `src/icmp.rs` impl Header { fn serialize<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a { use cf::bytes::be_u32; move |out| match self { Self::EchoRequest(echo) | Self::EchoReply(echo) => echo.serialize()(out), Self::Other(x) => be_u32(*x)(out), } } }
Alright, that works! Having a match
here will ensure that we have exhaustive
coverage of all the variants, so if we handle other kinds of ICMP request headers,
we'll get a nice compile-time error here to remind us to serialize that, too.
Now, we said earlier that, when serializing, we'd disregard the typ
field
because it's redundant with Header
.
Also, we're going to need our ICMP packet to have a valid checksum,
and some of the information we can serialize from Header
will be before
the checksum, while the rest will be after the checksum.
So let's make another function to serialize what goes before the checksum.
// in `src/icmp.rs` impl Header { // omitted: serialize() fn serialize_type_and_code<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a { use cf::{bytes::be_u8, sequence::tuple}; move |out| match self { Self::EchoRequest(_) => tuple((be_u8(8), be_u8(0)))(out), Self::EchoReply(_) => tuple((be_u8(0), be_u8(0)))(out), // we're not planning on sending any "TTL Expired" or // "Host unreachable" ICMP packets. _ => unimplemented!(), } } }
Let's also make Blob
serializable:
// in `src/blob.rs` use cookie_factory as cf; use std::io; impl Blob { pub fn serialize<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a { use cf::combinator::slice; slice(&self.0) } }
Alright. It's just a slice of bytes! Again, we borrow the Blob for 'a
and the
returned serializer function is valid for 'a
. It all works out.
We're ready to move on to icmp::Packet
. Let's first make a serializer
that ignores the checksum
field:
// in `src/icmp.rs` impl Packet { pub fn serialize_no_checksum<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a { use cf::{bytes::be_u16, sequence::tuple}; tuple(( self.header.serialize_type_and_code(), be_u16(0), // checksum self.header.serialize(), self.payload.serialize(), )) } }
Alright, I think we're far enough along that we can try it out.
We'll try it out from icmp::Packet::parse
, because it's a place where
we have access to an u8
slice that should look roughly the same.
// in `src/icmp.rs` impl Packet { pub fn parse(i: parse::Input) -> parse::Result<Self> { let original_i = i; // rest of parser let res = Self { typ, checksum, header, payload, }; // here we go: let serialized = cf::gen_simple(res.serialize_no_checksum(), Vec::new()).unwrap(); println!(" original = {:?}", Blob::new(original_i)); println!("serialized = {:?}", Blob::new(&serialized)); Ok((i, res)) } }
$ cargo run --quiet Listening for packets... original = [08 00 49 a4 00 01 03 b7 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] serialized = [08 00 00 00 00 01 03 b7 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] 1.0002556s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 03b7 }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } original = [00 00 51 a4 00 01 03 b7 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] serialized = [00 00 00 00 00 01 03 b7 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] 1.0003878s | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, header: EchoReply(Echo { identifier: 0001, sequence_number: 03b7 }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } original = [08 00 49 a3 00 01 03 b8 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] serialized = [08 00 00 00 00 01 03 b8 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] 2.0002365s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 03b8 }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } original = [00 00 51 a3 00 01 03 b8 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] serialized = [00 00 00 00 00 01 03 b8 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] 2.0003223s | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, header: EchoReply(Echo { identifier: 0001, sequence_number: 03b8 }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], }
Heyyyyy. They do look similar. The only differences are at byte offsets 2 and 3, which.. that's the checksum!
Marty! You've gotta come back with me!
Where? Back to the buffer!
So we've got a little pickle. We need to compute a checksum for our freshly-generated ICMP packet, but in order to do that we need to generate some stuff before and some stuff after the checksum, so by the time we need it, we don't yet have all the information we need.
cookie-factory
has something for that, a
back_to_the_buffer
combinator.
Let's say you're writing a binary format that has messages of variable length,
and when serialized, they're prefixed by their length, as an u32
.
back_to_the_buffer
would let you reserve 4 bytes, then write out
the message itself:
Then it would go back, and invoke a closure with a write context at the position we reserved, with the result of the message payload's serialization. Now that it's done, we know it's of length 12, so we can write an u32 with value 12:
And voilà!
Unfortunately, we can't use that. We'd need to reserve data in the middle.
We could technically come up with our own riff on BackToTheBuffer
(it's all
just a bunch of seeks anyway - I know, I looked), but let's not worry too much
about it for now. It's nice to know we have the option, if we ever need to
squeeze some extra performance out of our custom network stack.
We won't.
We won't need to.
So for now, let's do it the old-fashioned way. By first generating the entire
ICMP packet in a buffer, then computing the checksum, then overwriting the checksum
part of the buffer with our u16
result.
The code is actually not that bad:
// in `src/icmp.rs` impl Packet { // omitted: `serialize_no_checksum()` pub fn serialize<'a, W: io::Write + 'a>(&'a self) -> impl cf::SerializeFn<W> + 'a { use cf::{bytes::le_u16, combinator::slice}; move |out| { let mut buf = cf::gen_simple(self.serialize_no_checksum(), Vec::new())?; let checksum = crate::ipv4::checksum(&buf); cf::gen_simple(le_u16(checksum), &mut buf[2..])?; slice(buf)(out) } } }
Good, good. WAIT A MINUTE, le_u16
? As in, little-endian?
Well here's a neat thing we haven't really covered about the "internet checksum" we implemented earlier: endianness "doesn't really matter".
By which I mean, we could convert all u16
in the input from big-endian to
little endian, and do the one's complement sum, take the one's complement of the
result, and then write it as big-endian.
Or we could do no conversion at all, do everything as little-endian, and write it out as little-endian, and get the exact same result.
But don't take it from me, take it from RFC 1071:
The sum of 16-bit integers can be computed in either byte order. Thus, if we calculate the swapped sum:
[B,A] +' [D,C] +' ... +' [Z,Y]
the result is the same as [before], except the bytes are swapped in the sum! To see why this is so, observe that in both orders the carries are the same: from bit 15 to bit 0 and from bit 7 to bit 8. In other words, consistently swapping bytes simply rotates the bits within the sum, but does not affect their internal ordering.
Therefore, the sum may be calculated in exactly the same way regardless of the byte order ("big-endian" or "little-endian") of the underlaying hardware. For example, assume a "little- endian" machine summing data that is stored in memory in network ("big-endian") order. Fetching each 16-bit word will swap bytes, resulting in the sum [above]; however, storing the result back into memory will swap the sum back into network byte order.
Note that our code will now only work on little-endian machines,
because we're explicitly using le_u16
when serializing, which would
do an extra swap on big-endian machines. So, you know, don't go
run this on OpenRISC with no modifications.
Anyway, does it even work?
// in `src/icmp.rs` // in `Packet::parse` // was: `res.serialize_no_checksum()` let serialized = cf::gen_simple(res.serialize(), Vec::new()).unwrap(); println!(" original = {:?}", Blob::new(original_i)); println!("serialized = {:?}", Blob::new(&serialized)); assert_eq!(original_i, &serialized[..]); Ok((i, res))
$ cargo run --quiet Listening for packets... original = [08 00 42 c1 00 01 0a 9a 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] serialized = [08 00 42 c1 00 01 0a 9a 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] 1.0000433s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 0a9a }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } original = [00 00 4a c1 00 01 0a 9a 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] serialized = [00 00 4a c1 00 01 0a 9a 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] 1.0001887s | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, header: EchoReply(Echo { identifier: 0001, sequence_number: 0a9a }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } original = [08 00 42 c0 00 01 0a 9b 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] serialized = [08 00 42 c0 00 01 0a 9b 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] 2.7004159s | (192.168.1.16) => (8.8.8.8) | Packet { typ: EchoRequest, header: EchoRequest(Echo { identifier: 0001, sequence_number: 0a9b }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], } original = [00 00 4a c0 00 01 0a 9b 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] serialized = [00 00 4a c0 00 01 0a 9b 61 62 63 64 65 66 67 68 69 6a 6b 6c + 20 bytes] 2.7004974s | (8.8.8.8) => (192.168.1.16) | Packet { typ: EchoReply, header: EchoReply(Echo { identifier: 0001, sequence_number: 0a9b }), payload: [61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 + 12 bytes], }
It does! And we don't even have to worry about those extra 20 bytes that
Blob
doesn't print, because we sneaked an assert_eq!
in there, so our
code would crash if it didn't work.
And just like that, we're seemingly real close to sending our own, hand-crafted network traffic... We just have to also serialize IPv4 packets and, oh, Ethernet frames. Well that shouldn't be too hard, right?
Right?
😐
This article is part 12 of the Making our own ping series.
If you liked what you saw, please support my work!