A simple ping library, parsing strings into IPv4 address

👋 This page was last updated ~5 years ago. Just so you know.

We've just spent a lot of time abstracting over LoadLibrary, but we still have all the gory details of the Win32 ICMP API straight in our main.rs file! That won't do.

This time will be much quicker, since we already learned about carefully designing an API, hiding the low-level bits and so on.

Let's add an icmp module to our program. Actually, we've been dealing with an IPAddr all this time, it also sounds like it could use its own package:

In src/main.rs:

pub mod icmp;
pub mod ipv4;

Our ipv4 module will be short and sweet. In src/ipv4.rs:

use std::fmt;

pub struct Addr(pub [u8; 4]);

impl fmt::Debug for Addr {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        let [a, b, c, d] = self.0;
        write!(f, "{}.{}.{}.{}", a, b, c, d)
    }
}

Now we can just change mentions of IPAddr to ipv4::Addr.

Onto the icmp module, we'll actually want two sources files. One will expose our public interface, and declare its own private module to further hide the Win32 bits.

src/icmp/mod.rs will look like:

// note that we're not declaring this one "pub"
// it's just our own!
mod icmp_sys;

pub struct Request {
    // TODO:
}

pub struct Reply {
    // TODO:
}

Finally, src/icmp/icmp_sys.rs will contain the Win32-specific bits:

use crate::ipv4;

#[repr(C)]
#[derive(Debug)]
pub struct IpOptionInformation {
    pub ttl: u8,
    pub tos: u8,
    pub flags: u8,
    pub options_size: u8,
    pub options_data: u32,
}

#[repr(C)]
#[derive(Debug)]
pub struct IcmpEchoReply {
    pub address: ipv4::Addr,
    pub status: u32,
    pub rtt: u32,
    pub data_size: u16,
    pub reserved: u16,
    pub data: *const u8,
    pub options: IpOptionInformation,
}

Note that we had to make the structs and their members pub, because we're defining them in crate::icmp::icmp_sys but we're going to be using them from another module, crate::icmp. Without pub they'd be private to the current module and that's it.

You may remember from the previous parts that we need an "ICMP handle" to use the Win32 ICMP API, so let's start by exposing that:

// still in `src/icmp/icmp_sys.rs`

use std::ffi:c_void;

pub type Handle = *const c_void;

pub fn IcmpCreateFile() -> Handle {
    unimplemented!()
}

Good! Now we could do something like that in src/icmp/mod.rs:

use icmp_sys::IcmpCreateFile;

pub fn something() {
    let handle = IcmpCreateFile();
}

However, there's several problems with that approach

First problem: we left IcmpCreateFile unimplemented. We know we're not going to be implementing it ourselves, it's actually provided by IPHLPAPI.dll.

But we also know we can't just do:

// in `src/icmp/icmp_sys.rs`

extern "stdcall" {
    pub fn IcmpCreateFile() -> Handle;
}

..because then our program won't link. We've been using LoadLibrary to dynamically load IPHLPAPI.dll (as the Microsoft docs recommend!), we haven't been linking against it.

We could do something like this:

// in `src/icmp/icmp_sys.rs`

use crate::loadlibrary::Library;

type IcmpCreateFile = extern "stdcall" fn() -> Handle;

pub fn IcmpCreateFile() -> Handle {
    let iphlp = Library::new("IPHLPAPI.dll").unwrap();
    let IcmpCreateFile: IcmpCreateFile = unsafe { iphlp.get_proc("IcmpCreateFile").unwrap() };
    IcmpCreateFile()
}

..and it would work. But it would leak handles (we never close IPHLPAPI.dll).

Also, it would open the DLL and look up the procedure every time we call IcmpCreateFile. So that's not going to be a viable long-term strategy.

But, for now, let's roll with it - just so we can get our crate::icmp module up and running. We'll add IcmpSendEcho too:

// in `src/icmp/icmp_sys.rs`

type IcmpSendEcho = extern "stdcall" fn(
    handle: Handle,
    dest: ipv4::Addr,
    request_data: *const u8,
    request_size: u16,
    request_options: Option<&IpOptionInformation>,
    reply_buffer: *mut u8,
    reply_size: u32,
    timeout: u32,
) -> u32;

pub fn IcmpSendEcho(
    handle: Handle,
    dest: ipv4::Addr,
    request_data: *const u8,
    request_size: u16,
    request_options: Option<&IpOptionInformation>,
    reply_buffer: *mut u8,
    reply_size: u32,
    timeout: u32,
) -> u32 {
    let iphlp = Library::new("IPHLPAPI.dll").unwrap();
    let IcmpSendEcho: IcmpSendEcho = unsafe { iphlp.get_proc("IcmpSendEcho").unwrap() };
    IcmpSendEcho(
        handle,
        dest,
        request_data,
        request_size,
        request_options,
        reply_buffer,
        reply_size,
        timeout,
    )
}

Whoa. Okay, yeah, we're definitely going to need to come back to that.

But for now, we've got everything we want, I think! Let's move on to the design of the crate::icmp module.

We want our icmp API to be simple to use, something like:

// src/main.rs

fn main() {
    icmp::ping(ipv4::Addr([8, 8, 8, 8])).unwrap();
}

So let's build that!

// src/icmp/mod.rs

use std::mem::size_of;

pub fn ping(dest: ipv4::Addr) -> Result<(), String> {
    let handle = icmp_sys::IcmpCreateFile();

    let data = "O Romeo. Please respond.";

    let reply_size = size_of::<icmp_sys::IcmpEchoReply>();
    let reply_buf_size = reply_size + 8 + data.len();
    let mut reply_buf = vec![0u8; reply_buf_size];

    let timeout = 4000_u32;

    match icmp_sys::IcmpSendEcho(
        handle,
        dest,
        data.as_ptr(),
        data.len() as u16,
        None,
        reply_buf.as_mut_ptr(),
        reply_buf_size as u32,
        timeout,
    ) {
        0 => Err("IcmpSendEcho failed :(".to_string()),
        _ => Ok(()),
    }
}

This is pretty much the simplest thing we can do that still works.

A few things to note:

  • We hardcoded a timeout of 4 seconds
  • We didn't pass any IP options (like the TTL)
  • We completely ignored the reply
  • We're leaking the ICMP handle after ping returns
    • Not the memory allocated for it, but the OS resources associated to it

But... it seems to work!

Well, with no output it's always hard to convince oneself that it works.

Let's try with another address just to make sure:

// src/main.rs

fn main() {
    icmp::ping(ipv4::Addr([0, 0, 0, 0])).unwrap();
}

Okay, that's reassuring!

Cool bear

Cool bear's hot tip

Do you like the fancy terminal in those screenshots?

That's the new Windows Terminal, which now has beta builds available on the Windows Store.

It turns out Amos was too lazy to build it from source so he's only gotten around to it now.

Also, the theme is "One Half Dark", and the font is "Consolas".

Building a CLI

Although we have a lot to clean up underneath the surface, let's spend some time making sup (our version of ping) more user-friendly.

So far we've been hard-coding IP addresses - it's time for that to change.

We can use the std::env package to retrieve command-line arguments:

// src/main.rs

use std::env;

fn main() {
    let args: Vec<String> = env::args().collect();
    println!("args = {:?}", args);
}

For starters, we'll only take one argument - the IP address of the host to ping.

// src/main.rs

use std::process::exit;

fn main() {
    let arg = env::args().nth(1).unwrap_or_else(|| {
        println!("Usage: sup DEST");
        exit(1);
    });
    println!("dest = {}", arg);
}

What's happening here? Well, std::env::args() returns an iterator, so we can just:

  • Skip over sup.exe, the first (well, 0th) argument
  • Ask for the next item
    • If there was a next item, use it
    • If there wasn't (ie. next() returned None), then print usage and exit with a non-zero code

Progress!

But right now, we have "dest" as a string, and we want it as an ipv4::Addr.

Let's delegate that job to the ipv4 package.

How about an ipv4::Addr::parse() method?

// src/ipv4.rs

impl Addr {
    pub fn parse(s: String) -> Self {
        unimplemented!()
    }
}

Okay, well, it's certainly easy to use:

// src/main.rs

fn main() {
    let arg = env::args().nth(1).unwrap_or_else(|| {
        println!("Usage: sup DEST");
        exit(1);
    });
    let dest = ipv4::Addr::parse(arg);
    icmp::ping(dest).expect("ping failed");
}

But.. about that function signature: do we really need to take a String?

Remember, String is an owned type, so by taking a parameter of type String, we take ownership of it. Which means we can't re-use dest after:

    let dest = ipv4::Addr::parse(arg);
    println!("Just parsed {}", arg);

We never actually change the argument to ipv4::Addr::parse. We don't hang on to it either. We just need to.. borrow it (immutably) for a hot minute.

So we could take an &str instead!

// src/ipv4.rs

impl Addr {
    pub fn parse(s: &str) -> Self {
        unimplemented!()
    }
}

We don't have to worry about the lifetime of s (see Declarative memory management for an intro to lifetimes).

But, we do get a new compile error:

Well, fair - we can fix that by passing &arg instead:

let dest = ipv4::Addr::parse(&arg);

...but wouldn't it be nice to have our parse function accept both &str and String at the same time?

The AsRef trait lets us do that:

impl Addr {
    pub fn parse<S>(s: S) -> Self
    where
        S: AsRef<str>,
    {
        unimplemented!()
    }
}

Now, we can call ipv4::Addr::parse with either a &str or a String - or with any type that implements AsRef<str>, for that matter.

However, if we pass a String, we still pass ownership along with the string. If we want to re-use arg after, we must pass &arg. In this case, we don't really care, so we can just pass arg - for convenience!

let dest = ipv4::Addr::parse(arg);
icmp::ping(dest).expect("ping failed");

So! Parsing a string into an IPv4 address.

Well, we know we're going to have 4 parts, separated by dots. So let's split that string up:

impl Addr {
    pub fn parse<S>(s: S) -> Self
    where
        S: AsRef<str>,
    {
        let tokens = s.as_ref().split(".");

        unimplemented!()
    }
}

So far, so good! We had to call .as_ref() to get an actual &str - that's what taking an AsRef<str> guarantees we can do.

Next, we need to take 4 parts, convert them to u8 values, and return them as an ipv4::Addr.

Well, we've just used iterators before, so we know we can use next()!

{
    let tokens = s.as_ref().split(".");

    let a = tokens.next();
}

...but next() returns an Option! We don't know many items the iterators will yield. It might be 4, but it might be 3, or 8, or 0.

So we have to face the hard reality that parse can fail - not all strings are valid IPv4 addresses.

Let's make an error type:

// all error types must implement Debug if we want
// to use `unwrap()`, `expect()` on the corresponding
// Result<T, E> types.
#[derive(Debug)]
pub enum ParseAddrError {
    NotEnoughParts,
}

And use it:

impl Addr {
    pub fn parse<S>(s: S) -> Result<Self, ParseAddrError>
    where
        S: AsRef<str>,
    {
        let mut tokens = s.as_ref().split(".");

        let a = tokens.next().ok_or(ParseAddrError::NotEnoughParts)?;
        let b = tokens.next().ok_or(ParseAddrError::NotEnoughParts)?;
        let c = tokens.next().ok_or(ParseAddrError::NotEnoughParts)?;
        let d = tokens.next().ok_or(ParseAddrError::NotEnoughParts)?;

        dbg!(a, b, c, d);

        unimplemented!()
    }
}

By doing that, the borrow checker woke up and told us we needed tokens to be mutable - it is an iterator, and next() changes its internal state after all.

dbg! is simply a macro that prints expressions, first literally, and also their value. It also shows the source file location of where we asked for a debug print:

So far, so good - but we need u8 items, not &str items. Luckily, u8 implements the FromStr trait, so we can just do this:

{
    let mut tokens = s.as_ref().split(".");

    let a = tokens
        .next()
        .ok_or(ParseAddrError::NotEnoughParts)?
        .parse::<u8>();
    let b = tokens
        .next()
        .ok_or(ParseAddrError::NotEnoughParts)?
        .parse::<u8>();
    let c = tokens
        .next()
        .ok_or(ParseAddrError::NotEnoughParts)?
        .parse::<u8>();
    let d = tokens
        .next()
        .ok_or(ParseAddrError::NotEnoughParts)?
        .parse::<u8>();

    dbg!(a, b, c, d);

    unimplemented!()
}

Uh oh this is starting to get unwieldy. But it does work:

Oh.. those are Ok(u8). That's right, not every string is a valid u8 either, only the strings "0", "1", "2", etc.

Cool bear

Cool bear's hot tip

Note that u8's implementation of FromStr might not exactly be what we want.

It might accept, for example, hexadecimal notation like 0x13. 0x13.0.0.1 isn't really format in which we expect IP addresses.

It might also accept engineering notation, like 1e2 (meaning 1 * 10.pow(2), ie. 100).

In this case, it seems like impl FromStr for u8 doesn't accept hexadecimal notation or engineering notation. It seems to only accept decimal notation.

But there's many ways to "parse a string into an int", and, if we were writing production software, we should should make sure the standard FromStr implementation meets our needs.

If we didn't, we might silently accept bad input, causing hard-to-diagnose problems.

Ok, so, parse can fail too, let's add ?:

let a = tokens
    .next()
    .ok_or(ParseAddrError::NotEnoughParts)?
    .parse::<u8>()?;

// etc.

Uh oh, that doesn't work:

Right! FromStr<u8>::parse() returns a Result<u8, std::num::ParseIntError>, but our error type is ParseAddrError.

Well, it's complaining about From, maybe we can implement From?

use std::num::ParseIntError;

#[derive(Debug)]
pub enum ParseAddrError {
    NotEnoughParts,
    ParseIntError(ParseIntError),
}

impl From<ParseIntError> for ParseAddrError {
    fn from(e: ParseIntError) -> Self {
        ParseAddrError::ParseIntError(e)
    }
}

Boom! Now we can implicitly convert between std::num::ParseIntError and crate::ipv4::ParseAddrError.

But our parse function is still quite long and repetitive. Here are some ideas to fix it.

We could use a closure and call it repeatedly:

impl Addr {
    pub fn parse<S>(s: S) -> Result<Self, ParseAddrError>
    where
        S: AsRef<str>,
    {
        let mut tokens = s.as_ref().split(".");

        // It's a "mut" closure (an FnMut), because it some
        // mutable references as part of its environment: in this
        // case, &mut tokens.
        let mut f = || -> Result<u8, ParseAddrError> {
            // Also, we had to annotate the return type of the closure
            // otherwise rustc couldn't infer the Error type
            Ok(tokens
                .next()
                .ok_or(ParseAddrError::NotEnoughParts)?
                .parse()?)
            // we no longer need a turbofish for parse.
            // this used to be `parse::<u8>()`, but due
            // to the way `f()` is used below, the compiler
            // knows we want an u8.
        };

        Ok(Self([f()?, f()?, f()?, f()?]))
    }
}

That version works quite well:

Although we're still calling f four times, which I'm not a fan of.

Also, it doesn't complain if we pass, say, "8.8.8.8.231"

Here's another approach:

#[derive(Debug)]
pub enum ParseAddrError {
    NotEnoughParts,
    TooManyParts, // new!
    ParseIntError(ParseIntError),
}

impl Addr {
    pub fn parse<S>(s: S) -> Result<Self, ParseAddrError>
    where
        S: AsRef<str>,
    {
        let mut tokens = s.as_ref().split(".");

        let mut res = Self([0, 0, 0, 0]);
        for part in res.0.iter_mut() {
            // `part` is now a mutable reference to one of the
            // parts of `res.0`.
            // and remember, `Addr` is a newtype, it behaves like
            // a tuple that only has one element - that's why we
            // use `res.0` to operate on the `[u8; 4]` inside.
            *part = tokens
                .next()
                .ok_or(ParseAddrError::NotEnoughParts)?
                .parse()?
        }

        // we *should* be getting `None` here, because there
        // should only be four parts. If we get `Some`, there's
        // too many.
        if let Some(_) = tokens.next() {
            return Err(ParseAddrError::TooManyParts);
        }

        Ok(res)
    }
}

I like this one better, so we'll keep it.

While we implemented ipv4::Addr::parse, we discovered something: there is a FromStr trait in the Rust standard library.

Why not implement that trait for Addr instead?

impl std::str::FromStr for Addr {
    type Err = ParseAddrError;

    fn from_str(s: &str) -> Result<Self, ParseAddrError> {
        let mut tokens = s.split(".");

        // cut: same body as before

        Ok(res)
    }
}

Now, we can change our main function from this:

let dest = ipv4::Addr::parse(arg).unwrap();
icmp::ping(dest).expect("ping failed");

To this:

icmp::ping(arg.parse().unwrap()).expect("ping failed");

And if we let our main function return a Result, we can reduce it further to this:

use std::{env, error::Error, process::exit};

fn main() -> Result<(), Box<dyn Error>> {
    let arg = env::args().nth(1).unwrap_or_else(|| {
        println!("Usage: sup DEST");
        exit(1);
    });
    icmp::ping(arg.parse()?)?;

    Ok(())
}

And here we have those multiple ? sigils that confuse Rust newcomers - but they each have a purpose on the second-to-last line. There's two operations, both of which can fail: converting the argument to a string, and then performing a ping.

Of course, for that to work, we need to make ParseAddrError implement std::error::Error, which in turns requires it to implement std::fmt::Display, so let's do that right now:

// src/ipv4.rs

use std::fmt;

impl fmt::Display for ParseAddrError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "{:?}", self)
    }
}

impl std::error::Error for ParseAddrError {}

And there you have it!

Cool bear

What did we learn?

Rust's visibility system is fine-grained enough for us to hide implementation details however we choose to.

std::env::args() returns the arguments passed to a program as an iterator. One may skip an iterator's items, ask for the next item, or collect the whole iterator into a type like Vec<T>.

The FromStr trait is implemented by a variety of primitive types. Implementing it for custom types is easy, and works great with command-line argument parsing.

The AsRef trait allows one to take both a &str and String. Or, both a Path and a PathBuf. This applies to many other types. See also the Borrow trait.

The dbg!() macro is very useful for quick and dirty "printf debugging".

When the rust compiler needs a little help inferring types, closures can be annotated with argument types and a return type.

Implementing std::error::Error only requires implementing std::fmt::Display and std::fmt::Debug.

Comment on /r/fasterthanlime

(JavaScript is required to see this. Or maybe my stuff broke)

Here's another article just for you:

Profiling linkers

In the wake of Why is my Rust build so slow?, developers from the mold and lld linkers reached out, wondering why using their linker didn't make a big difference.

Of course the answer was "there's just not that much linking to do", and so any difference between mold and lld was within a second. GNU ld was lagging way behind, at four seconds or so.