A simple ping library, parsing strings into IPv4 address
From the series
Making our own ping
We've just spent a lot of time abstracting over LoadLibrary, but we still have all the gory details of the Win32 ICMP API straight in our main.rs file! That won't do.
This time will be much quicker, since we already learned about carefully designing an API, hiding the low-level bits and so on.
Let's add an icmp
module to our program. Actually, we've been dealing with
an IPAddr
all this time, it also sounds like it could use its own package:
In src/main.rs
:
pub mod icmp; pub mod ipv4;
Our ipv4
module will be short and sweet. In src/ipv4.rs
:
use std::fmt; pub struct Addr(pub [u8; 4]); impl fmt::Debug for Addr { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { let [a, b, c, d] = self.0; write!(f, "{}.{}.{}.{}", a, b, c, d) } }
Now we can just change mentions of IPAddr
to ipv4::Addr
.
Onto the icmp
module, we'll actually want two sources files.
One will expose our public interface, and declare its own private
module to further hide the Win32 bits.
src/icmp/mod.rs
will look like:
// note that we're not declaring this one "pub" // it's just our own! mod icmp_sys; pub struct Request { // TODO: } pub struct Reply { // TODO: }
Finally, src/icmp/icmp_sys.rs
will contain the Win32-specific bits:
use crate::ipv4; #[repr(C)] #[derive(Debug)] pub struct IpOptionInformation { pub ttl: u8, pub tos: u8, pub flags: u8, pub options_size: u8, pub options_data: u32, } #[repr(C)] #[derive(Debug)] pub struct IcmpEchoReply { pub address: ipv4::Addr, pub status: u32, pub rtt: u32, pub data_size: u16, pub reserved: u16, pub data: *const u8, pub options: IpOptionInformation, }
Note that we had to make the structs and their members pub
, because we're
defining them in crate::icmp::icmp_sys
but we're going to be using them
from another module, crate::icmp
. Without pub
they'd be private to the
current module and that's it.
You may remember from the previous parts that we need an "ICMP handle" to use the Win32 ICMP API, so let's start by exposing that:
// still in `src/icmp/icmp_sys.rs` use std::ffi:c_void; pub type Handle = *const c_void; pub fn IcmpCreateFile() -> Handle { unimplemented!() }
Good! Now we could do something like that in src/icmp/mod.rs
:
use icmp_sys::IcmpCreateFile; pub fn something() { let handle = IcmpCreateFile(); }
However, there's several problems with that approach
First problem: we left IcmpCreateFile
unimplemented. We know we're not
going to be implementing it ourselves, it's actually provided by IPHLPAPI.dll
.
But we also know we can't just do:
// in `src/icmp/icmp_sys.rs` extern "stdcall" { pub fn IcmpCreateFile() -> Handle; }
..because then our program won't link. We've been using LoadLibrary
to
dynamically load IPHLPAPI.dll
(as the Microsoft docs recommend!), we haven't
been linking against it.
We could do something like this:
// in `src/icmp/icmp_sys.rs` use crate::loadlibrary::Library; type IcmpCreateFile = extern "stdcall" fn() -> Handle; pub fn IcmpCreateFile() -> Handle { let iphlp = Library::new("IPHLPAPI.dll").unwrap(); let IcmpCreateFile: IcmpCreateFile = unsafe { iphlp.get_proc("IcmpCreateFile").unwrap() }; IcmpCreateFile() }
..and it would work. But it would leak handles (we never close IPHLPAPI.dll
).
Also, it would open the DLL and look up the procedure every time we call
IcmpCreateFile
. So that's not going to be a viable long-term strategy.
But, for now, let's roll with it - just so we can get our crate::icmp
module
up and running. We'll add IcmpSendEcho
too:
// in `src/icmp/icmp_sys.rs` type IcmpSendEcho = extern "stdcall" fn( handle: Handle, dest: ipv4::Addr, request_data: *const u8, request_size: u16, request_options: Option<&IpOptionInformation>, reply_buffer: *mut u8, reply_size: u32, timeout: u32, ) -> u32; pub fn IcmpSendEcho( handle: Handle, dest: ipv4::Addr, request_data: *const u8, request_size: u16, request_options: Option<&IpOptionInformation>, reply_buffer: *mut u8, reply_size: u32, timeout: u32, ) -> u32 { let iphlp = Library::new("IPHLPAPI.dll").unwrap(); let IcmpSendEcho: IcmpSendEcho = unsafe { iphlp.get_proc("IcmpSendEcho").unwrap() }; IcmpSendEcho( handle, dest, request_data, request_size, request_options, reply_buffer, reply_size, timeout, ) }
Whoa. Okay, yeah, we're definitely going to need to come back to that.
But for now, we've got everything we want, I think! Let's move on to
the design of the crate::icmp
module.
We want our icmp API to be simple to use, something like:
// src/main.rs fn main() { icmp::ping(ipv4::Addr([8, 8, 8, 8])).unwrap(); }
So let's build that!
// src/icmp/mod.rs use std::mem::size_of; pub fn ping(dest: ipv4::Addr) -> Result<(), String> { let handle = icmp_sys::IcmpCreateFile(); let data = "O Romeo. Please respond."; let reply_size = size_of::<icmp_sys::IcmpEchoReply>(); let reply_buf_size = reply_size + 8 + data.len(); let mut reply_buf = vec![0u8; reply_buf_size]; let timeout = 4000_u32; match icmp_sys::IcmpSendEcho( handle, dest, data.as_ptr(), data.len() as u16, None, reply_buf.as_mut_ptr(), reply_buf_size as u32, timeout, ) { 0 => Err("IcmpSendEcho failed :(".to_string()), _ => Ok(()), } }
This is pretty much the simplest thing we can do that still works.
A few things to note:
- We hardcoded a timeout of 4 seconds
- We didn't pass any IP options (like the TTL)
- We completely ignored the reply
- We're leaking the ICMP handle after ping returns
- Not the memory allocated for it, but the OS resources associated to it
But... it seems to work!
Well, with no output it's always hard to convince oneself that it works.
Let's try with another address just to make sure:
// src/main.rs fn main() { icmp::ping(ipv4::Addr([0, 0, 0, 0])).unwrap(); }
Okay, that's reassuring!
Do you like the fancy terminal in those screenshots?
That's the new Windows Terminal, which now has beta builds available on the Windows Store.
It turns out Amos was too lazy to build it from source so he's only gotten around to it now.
Also, the theme is "One Half Dark", and the font is "Consolas".
Building a CLI
Although we have a lot to clean up underneath the surface, let's spend
some time making sup
(our version of ping) more user-friendly.
So far we've been hard-coding IP addresses - it's time for that to change.
We can use the std::env
package to retrieve command-line arguments:
// src/main.rs use std::env; fn main() { let args: Vec<String> = env::args().collect(); println!("args = {:?}", args); }
For starters, we'll only take one argument - the IP address of the host to ping.
// src/main.rs use std::process::exit; fn main() { let arg = env::args().nth(1).unwrap_or_else(|| { println!("Usage: sup DEST"); exit(1); }); println!("dest = {}", arg); }
What's happening here? Well, std::env::args()
returns an iterator, so we can just:
- Skip over
sup.exe
, the first (well, 0th) argument - Ask for the next item
- If there was a next item, use it
- If there wasn't (ie.
next()
returned None), then print usage and exit with a non-zero code
Progress!
But right now, we have "dest" as a string, and we want it as an ipv4::Addr
.
Let's delegate that job to the ipv4
package.
How about an ipv4::Addr::parse()
method?
// src/ipv4.rs impl Addr { pub fn parse(s: String) -> Self { unimplemented!() } }
Okay, well, it's certainly easy to use:
// src/main.rs fn main() { let arg = env::args().nth(1).unwrap_or_else(|| { println!("Usage: sup DEST"); exit(1); }); let dest = ipv4::Addr::parse(arg); icmp::ping(dest).expect("ping failed"); }
But.. about that function signature: do we really need to take a String
?
Remember, String
is an owned type, so by taking a parameter of type
String
, we take ownership of it. Which means we can't re-use dest after:
let dest = ipv4::Addr::parse(arg); println!("Just parsed {}", arg);
We never actually change the argument to ipv4::Addr::parse
. We don't hang
on to it either. We just need to.. borrow it (immutably) for a hot minute.
So we could take an &str
instead!
// src/ipv4.rs impl Addr { pub fn parse(s: &str) -> Self { unimplemented!() } }
We don't have to worry about the lifetime of s
(see Declarative memory management
for an intro to lifetimes).
But, we do get a new compile error:
Well, fair - we can fix that by passing &arg
instead:
let dest = ipv4::Addr::parse(&arg);
...but wouldn't it be nice to have our parse
function accept both &str
and String
at the same time?
The AsRef
trait lets us do that:
impl Addr { pub fn parse<S>(s: S) -> Self where S: AsRef<str>, { unimplemented!() } }
Now, we can call ipv4::Addr::parse
with either a &str
or a String
- or with
any type that implements AsRef<str>
, for that matter.
However, if we pass a String
, we still pass ownership along with the string.
If we want to re-use arg
after, we must pass &arg
. In this case, we don't really
care, so we can just pass arg
- for convenience!
let dest = ipv4::Addr::parse(arg); icmp::ping(dest).expect("ping failed");
So! Parsing a string into an IPv4 address.
Well, we know we're going to have 4 parts, separated by dots. So let's split that string up:
impl Addr { pub fn parse<S>(s: S) -> Self where S: AsRef<str>, { let tokens = s.as_ref().split("."); unimplemented!() } }
So far, so good! We had to call .as_ref()
to get an actual &str
- that's
what taking an AsRef<str>
guarantees we can do.
Next, we need to take 4 parts, convert them to u8
values, and return them
as an ipv4::Addr
.
Well, we've just used iterators before, so we know we can use next()!
{ let tokens = s.as_ref().split("."); let a = tokens.next(); }
...but next()
returns an Option! We don't know many items the iterators
will yield. It might be 4, but it might be 3, or 8, or 0.
So we have to face the hard reality that parse
can fail - not all strings
are valid IPv4 addresses.
Let's make an error type:
// all error types must implement Debug if we want // to use `unwrap()`, `expect()` on the corresponding // Result<T, E> types. #[derive(Debug)] pub enum ParseAddrError { NotEnoughParts, }
And use it:
impl Addr { pub fn parse<S>(s: S) -> Result<Self, ParseAddrError> where S: AsRef<str>, { let mut tokens = s.as_ref().split("."); let a = tokens.next().ok_or(ParseAddrError::NotEnoughParts)?; let b = tokens.next().ok_or(ParseAddrError::NotEnoughParts)?; let c = tokens.next().ok_or(ParseAddrError::NotEnoughParts)?; let d = tokens.next().ok_or(ParseAddrError::NotEnoughParts)?; dbg!(a, b, c, d); unimplemented!() } }
By doing that, the borrow checker woke up and told us we needed tokens
to
be mutable - it is an iterator, and next()
changes its internal state after
all.
dbg!
is simply a macro that prints expressions, first literally, and also
their value. It also shows the source file location of where we asked for
a debug print:
So far, so good - but we need u8
items, not &str
items. Luckily,
u8
implements the FromStr
trait, so we can just do this:
{ let mut tokens = s.as_ref().split("."); let a = tokens .next() .ok_or(ParseAddrError::NotEnoughParts)? .parse::<u8>(); let b = tokens .next() .ok_or(ParseAddrError::NotEnoughParts)? .parse::<u8>(); let c = tokens .next() .ok_or(ParseAddrError::NotEnoughParts)? .parse::<u8>(); let d = tokens .next() .ok_or(ParseAddrError::NotEnoughParts)? .parse::<u8>(); dbg!(a, b, c, d); unimplemented!() }
Uh oh this is starting to get unwieldy. But it does work:
Oh.. those are Ok(u8)
. That's right, not every string is a valid u8
either,
only the strings "0", "1", "2", etc.
Note that u8
's implementation of FromStr
might not exactly be what we want.
It might accept, for example, hexadecimal notation like 0x13
.
0x13.0.0.1
isn't really format in which we expect IP addresses.
It might also accept engineering notation, like 1e2
(meaning 1 * 10.pow(2)
, ie. 100
).
In this case, it seems like impl FromStr for u8
doesn't accept hexadecimal
notation or engineering notation. It seems to only accept decimal notation.
But there's many ways to "parse a string into an int", and, if we were
writing production software, we should should make sure the standard
FromStr
implementation meets our needs.
If we didn't, we might silently accept bad input, causing hard-to-diagnose problems.
Ok, so, parse can fail too, let's add ?
:
let a = tokens .next() .ok_or(ParseAddrError::NotEnoughParts)? .parse::<u8>()?; // etc.
Uh oh, that doesn't work:
Right! FromStr<u8>::parse()
returns a Result<u8, std::num::ParseIntError>
,
but our error type is ParseAddrError
.
Well, it's complaining about From
, maybe we can implement From
?
use std::num::ParseIntError; #[derive(Debug)] pub enum ParseAddrError { NotEnoughParts, ParseIntError(ParseIntError), } impl From<ParseIntError> for ParseAddrError { fn from(e: ParseIntError) -> Self { ParseAddrError::ParseIntError(e) } }
Boom! Now we can implicitly convert between std::num::ParseIntError
and crate::ipv4::ParseAddrError
.
But our parse
function is still quite long and repetitive. Here are some
ideas to fix it.
We could use a closure and call it repeatedly:
impl Addr { pub fn parse<S>(s: S) -> Result<Self, ParseAddrError> where S: AsRef<str>, { let mut tokens = s.as_ref().split("."); // It's a "mut" closure (an FnMut), because it some // mutable references as part of its environment: in this // case, &mut tokens. let mut f = || -> Result<u8, ParseAddrError> { // Also, we had to annotate the return type of the closure // otherwise rustc couldn't infer the Error type Ok(tokens .next() .ok_or(ParseAddrError::NotEnoughParts)? .parse()?) // we no longer need a turbofish for parse. // this used to be `parse::<u8>()`, but due // to the way `f()` is used below, the compiler // knows we want an u8. }; Ok(Self([f()?, f()?, f()?, f()?])) } }
That version works quite well:
Although we're still calling f
four times, which I'm not a fan of.
Also, it doesn't complain if we pass, say, "8.8.8.8.231"
Here's another approach:
#[derive(Debug)] pub enum ParseAddrError { NotEnoughParts, TooManyParts, // new! ParseIntError(ParseIntError), } impl Addr { pub fn parse<S>(s: S) -> Result<Self, ParseAddrError> where S: AsRef<str>, { let mut tokens = s.as_ref().split("."); let mut res = Self([0, 0, 0, 0]); for part in res.0.iter_mut() { // `part` is now a mutable reference to one of the // parts of `res.0`. // and remember, `Addr` is a newtype, it behaves like // a tuple that only has one element - that's why we // use `res.0` to operate on the `[u8; 4]` inside. *part = tokens .next() .ok_or(ParseAddrError::NotEnoughParts)? .parse()? } // we *should* be getting `None` here, because there // should only be four parts. If we get `Some`, there's // too many. if let Some(_) = tokens.next() { return Err(ParseAddrError::TooManyParts); } Ok(res) } }
I like this one better, so we'll keep it.
While we implemented ipv4::Addr::parse
, we discovered something: there
is a FromStr
trait in the Rust standard library.
Why not implement that trait for Addr
instead?
impl std::str::FromStr for Addr { type Err = ParseAddrError; fn from_str(s: &str) -> Result<Self, ParseAddrError> { let mut tokens = s.split("."); // cut: same body as before Ok(res) } }
Now, we can change our main function from this:
let dest = ipv4::Addr::parse(arg).unwrap(); icmp::ping(dest).expect("ping failed");
To this:
icmp::ping(arg.parse().unwrap()).expect("ping failed");
And if we let our main function return a Result
, we can
reduce it further to this:
use std::{env, error::Error, process::exit}; fn main() -> Result<(), Box<dyn Error>> { let arg = env::args().nth(1).unwrap_or_else(|| { println!("Usage: sup DEST"); exit(1); }); icmp::ping(arg.parse()?)?; Ok(()) }
And here we have those multiple ?
sigils that confuse Rust newcomers - but
they each have a purpose on the second-to-last line. There's two operations, both
of which can fail: converting the argument to a string, and then performing a ping.
Of course, for that to work, we need to make ParseAddrError
implement
std::error::Error
, which in turns requires it to implement std::fmt::Display
,
so let's do that right now:
// src/ipv4.rs use std::fmt; impl fmt::Display for ParseAddrError { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "{:?}", self) } } impl std::error::Error for ParseAddrError {}
And there you have it!
Rust's visibility system is fine-grained enough for us to hide implementation details however we choose to.
std::env::args()
returns the arguments passed to a program as an iterator.
One may skip an iterator's items, ask for the next item, or collect the whole
iterator into a type like Vec<T>
.
The FromStr
trait is implemented by a variety of primitive types. Implementing
it for custom types is easy, and works great with command-line argument parsing.
The AsRef
trait allows one to take both a &str
and String
. Or, both a
Path
and a PathBuf
. This applies to many other types. See also the
Borrow trait.
The dbg!()
macro is very useful for quick and dirty "printf debugging".
When the rust compiler needs a little help inferring types, closures can be annotated with argument types and a return type.
Implementing std::error::Error
only requires implementing std::fmt::Display
and std::fmt::Debug
.
This article is part 5 of the Making our own ping series.
If you liked what you saw, please support my work!