The builder pattern, and a macro that keeps FFI code DRY
👋 This page was last updated ~5 years ago. Just so you know.
Our ping API is simple, but it's also very limited:
pub fn ping(dest: ipv4::Addr) -> Result<(), String> // called as: ping(ipv4::Addr([8, 8, 8, 8])).unwrap();
It doesn't allow specifying the TTL (time to live) of packets, it doesn't allow specifying the timeout, it doesn't let one specify the data to send along, and it doesn't give us any kind of information on the reply.
Let's change that now.
We could take all of these as arguments:
pub fn ping(dest: ipv4::Addr, ttl: u8, timeout: u32, data: Vec<u8>) -> Result<(), String>
But then the callsite becomes larger and less readable:
ping(ipv4::Addr([8, 8, 8, 8]), 128, 4000, "Some data".into())
Here we can sort of guess that 128 must be the TTL since 4000 wouldn't fit into
an u8
- but that's because we have domain specific knowledge. Someone casually
reading the call to ping()
might not know that.
Even if they hovered on 128
in their IDE, all they'd see is a pop-up that
says u8
. Positional parameters make for hard-to-read APIs, especially when
there's more than 2 or 3 of them, and their types resemble each other.
Another problem with this proposal is that we've lost the ability to set defaults.
Now everyone has to pick a ttl and a timeout, even if they were fine with the default we had in our earlier version.
Of course, we can fix that problem by making (almost) everything optional:
pub fn ping( dest: ipv4::Addr, ttl: Option<u8>, timeout: Option<u32>, data: Option<Vec<u8>>, ) -> Result<(), String>
...and now the call site is even worse:
ping(ipv4::Addr([8, 8, 8, 8]), Some(128), Some(4000), Some("Some data".into()))
...especially if we want to use the defaults:
ping(ipv4::Addr([8, 8, 8, 8]), None, None, None)
Who knows what these parameters mean?
If we want defaults and readable call sites, we can use the Builder pattern.
First we'll need a struct that represents an ICMP request:
// src/icmp/mod.rs pub struct Request {}
We'll provide Request::new
, to create a new instance of it:
// src/icmp/mod.rs pub struct Request {} impl Request { pub fn new() -> Self { Self {} } }
new()
will take a destination address, because it's the only argument
that doesn't have a default.
We'll store it in the Request
struct, but we'll keep the field private, so
that it can't be changed afterwards:
// src/icmp/mod.rs pub struct Request { dest: ipv4::Addr, } impl Request { pub fn new(dest: ipv4::Addr) -> Self { Self { dest } } }
And finally, we'll move our ping
function to Request::ping
:
// src/icmp/mod.rs impl Request { // cut: pub fn new() pub fn send(&self) -> Result<(), String> { // same as our old `ping()` // except using `self.dest` instead of `dest` } }
For this to work, we need to derive new traits for ipv4::Addr
.
We don't want to move it out of self
, we just want to pass
a copy:
// src/ipv4.rs #[derive(Clone, Copy)] pub struct Addr(pub [u8; 4]);
With that change, our main function now looks like:
// src/main.rs fn main() -> Result<(), Box<dyn Error>> { let arg = env::args().nth(1).unwrap_or_else(|| { println!("Usage: sup DEST"); exit(1); }); icmp::Request::new(arg.parse()?).send()?; Ok(()) }
Not too bad!
Now, let's add other fields to Request
:
// src/icmp/mod.rs pub struct Request { dest: ipv4::Addr, ttl: u8, timeout: u32, data: Option<Vec<u8>>, }
We'll need to initialize them with defaults in Request::new()
:
//src/icmp/mod.rs impl Request { pub fn new(dest: ipv4::Addr) -> Self { Self { dest, ttl: 128, timeout: 4000, data: None, } } }
I don't know about you but I like the looks of this.
We'll also have to use them in send()
, which now becomes:
impl Request { pub fn send(self) -> Result<(), String> { let handle = icmp_sys::IcmpCreateFile(); let data = self.data.unwrap_or_default(); let reply_size = size_of::<icmp_sys::IcmpEchoReply>(); let reply_buf_size = reply_size + 8 + data.len(); let mut reply_buf = vec![0u8; reply_buf_size]; let ip_options = icmp_sys::IpOptionInformation { ttl: self.ttl, tos: 0, flags: 0, options_data: 0, options_size: 0, }; match icmp_sys::IcmpSendEcho( handle, self.dest, data.as_ptr(), data.len() as u16, Some(&ip_options), reply_buf.as_mut_ptr(), reply_buf_size as u32, self.timeout, ) { 0 => Err("IcmpSendEcho failed :(".to_string()), _ => Ok(()), } } }
A few notes: we changed the signature of send()
to take self
instead of
&self
.
Now, Request::send
takes ownership of self
. The request moves into
send()
. In other words, it consumes the Request
. That request won't be
usable for anything else afterwards.
We also used icmp_sys::IpOptionInformation
to specify the TTL (time to live).
And we used Option::unwrap_or_default
to provide empty data if none is specified.
This all works beautifully:
..but we still have zero output.
Our request struct now contains fields like ttl
and timeout
, but, for users
of the crate, those are invisible. They can't be read or written:
We can provide a setter:
impl Request { pub fn ttl(&mut self, ttl: u8) { self.ttl = ttl; } }
And then this works:
let mut req = icmp::Request::new(arg.parse()?); req.ttl(60); req.send()?;
But it's not great, especially when you want to specify multiple parameters. It gets long pretty quickly:
let mut req = icmp::Request::new(arg.parse()?); req.ttl(60); req.timeout(2000); req.data("Oh hey".into()); req.send()?;
An alternative is to provide a function that consumes the request.. and then gives it right back!
impl Request { pub fn ttl(mut self, ttl: u8) -> Self { self.ttl = ttl; self } }
// in main use icmp::Request; let dest = arg.parse()?; Request::new(dest).ttl(60).send()?;
We can do the same for timeout
:
impl Request { pub fn timeout(mut self, timeout: u32) -> Self { self.timeout = timeout; self } }
As for data
, I don't quite like having to call into()
explicitly
at the callsite. How about we take any type that can be converted
into a Vec<u8>
?
impl Request { pub fn data<D>(mut self, data: D) -> Self where D: Into<Vec<u8>>, { self.data = Some(data.into()); self } }
Here's how we can use our new request:
// in main use icmp::Request; let dest = arg.parse()?; Request::new(dest) .ttl(60) .timeout(1000) .data("Pretty cool!") .send()?;
That is the builder pattern.
There's variations on it, that leverage generic types to disallow calling setters more than once, for example, but they're out of scope for now.
We have a pretty API now, and I'm pretty happy about it.
Oh, and everything still works.
Mass dynamic loading
In part 5, we said we'd have to come back to src/icmp/icmp_sys.rs
at some point
in the future. That point is now.
As a reminder, here's a short section of it:
pub type Handle = *const c_void; use crate::loadlibrary::Library; type IcmpCreateFile = extern "stdcall" fn() -> Handle; pub fn IcmpCreateFile() -> Handle { let iphlp = Library::new("IPHLPAPI.dll").unwrap(); let IcmpCreateFile: IcmpCreateFile = unsafe { iphlp.get_proc("IcmpCreateFile").unwrap() }; IcmpCreateFile() }
Oof. Not only do we have to replicate the IcmpCreateFile
signature, we
open IPHLPAPI.dll
every time we call it.
That's not great.
We also don't even expose IcmpCloseHandle
, let's do so now:
type IcmpCloseHandle = extern "stdcall" fn(handle: Handle); pub fn IcmpCloseHandle(handle: Handle) { let iphlp = Library::new("IPHLPAPI.dll").unwrap(); let IcmpCloseHandle: IcmpCloseHandle = unsafe { iphlp.get_proc("IcmpCloseHandle").unwrap() }; IcmpCloseHandle(handle) }
Now, we can at least close the handle when we're done with it:
// src/icmp/mod.rs impl Request { pub fn send(self) -> Result<(), String> { let ret = icmp_sys::IcmpSendEcho( handle, self.dest, data.as_ptr(), data.len() as u16, Some(&ip_options), reply_buf.as_mut_ptr(), reply_buf_size as u32, self.timeout, ); // new: icmp_sys::IcmpCloseHandle(handle); match ret { 0 => Err("IcmpSendEcho failed :(".to_string()), _ => Ok(()), } } }
Good! Everything still works.
Now then. There's three functions we're interested in from IPHLPAPI.dll
:
IcmpCreateFile
IcmpSendEcho
IcmpCloseHandle
Ideally, we'd want to open the DLL only once, then find all three functions.
We'd want to do that at most once per run, and we'd want to delay doing that until a call to either of those three functions are made.
Since we think of those three functions are "together", we can define a struct:
// src/icmp/icmp_sys.rs pub struct Functions { pub IcmpCreateFile: extern "stdcall" fn() -> Handle, pub IcmpSendEcho: extern "stdcall" fn( handle: Handle, dest: ipv4::Addr, request_data: *const u8, request_size: u16, request_options: Option<&IpOptionInformation>, reply_buffer: *mut u8, reply_size: u32, timeout: u32, ) -> u32, pub IcmpCloseHandle: extern "stdcall" fn(handle: Handle), }
Then, we could define Functions::get
, which looks up all three of them
in one fell swoop:
// src/icmp/icmp_sys.rs impl Functions { pub fn get() -> Self { let iphlp = Library::new("IPHLPAPI.dll").unwrap(); Self { IcmpCreateFile: unsafe { iphlp.get_proc("IcmpCreateFile").unwrap() }, IcmpSendEcho: unsafe { iphlp.get_proc("IcmpSendEcho").unwrap() }, IcmpCloseHandle: unsafe { iphlp.get_proc("IcmpCloseHandle").unwrap() }, } } }
And now, we can change src/icmp/mod.rs
:
// src/icmp/mod.rs impl Request { pub fn send(self) -> Result<(), String> { let fns = icmp_sys::Functions::get(); let handle = (fns.IcmpCreateFile)(); // cut: a bunch of code let ret = (fns.IcmpSendEcho)( // cut: arguments ); (fns.IcmpCloseHandle)(handle); // cut: rest of function } }
Sure enough, this works fine! And it's better - we only call LoadLibrary
once, everything is nicely grouped.
But we still call LoadLibrary
every time we call icmp_sys::Functions::get()
,
and that's not great.
It would be better if Functions
was a singleton.
Can we maybe.. declare it as a static?
// src/icmp/mod.rs pub static FUNCTIONS: Functions = Functions::get();
No. No we cannot.
Now, ordinarily I'd use lazy_static to get out of this jam, but I'm just now seeing that it's "passively maintained", so I went hunting and found once_cell, which promises to do the same thing without macros!
Let's add it:
We can now write this:
// src/icmp/icmp_sys.rs use once_cell::sync::Lazy; pub static FUNCTIONS: Lazy<Functions> = Lazy::new(|| Functions::get());
Let's also make Functions::get()
private, so it can't be
called from outside this module.
impl Functions { // this used to be 'pub' fn get() -> Self { /* ... */ } }
We can now change:
// src/icmp/mod.rs impl Request { pub fn send(self) -> Result<(), String> { let fns = icmp_sys::Functions::get(); // etc. } }
to:
// src/icmp/mod.rs impl Request { pub fn send(self) -> Result<(), String> { let fns = &icmp_sys::FUNCTIONS; // etc. } }
And everything still works!
It's kind of annoying to have wrap every method in parenthesis though:
let handle = (icmp_sys::FUNCTIONS.IcmpCreateFile)();
This is required because IcmpCreateFile
is a field, not a method:
Also, it's kind of articial to have to use icmp_sys::FUNCTIONS
- why
are they not directly exported by icmp_sys
? Like a regular module?
Let's see if we can cook something up with macros.
Ideally, we'd like to call a macro like this:
bind! { fn IcmpCreateFile() -> Handle; fn IcmpCloseHandle(handle: Handle) -> (); fn IcmpSendEcho( handle: Handle, dest: ipv4::Addr, request_data: *const u8, request_size: u16, request_options: Option<&IpOptionInformation>, reply_buffer: *mut u8, reply_size: u32, timeout: u32 ) -> u32; }
And have it:
- Declare a struct with
fn extern "stdcall"
members - Declare a lazy static with
once_cell
- Declare and export wrapper functions
So, let's give it a go, start simple:
macro_rules! bind { ($(fn $name:ident($($arg:ident: $type:ty),*) -> $ret:ty;)*) => { struct Functions { $(pub $name: extern "stdcall" fn ($($arg: $type),*) -> $ret),* } $( pub fn $name($($arg: $type),*) -> $ret { unimplemented!() } )* }; }
To see this macro in action, we can use the wonderful cargo-expand:
> cargo install cargo-expand > rustup toolchain add nightly > cargo-expand icmp::icmp_sys
So far so good! Let's fill in the rest:
macro_rules! bind { ($(fn $name:ident($($arg:ident: $type:ty),*) -> $ret:ty;)*) => { struct Functions { $(pub $name: extern "stdcall" fn ($($arg: $type),*) -> $ret),* } static FUNCTIONS: once_cell::sync::Lazy<Functions> = once_cell::sync::Lazy::new(|| { let lib = crate::loadlibrary::Library::new("IPHLPAPI.dll").unwrap(); Functions { $($name: unsafe { lib.get_proc(stringify!($name)).unwrap() }),* } }); $( #[inline(always)] pub fn $name($($arg: $type),*) -> $ret { (FUNCTIONS.$name)($($arg),*) } )* }; }
Note: we hardcoded
"IPHLPAPI.dll"
here, but it could just as well have been a macro argument.
Let's check the macro expansion again:
Wonderful.
Replies
Right now our ping API returns a Result<(), E>
- in other words, it only
signals success.
But when sending an ICMP Echo, we want to display some statistics, like the RTT, ie. "round-trip time" - the time it took for a packet to travel all the way to the host and all the way back to us.
Windows's ping.exe
shows this:
We can sorta simulate this output if we turn main into this:
fn main() -> Result<(), Box<dyn Error>> { let arg = env::args().nth(1).unwrap_or_else(|| { println!("Usage: sup DEST"); exit(1); }); use icmp::Request; let dest = arg.parse()?; let data = "O Romeo."; println!(); println!("Pinging {:?} with {} bytes of data:", dest, data.len()); use std::{thread::sleep, time::Duration}; for _ in 0..4 { match Request::new(dest).ttl(128).timeout(4000).data(data).send() { Ok(_) => println!("Reply from {:?}: bytes={} time=? TTL=?", dest, data.len()), Err(_) => println!("Something went wrong"), } sleep(Duration::from_secs(1)); } println!(); Ok(()) }
Not too bad! Let's fill the missing bits.
Let's take a look at IcmpEchoReply
, the Win32 type:
#[repr(C)] #[derive(Debug)] pub struct IcmpEchoReply { pub address: ipv4::Addr, pub status: u32, pub rtt: u32, pub data_size: u16, pub reserved: u16, pub data: *const u8, pub options: IpOptionInformation, }
We can't just make Request::send
return that, for two reasons:
- It's private to
icmp::icmp_sys
- It contains a raw pointer (data).
We can however, pick the parts we're interested in.
I'm thinking this will do nicely:
// src/icmp/mod.rs use std::time::Duration; pub struct Reply { pub addr: ipv4::Addr, pub data: Vec<u8>, pub rtt: Duration, pub ttl: u8, }
Let's adjust Request::send
to return one of these:
// src/icmp/mod.rs use std::{ mem::{size_of, transmute}, slice, }; pub fn send(self) -> Result<Reply, String> { match ret { 0 => Err("IcmpSendEcho failed :(".to_string()), _ => { let reply: &icmp_sys::IcmpEchoReply = unsafe { transmute(&reply_buf[0]) }; // this bit of code was explained in part 3 let data: Vec<u8> = unsafe { let data_ptr: *const u8 = transmute(&reply_buf[reply_size + 8]); slice::from_raw_parts(data_ptr, reply.data_size as usize) } .into(); Ok(Reply { addr: reply.address, data, rtt: Duration::from_millis(reply.rtt as u64), ttl: reply.options.ttl, }) } } }
Great!
We even used std::time::Duration
, so that we get a Debug
implementation
for free that formats a duration as seconds, milliseconds, microseconds..
whatever is most appropriate.
Let's use it from main:
// src/main.rs // in main: for _ in 0..4 { match Request::new(dest).ttl(128).timeout(4000).data(data).send() { Ok(res) => println!( "Reply from {:?}: bytes={} time={:?} TTL={}", res.addr, res.data.len(), res.rtt, res.ttl, ), Err(_) => println!("Something went wrong"), } sleep(Duration::from_secs(1)); }
Aaand let's check out the results:
The builder pattern is a powerful tool in your API design toolbelt. It allows providing defaults, keeping callsites readable.
Taking self
is a good way to consume a value, making sure it cannot be
re-used or mutated afterwards. In combination with the builder pattern, it
prevents configure-after-use bugs.
The Into
trait allows taking an argument of any type that can be converted
into the type we really want.
Crates like lazy_static
and once_cell
allow having lazily-initialized
statics, which are useful for complicated singletons, like.. a set of external
functions all from the same DLL, like we have here.
Rust macros can avoid a lot of repetitive typing or copy/pasting, which in turns avoid bugs.
What's next?
So that's it right? We're done? We've made a tool that looks and feels like
ping.exe
, although with fewer options. It also doesn't display statistics, like
minimum/average/maximum round-trip time. That part is left as an exercise
to the reader.
But we're far from done.
We've been relying on Windows to "speak" ICMP for us. It's been doing all the heavy lifting!
Surely we can go lower-level than just calling the Windows APIs...
Thanks to my sponsors: Olly Swanson, Marie Janssen, Jarek Samic, Tabitha, Morgan Rosenkranz, Sarah Berrettini, Marty Penner, Jake Demarest-Mays, Shane Lillie, Torben Clasen, Aleksandre Khokhiashvili, Hadrien G., Johan Andersson, David Barsky, Yann Schwartz, Ahmad Alhashemi, Sam, WeblWabl, René Ribaud, Leo Shimonaka and 227 more
If you liked what you saw, please support my work!
Here's another article just for you:
Writing Rust is pretty neat. But you know what's even neater? Continuously testing Rust, releasing Rust, and eventually, shipping Rust to production. And for that, we want more than plug-in for a code editor.
We want... a workflow.
Why I specifically care about this
This gets pretty long, so if all you want is the advice, feel free to directly.