Thanks to my sponsors: SeniorMars, Mikkel Rasmussen, Yufan Lou, Philipp Hatt, Horváth-Lázár Péter, Daniel Papp, Richard Pringle, Marcus Griep, Ronen Ulanovsky, Herman J. Radtke III, Andy F, Chris Biscardi, std__mpa, Marty Penner, Mike Cripps, zaurask, Alex Krantz, Shane Lillie, Chris Emery, Aiden Scandella and 230 more
FFI-safe types in Rust, newtypes and MaybeUninit
👋 This page was last updated ~5 years ago. Just so you know.
It's time to make sup
, our own take on ping
, use the Win32 APIs to send
an ICMP echo. Earlier we discovered that Windows's ping.exe
used
IcmpSendEcho2Ex
. But for our purposes, the simpler IcmpSendEcho
will do
just fine.
As we mentioned earlier, it's provided by IPHLPAPI.dll
, and its C
declaration is:
IPHLPAPI_DLL_LINKAGE DWORD IcmpSendEcho(
HANDLE IcmpHandle,
IPAddr DestinationAddress,
LPVOID RequestData,
WORD RequestSize,
PIP_OPTION_INFORMATION RequestOptions,
LPVOID ReplyBuffer,
DWORD ReplySize,
DWORD Timeout
);
Compared to MessageBoxA
, there's a lot more types going on!
WORD
is typically an u16
, whereas DWORD
is an u32
.
LPVOID
is a Long Pointer to Void, so const *c_void
will do.
Same goes for HANDLE
, according to Windows Data Types.
And then there's IPAddr
, which is an IPv4 address (there's a separate
family of functions for IPv6). We know that IP addresses are written by
humans as x.y.z.w
, where each letter is a number between 0 and 255.
Let's make a proper type for that:
// this is a **newtype**
// it has the same memory layout as `[u8; 4]`, but we can
// define our own implementations of traits..
struct IPAddr([u8; 4]);
// ..like this trait for example!
impl fmt::Debug for IPAddr {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
// here, self is effectively a tuple with a single
// element of type `[u8; 4]`. `self.0` accesses the
// first element of a tuple, and having `[a, b, c, d]`
// on the left does a destructuring assignment, letting us
// bind elements of the array to different names.
// this works because u8 is Copy!
let [a, b, c, d] = self.0;
write!(f, "{}.{}.{}.{}", a, b, c, d)
}
}
Now let's take it out for a spin:
fn main() {
let addr = IPAddr([8, 8, 8, 8]);
println!("addr = {:?}", addr);
let addr_as_integer: u32 = unsafe { transmute(addr) };
println!("addr_as_integer = {}", addr_as_integer);
}
Hey, that looks familiar!
By using
transmute, we were
able to reinterpret our IPAddr
type as a 32-bit integer, and we stumbled
upon 134744072
, the same value rohitab's API monitor showed us in part 2.
Next let's take a look at RequestOptions
. PIP_OPTION_INFORMATION
is a
pointer to an IP_OPTION_INFORMATION_STRUCTURE
, for which MSDN gives us the
following C declaration:
typedef struct ip_option_information32 {
UCHAR Ttl;
UCHAR Tos;
UCHAR Flags;
UCHAR OptionsSize;
UCHAR POINTER_32 *OptionsData;
} IP_OPTION_INFORMATION32, *PIP_OPTION_INFORMATION32;
We recognize ttl
(time to live). But what's that POINTER_32
thing?
We're currently working on a 64-bit Windows, and the docs for IcmpSendEcho
say the following:
RequestOptions
A pointer to the IP header options for the request, in the form of an
IP_OPTION_INFORMATION
structure. On a 64-bit platform, this parameter is in the form for anIP_OPTION_INFORMATION32
structure.
That's why I pulled up the C declaration for IP_OPTION_INFORMATION32
. If
we look at the other one, IP_OPTION_INFORMATION
, which is used on 32-bit
Windows:
typedef struct ip_option_information {
UCHAR Ttl;
UCHAR Tos;
UCHAR Flags;
UCHAR OptionsSize;
PUCHAR OptionsData;
} IP_OPTION_INFORMATION, *PIP_OPTION_INFORMATION;
...it's pretty much the same thing, except OptionsData
is a regular pointer.
But! A regular pointer on 64-bit would be, well, 64 bits. Whereas on 32-bit it's 32-bit. So by having two different structs, Windows ensures that, whether it's called from a 64-bit process or a 32-bit process, the structure has the exact same size and layout.
But why?
My guess is that this struct eventually gets passed to the kernel, and although, on 64-bit Windows, processes can be either 32-bit or 64-bit, via WoW64, by the time we hit the network stack of the kernel, that distinction is gone, so there has to be a single struct declaration with a single layout. The 32-bit version was there first, so it's used for both architectures, as the common denominator.
Cool bear's hot tip
Amos is making guesses here, but if you know better (for example, you've worked at Microsoft at the time these decisions were made), feel free to send him an errata on Twitter.
In this case, we're not really planning on passing "IP options", so we can just
use any old 32-bit-wide type. And so our IpOptionInformation
type reads:
#[repr(C)]
struct IpOptionInformation {
ttl: u8,
tos: u8,
flags: u8,
options_size: u8,
// actually a 32-bit pointer, but, that's a Windows
// oddity and I couldn't find a built-in Rust type for it.
options_data: u32,
}
Notice that we used #[repr(C)]
. What does that mean?
Well, there are several ways to lay out a struct in memory. Let's take this one:
struct Foo {
a: u8,
b: u32,
}
What's the size of that struct? 5 bytes, right? One for a, four for b:
Right??
Wrong.
The actual struct layout chosen by rustc here is actually:
If we wanted to have the compact representation we were thinking of,
we could use repr(packed)
:
fn main() {
struct Foo {
a: u8,
b: u32,
}
#[repr(packed)]
struct FooPacked {
a: u8,
b: u32,
}
use std::mem::size_of;
println!("Foo = {}", size_of::<Foo>());
println!("FooPacked = {}", size_of::<FooPacked>());
}
The reason for that is performance. Conventional wisdom says: it's faster to access values that are aligned. So, for a 4-byte value, you'd store it at an address that's a multiple of 4.
Cool bear's hot tip
That's really hard to benchmark correctly.
Also, it doesn't seem all that true for modern x86, and when dealing with larger data sets, padding actually hurts performance, because of caching.
Finally, you need to be aware that "misaligned accesses are okay" is not true for a large number of non-x86 processors, see this answer.
Anyway, different compilers have different ways of laying out structs
in memory, and since we're interacting with about 65 million lines of C/C++ code, we use #[repr(C)]
.
Does this actually make a difference for this struct?
We can check with the memoffset crate.
fn main() {
// first, let's declare both structs: with Rust repr
struct IOI_Rust {
ttl: u8,
tos: u8,
flags: u8,
options_size: u8,
options_data: u32,
}
// and C repr
#[repr(C)]
struct IOI_C {
ttl: u8,
tos: u8,
flags: u8,
options_size: u8,
options_data: u32,
}
use memoffset::span_of;
use std::mem::size_of;
// let's make a quick macro, this will make this a lot easier
macro_rules! print_offset {
// the macro takes one identifier (the struct's name), then a tuple
// of identifiers (the field names)
($type: ident, ($($field: ident),*)) => {
// `$type` is an identifier, but we're going to
// print it out, so we need it as a string instead.
let t = stringify!($type);
// this will repeat for each $field
$(
let f = stringify!($field);
let span = span_of!($type, $field);
println!("{:10} {:15} {:?}", t, f, span);
)*
// finally, print the total field size
let ts = size_of::<$type>();
println!("{:10} {:15} {}", t, "(total)", ts);
println!();
};
}
print_offset!(IOI_Rust, (ttl, tos, flags, options_size, options_data));
print_offset!(IOI_C, (ttl, tos, flags, options_size, options_data));
}
Here's the output from this program:
So it does make a difference in our case. As a diagram now:
So if we hadn't used #[repr(C)]
, we would have been passing garbage
for all the parameters. And that's the scary thing with extern
functions.
Since we provide our own declarations, and the compiler believes us
at our word, we better get it right.
Putting all of this knowledge together, we can tentatively write out the type
for IcmpSendEcho
as:
type Handle = *const c_void;
type IcmpSendEcho = extern "stdcall" fn(
handle: Handle,
dest: IPAddr,
request_data: *const u8,
request_size: u16,
request_options: Option<&IpOptionInformation>,
reply_buffer: *mut u8,
reply_size: u32,
timeout: u32,
) -> u32;
Let's unpack this (ha!). The request_data
field can contain anything we
want when sending ICMP echo messages. Remember, the Windows ping.exe
just
sends a bunch of letters from the alphabet.
Request options is a pointer to an IpOptionInformation
- but it could also
be NULL. We could have written that field as:
type IcmpSendEcho = extern "stdcall" fn(
// ...
request_options: *const IpOptionInformation,
But instead we wrote it as:
type IcmpSendEcho = extern "stdcall" fn(
// ...
request_options: Option<&IpOptionInformation>,
Because:
- Both of those are FFI-safe - they are both just one regular pointer
- The former (
*const X
) is a lot more annoying to use from Rust code.
Finally, reply_buffer
is an output parameter (IcmpSendEcho
will write
to it), so it needs to be a *mut
pointer, not a *const
one. We'll get
to what's in the reply buffer later, so for now we'll just leave it as
raw bytes (hence, *mut u8
).
Back to our regularly-scheduled Win32 API calling
It looks like IcmpSendEcho
takes a Handle
, and after a quick
search on MSDN, we find that IcmpCreateFile
is the right function
for us. Its declaration is a lot simpler:
IPHLPAPI_DLL_LINKAGE HANDLE IcmpCreateFile();
No parameters, great:
type IcmpCreateFile = extern "stdcall" fn() -> Handle;
Alright, it's time to call some functions!
First let's retrieve both their addresses and create an "ICMP file", whatever that means:
fn main() {
unsafe {
let h = LoadLibraryA("IPHLPAPI.dll\0".as_ptr());
let IcmpCreateFile: IcmpCreateFile =
transmute(GetProcAddress(h, "IcmpCreateFile\0".as_ptr()));
let IcmpSendEcho: IcmpSendEcho = transmute(GetProcAddress(h, "IcmpSendEcho\0".as_ptr()));
let handle = IcmpCreateFile();
println!("handle = {:?}", handle);
}
}
> cargo run
handle = 0x246c5e0ee30
Looks good! There's a troubling lack of error handling so far, but
since we got that far we can be pretty sure that we loaded the right library,
and spelled IcmpCreateFile
correctly.
We're all set to call IcmpSendEcho
:
// in main, in unsafe block:
let handle = IcmpCreateFile();
println!("handle = {:?}", handle);
// let's send some culture down the internet pipes
let data = "O Romeo, Romeo. Reachable art thou Romeo?";
// this will be written to, so it needs to be `mut`.
// I'm picking 128 bytes here because I expect the reply
// to be small
let mut reply = vec![0u8; 128];
let ret = IcmpSendEcho(
handle,
IPAddr([8, 8, 8, 8]), // destination
data.as_ptr(), // request data
data.len() as u16,
Some(&IpOptionInformation {
ttl: 128, // time to live
tos: 0,
flags: 0,
options_data: 0,
options_size: 0,
}),
reply.as_mut_ptr(), // reply buffer
reply.len() as u32,
4000, // timeout (4 seconds)
);
println!("ret = {}", ret);
Did it work?
> cargo run
handle = 0x1bec649ec50
ret = 1
Time for a moment of doubt. I'm pretty sure most Win32 functions return 0 on success, but.. maybe this one is different?
Return Value
The
IcmpSendEcho
function returns the number ofICMP_ECHO_REPLY
structures stored in the ReplyBuffer. The status of each reply is contained in the structure. If the return value is zero, callGetLastError
for additional error information.
It is! It is different. Our return value of 1
is actually good news, everyone.
So there you have it, we've made our own ping, and as you can see, it works great. Thanks for following the series, next time we'll cover bwahahah sorry I can't type this with a straight face - of course we're not done.
First of all, we haven't examined the reply buffer at all - let's do so, using the pretty-hex crate.
> cargo add pretty-hex
Adding pretty-hex v0.1.1 to dependencies
Cool bear's hot tip
Note: cargo add
and cargo rm
are not builtins, they're provided by
the cargo-edit crate, which I, cool bear, fully endorse.
Since I don't remember how to use pretty-hex
, I'm going to generate and
open the docs locally:
> cargo doc --open
Finished dev [unoptimized + debuginfo] target(s) in 0.01s
Opening C:\Users\amos\sup\target\doc\sup\index.html
Cool bear's hot tip
This is a cargo built-in, and it's very good.
It works offline, so if you're a TV writer looking for a plot device, you can have a character use this from a fully-isolated basement to get out of a precarious situation.
Oh! That's easy.
// in main, in unsafe block, after `IcmpSendEcho` call:
use pretty_hex::*;
println!("{:?}", reply.hex_dump());
Now that's interesting. Along with some non-text data, we got our request data back!
MSDN docs told us the reply buffer actually contained a series of
ICMP_ECHO_REPLY
structs, so let's take a look at that declaration:
typedef struct icmp_echo_reply {
IPAddr Address;
ULONG Status;
ULONG RoundTripTime;
USHORT DataSize;
USHORT Reserved;
PVOID Data;
struct ip_option_information Options;
} ICMP_ECHO_REPLY, *PICMP_ECHO_REPLY;
Heyy, we know almost all of these! We already have IPAddr
,
and we already have IpOptionInformation
. As for ULONG
and
USHORT
, they're just u32
and u16
.
Time to get binding:
#[repr(C)]
#[derive(Debug)]
struct IcmpEchoReply {
address: IPAddr,
status: u32,
rtt: u32,
data_size: u16,
reserved: u16,
data: *const u8,
options: IpOptionInformation,
}
For inspection purposes, we've derived the Debug
trait for this
struct. Since it contains an IpOptionInformation
, we'll need to add
#[derive(Debug)]
to it as well.
Now, here's one thing we could do. First define IP options separately, to
make the IcmpSendEcho
call more readable:
let ip_opts = IpOptionInformation {
ttl: 128,
tos: 0,
flags: 0,
options_data: 0,
options_size: 0,
}
And then declare a single IcmpEchoReply
- but don't initialize it.
// First off, we need to adjust the signature of `IcmpSendEcho` so that it accepts
// a pointer to an IcmpEchoReply, not a u8 slice:
type IcmpSendEcho = extern "stdcall" fn(
// omitted: other params
reply_buffer: *mut IcmpEchoReply,
) -> u32;
// Now onto MaybeUninit
use std::mem;
let mut reply: mem::MaybeUninit<IcmpEchoReply> = mem::MaybeUninit::uninit();
let ret = IcmpSendEcho(
handle,
IPAddr([8, 8, 8, 8]),
data.as_ptr(),
data.len() as u16,
Some(&ip_opts),
reply.as_mut_ptr(),
mem::size_of::<IcmpEchoReply>() as u32,
4000,
);
if ret == 0 {
panic!("IcmpSendEcho failed! ret = {}", ret);
}
let reply = reply.assume_init();
println!("{:#?}", reply);
MaybeUninit
was recently stabilized (see the 1.36
changelog). It
allows us to tell Rust to allocate a value, but until we call
assume_init
, to treat it as uninitialized. It basically leverages the type
system to prevent undefined behavior.
Here, we only assume it's initialized if IcmpSendEcho
succeeds, which
I believe is correct. However, we're out of luck:
...because the "reply" for IcmpSendEcho
is weirder. The docs say:
ReplySize
The allocated size, in bytes, of the reply buffer. The buffer should be large enough to hold at least one
ICMP_ECHO_REPLY
structure plusRequestSize
bytes of data.This buffer should also be large enough to also hold 8 more bytes of data (the size of an ICMP error message).
To recap, this is how IcmpSendEcho
stores things in the reply buffer:
Not only did we not reserve the 8 bytes for the ICMP error message, we're also sending some data, so our reply buffer isn't large enough - and that's why it now fails. Note that, in ICMP, the reply data is exactly the data we sent.
To unpack the reply properly, we're going to have to do something slightly
more involved. We'll just allocate a vector with enough room, and only
later on interpret its contents as either an IcmpEchoReply
, an ICMP error,
or the reply data.
// Since we changed our mind again, we need to adjust the signature of `IcmpSendEcho`
// *again* so that it accepts once again a pointer to a u8 slice:
type IcmpSendEcho = extern "stdcall" fn(
// omitted: other params
reply_buffer: *mut u8,
) -> u32;
// in main, in unsafe block
use std::mem;
let reply_size = mem::size_of::<IcmpEchoReply>();
let reply_buf_size = reply_size + 8 + data.len();
let mut reply_buf = vec![0u8; reply_buf_size];
// note: there's probably a way to use MaybeUninit here / avoid using vec, but
// let's go for something simple.
let ret = IcmpSendEcho(
handle,
IPAddr([8, 8, 8, 8]),
data.as_ptr(),
data.len() as u16,
Some(&ip_opts),
reply_buf.as_mut_ptr(),
reply_buf_size as u32,
4000,
);
if ret == 0 {
panic!("IcmpSendEcho failed! ret = {}", ret);
}
// casting between pointer types requires transmute:
let reply: &IcmpEchoReply = mem::transmute(&reply_buf[0]);
println!("{:#?}", *reply);
// as it turns out, the "8 bytes for ICMP errors" occur *before* the
// reply data.
let reply_data: *const u8 = mem::transmute(&reply_buf[reply_size + 8]);
// in the previous line, `reply_data` is just a pointer - this turns it
// into a slice.
let reply_data = std::slice::from_raw_parts(reply_data, reply.data_size as usize);
use pretty_hex::*;
println!("{:?}", reply_data.hex_dump());
And now, everything works beautifully:
Here's our complete program so far:
use pretty_hex::*;
use std::{
ffi::c_void,
fmt,
mem::{size_of, transmute},
slice,
};
type HModule = *const c_void;
type FarProc = *const c_void;
extern "stdcall" {
fn LoadLibraryA(name: *const u8) -> HModule;
fn GetProcAddress(module: HModule, name: *const u8) -> FarProc;
}
struct IPAddr([u8; 4]);
impl fmt::Debug for IPAddr {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let [a, b, c, d] = self.0;
write!(f, "{}.{}.{}.{}", a, b, c, d)
}
}
#[repr(C)]
#[derive(Debug)]
struct IpOptionInformation {
ttl: u8,
tos: u8,
flags: u8,
options_size: u8,
options_data: u32,
}
type Handle = *const c_void;
#[repr(C)]
#[derive(Debug)]
struct IcmpEchoReply {
address: IPAddr,
status: u32,
rtt: u32,
data_size: u16,
reserved: u16,
data: *const u8,
options: IpOptionInformation,
}
type IcmpSendEcho = extern "stdcall" fn(
handle: Handle,
dest: IPAddr,
request_data: *const u8,
request_size: u16,
request_options: Option<&IpOptionInformation>,
reply_buffer: *mut u8,
reply_size: u32,
timeout: u32,
) -> u32;
type IcmpCreateFile = extern "stdcall" fn() -> Handle;
fn main() {
#[allow(non_snake_case)]
unsafe {
let h = LoadLibraryA("IPHLPAPI.dll\0".as_ptr());
let IcmpCreateFile: IcmpCreateFile =
transmute(GetProcAddress(h, "IcmpCreateFile\0".as_ptr()));
let IcmpSendEcho: IcmpSendEcho = transmute(GetProcAddress(h, "IcmpSendEcho\0".as_ptr()));
let handle = IcmpCreateFile();
let data = "O Romeo, Romeo. Reachable art thou Romeo?";
let ip_opts = IpOptionInformation {
ttl: 128,
tos: 0,
flags: 0,
options_data: 0,
options_size: 0,
};
let reply_size = size_of::<IcmpEchoReply>();
let reply_buf_size = reply_size + 8 + data.len();
let mut reply_buf = vec![0u8; reply_buf_size];
let ret = IcmpSendEcho(
handle,
IPAddr([8, 8, 8, 8]),
data.as_ptr(),
data.len() as u16,
Some(&ip_opts),
reply_buf.as_mut_ptr(),
reply_buf_size as u32,
4000,
);
if ret == 0 {
panic!("IcmpSendEcho failed! ret = {}", ret);
}
let reply: &IcmpEchoReply = transmute(&reply_buf[0]);
println!("{:#?}", *reply);
let reply_data: *const u8 = transmute(&reply_buf[reply_size + 8]);
let reply_data = slice::from_raw_parts(reply_data, reply.data_size as usize);
println!("{:?}", reply_data.hex_dump());
}
}
In the next part, we'll refactor our codebase and add some more features!
What did we learn?
Newtypes allow us to provide our own implementation of traits - in this article,
we provided a custom Debug
implementation for [u8; 4]
- an IPv4 address as
represented in the Win32 API.
When it comes to FFI (foreign function interface), struct layout matters. Rust's
default representation is different from C's, and we can opt into packing. It's
controlled by the repr
attribute, used directly above a struct declaration.
cargo doc
allows generating and reading the documentation of third-party crates,
even offline. It generates the documentation for all dependencies of the current
project.
MaybeUninit
allows us to safely deal with uninitialized data, without causing
undefined behavior. This is enforced by the type system.
Option<&T>
can be used instead of *const T
when passing parameters from
Rust to a C function, for ease of use.
Rust slices can be made from a raw pointer + a length, using
std::slice::from_raw_parts
.
Here's another article just for you:
Some mistakes Rust doesn't catch
I still get excited about programming languages. But these days, it's not so much because of what they let me do, but rather what they don't let me do.
Ultimately, what you can with a programming language is seldom limited by the language itself: there's nothing you can do in C++ that you can't do in C, given infinite time.
As long as a language is turing-complete and compiles down to assembly, no matter the interface, it's the same machine you're talking to. You're limited by... what your hardware can do, how much memory it has (and how fast it is), what kind of peripherals are plugged into it, and so on.