Windows dynamic libraries, calling conventions, and transmute

👋 This page was last updated ~5 years ago. Just so you know.

So, how does ping.exe actually send a ping? It seems unrealistic that ping.exe itself implements all the protocols involved in sending a ping. So it must be calling some sort of library. Also, since it ends up talking to the outside world via a NIC (network interface controller), the kernel is probably involved at some point.

In reading files the hard way - part 2, we learned about dynamic libraries (like libc), and the Linux kernel, and how syscalls allowed us to ask the Linux kernel to do our bidding. For this series, we're going to have to look at the Windows equivalents.

The Manjaro system we used to read files came with binutils, which provided the ldd command. It listed the dynamic libraries an executable was linked against. Unfortunately, it doesn't seem to have the same name on Windows:

On Windows, building applications (the way Microsoft intended) involves setting aside multiple gigabytes of free disk space and running a graphical installer. In our case, we'll use Visual Studio 2019 Community, which is free for students, open-source contributors, and invidiuals.

Cool bear

Cool bear's hot tip

You could also install only the build tools. Which would be lighter. And which you could probably install in the command line.

But one day you may want to use Visual Studio's profiler, and then you will have to install the full thing, and then you'll end up with two copies of the build tools and then node-gyp will be confused about which version it should use and you'll wonder why updating one doesn't do anything and it'll be a whole mess.

In the wise words of a popular TV show couple, "this way we have it".

Visual Studio 2019 comes, like its predecessors, with a shortcut for a "Tools Command Prompt":

...which is pretty much just your regular cmd.exe, except with some environment variables moved around so that we have access to the various, well, Visual Studio tools.

For example, we have dumpbin, which, like its name only mildly suggests, prints the content of PE/COFF object files (.exe executables, .dll dynamic libraries, etc.):

dumpbin /dependents (Microsoft CLI tools tend to take flags as /flag rather than --flag) gives us a list of libraries an executable depends on:

> dumpbin /dependents C:\Windows\System32\PING.EXE
Microsoft (R) COFF/PE Dumper Version 14.23.28105.4
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file C:\Windows\System32\PING.EXE

File Type: EXECUTABLE IMAGE

  Image has the following dependencies:

    msvcrt.dll
    api-ms-win-core-console-l1-1-0.dll
    api-ms-win-core-registry-l1-1-0.dll
    IPHLPAPI.DLL
    api-ms-win-core-localization-l1-2-0.dll
    WS2_32.dll
    api-ms-win-core-heap-l2-1-0.dll
    api-ms-win-core-synch-l1-2-0.dll
    api-ms-win-core-errorhandling-l1-1-0.dll
    ntdll.dll
    api-ms-win-core-heap-l1-1-0.dll
    api-ms-win-core-rtlsupport-l1-1-0.dll
    api-ms-win-core-processthreads-l1-1-0.dll
    api-ms-win-core-libraryloader-l1-2-0.dll
    api-ms-win-core-profile-l1-1-0.dll
    api-ms-win-core-sysinfo-l1-1-0.dll
    api-ms-win-core-processenvironment-l1-1-0.dll
    api-ms-win-core-string-l1-1-0.dll
    api-ms-win-core-file-l1-1-0.dll
Cool bear

Cool bear's hot tip

If we ran the same command on an older version of Windows, we'd see only a handful of DLLs.

But Windows 7 introduced kernel architecture changes and now we have a whole bundle of cute little kebab-case DLLs running around.

This is great! As we suspected, ping.exe uses external libraries to do its job. But which one is responsible for pinging?

dumpbin /imports lets us know exactly which functions are imported by ping.exe. If we pipe its output into findstr (the Windows equivalent to grep), we can search for interesting words:

>dumpbin /imports C:\Windows\System32\PING.EXE | findstr /I ping
Dump of file C:\Windows\System32\PING.EXE

Nothing for ping... maybe echo?

>dumpbin /imports C:\Windows\System32\PING.EXE | findstr /I echo
                          9B IcmpSendEcho2Ex
                          95 Icmp6SendEcho2

Bingo.

Cool bear

Cool bear's hot tip

The /I option for findstr means "case-insensitive".

In this case, it would find echo, ECHO, Echo, or even EcHo.

However, we still don't know what library provides IcmpSendEcho2Ex, for example.

We could just go ahead and look for it on MSDN. But that's no fun. Instead, we'll just need to do a better text search, because the output of dumpbin /imports has everything we need.

But cmd.exe is a little under-powered for the task. I'm sure there's a way to do it, but, it's 2019, and PowerShell is all the rage, so let's use it:

How modern! Syntax highlighting and everything.

So the function we want is in IPHLPAPI.dll. If I had to take a guess, I would say "IP" stands for "Internet Protocol", "API" stands for "Application Programming Interface", and "HLP" stands for "Helper".

Cool bear

Cool bear's hot tip

Amos lied, he looked it up beforehand.

But, y'know, after years wrangling with the Win32 API, you get a sort of intuition about these shorthands.

But since we're still too stubborn to just look up the docs for now, we have no idea how to use this Echo function.

Luckily, there's a tool to trace how applications use the Windows APIs. Well, there's several. For a casual trace I usually go for ProcMon, but I'm not sure it traces calls to IPHLPAPI, so for today we'll skip straight to the really good one, rohitab's API monitor.

For this article, I'm using the v2 Alpha-r13 x86 64-bit.

API monitor is comprehensive. At startup, it loads API definitions from a gaggle of XML files, which it then lets us browse via a handy tree:

Since we know more or less what we're looking for, we can use the search function to set up hooks:

And just like that, we find what we're looking for. Look at that, it even has inline docs! (sort of)

To be on the safe side, we'll just enable the whole "IP Helper" group and, using the "Monitor new process" link, monitor a.. new.. process.

A few pings later... voilà!

This is quite fascinating. I had no idea Windows's PING command sent lowercase letters of the alphabet (see the "Hex Buffer" pane on the right).

Looking at this, we can figure out that:

  • We first need to create a IcmpHandle
  • IcmpSendEcho2Ex takes a whole lot of parameters that can be left blank
  • PING chooses a TTL (time to live) of 128 by default. That's the number of times our IP packet can hop through the internet before it's deemed lost.
  • We're pinging address 134744072

Wait, that's not an IP address. Or is it?

Phew. It's just 8.8.8.8, with each 8 stored as its own byte, interpreted by API Monitor as if it were a 32-bit integer. In its defense, it's an alpha version!

And with that, we have everything we need.

Cool bear

What did we learn?

ping.exe uses Win32 APIs (Windows libraries) to send ICMP Echo messages.

To list the libraries used by an executable compiled with Microsoft's toolchain on Windows, we can use dumpbin.exe, which comes with Visual Studio 2019.

The library that contains IMCP-related functions is called IPHLPAPI.dll.

Reading the documentation on MSDN will only get you so far, and monitoring an existing application through rohitab's API Monitor is a great way to see how an API call is used in the real world.

Is it me or does it smell like Rust

Now, if we were truly drinking the Microsoft Windows kool-aid, we'd write our application in C, or C++, or maybe C# if we felt the need to be managed.

But we're going to skip on right over to Rust land, because this article is not about copying code samples straight from MSDN.

So, after few minutes of quality time with rustup, we should have a working install of the latest rust compiler:

The toolchain's full name is "stable-x86_64-pc-windows-msvc", and the last bit, "-msvc", tells us that it will be ABI (application binary interface) compatible with the tools from Visual Studio 2019 we installed earlier.

Now let's create a new binary crate using cargo, build it, and see what it depends on:

Huh! No IPHLPAPI.dll.

Now, we are at a crossroads. We could find a way to link dynamically against IPHLPAPI.dll and just use the IcmpSendEcho* family.

But where's the fun in that?

No, instead, we'll use what KERNEL32.dll gave us, and open the IP Helper library dynamically at runtime. It'll be a lot more entertaining, and we'll learn a few tricks in the process. (No pun intended)

First, let's get a code editor.

Now, if this were Linux, I would have to spend at least 4 or 5 articles teaching you the basics of Vim. But this is a Windows series, and so we shall concentrate on Microsoft-given tools.

It just so happens that one of my favorite code editors is made (in part) by Microsoft: Visual Studio Code.

Cool bear

Cool bear's hot tip

There's a pretty good Vim emulation plugin for Visual Studio Code, so you can get the best of both worlds: a modern asynchronous plugin architecture in one neatly bundled package, and never to use the mouse again.

Well, not never but, you know. Not as often.

So, yadda yadda, one install later we have a code editor:

If you want your Code to look like mine, you'll want to grab the Cascadia Code font and install the Kary Pro Colors theme/extension. You might also want to check out the crates extension, for editing your Cargo.toml.

But most importantly, you'll want to install rust-analyzer. You'll need Git for Windows and node.js, and then you'd clone its repository and simply run cargo install-ra, which will install both the language server itself, and its Code extension.

Cool bear

Cool bear's hot tip

Note: if you pick rust-analyzer, you won't need the other Code plugin for Rust. They'll just conflict.

Also, to get rust-analyzer's Code plugin to play nicely with the Vim Code plugin, you'll need to disable "Enhanced Typing" in user settings.

rust-analyzer is an unfinished, but already very good language server for Rust, which means it provides stuff like automatic formatting on save, autocompletion, and inline docs:

Cool bear

What did we learn?

Amos wants you to know that you can have a very comfy Rust coding experience on Windows, but can't really be bothered to hand-hold readers through all of it right now, so he glossed over some of the details.

If you're unhappy, clap your hands at him on Twitter and maybe he'll do a separate article with a step-by-step anyone can follow.

Let's call some APIs

As we discovered earlier, a barebones Rust program won't link against IPHLPAPI.dll, but it will link against KERNEL32.dll.

If only there was.. a way... to do something with a library. Ugh, MSDN is so big, let's be lazy:

Heyyy, look, LoadLibrary* sounds like what we want. But there seems to be two variants: LoadLibraryA and LoadLibraryW. Cool bear, do you have anything for us here?

Cool bear

Cool bear's hot tip

Win32 API functions that deal with strings usually come in two flavors: ANSI (A), and Wide (W).

Historically, the first was almost ASCII, while the latter was UTF-16.

However, starting with build 1903 of Windows 10, processes start with an UTF-8 codepage by default, which means if you don't care about backwards compatibility at all, you can just pass UTF-8 to A functions.

Thanks cool bear. Now, we're still too lazy to check out MSDN, but if we were designing a simple LoadLibrary function, it would probably look something like this:

fn LoadLibrary(name: &str) -> Handle

Except, we're in C land, so we don't have Rust strings, so instead we have to take the C equivalent: a pointer to a series of bytes:

fn LoadLibraryA(name: *const u8) -> Handle

Why const? Because it's an input parameter, so it won't be modified. Raw pointers in Rust always need to be either const, or mut.

Why u8? That's just a byte (unsigned 8-bit integer).

Of course, we're not actually implementing LoadLibraryA. We're using it, and it's defined in a Win32 library. In other words, it's extern:

extern {
    fn LoadLibraryA(name: *const i8) -> Handle;
}

// note that we didn't specify a function body.
// it's external! we're not supposed to implement it.
// the body lives in KERNEL32.dll.

Oh, also, Win32 API functions have a specific calling convention. Remember when we wrote assembly for a bit?

Well a calling convention describes how parameters are passed to a function, how values are returned from it, and who's in charge of cleaning things up.

Some calling conventions use registers, others use the stack - if you can think of some way to pass parameters, there's a calling convention for it. The convention for Win32 API calls is named stdcall.

Cool bear

Cool bear's hot tip

You may intuit that "std" means "standard", but in practice, there is no real standard calling conventions.

Calling conventions change across operating systems, processor architectures, and even compilers!

The point is: if we use the wrong calling convention, things may work a little and then crash later, or they may immediately set everything on fire - which is strongly preferable, trust me.

I once used the wrong calling convention for weeks on a project before something strange came up. Even the debugger was confused. It was not a fun time.

Anyway, Rust lets you specify the calling convention for extern functions, so let's do that:

extern "stdcall" {
    fn LoadLibraryA(name: *const i8) -> Handle;
}

Now we still have to take care of that Handle type. Why "handle"? Because everything is a handle on Windows. Open a file? You don't get a file descriptor, you get a handle. Make a network connection? You better believe you get a handle. Open a process to query information? Handle. Immediately. You can't even talk to a Microsoft employee unless you get a handle on them first.

But we don't really know what it.. is. Is it a number, and if so, how wide is it? 32 bits? 64 bits? Is it an address? I guess we'll have to check MSDN after all.

The page for LoadLibraryA gives us the following C declaration:

HMODULE LoadLibraryA(
  LPCSTR lpLibFileName
);

What's an HMODULE? Let's check Windows Data Types:

So, it's an address, gotcha. We don't know exactly what it points to, so we probably want it to be a void pointer, ie. *void in C. But this is not C, and we don't have void, so..

Cool bear

Cool bear's hot tip

Actually, the Rust standard library gives us c_void, which is exactly what you want here.

...thanks bear.

use std::ffi::c_void;

type HModule = *const c_void;

extern "stdcall" {
    fn LoadLibraryA(name: *const i8) -> HModule;
}

Does this build?

Can we maybe, finally, call it?

Well...

No, we can't, because although we just learned about the Rust syntax for raw pointers, we don't know how to actually, uh, make one.

Do we have to read some more docs? Or can our fancy code editor and its rust-analyzer plugin save us?

Praise be! This looks like just the right tool for the job. Now to compile the program and-

Oh no.

Oh no. This is why we don't code. This is what The Others warned us about.

This is unsafe Rust.

But it makes a lot of sense, you know? Microsoft Windows isn't coded in Rust, at last, not quite yet.

So, for the time being, whenever we call an external function, we have zero memory safety guarantees, and the ever-vigilant rustc wants us to acknowledge that fact.

Fair enough rustc, fair enough.

Let's review:

use std::ffi::c_void;

type HModule = *const c_void;

extern "stdcall" {
    fn LoadLibraryA(name: *const u8) -> HModule;
}

fn main() {
    let h = unsafe { LoadLibraryA("IPHLPAPI.dll".as_ptr()) };

    // another thing rustc was complaining about was that we didn't use `h`.
    //
    // well, let's print it. it's a raw pointer, and those implement the `Debug`
    // trait, so we can use the `{:?}` formatting directive.
    println!("{:?}", h);
}

Looks good to me. It's time to cargooooooooo:

Okay. Okay. Gather round team, there's good news and bad news. The good news is, we didn't crash. We survived the dangerous unsafety of the Win32 API.

The bad news is, I'm pretty sure if LoadLibraryA returns NULL (0x0), it failed. Like, 90% sure. And things were going so well!

We'll fix it, of course, but in case you haven't figured it out already, you may want to take some time to think about what the problem could be.

And just so you don't spoil yourself, I'll let cool bear go meta for a while.

Cool bear

Cool bear's hot tip

This is a good example of the forest hiding the trees.

We were careful when writing our Rust program that calls LoadLibraryA.

We did a lot of preparatory work, we looked up the relevant library, we even paid special attention to calling conventions, and so on - if there was justice in the world, it should work, right?

But here we are, and it doesn't.

And by this point in the adventure, we're so focused on the fact that we got the calling convention right, and read the MSDN page, and even used a type alias for HModule that we've lost track of one very important fact about C strings.

And if we had a colleague, now would be a good time to call them (by their preferred calling convention).

What did we pass to LoadLibraryA? A string? No!

We just passed a memory address:

We hoped it would only use the IPHLPAPI.dll part, but since we didn't pass a length, and this is C world (not the controversial theme park, the other one), it'll use all the characters until it encounters a 0 byte.

That's what "null-terminated strings" means.

In fact, we're lucky it stopped! If there hadn't been a byte set to zero after our string, it would've kept reading until it encountered a segmentation fault.

Cool bear

Cool bear's hot tip

We've briefly touched on segmentation faults before, and I'm happy to report that they are also a thing on Windows. And macOS.

All desktop operating systems that cool bear knows about, in fact.

So, does that mean.. if we add a null byte of our own, then maybe...?

fn main() {
    // notice the '\0' that wasn't there before
    let h = unsafe { LoadLibraryA("IPHLPAPI.dll\0".as_ptr()) };
    println!("{:?}", h);
}

That looks a lot better!

But let's not stop there. We got an address back, but maybe there's something else we could do to really convince ourselves that we are, actually, speaking Win32 API properly.

We could, for example try to show a message box.

First off, a library can export many functions, so we'd need to use GetProcAddress to get the address of a specific function.

MSDN's C declaration is as follows:

FARPROC GetProcAddress(
  HMODULE hModule,
  LPCSTR  lpProcName
);

Hey, we know HMODULE! And if we focus on our Microsoft intuition real hard, we can figure out that LPCSTR probably means Long.. Pointer... C language... Strings? We've used strings before, so we know what to do.

use std::ffi::c_void;

type HModule = *const c_void;
type FarProc = *const c_void;

extern "stdcall" {
    fn LoadLibraryA(name: *const u8) -> HModule;
    fn GetProcAddress(module: HModule, name: *const u8) -> FarProc;
}

fn main() {
    // we're going to do a whole lot of unsafe things,
    // so let's just make a block of it.
    unsafe {
        // This used to be `IPHLPAPI.dll` but that's not where MessageBox
        // lives now does it
        let h = LoadLibraryA("USER32.dll\0".as_ptr());

        let f = GetProcAddress(h, "MessageBoxA\0".as_ptr());

        // this line is not unsafe, but, we have `f` in scope
        // here, and the article is almost over, so let's not
        // start asking questions we might regret
        println!("f = {:?}", f);
    }
}

Does it work?

I mean, it's not NULL, so that's something.

Can we call it?

No. No we can't. rustc points out that, although they like us, and we've shown some real improvement lately, and if we keep it up our grades will for sure go up, that's not a function and you can only call functions.

So let's declare it as a function then. The MSDN C declaration for MessageBoxA is... spins wheel

int MessageBoxA(
  HWND    hWnd,
  LPCTSTR lpText,
  LPCTSTR lpCaption,
  UINT    uType
);

Nice! HWND smells like a handle, so it's probably a pointer. And wnd probably refers to a window, since we're talking about GUI (graphical user interface) stuff. Then strings for text and caption (yes please), and then some number we can probably leave to zero.

fn main() {
    unsafe {
        let h = LoadLibraryA("USER32.dll\0".as_ptr());

        let MessageBoxA: extern "stdcall" fn(*const c_void, *const u8, *const u8, u32) =
            GetProcAddress(h, "MessageBoxA\0".as_ptr());

        // `0` wouldn't work for raw pointers, but the standard library provides
        // us with a way to make null raw pointers.
        use std::ptr::null;
        // let's not forget to null-terminate our string
        MessageBoxA(null(), "Hello from Rust\0".as_ptr(), null(), 0);
    }
}

..but this doesn't compile.

And, you know, I think that makes a lot of sense. We just straight up asked rustc to believe that that raw pointer over there, well, that was a function. And it takes a little more than that for rustc to enter suspension of disbelief.

We didn't say the magic word. And as far as "reinterpreting a value as another type without any conversion whatsoever", the magic word is... transmute.

Let's review:

// hey, look! compact imports!
use std::{ffi::c_void, mem::transmute, ptr::null};

type HModule = *const c_void;
type FarProc = *const c_void;

extern "stdcall" {
    fn LoadLibraryA(name: *const u8) -> HModule;
    fn GetProcAddress(module: HModule, name: *const u8) -> FarProc;
}

// This one is just for readability
type MessageBoxA = extern "stdcall" fn(*const c_void, *const u8, *const u8, u32);

fn main() {
    unsafe {
        let h = LoadLibraryA("USER32.dll\0".as_ptr());

        let MessageBoxA: MessageBoxA = transmute(GetProcAddress(h, "MessageBoxA\0".as_ptr()));
        // no, you're not having a stroke - yes, you can have a local variable and a
        // type with the same name in scope at the same time!

        MessageBoxA(null(), "Hello from Rust\0".as_ptr(), null(), 0);
    }
}

And with no further ado:

Success! Time for mojitos.

Cool bear

What did we learn?

Using the correct calling convention is a prerequisite for fun times.

The calling convention for Win32 APIs is stdcall. Win32 API functions that deal with strings come in ANSI and UTF-16 flavor. No matter which one we pick, we must remember to null-terminate strings.

On Windows, LoadLibrary and GetProcAddress let us open a library at runtime and call its functions. Those are provided by KERNEL32.dll, which executables usually link to.

Rust is memory-safe and strongly-typed, and if we want both of those to go away for a fleeting moment, we can use the unsafe keyword and the transmute function.

Comment on /r/fasterthanlime

(JavaScript is required to see this. Or maybe my stuff broke)

Here's another article just for you:

Getting in and out of trouble with Rust futures

I started experimenting with asynchronous Rust code back when futures 0.1 was all we had - before async/await. I was a Rust baby then (I'm at least a toddler now), so I quickly drowned in a sea of .and_then, .map_err and Either<A, B>.

But that's all in the past! I guess!

Now everything is fine, and things go smoothly. For the most part. But even with , there are still some cases where the compiler diagnostics are, just, .