Thanks to my sponsors: David Cornu, Diego Roig, Ian McLinden, jer, Johan Saf, Vincent, Guy Waldman, Jake Demarest-Mays, Romain Ruetschi, Astrid, James Brown, Noel, Valentin Mariette, Christian Bourjau, Josh Triplett, Mark Old, Jelle Besseling, Olly Swanson, Niels Abildgaard, Max Heaton and 230 more
Running a self-relocatable ELF from memory
👋 This page was last updated ~4 years ago. Just so you know.
Welcome back!
In the last article, we
did foundational work on minipak
, our ELF packer.
It is now able to receive command-line arguments, environment variables, and
auxiliary vectors. It can parse those command-line arguments into a set of
options. It can make an ELF file smaller using the LZ4 compression
algorithm, and pack
it together with stage1
, our launcher.
And finally, the resulting file contains an EndMarker
and a Manifest
that
let us locate different parts of the .pak
, so that we can load the
compressed guest executable.
But, we've been cheating a little! In stage1
, we've been simply
decompressing the guest and writing it to disk, so that we can use execve
on it. Effectively, in the last article we've done all the parts we haven't
been doing so far.
All that's missing is the actual loader part, so in theory, we "simply" have
to put everything we've learned into minipak
, and we should be good to go!
Yes, "simply". What could possibly go wrong.
You know what bear, I don't think much will actually go wrong. We've been doing this for a while. It is part seventeen. That's a lot of parts.
Sure, sure, if you say so.
And I think you may have started to get a bit of an attitude problem lately. One minute you're hounding me to continue writing, and the next you're skeptical that we'll achieve anything at all. Are you okay?
Yes, yes, it's just... it's been so long, I'm starting to lose faith.
But we've done such great progress! And we're so close!
I've heard that before..
Here, let me show you.
Parsing ELF (again)
So, since we don't actually want to rely on the execve
syscall, and we want
to load the guest executable ourselves, we'll need to parse its ELF headers so
we know where to map each segment.
If this is unfamiliar to you, well, points at entire series feel free to go back and read from the start, but, basically, segments contain what really matters about an ELF object when we run it.
And in ELF, segments are defined in "program headers", ie. the "loader view" of the file (whereas sections are defined in section headers, ie. the "linker view" of the file)
The readelf
tool is as handy as ever, to list both segments and sections:
$ readelf -Wl ./target/release/minipak
Elf file type is EXEC (Executable file)
Entry point 0x40e150
There are 8 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000224 0x000224 R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x01026e 0x01026e R E 0x1000
LOAD 0x012000 0x0000000000412000 0x0000000000412000 0x0183ec 0x0183ec R 0x1000
LOAD 0x02adc8 0x000000000042bdc8 0x000000000042bdc8 0x001240 0x001270 RW 0x1000
NOTE 0x000200 0x0000000000400200 0x0000000000400200 0x000024 0x000024 R 0x4
GNU_EH_FRAME 0x028acc 0x0000000000428acc 0x0000000000428acc 0x0003a4 0x0003a4 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x02adc8 0x000000000042bdc8 0x000000000042bdc8 0x001238 0x001238 R 0x1
Section to Segment mapping:
Segment Sections...
00 .note.gnu.build-id
01 .text
02 .rodata .eh_frame_hdr .eh_frame .gcc_except_table
03 .data.rel.ro .got .data .bss
04 .note.gnu.build-id
05 .eh_frame_hdr
06
07 .data.rel.ro .got
If you look at the "Flg" (flags) column, you'll see that only one of these is
"E" (executable) and the code is probably in the second segment, at offset
0x1000
within the file.
If you look at the VirtAddr
column, you'll see that it all starts at
0x400000
. That's where the executable expects to be mapped in memory.
And indeed, if we start it:
$ gdb --quiet --args ./target/release/minipak
Reading symbols from ./target/release/minipak...
(gdb) starti
Starting program: /home/amos/ftl/minipak/target/release/minipak
Program stopped.
minipak::_start () at /home/amos/ftl/minipak/crates/minipak/src/main.rs:23
23 asm!("mov rdi, rsp", "call pre_main", options(noreturn))
(gdb) p/x $rip
$1 = 0x40e150
(gdb) info proc mappings
process 1589
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x400000 0x401000 0x1000 0x0 /home/amos/ftl/minipak/target/release/minipak
0x401000 0x412000 0x11000 0x1000 /home/amos/ftl/minipak/target/release/minipak
0x412000 0x42b000 0x19000 0x12000 /home/amos/ftl/minipak/target/release/minipak
0x42b000 0x42e000 0x3000 0x2a000 /home/amos/ftl/minipak/target/release/minipak
0x7ffff7ffa000 0x7ffff7ffd000 0x3000 0x0 [vvar]
0x7ffff7ffd000 0x7ffff7fff000 0x2000 0x0 [vdso]
0x7ffffffdd000 0x7ffffffff000 0x22000 0x0 [stack]
(gdb)
...we can see that $rip
(the instruction pointer) is somewhere between
0x401000
and 0x412000
, which is where it ought to be.
Not all ELF objects expect to be mapped at a fixed address, though. If we look
at the program headers for /lib/ld-linux-x86-64.so.2
for example, we'll see that
VirtAddr
starts at 0x0
.
$ readelf -Wl /lib/ld-linux-x86-64.so.2
Elf file type is DYN (Shared object file)
Entry point 0x1090
There are 11 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000cf8 0x000cf8 R 0x1000
LOAD 0x001000 0x0000000000001000 0x0000000000001000 0x023206 0x023206 R E 0x1000
LOAD 0x025000 0x0000000000025000 0x0000000000025000 0x008c24 0x008c24 R 0x1000
LOAD 0x02ec20 0x000000000002fc20 0x000000000002fc20 0x002418 0x0025b8 RW 0x1000
DYNAMIC 0x02fe30 0x0000000000030e30 0x0000000000030e30 0x000190 0x000190 RW 0x8
NOTE 0x0002a8 0x00000000000002a8 0x00000000000002a8 0x000040 0x000040 R 0x8
NOTE 0x0002e8 0x00000000000002e8 0x00000000000002e8 0x000024 0x000024 R 0x4
GNU_PROPERTY 0x0002a8 0x00000000000002a8 0x00000000000002a8 0x000040 0x000040 R 0x8
GNU_EH_FRAME 0x02a59c 0x000000000002a59c 0x000000000002a59c 0x00082c 0x00082c R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x02ec20 0x000000000002fc20 0x000000000002fc20 0x0013e0 0x0013e0 R 0x1
(etc.)
As we've seen before, it doesn't mean that it's going to be mapped at 0x0
.
Although the Linux kernel technically allows us to do that (assuming we have
the appropriate capabilities), this is not what "rtld" (the short name for
/lib/ld-linux-x86-64.so.2
) expects.
Instead, it expects to be mapped... anywhere at all:
$ gdb --quiet --args /lib/ld-linux-x86-64.so.2
Reading symbols from /lib/ld-linux-x86-64.so.2...
(No debugging symbols found in /lib/ld-linux-x86-64.so.2)
(gdb) starti
Starting program: /usr/lib/ld-linux-x86-64.so.2
Program stopped.
0x00007ffff7fcd090 in _start ()
(gdb) p/x $rip
$1 = 0x7ffff7fcd090
(gdb) info proc mappings
process 1987
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x7ffff7fc7000 0x7ffff7fca000 0x3000 0x0 [vvar]
0x7ffff7fca000 0x7ffff7fcc000 0x2000 0x0 [vdso]
0x7ffff7fcc000 0x7ffff7fcd000 0x1000 0x0 /usr/lib/ld-2.33.so
0x7ffff7fcd000 0x7ffff7ff1000 0x24000 0x1000 /usr/lib/ld-2.33.so
0x7ffff7ff1000 0x7ffff7ffa000 0x9000 0x25000 /usr/lib/ld-2.33.so
0x7ffff7ffb000 0x7ffff7fff000 0x4000 0x2e000 /usr/lib/ld-2.33.so
0x7ffffffdd000 0x7ffffffff000 0x22000 0x0 [stack]
And if we run that gdb
invocation again and again we'll notice that
"anywhere" happens to always be at 0x7ffff7fcc000
. But that's just GDB
trying to be helpful by disabling Address Space Layout Randomization
(ASLR).
We can always tell GDB to not be helpful though:
$ gdb --quiet -ex "set disable-randomization off" -ex "set confirm off" -ex "starti" -ex "p/x \$rip" -ex "quit" --args /lib/ld-linux-x86-64.so.2 | grep -F "\$1"
$1 = 0x7ff89f1af090
$ gdb --quiet -ex "set disable-randomization off" -ex "set confirm off" -ex "starti" -ex "p/x \$rip" -ex "quit" --args /lib/ld-linux-x86-64.so.2 | grep -F "\$1"
$1 = 0x7fa1ee491090
$ gdb --quiet -ex "set disable-randomization off" -ex "set confirm off" -ex "starti" -ex "p/x \$rip" -ex "quit" --args /lib/ld-linux-x86-64.so.2 | grep -F "\$1"
$1 = 0x7f2f2484e090
$ gdb --quiet -ex "set disable-randomization off" -ex "set confirm off" -ex "starti" -ex "p/x \$rip" -ex "quit" --args /lib/ld-linux-x86-64.so.2 | grep -F "\$1"
$1 = 0x7f39c9cfe090
$ gdb --quiet -ex "set disable-randomization off" -ex "set confirm off" -ex "starti" -ex "p/x \$rip" -ex "quit" --args /lib/ld-linux-x86-64.so.2 | grep -F "\$1"
$1 = 0x7f4903290090
Cool bear's hot tip
What's going on here? Well, --quiet
tells GDB to not display a wall of text
when it starts up. All the -ex
commands effectively execute GDB commands
directly, without needing to type them in.
set disable-randomization off
re-enables ASLR. set confirm off
disable
confirmation prompts so that quit
later works. As for p/x $rip
, it prints
the contents of the %rip
register as hexadecimal.
We need to escape the dollars sign ($
) though, because it's in a
double-quoted string, and if we don't, our shell will try to replace it with
the value of the rip
environment variable, which almost certainly doesn't
exist, so we'd end up with the empty string!
Here we can see that the code is mapped at a different address every time.
Long story short, if we're going to be mapping segments ourselves, we're going to need to read them, starting with the ELF header.
Since deku has served us so well so far, we'll use it to parse ELF headers as well, why not?
And since we're going to be reading so many different things from ELF files,
we'll introduce a new module named format
in pixie's codebase.
// in `crates/pixie/src/lib.rs`
mod format;
pub use format::*;
We'll even make an internal prelude for it, because we're going to end up importing a lot of the same symbols in a lot of different modules.
// in `crates/pixie/src/format/prelude.rs`
pub(crate) use alloc::{format, vec::Vec};
pub(crate) use deku::prelude::*;
pub(crate) use deku::{DekuContainerRead, DekuRead};
All the different bits of pieces of the ELF format will end up in their own
Rust module, which will be re-exported by pixie::format
, starting with the
header:
// in `crates/pixie/src/format/mod.rs`
mod prelude;
mod header;
pub use header::*;
// in `crates/pixie/src/format/header.rs`
use super::prelude::*;
/// An ELF object header
#[derive(Debug, Clone, PartialEq, DekuRead, DekuWrite)]
#[deku(magic = b"\x7FELF")]
pub struct ObjectHeader {
pub class: ElfClass,
pub endianness: Endianness,
/// Always 1
pub version: u8,
#[deku(pad_bytes_after = "8")]
pub os_abi: OsAbi,
pub typ: ElfType,
pub machine: ElfMachine,
/// Always 1
pub version_bis: u32,
pub entry_point: u64,
pub ph_offset: u64,
pub sh_offset: u64,
pub flags: u32,
pub hdr_size: u16,
pub ph_entsize: u16,
pub ph_count: u16,
pub sh_entsize: u16,
pub sh_count: u16,
pub sh_nidx: u16,
}
There, that looks about right. Here's the diagram we made aaaall the way back in Part 1 for reference:
There's some very nice things happening here with deku
. First off, the
magic is just an attribute on the whole struct:
#[deku(magic = b"\x7FELF")]
pub struct ObjectHeader {}
Again, deku
makes sure the magic is present and correct when reading, and
it writes it when, well, writing. This means if we ever need to generate an
ELF file, well, we'll just have to serialize an ObjectHeader
and that'll be
that.
"just", yes.
Then there's padding. After os_abi
, there's 8 bytes of padding, so we say so:
#[deku(pad_bytes_after = "8")]
pub os_abi: OsAbi,
Which brings us to some of the type that we haven't defined yet: ElfClass
,
Endianness
, OsAbi
, ElfType
, and ElfMachine
.
For all intents and purposes, those fields are enums. According to our
diagram, ElfClass
can be 1 or 2. But on disk, in the file itself, those
can be anything. It's just a byte, there's 255 possible values!
So, unless we want the parsing to fail if we encounter an unknown value, we must account for the fact that the value we find may be neither 1 nor 2.
And we can model that in Rust, because enum variants can have associated data:
pub enum ElfClass {
Elf32,
Elf64,
Other(u8),
}
With such an enum, we should be able to map 1 to ElfClass::Elf32
, 2 to
ElfClass::Elf64
, and everything else to ElfClass::Other(_)
.
But how does that work with deku
? Well, we need to specify two things:
- How large is
ElfClass
when serialized? Is it one byte? Two? Four? - How do we identify each variant?
And deku
lets us do all of that quite nicely, using patterns:
// in `crates/pixie/src/format/header.rs`
#[derive(Clone, Copy, DekuRead, DekuWrite, Debug, PartialEq)]
#[deku(type = "u8")]
pub enum ElfClass {
#[deku(id = "1")]
Elf32,
#[deku(id = "2")]
Elf64,
#[deku(id_pat = "_")]
Other(u8),
}
This is all explained in detail in the deku docs.
But it's very neat! This means that parsing will not fail, we'll just capture unexpected values, and then we can deal with them later if we want.
Let's fill in the rest of the enums:
// in `crates/pixie/src/format/header.rs`
#[derive(Clone, Copy, DekuRead, DekuWrite, Debug, PartialEq)]
#[deku(type = "u16")]
pub enum ElfType {
#[deku(id = "0x2")]
Exec,
#[deku(id = "0x3")]
Dyn,
#[deku(id_pat = "_")]
Other(u16),
}
#[derive(Clone, Copy, DekuRead, DekuWrite, Debug, PartialEq)]
#[deku(type = "u8")]
pub enum Endianness {
#[deku(id = "0x1")]
Little,
#[deku(id = "0x2")]
Big,
#[deku(id_pat = "_")]
Other(u8),
}
#[derive(Clone, Copy, DekuRead, DekuWrite, Debug, PartialEq)]
#[deku(type = "u16")]
pub enum ElfMachine {
#[deku(id = "0x03")]
X86,
#[deku(id = "0x3e")]
X86_64,
#[deku(id_pat = "_")]
Other(u16),
}
#[derive(Clone, Copy, DekuRead, DekuWrite, Debug, PartialEq)]
#[deku(type = "u8")]
pub enum OsAbi {
#[deku(id = "0x0")]
SysV,
#[deku(id_pat = "_")]
Other(u8),
}
And for convenience, let's add a constant to ObjectHeader
that corresponds to
its complete, serialized size:
// in `crates/pixie/src/format/header.rs`
impl ObjectHeader {
pub const SIZE: u16 = 64;
}
Now then! All this code compiles, but we're not really using it yet.
But before we do, let's think of how we want to use it. Ideally, we'd like
pixie
to expose some sort of higher-level interface, so that we don't have
to deal with the intricacies of serialization and deserialization too much in
minipak
or stage1
.
Something like this:
// in `crates/pixie/src/lib.rs`
pub struct Object<'a> {
header: ObjectHeader,
slice: &'a [u8],
}
impl<'a> Object<'a> {
/// Read an ELF object from a given slice
pub fn new(slice: &'a [u8]) -> Result<Self, PixieError> {
let input = (slice, 0);
let (_, header) = ObjectHeader::from_bytes(input)?;
Ok(Self { slice, header })
}
/// Returns the ELF object header
pub fn header(&self) -> &ObjectHeader {
&self.header
}
/// Returns the full slice
pub fn slice(&self) -> &[u8] {
&self.slice
}
}
And now, we can read the ELF object from stage1
!
// in `crates/stage1/src/main.rs`
#[allow(clippy::unnecessary_wraps)]
fn main(_env: Env) -> Result<(), PixieError> {
println!("Hello from stage1!");
let host = File::open("/proc/self/exe")?;
let host = host.map()?;
let host = host.as_ref();
let manifest = Manifest::read_from_full_slice(host)?;
let guest_range = manifest.guest.as_range();
println!("The guest is at {:x?}", guest_range);
let guest_slice = &host[guest_range];
let uncompressed_guest =
lz4_flex::decompress_size_prepended(guest_slice).expect("invalid lz4 payload");
let guest_obj = Object::new(&uncompressed_guest[..])?;
println!("Parsed {:#?}", guest_obj.header());
Ok(())
}
$ cargo run --release --bin minipak -- /usr/bin/gcc -o /tmp/gcc.pak
Compiling minipak v0.1.0 (/home/amos/ftl/minipak/crates/minipak)
Compiling pixie v0.1.0 (/home/amos/ftl/minipak/crates/pixie)
Finished release [optimized + debuginfo] target(s) in 3.58s
Running `target/release/minipak /usr/bin/gcc -o /tmp/gcc.pak`
Wrote /tmp/gcc.pak (59.86% of input)
$ /tmp/gcc.pak
Hello from stage1!
The guest is at 16380..b0359
Parsed ObjectHeader {
class: Elf64,
endianness: Little,
version: 1,
os_abi: SysV,
typ: Exec,
machine: X86_64,
version_bis: 1,
entry_point: 4221408,
ph_offset: 64,
sh_offset: 1209088,
flags: 0,
hdr_size: 64,
ph_entsize: 56,
ph_count: 14,
sh_entsize: 64,
sh_count: 34,
sh_nidx: 33,
}
Neat!
It would be even neater if we could print some of those fields as hexadecimal,
but even though I think custom_debug is
meant to support no_std
, its current version still pulls in libstd
.
No worries though, we can use something else! derivative will do the trick.
# in `crates/pixie/Cargo.toml`
derivative = { version = "2.2.0", features = ["use_core"] }
// in `crates/pixie/src/format/prelude.rs`
pub(crate) use derivative::*;
/// Format a field as lowercase hexadecimal, with the `0x` prefix.
pub fn hex_fmt<T>(t: &T, f: &mut core::fmt::Formatter) -> core::fmt::Result
where
T: core::fmt::LowerHex,
{
write!(f, "0x{:x}", t)
}
We'll pick out some fields from ObjectHeader
to format as hex — mostly
offsets, and sizes, with a few exceptions. It's really a matter of taste at
this point, they're all just numbers:
/// An ELF object header
#[derive(Derivative, Clone, PartialEq, DekuRead, DekuWrite)]
#[derivative(Debug)]
#[deku(magic = b"\x7FELF")]
pub struct ObjectHeader {
#[derivative(Debug = "ignore")]
pub class: ElfClass,
pub endianness: Endianness,
/// Always 1
pub version: u8,
#[deku(pad_bytes_after = "8")]
pub os_abi: OsAbi,
pub typ: ElfType,
pub machine: ElfMachine,
/// Always 1
pub version_bis: u32,
#[derivative(Debug(format_with = "hex_fmt"))]
pub entry_point: u64,
#[derivative(Debug(format_with = "hex_fmt"))]
pub ph_offset: u64,
#[derivative(Debug(format_with = "hex_fmt"))]
pub sh_offset: u64,
#[derivative(Debug(format_with = "hex_fmt"))]
pub flags: u32,
pub hdr_size: u16,
pub ph_entsize: u16,
pub ph_count: u16,
pub sh_entsize: u16,
pub sh_count: u16,
pub sh_nidx: u16,
}
Now, we got something wrong in the last article, when we made our build script.
fn cargo_build(path: &Path) {
println!("cargo:rerun-if-changed={}", path.display());
// etc.
}
Since we call cargo_build()
with "../stage1"
, this will rebuild if
anything inside of stage1
changes. But here, we've changed pixie
without
changing stage1
, and thus, the build script won't get re-run, and stage1
won't get recompiled.
Is that what you were just now swearing about?
Me?? I swear I have no idea what you're talking about my good bear.
Let's fix it up real quick, but rerunning if anything in the crates/
folder
changed.
fn cargo_build(path: &Path) {
println!("cargo:rerun-if-changed=..");
// etc.
}
Won't that re-run it much too often? What if we change minipak
, which is
not a dependency of stage1
?
cargo has its own dependency tracking, so running cargo build
on stage1
if
there aren't any changes should be rather cheap.
Let's try again:
$ cargo run --release --bin minipak -- /usr/bin/gcc -o /tmp/gcc.pak && /tmp/gcc.pak
Compiling minipak v0.1.0 (/home/amos/ftl/minipak/crates/minipak)
Compiling pixie v0.1.0 (/home/amos/ftl/minipak/crates/pixie)
Finished release [optimized + debuginfo] target(s) in 3.50s
Running `target/release/minipak /usr/bin/gcc -o /tmp/gcc.pak`
Wrote /tmp/gcc.pak (59.86% of input)
Hello from stage1!
The guest is at 16380..b0359
Parsed ObjectHeader {
endianness: Little,
version: 1,
os_abi: SysV,
typ: Exec,
machine: X86_64,
version_bis: 1,
entry_point: 0x4069e0,
ph_offset: 0x40,
sh_offset: 0x127300,
flags: 0x0,
hdr_size: 64,
ph_entsize: 56,
ph_count: 14,
sh_entsize: 64,
sh_count: 34,
sh_nidx: 33,
}
And compare with readelf
's output:
$ readelf -Wh /usr/bin/gcc
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x4069e0
Start of program headers: 64 (bytes into file)
Start of section headers: 1209088 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 14
Size of section headers: 64 (bytes)
Number of section headers: 34
Section header string table index: 33
Well, the readelf
authors made different choices, but all the values seem
to match up!
Next up, we'll need to parse the program headers. Again, we've got a diagram for that:
// in `crates/pixie/src/format/mod.rs`
mod program_header;
pub use program_header::*;
And deku makes it relatively easy:
// `in crates/pixie/src/format/program_header.rs`
use super::prelude::*;
/// A program header (loader view, segment mapped into memory)
#[derive(Derivative, DekuRead, DekuWrite, Clone)]
#[derivative(Debug)]
pub struct ProgramHeader {
pub typ: SegmentType,
#[derivative(Debug(format_with = "hex_fmt"))]
pub flags: u32,
#[derivative(Debug(format_with = "hex_fmt"))]
pub offset: u64,
#[derivative(Debug(format_with = "hex_fmt"))]
pub vaddr: u64,
#[derivative(Debug(format_with = "hex_fmt"))]
pub paddr: u64,
#[derivative(Debug(format_with = "hex_fmt"))]
pub filesz: u64,
#[derivative(Debug(format_with = "hex_fmt"))]
pub memsz: u64,
#[derivative(Debug(format_with = "hex_fmt"))]
pub align: u64,
}
As before, we can use an enum with a "catch-all" variant, for the segment type:
// `in crates/pixie/src/format/program_header.rs`
#[derive(Debug, DekuRead, DekuWrite, Clone, Copy, PartialEq)]
#[deku(type = "u32")]
pub enum SegmentType {
#[deku(id = "0x0")]
Null,
#[deku(id = "0x1")]
Load,
#[deku(id = "0x2")]
Dynamic,
#[deku(id = "0x3")]
Interp,
#[deku(id = "0x7")]
Tls,
#[deku(id = "0x6474e551")]
GnuStack,
#[deku(id_pat = "_")]
Other(u32),
}
And we can also add a few convenience methods, because well, vaddr
/memsz
and offset
/filesz
go together, so if we put them in a Range
, it's harder
to mess up!
// `in crates/pixie/src/format/program_header.rs`
impl ProgramHeader {
pub const SIZE: u16 = 56;
pub const EXECUTE: u32 = 1;
pub const WRITE: u32 = 2;
pub const READ: u32 = 4;
/// Returns a range that spans from offset to offset+filesz
pub fn file_range(&self) -> core::ops::Range<usize> {
let start = self.offset as usize;
let len = self.filesz as usize;
let end = start + len;
start..end
}
/// Returns a range that spans from vaddr to vaddr+memsz
pub fn mem_range(&self) -> core::ops::Range<u64> {
let start = self.vaddr;
let len = self.memsz;
let end = start + len;
start..end
}
}
Which brings us to the next question: how (and when?) do we parse all the program headers?
Well, we already have an Object
struct in pixie, that has access to the
whole contents of whichever ELF file we happen to be parsing, and program
headers are something really useful, so let's parse them directly in
Object::new
, shall we?
But before we do... I'm sure we can think of a slightly higher-level
interface to program headers. See, program headers are just that: headers.
They're a bunch of numbers, pretty much. What if we had a struct that
represents segments? Just like we had ObjectHeader
and Object
, where
Object
is the higher-level one, that also keeps track of the corresponding
data slices?
Something like this:
// in `crates/pixie/src/lib.rs`
/// A segment as read from an ELF file
pub struct Segment<'a> {
/// The program header for this segment
header: ProgramHeader,
/// The slice for this segment (not the full ELF file)
slice: &'a [u8],
}
We could have a convenience method to build it from a ProgramHeader
, and then
some getter!
// in `crates/pixie/src/lib.rs`
impl<'a> Segment<'a> {
/// Instantiate a segment
fn new(header: ProgramHeader, full_slice: &'a [u8]) -> Self {
let start = header.offset as usize;
let len = header.filesz as usize;
Segment {
header,
slice: &full_slice[start..][..len],
}
}
/// Returns the segment's type
pub fn typ(&self) -> SegmentType {
self.header.typ
}
/// Returns the segment's slice
pub fn slice(&self) -> &[u8] {
&self.slice
}
/// Returns the [`ProgramHeader`] for this segment
pub fn header(&self) -> &ProgramHeader {
&self.header
}
}
But let's think bigger! Typically when dealing with segments, we'll want to operate on one specific segment type. Or on "all the segments of a particular type".
Another thing we find ourselves doing a bunch is to build the convex hull of all the "Load" segments, effectively the smallest range that contains all the memory ranges of all the "Load" segments.
Let's do all of these upfront:
// in `crates/pixie/src/lib.rs`
use core::ops::Range;
use core::cmp::{min, max};
#[derive(displaydoc::Display, Debug)]
/// A pixie error
pub enum PixieError {
/// `{0}`
Deku(DekuError),
/// `{0}
Encore(EncoreError),
// 👇 new
/// no segments found
NoSegmentsFound,
/// could not find segment of type `{0:?}`
SegmentNotFound(SegmentType),
}
/// A collection of segments, easy to filter.
#[derive(Default)]
pub struct Segments<'a> {
items: Vec<Segment<'a>>,
}
impl<'a> Segments<'a> {
/// Returns all segments
pub fn all(&self) -> &[Segment] {
&self.items
}
/// Returns all segments of a certain type
pub fn of_type(&self, typ: SegmentType) -> impl Iterator<Item = &Segment<'a>> + '_ {
self.items.iter().filter(move |s| s.typ() == typ)
}
/// Returns the first segment of a given type or none if none matched
pub fn find(&self, typ: SegmentType) -> Result<&Segment, PixieError> {
self.of_type(typ)
.next()
.ok_or(PixieError::SegmentNotFound(typ))
}
/// Returns a 4K-aligned convex hull of all the load segments
pub fn load_convex_hull(&self) -> Result<Range<u64>, PixieError> {
let hull = self
.of_type(SegmentType::Load)
.map(|s| s.header().mem_range())
.reduce(|a, b| min(a.start, b.start)..max(a.end, b.end))
.ok_or(PixieError::NoSegmentsFound)?;
Ok(hull)
}
}
And now that we have all the data structures we could possibly dream of, let's
make sure they're available directly from the top-level Object
struct:
// in `crates/pixie/src/lib.rs`
pub struct Object<'a> {
header: ObjectHeader,
slice: &'a [u8],
// 👇 new
segments: Segments<'a>,
}
impl<'a> Object<'a> {
// 👇 our `new` function now parses segments
/// Read an ELF object from a given slice
pub fn new(slice: &'a [u8]) -> Result<Self, PixieError> {
let input = (slice, 0);
let (_, header) = ObjectHeader::from_bytes(input)?;
// Read segments
let segments = {
let mut segments = Segments::default();
let mut input = (&slice[header.ph_offset as usize..], 0);
for _ in 0..header.ph_count {
let (rest, ph) = ProgramHeader::from_bytes(input)?;
segments.items.push(Segment::new(ph, slice));
input = rest;
}
segments
};
Ok(Self {
slice,
segments,
header,
})
}
// 👇 there's now a getter for segments
/// Returns all the program's segments
pub fn segments(&self) -> &Segments {
&self.segments
}
}
And with that, we are able, in stage1
, to print each header, and the load
convex hull for our guest executable:
// in `crates/stage1/src/main.rs`
#[allow(clippy::unnecessary_wraps)]
fn main(_env: Env) -> Result<(), PixieError> {
println!("Hello from stage1!");
let host = File::open("/proc/self/exe")?;
let host = host.map()?;
let host = host.as_ref();
let manifest = Manifest::read_from_full_slice(host)?;
let guest_range = manifest.guest.as_range();
println!("The guest is at {:x?}", guest_range);
let guest_slice = &host[guest_range];
let uncompressed_guest =
lz4_flex::decompress_size_prepended(guest_slice).expect("invalid lz4 payload");
let guest_obj = Object::new(&uncompressed_guest[..])?;
println!("Parsed {:#?}", guest_obj.header());
// 👇 new!
for seg in guest_obj.segments().all() {
println!("{:?}", seg.header());
}
println!(
"Load convex hull: {:0x?}",
guest_obj.segments().load_convex_hull()
);
Ok(())
}
And we get:
$ cargo run --release --bin minipak -- /usr/bin/gcc -o /tmp/gcc.pak && /tmp/gcc.pak
Finished release [optimized + debuginfo] target(s) in 0.01s
Running `target/release/minipak /usr/bin/gcc -o /tmp/gcc.pak`
Wrote /tmp/gcc.pak (60.87% of input)
Hello from stage1!
The guest is at 19380..b3359
Parsed ObjectHeader {
// (cut)
}
ProgramHeader { typ: Other(6), flags: 0x4, offset: 0x40, vaddr: 0x400040, paddr: 0x400040, filesz: 0x310, memsz: 0x310, align: 0x8 }
ProgramHeader { typ: Interp, flags: 0x4, offset: 0x350, vaddr: 0x400350, paddr: 0x400350, filesz: 0x1c, memsz: 0x1c, align: 0x1 }
ProgramHeader { typ: Load, flags: 0x4, offset: 0x0, vaddr: 0x400000, paddr: 0x400000, filesz: 0x2ab8, memsz: 0x2ab8, align: 0x1000 }
ProgramHeader { typ: Load, flags: 0x5, offset: 0x3000, vaddr: 0x403000, paddr: 0x403000, filesz: 0x90fe1, memsz: 0x90fe1, align: 0x1000 }
ProgramHeader { typ: Load, flags: 0x4, offset: 0x94000, vaddr: 0x494000, paddr: 0x494000, filesz: 0x8ef64, memsz: 0x8ef64, align: 0x1000 }
ProgramHeader { typ: Load, flags: 0x6, offset: 0x123468, vaddr: 0x524468, paddr: 0x524468, filesz: 0x3c08, memsz: 0x8198, align: 0x1000 }
ProgramHeader { typ: Dynamic, flags: 0x6, offset: 0x125d38, vaddr: 0x526d38, paddr: 0x526d38, filesz: 0x1f0, memsz: 0x1f0, align: 0x8 }
ProgramHeader { typ: Other(4), flags: 0x4, offset: 0x370, vaddr: 0x400370, paddr: 0x400370, filesz: 0x40, memsz: 0x40, align: 0x8 }
ProgramHeader { typ: Other(4), flags: 0x4, offset: 0x3b0, vaddr: 0x4003b0, paddr: 0x4003b0, filesz: 0x44, memsz: 0x44, align: 0x4 }
ProgramHeader { typ: Tls, flags: 0x4, offset: 0x123468, vaddr: 0x524468, paddr: 0x524468, filesz: 0x0, memsz: 0x10, align: 0x8 }
ProgramHeader { typ: Other(1685382483), flags: 0x4, offset: 0x370, vaddr: 0x400370, paddr: 0x400370, filesz: 0x40, memsz: 0x40, align: 0x8 }
ProgramHeader { typ: Other(1685382480), flags: 0x4, offset: 0x10b644, vaddr: 0x50b644, paddr: 0x50b644, filesz: 0x316c, memsz: 0x316c, align: 0x4 }
ProgramHeader { typ: GnuStack, flags: 0x6, offset: 0x0, vaddr: 0x0, paddr: 0x0, filesz: 0x0, memsz: 0x0, align: 0x10 }
ProgramHeader { typ: Other(1685382482), flags: 0x4, offset: 0x123468, vaddr: 0x524468, paddr: 0x524468, filesz: 0x2b98, memsz: 0x2b98, align: 0x1 }
Load convex hull: Ok(400000..52c600)
How fun! But uh, I see one problem.
A problem?
Yeah! I mean, it's cool that we can parse the program headers from
/usr/bin/gcc
, but I don't think we're going to be able to run it from
stage1
.
Oh?
Well... what's the convex hull for stage1
?
I don't know, let me see...
$ readelf -Wl /tmp/gcc.pak
Elf file type is EXEC (Executable file)
Entry point 0x410b40
There are 8 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000224 0x000224 R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x01195e 0x01195e R E 0x1000
LOAD 0x013000 0x0000000000413000 0x0000000000413000 0x004280 0x004280 R 0x1000
LOAD 0x017b30 0x0000000000418b30 0x0000000000418b30 0x0014d8 0x001508 RW 0x1000
NOTE 0x000200 0x0000000000400200 0x0000000000400200 0x000024 0x000024 R 0x4
GNU_EH_FRAME 0x014f90 0x0000000000414f90 0x0000000000414f90 0x000564 0x000564 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x017b30 0x0000000000418b30 0x0000000000418b30 0x0014d0 0x0014d0 R 0x1
$ gdb -quiet -ex "p/x 0x0000000000418b30+0x001508" -ex "q"
$1 = 0x41a038
It's uhh... 0x400000..0x41a038
.
And what's the load convex hull for gcc?
scrolls up it's 0x400000..0x52c600
ohhhhhh.
Yeah. Can't really load something at the exact place we already are, right?
Right! That would be "chopping the branch we're sitting on"!
...I don't think that aphorism exists in English.
So, we can't really load GCC right now. But maybe we can load something else?
What about a nice relocatable executable?
Sure.
Let's make one:
// in `samples/hello-pie.c`
#include <stdio.h>
int main() {
printf("Hello! I am a C program.\n");
return 0;
}
# in `samples/Justfile`
hello-pie:
gcc -static-pie hello-pie.c -o hello-pie
file hello-pie
# in `samples/.gitignore`
*
!.gitignore
!*.c
!Justfile
Cool bear's hot tip
just is just a command runner. It doesn't have a lot of the implicit rules and complications that GNU make has, it doesn't do automatic dependency tracking like tup does.
It really is just a command runner. We'll be using it to remember how our sample executables should be built.
$ # from the top-level minipak/ folder
$ just samples/hello-pie
gcc -static-pie hello-pie.c -o hello-pie
file hello-pie
hello-pie: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=29be2c132bdb5d266cbfbd0519e890cae86d5b19, for GNU/Linux 4.4.0, not stripped
Cool bear's hot tip
Here, just
picks up samples/Justfile
and runs the hello-pie
target.
So, let's compress this executable and see what happens:
$ cargo run --release --bin minipak -- samples/hello-pie -o /tmp/hello-pie.pak && /tmp/hello-pie.pak
Finished release [optimized + debuginfo] target(s) in 0.01s
Running `target/release/minipak samples/hello-pie -o /tmp/hello-pie.pak`
Wrote /tmp/hello-pie.pak (67.42% of input)
Hello from stage1!
The guest is at 19380..89afb
Parsed ObjectHeader {
endianness: Little,
version: 1,
os_abi: Other(
3,
),
typ: Dyn,
machine: X86_64,
version_bis: 1,
entry_point: 0x8840,
ph_offset: 0x40,
sh_offset: 0xcc198,
flags: 0x0,
hdr_size: 64,
ph_entsize: 56,
ph_count: 12,
sh_entsize: 64,
sh_count: 39,
sh_nidx: 38,
}
ProgramHeader { typ: Load, flags: 0x4, offset: 0x0, vaddr: 0x0, paddr: 0x0, filesz: 0x7f20, memsz: 0x7f20, align: 0x1000 }
ProgramHeader { typ: Load, flags: 0x5, offset: 0x8000, vaddr: 0x8000, paddr: 0x8000, filesz: 0x81f7d, memsz: 0x81f7d, align: 0x1000 }
ProgramHeader { typ: Load, flags: 0x4, offset: 0x8a000, vaddr: 0x8a000, paddr: 0x8a000, filesz: 0x28bc8, memsz: 0x28bc8, align: 0x1000 }
ProgramHeader { typ: Load, flags: 0x6, offset: 0xb3768, vaddr: 0xb4768, paddr: 0xb4768, filesz: 0x5ba8, memsz: 0x7438, align: 0x1000 }
ProgramHeader { typ: Dynamic, flags: 0x6, offset: 0xb6d58, vaddr: 0xb7d58, paddr: 0xb7d58, filesz: 0x1a0, memsz: 0x1a0, align: 0x8 }
ProgramHeader { typ: Other(4), flags: 0x4, offset: 0x2e0, vaddr: 0x2e0, paddr: 0x2e0, filesz: 0x40, memsz: 0x40, align: 0x8 }
ProgramHeader { typ: Other(4), flags: 0x4, offset: 0x320, vaddr: 0x320, paddr: 0x320, filesz: 0x44, memsz: 0x44, align: 0x4 }
ProgramHeader { typ: Tls, flags: 0x4, offset: 0xb3768, vaddr: 0xb4768, paddr: 0xb4768, filesz: 0x20, memsz: 0x60, align: 0x8 }
ProgramHeader { typ: Other(1685382483), flags: 0x4, offset: 0x2e0, vaddr: 0x2e0, paddr: 0x2e0, filesz: 0x40, memsz: 0x40, align: 0x8 }
ProgramHeader { typ: Other(1685382480), flags: 0x4, offset: 0xa6390, vaddr: 0xa6390, paddr: 0xa6390, filesz: 0x1db4, memsz: 0x1db4, align: 0x4 }
ProgramHeader { typ: GnuStack, flags: 0x6, offset: 0x0, vaddr: 0x0, paddr: 0x0, filesz: 0x0, memsz: 0x0, align: 0x10 }
ProgramHeader { typ: Other(1685382482), flags: 0x4, offset: 0xb3768, vaddr: 0xb4768, paddr: 0xb4768, filesz: 0x3898, memsz: 0x3898, align: 0x1 }
Load convex hull: Ok(0..bbba0)
Great!
The load convex hull starts at 0x0
, which in this case really means we can
map it anywhere. And as we've seen in Part
14, executables like that
are actually self-relocating.
They statically link a part of rtld
within themselves, and when they start
up, they go through their own relocations and apply them.
So, we should just be able to map this object anywhere and jump to its entry point, and everything should work out!
But we're not going to just do that.
Oh no.
That would be too simple.
No, we know ahead of time that we're going to need to do that a bunch of times in a bunch of difference scenarios, so we're going to throw YAGNI to the wind, and come up with an abstraction for that:
// in `crates/src/pixie/lib.rs`
/// An ELF object mapped into memory
pub struct MappedObject<'a> {
/// The object we mapped
object: &'a Object<'a>,
/// Load convex hull
hull: Range<u64>,
/// Difference between the start of the load convex hull
/// and where it's actually mapped. For relocatable objects,
/// it's the base we picked. For non-relocatable objects,
/// it's zero.
base_offset: u64,
/// Memory allocated for the object in question
mem: &'a mut [u8],
}
There! Just like we had an Object
struct that kept track of the parsed data
(the various headers) and the mapped memory, we now have a MappedObject
struct that keeps track of the "input" Object
, and the anonymous memory
mappings we're going to copy segments into and run off of.
We'll then add a constructor to it, which takes a single argument: an address to map the object at. This only applies to relocatable objects, so, in case we're asked to map a non-relocatable object to a fixed address, we just error out, because there is no happiness down that path.
// in `crates/src/pixie/lib.rs`
#[derive(displaydoc::Display, Debug)]
/// A pixie error
pub enum PixieError {
// 👇 new!
/// cannot map non-relocatable object at fixed position
CannotMapNonRelocatableObjectAtFixedPosition,
}
impl<'a> MappedObject<'a> {
/// If `at` is Some, map at a specific address. This only works
/// with relocatable objects.
pub fn new(object: &'a Object, mut at: Option<u64>) -> Result<Self, PixieError> {
let hull = object.segments().load_convex_hull()?;
let is_relocatable = hull.start == 0;
if !is_relocatable {
// non-relocatable object, we need to map it at its fixed position
if at.is_some() {
return Err(PixieError::CannotMapNonRelocatableObjectAtFixedPosition);
}
at = Some(hull.start)
}
let mem_len = hull.end - hull.start;
let mut map_opts = MmapOptions::new(hull.end - hull.start);
map_opts.prot(MmapProt::READ | MmapProt::WRITE | MmapProt::EXEC);
if let Some(at) = at {
map_opts.at(at);
}
let res = map_opts.map()?;
let base_offset = if is_relocatable { res } else { 0 };
let mem = unsafe { core::slice::from_raw_parts_mut(res as _, mem_len as _) };
let mut mapped = Self {
hull,
object,
mem,
base_offset,
};
mapped.copy_load_segments();
Ok(mapped)
}
}
Wait, everything is read+write+exec?
Well.... that's one shortcut we can take.
Isn't that just lazy?
No, in the industry we call that "an exercise left to the reader".
We got it right in elk/delf, here we just want results. You're the one who's been impatient these last couple articles!
Fair, fair. So, results!
Well, to see results we'll need to actually implement copy_load_segments
.
And here the nice things, because we "cheated" by making everything RWX (read/write/execute), and by only mapping one big memory region (the "load convex hull") we're effectively just doing operations on Rust slices.
It is quite lengthy though, so prepare yourselves:
// in `crates/pixie/src/lib.rs`
impl<'a> MappedObject<'a> {
/// Copies load segments from the file into the memory we mapped
fn copy_load_segments(&mut self) {
for seg in self.object.segments().of_type(SegmentType::Load) {
let mem_start = self.vaddr_to_mem_offset(seg.header().vaddr);
let dst = &mut self.mem[mem_start..][..seg.slice().len()];
dst.copy_from_slice(seg.slice());
}
}
}
There!
...but that wasn't lengthy at all!
Yes! I lied! But we only got to write such a small amount of code because we prepared everything so nicely.
Yeah well it's easy to do that when you get to first golf down the final code and then write about it.
Shhh that's behind the scenes material.
I think we're missing some more utility methods though, starting with
MappedObject::vaddr_to_mem_offset
, which we use in
MappedObject::copy_load_segments
. And then a couple more:
// in `crates/pixie/src/lib.rs`
impl<'a> MappedObject<'a> {
/// Convert a vaddr to a memory offset
pub fn vaddr_to_mem_offset(&self, vaddr: u64) -> usize {
(vaddr - self.hull.start) as _
}
/// Returns a view of (potentially relocated) `mem` for a given range
pub fn vaddr_slice(&self, range: Range<u64>) -> &[u8] {
&self.mem[self.vaddr_to_mem_offset(range.start)..self.vaddr_to_mem_offset(range.end)]
}
/// Returns true if the object's base offset is zero, which we assume
/// means it can be mapped anywhere.
pub fn is_relocatable(&self) -> bool {
self.base_offset == 0
}
/// Returns the offset between the object's base and where we loaded it
pub fn base_offset(&self) -> u64 {
self.base_offset
}
/// Returns the base address for this executable
pub fn base(&self) -> u64 {
self.mem.as_ptr() as _
}
}
Good! Glad we could get these out of the way early.
Now that we have all that, we should be able to just map "hello-pie" and jump to its entry point!
In order to help us debug what's going on, let's define an info!
macro that
just forward to println!
with a prefix:
// in `crates/stage1/src/main.rs`
extern crate alloc;
macro_rules! info {
($($tokens: tt)*) => {
println!("[stage1] {}", alloc::format!($($tokens)*));
}
}
And then we can try the simplest thing that could possibly work:
// in `crates/stage1/src/main.rs`
#[allow(clippy::unnecessary_wraps)]
fn main(_env: Env) -> Result<(), PixieError> {
// 👇 we've seen this before...
let host = File::open("/proc/self/exe")?;
let host = host.map()?;
let host = host.as_ref();
let manifest = Manifest::read_from_full_slice(host)?;
let guest_range = manifest.guest.as_range();
println!("The guest is at {:x?}", guest_range);
let guest_slice = &host[guest_range];
let uncompressed_guest =
lz4_flex::decompress_size_prepended(guest_slice).expect("invalid lz4 payload");
// 👇 and this is new!
let guest_obj = Object::new(&uncompressed_guest[..])?;
let guest_mapped = MappedObject::new(&guest_obj, None)?;
info!("Mapped guest at 0x{:x}", guest_mapped.base());
let entry_point = guest_mapped.base() + guest_obj.header().entry_point;
info!("Jumping to guest's entry point 0x{:x}", entry_point);
unsafe {
pixie::launch(entry_point);
}
}
Our launch
function is going to have all the assembly we need to actually
jump to our guest executable.
// in `crates/pixie/src/lib.rs`
// Let us use inline assembly!
#![feature(asm)]
mod launch;
pub use launch::*;
// in `crates/pixie/src/launch.rs`
use crate::syscall;
/// # Safety
/// Nothing about this function is safe.
#[inline(never)]
pub unsafe fn launch(entry_point: u64) -> ! {
// handy for breakpoints
syscall::dup(0);
asm!(
/////////////////////////////////
// Jump to the entry point
/////////////////////////////////
"jmp r13",
in("r13") entry_point,
options(noreturn)
)
}
Since we expect a lot of things to go wrong, it may be useful to break just before our assembly "launch pad". But it's not that easy to break on a symbol, because by the time it's actually run, it's part of the "compressed executable", which right now looks pretty standard, but that won't last long.
So, for easy debugging, we simply try to duplicate file descriptor 0
. We
never perform that syscall anywhere else in minipak, so it should be fairly
easy to catch it from GDB.
Since we didn't add a definition for syscall::dup
before, let's do it now:
// in `crates/encore/src/syscall.rs`
/// # Safety
/// Calls into the kernel.
#[inline(always)]
pub unsafe fn dup(fd: u64) {
let syscall_number = 32;
asm!(
"syscall",
in("rax") syscall_number,
in("rdi") fd,
lateout("rcx") _, lateout("r11") _,
options(nostack),
);
}
And with that... we should have everything we need!
Let's go!
$ cargo run --release --bin minipak -- samples/hello-pie -o /tmp/hello-pie.pak && /tmp/hello-pie.pak
Compiling minipak v0.1.0 (/home/amos/ftl/minipak/crates/minipak)
Compiling encore v0.1.0 (/home/amos/ftl/minipak/crates/encore)
Compiling pixie v0.1.0 (/home/amos/ftl/minipak/crates/pixie)
Finished release [optimized + debuginfo] target(s) in 4.00s
Running `target/release/minipak samples/hello-pie -o /tmp/hello-pie.pak`
Wrote /tmp/hello-pie.pak (66.93% of input)
The guest is at 18380..88afb
[stage2] Mapped guest at 0x7fbdc662f000
[stage2] Jumping to guest's entry point 0x7fbdc6637840
[1] 10706 segmentation fault /tmp/hello-pie.pak
Awwwww. No first time success.
Well... let's try to rebuild hello-pie
with debug information:
# in `samples/Justfile`
hello-pie:
# 👇 now asking for debug info
gcc -g -static-pie hello-pie.c -o hello-pie
file hello-pie
$ just samples/hello-pie
gcc -g -static-pie hello-pie.c -o hello-pie
file hello-pie
hello-pie: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=0887df3e3be755d11f82cfcd306b32ebd16962ea, for GNU/Linux 4.4.0, with debug_info, not stripped
And now, we can use that debug info. Even though we don't map the "debug
info" part of the hello-pie
executable into memory, we can tell GDB to use
it, if we only tell it where we loaded hello-pie
— just like we did in
Part 9.
We just need to do some maths!
(gdb) help add-symbol-file
Load symbols from FILE, assuming FILE has been dynamically loaded.
Usage: add-symbol-file FILE [-readnow | -readnever] [-o OFF] [ADDR] [-s SECT-NAME SECT-ADDR]...
ADDR is the starting address of the file's text.
So, where does the .text
section start in hello-pie
?
$ readelf -WS ./samples/hello-pie | grep -E "[.]text|Address"
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[12] .text PROGBITS 0000000000008250 008250 081250 00 AX 0 0 16
Alright! So, if we pack it once again:
$ cargo run --release --bin minipak -- samples/hello-pie -o /tmp/hello-pie.pak
Finished release [optimized + debuginfo] target(s) in 0.01s
Running `target/release/minipak samples/hello-pie -o /tmp/hello-pie.pak`
Wrote /tmp/hello-pie.pak (66.86% of input)
And debug it, catching the dup
syscall:
$ gdb --quiet --args /tmp/hello-pie.pak
Reading symbols from /tmp/hello-pie.pak...
(No debugging symbols found in /tmp/hello-pie.pak)
(gdb) catch syscall dup
Catchpoint 1 (syscall 'dup' [32])
(gdb) r
Starting program: /tmp/hello-pie.pak
The guest is at 18380..88cf6
[stage2] Mapped guest at 0x7fffefeb4000
[stage2] Jumping to guest's entry point 0x7fffefebc840
Catchpoint 1 (call to syscall dup), 0x000000000040d54e in ?? ()
(gdb)
So, if the guest was mapped at 0x7fffefeb4000
, and its text section is
supposed to be at 0x8250
(with a zero base), then the actual address of the
text section is...
(gdb) p/x 0x7fffefeb4000 + 0x8250
$1 = 0x7fffefebc250
And so we should be able to get GDB to load the debug information if we simply do this:
(gdb) add-symbol-file ./samples/hello-pie 0x7fffefebc250
add symbol table from file "./samples/hello-pie" at
.text_addr = 0x7fffefebc250
(y or n) y
Reading symbols from ./samples/hello-pie...
Well? Did it work?
It's often hard to say — if you input the wrong address, then it might still show a partial stack trace and you might end up chasing the wrong thing altogether!
Ohhh is that why you were cursing so much a few weeks back?
What? Haha bear, I never curse, there must have been a mix-up.
So anyway - asking for a backtrace right now isn't very illuminating:
(gdb) backtrace
#0 0x000000000040d54e in ?? ()
#1 0x0000000000410f14 in ?? ()
#2 0x000000000040ffd1 in ?? ()
#3 0x000000000040ff98 in ?? ()
#4 0x0000000000000001 in ?? ()
#5 0x00007fffffffdf92 in ?? ()
#6 0x0000000000000000 in ?? ()
...but that's only because we haven't actually jumped to the entry point yet.
And if we do (by using stepi
repeatedly), and we enable TUI mode (with
Ctrl-x 2
), we can see the familiar prologue:
And if we keep going, we can eventually see the segfault in action:
In this instance, it looks like it's trying to access memory that isn't mapped!
And indeed, if we look closely, we can see that $rdi
points nowhere near
mapped memory:
(gdb) p/x $rdi
$16 = 0x7fff7f5e1c38
(gdb) info proc mappings
process 13380
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x400000 0x401000 0x1000 0x0 /tmp/hello-pie.pak
0x401000 0x412000 0x11000 0x1000 /tmp/hello-pie.pak
0x412000 0x416000 0x4000 0x12000 /tmp/hello-pie.pak
0x417000 0x41a000 0x3000 0x16000 /tmp/hello-pie.pak
0x7fffefeb4000 0x7fffeff70000 0xbc000 0x0
0x7fffeff70000 0x7fffefffa000 0x8a000 0x0 /tmp/hello-pie.pak
0x7fffefffa000 0x7ffff7ffa000 0x8000000 0x0
0x7ffff7ffa000 0x7ffff7ffd000 0x3000 0x0 [vvar]
0x7ffff7ffd000 0x7ffff7fff000 0x2000 0x0 [vdso]
0x7ffffffdd000 0x7ffffffff000 0x22000 0x0 [stack]
Mhhhh. Maybe we've taken one too many shortcuts.
Aww. Can we at least get something working?
I don't know bear, can we? Who knows what we forgot! We could be debugging this for another day or two and not get anywhere!
Well, let's start with the fundamentals... what's the first thing hello-pie
does?
I don't know... probably just the same thing we do: read command-line arguments?
Right! And where would it read those from?
Uhhh the stack?
And what's the stack pointer pointing to by the time we jump to the entry point?
Ohhh. Oh!
Yeah we definitely forgot one part. We do need to set the %rsp
register
before handing off control to the entry point.
Well, that's rather easy to fix!
// in `crates/stage1/src/main.rs`
#[no_mangle]
unsafe fn pre_main(stack_top: *mut u8) {
init_allocator();
// 👇 we now pass `stack_top` as well as `Env`
main(stack_top, Env::read(stack_top)).unwrap();
syscall::exit(0);
}
#[allow(clippy::unnecessary_wraps)]
// 👇
fn main(stack_top: *mut u8, _env: Env) -> Result<(), PixieError> {
// (bunch of code omitted)
let entry_point = guest_mapped.base() + guest_obj.header().entry_point;
info!("Jumping to guest's entry point 0x{:x}", entry_point);
unsafe {
// 👇
pixie::launch(stack_top, entry_point);
}
}
And then we change pixie::launch
to set %rsp
before jumping to the entry
point:
// in `crates/pixie/src/launch.rs`
/// # Safety
/// Nothing about this function is safe.
#[inline(never)]
pub unsafe fn launch(stack_top: *mut u8, entry_point: u64) -> ! {
// handy for breakpoints
syscall::dup(0);
asm!(
/////////////////////////////////
// Set up stack pointer
/////////////////////////////////
"mov rsp, r12",
/////////////////////////////////
// Jump to the entry point
/////////////////////////////////
"jmp r13",
in("r12") stack_top,
in("r13") entry_point,
options(noreturn)
)
}
Alright! I feel better about this already.
Let's pack it again:
$ cargo run --release --bin minipak -- samples/hello-pie -o /tmp/hello-pie.pak
Compiling minipak v0.1.0 (/home/amos/ftl/minipak/crates/minipak)
Compiling encore v0.1.0 (/home/amos/ftl/minipak/crates/encore)
Compiling pixie v0.1.0 (/home/amos/ftl/minipak/crates/pixie)
Finished release [optimized + debuginfo] target(s) in 3.83s
Running `target/release/minipak samples/hello-pie -o /tmp/hello-pie.pak`
panicked at 'called `Result::unwrap()` on an `Err` value: Encore(Open("/tmp/hello-pie.pak"))', crates/minipak/src/main.rs:34:32
[1] 15155 illegal hardware instruction cargo run --release --bin minipak -- samples/hello-pie -o /tmp/hello-pie.pak
Oh, uh, what?
Don't we have a GDB session running with /tmp/hello-pie.pak
?
Oh right, that'll lock the file. Let's exit the GDB session and try again:
$ cargo run --release --bin minipak -- samples/hello-pie -o /tmp/hello-pie.pak
Finished release [optimized + debuginfo] target(s) in 0.01s
Running `target/release/minipak samples/hello-pie -o /tmp/hello-pie.pak`
Wrote /tmp/hello-pie.pak (66.86% of input)
Alright. Now will it run?
$ /tmp/hello-pie.pak
The guest is at 18380..88cf6
[stage2] Mapped guest at 0x7f85dd924000
[stage2] Jumping to guest's entry point 0x7f85dd92c840
[1] 15763 segmentation fault /tmp/hello-pie.pak
Nope!
Well, let's see where it crashes this time...
$ gdb --quiet --args /tmp/hello-pie.pak
Reading symbols from /tmp/hello-pie.pak...
(No debugging symbols found in /tmp/hello-pie.pak)
(gdb) catch syscall dup
Catchpoint 1 (syscall 'dup' [32])
(gdb) r
Starting program: /tmp/hello-pie.pak
The guest is at 18380..88cf6
[stage2] Mapped guest at 0x7fffefeb4000
[stage2] Jumping to guest's entry point 0x7fffefebc840
Catchpoint 1 (call to syscall dup), 0x000000000040d554 in ?? ()
(gdb) p/x 0x7fffefeb4000 + 0x8250
$1 = 0x7fffefebc250
(gdb) add-symbol-file ./samples/hello-pie 0x7fffefebc250
add symbol table from file "./samples/hello-pie" at
.text_addr = 0x7fffefebc250
(y or n) y
Reading symbols from ./samples/hello-pie...
Huh. Right in the middle of messing with... some thread-local data.
Fun.
Let's see, what else could we have forgotten?
Well... we've thought about command-line arguments, but there's something else below the stack isn't there?
Auxiliary vectors?
Yeah.
What about them?
Well, when we're running hello-pie.pak
, we're not really running hello-pie
,
are we? We're running stage1
. Does it have the same auxiliary vectors?
Uhh...
$ gdb --quiet -ex "set confirm off" -ex "starti" -ex "info auxv" -ex "quit" --args /tmp/hello-pie.pak
Reading symbols from /tmp/hello-pie.pak...
(No debugging symbols found in /tmp/hello-pie.pak)
Starting program: /tmp/hello-pie.pak
Program stopped.
0x00000000004100a0 in ?? ()
33 AT_SYSINFO_EHDR System-supplied DSO's ELF header 0x7ffff7ffd000
16 AT_HWCAP Machine-dependent CPU capability hints 0x1f8bfbff
6 AT_PAGESZ System page size 4096
17 AT_CLKTCK Frequency of times() 100
3 AT_PHDR Program headers for program 0x400040
4 AT_PHENT Size of program header entry 56
5 AT_PHNUM Number of program headers 8
7 AT_BASE Base address of interpreter 0x0
8 AT_FLAGS Flags 0x0
9 AT_ENTRY Entry point of program 0x4100a0
11 AT_UID Real user ID 1000
12 AT_EUID Effective user ID 1000
13 AT_GID Real group ID 1000
14 AT_EGID Effective group ID 1000
23 AT_SECURE Boolean, was exec setuid-like? 0
25 AT_RANDOM Address of 16 random bytes 0x7fffffffdf79
26 AT_HWCAP2 Extension of AT_HWCAP 0x0
31 AT_EXECFN File name of executable 0x7fffffffefe5 "/tmp/hello-pie.pak"
15 AT_PLATFORM String identifying platform 0x7fffffffdf89 "x86_64"
0 AT_NULL End of vector 0x0
$ gdb --quiet -ex "set confirm off" -ex "starti" -ex "info auxv" -ex "quit" --args ./samples/hello-pie
Reading symbols from ./samples/hello-pie...
Starting program: /home/amos/ftl/minipak/samples/hello-pie
Program stopped.
0x00007ffff7f4b840 in _start ()
33 AT_SYSINFO_EHDR System-supplied DSO's ELF header 0x7ffff7f41000
16 AT_HWCAP Machine-dependent CPU capability hints 0x1f8bfbff
6 AT_PAGESZ System page size 4096
17 AT_CLKTCK Frequency of times() 100
3 AT_PHDR Program headers for program 0x7ffff7f43040
4 AT_PHENT Size of program header entry 56
5 AT_PHNUM Number of program headers 12
7 AT_BASE Base address of interpreter 0x0
8 AT_FLAGS Flags 0x0
9 AT_ENTRY Entry point of program 0x7ffff7f4b840
11 AT_UID Real user ID 1000
12 AT_EUID Effective user ID 1000
13 AT_GID Real group ID 1000
14 AT_EGID Effective group ID 1000
23 AT_SECURE Boolean, was exec setuid-like? 0
25 AT_RANDOM Address of 16 random bytes 0x7fffffffdf39
26 AT_HWCAP2 Extension of AT_HWCAP 0x0
31 AT_EXECFN File name of executable 0x7fffffffefcf "/home/amos/ftl/minipak/samples/hello-pie"
15 AT_PLATFORM String identifying platform 0x7fffffffdf49 "x86_64"
0 AT_NULL End of vector 0x0
...no.
I think Cool Bear is onto something. Not only is the number of program
headers different (8 for packed, 12 for the original), the address of those
program headers also must be different, because even if they were at the
same file offset, we're mapping the guest somewhere completely different: not
around 0x400000
, but around 0x7ffff7000000
.
And the program headers is definitely something a self-relocating executable would be looking at.
Luckily, the Env
struct we made earlier will come in handy here.
There's three auxiliary vectors we need to worry about:
PHDR
, the program headers offsetPHNUM
, the number of program headersENTRY
, the program's entry point
That last one may not matter as much in this particular scenario, since we're jumping directly to it, but it might come in handy in the future...
Ah there he goes, doing time travel again.
#[allow(clippy::unnecessary_wraps)]
// no longer unused, and mut: 👇
fn main(stack_top: *mut u8, mut env: Env) -> Result<(), PixieError> {
// (code omitted up until this point)
info!("Mapped guest at 0x{:x}", guest_mapped.base());
// Set phdr auxiliary vector
let at_phdr = env.find_vector(AuxvType::PHDR);
at_phdr.value = guest_mapped.base() + guest_obj.header().ph_offset;
// Set phnum auxiliary vector
let at_phnum = env.find_vector(AuxvType::PHNUM);
at_phnum.value = guest_obj.header().ph_count as _;
// Set entry auxiliary vector
let at_entry = env.find_vector(AuxvType::ENTRY);
at_entry.value = guest_mapped.base_offset() + guest_obj.header().entry_point;
let entry_point = guest_mapped.base() + guest_obj.header().entry_point;
info!("Jumping to guest's entry point 0x{:x}", entry_point);
unsafe {
pixie::launch(stack_top, entry_point);
}
}
Aaand... voilà!
$ cargo run --release --bin minipak -- samples/hello-pie -o /tmp/hello-pie.pak && /tmp/hello-pie.pak
Finished release [optimized + debuginfo] target(s) in 0.01s
Running `target/release/minipak samples/hello-pie -o /tmp/hello-pie.pak`
Wrote /tmp/hello-pie.pak (66.86% of input)
The guest is at 18380..88cf6
[stage2] Mapped guest at 0x7f6c35075000
[stage2] Jumping to guest's entry point 0x7f6c3507d840
Hello! I am a C program.
[1] 18827 segmentation fault /tmp/hello-pie.pak
Yes! No! It runs! But it segfaults at exit!
Well, nothing we haven't seen before... when we were working on delf/elk, we had
to patch exit
so that it didn't crash.
Yeah, but back then we were also pretending to be glibc! And we were patching
dladdr
as well! We should not have to do that here!
So the investigation there was actually quite a fun one, and I have to credit my friend @GranPC for finding the relevant Linux kernel and glibc code.
I couldn't find a standard that says so in written form, but, well, on Linux,
by convention, most of the registers (except %rsp
) are generally zeroed
when program execution starts.
And in our case, they definitely aren't. We're running a bunch of code before jumping to the entry point, that uses registers left and right.
Because a specific register is not zeroed, glibc thinks we're registering some dummy address as a destructor, and so it jumps to that address on exit.
That address?
$ gdb --quiet --args /tmp/hello-pie.pak
Reading symbols from /tmp/hello-pie.pak...
(No debugging symbols found in /tmp/hello-pie.pak)
(gdb) r
Starting program: /tmp/hello-pie.pak
The guest is at 18380..88cf6
[stage2] Mapped guest at 0x7fffefeb4000
[stage2] Jumping to guest's entry point 0x7fffefebc840
Hello! I am a C program.
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000001 in ?? ()
0x1
.
So, yeah. We're going to clear registers. Except for r13
, which contains
our actual entry point.
And we're even going to go above and beyond. When a process start, it gets a
fresh stack right? Below it are command-line arguments, environment variables,
and auxiliary vectors. But above %rsp
? Should be all zeros.
Well, let's do both these things:
// in `crates/pixie/src/launch.rs`
/// # Safety
/// Nothing about this function is safe.
#[inline(never)]
pub unsafe fn launch(stack_top: *mut u8, entry_point: u64) -> ! {
// handy for breakpoints
syscall::dup(0);
asm!(
/////////////////////////////////
// Clear some of the stack
/////////////////////////////////
// Use rsi as counter
"mov rsi, r12",
"sub rsi, 0x1000",
// Loop label
"$clear_stack:",
"cmp rsi, r12",
// If we reach rdi, we're done
"je $clear_stack_done",
// Otherwise, clear 8 bytes at once
"mov qword ptr [rsi], 0",
// Then add 8 bytes to counter
"add rsi, 0x8",
// Otherwise, loop
"jmp $clear_stack",
"$clear_stack_done:",
/////////////////////////////////
// Set up stack pointer
/////////////////////////////////
"mov rsp, r12",
/////////////////////////////////
// Jump to the entry point
/////////////////////////////////
// Clear everything that isn't r13, like the kernel does
// https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/elf.h#L170
"xor bx, bx",
"xor cx, cx",
"xor dx, dx",
"xor si, si",
"xor di, di",
"xor r8, r8",
"xor r9, r9",
"xor r10, r10",
"xor r11, r11",
"xor r12, r12",
// skip r13, we have the entry point in there
"xor r14, r14",
"xor r15, r15",
// Now we can actually jump to the entry point
"jmp r13",
in("r12") stack_top,
in("r13") entry_point,
options(noreturn)
)
}
And just like that:
$ cargo run --release --bin minipak -- samples/hello-pie -o /tmp/hello-pie.pak && /tmp/hello-pie.pak
Compiling minipak v0.1.0 (/home/amos/ftl/minipak/crates/minipak)
Compiling pixie v0.1.0 (/home/amos/ftl/minipak/crates/pixie)
Finished release [optimized + debuginfo] target(s) in 3.60s
Running `target/release/minipak samples/hello-pie -o /tmp/hello-pie.pak`
Wrote /tmp/hello-pie.pak (66.86% of input)
The guest is at 18380..88cf6
[stage2] Mapped guest at 0x7f80bfde8000
[stage2] Jumping to guest's entry point 0x7f80bfdf0840
Hello! I am a C program.
We're golden 😎
We really, truly have made an executable packer from start to finish.
Woo!
Albeit, with a severe limitation. It can only pack and run self-relocating executables, aka "static PIE" executables.
If we try a static executable that's not relocatable, well...
$ cargo run --release --bin minipak -- ~/go/bin/hugo -o /tmp/hugo.pak && /tmp/hugo.pak
Finished release [optimized + debuginfo] target(s) in 0.01s
Running `target/release/minipak /home/amos/go/bin/hugo -o /tmp/hugo.pak`
Wrote /tmp/hugo.pak (51.05% of input)
The guest is at 18380..1edd205
[1] 20716 segmentation fault /tmp/hugo.pak
...stage1 ends up overwriting itself, and everything comes crashing down.
So we're not done yet?
Not quite. But almost!
Here's another article just for you:
The curse of strong typing
It happened when I least expected it.
Someone, somewhere (above me, presumably) made a decision. "From now on", they declared, "all our new stuff must be written in Rust".
I'm not sure where they got that idea from. Maybe they've been reading propaganda. Maybe they fell prey to some confident asshole, and convinced themselves that Rust was answer to their problems.