Thanks to my sponsors: Sarah Berrettini, Niels Abildgaard, Nicholas, WeblWabl, Makoto Nakashima, jer, Geoffrey Thomas, Kristoffer Winther Balling, Berkus Decker, Mark Old, Paul Horn, Luke Yue, Jörn Huxhorn, Dirkjan Ochtman, René Ribaud, Alejandro Angulo, Jean-David Gadina, Cass, Helge Eichhorn, Michal Hošna and 255 more
ELF relocations
👋 This page was last updated ~5 years ago. Just so you know.
The last article, Position-independent code, was a mess. But who could blame us? We looked at the world, and found it to be a chaotic and seemingly nonsensical place. So, in order to blend in, we had to let go of a little bit of sanity.
The time has come to reclaim it.
Short of faulty memory sticks, memory locations don't magically turn from
0x0
into valid addresses. Someone is doing the turning, and we're going to
find out who, if it takes the rest of the series.
Let's back up a bit.
Where did it all go wrong
The executable we ended up loading and running successfully had roughly this assembly code:
; in `elk/samples/hello-pie.asm`
global _start
section .text
_start: mov rdi, 1 ; stdout fd
lea rsi, [rel msg]
mov rdx, 9 ; 8 chars + newline
mov rax, 1 ; write syscall
syscall
xor rdi, rdi ; return code 0
mov rax, 60 ; exit syscall
syscall
section .data
msg: db "hi there", 10
We compiled and linked it with:
$ # in `elk/samples/`
$ nasm -f elf64 hello-pie.asm
$ ld -pie --dynamic-linker /lib/ld-linux-x86-64.so.2 hello-pie.o -o hello-pie
We ended up with a perfectly reasonable ELF file:
$ file hello-pie
hello-pie: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-x86-64.so.2, not stripped
That we were able to load and run with elk
:
$ # in `elk/`
$ cargo b -q
$ ./target/debug/elk ./samples/hello-pie
Analyzing "./samples/hello-pie"...
(cut)
Press Enter to jmp...
hi there
But we still have a lot to elucidate. First off: why does this work?
lea rsi, [rel msg]
But this doesn't?
mov rsi, msg
To be clear - they both work. As in: they both assemble into object files, which we can link into a position-independent executable, which we can run just fine.
We just can't run the mov
version with elk.
We've done the exercise already, but let's disassemble the lea
version
again. For convenience, we'll use objdump
once again.
$ objdump -d ./hello-pie
hello-pie: file format elf64-x86-64
Disassembly of section .text:
0000000000001000 <_start>:
1000: bf 01 00 00 00 mov edi,0x1
1005: 48 8d 35 f4 1f 00 00 lea rsi,[rip+0x1ff4] # 3000 <msg>
100c: ba 09 00 00 00 mov edx,0x9
1011: b8 01 00 00 00 mov eax,0x1
1016: 0f 05 syscall
1018: 48 31 ff xor rdi,rdi
101b: b8 3c 00 00 00 mov eax,0x3c
1020: 0f 05 syscall
So, the lea
remained a lea
, although its second operand isn't [rel msg]
anymore. That makes sense - the CPU doesn't know about variable names, it's
all about addresses.
We've seen in the previous
article what RIP-relative
addressing was, and we can see that, much like in our entry_point
sample
before, this version of hello-pie
uses it.
Now let's disassemble the mov
version again:
$ nasm -f elf64 hello.asm
$ ld -pie --dynamic-linker /lib/ld-linux-x86-64.so.2 hello.o -o hello-mov-pie
$ objdump -d ./hello-mov-pie
hello-mov-pie: file format elf64-x86-64
Disassembly of section .text:
0000000000001000 <_start>:
1000: bf 01 00 00 00 mov edi,0x1
1005: 48 be 00 30 00 00 00 movabs rsi,0x3000
100c: 00 00 00
100f: ba 09 00 00 00 mov edx,0x9
1014: b8 01 00 00 00 mov eax,0x1
1019: 0f 05 syscall
101b: 48 31 ff xor rdi,rdi
101e: b8 3c 00 00 00 mov eax,0x3c
1023: 0f 05 syscall
As we've observed before, the mov
turned into a movabs
, with an absolute
source address, 0x3000
. And if we run it through elk, it segfaults, because
we're not mapping the data segment to 0x3000
- we're mapping it somewhere else.
But that version of hello-pie
.. is also position-independent. In fact, when
it's loaded via gdb, it's not mapped at 0x3000
either:
$ # in `samples/`
gdb --quiet ./hello-mov-pie
Reading symbols from ./hello-mov-pie...
(No debugging symbols found in ./hello-mov-pie)
(gdb) break _start
Breakpoint 1 at 0x1000
(gdb) run
Starting program: /home/amos/ftl/elf-series/samples/hello-mov-pie
Breakpoint 1, 0x0000555555555000 in _start ()
(gdb) disassemble
Dump of assembler code for function _start:
=> 0x0000555555555000 <+0>: mov $0x1,%edi
0x0000555555555005 <+5>: movabs $0x555555557000,%rsi
0x000055555555500f <+15>: mov $0x9,%edx
0x0000555555555014 <+20>: mov $0x1,%eax
0x0000555555555019 <+25>: syscall
0x000055555555501b <+27>: xor %rdi,%rdi
0x000055555555501e <+30>: mov $0x3c,%eax
0x0000555555555023 <+35>: syscall
End of assembler dump.
(gdb)
Hey! That address changed. Enhance!
0x0000555555555005 <+5>: movabs rsi,0x555555557000
The movabs
does not have source address 0x3000
, but 0x555555557000
.
How did it know? I guess it's time to look at some of the non-Load segments closer.
Let's look at... the Dynamic
segment for example.
This document about ELF-64 from the uClibc website says the following:
Dynamically-bound object files will have a PT_DYNAMIC program header entry. This program header entry refers to a segment containing the .dynamic section, whose contents are an array of Elf64_Dyn structures.
The Elf64_Dyn
structure in question looks like:
typedef struct {
Elf64_Sxword d_tag;
union {
Elf64_Xword d_val;
Elf64_Addr d_ptr;
} d_un;
} Elf64_Dyn;
Very well, let's read that. Since it corresponds to a ProgramHeader
entry, we'll store it
directly in the relevant struct:
// in `delf/src/lib.rs`
pub struct ProgramHeader {
// omitted: other fields
pub contents: SegmentContents,
}
We haven't really parsed other segments, and they all contain
different things, so we'll make SegmentContents
an enum - the only
non-Unknown
variant will be Dynamic
for now:
#[derive(Debug)]
// in `delf/src/lib.rs`
pub enum SegmentContents {
Dynamic(Vec<DynamicEntry>),
Unknown,
}
DynamicEntry
should be a struct, like the document said. It mentioned a
union
, but our Addr
type is a newtype over u64
, we can always access
its .0
field if we need a number, so, let's go with something simple:
// in `delf/src/lib.rs`
#[derive(Debug)]
pub struct DynamicEntry {
pub tag: DynamicTag,
pub addr: Addr,
}
Finally, DynamicTag
can have 30-odd different values - we list all of them
here for completeness. We'll find out soon enough which one our hello-pie
sample
actually contains!
// in `delf/src/lib.rs`
#[derive(Debug, TryFromPrimitive, PartialEq, Eq)]
#[repr(u64)]
pub enum DynamicTag {
Null = 0,
Needed = 1,
PltRelSz = 2,
PltGot = 3,
Hash = 4,
StrTab = 5,
SymTab = 6,
Rela = 7,
RelaSz = 8,
RelaEnt = 9,
StrSz = 10,
SymEnt = 11,
Init = 12,
Fini = 13,
SoName = 14,
RPath = 15,
Symbolic = 16,
Rel = 17,
RelSz = 18,
RelEnt = 19,
PltRel = 20,
Debug = 21,
TextRel = 22,
JmpRel = 23,
BindNow = 24,
InitArray = 25,
FiniArray = 26,
InitArraySz = 27,
FiniArraySz = 28,
RunPath = 0x1d,
LoOs = 0x60000000,
HiOs = 0x6fffffff,
LoProc = 0x70000000,
HiProc = 0x7fffffff,
}
impl_parse_for_enum!(DynamicTag, le_u64);
Cool bear's hot tip
Spoiler alert: that's not all of them.
Now that we have our data types all set up, let's adjust our parser to build them:
// in `delf/src/lib.rs`
impl ProgramHeader {
fn parse<'a>(full_input: parse::Input<'a>, i: parse::Input<'a>) -> parse::Result<'a, Self> {
use nom::sequence::tuple;
let (i, (r#type, flags)) = tuple((SegmentType::parse, SegmentFlag::parse))(i)?;
let ap = Addr::parse;
let (i, (offset, vaddr, paddr, filesz, memsz, align)) = tuple((ap, ap, ap, ap, ap, ap))(i)?;
// the new code is here:
use nom::{combinator::map, multi::many_m_n};
// this used to be directly in the `Self` struct literal, but
// we're going to use it in the next block to parse dynamic entries from it.
let slice = &full_input[offset.into()..][..filesz.into()];
let (_, contents) = match r#type {
SegmentType::Dynamic => {
// *if* this is a Dynamic segment, we parse its contents. we haven't
// implemented `DynamicEntry::parse` yet, but it's coming!
let entry_size = 16_usize;
let n = slice.len() / entry_size;
map(
many_m_n(n, n, DynamicEntry::parse),
SegmentContents::Dynamic,
)(slice)?
}
_ => (slice, SegmentContents::Unknown),
};
let res = Self {
r#type,
flags,
offset,
vaddr,
paddr,
filesz,
memsz,
align,
data: slice.to_vec(),
contents,
};
Ok((i, res))
}
}
All that's left is to implement DynamicEntry::parse
:
// in `delf/src/lib.rs`
impl DynamicEntry {
fn parse(i: parse::Input) -> parse::Result<Self> {
use nom::sequence::tuple;
let (i, (tag, addr)) = tuple((DynamicTag::parse, Addr::parse))(i)?;
Ok((i, Self { tag, addr }))
}
}
We'll also change elk
to show dynamic sections:
// in `elk/src/main.rs`
// in fn `main`
println!("Dynamic entries:");
if let Some(ds) = file
.program_headers
.iter()
.find(|ph| ph.r#type == delf::SegmentType::Dynamic)
{
if let delf::SegmentContents::Dynamic(ref table) = ds.contents {
for entry in table {
println!(" - {:?}", entry);
}
}
}
let base = 0x400000_usize;
println!("Mapping {:?} in memory...", input_path);
// (etc.)
Alright! Time to see what's hidden in this executable:
$ # in elk/
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
Analyzing "./samples/hello-mov-pie"...
Parsing failed:
Nom(MapRes) at position 12016:
00002ef0: f5 fe ff 6f 00 00 00 00 30 02 00 00 00 00 00 00 05 00 00 00
Context("DynamicTag") at position 12016:
00002ef0: f5 fe ff 6f 00 00 00 00 30 02 00 00 00 00 00 00 05 00 00 00
Nom(ManyMN) at position 12016:
00002ef0: f5 fe ff 6f 00 00 00 00 30 02 00 00 00 00 00 00 05 00 00 00
Ah. It appears, once again, that our reference documents were not exhaustive. We can find other dynamic tags in the wild, so, let's adjust.
Cool bear's hot tip
I used readelf -d
to figure it out, but that doesn't
mean you should go and do it now!
Again, it's kinda cheating. We'll get to the bottom of those entries very soon.
Let's just add a few missing variants:
// in `delf/src/lib.rs
#[derive(Debug, TryFromPrimitive, PartialEq, Eq)]
#[repr(u64)]
pub enum DynamicTag {
// omitted: existing fields
GnuHash = 0x6ffffef5,
Flags1 = 0x6ffffffb,
RelACount = 0x6ffffff9,
}
And give it another go:
$ # in elk/
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
Analyzing "./samples/hello-mov-pie"...
File {
(cut)
}
Dynamic entries:
- DynamicEntry { tag: Hash, addr: 00000220 }
- DynamicEntry { tag: GnuHash, addr: 00000230 }
- DynamicEntry { tag: StrTab, addr: 00000268 }
- DynamicEntry { tag: SymTab, addr: 00000250 }
- DynamicEntry { tag: StrSz, addr: 00000001 }
- DynamicEntry { tag: SymEnt, addr: 00000018 }
- DynamicEntry { tag: Debug, addr: 00000000 }
- DynamicEntry { tag: Rela, addr: 00000270 }
- DynamicEntry { tag: RelaSz, addr: 00000018 }
- DynamicEntry { tag: RelaEnt, addr: 00000018 }
- DynamicEntry { tag: TextRel, addr: 00000000 }
- DynamicEntry { tag: Flags1, addr: 08000000 }
- DynamicEntry { tag: RelACount, addr: 00000001 }
- DynamicEntry { tag: Null, addr: 00000000 }
- DynamicEntry { tag: Null, addr: 00000000 }
- DynamicEntry { tag: Null, addr: 00000000 }
- DynamicEntry { tag: Null, addr: 00000000 }
- DynamicEntry { tag: Null, addr: 00000000 }
(cut)
Very well then - we can see the list ends with a bunch of null entries, that's all well and good. It's probably for alignment reasons!
In fact, our ELF reference says we should stop as soon as we encounter
a Null
tag, so until of using nom::combinator::many_m_n
with a precise
count, we could use nom::combinator::many_till
.
many_till
takes parsers F
and G
, and returns a Vec
of F
's results,
along with the first result G
returned. Instead of just looking for a
8-byte sequence of 0
, or read an le_u64
and check that it equals zero,
we can use DynamicEntry::parse
for G
as well, with verify
to only
return a result if the tag
field is Null
.
// in `delf/src/lib.rs`
// in `impl ProgramHeader`
// in `fn parse`
use nom::{
combinator::{map, verify},
multi::many_till,
};
let slice = &full_input[offset.into()..][..filesz.into()];
let (_, contents) = match r#type {
SegmentType::Dynamic => map(
many_till(
DynamicEntry::parse,
verify(DynamicEntry::parse, |e| e.tag == DynamicTag::Null),
),
|(entries, _last)| SegmentContents::Dynamic(entries),
)(slice)?,
_ => (slice, SegmentContents::Unknown),
};
Note that many_till
actually returns a tuple of a Vec
of F
's results,
along with the first (and only) result G
returned. We bound it to _last
just now, and then ignored it.
Now we're left with just the non-Null
dynamic entries:
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
(cut)
Dynamic entries:
- DynamicEntry { tag: Hash, addr: 00000220 }
- DynamicEntry { tag: GnuHash, addr: 00000230 }
- DynamicEntry { tag: StrTab, addr: 00000268 }
- DynamicEntry { tag: SymTab, addr: 00000250 }
- DynamicEntry { tag: StrSz, addr: 00000001 }
- DynamicEntry { tag: SymEnt, addr: 00000018 }
- DynamicEntry { tag: Debug, addr: 00000000 }
- DynamicEntry { tag: Rela, addr: 00000270 }
- DynamicEntry { tag: RelaSz, addr: 00000018 }
- DynamicEntry { tag: RelaEnt, addr: 00000018 }
- DynamicEntry { tag: TextRel, addr: 00000000 }
- DynamicEntry { tag: Flags1, addr: 08000000 }
- DynamicEntry { tag: RelACount, addr: 00000001 }
Now, we're not terribly interested in entries like Debug
.
StrTab
sounds like it contains strings - but StrSz
is 1, so maybe there's not a whole lot of entries in there.
I want to know more about Rela
though - we've got an address,
0x270
, a total size, RelaSz
, of 0x18
, size of an entry
(RelaEnt
, also 0x18
), and a number of entries, RelaCount
which, well, that checks out, is 1
.
Alright, I suppose we should at least check to see what's in there.
Our ELF-64 reference says Rela
contains the "address of a relocation table with Elf64_Rela
entries".
Cool bear's hot tip
Hey!
That's the title of the article!
It sure is, cool bear. Looks like we've got some more parsing to do.
The Elf64_Rela
struct in question has this C definition:
typedef struct {
Elf64_Addr r_offset; /* Address of reference */
Elf64_Xword r_info; /* Symbol index and type of relocation */
Elf64_Sxword r_addend; /* Constant part of expression */
} Elf64_Rela;
Okay - now if you keep reading the reference like I did, you'll notice some bit shifting going on - the 64-bit value in the middle gets split into two 32-bit values. So we might as well parse it into two separate 32-bit values to begin with.
// in `delf/src/lib.rs
#[derive(Debug)]
pub struct Rela {
pub offset: Addr,
pub r#type: u32,
pub sym: u32,
pub addend: Addr,
}
Now, we have a tiny problem to solve. The entry with DynamicTag::Rela
gave us an address - but what kind of address? Is it an offset into the file,
or an offset into memory?
Since we're looking at program headers, which are used when executing a file, I'm ready to bet it's a memory address, because mapping segments into memory sounds like something that's done very early when launching an executable.
Very well. The thing is, I want to be able to read those Rela
entries
before we ever map any of the executable in memory. As far as parsing is
concerned, we haven't been doing memory mapping, just good old "allocating
a buffer and reading the file to it" - and then interpreting that.
So we're going to need some way to find which part of the buffer we're talking about, given a memory address. We can make a helper for that, quite easily:
// in `delf/src/lib.rs`
impl File {
pub fn segment_at(&self, addr: Addr) -> Option<&ProgramHeader> {
self.program_headers
.iter()
.filter(|ph| ph.r#type == SegmentType::Load)
.find(|ph| ph.mem_range().contains(&addr))
}
}
In fact, reading Rela
entries seems like a very useful thing to have in
the delf
library directly, instead of doing it from elk
.
It being library code, we'll need an error type:
$ # in delf/
$ cargo add thiserror@1
Adding thiserror v1.0.22 to dependencies
// in `delf/src/lib.rs`
#[derive(thiserror::Error, Debug)]
pub enum ReadRelaError {
#[error("Rela dynamic entry not found")]
RelaNotFound,
#[error("RelaSz dynamic entry not found")]
RelaSzNotFound,
#[error("Rela segment not found")]
RelaSegmentNotFound,
#[error("Parsing error")]
ParsingError(nom::error::VerboseErrorKind),
}
And some more helpers:
// in `delf/src/lib.rs`
impl File {
pub fn segment_of_type(&self, r#type: SegmentType) -> Option<&ProgramHeader> {
self.program_headers.iter().find(|ph| ph.r#type == r#type)
}
pub fn dynamic_entry(&self, tag: DynamicTag) -> Option<Addr> {
match self.segment_of_type(SegmentType::Dynamic) {
Some(ProgramHeader {
contents: SegmentContents::Dynamic(entries),
..
}) => entries.iter().find(|e| e.tag == tag).map(|e| e.addr),
_ => None,
}
}
}
And finally a read_rela_entries
method:
// in `delf/src/lib.rs`
impl Rela {
pub fn parse(i: parse::Input) -> parse::Result<Self> {
use nom::{combinator::map, number::complete::le_u32, sequence::tuple};
map(
tuple((Addr::parse, le_u32, le_u32, Addr::parse)),
|(offset, r#type, sym, addend)| Rela {
offset,
r#type,
sym,
addend,
},
)(i)
}
}
impl File {
pub fn read_rela_entries(&self) -> Result<Vec<Rela>, ReadRelaError> {
// see https://blog.rust-lang.org/2019/08/15/Rust-1.37.0.html#referring-to-enum-variants-through-type-aliases
use DynamicTag as DT;
use ReadRelaError as E;
// converting `Option<T>` to `Result<T, E>` so we can use `?` to
// bubble up errors.
let addr = self.dynamic_entry(DT::Rela).ok_or(E::RelaNotFound)?;
let len = self.dynamic_entry(DT::RelaSz).ok_or(E::RelaSzNotFound)?;
let seg = self.segment_at(addr).ok_or(E::RelaSegmentNotFound)?;
// double-slicing trick
let i = &seg.data[(addr - seg.mem_range().start).into()..][..len.into()];
use nom::multi::many0;
match many0(Rela::parse)(i) {
Ok((_, rela_entries)) => Ok(rela_entries),
Err(nom::Err::Failure(err)) | Err(nom::Err::Error(err)) => {
let e = &err.errors[0];
let (_input, error_kind) = e;
Err(E::ParsingError(error_kind.clone()))
}
// we don't use any "streaming" parsers, so `nom::Err::Incomplete` seems unlikely
_ => unreachable!(),
}
}
}
Let's try using that from elk, see how it goes!
// in `elk/src/main.rs`
// in `main`
println!("Dynamic entries:");
// (cut)
println!("Rela entries:");
let rela_entries = file.read_rela_entries()?;
for e in &rela_entries {
println!("{:#?}", e);
}
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
(cut)
Rela entries:
Rela {
offset: 00001007,
type: 8,
sym: 0,
addend: 00003000,
}
Interesting.
Very, very interesting.
What is at 0x1007
?
Let's find the corresponding segment:
// in `elk/src/main.rs`
// in `main`
println!("Rela entries:");
let rela_entries = file.read_rela_entries()?;
for e in &rela_entries {
println!("{:#?}", e);
if let Some(seg) = file.segment_at(e.offset) {
println!("... for {:#?}", seg);
}
}
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
(cut)
Rela entries:
Rela {
offset: 00001007,
type: 8,
sym: 0,
addend: 00003000,
}
... for file 00001000..00001025 | mem 00001000..00001025 | align 00001000 | R.X Load
Huh! The executable segment. Interesting.
What's in there again?
$ # `status=none` quiets it down, `if` stands for "input file" probably,
$ # `skip` skips bytes, `bs` is the block size, `count` is the number of
$ # blocks, we don't specify `of` ("output file") so it defaults to stdout,
$ # and we pipe it to `ndisasm`, in 64-bit mode, with an "origin address"
$ # (`-o`) equal to our skip. Phew.
$ dd status=none if=./samples/hello-mov-pie skip=$((0x1000)) bs=1 count=$((0x25)) | ndisasm -b 64 -o $((0x1000)) -
00001000 BF01000000 mov edi,0x1
00001005 48BE003000000000 mov rsi,0x3000
-0000
0000100F BA09000000 mov edx,0x9
00001014 B801000000 mov eax,0x1
00001019 0F05 syscall
0000101B 4831FF xor rdi,rdi
0000101E B83C000000 mov eax,0x3c
00001023 0F05 syscall
So. The mov rsi, 0x3000
instructions starts at 0x1005
, which means
that if we look at the hex dump, we should skip two bytes, and then
we're talking about these bytes:
48 BE 00 30 00 00 00 00 00 00
^^^^^^^^^^^^^^^^^^^^^^^
☝
0x1007
Now, I'm no x86 instruction expert, but I'm ready to bet that the
48 BE
parts determine that it's a mov
, to the rsi
register,
and that the 00 30 00 00 00 00 00 00
parts are.. a 64-bit
little-endian number.
Let's check in with our friend xxd
:
$ # `-g 8` shows groups of 8 bytes, `-l 8` only reads 8 bytes,
$ # `-s` is for skip, and `-e` is for little-endian
$ xxd -g 8 -l 8 -s $((0x1007)) -e ./samples/hello-mov-pie
00001007: 0000000000003000 .0......
Very well! That is indeed 0x3000
.
So those Rela
entries, they tell us exactly what part of the executable
segment to modify. That's why ./samples/hello-mov-pie
doesn't run via elk
,
but it runs normally. Because something somewhere in the system does do
those replacements.
How do we know what to replace it with, though?
The System V ABI for x86-64
To figure that out... we need to take a look a the System V ABI (Application Binary Interface) for the AMD64 Architecture. That's right! There's different relocation types for different processors, because they each have different instructions, and thus, different needs.
The most up-to-date source I've been able to find is hjl-tools/x86-psABI, which contains links to PDFs. Those won't display in a browser, they'll download. If you need a slightly older one, in a pinch, the uClibc website has you covered.
That document just so happens to have a table called "Relocation types", reproduced below:
Name | Value | Field | Calculation |
None | 0 | none | none |
64 | 1 | word64 | S + A |
PC32 | 2 | word32 | S + A - P |
GOT32 | 3 | word32 | G + A |
PLT32 | 4 | word32 | L + A - P |
COPY | 5 | none | none |
GLOB_DAT | 6 | wordclass | S |
JUMP_SLOT | 7 | wordclass | S |
RELATIVE | 8 | wordclass | B + A |
The table keeps going for a while, but for the purpose of this series, we should be all good with the those relocation types. In fact, I'm going to decide we only need three, and stick to that.
Now that we know what the possible values for the type
field of the Rela
struct are, let's turn it from an u32
into a proper enum
:
// in `delf/src/lib.rs`
#[derive(Debug, TryFromPrimitive, Clone, Copy, PartialEq, Eq)]
#[repr(u32)]
pub enum RelType {
GlobDat = 6,
JumpSlot = 7,
Relative = 8,
}
impl_parse_for_enum!(RelType, le_u32);
#[derive(Debug)]
pub struct Rela {
pub offset: Addr,
// was: `r#type: u32`
pub r#type: RelType,
pub sym: u32,
pub addend: Addr,
}
impl Rela {
pub fn parse(i: parse::Input) -> parse::Result<Self> {
use nom::{combinator::map, number::complete::le_u32, sequence::tuple};
map(
// was: `(Addr::parse, le_u32`
tuple((Addr::parse, RelType::parse, le_u32, Addr::parse)),
|(offset, r#type, sym, addend)| Rela {
offset,
r#type,
sym,
addend,
},
)(i)
}
}
$ # in `elk/`
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
(cut)
Rela {
offset: 00001007,
type: Relative,
sym: 0,
addend: 00003000,
}
... for file 00001000..00001025 | mem 00001000..00001025 | align 00001000 | R.X Load
So, our Rela entry is of type Relative
(which I'm sure you'd already
figured out). And the table above gives us a calculation of B + A
,
where A
is the addend (already a field of the Rela
struct), and B
is... the base address?
Which base address? This base address!
// in `elk/src/main.rs`
// in `main`
let base = 0x400000_usize;
println!("Mapping {:?} in memory...", input_path);
// (cut)
Let's back up. In order to execute samples/hello-pie
correctly, we have to:
- Open it (done)
- Read it to memory (done)
- Parse it to make sense of it, find and parse program headers (done)
- Find and parse all relocation entries (done)
- Map memory regions as specified by
Load
segments, although, with an offset, because mapping0x0
doesn't work too well (done) - Apply relocations (to do)
- Adjust memory regions' protection levels so that some are no longer writable, and some become executable (done)
- Jump to the entry point (adjusted for base address - done)
...and then we should be good to go!
Applying relocation, the slightly dangerous way
Let's go:
// in `elk/src/main.rs`
// cut: use directives
fn main() -> Result<(), Box<dyn Error>> {
// omitted: CLI argument handling, parse_or_print call
let rela_entries = file.read_rela_entries()?;
let base = 0x400000_usize;
println!("Mapping {:?} in memory...", input_path);
let mut mappings = Vec::new();
for ph in file
.program_headers
.iter()
.filter(|ph| ph.r#type == delf::SegmentType::Load)
.filter(|ph| ph.mem_range().end > ph.mem_range().start)
{
// this is all the same as before, until...
println!("Mapping segment @ {:?} with {:?}", ph.mem_range(), ph.flags);
let mem_range = ph.mem_range();
let len: usize = (mem_range.end - mem_range.start).into();
let start: usize = mem_range.start.0 as usize + base;
let aligned_start: usize = align_lo(start);
let padding = start - aligned_start;
let len = len + padding;
let addr: *mut u8 = unsafe { std::mem::transmute(aligned_start) };
println!("Addr: {:p}, Padding: {:08x}", addr, padding);
let map = MemoryMap::new(len, &[MapOption::MapWritable, MapOption::MapAddr(addr)])?;
println!("Copying segment data...");
unsafe {
std::ptr::copy_nonoverlapping(ph.data.as_ptr(), addr.add(padding), len);
}
// right there! this is new.
println!("Applying relocations (if any)...");
for reloc in &rela_entries {
if mem_range.contains(&reloc.offset) {
unsafe {
use std::mem::transmute as trans;
let real_segment_start = addr.add(padding);
let specified_reloc_offset = reloc.offset;
let specified_segment_start = mem_range.start;
let offset_into_segment = specified_reloc_offset - specified_segment_start;
println!(
"Applying {:?} relocation @ {:?} from segment start",
reloc.r#type, offset_into_segment
);
let reloc_addr: *mut u64 =
trans(real_segment_start.add(offset_into_segment.into()));
match reloc.r#type {
delf::RelType::Relative => {
let reloc_value = reloc.addend + delf::Addr(base as u64);
println!("Replacing with value {:?}", reloc_value);
std::ptr::write_unaligned(reloc_addr, reloc_value.0);
}
r#type => {
panic!("Unsupported relocation type {:?}", r#type);
}
}
}
}
}
// ...and, this is old again.
println!("Adjusting permissions...");
let mut protection = Protection::None;
for flag in ph.flags.iter() {
protection |= match flag {
delf::SegmentFlag::Read => Protection::Read,
delf::SegmentFlag::Write => Protection::Write,
delf::SegmentFlag::Execute => Protection::Execute,
}
}
unsafe {
protect(addr, len, protection)?;
}
mappings.push(map);
}
println!("Jumping to entry point @ {:?}...", file.entry_point);
unsafe {
jmp(std::mem::transmute(file.entry_point.0 as usize + base));
}
Ok(())
}
Well.
Life comes down to a few moments, right?
Let's try it in ugdb:
$ # in elk/
$ cargo b -q
$ ugdb ./target/debug/elk ./samples/hello-mov-pie
(gdb) break jmp
Breakpoint 1 at 0xc919: file src/main.rs, line 150.
(gdb) start
Ah! Looking at the stdout/stderr pane on the right, we can see that, when mapping
the executable segment (the 0x1000..0x1025
one), we've had one relocation to
apply (so far, so good), and we replace it with value 0x00403000
.
Our base address was 0x00400000
, and the addend was 0x3000
, so I dare say the
maths check out. But we just did a whole lot of unsafe things, it's entirely possible
we wrote to the wrong place, or wrote the right thing, but with the wrong endianness,
etc.
So did we get right? Does it actually run?
(gdb) stepi
(repeat as necessary)
Oh this looks good.
This looks real good. You can see at 0x401005
that the instruction has become
movabs rsi,x0403000
. It hasn't turned from a movabs
into something else, it
still has the register rsi
as a destination, and the endianness seems right.
Let's keep going:
(gdb) stepi
(gdb) stepi
(gdb) info reg rsi
(gdb) stepi
(gdb) stepi
(gdb) stepi
It worked!
It actually worked!
Let's cleanup elk
's output a little, now that the fundamentals
have solidified a bit.
// in `elk/src/main.rs`
use std::{env, error::Error, fs, mem::transmute, ptr::copy_nonoverlapping};
use mmap::{MapOption, MemoryMap};
use region::{protect, Protection};
fn main() -> Result<(), Box<dyn Error>> {
let input_path = env::args().nth(1).expect("usage: elk FILE");
let input = fs::read(&input_path)?;
println!("Analyzing {:?}...", input_path);
let file = match delf::File::parse_or_print_error(&input[..]) {
Some(f) => f,
None => std::process::exit(1),
};
println!("{:#?}", file);
let rela_entries = file.read_rela_entries()?;
let base = 0x400000_usize;
println!("Loading with base address @ 0x{:x}", base);
let non_empty_load_segments = file
.program_headers
.iter()
.filter(|ph| ph.r#type == delf::SegmentType::Load)
// ignore zero-length segments
.filter(|ph| ph.mem_range().end > ph.mem_range().start);
let mut mappings = Vec::new();
for ph in non_empty_load_segments {
println!("Mapping {:?} - {:?}", ph.mem_range(), ph.flags);
let mem_range = ph.mem_range();
let len: usize = (mem_range.end - mem_range.start).into();
let start: usize = mem_range.start.0 as usize + base;
let aligned_start: usize = align_lo(start);
let padding = start - aligned_start;
let len = len + padding;
let addr: *mut u8 = unsafe { transmute(aligned_start) };
if padding > 0 {
println!("(With 0x{:x} bytes of padding at the start)", padding);
}
let map = MemoryMap::new(len, &[MapOption::MapWritable, MapOption::MapAddr(addr)])?;
unsafe {
copy_nonoverlapping(ph.data.as_ptr(), addr.add(padding), len);
}
let mut num_relocs = 0;
for reloc in &rela_entries {
if mem_range.contains(&reloc.offset) {
num_relocs += 1;
unsafe {
let real_segment_start = addr.add(padding);
let offset_into_segment = reloc.offset - mem_range.start;
let reloc_addr = real_segment_start.add(offset_into_segment.into());
match reloc.r#type {
delf::RelType::Relative => {
// this assumes `reloc_addr` is 8-byte aligned. if this isn't
// the case, we would crash, and so would the target executable.
let reloc_addr: *mut u64 = transmute(reloc_addr);
let reloc_value = reloc.addend + delf::Addr(base as u64);
std::ptr::write_unaligned(reloc_addr, reloc_value.0);
}
r#type => {
panic!("Unsupported relocation type {:?}", r#type);
}
}
}
}
}
if num_relocs > 0 {
println!("(Applied {} relocations)", num_relocs);
}
let mut protection = Protection::NONE;
for flag in ph.flags.iter() {
protection |= match flag {
delf::SegmentFlag::Read => Protection::READ,
delf::SegmentFlag::Write => Protection::WRITE,
delf::SegmentFlag::Execute => Protection::EXECUTE,
}
}
unsafe {
protect(addr, len, protection)?;
}
mappings.push(map);
}
println!("Jumping to entry point @ {:?}...", file.entry_point);
unsafe {
jmp(transmute(file.entry_point.0 as usize + base));
}
Ok(())
}
And here's the complete output, without ugdb
:
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
Analyzing "./samples/hello-mov-pie"...
File {
type: Dyn,
machine: X86_64,
entry_point: 00001000,
program_headers: [
file 00000040..00000200 | mem 00000040..00000200 | align 00000008 | R.. PHdr,
file 00000200..0000021a | mem 00000200..0000021a | align 00000001 | R.. Interp,
file 00000000..00000288 | mem 00000000..00000288 | align 00001000 | R.. Load,
file 00001000..00001025 | mem 00001000..00001025 | align 00001000 | R.X Load,
file 00002000..00002000 | mem 00002000..00002000 | align 00001000 | R.. Load,
file 00002ee0..00003009 | mem 00002ee0..00003009 | align 00001000 | RW. Load,
file 00002ee0..00003000 | mem 00002ee0..00003000 | align 00000008 | RW. Dynamic,
file 00002ee0..00003000 | mem 00002ee0..00003000 | align 00000001 | R.. GnuRelRo,
],
}
Loading with base address @ 0x400000
Mapping 00000000..00000288 - BitFlags<SegmentFlag>(0b100, Read)
Mapping 00001000..00001025 - BitFlags<SegmentFlag>(0b101, Execute | Read)
(Applied 1 relocations)
Mapping 00002ee0..00003009 - BitFlags<SegmentFlag>(0b110, Write | Read)
(With 0xee0 bytes of padding at the start)
Jumping to entry point @ 00001000...
hi there
Drinks and cheers all around
And there you have it. We are now able to execute any position-independent
executable. For example, our nodata
sample runs just fine:
$ ./target/debug/elk ./samples/nodata
Analyzing "./samples/nodata"...
File {
type: Exec,
machine: X86_64,
entry_point: 00401000,
program_headers: [
file 00000000..000000b0 | mem 00400000..004000b0 | align 00001000 | R.. Load,
file 00001000..00001057 | mem 00401000..00401057 | align 00001000 | R.X Load,
],
}
Error: RelaNotFound
Uh...
// in `elk/src/main.rs`
let rela_entries = file.read_rela_entries().unwrap_or_else(|e| {
println!("Could not read relocations: {:?}", e);
Default::default()
});
let base = 0x400000_usize;
As I was saying, our nodata
sample runs just fine:
$ cargo b -q
$ ./target/debug/elk ./samples/nodata
Analyzing "./samples/nodata"...
File {
type: Exec,
machine: X86_64,
entry_point: 00401000,
program_headers: [
file 00000000..000000b0 | mem 00400000..004000b0 | align 00001000 | R.. Load,
file 00001000..00001057 | mem 00401000..00401057 | align 00001000 | R.X Load,
],
}
Could not read relocations: RelaNotFound
Loading with base address @ 0x400000
Mapping 00400000..004000b0 - BitFlags<SegmentFlag>(0b100, Read)
Mapping 00401000..00401057 - BitFlags<SegmentFlag>(0b101, Execute | Read)
Jumping to entry point @ 00401000...
okay then
And even a C program like entry_point
surely ought to run fine, now
that we've gotten to the bottom of this bleak relocation business:
./target/debug/elk ./samples/entry_point
Analyzing "./samples/entry_point"...
Parsing failed:
Nom(MapRes) at position 12104:
00002f48: fe ff ff 6f 00 00 00 00 d0 04 00 00 00 00 00 00 ff ff ff 6f
Context("DynamicTag") at position 12104:
00002f48: fe ff ff 6f 00 00 00 00 d0 04 00 00 00 00 00 00 ff ff ff 6f
Nom(ManyTill) at position 12104:
00002f48: fe ff ff 6f 00 00 00 00 d0 04 00 00 00 00 00 00 ff ff ff 6f
Ugh. More dynamic tags. Alright GNU, have it your way:
// in `delf/src/lib.rs`
#[derive(Debug, TryFromPrimitive, PartialEq, Eq)]
#[repr(u64)]
pub enum DynamicTag {
// omitted: tags 0 through 28
Flags = 30,
// As it turns out, `LoOs` and `HiOs` were the minimum and maximum values
// for operating system specific tags. And there's a bunch of those..
GnuHash = 0x6ffffef5,
VerSym = 0x6ffffff0,
RelaCount = 0x6ffffff9,
Flags1 = 0x6ffffffb,
VerDef = 0x6ffffffc,
VerDefNum = 0x6ffffffd,
VerNeed = 0x6ffffffe,
VerNeedNum = 0x6fffffff,
}
As I was saying, even a C executable like samples/entry_point
ought to
run completely f...
$ ./target/debug/elk ./samples/entry_point
Analyzing "./samples/entry_point"...
File {
type: Dyn,
machine: X86_64,
entry_point: 00001070,
program_headers: [
file 00000040..000002a8 | mem 00000040..000002a8 | align 00000008 | R.. PHdr,
file 000002a8..000002c4 | mem 000002a8..000002c4 | align 00000001 | R.. Interp,
file 00000000..00000628 | mem 00000000..00000628 | align 00001000 | R.. Load,
file 00001000..000012d5 | mem 00001000..000012d5 | align 00001000 | R.X Load,
file 00002000..000021a0 | mem 00002000..000021a0 | align 00001000 | R.. Load,
file 00002de8..00003050 | mem 00003de8..00004058 | align 00001000 | RW. Load,
file 00002df8..00002fd8 | mem 00003df8..00003fd8 | align 00000008 | RW. Dynamic,
file 000002c4..00000308 | mem 000002c4..00000308 | align 00000004 | R.. Note,
file 00002094..000020c8 | mem 00002094..000020c8 | align 00000004 | R.. GnuEhFrame,
file 00000000..00000000 | mem 00000000..00000000 | align 00000010 | RW. GnuStack,
file 00002de8..00003000 | mem 00003de8..00004000 | align 00000001 | R.. GnuRelRo,
],
}
Loading with base address @ 0x400000
Mapping 00000000..00000628 - BitFlags<SegmentFlag>(0b100, Read)
Mapping 00001000..000012d5 - BitFlags<SegmentFlag>(0b101, Execute | Read)
Mapping 00002000..000021a0 - BitFlags<SegmentFlag>(0b100, Read)
Mapping 00003de8..00004058 - BitFlags<SegmentFlag>(0b110, Write | Read)
(With 0xde8 bytes of padding at the start)
thread 'main' panicked at 'Unsupported relocation type GlobDat', src/main.rs:72:29
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
Oh bother.
We're not done yet, are we.
What did we learn?
Along with some machine code (in the R+X segment) and some data (in R+W and R segments), ELF-64 files contain a dynamic segment - which is a subslice of one of the Load segments.
Dynamic entries are either addresses or 64-bit numbers. One of them indicates the address of a relocation table. That table contains entries that describe exactly how to patch the machine code so that it runs with a given base address.
"Relative" relocations are rather simple to apply. But apparently, there's other kinds. Oh no. Is it too late to change subjects? How about a series on plants? Plants are nice. They relocate a lot more easily. Let's talk about plants next time. No? Alright then.
Here's another article just for you:
Peeking inside a Rust enum
During a recent Rust Q&A Session on my twitch
channel, someone asked a question that
seemed simple: why are small string types, like SmartString
or SmolStr
,
the same size as String
, but small vec types, like SmallVec
, are larger
than Vec
?
Now I know I just used the adjective simple, but the truth of the matter is: to understand the question, we're going to need a little bit of background.