ELF relocations

👋 This page was last updated ~5 years ago. Just so you know.

The last article, Position-independent code, was a mess. But who could blame us? We looked at the world, and found it to be a chaotic and seemingly nonsensical place. So, in order to blend in, we had to let go of a little bit of sanity.

The time has come to reclaim it.

Short of faulty memory sticks, memory locations don't magically turn from 0x0 into valid addresses. Someone is doing the turning, and we're going to find out who, if it takes the rest of the series.

Let's back up a bit.

Where did it all go wrong

The executable we ended up loading and running successfully had roughly this assembly code:

; in `elk/samples/hello-pie.asm`

        global _start

        section .text

_start: mov rdi, 1      ; stdout fd
        lea rsi, [rel msg]
        mov rdx, 9      ; 8 chars + newline
        mov rax, 1      ; write syscall
        syscall

        xor rdi, rdi    ; return code 0
        mov rax, 60     ; exit syscall
        syscall

        section .data

msg:    db "hi there", 10

We compiled and linked it with:

$ # in `elk/samples/`
$ nasm -f elf64 hello-pie.asm
$ ld -pie --dynamic-linker /lib/ld-linux-x86-64.so.2 hello-pie.o -o hello-pie

We ended up with a perfectly reasonable ELF file:

$ file hello-pie
hello-pie: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-x86-64.so.2, not stripped

That we were able to load and run with elk:

$ # in `elk/`
$ cargo b -q
$ ./target/debug/elk ./samples/hello-pie
Analyzing "./samples/hello-pie"...
(cut)
Press Enter to jmp...

hi there

But we still have a lot to elucidate. First off: why does this work?

        lea     rsi, [rel msg]

But this doesn't?

        mov     rsi, msg

To be clear - they both work. As in: they both assemble into object files, which we can link into a position-independent executable, which we can run just fine.

We just can't run the mov version with elk.

We've done the exercise already, but let's disassemble the lea version again. For convenience, we'll use objdump once again.

$ objdump -d ./hello-pie

hello-pie:     file format elf64-x86-64


Disassembly of section .text:

0000000000001000 <_start>:
    1000:       bf 01 00 00 00          mov    edi,0x1
    1005:       48 8d 35 f4 1f 00 00    lea    rsi,[rip+0x1ff4]        # 3000 <msg>
    100c:       ba 09 00 00 00          mov    edx,0x9
    1011:       b8 01 00 00 00          mov    eax,0x1
    1016:       0f 05                   syscall
    1018:       48 31 ff                xor    rdi,rdi
    101b:       b8 3c 00 00 00          mov    eax,0x3c
    1020:       0f 05                   syscall

So, the lea remained a lea, although its second operand isn't [rel msg] anymore. That makes sense - the CPU doesn't know about variable names, it's all about addresses.

We've seen in the previous article what RIP-relative addressing was, and we can see that, much like in our entry_point sample before, this version of hello-pie uses it.

Now let's disassemble the mov version again:

$ nasm -f elf64 hello.asm
$ ld -pie --dynamic-linker /lib/ld-linux-x86-64.so.2 hello.o -o hello-mov-pie
$ objdump -d ./hello-mov-pie

hello-mov-pie:     file format elf64-x86-64


Disassembly of section .text:

0000000000001000 <_start>:
    1000:       bf 01 00 00 00          mov    edi,0x1
    1005:       48 be 00 30 00 00 00    movabs rsi,0x3000
    100c:       00 00 00
    100f:       ba 09 00 00 00          mov    edx,0x9
    1014:       b8 01 00 00 00          mov    eax,0x1
    1019:       0f 05                   syscall
    101b:       48 31 ff                xor    rdi,rdi
    101e:       b8 3c 00 00 00          mov    eax,0x3c
    1023:       0f 05                   syscall

As we've observed before, the mov turned into a movabs, with an absolute source address, 0x3000. And if we run it through elk, it segfaults, because we're not mapping the data segment to 0x3000 - we're mapping it somewhere else.

But that version of hello-pie.. is also position-independent. In fact, when it's loaded via gdb, it's not mapped at 0x3000 either:

$ # in `samples/`
gdb --quiet ./hello-mov-pie
Reading symbols from ./hello-mov-pie...
(No debugging symbols found in ./hello-mov-pie)
(gdb) break _start
Breakpoint 1 at 0x1000
(gdb) run
Starting program: /home/amos/ftl/elf-series/samples/hello-mov-pie

Breakpoint 1, 0x0000555555555000 in _start ()
(gdb) disassemble
Dump of assembler code for function _start:
=> 0x0000555555555000 <+0>:     mov    $0x1,%edi
   0x0000555555555005 <+5>:     movabs $0x555555557000,%rsi
   0x000055555555500f <+15>:    mov    $0x9,%edx
   0x0000555555555014 <+20>:    mov    $0x1,%eax
   0x0000555555555019 <+25>:    syscall
   0x000055555555501b <+27>:    xor    %rdi,%rdi
   0x000055555555501e <+30>:    mov    $0x3c,%eax
   0x0000555555555023 <+35>:    syscall
End of assembler dump.
(gdb)

Hey! That address changed. Enhance!

   0x0000555555555005 <+5>:     movabs rsi,0x555555557000

The movabs does not have source address 0x3000, but 0x555555557000.

How did it know? I guess it's time to look at some of the non-Load segments closer.

Let's look at... the Dynamic segment for example.

This document about ELF-64 from the uClibc website says the following:

Dynamically-bound object files will have a PT_DYNAMIC program header entry. This program header entry refers to a segment containing the .dynamic section, whose contents are an array of Elf64_Dyn structures.

The Elf64_Dyn structure in question looks like:

typedef struct {
    Elf64_Sxword d_tag;
    union {
        Elf64_Xword d_val;
        Elf64_Addr d_ptr;
    } d_un;
} Elf64_Dyn;

Very well, let's read that. Since it corresponds to a ProgramHeader entry, we'll store it directly in the relevant struct:

// in `delf/src/lib.rs`

pub struct ProgramHeader {
    // omitted: other fields
    pub contents: SegmentContents,
}

We haven't really parsed other segments, and they all contain different things, so we'll make SegmentContents an enum - the only non-Unknown variant will be Dynamic for now:

#[derive(Debug)]
// in `delf/src/lib.rs`

pub enum SegmentContents {
    Dynamic(Vec<DynamicEntry>),
    Unknown,
}

DynamicEntry should be a struct, like the document said. It mentioned a union, but our Addr type is a newtype over u64, we can always access its .0 field if we need a number, so, let's go with something simple:

// in `delf/src/lib.rs`

#[derive(Debug)]
pub struct DynamicEntry {
    pub tag: DynamicTag,
    pub addr: Addr,
}

Finally, DynamicTag can have 30-odd different values - we list all of them here for completeness. We'll find out soon enough which one our hello-pie sample actually contains!

// in `delf/src/lib.rs`

#[derive(Debug, TryFromPrimitive, PartialEq, Eq)]
#[repr(u64)]
pub enum DynamicTag {
    Null = 0,
    Needed = 1,
    PltRelSz = 2,
    PltGot = 3,
    Hash = 4,
    StrTab = 5,
    SymTab = 6,
    Rela = 7,
    RelaSz = 8,
    RelaEnt = 9,
    StrSz = 10,
    SymEnt = 11,
    Init = 12,
    Fini = 13,
    SoName = 14,
    RPath = 15,
    Symbolic = 16,
    Rel = 17,
    RelSz = 18,
    RelEnt = 19,
    PltRel = 20,
    Debug = 21,
    TextRel = 22,
    JmpRel = 23,
    BindNow = 24,
    InitArray = 25,
    FiniArray = 26,
    InitArraySz = 27,
    FiniArraySz = 28,
    RunPath = 0x1d,
    LoOs = 0x60000000,
    HiOs = 0x6fffffff,
    LoProc = 0x70000000,
    HiProc = 0x7fffffff,
}

impl_parse_for_enum!(DynamicTag, le_u64);
Cool bear

Cool bear's hot tip

Spoiler alert: that's not all of them.

Now that we have our data types all set up, let's adjust our parser to build them:

// in `delf/src/lib.rs`

impl ProgramHeader {
    fn parse<'a>(full_input: parse::Input<'a>, i: parse::Input<'a>) -> parse::Result<'a, Self> {
        use nom::sequence::tuple;
        let (i, (r#type, flags)) = tuple((SegmentType::parse, SegmentFlag::parse))(i)?;

        let ap = Addr::parse;
        let (i, (offset, vaddr, paddr, filesz, memsz, align)) = tuple((ap, ap, ap, ap, ap, ap))(i)?;

        // the new code is here:
        use nom::{combinator::map, multi::many_m_n};
        // this used to be directly in the `Self` struct literal, but
        // we're going to use it in the next block to parse dynamic entries from it.
        let slice = &full_input[offset.into()..][..filesz.into()];
        let (_, contents) = match r#type {
            SegmentType::Dynamic => {
                // *if* this is a Dynamic segment, we parse its contents. we haven't
                // implemented `DynamicEntry::parse` yet, but it's coming!
                let entry_size = 16_usize;
                let n = slice.len() / entry_size;
                map(
                    many_m_n(n, n, DynamicEntry::parse),
                    SegmentContents::Dynamic,
                )(slice)?
            }
            _ => (slice, SegmentContents::Unknown),
        };

        let res = Self {
            r#type,
            flags,
            offset,
            vaddr,
            paddr,
            filesz,
            memsz,
            align,
            data: slice.to_vec(),
            contents,
        };
        Ok((i, res))
    }
}

All that's left is to implement DynamicEntry::parse:

// in `delf/src/lib.rs`

impl DynamicEntry {
    fn parse(i: parse::Input) -> parse::Result<Self> {
        use nom::sequence::tuple;
        let (i, (tag, addr)) = tuple((DynamicTag::parse, Addr::parse))(i)?;
        Ok((i, Self { tag, addr }))
    }
}

We'll also change elk to show dynamic sections:

// in `elk/src/main.rs`
// in fn `main`

    println!("Dynamic entries:");
    if let Some(ds) = file
        .program_headers
        .iter()
        .find(|ph| ph.r#type == delf::SegmentType::Dynamic)
    {
        if let delf::SegmentContents::Dynamic(ref table) = ds.contents {
            for entry in table {
                println!(" - {:?}", entry);
            }
        }
    }

    let base = 0x400000_usize;

    println!("Mapping {:?} in memory...", input_path);
    // (etc.)

Alright! Time to see what's hidden in this executable:

$ # in elk/
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
Analyzing "./samples/hello-mov-pie"...
Parsing failed:
Nom(MapRes) at position 12016:
00002ef0: f5 fe ff 6f 00 00 00 00 30 02 00 00 00 00 00 00 05 00 00 00
Context("DynamicTag") at position 12016:
00002ef0: f5 fe ff 6f 00 00 00 00 30 02 00 00 00 00 00 00 05 00 00 00
Nom(ManyMN) at position 12016:
00002ef0: f5 fe ff 6f 00 00 00 00 30 02 00 00 00 00 00 00 05 00 00 00

Ah. It appears, once again, that our reference documents were not exhaustive. We can find other dynamic tags in the wild, so, let's adjust.

Cool bear

Cool bear's hot tip

I used readelf -d to figure it out, but that doesn't mean you should go and do it now!

Again, it's kinda cheating. We'll get to the bottom of those entries very soon.

Let's just add a few missing variants:

// in `delf/src/lib.rs

#[derive(Debug, TryFromPrimitive, PartialEq, Eq)]
#[repr(u64)]
pub enum DynamicTag {
    // omitted: existing fields
    GnuHash = 0x6ffffef5,
    Flags1 = 0x6ffffffb,
    RelACount = 0x6ffffff9,
}

And give it another go:

$ # in elk/
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
Analyzing "./samples/hello-mov-pie"...
File {
    (cut)
}
Dynamic entries:
 - DynamicEntry { tag: Hash, addr: 00000220 }
 - DynamicEntry { tag: GnuHash, addr: 00000230 }
 - DynamicEntry { tag: StrTab, addr: 00000268 }
 - DynamicEntry { tag: SymTab, addr: 00000250 }
 - DynamicEntry { tag: StrSz, addr: 00000001 }
 - DynamicEntry { tag: SymEnt, addr: 00000018 }
 - DynamicEntry { tag: Debug, addr: 00000000 }
 - DynamicEntry { tag: Rela, addr: 00000270 }
 - DynamicEntry { tag: RelaSz, addr: 00000018 }
 - DynamicEntry { tag: RelaEnt, addr: 00000018 }
 - DynamicEntry { tag: TextRel, addr: 00000000 }
 - DynamicEntry { tag: Flags1, addr: 08000000 }
 - DynamicEntry { tag: RelACount, addr: 00000001 }
 - DynamicEntry { tag: Null, addr: 00000000 }
 - DynamicEntry { tag: Null, addr: 00000000 }
 - DynamicEntry { tag: Null, addr: 00000000 }
 - DynamicEntry { tag: Null, addr: 00000000 }
 - DynamicEntry { tag: Null, addr: 00000000 }
(cut)

Very well then - we can see the list ends with a bunch of null entries, that's all well and good. It's probably for alignment reasons!

In fact, our ELF reference says we should stop as soon as we encounter a Null tag, so until of using nom::combinator::many_m_n with a precise count, we could use nom::combinator::many_till.

many_till takes parsers F and G, and returns a Vec of F's results, along with the first result G returned. Instead of just looking for a 8-byte sequence of 0, or read an le_u64 and check that it equals zero, we can use DynamicEntry::parse for G as well, with verify to only return a result if the tag field is Null.

// in `delf/src/lib.rs`
// in `impl ProgramHeader`
// in `fn parse`

        use nom::{
            combinator::{map, verify},
            multi::many_till,
        };
        let slice = &full_input[offset.into()..][..filesz.into()];
        let (_, contents) = match r#type {
            SegmentType::Dynamic => map(
                many_till(
                    DynamicEntry::parse,
                    verify(DynamicEntry::parse, |e| e.tag == DynamicTag::Null),
                ),
                |(entries, _last)| SegmentContents::Dynamic(entries),
            )(slice)?,
            _ => (slice, SegmentContents::Unknown),
        };

Note that many_till actually returns a tuple of a Vec of F's results, along with the first (and only) result G returned. We bound it to _last just now, and then ignored it.

Now we're left with just the non-Null dynamic entries:

$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
(cut)
Dynamic entries:
 - DynamicEntry { tag: Hash, addr: 00000220 }
 - DynamicEntry { tag: GnuHash, addr: 00000230 }
 - DynamicEntry { tag: StrTab, addr: 00000268 }
 - DynamicEntry { tag: SymTab, addr: 00000250 }
 - DynamicEntry { tag: StrSz, addr: 00000001 }
 - DynamicEntry { tag: SymEnt, addr: 00000018 }
 - DynamicEntry { tag: Debug, addr: 00000000 }
 - DynamicEntry { tag: Rela, addr: 00000270 }
 - DynamicEntry { tag: RelaSz, addr: 00000018 }
 - DynamicEntry { tag: RelaEnt, addr: 00000018 }
 - DynamicEntry { tag: TextRel, addr: 00000000 }
 - DynamicEntry { tag: Flags1, addr: 08000000 }
 - DynamicEntry { tag: RelACount, addr: 00000001 }

Now, we're not terribly interested in entries like Debug. StrTab sounds like it contains strings - but StrSz is 1, so maybe there's not a whole lot of entries in there.

I want to know more about Rela though - we've got an address, 0x270, a total size, RelaSz, of 0x18, size of an entry (RelaEnt, also 0x18), and a number of entries, RelaCount which, well, that checks out, is 1.

Alright, I suppose we should at least check to see what's in there. Our ELF-64 reference says Rela contains the "address of a relocation table with Elf64_Rela entries".

Cool bear

Cool bear's hot tip

Hey!

That's the title of the article!

It sure is, cool bear. Looks like we've got some more parsing to do.

The Elf64_Rela struct in question has this C definition:

typedef struct {
    Elf64_Addr r_offset; /* Address of reference */
    Elf64_Xword r_info; /* Symbol index and type of relocation */
    Elf64_Sxword r_addend; /* Constant part of expression */
} Elf64_Rela;

Okay - now if you keep reading the reference like I did, you'll notice some bit shifting going on - the 64-bit value in the middle gets split into two 32-bit values. So we might as well parse it into two separate 32-bit values to begin with.

// in `delf/src/lib.rs

#[derive(Debug)]
pub struct Rela {
    pub offset: Addr,
    pub r#type: u32,
    pub sym: u32,
    pub addend: Addr,
}

Now, we have a tiny problem to solve. The entry with DynamicTag::Rela gave us an address - but what kind of address? Is it an offset into the file, or an offset into memory?

Since we're looking at program headers, which are used when executing a file, I'm ready to bet it's a memory address, because mapping segments into memory sounds like something that's done very early when launching an executable.

Very well. The thing is, I want to be able to read those Rela entries before we ever map any of the executable in memory. As far as parsing is concerned, we haven't been doing memory mapping, just good old "allocating a buffer and reading the file to it" - and then interpreting that.

So we're going to need some way to find which part of the buffer we're talking about, given a memory address. We can make a helper for that, quite easily:

// in `delf/src/lib.rs`

impl File {
    pub fn segment_at(&self, addr: Addr) -> Option<&ProgramHeader> {
        self.program_headers
            .iter()
            .filter(|ph| ph.r#type == SegmentType::Load)
            .find(|ph| ph.mem_range().contains(&addr))
    }
}

In fact, reading Rela entries seems like a very useful thing to have in the delf library directly, instead of doing it from elk.

It being library code, we'll need an error type:

$ # in delf/
$ cargo add thiserror@1
      Adding thiserror v1.0.22 to dependencies
// in `delf/src/lib.rs`

#[derive(thiserror::Error, Debug)]
pub enum ReadRelaError {
    #[error("Rela dynamic entry not found")]
    RelaNotFound,
    #[error("RelaSz dynamic entry not found")]
    RelaSzNotFound,
    #[error("Rela segment not found")]
    RelaSegmentNotFound,
    #[error("Parsing error")]
    ParsingError(nom::error::VerboseErrorKind),
}

And some more helpers:

// in `delf/src/lib.rs`

impl File {
    pub fn segment_of_type(&self, r#type: SegmentType) -> Option<&ProgramHeader> {
        self.program_headers.iter().find(|ph| ph.r#type == r#type)
    }

    pub fn dynamic_entry(&self, tag: DynamicTag) -> Option<Addr> {
        match self.segment_of_type(SegmentType::Dynamic) {
            Some(ProgramHeader {
                contents: SegmentContents::Dynamic(entries),
                ..
            }) => entries.iter().find(|e| e.tag == tag).map(|e| e.addr),
            _ => None,
        }
    }
}

And finally a read_rela_entries method:

// in `delf/src/lib.rs`

impl Rela {
    pub fn parse(i: parse::Input) -> parse::Result<Self> {
        use nom::{combinator::map, number::complete::le_u32, sequence::tuple};
        map(
            tuple((Addr::parse, le_u32, le_u32, Addr::parse)),
            |(offset, r#type, sym, addend)| Rela {
                offset,
                r#type,
                sym,
                addend,
            },
        )(i)
    }
}

impl File {
    pub fn read_rela_entries(&self) -> Result<Vec<Rela>, ReadRelaError> {
        // see https://blog.rust-lang.org/2019/08/15/Rust-1.37.0.html#referring-to-enum-variants-through-type-aliases
        use DynamicTag as DT;
        use ReadRelaError as E;

        // converting `Option<T>` to `Result<T, E>` so we can use `?` to
        // bubble up errors.
        let addr = self.dynamic_entry(DT::Rela).ok_or(E::RelaNotFound)?;
        let len = self.dynamic_entry(DT::RelaSz).ok_or(E::RelaSzNotFound)?;
        let seg = self.segment_at(addr).ok_or(E::RelaSegmentNotFound)?;

        // double-slicing trick
        let i = &seg.data[(addr - seg.mem_range().start).into()..][..len.into()];

        use nom::multi::many0;
        match many0(Rela::parse)(i) {
            Ok((_, rela_entries)) => Ok(rela_entries),
            Err(nom::Err::Failure(err)) | Err(nom::Err::Error(err)) => {
                let e = &err.errors[0];
                let (_input, error_kind) = e;
                Err(E::ParsingError(error_kind.clone()))
            }
            // we don't use any "streaming" parsers, so `nom::Err::Incomplete` seems unlikely
            _ => unreachable!(),
        }
    }
}

Let's try using that from elk, see how it goes!

// in `elk/src/main.rs`
// in `main`

    println!("Dynamic entries:");
    // (cut)

    println!("Rela entries:");
    let rela_entries = file.read_rela_entries()?;
    for e in &rela_entries {
        println!("{:#?}", e);
    }
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
(cut)
Rela entries:
Rela {
    offset: 00001007,
    type: 8,
    sym: 0,
    addend: 00003000,
}

Interesting.

Very, very interesting.

What is at 0x1007?

Let's find the corresponding segment:

// in `elk/src/main.rs`
// in `main`

    println!("Rela entries:");
    let rela_entries = file.read_rela_entries()?;
    for e in &rela_entries {
        println!("{:#?}", e);
        if let Some(seg) = file.segment_at(e.offset) {
            println!("... for {:#?}", seg);
        }
    }
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
(cut)
Rela entries:
Rela {
    offset: 00001007,
    type: 8,
    sym: 0,
    addend: 00003000,
}
... for file 00001000..00001025 | mem 00001000..00001025 | align 00001000 | R.X Load

Huh! The executable segment. Interesting.

What's in there again?

$ # `status=none` quiets it down, `if` stands for "input file" probably,
$ # `skip` skips bytes, `bs` is the block size, `count` is the number of
$ # blocks, we don't specify `of` ("output file") so it defaults to stdout,
$ # and we pipe it to `ndisasm`, in 64-bit mode, with an "origin address"
$ # (`-o`) equal to our skip. Phew.
$ dd status=none if=./samples/hello-mov-pie skip=$((0x1000)) bs=1 count=$((0x25)) | ndisasm -b 64 -o $((0x1000)) -
00001000  BF01000000        mov edi,0x1
00001005  48BE003000000000  mov rsi,0x3000
         -0000
0000100F  BA09000000        mov edx,0x9
00001014  B801000000        mov eax,0x1
00001019  0F05              syscall
0000101B  4831FF            xor rdi,rdi
0000101E  B83C000000        mov eax,0x3c
00001023  0F05              syscall

So. The mov rsi, 0x3000 instructions starts at 0x1005, which means that if we look at the hex dump, we should skip two bytes, and then we're talking about these bytes:

48 BE 00 30 00 00 00 00 00 00
      ^^^^^^^^^^^^^^^^^^^^^^^
      ☝
      0x1007

Now, I'm no x86 instruction expert, but I'm ready to bet that the 48 BE parts determine that it's a mov, to the rsi register, and that the 00 30 00 00 00 00 00 00 parts are.. a 64-bit little-endian number.

Let's check in with our friend xxd:

$ # `-g 8` shows groups of 8 bytes, `-l 8` only reads 8 bytes,
$ # `-s` is for skip, and `-e` is for little-endian
$ xxd -g 8 -l 8 -s $((0x1007)) -e ./samples/hello-mov-pie
00001007: 0000000000003000                   .0......

Very well! That is indeed 0x3000.

So those Rela entries, they tell us exactly what part of the executable segment to modify. That's why ./samples/hello-mov-pie doesn't run via elk, but it runs normally. Because something somewhere in the system does do those replacements.

How do we know what to replace it with, though?

The System V ABI for x86-64

To figure that out... we need to take a look a the System V ABI (Application Binary Interface) for the AMD64 Architecture. That's right! There's different relocation types for different processors, because they each have different instructions, and thus, different needs.

The most up-to-date source I've been able to find is hjl-tools/x86-psABI, which contains links to PDFs. Those won't display in a browser, they'll download. If you need a slightly older one, in a pinch, the uClibc website has you covered.

That document just so happens to have a table called "Relocation types", reproduced below:

NameValueFieldCalculation
None0nonenone
641word64S + A
PC322word32S + A - P
GOT323word32G + A
PLT324word32L + A - P
COPY5nonenone
GLOB_DAT6wordclassS
JUMP_SLOT7wordclassS
RELATIVE8wordclassB + A

The table keeps going for a while, but for the purpose of this series, we should be all good with the those relocation types. In fact, I'm going to decide we only need three, and stick to that.

Now that we know what the possible values for the type field of the Rela struct are, let's turn it from an u32 into a proper enum:

// in `delf/src/lib.rs`

#[derive(Debug, TryFromPrimitive, Clone, Copy, PartialEq, Eq)]
#[repr(u32)]
pub enum RelType {
    GlobDat = 6,
    JumpSlot = 7,
    Relative = 8,
}

impl_parse_for_enum!(RelType, le_u32);

#[derive(Debug)]
pub struct Rela {
    pub offset: Addr,
    // was: `r#type: u32`
    pub r#type: RelType,
    pub sym: u32,
    pub addend: Addr,
}

impl Rela {
    pub fn parse(i: parse::Input) -> parse::Result<Self> {
        use nom::{combinator::map, number::complete::le_u32, sequence::tuple};
        map(
            // was: `(Addr::parse, le_u32`
            tuple((Addr::parse, RelType::parse, le_u32, Addr::parse)),
            |(offset, r#type, sym, addend)| Rela {
                offset,
                r#type,
                sym,
                addend,
            },
        )(i)
    }
}
$ # in `elk/`
$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
(cut)
Rela {
    offset: 00001007,
    type: Relative,
    sym: 0,
    addend: 00003000,
}
... for file 00001000..00001025 | mem 00001000..00001025 | align 00001000 | R.X Load

So, our Rela entry is of type Relative (which I'm sure you'd already figured out). And the table above gives us a calculation of B + A, where A is the addend (already a field of the Rela struct), and B is... the base address?

Which base address? This base address!

// in `elk/src/main.rs`
// in `main`

    let base = 0x400000_usize;

    println!("Mapping {:?} in memory...", input_path);
    // (cut)

Let's back up. In order to execute samples/hello-pie correctly, we have to:

  • Open it (done)
  • Read it to memory (done)
  • Parse it to make sense of it, find and parse program headers (done)
  • Find and parse all relocation entries (done)
  • Map memory regions as specified by Load segments, although, with an offset, because mapping 0x0 doesn't work too well (done)
  • Apply relocations (to do)
  • Adjust memory regions' protection levels so that some are no longer writable, and some become executable (done)
  • Jump to the entry point (adjusted for base address - done)

...and then we should be good to go!

Applying relocation, the slightly dangerous way

Let's go:

// in `elk/src/main.rs`

// cut: use directives

fn main() -> Result<(), Box<dyn Error>> {
    // omitted: CLI argument handling, parse_or_print call

    let rela_entries = file.read_rela_entries()?;
    let base = 0x400000_usize;

    println!("Mapping {:?} in memory...", input_path);
    let mut mappings = Vec::new();
    for ph in file
        .program_headers
        .iter()
        .filter(|ph| ph.r#type == delf::SegmentType::Load)
        .filter(|ph| ph.mem_range().end > ph.mem_range().start)
    {
        // this is all the same as before, until...
        println!("Mapping segment @ {:?} with {:?}", ph.mem_range(), ph.flags);
        let mem_range = ph.mem_range();
        let len: usize = (mem_range.end - mem_range.start).into();

        let start: usize = mem_range.start.0 as usize + base;
        let aligned_start: usize = align_lo(start);
        let padding = start - aligned_start;
        let len = len + padding;

        let addr: *mut u8 = unsafe { std::mem::transmute(aligned_start) };
        println!("Addr: {:p}, Padding: {:08x}", addr, padding);

        let map = MemoryMap::new(len, &[MapOption::MapWritable, MapOption::MapAddr(addr)])?;

        println!("Copying segment data...");
        unsafe {
            std::ptr::copy_nonoverlapping(ph.data.as_ptr(), addr.add(padding), len);
        }

        // right there! this is new.
        println!("Applying relocations (if any)...");
        for reloc in &rela_entries {
            if mem_range.contains(&reloc.offset) {
                unsafe {
                    use std::mem::transmute as trans;
                    let real_segment_start = addr.add(padding);

                    let specified_reloc_offset = reloc.offset;
                    let specified_segment_start = mem_range.start;
                    let offset_into_segment = specified_reloc_offset - specified_segment_start;

                    println!(
                        "Applying {:?} relocation @ {:?} from segment start",
                        reloc.r#type, offset_into_segment
                    );

                    let reloc_addr: *mut u64 =
                        trans(real_segment_start.add(offset_into_segment.into()));
                    match reloc.r#type {
                        delf::RelType::Relative => {
                            let reloc_value = reloc.addend + delf::Addr(base as u64);
                            println!("Replacing with value {:?}", reloc_value);
                            std::ptr::write_unaligned(reloc_addr, reloc_value.0);
                        }
                        r#type => {
                            panic!("Unsupported relocation type {:?}", r#type);
                        }
                    }
                }
            }
        }

        // ...and, this is old again.
        println!("Adjusting permissions...");
        let mut protection = Protection::None;
        for flag in ph.flags.iter() {
            protection |= match flag {
                delf::SegmentFlag::Read => Protection::Read,
                delf::SegmentFlag::Write => Protection::Write,
                delf::SegmentFlag::Execute => Protection::Execute,
            }
        }
        unsafe {
            protect(addr, len, protection)?;
        }
        mappings.push(map);
    }

    println!("Jumping to entry point @ {:?}...", file.entry_point);
    unsafe {
        jmp(std::mem::transmute(file.entry_point.0 as usize + base));
    }

    Ok(())
}

Well.

Life comes down to a few moments, right?

Let's try it in ugdb:

$ # in elk/
$ cargo b -q
$ ugdb ./target/debug/elk ./samples/hello-mov-pie
(gdb) break jmp
Breakpoint 1 at 0xc919: file src/main.rs, line 150.
(gdb) start

Ah! Looking at the stdout/stderr pane on the right, we can see that, when mapping the executable segment (the 0x1000..0x1025 one), we've had one relocation to apply (so far, so good), and we replace it with value 0x00403000.

Our base address was 0x00400000, and the addend was 0x3000, so I dare say the maths check out. But we just did a whole lot of unsafe things, it's entirely possible we wrote to the wrong place, or wrote the right thing, but with the wrong endianness, etc.

So did we get right? Does it actually run?

(gdb) stepi
(repeat as necessary)

Oh this looks good.

This looks real good. You can see at 0x401005 that the instruction has become movabs rsi,x0403000. It hasn't turned from a movabs into something else, it still has the register rsi as a destination, and the endianness seems right.

Let's keep going:

(gdb) stepi
(gdb) stepi
(gdb) info reg rsi
(gdb) stepi
(gdb) stepi
(gdb) stepi

It worked!

It actually worked!

Let's cleanup elk's output a little, now that the fundamentals have solidified a bit.

// in `elk/src/main.rs`

use std::{env, error::Error, fs, mem::transmute, ptr::copy_nonoverlapping};

use mmap::{MapOption, MemoryMap};
use region::{protect, Protection};

fn main() -> Result<(), Box<dyn Error>> {
    let input_path = env::args().nth(1).expect("usage: elk FILE");
    let input = fs::read(&input_path)?;

    println!("Analyzing {:?}...", input_path);

    let file = match delf::File::parse_or_print_error(&input[..]) {
        Some(f) => f,
        None => std::process::exit(1),
    };
    println!("{:#?}", file);

    let rela_entries = file.read_rela_entries()?;
    let base = 0x400000_usize;

    println!("Loading with base address @ 0x{:x}", base);
    let non_empty_load_segments = file
        .program_headers
        .iter()
        .filter(|ph| ph.r#type == delf::SegmentType::Load)
        // ignore zero-length segments
        .filter(|ph| ph.mem_range().end > ph.mem_range().start);

    let mut mappings = Vec::new();
    for ph in non_empty_load_segments {
        println!("Mapping {:?} - {:?}", ph.mem_range(), ph.flags);
        let mem_range = ph.mem_range();
        let len: usize = (mem_range.end - mem_range.start).into();

        let start: usize = mem_range.start.0 as usize + base;
        let aligned_start: usize = align_lo(start);
        let padding = start - aligned_start;
        let len = len + padding;

        let addr: *mut u8 = unsafe { transmute(aligned_start) };
        if padding > 0 {
            println!("(With 0x{:x} bytes of padding at the start)", padding);
        }

        let map = MemoryMap::new(len, &[MapOption::MapWritable, MapOption::MapAddr(addr)])?;

        unsafe {
            copy_nonoverlapping(ph.data.as_ptr(), addr.add(padding), len);
        }

        let mut num_relocs = 0;
        for reloc in &rela_entries {
            if mem_range.contains(&reloc.offset) {
                num_relocs += 1;
                unsafe {
                    let real_segment_start = addr.add(padding);
                    let offset_into_segment = reloc.offset - mem_range.start;
                    let reloc_addr = real_segment_start.add(offset_into_segment.into());

                    match reloc.r#type {
                        delf::RelType::Relative => {
                            // this assumes `reloc_addr` is 8-byte aligned. if this isn't
                            // the case, we would crash, and so would the target executable.
                            let reloc_addr: *mut u64 = transmute(reloc_addr);
                            let reloc_value = reloc.addend + delf::Addr(base as u64);
                            std::ptr::write_unaligned(reloc_addr, reloc_value.0);
                        }
                        r#type => {
                            panic!("Unsupported relocation type {:?}", r#type);
                        }
                    }
                }
            }
        }
        if num_relocs > 0 {
            println!("(Applied {} relocations)", num_relocs);
        }

        let mut protection = Protection::NONE;
        for flag in ph.flags.iter() {
            protection |= match flag {
                delf::SegmentFlag::Read => Protection::READ,
                delf::SegmentFlag::Write => Protection::WRITE,
                delf::SegmentFlag::Execute => Protection::EXECUTE,
            }
        }
        unsafe {
            protect(addr, len, protection)?;
        }
        mappings.push(map);
    }

    println!("Jumping to entry point @ {:?}...", file.entry_point);
    unsafe {
        jmp(transmute(file.entry_point.0 as usize + base));
    }

    Ok(())
}

And here's the complete output, without ugdb:

$ cargo b -q
$ ./target/debug/elk ./samples/hello-mov-pie
Analyzing "./samples/hello-mov-pie"...
File {
    type: Dyn,
    machine: X86_64,
    entry_point: 00001000,
    program_headers: [
        file 00000040..00000200 | mem 00000040..00000200 | align 00000008 | R.. PHdr,
        file 00000200..0000021a | mem 00000200..0000021a | align 00000001 | R.. Interp,
        file 00000000..00000288 | mem 00000000..00000288 | align 00001000 | R.. Load,
        file 00001000..00001025 | mem 00001000..00001025 | align 00001000 | R.X Load,
        file 00002000..00002000 | mem 00002000..00002000 | align 00001000 | R.. Load,
        file 00002ee0..00003009 | mem 00002ee0..00003009 | align 00001000 | RW. Load,
        file 00002ee0..00003000 | mem 00002ee0..00003000 | align 00000008 | RW. Dynamic,
        file 00002ee0..00003000 | mem 00002ee0..00003000 | align 00000001 | R.. GnuRelRo,
    ],
}
Loading with base address @ 0x400000
Mapping 00000000..00000288 - BitFlags<SegmentFlag>(0b100, Read)
Mapping 00001000..00001025 - BitFlags<SegmentFlag>(0b101, Execute | Read)
(Applied 1 relocations)
Mapping 00002ee0..00003009 - BitFlags<SegmentFlag>(0b110, Write | Read)
(With 0xee0 bytes of padding at the start)
Jumping to entry point @ 00001000...
hi there

Drinks and cheers all around

And there you have it. We are now able to execute any position-independent executable. For example, our nodata sample runs just fine:

$ ./target/debug/elk ./samples/nodata
Analyzing "./samples/nodata"...
File {
    type: Exec,
    machine: X86_64,
    entry_point: 00401000,
    program_headers: [
        file 00000000..000000b0 | mem 00400000..004000b0 | align 00001000 | R.. Load,
        file 00001000..00001057 | mem 00401000..00401057 | align 00001000 | R.X Load,
    ],
}
Error: RelaNotFound

Uh...

// in `elk/src/main.rs`

    let rela_entries = file.read_rela_entries().unwrap_or_else(|e| {
        println!("Could not read relocations: {:?}", e);
        Default::default()
    });
    let base = 0x400000_usize;

As I was saying, our nodata sample runs just fine:

$ cargo b -q
$ ./target/debug/elk ./samples/nodata
Analyzing "./samples/nodata"...
File {
    type: Exec,
    machine: X86_64,
    entry_point: 00401000,
    program_headers: [
        file 00000000..000000b0 | mem 00400000..004000b0 | align 00001000 | R.. Load,
        file 00001000..00001057 | mem 00401000..00401057 | align 00001000 | R.X Load,
    ],
}
Could not read relocations: RelaNotFound
Loading with base address @ 0x400000
Mapping 00400000..004000b0 - BitFlags<SegmentFlag>(0b100, Read)
Mapping 00401000..00401057 - BitFlags<SegmentFlag>(0b101, Execute | Read)
Jumping to entry point @ 00401000...
okay then

And even a C program like entry_point surely ought to run fine, now that we've gotten to the bottom of this bleak relocation business:

./target/debug/elk ./samples/entry_point
Analyzing "./samples/entry_point"...
Parsing failed:
Nom(MapRes) at position 12104:
00002f48: fe ff ff 6f 00 00 00 00 d0 04 00 00 00 00 00 00 ff ff ff 6f
Context("DynamicTag") at position 12104:
00002f48: fe ff ff 6f 00 00 00 00 d0 04 00 00 00 00 00 00 ff ff ff 6f
Nom(ManyTill) at position 12104:
00002f48: fe ff ff 6f 00 00 00 00 d0 04 00 00 00 00 00 00 ff ff ff 6f

Ugh. More dynamic tags. Alright GNU, have it your way:

// in `delf/src/lib.rs`

#[derive(Debug, TryFromPrimitive, PartialEq, Eq)]
#[repr(u64)]
pub enum DynamicTag {
    // omitted: tags 0 through 28

    Flags = 30,

    // As it turns out, `LoOs` and `HiOs` were the minimum and maximum values
    // for operating system specific tags. And there's a bunch of those..
    GnuHash = 0x6ffffef5,
    VerSym = 0x6ffffff0,
    RelaCount = 0x6ffffff9,
    Flags1 = 0x6ffffffb,
    VerDef = 0x6ffffffc,
    VerDefNum = 0x6ffffffd,
    VerNeed = 0x6ffffffe,
    VerNeedNum = 0x6fffffff,
}

As I was saying, even a C executable like samples/entry_point ought to run completely f...

$ ./target/debug/elk ./samples/entry_point
Analyzing "./samples/entry_point"...
File {
    type: Dyn,
    machine: X86_64,
    entry_point: 00001070,
    program_headers: [
        file 00000040..000002a8 | mem 00000040..000002a8 | align 00000008 | R.. PHdr,
        file 000002a8..000002c4 | mem 000002a8..000002c4 | align 00000001 | R.. Interp,
        file 00000000..00000628 | mem 00000000..00000628 | align 00001000 | R.. Load,
        file 00001000..000012d5 | mem 00001000..000012d5 | align 00001000 | R.X Load,
        file 00002000..000021a0 | mem 00002000..000021a0 | align 00001000 | R.. Load,
        file 00002de8..00003050 | mem 00003de8..00004058 | align 00001000 | RW. Load,
        file 00002df8..00002fd8 | mem 00003df8..00003fd8 | align 00000008 | RW. Dynamic,
        file 000002c4..00000308 | mem 000002c4..00000308 | align 00000004 | R.. Note,
        file 00002094..000020c8 | mem 00002094..000020c8 | align 00000004 | R.. GnuEhFrame,
        file 00000000..00000000 | mem 00000000..00000000 | align 00000010 | RW. GnuStack,
        file 00002de8..00003000 | mem 00003de8..00004000 | align 00000001 | R.. GnuRelRo,
    ],
}
Loading with base address @ 0x400000
Mapping 00000000..00000628 - BitFlags<SegmentFlag>(0b100, Read)
Mapping 00001000..000012d5 - BitFlags<SegmentFlag>(0b101, Execute | Read)
Mapping 00002000..000021a0 - BitFlags<SegmentFlag>(0b100, Read)
Mapping 00003de8..00004058 - BitFlags<SegmentFlag>(0b110, Write | Read)
(With 0xde8 bytes of padding at the start)
thread 'main' panicked at 'Unsupported relocation type GlobDat', src/main.rs:72:29
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

Oh bother.

We're not done yet, are we.

Cool bear

What did we learn?

Along with some machine code (in the R+X segment) and some data (in R+W and R segments), ELF-64 files contain a dynamic segment - which is a subslice of one of the Load segments.

Dynamic entries are either addresses or 64-bit numbers. One of them indicates the address of a relocation table. That table contains entries that describe exactly how to patch the machine code so that it runs with a given base address.

"Relative" relocations are rather simple to apply. But apparently, there's other kinds. Oh no. Is it too late to change subjects? How about a series on plants? Plants are nice. They relocate a lot more easily. Let's talk about plants next time. No? Alright then.

Comment on /r/fasterthanlime

(JavaScript is required to see this. Or maybe my stuff broke)

Here's another article just for you:

Surviving Rust async interfaces

I used to be afraid of async Rust. It's easy to get into trouble!

But thanks to the work done by the whole community, async Rust is getting easier to use every week. One project I think is doing particularly great work in this area is async-std.

Let's say we want to compute the SHA3-256 hash of a file. It's very easy to do with synchronous I/O: