In the last article, we managed to load a program (hello-dl) that uses a single dynamic library (libmsg.so) containing a single exported symbol, msg.

Our program, hello-dl.asm, looked like this:

X86 Assembly
global _start extern msg section .text _start: mov rdi, 1 ; stdout fd mov rsi, msg mov rdx, 38 ; 37 chars + newline mov rax, 1 ; write syscall syscall xor rdi, rdi ; return code 0 mov rax, 60 ; exit syscall syscall

And our library, msg.asm, looked like this:

X86 Assembly
global msg:data msg.end-msg section .data msg: db "this is way longer than sixteen bytes", 10 .end:

To make it all work, we had to implement two types of relocation:

And it did run fine:

Shell session
$ ./target/debug/elk ./samples/hello-dl Loading "/home/amos/ftl/elk/samples/hello-dl" Found RPATH entry "/home/amos/ftl/elk/samples" Loading "/home/amos/ftl/elk/samples/libmsg.so" Applying relocations for "/home/amos/ftl/elk/samples/libmsg.so" Nevermind: RelaNotFound Applying relocations for "/home/amos/ftl/elk/samples/hello-dl" Found Rela { offset: 00001007, type: Known(_64), sym: 1, addend: 00000000 } Found Rela { offset: 00003000, type: Known(Copy), sym: 1, addend: 00000000 } this is way longer than sixteen bytes

So... we got one application to load. Does it work on other applications?

Say, ones that we haven't compiled ourselves?

A GNU core utility for example?

Shell session
$ ./target/debug/elk /bin/ls Loading "/usr/bin/ls" Loading "/usr/lib/libcap.so.2.30" Loading "/usr/lib/libc-2.30.so" Loading "/usr/lib/ld-2.30.so" Applying relocations for "/usr/lib/ld-2.30.so" Found Rela { offset: 0002a7e8, type: Known(Relative), sym: 0, addend: 00011440 } Error: UnimplementedRelocation(Relative)

No, it does not. Very well then, we have some more work to do.

But first, I want to go over some code, criticize it, and then improve it.

Let's look at our segment loading code, for example. It's in elk/src/process.rs, in the load_object function:

Rust code
use std::os::unix::io::AsRawFd; let segments = load_segments() .filter_map(|ph| { if ph.memsz.0 > 0 { let vaddr = delf::Addr(ph.vaddr.0 & !0xFFF); let padding = ph.vaddr - vaddr; let offset = ph.offset - padding; let memsz = ph.memsz + padding; let map_res = MemoryMap::new( memsz.into(), &[ MapOption::MapReadable, MapOption::MapWritable, MapOption::MapFd(fs_file.as_raw_fd()), MapOption::MapOffset(offset.into()), MapOption::MapAddr(unsafe { (base + vaddr).as_ptr() }), ], ); Some(map_res.map(|map| Segment { map, padding, flags: ph.flags, })) } else { None } }) .collect::<Result<Vec<_>, _>>()?;

We've seen earlier that each segment of the executable that's mapped in memory has a .filesz and a .memsz. If we use readelf -a on our hello-dl executable, we get the following output:

Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x00000000000002d8 0x00000000000002d8 R 0x1000 ^^^ ^^^ LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000 0x0000000000000025 0x0000000000000025 R E 0x1000 ^^ ^^ LOAD 0x0000000000002000 0x0000000000002000 0x0000000000002000 0x0000000000000000 0x0000000000000000 R 0x1000 ^ ^ LOAD 0x0000000000002ec0 0x0000000000002ec0 0x0000000000002ec0 0x0000000000000140 0x0000000000000168 RW 0x1000 ^^^ ^^^ (output was slightly cut & reformatted)

Every segment has equal file size and mem size, except for the R+W segment, where the mem size is larger (by 0x28 bytes).

And yet, the mapping we're creating is for the entire mem size. So what's in the extra bytes??

Let's make another program to check it out.

We used nasm's dX before, for initialized data - we used it to define a string constant, msg. nasm has db for bytes, dw for 2-byte words, dd for "double words" (32 bits), and dq for "quad words" (64 bits).

Similarly, for uninitialized data, nasm has resX: resb, resw, resd, resq, with the same sizes. I'm going to go ahead and guess that d stands for "data" and res stands for "reserve".

So, let's make a program that reserves sixteen 64-bit values.

X86 Assembly
; in `elk/samples/bss.asm` global _start section .text _start: ; load address of `zero`, for debugging purposes lea rax, [rel zero] ; then just exit. xor rdi, rdi mov rax, 60 syscall section .bss ; here it is! zero: resq 16

Let's compile and link it (as a position-independent executable):

Shell session
$ cd elk/samples $ nasm -f elf64 bss.asm $ ld -pie --dynamic-linker /lib/ld-linux-x86-64.so.2 bss.o -o bss $ ./bss $

Okay, so it runs well.

Let's run it in GDB:

Shell session
$ gdb ./bss (gdb) break _start Breakpoint 1 at 0x1000 (gdb) r Starting program: /home/amos/ftl/elk/samples/bss Breakpoint 1, 0x0000555555555000 in _start ()

Instead of using ugdb in this article, we'll use GDB's built-in TUI (text user interface).

One way to enter it is to press Ctrl+X and then 2.

As requested, GDB paused program execution right at the beginning of _start, before the lea instruction.

We could use stepi to step to the next instruction and then inspect rax, but there's really no need, as GDB shows us the address it's going to have: whenever rip-relative addressing is used, it shows the address as a "comment":

B+>│0x555555555000 <_start> lea rax,[rip+0x1ff9] # 0x555555557000 ^ | rax is going to have that value!

We know how to inspect memory in gdb now - just use the examine command, or, in short, x. We can control the output with /nfu, where n is the number of things to print (here, 16), f is the format (here we'll pick x for hexadecimal), and u is the size of the things to print (here, g for giant words).

Shell session
(gdb) x/16xg 0x555555557000 0x555555557000: 0x0000000000000000 0x0000000000000000 0x555555557010: 0x0000000000000000 0x0000000000000000 0x555555557020: 0x0000000000000000 0x0000000000000000 0x555555557030: 0x0000000000000000 0x0000000000000000 0x555555557040: 0x0000000000000000 0x0000000000000000 0x555555557050: 0x0000000000000000 0x0000000000000000 0x555555557060: 0x0000000000000000 0x0000000000000000 0x555555557070: 0x0000000000000000 0x0000000000000000

Cool! Sixteen 64-bit values, all zero. So uninitialized data is zeroed.

Good.

I uh.. I don't remember doing that in elk, though.

What do the segments of our bss executable look like?

Shell session
$ readelf -a ./bss Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000269 0x0000000000000269 R 0x1000 LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000 0x0000000000000011 0x0000000000000011 R E 0x1000 LOAD 0x0000000000002f20 0x0000000000002f20 0x0000000000002f20 0x00000000000000e0 0x0000000000000160 RW 0x1000 ^^ ^^^

Mhh, yeah. Now there's a big difference between file size and mem size.

So what does actually happen when we load bss through elk?

Shell session
$ cd elk/samples $ gdb --args ../target/debug/elk ./bss

Here, we can't break on _start. Well, we could, but it'd be elk's _start, not bss's _start. But we can break on jmp, our very-unsafe let's-jump-to-the-program-entry-point-and-hope-for-the-best function.

Shell session
(gdb) break jmp Breakpoint 1 at 0x33d49: file src/main.rs, line 42. (gdb) r Starting program: /home/amos/ftl/elk/target/debug/elk ./bss [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". Loading "/home/amos/ftl/elk/samples/bss" Applying relocations for "/home/amos/ftl/elk/samples/bss" Nevermind: RelaNotFound Breakpoint 1, elk::jmp (addr=0x7ffff7fc9000 "H\215\005\371\037\000") at src/main.rs:42 42 let fn_ptr: fn() = std::mem::transmute(addr);

Step through a few instructions and...

Shell session
(gdb) stepi 0x0000555555587d4e 42 let fn_ptr: fn() = std::mem::transmute(addr); (gdb) 43 fn_ptr(); (gdb) 0x00007ffff7fc9000 in ?? ()

Ah. ?? must be code for "some executable you loaded by hand and that GDB does not know about". Very well! TUI powers activate:

Note that GDB is kind enough to disassemble what we're currently running, even though, to the best of its knowledge, there is no symbol associated with the current instruction pointer (the rip register).

The disas command refuses to work without a symbol though - probably because it wouldn't know when to stop disassembling. However, x/4i would show four instructions for example:

Shell session
(gdb) x/4i $rip => 0x7ffff7fc9000: lea rax,[rip+0x1ff9] # 0x7ffff7fcb000 0x7ffff7fc9007: xor rdi,rdi 0x7ffff7fc900a: mov eax,0x3c 0x7ffff7fc900f: syscall

So, what's in our zero variable?

Shell session
(gdb) x/16xg 0x7ffff7fcb000 0x7ffff7fcb000: 0x0000000000000000 0x0000000000000000 0x7ffff7fcb010: 0x0000000000000000 0x0001000300000000 0x7ffff7fcb020: 0x0000000000000200 0x0000000000000000 0x7ffff7fcb030: 0x0002000300000000 0x0000000000000220 0x7ffff7fcb040: 0x0000000000000000 0x0003000300000000 0x7ffff7fcb050: 0x0000000000000230 0x0000000000000000 0x7ffff7fcb060: 0x0004000300000000 0x0000000000000250 0x7ffff7fcb070: 0x0000000000000000 0x0005000300000000

Well... not zeroes, that's for sure.

That's not even our biggest problem, though. In this case, we lucked out, and the file was large enough so that our memory mapping didn't run past the end of the file.

But what if we reserve a larger amount of memory? Something disgustingly large, like 64KiB?

Enter elk/samples/bss2.asm:

X86 Assembly
global _start section .text _start: lea rax, [rel zero] mov rax, [rax] xor rdi, rdi ; return code 0 mov rax, 60 ; exit syscall syscall section .bss pad: resq 65536 zero: resq 16
Shell session
$ cd elk/samples $ nasm -f elf64 bss2.asm $ ld -pie --dynamic-linker /lib/ld-linux-x86-64.so.2 bss2.o -o bss2 $ ./bss2 $

Sure, it runs well with the system's dynamic linker. But ours?

Shell session
$ ../target/debug/elk ./bss2 Loading "/home/amos/ftl/elk/samples/bss2" Applying relocations for "/home/amos/ftl/elk/samples/bss2" Nevermind: RelaNotFound [1] 5480 bus error (core dumped) ../target/debug/elk ./bss2

Woops.

Running this under GDB and trying to examine the address of zero just shows this:

Shell session
(gdb) x/xg 0x7ffff7d84000 0x7ffff7d84000: Cannot access memory at address 0x7ffff7d84000

I'm not sure if GDB catches SIGBUS, or if they check the process map ahead of time to prevent it - at any rate, it knows what's up.

So, let's fix this. The plan is, for each segment:

* To only map the `0..filesz` range
* To zero out the `filesz..memsz` range

To make our lives easier, I want to change up Addr a little.

First off, I'm not happy with its Debug implementation - we used :08x as its format string, but the number we specify here is the number of nibbles (half-bytes), not the number of bytes, and we're dealing with 64-bit addresses, so it really should be :016x instead.

Rust code
// in `delf/src/lib.rs` impl fmt::Debug for Addr { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "{:016x}", self.0) } }

Second, right now we have methods to interpret a delf::Addr as a pointer, or a mut pointer, but I think we could use a few more:

Rust code
// in `delf/src/lib.rs` impl Addr { // pre-existing pub unsafe fn as_ptr<T>(&self) -> *const T { std::mem::transmute(self.0 as usize) } // pre-existing pub unsafe fn as_mut_ptr<T>(&self) -> *mut T { std::mem::transmute(self.0 as usize) } // new! pub unsafe fn as_slice<T>(&self, len: usize) -> &[T] { std::slice::from_raw_parts(self.as_ptr(), len) } // new! pub unsafe fn as_mut_slice<T>(&self, len: usize) -> &mut [T] { std::slice::from_raw_parts_mut(self.as_mut_ptr(), len) } // new! pub unsafe fn write(&self, src: &[u8]) { std::ptr::copy_nonoverlapping(src.as_ptr(), self.as_mut_ptr(), src.len()); } // new! pub unsafe fn set<T>(&self, src: T) { *self.as_mut_ptr() = src; } }

Now we're ready to zero out our uninitialized data.

And while we're at it, we'll change the code structure a bit. We're returning something funky, an Option<Result<T, E>>, so we had to use .map on the result of MemoryMap::new.

This makes the flow of the code kinda hard to read, so let's go with something different:

Rust code
// in `elk/src/process.rs` use std::os::unix::io::AsRawFd; let segments = load_segments() // first filter-out zero-sized segments .filter(|ph| ph.memsz.0 > 0) // then map the remaining ones! .map(|ph| -> Result<_, LoadError> { let vaddr = delf::Addr(ph.vaddr.0 & !0xFFF); let padding = ph.vaddr - vaddr; let offset = ph.offset - padding; let filesz = ph.memsz + padding; let map = MemoryMap::new( // the mapping only goes up to filesz... filesz.into(), &[ MapOption::MapReadable, MapOption::MapWritable, MapOption::MapFd(fs_file.as_raw_fd()), MapOption::MapOffset(offset.into()), MapOption::MapAddr(unsafe { (base + vaddr).as_ptr() }), ], )?; // but if there's some bytes left over... if ph.memsz > ph.filesz { // ...then we zero them! // note: this works because we already reserved the *convex hull* // of all segments in memory in our initial `MemoryMap::new` call, // so that memory is there. let zero_start = base + ph.mem_range().start + ph.filesz; let zero_len = ph.memsz - ph.filesz; unsafe { // this will probably get optimized to something good. for i in zero_start.as_mut_slice(zero_len.into()) { *i = 0; } } } Ok(Segment { map, padding, flags: ph.flags, }) }) .collect::<Result<Vec<_>, _>>()?;

And now, everything works fine:

Shell session
$ ../target/debug/elk ./bss2 Loading "/home/amos/ftl/elk/samples/bss2" [1] 10331 segmentation fault (core dumped) ../target/debug/elk ./bss2

Ah. It does not.

Okay so this memory should definitely be mapped because, as I said in the comments above, we reserve the convex hull of all segments in memory in our initial MemoryMap::new call, here:

Rust code
// in `elk/src/process.rs` let mem_size: usize = (mem_range.end - mem_range.start).into(); let mem_map = std::mem::ManuallyDrop::new(MemoryMap::new(mem_size, &[])?); let base = delf::Addr(unsafe { let u: usize = transmute(mem_map.data()); u as u64 }) - mem_range.start;

Mh. We never bothered to set any permissions for this mapping though, let's try making it readable and writable:

Rust code
let mem_size: usize = (mem_range.end - mem_range.start).into(); let mem_map = std::mem::ManuallyDrop::new(MemoryMap::new( mem_size, &[MapOption::MapReadable, MapOption::MapWritable], )?);
Shell session
$ ../target/debug/elk ./bss2 Loading "/home/amos/ftl/elk/samples/bss2" [1] 10566 segmentation fault (core dumped) ../target/debug/elk ./bss2

Still no dice.

What does GDB say?

Shell session
$ gdb --args ../target/debug/elk ./bss2 (cut: GDB startup banner) (gdb) r Starting program: /home/amos/ftl/elk/target/debug/elk ./bss2 [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". Loading "/home/amos/ftl/elk/samples/bss2" Program received signal SIGSEGV, Segmentation fault. 0x0000555555565706 in elk::process::Process::load_object::{{closure}} (ph=0x555555638608) at src/process.rs:203 203 *i = 0; (gdb)

Mhokay, so it is crashing when we zero memory, that's what we expected.

And i is:

Shell session
(gdb) p i $4 = (*mut i32) 0x7ffff7d87000

Wait a minute. i is a *mut i32??? Why i32? We're zeroing bytes here, not i32s!! That means we're zeroing four times as much data as we should be, no wonder we end up segfaulting.

It's because of that sneaky bit of code:

Rust code
unsafe { for i in zero_start.as_mut_slice(zero_len.into()) { *i = 0; } }

When we wrote this, we were thinking about bytes. But the compiler was thinking that this:

Rust code
pub unsafe fn as_mut_slice<T>(&self, len: usize) -> &mut [T] { std::slice::from_raw_parts_mut(self.as_mut_ptr(), len) }

...can have any T, and it must infer what T is. And there's only one place i is used, when we do this:

Rust code
*i = 0;

And that's not a type, it's a value. It's an {integer}. And what does an {integer} become, when there are no other constraints on it?

Let's do a test program:

Rust code
// throwaway file fn main() { println!("{:?}", type_name_of(0)); } fn type_name_of<T>(_t: T) -> &'static str { std::any::type_name::<T>() }

That program prints.. i32. So everything makes sense.

We could solve it with turbofish syntax:

Rust code
unsafe { for i in zero_start.as_mut_slice::<u8>(zero_len.into()) { *i = 0; } }

Or we could solve it by explicitly saying our 0 is an u8 literal, not just any integer:

Rust code
unsafe { for i in zero_start.as_mut_slice(zero_len.into()) { *i = 0u8; } }

With those changes, bss2 runs fine. But phew, .as_mut_slice sure is dangerous. Good thing we marked it as unsafe!

Let's just check in GDB to make sure:

Shell session
Breakpoint 1, elk::jmp (addr=0x7ffff7d02000 "H\215\005\371\037\b\000") at src/main.rs:42 42 let fn_ptr: fn() = std::mem::transmute(addr); (gdb) stepi 0x000055555558877e 42 let fn_ptr: fn() = std::mem::transmute(addr); (gdb) 43 fn_ptr(); (gdb) 0x00007ffff7d02000 in ?? () (gdb) x/4i $rip => 0x7ffff7d02000: lea rax,[rip+0x81ff9] # 0x7ffff7d84000 0x7ffff7d02007: mov rax,QWORD PTR [rax] 0x7ffff7d0200a: xor rdi,rdi 0x7ffff7d0200d: mov eax,0x3c (gdb) x/16xg 0x7ffff7d84000 0x7ffff7d84000: 0x0000000000000000 0x0000000000000000 0x7ffff7d84010: 0x0000000000000000 0x0000000000000000 0x7ffff7d84020: 0x0000000000000000 0x0000000000000000 0x7ffff7d84030: 0x0000000000000000 0x0000000000000000 0x7ffff7d84040: 0x0000000000000000 0x0000000000000000 0x7ffff7d84050: 0x0000000000000000 0x0000000000000000 0x7ffff7d84060: 0x0000000000000000 0x0000000000000000 0x7ffff7d84070: 0x0000000000000000 0x0000000000000000

Seems good!

So, we just fixed an issue we didn't even know we had, that's good!

What's next? Relocations!

If we make a third test program, bss3, and adjust it a bit:

X86 Assembly
global _start section .text _start: ; this used to be a `lea rax, [rel zero]` plus a `lea rax, [rax]` mov rax, zero xor rdi, rdi ; return code 0 mov rax, 60 ; exit syscall syscall section .bss pad: resq 65536 zero: resq 16

...then we encounter a relocation type we haven't seen yet!

Shell session
$ cd elk/samples $ nasm -f elf64 bss3.asm $ ld -pie --dynamic-linker /lib/ld-linux-x86-64.so.2 bss3.o -o bss3 $ ../target/debug/elk bss3 Loading "/home/amos/ftl/elk/samples/bss3" Applying relocations for "/home/amos/ftl/elk/samples/bss3" Found Rela { offset: 0000000000001002, type: Known(Relative), sym: 0, addend: 0000000000083000 } Error: UnimplementedRelocation(Relative)

I'm ready to bet this isn't the last relocation we'll have to implement.

Right now, our relocation code is kinda convoluted.

We do the following:

We even have this quirky debug output when there are no relocations:

Shell session
$ ../target/debug/elk ./bss Loading "/home/amos/ftl/elk/samples/bss" Applying relocations for "/home/amos/ftl/elk/samples/bss" Nevermind: RelaNotFound

Isn't this adorable?

Kind of. But we're going to need a better code structure if we want to get much further.

First, I want to take a look at read_rela_entries, in delf:

Rust code
// in `delf/src/lib.rs` impl File { pub fn read_rela_entries(&self) -> Result<Vec<Rela>, ReadRelaError> { use DynamicTag as DT; use ReadRelaError as E; let addr = self.dynamic_entry(DT::Rela).ok_or(E::RelaNotFound)?; let len = self.dynamic_entry(DT::RelaSz).ok_or(E::RelaSzNotFound)?; let ent = self.dynamic_entry(DT::RelaEnt).ok_or(E::RelaEntNotFound)?; let i = self.slice_at(addr).ok_or(E::RelaSegmentNotFound)?; let i = &i[..len.into()]; let n = (len.0 / ent.0) as usize; use nom::multi::many_m_n; match many_m_n(n, n, Rela::parse)(i) { Ok((_, rela_entries)) => Ok(rela_entries), Err(nom::Err::Failure(err)) | Err(nom::Err::Error(err)) => { let e = &err.errors[0]; let (_input, error_kind) = e; return Err(E::ParsingError(error_kind.clone())); } // we don't use any "streaming" parsers, so. _ => unreachable!(), } } }

First off, "not being able to find a dynamic entry" is a pretty common error. It seems silly to have 3 variants in ReadRelaError just for this:

Rust code
#[derive(Error, Debug)] pub enum ReadRelaError { #[error("Rela dynamic entry not found")] RelaNotFound, #[error("RelaSz dynamic entry not found")] RelaSzNotFound, #[error("RelaEnt dynamic entry not found")] RelaEntNotFound, #[error("Rela segment not found")] RelaSegmentNotFound, #[error("Parsing error")] ParsingError(parse::ErrorKind), }

Tell you what, why don't we make up a new method, get_dynamic_entry, that returns a Result itself, mh?

Rust code
// in `delf/src/lib.rs` #[derive(Error, Debug)] pub enum GetDynamicEntryError { #[error("Dynamic entry {0} not found")] NotFound(DynamicTag), } impl File { pub fn get_dynamic_entry(&self, tag: DynamicTag) -> Result<Addr, GetDynamicEntryError> { self.dynamic_entry(tag) .ok_or_else(|| GetDynamicEntryError::NotFound(tag)) } }

Woops, this doesn't compile:

Shell session
error[E0599]: no method named `as_display` found for type `&DynamicTag` in the current scope --> /home/amos/ftl/delf/src/lib.rs:584:13 | 584 | #[error("Dynamic entry {0} not found")] | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ method not found in `&DynamicTag` | = note: the method `as_display` exists but the following trait bounds were not satisfied: `&DynamicTag : thiserror::display::DisplayAsDisplay` = help: items from traits can only be used if the trait is implemented and in scope = note: the following traits define an item `as_display`, perhaps you need to implement one of them: candidate #1: `thiserror::display::DisplayAsDisplay` candidate #2: `thiserror::display::PathAsDisplay`

Ah right! We need to tell thiserror to use the Debug trait to format the inner DynamicTag:

Rust code
#[derive(Error, Debug)] pub enum GetDynamicEntryError { #[error("Dynamic entry {0:?} not found")] NotFound(DynamicTag), }

All better.

Now we can simplify ReadRelaError:

Rust code
#[derive(Error, Debug)] pub enum ReadRelaError { #[error("{0}")] DynamicEntryNotFound(#[from] GetDynamicEntryError), #[error("Rela segment not found")] RelaSegmentNotFound, #[error("Parsing error")] ParsingError(parse::ErrorKind), }

And read_rela_entries itself. But let's also think about what the lack of a RELA dynamic entry means. It's not really an error - it just means there's no relocations at all! So we could just return an empty vec instead.

And also: do we really need to look up RELA_ENT ? Rela entries are always 24 bytes. If that changed, I'm pretty sure a lot of software would freak out.

Let's address all of this.

First, a constant for rela size:

Rust code
impl Rela { const SIZE: usize = 24; }

Then the reading itself:

Rust code
impl File { pub fn read_rela_entries(&self) -> Result<Vec<Rela>, ReadRelaError> { use DynamicTag as DT; use ReadRelaError as E; match self.dynamic_entry(DT::Rela) { None => Ok(vec![]), Some(addr) => { let len = self.get_dynamic_entry(DT::RelaSz)?; let i = self.slice_at(addr).ok_or(E::RelaSegmentNotFound)?; let i = &i[..len.into()]; let n: usize = len.0 as usize / Rela::SIZE; match nom::multi::many_m_n(n, n, Rela::parse)(i) { Ok((_, rela_entries)) => Ok(rela_entries), Err(nom::Err::Failure(err)) | Err(nom::Err::Error(err)) => { let e = &err.errors[0]; let (_input, error_kind) = e; return Err(E::ParsingError(error_kind.clone())); } _ => unreachable!(), } } } } }

Speaking of complexity, the whole RelType / KnownRelType duality is kind of a bummer. Chances are, if we encounter an unknown relocation type, we won't be able to run the executable at all.

We want to handle all relocations or fail early on, so let's go back to a single RelType type.

Rust code
// in `delf/src/lib.rs` #[derive(Debug, TryFromPrimitive, Clone, Copy, PartialEq, Eq)] #[repr(u32)] pub enum RelType { _64 = 1, Copy = 5, GlobDat = 6, JumpSlot = 7, Relative = 8, } impl_parse_for_enum!(RelType, le_u32);

Of course then I won't be happy with impl_parse_for_enum, because, as-is, it only returns nom's ErrorKind::Alt:

Rust code
// in `delf/src/parse.rs` #[macro_export] macro_rules! impl_parse_for_enum { ($type: ident, $number_parser: ident) => { impl $type { pub fn parse(i: parse::Input) -> parse::Result<Self> { use nom::{ combinator::map_res, error::{context, ErrorKind}, number::complete::$number_parser, }; // that's not very informative... let parser = map_res($number_parser, |x| Self::try_from(x).ok_or(ErrorKind::Alt)); context(stringify!($type), parser)(i) } } }; }

So let's try to make this friendlier.

I'm okay with our own ErrorKind having a String variant - it might be wasteful, but it only happens if something goes wrong, so we should be fine.

Rust code
// in `delf/src/parse.rs` #[derive(Debug, Clone)] pub enum ErrorKind { Nom(nom::error::ErrorKind), Context(&'static str), String(String), }

While we're at it, let's also add a Debug implemention for our parse::Error type, and a from_string helper:

Rust code
// in `delf/src/parse.rs` use std::fmt; impl fmt::Debug for Error<&[u8]> { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { for (input, err) in &self.errors { write!(f, "{:?}:\n", err)?; write!(f, "input: {:?}\n", crate::HexDump(input))?; } Ok(()) } } impl<I> Error<I> { pub fn from_string<S: Into<String>>(input: I, s: S) -> Self { let errors = vec![(input, ErrorKind::String(s.into()))]; Self { errors } } }

And finally we can make impl_parse_for_enum have a slightly friendlier error:

Rust code
#[macro_export] macro_rules! impl_parse_for_enum { ($type: ident, $number_parser: ident) => { impl $type { pub fn parse(full_input: parse::Input) -> parse::Result<Self> { use nom::number::complete::$number_parser; let (i, val) = $number_parser(full_input)?; match Self::try_from(val) { Some(val) => Ok((i, val)), None => Err(nom::Err::Failure(parse::Error::from_string( full_input, format!("Unknown {} {} (0x{:x})", stringify!($type), val, val), ))), } } } }; }

Now that this is done, we can adjust read_rela_entries to return a formatted error from delf, instead of just a nom ErrorKind:

Rust code
// in `elk/src/lib.rs` #[derive(Error, Debug)] pub enum ReadRelaError { #[error("{0}")] DynamicEntryNotFound(#[from] GetDynamicEntryError), #[error("Rela segment not found")] RelaSegmentNotFound, #[error("Parsing error: {0}")] ParsingError(String), // this used to be `ErrorKind` }
Rust code
// in `read_rela_entries` match nom::multi::many_m_n(n, n, Rela::parse)(i) { Ok((_, rela_entries)) => Ok(rela_entries), Err(nom::Err::Failure(err)) | Err(nom::Err::Error(err)) => { // much simpler! and nicer. return Err(E::ParsingError(format!("{:?}", err))); } _ => unreachable!(), }

Similarly, we can make up a macro for SymBind and SymType, which are both 4-bit values in the file:

Rust code
// in `elk/src/parse.rs` #[macro_export] macro_rules! impl_parse_for_bitenum { ($type: ident, $bits: expr) => { impl $type { pub fn parse(full_input: parse::BitInput) -> parse::BitResult<Self> { use nom::bits::complete::take; let (i, val) = take($bits)(full_input)?; match Self::try_from(val) { Some(val) => Ok((i, val)), None => Err(nom::Err::Failure(parse::Error::from_string( full_input, format!("Unknown {} {} (0x{:x})", stringify!($type), val, val), ))), } } } }; }
Rust code
// in `elk/src/lib.rs` #[derive(Debug, TryFromPrimitive, Clone, Copy)] #[repr(u8)] pub enum SymBind { Local = 0, Global = 1, Weak = 2, } impl_parse_for_bitenum!(SymBind, 4_usize); #[derive(Debug, TryFromPrimitive, Clone, Copy)] #[repr(u8)] pub enum SymType { None = 0, Object = 1, Func = 2, Section = 3, } impl_parse_for_bitenum!(SymType, 4_usize); // note: the previous `parse` implementations for `SymBind` and `SymType` // were removed

And now we don't have to deal with Option<SymBind> and Option<SymType> in the Sym type! Again - we probably always want to know what a symbol's type and bind is.

Rust code
// in `elk/src/lib.rs` #[derive(Debug)] pub struct Sym { // was Option pub bind: SymBind, // was Option pub r#type: SymType, pub name: Addr, pub shndx: SectionIndex, pub value: Addr, pub size: u64, }

Now that we've done all that, we have a few things to adjust in elk, for the new interface:

Rust code
#[derive(Error, Debug)] pub enum RelocationError { #[error("unimplemented relocation: {0:?}")] UnimplementedRelocation(delf::RelType), #[error("unknown symbol number: {0}")] UnknownSymbolNumber(u32), #[error("undefined symbol: {0}")] UndefinedSymbol(String), }
Rust code
// in `Process::apply_relocations` // for rel in rels println!("Found {:?}", rel); let reltype = rel.r#type; use delf::RelType as RT; match reltype { RT::_64 => { // handle _64 relocation } RT::Copy => { // handle Copy relocation } _ => return Err(RelocationError::UnimplementedRelocation(reltype)), }

Does everything still work?

Shell session
$ ../target/debug/elk hello-dl Loading "/home/amos/ftl/elk/samples/hello-dl" Found RPATH entry "/home/amos/ftl/elk/samples" Loading "/home/amos/ftl/elk/samples/libmsg.so" Applying relocations for "/home/amos/ftl/elk/samples/libmsg.so" Applying relocations for "/home/amos/ftl/elk/samples/hello-dl" Found Rela { offset: 0000000000001007, type: _64, sym: 1, addend: 0000000000000000 } Found Rela { offset: 0000000000003000, type: Copy, sym: 1, addend: 0000000000000000 } this is way longer than sixteen bytes $ ../target/debug/elk bss Loading "/home/amos/ftl/elk/samples/bss" Applying relocations for "/home/amos/ftl/elk/samples/bss" $ ../target/debug/elk bss2 Loading "/home/amos/ftl/elk/samples/bss2" Applying relocations for "/home/amos/ftl/elk/samples/bss2" $ ../target/debug/elk bss3 Loading "/home/amos/ftl/elk/samples/bss3" Applying relocations for "/home/amos/ftl/elk/samples/bss3" Found Rela { offset: 0000000000001002, type: Relative, sym: 0, addend: 0000000000083000 } Error: UnimplementedRelocation(Relative)

Looks like it!

How does it fare with unimplemented stuff though?

Shell session
$ ../target/debug/elk /bin/ls Loading "/usr/bin/ls" Loading "/usr/lib/libcap.so.2.30" Loading "/usr/lib/libc-2.30.so" Error: ReadSymsError(ParsingError(String("Unknown SymType 10 (0xa)")))

That's not a lot of context... but that's also not the error type we adjusted earlier.

Rust code
// in `delf/src/lib.rs` #[derive(Error, Debug)] pub enum ReadSymsError { // new: use get_dynamic_entry #[error("{0:?}")] DynamicEntryNotFound(#[from] GetDynamicEntryError), #[error("SymTab section not found")] SymTabSectionNotFound, #[error("SymTab segment not found")] SymTabSegmentNotFound, // new: store formatted error instead of just ErrorKind #[error("Parsing error: {0}")] ParsingError(String), } impl File { pub fn read_syms(&self) -> Result<Vec<Sym>, ReadSymsError> { use DynamicTag as DT; use ReadSymsError as E; // new: use get_dynamic_entry let addr = self.get_dynamic_entry(DT::SymTab)?; let section = self .section_starting_at(addr) .ok_or(E::SymTabSectionNotFound)?; let i = self.slice_at(addr).ok_or(E::SymTabSegmentNotFound)?; let n = (section.size.0 / section.entsize.0) as usize; use nom::multi::many_m_n; match many_m_n(n, n, Sym::parse)(i) { Ok((_, syms)) => Ok(syms), Err(nom::Err::Failure(err)) | Err(nom::Err::Error(err)) => { // new: format error here (because we stop borrowing input later) return Err(E::ParsingError(format!("{:?}", err))); } _ => unreachable!(), } } }

Better?

Shell session
$ ../target/debug/elk /bin/ls Loading "/usr/bin/ls" Loading "/usr/lib/libcap.so.2.30" Loading "/usr/lib/libc-2.30.so" Error: ReadSymsError(ParsingError("String(\"Unknown SymType 10 (0xa)\"):\ninput: 1a 00 10 00 a0 bf 0b 00 00 00 00 00 c1 00 00 00 00 00 00 00 \n"))

Mh, yes and no.

It looks like the error is bubbling all the way up to elk's main function, which has this signature:

Rust code
// in `elk/src/main.rs` fn main() -> Result<(), Box<dyn Error>> { // (cut) }

And it looks like it's formatting whatever error is returned with the Debug trait, so we see the full composition of the error, instead of the pretty Display strings we've been specifying.

Let's fix that.

Rust code
// in `elk/src/main.rs` fn main() { if let Err(e) = do_main() { eprintln!("Fatal error: {}", e); } } fn do_main() -> Result<(), Box<dyn Error>> { // (cut) }
Shell session
$ ../target/debug/elk /bin/ls Loading "/usr/bin/ls" Loading "/usr/lib/libcap.so.2.30" Loading "/usr/lib/libc-2.30.so" Fatal error: Could not read symbols from ELF object: Parsing error: String("Unknown SymType 10 (0xa)"): input: 1a 00 10 00 a0 bf 0b 00 00 00 00 00 c1 00 00 00 00 00 00 00

Okay! Now we actually get the error formatting we've worked on. We'll worry about symbol type 10 later.

I'm liking those cleanups! I don't want to stop!

Here's another thing that bothers me: Process::lookup_symbol(0

Rust code
impl Process { pub fn lookup_symbol( &self, name: &str, ignore: Option<&Object>, ) -> Result<Option<(&Object, &delf::Sym)>, RelocationError> { for obj in &self.objects { if let Some(ignored) = ignore { if std::ptr::eq(ignored, obj) { continue; } } for (i, sym) in obj.syms.iter().enumerate() { if obj.sym_name(i as u32)? == name { return Ok(Some((obj, sym))); } } } Ok(None) } }

For every symbol look up, we go through all loaded objects (that part we can't really help) and check each individual symbol of those objects.

Sure, in our small assembly programs we've got a handful of symbols. But even the most basic dynamically-linked C application has dependencies that have thousands of symbols.

It gets worse - to compare names, we call delf::File::get_string

Rust code
impl Object { pub fn sym_name(&self, index: u32) -> Result<String, RelocationError> { self.file .get_string(self.syms[index as usize].name) .map_err(|_| RelocationError::UnknownSymbolNumber(index)) } }

...which does a lot of work:

Rust code
pub fn get_string(&self, offset: Addr) -> Result<String, GetStringError> { use DynamicTag as DT; use GetStringError as E; let addr = self.dynamic_entry(DT::StrTab).ok_or(E::StrTabNotFound)?; let slice = self .slice_at(addr + offset) .ok_or(E::StrTabSegmentNotFound)?; let string_slice = slice.split(|&c| c == 0).next().ok_or(E::StringNotFound)?; Ok(String::from_utf8_lossy(string_slice).into()) }

It does a linear scan in dynamic entries to find the string table, and then decodes the symbol name as utf-8 (lossily, so we don't crash if it's invalid utf-8).

We've got to be able to do better than that.

Firstly, we don't need to convert symbol names to &str just to compare them. We could use Vec<u8> instead, those implement PartialEq and Eq!

We could only convert them to strings when we need to print them out, for debugging purposes.

Secondly, we don't even need to own the data for names. There's no need to copy it out, because we've mapped the whole string table to memory before!

Thirdly, it's tempting to think of a HashMap to look up symbols more quickly. But the "symbol table" is a table for a reason: the same names can be listed multiple times.

In fact, just in /usr/lib/libc.so.6, there are multiple duplicates, and even quadruplicates:

nm -D /usr/lib/libc.so.6 | cut -d ' ' -f 3 | uniq -c | grep -v 1 2 2 fmemopen 2 glob 2 glob64 2 2 memcpy 2 nftw 2 nftw64 2 posix_spawn 2 posix_spawnp 2 pthread_cond_broadcast 2 pthread_cond_destroy 2 pthread_cond_init 2 pthread_cond_signal 2 pthread_cond_timedwait 2 pthread_cond_wait 2 quick_exit 2 realpath 2 regexec 2 2 sched_getaffinity 2 sched_setaffinity 4 _sys_errlist 4 sys_errlist 4 _sys_nerr 4 sys_nerr 2 sys_sigabbrev 2 _sys_siglist 2 sys_siglist

So HashMap is out of the question. We could have a HashMap<Name, Vec<Sym>>... but, and I insist we say this all together now: there is a crate for that.

So, let's first make a new Name type. I'm in an expansive mood, so let's make a module just for it:

Rust code
// in `elk/src/main.rs` mod name;
Rust code
// in `elk/src/process.rs` use crate::name::Name;
Rust code
// in `elk/src/name.rs` #[derive(Clone)] pub enum Name { FromAddr { addr: delf::Addr, len: usize }, Owned(Vec<u8>), }

Okay, good start! We have two variants: one for names that come straight from an ELF file we just mapped into memory, and one for names we own, like, maybe we want to look up a specific symbol in the future, from a Rust string literal, and then it'll come in handy.

Each entry in an ELF file's string table is null-terminated - that's why File::get_string has this nugget:

Rust code
let string_slice = slice.split(|&c| c == 0).next().ok_or(E::StringNotFound)?;

We'll give Name a constructor that does pretty much that, given a random Addr. This is pretty unsafe, it'll read any memory you tell it to:

Rust code
// in `elk/src/name.rs` impl Name { pub unsafe fn from_addr(addr: delf::Addr) -> Self { let len = addr .as_slice::<u8>(2048) .iter() .position(|&c| c == 0) .expect("scanned 2048 bytes without finding null-terminator for name"); Self::FromAddr { addr, len } } }

For "safety" (to avoid segfaults), we limit name size to 2048. Hopefully the actual limit for ELF symbol size is way lower, so we won't ever run into that.

By comparison,t he constructor for Name::Owned is much simpler (and safe):

Rust code
impl Name { pub fn owned<T: Into<Vec<u8>>>(value: T) -> Self { Self::Owned(value.into()) } }

From there, we can implement as_slice - this is almost all a Name is:

Rust code
impl Name { pub fn as_slice(&self) -> &[u8] { match self { // I knew this `as_slice` helper would come in handy! // no need to worry about type inference going wrong here, // we've explicitly declared the return type to be `&[u8]` Self::FromAddr { addr, len } => unsafe { addr.as_slice(*len) }, Self::Owned(vec) => &vec[..], } } }

As promised, we can make a Debug implementation that shows the symbol as a string if possible, and as a slice if not:

Rust code
use std::fmt; impl fmt::Debug for Name { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> Result<(), fmt::Error> { if let Ok(s) = std::str::from_utf8(self.as_slice()) { // this only succeds if the name is valid utf-8 fmt::Display::fmt(s, f) } else { fmt::Debug::fmt(self.as_slice(), f) } } }

And finally, since we'll want to use Name as a key in a hash map (well, a multimap), we'll need it to implement PartialEq, Eq, and Hash.

We cannot just derive those, because we want to test the contents. A Name::FromAddr could be equal to a Name::Owned.

So we simply defer to the std::slice implementations:

Rust code
impl PartialEq for Name { fn eq(&self, other: &Self) -> bool { PartialEq::eq(self.as_slice(), other.as_slice()) } } impl Eq for Name {} use std::hash::{Hash, Hasher}; impl Hash for Name { fn hash<H: Hasher>(&self, state: &mut H) { Hash::hash(self.as_slice(), state) } }

Sounds good! As long as we only ever pass valid addresses to Name::from_addr, and that those addresses remain valid for the lifetime of the Name, we're in the clear.

But really, we just unsafe'd our way out of any of the guarantees the borrow checker provided us. Which... I mean, just recently we accidentally unmapped part of a running executable, I think we're well into "never safe again" territory - and it's only getting worse from here.

Okay, so, let's see where we read symbols right now - it's in Process::load_object

Rust code
// in `src/` let syms = file.read_syms()?;

Okay, this gives us a Vec<delf::Sym>.

You know what would be even cooler?

If we had a struct that somehow combined the symbol's information (like its type, bind, value, size) and its name.

Maybe something like that:

Rust code
// in `src/elk/process.rs` #[derive(Debug, Clone)] struct NamedSym { sym: delf::Sym, name: Name, }

(Note: for this to work, we also need to derive Clone for delf::Sym. It is left as an exercise to the reader)

And then an Object would have two copies of all the symbols.

One Vec for index lookups, and one MultiMap for name lookups.

Cool bear's hot tip

Remember that relocations refer to local symbols by their number, and the name of that local symbol is used to look up a defined symbol in other objects.

So we need both kinds of lookup.

Well, let's get going, shall we:

Shell session
$ cargo add multimap Adding multimap v0.8.0 to dependencies
Rust code
// in `src/elk/process.rs` use multimap::MultiMap; #[derive(CustomDebug)] pub struct Object { // (other fields omitted) #[debug(skip)] syms: Vec<NamedSym>, #[debug(skip)] sym_map: MultiMap<Name, NamedSym>, }

Now, we can map our Vec<Sym> to a `Vec``:

Rust code
let syms = file.read_syms()?; let strtab = file .get_dynamic_entry(delf::DynamicTag::StrTab) .unwrap_or_else(|_| panic!("String table not found in {:?}", path)); let syms: Vec<_> = syms .into_iter() .map(|sym| unsafe { let name = Name::from_addr(base + strtab + sym.name); NamedSym { sym, name } }) .collect();

If you'll excuse the liberties taken with error handling (but really, do you want to live in a world where an ELF object has symbols but no string table??? No thank you), we can move on to sym_map:

Rust code
let mut sym_map = MultiMap::new(); for sym in &syms { sym_map.insert(sym.name.clone(), sym.clone()) } let object = Object { // (other fields omitted) syms, sym_map, };

That was easy! Lots of cloning, but Sym is relatively small, and the expected improvement in lookup time means it probably pays for itself. Or not.

Cool bear's hot tip

One really ought to make measurements before making claims like that.

And then not make the claims anyway, because measurements lie all the time.

Ok, what's next.. oh, Object::sym_name can go. Delete!

I'm thinking about the lookup_symbol method, whose signature is currently:

Rust code
pub fn lookup_symbol( &self, name: &str, ignore: Option<&Object>, ) -> Result<Option<(&Object, &delf::Sym)>, RelocationError> { }

I really dislike that it returns a (&Object, &delf::Sym) - seems inconvenient.

What if we had a struct that tied together a Sym and the Object it's bound to?

Something like this:

Rust code
#[derive(Debug, Clone)] struct ObjectSym<'a> { obj: &'a Object, sym: &'a NamedSym, }

It could even have a nice convenience method, value, that returns the rebased address:

Rust code
impl ObjectSym<'_> { fn value(&self) -> delf::Addr { self.obj.base + self.sym.sym.value } }

Then our lookup_symbol signature could just be this:

Rust code
pub fn lookup_symbol( &self, wanted: &ObjectSym, ignore_self: bool, ) -> Option<ObjectSym> { }

In fact... let's think bigger. Sometimes, a symbol is unresolved, but that's okay - because it was a weak symbol, and so, if it's undefined, we simply use 0x0 as its address instead of whatever it would've been if it was designed by one of the other ELF objects.

libc has several of those:

Shell session
$ nm -D /usr/lib/libc.so.6 | grep " w " w _dl_starting_up w __libdl_freeres w __libpthread_freeres

If only we could have a type that lets us know the result of a lookup, and that is usable even if it was undefined, just in case the local symbol we're resolving was weak... something like that:

Rust code
#[derive(Debug)] enum ResolvedSym<'a> { Defined(ObjectSym<'a>), Undefined, }

And this one could also have convenience methods:

Rust code
impl ResolvedSym<'_> { fn value(&self) -> delf::Addr { match self { Self::Defined(sym) => sym.value(), Self::Undefined => delf::Addr(0x0), } } fn size(&self) -> usize { match self { // weeeeeee Self::Defined(sym) => sym.sym.sym.size as usize, Self::Undefined => 0, } } }

And then lookup_symbol could simply be:

Rust code
fn lookup_symbol(&self, wanted: &ObjectSym, ignore_self: bool) -> ResolvedSym { for obj in &self.objects { if ignore_self && std::ptr::eq(wanted.obj, obj) { continue; } if let Some(syms) = obj.sym_map.get_vec(&wanted.sym.name) { if let Some(sym) = syms.iter().find(|sym| !sym.sym.shndx.is_undef()) { return ResolvedSym::Defined(ObjectSym { obj, sym }); } } } ResolvedSym::Undefined }

Pretty nice, right? get_vec is from multimap, it returns all the values corresponding to a key, or None if.. there aren't any.

Speaking of relocations. Why not read them ahead of time?

Rust code
#[derive(CustomDebug)] pub struct Object { // (other fields omitted) #[debug(skip)] pub rels: Vec<delf::Rela>, }
Rust code
// in Process::load_object let rels = file.read_rela_entries()?; let object = Object { // other fields omitted rels, };

Of course, for that we need a new variant for LoadError:

Rust code
#[derive(Error, Debug)] pub enum LoadError { // (other variants omitted) #[error("Could not read relocations from ELF object: {0}")] ReadRelaError(#[from] delf::ReadRelaError), }

Okay! Lots of compile errors right now, but that's what refactorings are all about (that and the end result, but, we'll see).

I like this idea of grouping "a symbol and its name", "a named symbol and its object". How about grouping relocations with their objects?

Rust code
#[derive(Debug)] struct ObjectRel<'a> { obj: &'a Object, rel: &'a delf::Rela, }

This way we could get the address of a relocation, already adjusted for the base address of the object!

Rust code
impl ObjectRel<'_> { fn addr(&self) -> delf::Addr { self.obj.base + self.rel.offset } }

Note that during the entire relocation process, all the objects, symbols, and relocations, are purely read-only, so we can afford to pass a lot of references around without the borrow checker going berserk.

So, let's approach apply_relocations with a clean perspective:

Rust code
impl Process { pub fn apply_relocations(&self) -> Result<(), RelocationError> { let rels: Vec<_> = self .objects .iter() .rev() .map(|obj| obj.rels.iter().map(move |rel| ObjectRel { obj, rel })) .flatten() .collect(); for rel in rels { self.apply_relocation(rel)?; } Ok(()) } }

Seems okay! We'll still be applying them in reverse load order, we're just... mapping them to a single Vec<ObjectRel> in advance.

Now for apply_relocation - the singular one.

Rust code
impl Process { fn apply_relocation(&self, objrel: ObjectRel) -> Result<(), RelocationError> { use delf::RelType as RT; // destructure a bit, for convenience let ObjectRel { obj, rel } = objrel; let reltype = rel.r#type; let addend = rel.addend; // this is the symbol we're looking for. // note that it may be symbol 0, which has an empty name - that's fine. let wanted = ObjectSym { obj, sym: &obj.syms[rel.sym as usize], }; // when doing a lookup, only ignore the relocation's object if // we're performing a Copy relocation. let ignore_self = match reltype { RT::Copy => true, _ => false, }; // perform symbol lookup early let found = match rel.sym { // the relocation isn't bound to any symbol, go with undef 0 => ResolvedSym::Undefined, _ => match self.lookup_symbol(&wanted, ignore_self) { undef @ ResolvedSym::Undefined => match wanted.sym.sym.bind { // undefined symbols are fine if our local symbol is weak delf::SymBind::Weak => undef, // otherwise, error out now _ => return Err(RelocationError::UndefinedSymbol(format!("{:?}", wanted))), }, // defined symbols are always fine x => x, }; }; match reltype { RT::_64 => unsafe { // beautiful // we're using `set<T>()` and passing a `delf::Addr` - which is // just a newtype over `u64`, so everything works out! objrel.addr().set(found.value() + addend); }, RT::Copy => unsafe { // beaaaaaaaaaaautiful! // write() takes a &[u8], so `as_slice`'s type is inferred correctly. objrel.addr().write(found.value().as_slice(found.size())); }, _ => return Err(RelocationError::UnimplementedRelocation(reltype)), } Ok(()) } }

Well, the actual relocation code is a lot more readable now! And since that's the one we're more likely to mess up, this seems really important right now.

But.. does it still work?

Shell session
$ ../target/debug/elk ./hello-dl Loading "/home/amos/ftl/elk/samples/hello-dl" Found RPATH entry "/home/amos/ftl/elk/samples" Loading "/home/amos/ftl/elk/samples/libmsg.so" this is way longer than sixteen bytes

Appears so. bss2 is still fine, too - I checked!

What about bss3?

Shell session
$ ../target/debug/elk ./bss3 Loading "/home/amos/ftl/elk/samples/bss3" Fatal error: unimplemented relocation: Relative

Ah, that's unfortunate.

Well, you know, this article is already pretty long and... oh alright, how are Relative relocations calculated again?

Mhh, B + A. A is for addend, B is for... base?

Rust code
match reltype { RT::_64 => unsafe { objrel.addr().set(found.value() + addend); }, // new! RT::Relative => unsafe { objrel.addr().set(obj.base + addend); }, RT::Copy => unsafe { objrel.addr().write(found.value().as_slice(found.size())); }, _ => return Err(RelocationError::UnimplementedRelocation(reltype)), }
Shell session
$ ../target/debug/elk ./bss3 Loading "/home/amos/ftl/elk/samples/bss3" $

Huh. Did that work? Kinda hard to tell - even if we go into GDB and examine memory, we could be looking at another bunch of 16 64-bit zero values.

Hey, didn't hello.asm have Relative relocations?

X86 Assembly
global _start section .text _start: mov rdi, 1 ; stdout fd ; lea rsi, [rel msg] mov rsi, msg mov rdx, 9 ; 8 chars + newline mov rax, 1 ; write syscall syscall xor rdi, rdi ; return code 0 mov rax, 60 ; exit syscall syscall section .data msg: db "hi there", 10
Shell session
$ objdump -drR ./hello ./hello: file format elf64-x86-64 Disassembly of section .text: 0000000000001000 <_start>: 1000: bf 01 00 00 00 mov edi,0x1 1005: 48 be 00 30 00 00 00 movabs rsi,0x3000 100c: 00 00 00 1007: R_X86_64_RELATIVE *ABS*+0x3000 100f: ba 09 00 00 00 mov edx,0x9 1014: b8 01 00 00 00 mov eax,0x1 1019: 0f 05 syscall 101b: 48 31 ff xor rdi,rdi 101e: b8 3c 00 00 00 mov eax,0x3c 1023: 0f 05 syscall

It did! Let's give it a shot:

Shell session
$ ../target/debug/elk ./hello Loading "/home/amos/ftl/elk/samples/hello" hi there

Success! 🎉

I feel a lot better about this relocation code already. I feel more confident that we can tackle whatever surprises ELF still has in store for us. We'll see about that!