In the last article, we found where code was hiding in our samples/hello executable, by disassembling the whole file and then looking for syscalls.

Later on, we learned how to inspect which memory ranges are mapped for a given PID (process identifier). We saw that memory areas weren't all equal: they can be readable, writable, and/or executable.

Finally, we learned about program headers and how they specified which parts of the executable file should be mapped to which memory areas.

And we put all that knowledge together to:

And we didn't get a segfault (like we did back when that memory wasn't executable), but we didn't get "hi there" printed to the standard output either.

And that was very frustrating! Because at that point, it felt like we had done a significant amount of research, and that after so much effort, it really ought to, you know, just work.

But computers rarely do that.

Has anyone brought a map?

Let's look again at the output we get when we run elk on samples/hello:

Shell session
cargo b -q && ./target/debug/elk samples/hello Analyzing "samples/hello"... File { type: Exec, machine: X86_64, entry_point: 00401000, program_headers: [ file 00000000..000000e8 | mem 00400000..004000e8 | align 00001000 | R.. Load, file 00001000..00001025 | mem 00401000..00401025 | align 00001000 | R.X Load, file 00002000..00002009 | mem 00402000..00402009 | align 00001000 | RW. Load, ], } Disassembling "samples/hello"... 00401000 BF01000000 mov edi,0x1 00401005 48BE002040000000 mov rsi,0x402000 -0000 0040100F BA09000000 mov edx,0x9 00401014 B801000000 mov eax,0x1 00401019 0F05 syscall 0040101B 4831FF xor rdi,rdi 0040101E B83C000000 mov eax,0x3c 00401023 0F05 syscall Executing "samples/hello" in memory... code @ 0x55739ab50000 entry offset @ 00000000 entry point @ 0x55739ab50000

Yep, sure enough. No "hi there" in the output.

Here's an idea: how about right before we jump to the target's entry point, we pause, so that we can inspect elk's memory map?

Rust code
// in `elk/src/main.rs` // in `main()` let entry_offset = file.entry_point - code_ph.vaddr; let entry_point = unsafe { code.as_ptr().add(entry_offset.into()) }; println!(" code @ {:?}", code.as_ptr()); println!("entry offset @ {:?}", entry_offset); println!("entry point @ {:?}", entry_point); println!("Press enter to jmp..."); { let mut s = String::new(); std::io::stdin().read_line(&mut s)?; } unsafe { jmp(entry_point); }
Shell session
$ cargo b -q && ./target/debug/elk samples/hello (cut) Executing "samples/hello" in memory... code @ 0x55646e6a5000 entry offset @ 00000000 entry point @ 0x55646e6a5000 Press enter to jmp...

Ah, crap, we forgot to print the PID. Well, not to worry! We can use ps and grep to the rescue.

ps means "process snapshot" and lists running processes, with various filtering and formatting options. grep stands for "global regular expression print" (obvious, right?) and prints lines that match patterns.

In our case the pattern is just "elk":

Shell session
$ ps aux | grep elk amos 2191 0.0 0.0 3116 2196 pts/3 S+ 12:25 0:00 ./target/debug/elk samples/hello amos 2596 0.0 0.0 6156 2376 pts/4 S+ 12:29 0:00 grep --color=auto elk

Huhh which column is the PID? Let's include the header in the output by changing our pattern to: must contain either "PID" or "elk". Since "or" in regular expressions is the same as "pipe" in shell, we'll wrap our pattern in double quotes:

Shell session
$ ps aux | grep -E "(PID|elk)" USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND amos 2191 0.0 0.0 3116 2196 pts/3 S+ 12:25 0:00 ./target/debug/elk samples/hello amos 2738 0.0 0.0 6288 2244 pts/4 S+ 12:31 0:00 grep --color=auto -E (PID|elk)

One last bikeshed: I don't really want grep to show up in the output - but it does, because its command includes our search terms.

To avoid that, we can simply pipe to grep once more and exclude all terms that contain "grep". We'll use the -v flag, which is the short version of --invert-match:

Shell session
$ ps aux | grep -E "(PID|elk)" | grep -v grep USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND amos 2191 0.0 0.0 3116 2196 pts/3 S+ 12:25 0:00 ./target/debug/elk samples/hello

Okay, cool! So, PID 2191. Let's print its memory map.

Shell session
$ cat /proc/2191/maps 55646d47b000-55646d484000 r--p 00000000 08:01 3813337 /home/amos/ftl/elk/target/debug/elk 55646d484000-55646d4e6000 r-xp 00009000 08:01 3813337 /home/amos/ftl/elk/target/debug/elk 55646d4e6000-55646d500000 r--p 0006b000 08:01 3813337 /home/amos/ftl/elk/target/debug/elk 55646d500000-55646d505000 r--p 00085000 08:01 3813337 /home/amos/ftl/elk/target/debug/elk 55646d505000-55646d506000 rw-p 0008a000 08:01 3813337 /home/amos/ftl/elk/target/debug/elk 55646e6a2000-55646e6a5000 rw-p 00000000 00:00 0 [heap] 55646e6a5000-55646e6a6000 rwxp 00000000 00:00 0 [heap] 55646e6a6000-55646e6c3000 rw-p 00000000 00:00 0 [heap] 7f295c859000-7f295c85b000 rw-p 00000000 00:00 0 7f295c85b000-7f295c880000 r--p 00000000 08:01 800108 /usr/lib/libc-2.30.so 7f295c880000-7f295c9cd000 r-xp 00025000 08:01 800108 /usr/lib/libc-2.30.so 7f295c9cd000-7f295ca17000 r--p 00172000 08:01 800108 /usr/lib/libc-2.30.so 7f295ca17000-7f295ca18000 ---p 001bc000 08:01 800108 /usr/lib/libc-2.30.so 7f295ca18000-7f295ca1b000 r--p 001bc000 08:01 800108 /usr/lib/libc-2.30.so 7f295ca1b000-7f295ca1e000 rw-p 001bf000 08:01 800108 /usr/lib/libc-2.30.so 7f295ca1e000-7f295ca22000 rw-p 00000000 00:00 0 7f295ca22000-7f295ca25000 r--p 00000000 08:01 790795 /usr/lib/libgcc_s.so.1 7f295ca25000-7f295ca36000 r-xp 00003000 08:01 790795 /usr/lib/libgcc_s.so.1 7f295ca36000-7f295ca3a000 r--p 00014000 08:01 790795 /usr/lib/libgcc_s.so.1 7f295ca3a000-7f295ca3b000 r--p 00017000 08:01 790795 /usr/lib/libgcc_s.so.1 7f295ca3b000-7f295ca3c000 rw-p 00018000 08:01 790795 /usr/lib/libgcc_s.so.1 7f295ca3c000-7f295ca43000 r--p 00000000 08:01 801270 /usr/lib/libpthread-2.30.so 7f295ca43000-7f295ca53000 r-xp 00007000 08:01 801270 /usr/lib/libpthread-2.30.so 7f295ca53000-7f295ca58000 r--p 00017000 08:01 801270 /usr/lib/libpthread-2.30.so 7f295ca58000-7f295ca59000 r--p 0001b000 08:01 801270 /usr/lib/libpthread-2.30.so 7f295ca59000-7f295ca5a000 rw-p 0001c000 08:01 801270 /usr/lib/libpthread-2.30.so 7f295ca5a000-7f295ca5e000 rw-p 00000000 00:00 0 7f295ca5e000-7f295ca5f000 r--p 00000000 08:01 800628 /usr/lib/libdl-2.30.so 7f295ca5f000-7f295ca60000 r-xp 00001000 08:01 800628 /usr/lib/libdl-2.30.so 7f295ca60000-7f295ca61000 r--p 00002000 08:01 800628 /usr/lib/libdl-2.30.so 7f295ca61000-7f295ca62000 r--p 00002000 08:01 800628 /usr/lib/libdl-2.30.so 7f295ca62000-7f295ca63000 rw-p 00003000 08:01 800628 /usr/lib/libdl-2.30.so 7f295ca63000-7f295ca65000 rw-p 00000000 00:00 0 7f295ca9d000-7f295ca9f000 rw-p 00000000 00:00 0 7f295ca9f000-7f295caa1000 r--p 00000000 08:01 795305 /usr/lib/ld-2.30.so 7f295caa1000-7f295cac0000 r-xp 00002000 08:01 795305 /usr/lib/ld-2.30.so 7f295cac0000-7f295cac8000 r--p 00021000 08:01 795305 /usr/lib/ld-2.30.so 7f295cac9000-7f295caca000 r--p 00029000 08:01 795305 /usr/lib/ld-2.30.so 7f295caca000-7f295cacb000 rw-p 0002a000 08:01 795305 /usr/lib/ld-2.30.so 7f295cacb000-7f295cacc000 rw-p 00000000 00:00 0 7ffc43a91000-7ffc43ab2000 rw-p 00000000 00:00 0 [stack] 7ffc43ac4000-7ffc43ac7000 r--p 00000000 00:00 0 [vvar] 7ffc43ac7000-7ffc43ac8000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]

That's uh, a lot of lines. Too many lines. Let's see - the address we printed for code was 0x55646e6a5000, which corresponds to this area:

55646e6a2000-55646e6a5000 rw-p 00000000 00:00 0 [heap] -> 55646e6a5000-55646e6a6000 rwxp 00000000 00:00 0 [heap] 55646e6a6000-55646e6c3000 rw-p 00000000 00:00 0 [heap] ^^^^

Helpfully, these three areas are annotated with "heap" - and that makes sense! We did heap-allocate the memory for the data field of ProgramHeader. Our code is stored contiguously into a Vec<u8>, since we used to_vec on a &[u8].

There's something funny about these areas, too - only a5000..a6000 is executable, whereas a2000..a5000 and a6000..c3000 are just read+write.

This seems too close to be a coincidence. To find out what's going on, let's also pause before we make code executable:

Rust code
// in `elk/src/main.rs` fn main() -> Result<(), Box<dyn Error>> { // (cut) println!("Executing {:?} in memory...", input_path); use region::{protect, Protection}; let code = &code_ph.data; pause("protect")?; // this is new unsafe { protect(code.as_ptr(), code.len(), Protection::ReadWriteExecute)?; } let entry_offset = file.entry_point - code_ph.vaddr; let entry_point = unsafe { code.as_ptr().add(entry_offset.into()) }; println!(" code @ {:?}", code.as_ptr()); println!("entry offset @ {:?}", entry_offset); println!("entry point @ {:?}", entry_point); pause("jmp")?; // that is also new unsafe { jmp(entry_point); } Ok(()) } // And this little helper function is new as well! fn pause(reason: &str) -> Result<(), Box<dyn Error>> { println!("Press Enter to {}...", reason); { let mut s = String::new(); std::io::stdin().read_line(&mut s)?; } Ok(()) }

Here's the heap memory before we call region::protect:

Shell session
$ cat /proc/3243/maps | grep heap 55e34dce7000-55e34dd08000 rw-p 00000000 00:00 0 [heap]

And after:

Shell session
$ cat /proc/3243/maps | grep heap 55e34dce7000-55e34dcea000 rw-p 00000000 00:00 0 [heap] 55e34dcea000-55e34dceb000 rwxp 00000000 00:00 0 [heap] 55e34dceb000-55e34dd08000 rw-p 00000000 00:00 0 [heap]

AhAH! So there was one unified heap, with read+write protection, and when we made code executable, it split into three regions - two read+write, as before, and a 4KiB one, in the middle, that's now read+write+execute.

This all makes sense.

But we're still no closer to finding out why it doesn't print "hi there".

Let's step through it again in ugdb:

Shell session
$ cargo b -q && ugdb ./target/debug/elk samples/hello (gdb) break jmp Breakpoint 1 at 0xb3c9: file src/main.rs, line 79. (gdb) start

At this point, since the program expects input, we need to press "Esc", "Right arrow", and "Enter" to switch to the program output pane, then press "Enter" twice. Then, "Esc", "Left arrow", and "Enter" again to switch back to the GDB command prompt.

We are now in the jmp function. We can inspect addr:

Shell session
(gdb) print addr $1 = (*mut u8) 0x5555555e4000 "\277\001\000"

Very good. Let's stepi our way forward.

We're about to move 0x1 into the edi register - that's good. Note that in our original assembly, we had mov rdi, 1. But rdi and edi are names for the same register, only as a 64-bit register or a 32-bit one.

I'm guessing nasm picked edi because our constant did fit in a 32-bit integer and it was more compact.

Cool bear's hot tip

That is correct.

  • mov rdi, 1 assembles to 48 c7 c7 01 00 00 00
  • mov edi, 1 assembles to bf 01 00 00 00

Onward!

So, edi should now be set to 1, which we can check with info registers edi, or info reg edi for short:

Shell session
(gdb) info reg edi edi 0x1 1

And it is! And we're just about to, uh, movabs something to rsi, and the thing we're moving is a constant equal to... 0x402000.

Mhhh.

Mmmmmmmmmhhhhhh.

I don't think that memory address is valid for this program.

I don't think that memory address is valid for this program at all.

Let's inspect:

Shell session
(gdb) info proc process 3447
Shell session
$ # -3 = only the first three lines. $ # since memory ranges are sorted, we don't need more. $ cat /proc/3447/maps | head -3 555555554000-55555555d000 r--p 00000000 08:01 3813344 /home/amos/ftl/elk/target/debug/elk 55555555d000-5555555c0000 r-xp 00009000 08:01 3813344 /home/amos/ftl/elk/target/debug/elk 5555555c0000-5555555db000 r--p 0006c000 08:01 3813344 /home/amos/ftl/elk/target/debug/elk

Yeah there's, uh, nothing at "402000".

Maybe that's why it doesn't, you know, print anything. Huh.

That address does ring a bell though.

Can we pull up the output of elk on samples/hello real quick again?

Shell session
$ cargo b -q && ./target/debug/elk samples/hello Analyzing "samples/hello"... File { type: Exec, machine: X86_64, entry_point: 00401000, program_headers: [ file 00000000..000000e8 | mem 00400000..004000e8 | align 00001000 | R.. Load, file 00001000..00001025 | mem 00401000..00401025 | align 00001000 | R.X Load, file 00002000..00002009 | mem 00402000..00402009 | align 00001000 | RW. Load,

Stop right there! Here, that third line! Enhance!

Shell session
file 00002000..00002009 | mem 00402000..00402009 | align 00001000 | RW. Load, ^^^^^^^^

There's our perp! I mean, our address! It's supposed to be mapped from the ELF file into memory! And the file range is 2000..2009, which is a suspiciously small range, I wonder what could be hiding in th...

Shell session
$ dd status=none if=samples/hello bs=1 count=9 skip=$((0x2000)) hi there

Oh. Hi.

It's pretty clear what's going on now. Our ELF file, samples/hello, contains not only x86 instructions (at 1000..1025), but also some data (at 2000..2009), and when we run it via elk, the data is definitely not at the address it expects.

Maybe we can work around this. Maybe we cheat a little.

More than one segment? In this economy?

Maybe... we write a program that doesn't have any data.

X86 Assembly
; in `elk/samples/nodata.asm` global _start section .text _start: mov rdi, 1 ; stdout fd sub rsp, 10 ; allocate 10 bytes on stack mov byte [rsp+0], 111 mov byte [rsp+1], 107 mov byte [rsp+2], 97 mov byte [rsp+3], 121 mov byte [rsp+4], 32 mov byte [rsp+5], 116 mov byte [rsp+6], 104 mov byte [rsp+7], 101 mov byte [rsp+8], 110 mov byte [rsp+9], 10 mov rsi, rsp mov rdx, 10 ; 8 chars + newline mov rax, 1 ; write syscall syscall add rsp, 10 ; free memory xor rdi, rdi ; return code 0 mov rax, 60 ; exit syscall syscall

And then compile it:

Shell session
$ # in `elk/samples/` $ nasm -f elf64 nodata.asm $ ld nodata.o -o nodata

And then run it through elk:

Shell session
$ cargo b -q && ./target/debug/elk samples/nodata Analyzing "samples/nodata"... File { type: Exec, machine: X86_64, entry_point: 00401000, program_headers: [ file 00000000..000000b0 | mem 00400000..004000b0 | align 00001000 | R.. Load, file 00001000..00001057 | mem 00401000..00401057 | align 00001000 | R.X Load, ], } Disassembling "samples/nodata"... 00401000 BF01000000 mov edi,0x1 00401005 4883EC0A sub rsp,byte +0xa 00401009 C604246F mov byte [rsp],0x6f 0040100D C64424016B mov byte [rsp+0x1],0x6b 00401012 C644240261 mov byte [rsp+0x2],0x61 00401017 C644240379 mov byte [rsp+0x3],0x79 0040101C C644240420 mov byte [rsp+0x4],0x20 00401021 C644240574 mov byte [rsp+0x5],0x74 00401026 C644240668 mov byte [rsp+0x6],0x68 0040102B C644240765 mov byte [rsp+0x7],0x65 00401030 C64424086E mov byte [rsp+0x8],0x6e 00401035 C64424090A mov byte [rsp+0x9],0xa 0040103A 4889E6 mov rsi,rsp 0040103D BA0A000000 mov edx,0xa 00401042 B801000000 mov eax,0x1 00401047 0F05 syscall 00401049 4883C40A add rsp,byte +0xa 0040104D 4831FF xor rdi,rdi 00401050 B83C000000 mov eax,0x3c 00401055 0F05 syscall Executing "samples/nodata" in memory... Press Enter to protect... code @ 0x563afdafcfa0 entry offset @ 00000000 entry point @ 0x563afdafcfa0 Press Enter to jmp... okay then

Hey, that worked!

When malloc is not enough

Good job everyone, I can't believe this series was so short, but we did it, we wrote a program that packs.. well, that executes, any ELF progr... well, ELF programs that do not have any data and uhh... yeah okay we're not quite done.

I do think we should take some time to celebrate our achievement properly, though. Working our way through the innards of executable files is no small feat, and we're doing great so far. I think we should reward ourselves with a cookie, maybe a warm drink.

Ahh. Success.

Okay, back to work. We've executed samples/nodata without exec, sure. But can we execute samples/hello?

The main problem is that the code in hello assumes there's going to be some data at 0x402000. I don't suppose there's a way to... allocate memory at a given address, right?

Shell session
$ man malloc NAME malloc, free, calloc, realloc - allocate and free dynamic memory SYNOPSIS #include <stdlib.h> void *malloc(size_t size);

Yeah, okay, no, malloc won't help us there.

But maybe... mmap?

Shell session
$ man mmap NAME mmap — map pages of memory SYNOPSIS #include <sys/mman.h> void *mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t off);

Hey! mmap takes an addr! That's an address!

Is there a crate for that?

Cool bear's hot tip

There is a crate for that.

Jazzy.

Shell session
$ cargo add mmap Adding mmap v0.1.1 to dependencies

So, as it turns out, mmap is crazy powerful. It's mostly brought up when we want to map part of a file into memory (which, as we've seen before, is what the operating system definitely does when running an executable). But it can also create "anonymous" mappings, that aren't backed by any particular file.

It also usually picks its own address (so, 0x0 is passed to addr, and then whatever it returns is what we get), but it can also create "fixed" mappings, at a precise address - and that sounds like exactly what we want right now.

The battle plan is:

Let's get to it:

Rust code
// in `elk/src/main.rs` use std::{env, error::Error, fs}; use mmap::{MapOption, MemoryMap}; use region::{protect, Protection}; fn main() -> Result<(), Box<dyn Error>> { // omitted: getting command-line arguments, parsing file, disassembling code section println!("Mapping {:?} in memory...", input_path); // we'll need to hold onto our "mmap::MemoryMap", because dropping them // unmaps them! let mut mappings = Vec::new(); // we're only interested in "Load" segments for ph in file .program_headers .iter() .filter(|ph| ph.r#type == delf::SegmentType::Load) { println!("Mapping segment @ {:?} with {:?}", ph.mem_range(), ph.flags); // note: mmap-ing would fail if the segments weren't aligned on pages, // but luckily, that is the case in the file already. That is not a coincidence. let mem_range = ph.mem_range(); let len: usize = (mem_range.end - mem_range.start).into(); // we'll be doing a lot of unsafe things... let addr: *mut u8 = unsafe { std::mem::transmute(mem_range.start.0 as usize) }; // at first, we want the memory area to be writable, so we can copy to it. // we'll set the right permissions later let map = MemoryMap::new(len, &[MapOption::MapWritable, MapOption::MapAddr(addr)])?; println!("Copying segment data..."); unsafe { // this is basically "memmove". // it is also very unsafe. std::ptr::copy_nonoverlapping(ph.data.as_ptr(), addr, len); } println!("Adjusting permissions..."); // the `region` crate and our `delf` crate have two different // enums (and bit flags) for protection, so we need to map from // delf's to region's. let mut protection = Protection::None; for flag in ph.flags.iter() { protection |= match flag { delf::SegmentFlag::Read => Protection::Read, delf::SegmentFlag::Write => Protection::Write, delf::SegmentFlag::Execute => Protection::Execute, } } unsafe { protect(addr, len, protection)?; } mappings.push(map); } println!("Jumping to entry point @ {:?}...", file.entry_point); pause("jmp")?; unsafe { // note that we don't have to do pointer arithmetic here, // as the entry point is indeed mapped in memory at the right place. jmp(std::mem::transmute(file.entry_point.0 as usize)); } Ok(()) }

I'm too afraid to run it.

There's a little too much unsafe code in here for my taste. This would definitely not pass code review. We'd be sure to get fired or worse, promoted.

Regardless, let's be brave folks and run it anyway - on samples/hello, since we've already gotten samples/nodata to work.

Shell session
cargo b -q && ./target/debug/elk samples/hello Analyzing "samples/hello"... File { type: Exec, machine: X86_64, entry_point: 00401000, program_headers: [ file 00000000..000000e8 | mem 00400000..004000e8 | align 00001000 | R.. Load, file 00001000..00001025 | mem 00401000..00401025 | align 00001000 | R.X Load, file 00002000..00002009 | mem 00402000..00402009 | align 00001000 | RW. Load, ], } Disassembling "samples/hello"... 00401000 BF01000000 mov edi,0x1 00401005 48BE002040000000 mov rsi,0x402000 -0000 0040100F BA09000000 mov edx,0x9 00401014 B801000000 mov eax,0x1 00401019 0F05 syscall 0040101B 4831FF xor rdi,rdi 0040101E B83C000000 mov eax,0x3c 00401023 0F05 syscall Mapping "samples/hello" in memory... Mapping segment @ 00400000..004000e8 with BitFlags<SegmentFlag>(0b100, Read) Copying segment data... Adjusting permissions... Mapping segment @ 00401000..00401025 with BitFlags<SegmentFlag>(0b101, Execute | Read) Copying segment data... Adjusting permissions... Mapping segment @ 00402000..00402009 with BitFlags<SegmentFlag>(0b110, Write | Read) Copying segment data... Adjusting permissions... Jumping to entry point @ 00401000... Press Enter to jmp... hi there

Huh. I can't believe that worked.

I guess there's some hope after all.

Dealing with conflict in 17 easy steps

Can we execute other things with it? Like our C program, entry_point?

Shell session
$ cargo b -q && ./target/debug/elk ./samples/entry_point Analyzing "./samples/entry_point"... Parsing failed: Nom(MapRes) at position 64: 00000040: 06 00 00 00 04 00 00 00 40 00 00 00 00 00 00 00 40 00 00 00 Context("SegmentType") at position 64: 00000040: 06 00 00 00 04 00 00 00 40 00 00 00 00 00 00 00 40 00 00 00

Ah. An unknown segment type. Alright, let's add the missing ones in delf:

Rust code
// in `delf/src/lib.rs` #[derive(Debug, Clone, Copy, PartialEq, Eq, TryFromPrimitive)] #[repr(u32)] pub enum SegmentType { Null = 0x0, Load = 0x1, Dynamic = 0x2, Interp = 0x3, Note = 0x4, ShLib = 0x5, PHdr = 0x6, TLS = 0x7, LoOS = 0x6000_0000, HiOS = 0x6FFF_FFFF, LoProc = 0x7000_0000, HiProc = 0x7FFF_FFFF, GnuEhFrame = 0x6474_E550, GnuStack = 0x6474_E551, GnuRelRo = 0x6474_E552, GnuProperty = 0x6474_E553, }

"But Amos" - I can already hear you interject - "where do these come from? Is there a reference online with all these segment types?" And the answer is no. I've had to go hunt for the last three hours all over the internet for you. This is your reference.

You're welcome.

Back to trying to run entry_point with elk:

Shell session
cargo b -q && ./target/debug/elk ./samples/entry_point Analyzing "./samples/entry_point"... File { type: Dyn, machine: X86_64, entry_point: 00001070, program_headers: [ file 00000040..000002a8 | mem 00000040..000002a8 | align 00000008 | R.. PHdr, file 000002a8..000002c4 | mem 000002a8..000002c4 | align 00000001 | R.. Interp, file 00000000..00000628 | mem 00000000..00000628 | align 00001000 | R.. Load, file 00001000..000012d5 | mem 00001000..000012d5 | align 00001000 | R.X Load, file 00002000..000021a0 | mem 00002000..000021a0 | align 00001000 | R.. Load, file 00002de8..00003050 | mem 00003de8..00004058 | align 00001000 | RW. Load, file 00002df8..00002fd8 | mem 00003df8..00003fd8 | align 00000008 | RW. Dynamic, file 000002c4..00000308 | mem 000002c4..00000308 | align 00000004 | R.. Note, file 00002094..000020c8 | mem 00002094..000020c8 | align 00000004 | R.. GnuEhFrame, file 00000000..00000000 | mem 00000000..00000000 | align 00000010 | RW. GnuStack, file 00002de8..00003000 | mem 00003de8..00004000 | align 00000001 | R.. GnuRelRo, ], } Disassembling "./samples/entry_point"... 00001070 F30F1EFA rep hint_nop55 edx 00001074 4883EC08 sub rsp,byte +0x8 00001078 488B05D92F0000 mov rax,[rel 0x4058] 0000107F 4885C0 test rax,rax (cut) 00001330 F30F1EFA rep hint_nop55 edx 00001334 C3 ret 00001335 0000 add [rax],al 00001337 00F3 add bl,dh 00001339 0F1EFA hint_nop55 edx 0000133C 4883EC08 sub rsp,byte +0x8 00001340 4883C408 add rsp,byte +0x8 00001344 C3 ret Mapping "./samples/entry_point" in memory... Mapping segment @ 00000000..00000628 with BitFlags<SegmentFlag>(0b100, Read) Error: ErrUnknown(1)

That's better. It still errors out, granted, but at least we get to look at a whole lot of assembly.

Cool bear's hot tip

If you're wondering what that "rep hint_nop55 edx" is, it's ndisasm showing its limits.

The actual instruction is "endbr64", which stands for "End Branch 64 bit", which is sometimes emitted to mark valid jump targets.

There is more information on StackOverflow.

ErrUnknown isn't exactly the most expressive error but hey, no abstraction is perfect. It's true that we're trying to map a memory area whose size isn't a multiple of 4KiB, but the first parameter to MemoryMap::new is min_len, so I'm assuming it can round up as needed.

However, the address itself... it's 0x0. I'm pretty sure that's reserved.

Cool bear's hot tip

You may know 0x0 by the name NULL, or nullptr. That's for reasonable platforms.

As this StackOverflow answer puts it for some platforms, NULL is not 0x0.

Mhh.

What if... we mapped all the segments in the right order, with the right alignment... but somewhere higher in memory? Somewhere that isn't, you know, 0x0.

When we were running samples/hello, the first region started at 0x400000, and it worked just fine! So maybe we can use that as a base address?

Let's try it out. We'll need to:

Rust code
// in `elk/src/main.rs` fn main() -> Result<(), Box<dyn Error>> { // (cut) // picked by fair 4KiB-aligned dice roll let base = 0x400000_usize; println!("Mapping {:?} in memory...", input_path); let mut mappings = Vec::new(); for ph in file .program_headers .iter() .filter(|ph| ph.r#type == delf::SegmentType::Load) { println!("Mapping segment @ {:?} with {:?}", ph.mem_range(), ph.flags); let mem_range = ph.mem_range(); let len: usize = (mem_range.end - mem_range.start).into(); // map each segment "base" higher than the program header says let start: usize = mem_range.start.0 as usize + base; let addr: *mut u8 = unsafe { std::mem::transmute(start) }; println!("Addr: {:p}", addr); let map = MemoryMap::new(len, &[MapOption::MapWritable, MapOption::MapAddr(addr)])?; println!("Copying segment data..."); // omitted println!("Adjusting permissions..."); // omitted } println!("Jumping to entry point @ {:?}...", file.entry_point); pause("jmp")?; unsafe { // jump to the base-adjusted entry point jmp(std::mem::transmute(file.entry_point.0 as usize + base)); } Ok(()) }

(For brevity, I've also commented out the part that disassembles the code - turns out, the code section is pretty long in entry_point. But that's to be expected - C is not very low-level, so it generates a lot of assembly.)

Cool bear's hot tip

That's a joke.

Do not send letters.

With this change, we get further!

Shell session
$ cargo b -q && ./target/debug/elk ./samples/entry_point Analyzing "./samples/entry_point"... File { type: Dyn, machine: X86_64, entry_point: 00001070, program_headers: [ file 00000040..000002a8 | mem 00000040..000002a8 | align 00000008 | R.. PHdr, file 000002a8..000002c4 | mem 000002a8..000002c4 | align 00000001 | R.. Interp, file 00000000..00000628 | mem 00000000..00000628 | align 00001000 | R.. Load, file 00001000..000012d5 | mem 00001000..000012d5 | align 00001000 | R.X Load, file 00002000..000021a0 | mem 00002000..000021a0 | align 00001000 | R.. Load, file 00002de8..00003050 | mem 00003de8..00004058 | align 00001000 | RW. Load, file 00002df8..00002fd8 | mem 00003df8..00003fd8 | align 00000008 | RW. Dynamic, file 000002c4..00000308 | mem 000002c4..00000308 | align 00000004 | R.. Note, file 00002094..000020c8 | mem 00002094..000020c8 | align 00000004 | R.. GnuEhFrame, file 00000000..00000000 | mem 00000000..00000000 | align 00000010 | RW. GnuStack, file 00002de8..00003000 | mem 00003de8..00004000 | align 00000001 | R.. GnuRelRo, ], } Mapping "./samples/entry_point" in memory... Mapping segment @ 00000000..00000628 with BitFlags<SegmentFlag>(0b100, Read) Addr: 0x400000 Copying segment data... Adjusting permissions... Mapping segment @ 00001000..000012d5 with BitFlags<SegmentFlag>(0b101, Execute | Read) Addr: 0x401000 Copying segment data... Adjusting permissions... Mapping segment @ 00002000..000021a0 with BitFlags<SegmentFlag>(0b100, Read) Addr: 0x402000 Copying segment data... Adjusting permissions... Mapping segment @ 00003de8..00004058 with BitFlags<SegmentFlag>(0b110, Write | Read) Addr: 0x403de8 Error: ErrUnaligned

Ah! So the mmap crate does have a specific error code for unaligned memory regions.

And who could blame it. 0x3de8 is definitely not 4KiB-aligned.

We've done realignment before in C (in Running an executable without exec), and we can do it in Rust easily.

We'll just make the segment slightly bigger than requested (ie. we'll make it start earlier, on the left-adjacent 4KiB boundary), and if we're careful to copy the segment data in the right place, everything should work out.

First a little helper:

Rust code
// in `elk/src/main.rs` /** * Truncates a usize value to the left-adjacent (low) 4KiB boundary. */ fn align_lo(x: usize) -> usize { x & !0xFFF }

And then the pièce de résistance:

Rust code
// in `elk/src/main.rs` fn main() -> Result<(), Box<dyn Error>> { // (cut) let base = 0x400000_usize; println!("Mapping {:?} in memory...", input_path); let mut mappings = Vec::new(); for ph in file .program_headers .iter() .filter(|ph| ph.r#type == delf::SegmentType::Load) { println!("Mapping segment @ {:?} with {:?}", ph.mem_range(), ph.flags); let mem_range = ph.mem_range(); let len: usize = (mem_range.end - mem_range.start).into(); let start: usize = mem_range.start.0 as usize + base; let aligned_start: usize = align_lo(start); let padding = start - aligned_start; let len = len + padding; let addr: *mut u8 = unsafe { std::mem::transmute(aligned_start) }; println!("Addr: {:p}, Padding: {:08x}", addr, padding); let map = MemoryMap::new(len, &[MapOption::MapWritable, MapOption::MapAddr(addr)])?; println!("Copying segment data..."); unsafe { std::ptr::copy_nonoverlapping(ph.data.as_ptr(), addr.add(padding), len); } println!("Adjusting permissions..."); let mut protection = Protection::None; for flag in ph.flags.iter() { protection |= match flag { delf::SegmentFlag::Read => Protection::Read, delf::SegmentFlag::Write => Protection::Write, delf::SegmentFlag::Execute => Protection::Execute, } } unsafe { protect(addr, len, protection)?; } mappings.push(map); } println!("Jumping to entry point @ {:?}...", file.entry_point); pause("jmp")?; unsafe { jmp(std::mem::transmute(file.entry_point.0 as usize + base)); } Ok(()) }

Let's go:

Shell session
cargo b -q && ./target/debug/elk ./samples/entry_point Analyzing "./samples/entry_point"... File { type: Dyn, machine: X86_64, entry_point: 00001070, program_headers: [ file 00000040..000002a8 | mem 00000040..000002a8 | align 00000008 | R.. PHdr, file 000002a8..000002c4 | mem 000002a8..000002c4 | align 00000001 | R.. Interp, file 00000000..00000628 | mem 00000000..00000628 | align 00001000 | R.. Load, file 00001000..000012d5 | mem 00001000..000012d5 | align 00001000 | R.X Load, file 00002000..000021a0 | mem 00002000..000021a0 | align 00001000 | R.. Load, file 00002de8..00003050 | mem 00003de8..00004058 | align 00001000 | RW. Load, file 00002df8..00002fd8 | mem 00003df8..00003fd8 | align 00000008 | RW. Dynamic, file 000002c4..00000308 | mem 000002c4..00000308 | align 00000004 | R.. Note, file 00002094..000020c8 | mem 00002094..000020c8 | align 00000004 | R.. GnuEhFrame, file 00000000..00000000 | mem 00000000..00000000 | align 00000010 | RW. GnuStack, file 00002de8..00003000 | mem 00003de8..00004000 | align 00000001 | R.. GnuRelRo, ], } Mapping "./samples/entry_point" in memory... Mapping segment @ 00000000..00000628 with BitFlags<SegmentFlag>(0b100, Read) Addr: 0x400000, Padding: 00000000 Copying segment data... Adjusting permissions... Mapping segment @ 00001000..000012d5 with BitFlags<SegmentFlag>(0b101, Execute | Read) Addr: 0x401000, Padding: 00000000 Copying segment data... Adjusting permissions... Mapping segment @ 00002000..000021a0 with BitFlags<SegmentFlag>(0b100, Read) Addr: 0x402000, Padding: 00000000 Copying segment data... Adjusting permissions... Mapping segment @ 00003de8..00004058 with BitFlags<SegmentFlag>(0b110, Write | Read) Addr: 0x403000, Padding: 00000de8 Copying segment data... Adjusting permissions... Jumping to entry point @ 00001070... Press Enter to jmp... [1] 12138 segmentation fault (core dumped) ./target/debug/elk ./samples/entry_point

deep breaths

So there's good news and bad news.

The good news is: we've mapped every segment successfully!

The bad news is: it segfaults.

You know the drill, let's fire up ugdb.

Shell session
$ cargo b -q && ugdb ./target/debug/elk ./samples/entry_point (gdb) break jmp Breakpoint 1 at 0xf6a9: file src/main.rs, line 124. (gdb) start (switch to right pane, press enter, switch back to gdb pane) (gdb) stepi (gdb) stepi (gdb) stepi

Heyyy there's our endbr64!

Everything seems to be okay so far.

I don't see any movabs, which is great. In fact, most memory operations seem to be computed: [rip+0x236], [rip+0x1bf], etc.

GDB helpfully annotates those, and we can see that 0x4012c0, 0x401250 etc. are all in mapped memory regions. That seems good!

And here we are. Baby's first call.

Is the annotation accurate? Let's see what's in the rip register:

Shell session
(gdb) info reg rip rip 0x401098 0x401098

Very well! If we add 0x2f42 to it, we get...

Shell session
(gdb) print 0x401098+0x2f42 $1 = 4210650

...a decimal value, apparently. Luckily, we can get GDB to print hexadecimal values by using formatting directives, like so: print/:format. In our case, we want hexadecimal, the directive is just x:

Shell session
(gdb) print/x 0x401098+0x2f42 $2 = 0x403fda

That's uhhh better, but still not what GDB is showing us.

What is rip anyway?

The instruction pointer is called ip in 16-bit mode, eip in 32-bit mode, and rip in 64-bit mode. The instruction pointer register points to the memory address which the processor will next attempt to execute.

Source: X86 assembly language, Wikipedia

Ah. So maybe by the time this instruction executes, rip won't be set to 0x401098 (where the call is), but to 0x40109e (where the hlt is).

Shell session
(gdb) print/x 0x40109e+0x2f42 $3 = 0x403fe0

Jolly good! GDB agrees with us, and everything still makes sense for the time being.

Let's jump:

Shell session
(gdb) stepi

Oh.

Well yeah. Jumping to 0x0 is a fairly good way to segfault.

Which just raises further questions. What was it even trying to jump to?

In which we relax the rules of the series a tiny bit

To answer this, we'll need to cheat - just a little. I didn't want us to use objdump throughout the whole series, but today's cheat day, so let's indulge together.

objdump is pretty much what we would've been using from the beginning if we didn't have strong NIH (not invented here) syndrome.

I mean, if we weren't trying to learn ELF the hard way.

I mean the fun way!

We're going to use only a small part of its superpowers, just to disassemble entry_point a little.

Shell session
$ # in `elk/` $ objdump -d ./samples/entry_point | less ./samples/entry_point: file format elf64-x86-64 Disassembly of section .init: 0000000000001000 <_init>: 1000: f3 0f 1e fa endbr64 1004: 48 83 ec 08 sub rsp,0x8 1008: 48 8b 05 d9 2f 00 00 mov rax,QWORD PTR [rip+0x2fd9] # 3fe8 <__gmon_start__> 100f: 48 85 c0 test rax,rax 1012: 74 02 je 1016 <_init+0x16> 1014: ff d0 call rax 1016: 48 83 c4 08 add rsp,0x8 101a: c3 ret (cut)

Oh look, assembly!

Cool bear's hot tip

objdump outputs AT&T syntax by default.

If you want Intel syntax, you can use -M intel. Amos has it aliased, because he's ~~~lazy~~~ wary of extra keystrokes.

From less, we can search for 2f42 by typing /2f42 and pressing Enter.

Shell session
(cut) Disassembly of section .text: 0000000000001070 <_start>: 1070: f3 0f 1e fa endbr64 1074: 31 ed xor ebp,ebp 1076: 49 89 d1 mov r9,rdx 1079: 5e pop rsi 107a: 48 89 e2 mov rdx,rsp 107d: 48 83 e4 f0 and rsp,0xfffffffffffffff0 1081: 50 push rax 1082: 54 push rsp 1083: 4c 8d 05 36 02 00 00 lea r8,[rip+0x236] # 12c0 <__libc_csu_fini> 108a: 48 8d 0d bf 01 00 00 lea rcx,[rip+0x1bf] # 1250 <__libc_csu_init> 1091: 48 8d 3d d1 00 00 00 lea rdi,[rip+0xd1] # 1169 <main> 1098: ff 15 42 2f 00 00 call QWORD PTR [rip+0x2f42] # 3fe0 <__libc_start_main@GLIBC_2.2.5> 109e: f4 hlt 109f: 90 nop (cut)

Theeeeeeeeere it is. We've got the call and the hlt and everything!

Not only is the target of the call annotated with its address (3fe0, not adjusted for the base address), but it's also annotated with.. its symbol!

__libc_start_main@GLIBC_2.2.5.

Which, uh. I don't remember defining that in entry_point.c. Would a C compiler really pull in a large set of dependencies just for a basic program?

Cool bear's hot tip

Again, joke, no letters.

Let's check what symbols are defined in ./samples/entry_point using nm (which stands for "name list", according to the V7 Unix Manual).

(Note: you can quit less by pressing "q")

Shell session
$ nm ./samples/entry_point 0000000000004050 B __bss_start 0000000000004050 b completed.7392 w __cxa_finalize@@GLIBC_2.2.5 0000000000004038 D __data_start 0000000000004038 W data_start 00000000000010a0 t deregister_tm_clones 0000000000001110 t __do_global_dtors_aux 0000000000003df0 d __do_global_dtors_aux_fini_array_entry 0000000000004040 D __dso_handle 0000000000003df8 d _DYNAMIC 0000000000004050 D _edata 0000000000004058 B _end U __errno_location@@GLIBC_2.2.5 00000000000012c8 T _fini 0000000000001160 t frame_dummy 0000000000003de8 d __frame_dummy_init_array_entry 000000000000219c r __FRAME_END__ 0000000000004000 d _GLOBAL_OFFSET_TABLE_ w __gmon_start__ 0000000000002094 r __GNU_EH_FRAME_HDR 0000000000001000 t _init 0000000000003df0 d __init_array_end 0000000000003de8 d __init_array_start 0000000000004048 D instructions 0000000000002000 R _IO_stdin_used w _ITM_deregisterTMCloneTable w _ITM_registerTMCloneTable 00000000000012c0 T __libc_csu_fini 0000000000001250 T __libc_csu_init U __libc_start_main@@GLIBC_2.2.5 0000000000001169 T main U mprotect@@GLIBC_2.2.5 U printf@@GLIBC_2.2.5 U puts@@GLIBC_2.2.5 00000000000010d0 t register_tm_clones 0000000000001070 T _start 0000000000004050 D __TMC_END__

That's a whole lot of things. Sure enough, there's __libc_start_main@@GLIBC_2.2.5 in there (with an extra @, even).

But it doesn't have an address. In stark constrast with _start, at 0x1070 and main, at 0x1169, which both seem to be there.

And what do all those letters mean?

Shell session
$ man nm DESCRIPTION GNU nm lists the symbols from object files objfile.... If no object files are listed as arguments, nm assumes the file a.out. For each symbol, nm shows: (cut)

Ah, here we go:

Shell session
"T" "t" The symbol is in the text (code) section. "U" The symbol is undefined.

Cool! So __libc_start_main@@GLIBC_2.2.5 is undefined. Well, no wonder our program crashes.

But why does it not crash when we simply run it as ./samples/entry_point?

Let's run it in ugdb:

Shell session
$ ugdb ./samples/entry_point (gdb) break _start Breakpoint 1 at 0x1070

Hey, that's the address nm reported! Good omen.

Shell session
(gdb) start (gdb) stepi (cut: more stepi)

Here we go again. About to call into the void. Note that the operating system seems to have mapped the code section at 0x555555555000 rather than 0x400000 like we did.

That's all good. Everyone has their favorite base address, we're not here to judge.

Before we call though, can we examine what exactly is at the address we're about to jump to? Turns out that's what GDB's x command is for:

Shell session
(gdb) x 0x555555557fe0 0x555555557fe0: 0xf7df2060

Huh. This reminds me of address ranges we saw earlier when inspecting /proc/:pid/maps, but it seems a bit.. low? Maybe it's printing only 4 bytes, when, if it's an address, we really want 8.

Let's read the GDB manual:

10.6 Examining Memory You can use the command x (for “examine”) to examine memory in any of several formats, independently of your program’s data types. x/nfu addr x addr x Use the x command to examine memory. n, f, and u are all optional parameters that specify how much memory to display and how to format it; addr is an expression giving the address where you want to start displaying memory. If you use defaults for nfu, you need not type the slash ‘/’. Several commands set convenient defaults for addr. n, the repeat count The repeat count is a decimal integer; the default is 1. It specifies how much memory (counting by units u) to display. If a negative number is specified, memory is examined backward from addr. f, the display format The display format is one of the formats used by print (‘x’, ‘d’, ‘u’, ‘o’, ‘t’, ‘a’, ‘c’, ‘f’, ‘s’), and in addition ‘i’ (for machine instructions). The default is ‘x’ (hexadecimal) initially. The default changes each time you use either x or print. u, the unit size The unit size is any of b Bytes. h Halfwords (two bytes). w Words (four bytes). This is the initial default. g Giant words (eight bytes).

ahAH! So we want to display one (n=1) hexadecimal (f=x) giant word (u=g).

Let's go:

Shell session
(gdb) x/1xg 0x555555557fe0 0x555555557fe0: 0x00007ffff7df2060

That's better! First off, that's not 0x0, so the program is not about to segfault. Well, that is, if it's mapped to anything.

Shell session
(gdb) info proc process 13892 (cut)
Shell session
$ cat /proc/13892/maps | grep 7ffff7df 7ffff7dcb000-7ffff7df0000 r--p 00000000 08:01 800108 /usr/lib/libc-2.30.so 7ffff7df0000-7ffff7f3d000 r-xp 00025000 08:01 800108 /usr/lib/libc-2.30.so

Hey, it is mapped! It's from /usr/lib/libc-2.30.so!

Does that file contain __libc_start_main@@GLIBC_2.2.5 ? Let's ask nm:

Shell session
nm /usr/lib/libc-2.30.so | grep __libc_start_main 0000000000027060 T __libc_start_main

It does! (Without the part after the @). And you can see how the target of our jump and the address of __libc_start_main align:

0x00007ffff7df2060 0x0000000000027060 ^^^

In other words, they're equal modulo 0x1000.

I know, I know, that's a lot to take in. What I'm gathering from this is that:

The valley of the shadow of ELF

Well, I take it back.

This series is never going to end.

In 2060, when I'm 70, and everybody will have switched to using Fuschia on the desktop, my friends will still poke fun at me: "Hey amos, remember your ELF series? When's it gonna end?", and I'll feign a smile, but inside I will be acutely, painfully aware that I have angered the binary gods and that I should have left well enough alone.

Serves me right.

But hey, at least ./samples/hello executes fine under elk right? That's a win! Let's run it again just to get our spirits up again:

Shell session
$ cargo b -q && ./target/debug/elk ./samples/hello Analyzing "./samples/hello"... File { type: Exec, machine: X86_64, entry_point: 00401000, program_headers: [ file 00000000..000000e8 | mem 00400000..004000e8 | align 00001000 | R.. Load, file 00001000..00001025 | mem 00401000..00401025 | align 00001000 | R.X Load, file 00002000..00002009 | mem 00402000..00402009 | align 00001000 | RW. Load, ], } Mapping "./samples/hello" in memory... Mapping segment @ 00400000..004000e8 with BitFlags<SegmentFlag>(0b100, Read) Addr: 0x800000, Padding: 00000000 Copying segment data... Adjusting permissions... Mapping segment @ 00401000..00401025 with BitFlags<SegmentFlag>(0b101, Execute | Read) Addr: 0x801000, Padding: 00000000 Copying segment data... Adjusting permissions... Mapping segment @ 00402000..00402009 with BitFlags<SegmentFlag>(0b110, Write | Read) Addr: 0x802000, Padding: 00000000 Copying segment data... Adjusting permissions... Jumping to entry point @ 00401000... Press Enter to jmp...

Oh.

Oh no.

Where has my "hi world" gone.

WHERE HAS MY-ohhh that's right we've offset everything by 0x400000.

And samples/hello has a movabs. And the data it expects to be at 0x402000 is now actually at 0x802000. That won't work.

It's kind of annoying when executables do that. When their code depends on the position in memory at which they're mapped.

It doesn't even really make sense - to me. An executable is composed of many object files, right? If you have a foo.c source file and a bar.c source file, they get compiled to separate foo.o and bar.o files, right?

And these are kinda cobbled together by the linker, right? That's what a linker does. And then you get a single executable.

Now if the code that's in .o files is position-dependent... how does the compiler know which address to pick so that multiple .o files don't clash?

And if it doesn't know, how does the linker do to arrange .o files so that they movabs from the right address? Does it look for movabs instructions and modify them?

This makes no sense.

No sense at all.

See, if I were to disassemble an .o file right now, we'd see tha-

Shell session
$ objdump -d ./samples/hello.o ./samples/hello.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <_start>: 0: bf 01 00 00 00 mov edi,0x1 5: 48 be 00 00 00 00 00 movabs rsi,0x0 c: 00 00 00 f: ba 09 00 00 00 mov edx,0x9 14: b8 01 00 00 00 mov eax,0x1 19: 0f 05 syscall 1b: 48 31 ff xor rdi,rdi 1e: b8 3c 00 00 00 mov eax,0x3c 23: 0f 05 syscall

Hum. I'm sorry, what? movabs rsi,0x0, really?

...really?

Just to be clear, that's what disassembling the hello executable shows us:

Shell session
$ objdump -d ./samples/hello ./samples/hello: file format elf64-x86-64 Disassembly of section .text: 0000000000401000 <_start>: 401000: bf 01 00 00 00 mov edi,0x1 401005: 48 be 00 20 40 00 00 movabs rsi,0x402000 40100c: 00 00 00 40100f: ba 09 00 00 00 mov edx,0x9 401014: b8 01 00 00 00 mov eax,0x1 401019: 0f 05 syscall 40101b: 48 31 ff xor rdi,rdi 40101e: b8 3c 00 00 00 mov eax,0x3c 401023: 0f 05 syscall

At this point in the article, nothing really makes sense anymore. Machine code magically turns from 0x0 into non-0x0. Symbols are undefined, then defined again. Files are mapped left and right. In the distance, sirens.

It's a wonderful day for PIE

So we better search for the title of the article on DuckDuckGo or something.

In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used for shared libraries, so that the same library code can be loaded in a location in each program address space where it will not overlap any other uses of memory (for example, other shared libraries). PIC was also used on older computer systems lacking an MMU, so that the operating system could keep applications away from each other even within the single address space of an MMU-less system.

Source: Position-independent code, Wikipedia

Alright, Jimmy, you win. Take my money.

So there is such a thing as position-independent code.

Let's make a position-independent executable I guess?

Shell session
$ # in `elk/samples` $ ld -pie hello.o -o hello-pie

Well that was easy. Does it run?

Shell session
$ ./hello-pie zsh: no such file or directory: ./hello-pie

I.. what?

Shell session
$ ls hello-pie hello-pie $ file hello-pie hello-pie: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld64.so.1, not stripped $ ./hello-pie zsh: no such file or directory: ./hello-pie

WHAT?

That's it. I give up. I quit. Thanks for the support everyone, but I can't take this anymore. Nothing makes sense, up is down, files exist and are executables but they won't execute, fuck all this, I'm out.

Cool bear's hot tip

Hey what's that "interpreter" part of file's output?

I don't know cool bear. I don't know what interpreter is because I am tired, and I'm starting to get a headache, and computers were almost certainly a mistake, and, and, and I guess it's a library or something, because it ends in .so.1, and if we check it we can see that:

Shell session
$ file /lib/ld64.so.1 /lib/ld64.so.1: cannot open `/lib/ld64.so.1' (No such file or directory)

Ah.

Okay then.

So hello-pie does exist. I'm not going completely crazy. It's just its interpreter that does not.

How does that even work? What's the interpreter for entry_point? If it has one?

Shell session
$ file entry_point entry_point: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=484a9f42457140ee49daba19c379ee0e5a3ba5d4, for GNU/Linux 3.2.0, not stripped

It does! And since it runs, that one must exist, right?

Shell session
$ file /lib64/ld-linux-x86-64.so.2 /lib64/ld-linux-x86-64.so.2: symbolic link to ld-2.30.so

Yeah alright it does. It's a symlink (symbolic link), but still.

Shell session
$ file /lib64/ld-2.30.so /lib64/ld-2.30.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=f6ca5853dae87d9f0503a9ef230f6d1fa15a832d, not stripped

Okay, so it's an ELF file too. We can't escape ELF, it seems. But it's a .so file! That's for libraries! How is a library an interpreter? Do you mean to tell me that if I do this:

Shell session
$ /lib64/ld-2.30.so ./entry_point main @ 0x7f212bf57169 instructions @ 0x7f212bf58004 page @ 0x7f212bf58000 making page executable... jumping...

...I don't know what I expected. Of course it's also an executable. Nevermind the file extensions, don't even question it.

Alright, cool, so, PIEs need an interpreter, sure, okay, and ld gives a non-existent one by default, sure why not, maybe we can override it?

Shell session
$ ld --dynamic-linker=/lib64/ld-linux-x86-64.so.2 -pie hello.o -o hello-pie $ ./hello-pie hi there

Ah. Thank you. Jeeze, the things we do for love.

Okay, so we got ourselves a PIE, let's rip it apart:

Shell session
objdump -d ./samples/hello-pie ./samples/hello-pie: file format elf64-x86-64 Disassembly of section .text: 0000000000001000 <_start>: 1000: bf 01 00 00 00 mov edi,0x1 1005: 48 be 00 30 00 00 00 movabs rsi,0x3000 100c: 00 00 00 100f: ba 09 00 00 00 mov edx,0x9 1014: b8 01 00 00 00 mov eax,0x1 1019: 0f 05 syscall 101b: 48 31 ff xor rdi,rdi 101e: b8 3c 00 00 00 mov eax,0x3c 1023: 0f 05 syscall

O..kay..., now it's moving 0x3000.

You know, we've looked at a PIE before, entry_point, and it had a whole bunch of [rip+XXX] in there. Why can't nasm do that for us, huh? Why can't it be nice?

I liked having memory loads that were relative to the current instruction. At least all we needed then was to have the sections follow each other as they were laid out in the program header.

Maybe there's a way to do that in nasm. Maybe all hope is not lost yet.

Let's sigh read the NASM manual:

In 64-bit mode, NASM will by default generate absolute addresses. The REL keyword makes it produce RIP–relative addresses. Since this is frequently the normally desired behaviour, see the DEFAULT directive (section 6.2). The keyword ABS overrides REL.

Ah. Well then.

X86 Assembly
; in ekl/samples/hello.asm ; 👇 that's new default rel global _start section .text _start: mov rdi, 1 ; stdout fd mov rsi, msg mov rdx, 9 ; 8 chars + newline mov rax, 1 ; write syscall syscall xor rdi, rdi ; return code 0 mov rax, 60 ; exit syscall syscall section .data msg: db "hi there", 10

Let's assemble it:

Shell session
$ # in elk/samples $ nasm -f elf64 hello.asm $ objdump -d hello.o hello.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <_start>: 0: bf 01 00 00 00 mov edi,0x1 5: 48 be 00 00 00 00 00 movabs rsi,0x0 c: 00 00 00 f: ba 09 00 00 00 mov edx,0x9 14: b8 01 00 00 00 mov eax,0x1 19: 0f 05 syscall 1b: 48 31 ff xor rdi,rdi 1e: b8 3c 00 00 00 mov eax,0x3c 23: 0f 05 syscall

No! NO. Bad nasm. No movabs for you. Only rel.

Alright then, forget defaults. Let's use rel directly. We'll need to use lea (load effective address) instead of mov, but that's a sacrifice I'm willing to make:

X86 Assembly
global _start section .text _start: mov rdi, 1 ; stdout fd lea rsi, [rel msg] mov rdx, 9 ; 8 chars + newline ; yada yada, you know the steps
Shell session
$ # in elk/samples $ nasm -f elf64 hello.asm $ objdump -d hello.o hello.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <_start>: 0: bf 01 00 00 00 mov edi,0x1 5: 48 8d 35 00 00 00 00 lea rsi,[rip+0x0] # c <_start+0xc> c: ba 09 00 00 00 mov edx,0x9 11: b8 01 00 00 00 mov eax,0x1 16: 0f 05 syscall 18: 48 31 ff xor rdi,rdi 1b: b8 3c 00 00 00 mov eax,0x3c 20: 0f 05 syscall

Go on...

Shell session
$ ld --dynamic-linker=/lib64/ld-linux-x86-64.so.2 -pie hello.o -o hello-pie $ objdump -d hello-pie hello-pie: file format elf64-x86-64 Disassembly of section .text: 0000000000001000 <_start>: 1000: bf 01 00 00 00 mov edi,0x1 1005: 48 8d 35 f4 1f 00 00 lea rsi,[rip+0x1ff4] # 3000 <msg> 100c: ba 09 00 00 00 mov edx,0x9 1011: b8 01 00 00 00 mov eax,0x1 1016: 0f 05 syscall 1018: 48 31 ff xor rdi,rdi 101b: b8 3c 00 00 00 mov eax,0x3c 1020: 0f 05 syscall

Yes??

Shell session
cargo b -q && ./target/debug/elk ./samples/hello-pie Analyzing "./samples/hello-pie"... File { type: Dyn, machine: X86_64, entry_point: 00001000, program_headers: [ file 00000040..00000200 | mem 00000040..00000200 | align 00000008 | R.. PHdr, file 00000200..0000021c | mem 00000200..0000021c | align 00000001 | R.. Interp, file 00000000..00000269 | mem 00000000..00000269 | align 00001000 | R.. Load, file 00001000..00001022 | mem 00001000..00001022 | align 00001000 | R.X Load, file 00002000..00002000 | mem 00002000..00002000 | align 00001000 | R.. Load, file 00002f20..00003009 | mem 00002f20..00003009 | align 00001000 | RW. Load, file 00002f20..00003000 | mem 00002f20..00003000 | align 00000008 | RW. Dynamic, file 00002f20..00003000 | mem 00002f20..00003000 | align 00000001 | R.. GnuRelRo, ], } (cut)

...wait, there's a zero-length segment in there (0x2000..0x2000), let's filter those out:

Rust code
// in `elk/src/main.rs` println!("Mapping {:?} in memory...", input_path); let mut mappings = Vec::new(); for ph in file .program_headers .iter() .filter(|ph| ph.r#type == delf::SegmentType::Load) // ignore zero-length segments .filter(|ph| ph.mem_range().end > ph.mem_range().start) { // (cut) }

...resuming where we left off:

Shell session
Mapping "./samples/hello-pie" in memory... Mapping segment @ 00000000..00000269 with BitFlags<SegmentFlag>(0b100, Read) Addr: 0x400000, Padding: 00000000 Copying segment data... Adjusting permissions... Mapping segment @ 00001000..00001022 with BitFlags<SegmentFlag>(0b101, Execute | Read) Addr: 0x401000, Padding: 00000000 Copying segment data... Adjusting permissions... Mapping segment @ 00002f20..00003009 with BitFlags<SegmentFlag>(0b110, Write | Read) Addr: 0x402000, Padding: 00000f20 Copying segment data... Adjusting permissions... Jumping to entry point @ 00001000... Press Enter to jmp... hi there

YES!

What did we learn?

Where to even begin.

When we loaded our program, it was put in memory somewhere other than what the program headers said it should be. And that didn't turn out so well.

Sure, the machine code we'd seen at first seemed like it would work. But then we tried to use some data, and that didn't work. So we stopped using data, and that worked.

But a program that uses data still does work, when we don't try to run it via elk (our command-line tool). In fact, when we debug it, the disassembly shown by GDB does not match the disassembly shown by ndisasm and objdump.

If we change our assembly a little though, we can get it to work. Because we're accessing data by addressing memory relative to where the code is. And they are in the right place, relative to each other.

But, if we're being honest, there are still a lot of questions left unanswered.

I'm looking for my next job!

August 2020 will be my last month at my current job, and I'm excited to figure out the next step.

If you're a remote-friendly company and you think we'd be a good fit, don't hesitate to get in touch!