assembly

More ELF relocations

Apr 06, 2020

28 min #os · #assembly · #linux · #rust

In our last installment of “Making our own executable packer”, we did some code cleanups. We got rid of a bunch of unsafe code, and found a way to represent memory-mapped data structures safely.

But that article was merely a break in our otherwise colorful saga of “trying to get as many executables to run with our own dynamic loader”. The last thing we got running was the ifunc-nolibc program.

Dynamic linker speed and correctness

Feb 14, 2020

28 min #os · #assembly · #linux

In the last article, we managed to load a program (hello-dl) that uses a single dynamic library (libmsg.so) containing a single exported symbol, msg.

Our program, hello-dl.asm, looked like this:

        global _start
        extern msg

        section .text

_start:
        mov rdi, 1      ; stdout fd
        mov rsi, msg
        mov rdx, 38     ; 37 chars + newline
        mov rax, 1      ; write syscall
        syscall

        xor rdi, rdi    ; return code 0
        mov rax, 60     ; exit syscall
        syscall

Dynamic symbol resolution

Feb 02, 2020

26 min #os · #assembly · #linux · #rust · #linkers

Let’s pick up where we left off: we had just taught elk to load not only an executable, but also its dependencies, and then their dependencies as well.

We discovered that ld-linux walked the dependency graph breadth-first, and so we did that too. Of course, it’s a little bit overkill since we only have one dependency, but, nevertheless, elk happily loads our executable and its one dependency:

Loading multiple ELF objects

Jan 26, 2020

32 min #os · #assembly · #linux · #rust · #linkers

Up until now, we’ve been loading a single ELF file, and there wasn’t much structure to how we did it: everyhing just kinda happened in main, in no particular order.

But now that shared libraries are in the picture, we have to load multiple ELF files, with search paths, and keep them around so we can resolve symbols, and apply relocations across different objects.

The simplest shared library

Jan 22, 2020

29 min #os · #assembly · #linux · #rust · #linkers

In our last article, we managed to load and execute a PIE (position-independent executable) compiled from the following code:

; in `samples/hello-pie.asm`

        global _start

        section .text

_start: mov rdi, 1      ; stdout fd
        lea rsi, [rel msg]
        mov rdx, 9      ; 8 chars + newline
        mov rax, 1      ; write syscall
        syscall

        xor rdi, rdi    ; return code 0
        mov rax, 60     ; exit syscall
        syscall

        section .data

msg:    db "hi there", 10

ELF relocations

Jan 19, 2020

17 min #os · #assembly · #linux · #rust · #linkers

The last article, Position-independent code, was a mess. But who could blame us? We looked at the world, and found it to be a chaotic and seemingly nonsensical place. So, in order to blend in, we had to let go of a little bit of sanity.

The time has come to reclaim it.

Short of faulty memory sticks, memory locations don’t magically turn from 0x0 into valid addresses. Someone is doing the turning, and we’re going to find out who, if it takes the rest of the series.

Position-independent code

Jan 13, 2020

28 min #os · #assembly · #linux · #rust

In the last article, we found where code was hiding in our samples/hello executable, by disassembling the whole file and then looking for syscalls.

Later on, we learned how to inspect which memory ranges are mapped for a given PID (process identifier). We saw that memory areas weren’t all equal: they can be readable, writable, and/or executable.

Running an executable without exec

Jan 12, 2020

22 min #os · #assembly · #linux · #rust

In part 1, we’ve looked at three executables:

sample, an assembly program that prints “hi there” using the write system call.
entry_point, a C program that prints the address of main using printf
The /bin/true executable, probably also a C program (because it’s part of GNU coreutils), and which just exits with code 0.

We noticed that when running entry_point through GDB, it always printed the same address. But when we ran it directly, it printed a different address on every run.