Everything about os
Welcome back and thanks for joining us for the reads notes... the thirteenth installment of our series on ELF files, what they are, what they can do, what does the dynamic linker do to them, and how can we do it ourselves.
In our last installment of
"Making our own executable packer", we did some code cleanups. We got rid of
a bunch of unsafe
code, and found a way to represent memory-mapped data
structures safely.
In the last article, we managed to load a program (hello-dl
) that uses a single
dynamic library (libmsg.so
) containing a single exported symbol, msg
.
Let's pick up where we left off: we had just taught elk
to load
not only an executable, but also its dependencies, and then their
dependencies as well.
Up until now, we've been loading a single ELF file, and there wasn't much
structure to how we did it: everyhing just kinda happened in main
, in no
particular order.
In our last article, we managed to load and execute a PIE (position-independent executable) compiled from the following code:
X86 Assembly; in `samples/hello-pie.asm` global _start section .text _start: mov rdi, 1 ; stdout fd lea rsi, [rel msg] mov rdx, 9 ; 8 chars + newline mov rax, 1 ; write syscall syscall xor rdi, rdi ; return code 0 mov rax, 60 ; exit syscall syscall section .data msg: db "hi there", 10
The last article, Position-independent code, was a mess. But who could blame us? We looked at the world, and found it to be a chaotic and seemingly nonsensical place. So, in order to blend in, we had to let go of a little bit of sanity.
In the last article, we found where code was hiding in our samples/hello
executable, by disassembling the whole file and then looking for syscalls.
In part 1, we've looked at three executables:
sample
, an assembly program that prints "hi there" using thewrite
system call.entry_point
, a C program that prints the address ofmain
usingprintf
- The
/bin/true
executable, probably also a C program (because it's part of GNU coreutils), and which just exits with code 0.
Executables have been fascinating to me ever since I discovered, as a kid,
that they were just files. If you renamed a .exe
to something else, you
could open it in notepad! And if you renamed something else to a .exe
,
you'd get a neat error dialog.
So far, we've seen many ways to read a file from different programming languages, we've learned about syscalls, how to make those from assembly, then we've learned about memory mapping, virtual address spaces, and generally some of the mechanisms in which userland and the kernel interact.
Looking at that latest mental model, it's.. a bit suspicious that every program ends up calling the same set of functions. It's almost like something different happens when calling those.
Everybody knows how to use files. You just open up File Explorer, the Finder, or a File Manager, and bam - it's chock-full of files. There's folders and files as far as the eye can see. It's a genuine filapalooza. I have never once heard someone complain there were not enough files on their computer.