Reading files the hard way

Everybody knows how to use files. You just open up File Explorer, the Finder, or a File Manager, and bam - it's chock-full of files. There's folders and files as far as the eye can see. It's a genuine filapalooza. I have never once heard someone complain there were not enough files on their computer.

But what is a file, really? And what does reading a file entail, exactly?

Read part 1

Series overview

1. Reading files the hard way - Part 1 (node.js, C, rust, strace)

Everybody knows how to use files. You just open up File Explorer, the Finder, or a File Manager, and bam - it's chock-full of files. There's folders and files as far as the eye can see. It's a genuine filapalooza. I have never once heard someone complain there were not enough files on their computer.

But what is a file, really? And what does reading a file entail, exactly?

2. Reading files the hard way - Part 2 (x86 asm, linux kernel)

Looking at that latest mental model, it's.. a bit suspicious that every program ends up calling the same set of functions. It's almost like something different happens when calling those.

Are those even regular functions? Can we step through them with a debugger?

If we run our stdio-powered C program in gdb, and break on read, we can confirm that we indeed end up calling a function (which is called here, but oh well):

3. Reading files the hard way - Part 3 (ftrace, disk layouts, ext4)

So far, we've seen many ways to read a file from different programming languages, we've learned about syscalls, how to make those from assembly, then we've learned about memory mapping, virtual address spaces, and generally some of the mechanisms in which userland and the kernel interact.

But in our exploration, we've always considered the kernel more or less like a "black box". It's time to change that.

This series is complete.