More devops than I bargained for

Background

I recently had a bit of impromptu disaster recovery, and it gave me a hunger for more! More downtime! More kubernetes manifest! More DNS! Ahhhh!

The plan was really simple. I love dedicated Hetzner servers with all my heart but they are not very fungible.

You have to wait entire minutes for a new dedicated server to be provisioned. Sometimes you pay a setup fee, et cetera. And at some point to server static websites and serve as a K3S server, it’s simply just too big, and approximately twice the price that I should pay.

Amos

I have gotten nervous about the world economy — Amos wrote on April 7th, as the American and Japanese stock markets just crashed — but it’s also a fun optimization problem. How much money do I actually need to spend on my infrastructure to get it to perform the way I want it to?

So I decided to move from an x86_64 dedicated server with 32 gigs of RAM and 16 cores, which cost me about 41 euros per month, to an aarch64 instance with 8 Ampere cores, 16 gigs of RAM, which costs 12 euros a month!

See, it’s not a significant saving, but it’s the first in my fleet of servers that is arm64 — And I figured, well, I recently set up continuous integration and continuous delivery for my CMS software so that it will build and ship x86_64-unknown-linux-gnu and aarch64-apple-darwin binaries to as Forgejo generic packages and a private Homebrew tap so… what’s one more target?

Right?

Cool bear

Ha ha ha.

Has anyone ever built it for arm64 linux before?

For most things, the answer is yes.

On the “main” / “control” / “k3s server” node, I run services like:

  • well, k3s itself, obvs
  • cert-manager
  • traefik v3 (and I get HTTP/3)
  • a full prometheus stack, including grafana
  • a couple postgres clusters
  • umami for analytics

All of those are either ubiquitous or written in Go, which has excellent tooling for cross compilation, which means they’ve had ARM64 images forever.

A few of my Dockerfile(s) downloaded binaries for stuff like regclient, an ffmpeg static build, etc. — a simple “make this work for arm64 too” prompt to Claude 3.5 Sonnet was enough to add the requisite bashisms:

# Download the archive echo -e "\033[1;34m📥 Downloading home-drawio \033[1;33m${HOME_DRAWIO_VERSION}\033[0m for \033[1;36m${ARCH_NAME}\033[0m..." # Map platform architecture to package architecture string if [ "${ARCH_NAME}" == "amd64" ]; then PKG_ARCH="x86_64-unknown-linux-gnu" elif [ "${ARCH_NAME}" == "arm64" ]; then PKG_ARCH="aarch64-unknown-linux-gnu" else echo -e "\033[1;31m❌ Error: Unsupported architecture: ${ARCH_NAME}\033[0m" >&2 exit 1 fi curl --fail --location --retry 3 --retry-delay 5 -H "Authorization: token ${FORGEJO_READWRITE_TOKEN}" \ "https://code.bearcove.cloud/api/packages/bearcove/generic/home-drawio/${HOME_DRAWIO_VERSION}/${PKG_ARCH}.tar.xz" \ -o "${TEMP_DIR}/home-drawio.tar.xz"
Amos

I like to request colors to make the log output more readable to me and emojis which also help with readability. I ask LLMs to generate tools that always show a plan for what they’re going to do first, ask the user for consent, report progress while doing it, and print a summary of actions takens and errors encountered at the end.

Amos

I have used them with great success with “devops”: there are a few pieces you need to be really solid, but the rest is all glue. I typically prototype in bash or TypeScript and then port it to rust if I need it to run fast or be more correct.

Cool bear

Didn’t LLMs lead you astray last time?

Amos

Babe, I mean bear, they lead me astray _every time. But I’m the one driving.

Cool bear

Fair enough — cool bear said, uttering words Amos had written.

I had forgotten how many moving parts were involved in my own software?

Most native dependencies are just an APT install away, since I use Debian 12 as a base image, and the Debian project has done the hard work of packaging just about everything.

Amos

I think the only thing I built from source is libdav7d, so that it’s recent enough?

home-drawio is one of my custom components: it’s a binary that’s able to convert draw.io diagrams to SVG. I used to shell out to node.js instead, but decided I didn’t like it, so now it’s bundled with bun as bytecode:

build: #!/bin/bash -eu echo -e "\033[1;34m🚀 Starting build process...\033[0m" echo -e "\033[1;33m📦 Installing dependencies...\033[0m" pnpm install echo -e "\033[1;35m🗑️ Cleaning up dist directory...\033[0m" rm -rf dist mkdir -p dist echo -e "\033[1;36m📜 Logging build information...\033[0m" echo "Build started at: $(date)" echo "Bun version: $(bun --version)" echo -e "\033[1;36m📊 Listing contents of dist directory...\033[0m" ls -lhA dist echo -e "\033[1;32m🏗️ Building project...\033[0m" DEBUG='*' bun build --compile src/index.js --bytecode --outfile dist/home-drawio echo -e "\033[1;33m🚀 Running the native executable...\033[0m" DRAWIO_DEBUG=1 dist/home-drawio convert ./sample.drawio echo -e "\033[1;32m✅ Build process completed successfully!\033[0m" echo -e "\033[1;36m📊 Checking binary size...\033[0m" ls -lh dist/home-drawio file dist/home-drawio echo -e "\033[1;36m📊 Checking xz compressed binary size...\033[0m" xz -2 -T0 -c dist/home-drawio > dist/home-drawio.xz ls -lh dist/home-drawio.xz rm dist/home-drawio.xz echo -e "\033[1;36m🔍 Checking binary dependencies...\033[0m" if [[ "$OSTYPE" == "darwin"* ]]; then otool -L dist/home-drawio elif [[ "$OSTYPE" == "linux-gnu"* ]]; then ldd dist/home-drawio else echo "Unsupported operating system for dependency check." fi rm -f .*.bun-build

This is a Justfile, for the just task runner, which replaces make for me, since I already have one (or three) build systems.

Its output is pleasing:

amos in 🌐 souffle in home-drawio on  main just build 🚀 Starting build process... 📦 Installing dependencies... Lockfile is up to date, resolution step is skipped Already up to date Done in 213ms using pnpm v10.7.1 🗑️ Cleaning up dist directory... 📜 Logging build information... Build started at: Mon Apr 7 13:18:32 CEST 2025 Bun version: 1.2.8 📊 Listing contents of dist directory... total 0 🏗️ Building project... [571ms] bundle 700 modules [143ms] compile dist/home-drawio 🚀 Running the native executable... [2025-04-07T11:18:34.317Z] Parse XML took 0.31ms [2025-04-07T11:18:34.318Z] Select diagram took 1.11ms [2025-04-07T11:18:34.318Z] Base64 decode took 0.08ms [2025-04-07T11:18:34.319Z] Inflate took 1.16ms [2025-04-07T11:18:34.320Z] URI decode took 0.03ms [2025-04-07T11:18:34.320Z] Decompressed diagram, set DRAWIO_VERBOSE=1 to see it [2025-04-07T11:18:34.321Z] XML parse took 1.30ms [2025-04-07T11:18:34.359Z] Get SVG took 11.74ms [2025-04-07T11:18:34.362Z] Get XML took 2.90ms <svg xmlns="http:// ✂️ ✅ Build process completed successfully! 📊 Checking binary size... -rwxrwxrwx@ 1 amos staff 95M Apr 7 13:18 dist/home-drawio dist/home-drawio: Mach-O 64-bit executable arm64 📊 Checking xz compressed binary size... -rw-r--r--@ 1 amos staff 20M Apr 7 13:18 dist/home-drawio.xz 🔍 Checking binary dependencies... dist/home-drawio: /usr/lib/libicucore.A.dylib (compatibility version 1.0.0, current version 74.1.0) /usr/lib/libresolv.9.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1700.255.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)

Multi-arch container images

One problem I ran into pretty early is that I had no idea how to make and push a container image that works for multiple architectures.

Up until now, I’d always been building images, and pushing them immediately with tags like:

  • code.bearcove.cloud/bearcove/beardist:latest
  • code.bearcove.cloud/bearcove/home:33.0.0

As far as I can tell, the way to go is to pick a convention for arch-specific tags, like:

  • code.bearcove.cloud/bearcove/beardist:latest-arm64
  • code.bearcove.cloud/bearcove/beardist:latest-amd64
Cool bear

The fact arm64 and amd64 look so close from afar is a disgrace, btw.

And then create a multi-arch manifest that is the thing pushed under :latest.

If you’re using docker to build images, then you can do something like this:

echo -e "\033[1;31m🗑️ Removing existing manifest: \033[0;32m{{BASE}}/$target:latest\033[0m" && \ docker manifest rm "{{BASE}}/$target:latest" || true && \ echo -e "\033[1;36m📝 Creating manifest: \033[0;32m{{BASE}}/$target:latest\033[0m" && \ docker manifest create "{{BASE}}/$target:latest" \ $(for platform in $PLATFORMS; do \ arch=$(echo $platform | cut -d/ -f2); \ echo "{{BASE}}/$target:latest-$arch"; \ done) && \ echo -e "\033[1;32m📤 Pushing manifest: \033[0;32m{{BASE}}/$target:latest\033[0m" && \ docker manifest push "{{BASE}}/$target:latest"; \ echo -e "\033[1;32m✅ Completed \033[1;33m$target\033[1;32m successfully!\033[0m"; \

If you’re using docker buildx, then it can do multi-arch builds for you! But that is not supported by OrbStack, or at least, I couldn’t get it working.

However, that’s irrelevant to me, because, most of my Dockerfiles are just there to declare dependencies — I don’t actually build inside of them.

Base images + regctl

See, it’s annoying to have access to a docker daemon in CI. Really, I’m a grown up: I can take on the risk of making the build environment and runtime environment match — I just want to copy my binary into a base image I know and control.

So… I have this repack.sh script:

#!/usr/bin/env bash set -euo pipefail # This script creates a container image for beardist # The process involves: # 1. Creating an OCI layout directory for beardist # 2. Adding beardist to the image # 3. Pushing the final image with proper tags if [ "${IMAGE_PLATFORM}" != "linux/arm64" ] && [ "${IMAGE_PLATFORM}" != "linux/amd64" ]; then echo -e "\033[1;31m❌ Error: IMAGE_PLATFORM must be set to linux/arm64 or linux/amd64\033[0m" >&2 exit 1 fi ARCH_NAME=$(echo "${IMAGE_PLATFORM}" | cut -d'/' -f2) # Check if we're on a tag and get the version TAG_VERSION="" if [[ "${GITHUB_REF:-}" == refs/tags/* ]]; then TAG_VERSION="${GITHUB_REF#refs/tags/}" # Remove 'v' prefix if present TAG_VERSION="${TAG_VERSION#v}" echo -e "\033[1;33m📦 Detected tag: ${TAG_VERSION}\033[0m" fi # Declare variables OCI_LAYOUT_DIR="/tmp/beardist-oci-layout" OUTPUT_DIR="/tmp/beardist-output" IMAGE_NAME="code.bearcove.cloud/bearcove/beardist:${TAG_VERSION:+${TAG_VERSION}-}${ARCH_NAME}" BASE_IMAGE="code.bearcove.cloud/bearcove/build:${ARCH_NAME}" # Clean up and create layout directory rm -rf "$OCI_LAYOUT_DIR" mkdir -p "$OCI_LAYOUT_DIR/usr/bin" # Copy beardist to the layout directory echo -e "\033[1;34m📦 Copying beardist binary to layout directory\033[0m" cp -v "$OUTPUT_DIR/beardist" "$OCI_LAYOUT_DIR/usr/bin/" # Reset all timestamps to epoch for reproducible builds touch -t 197001010000.00 "$OCI_LAYOUT_DIR/usr/bin/beardist" # Create the image echo -e "\033[1;36m🔄 Creating image from base\033[0m" regctl image mod "$BASE_IMAGE" --create "$IMAGE_NAME" \ --layer-add "dir=$OCI_LAYOUT_DIR" # Push the image echo -e "\033[1;32m🚀 Pushing image: \033[1;35m$IMAGE_NAME\033[0m" regctl image copy "$IMAGE_NAME"{,} # Push tagged image if we're in CI and there's a tag if [ -n "${CI:-}" ] && [ -n "${GITHUB_REF:-}" ]; then if [[ "$GITHUB_REF" == refs/tags/* ]]; then TAG=${GITHUB_REF#refs/tags/} if [[ "$TAG" == v* ]]; then TAG=${TAG#v} fi TAGGED_IMAGE_NAME="code.bearcove.cloud/bearcove/beardist:$TAG" echo -e "\033[1;32m🏷️ Tagging and pushing: \033[1;35m$TAGGED_IMAGE_NAME\033[0m" regctl image copy "$IMAGE_NAME" "$TAGGED_IMAGE_NAME" fi fi # Test the image if not in CI if [ -z "${CI:-}" ]; then echo -e "\033[1;34m🧪 Testing image locally\033[0m" docker pull "$IMAGE_NAME" docker run --rm "$IMAGE_NAME" beardist --help # Display image info echo -e "\033[1;35m📋 Image layer information:\033[0m" docker image inspect "$IMAGE_NAME" --format '{{.RootFS.Layers | len}} layers' fi
Cool bear

The timestamp stuff is particularly load-bearing.

This is, like… not nix, but it provides a lot of the value that nix gave me — assembling docker images without docker, allowing us the nice property “if a layer didn’t change, then it can just be reused”.

The rest of the value I got from nix, and from earthly after that (Cthulhu rest its eternal soul), is “don’t rebuild if you don’t need to rebuild”, which I achieved through timelord, a simple utility that saves and restores file timestamps, unless their contents has changed.

Amos

I look forward to timelord being completely deprecated by cargo’s checksum-freshness feature, just like I look forward to replacing cargo-sweep with gc, and cargo-hakari with feature unification.

So anyway, this is the important part of the script:

regctl image mod "$BASE_IMAGE" --create "$IMAGE_NAME" \ --layer-add "dir=$OCI_LAYOUT_DIR"
Cool bear

regctl comes from regclient and does not need a docker daemon present.

This adds a layer from a directory (which means it has to tar it and sha256 it — that’s basically all an OCI layer is).

Then we push it to the registry:

regctl image copy "$IMAGE_NAME"{,}

And then… then what? Then we can’t actually create a manifest because, contrary to base images, we need to build images like home (the name my CMS has this week, for those who follow along at… well, at home) from a machine with a matching infrastructure because the process is:

  • In a Debian 12 arm64/amd64 container
  • Build with beardist (which invokes cargo build, copies around dynamic libraries, does verifications, compression, uploads)
  • Add built binary on top of base layer of the correct arch, and push OCI image with regctl

So the architecture outside the image and inside the image must match.

beardist itself is distributed as a (multi-arch) image, and in fact, is built using itself, which means it has to be bootstrapped somehow.

And the way it’s bootstrapped is:

  • From a build environment matching the target env…
  • Run cargo install --path . in beardist/
  • Run BEARDIST_CACHE=/tmp/beardist beardist build
  • Run ./repack.sh

And voila! Now, beardist can build itself in CI, using its own docker image, which will be overwritten on every tag release.

Cool bear

If needed, the bootstrap can be redone, or an earlier “working” tag can simply be used. The chain hasn’t broken yet, a couple weeks in.

Once both architectures are built in CI, as two different Forgejo Actions jobs, a third job is triggered:

name: check on: push: branches: [main] tags: - "*" pull_request: branches: [main] jobs: mac-build: runs-on: mac-arm env: BEARDIST_CACHE_DIR: /Users/filler/beardist-cache BEARDIST_ARTIFACT_NAME: aarch64-apple-darwin FORGEJO_READWRITE_TOKEN: ${{ secrets.FORGEJO_READWRITE_TOKEN }} CLICOLOR: 1 CLICOLOR_FORCE: 1 steps: - name: Check out repository code uses: actions/checkout@v4 - name: Build shell: bash run: | beardist build if [[ "${{ github.ref }}" == refs/tags/* ]]; then echo "we're building a tag! " echo "installing latest beardist locally (mac runners are non-containerized)" # beardist is statically-linked (not a dylo binary) so we can just # copy it in-place cp /tmp/beardist-output/beardist $(which beardist) fi linux-build: strategy: matrix: include: - runs-on: linux-arm64 artifact: aarch64-unknown-linux-gnu platform: linux/arm64 - runs-on: linux-amd64 artifact: x86_64-unknown-linux-gnu platform: linux/amd64 runs-on: ${{ matrix.runs-on }} container: image: code.bearcove.cloud/bearcove/beardist:latest volumes: - /var/persistent-build-storage:/var/persistent-build-storage credentials: username: "token" password: "${{ secrets.BEARCOVE_PULL_PASSWORD }}" env: BEARDIST_CACHE_DIR: /var/persistent-build-storage/beardist-cache BEARDIST_ARTIFACT_NAME: ${{ matrix.artifact }} FORGEJO_READWRITE_TOKEN: ${{ secrets.FORGEJO_READWRITE_TOKEN }} CLICOLOR: 1 CLICOLOR_FORCE: 1 IMAGE_PLATFORM: ${{ matrix.platform }} steps: - name: Check out repository code uses: actions/checkout@v4 - name: Build shell: bash run: | beardist build if [[ "${{ github.ref }}" == refs/tags/* ]]; then regctl registry login code.bearcove.cloud -u token -p "${{ secrets.FORGEJO_READWRITE_TOKEN }}" ./repack.sh fi trigger-formula-update: needs: [mac-build, linux-build] if: startsWith(github.ref, 'refs/tags/') runs-on: linux-arm64 container: image: code.bearcove.cloud/bearcove/beardist:latest credentials: username: "token" password: "${{ secrets.BEARCOVE_PULL_PASSWORD }}" env: FORGEJO_READWRITE_TOKEN: ${{ secrets.FORGEJO_READWRITE_TOKEN }} steps: - name: Check out repository code uses: actions/checkout@v4 - name: Login to registry run: | regctl registry login code.bearcove.cloud -u token -p "${{ secrets.FORGEJO_READWRITE_TOKEN }}" - name: Create and push multi-platform manifest run: | ./multify.sh - name: Trigger formula update run: | curl -f -X POST \ -H "Authorization: token $FORGEJO_READWRITE_TOKEN" \ -H "Accept: application/json" \ -H "Content-Type: application/json" \ -d '{"ref": "main", "inputs": {"repository": "'$GITHUB_REPOSITORY'"}}' \ https://code.bearcove.cloud/api/v1/repos/bearcove/tap/actions/workflows/bump.yml/dispatches

That multify.sh script is shown here, and… is a bit more manual than the previous strategy since, we don’t have docker! Only regctl.

Which doesn’t come with “manifest-building” utilities.

Luckily, it’s “just JSON”, right?

#!/usr/bin/env -S bash -euo pipefail # Define colors GREEN='\033[0;32m' CYAN='\033[0;36m' RED='\033[0;31m' YELLOW='\033[0;33m' BLUE='\033[0;34m' NC='\033[0m' # No Color echo -e "${CYAN}🔍 Starting multi-architecture container manifest creation...${NC}" # Check if we're on a tag and get the version TAG_VERSION="" if [[ "${GITHUB_REF:-}" == refs/tags/* ]]; then TAG_VERSION="${GITHUB_REF#refs/tags/}" # Remove 'v' prefix if present TAG_VERSION="${TAG_VERSION#v}" echo -e "${YELLOW}📦 Detected tag: ${TAG_VERSION}${NC}" fi # Define the image tags to use if [[ -n "$TAG_VERSION" ]]; then AMD64_TAG="${TAG_VERSION}-amd64" ARM64_TAG="${TAG_VERSION}-arm64" MANIFEST_TAGS=("${TAG_VERSION}" "latest") else AMD64_TAG="latest-amd64" ARM64_TAG="latest-arm64" MANIFEST_TAGS=("latest") fi echo -e "${YELLOW}📦 Getting digests and sizes...${NC}" echo -e "${BLUE}⬇️ Fetching AMD64 digest...${NC}" AMD64_DIGEST=$(regctl manifest head code.bearcove.cloud/bearcove/beardist:${AMD64_TAG} --platform linux/amd64) echo -e "${BLUE}⬇️ Fetching ARM64 digest...${NC}" ARM64_DIGEST=$(regctl manifest head code.bearcove.cloud/bearcove/beardist:${ARM64_TAG} --platform linux/arm64) # Check if ARM64_DIGEST is empty or not properly set if [ -z "$ARM64_DIGEST" ]; then echo -e "${RED}❌ Error: Unable to get ARM64 digest. Exiting.${NC}" exit 1 else echo -e "${GREEN}✅ ARM64 digest retrieved successfully!${NC}" fi # Check if AMD64_DIGEST is empty or not properly set if [ -z "$AMD64_DIGEST" ]; then echo -e "${RED}❌ Error: Unable to get AMD64 digest. Exiting.${NC}" exit 1 else echo -e "${GREEN}✅ AMD64 digest retrieved successfully!${NC}" fi echo -e "${BLUE}📏 Calculating AMD64 manifest size...${NC}" AMD64_SIZE=$(regctl manifest get code.bearcove.cloud/bearcove/beardist:${AMD64_TAG} --platform linux/amd64 --format raw-body | wc -c) echo -e "${GREEN}✅ AMD64 size: ${AMD64_SIZE} bytes${NC}" echo -e "${BLUE}📏 Calculating ARM64 manifest size...${NC}" ARM64_SIZE=$(regctl manifest get code.bearcove.cloud/bearcove/beardist:${ARM64_TAG} --platform linux/arm64 --format raw-body | wc -c) echo -e "${GREEN}✅ ARM64 size: ${ARM64_SIZE} bytes${NC}" echo -e "${YELLOW}📝 Creating manifest.json...${NC}" cat <<EOF > manifest.json { "schemaVersion": 2, "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", "manifests": [ { "mediaType": "application/vnd.docker.distribution.manifest.v2+json", "size": $AMD64_SIZE, "digest": "$AMD64_DIGEST", "platform": { "architecture": "amd64", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v2+json", "size": $ARM64_SIZE, "digest": "$ARM64_DIGEST", "platform": { "architecture": "arm64", "os": "linux" } } ] } EOF echo -e "${GREEN}✅ manifest.json created successfully!${NC}" echo -e "${YELLOW}🚀 Pushing manifest.json to registry...${NC}" for TAG in "${MANIFEST_TAGS[@]}"; do echo -e "${BLUE}📤 Pushing manifest for tag: ${TAG}${NC}" regctl manifest put \ --content-type application/vnd.docker.distribution.manifest.list.v2+json \ code.bearcove.cloud/bearcove/beardist:${TAG} < manifest.json echo -e "${GREEN}✅ Successfully pushed manifest for tag: ${TAG}${NC}" done echo -e "${GREEN}🎉 Multi-architecture manifest(s) successfully pushed to registry!${NC}"

This script, too, is pleasing:

amos in 🌐 souffle in beardist on  main [!] via 🦀 v1.86.0 time GITHUB_REF=refs/tags/v3.8.9 ./multify.sh 🔍 Starting multi-architecture container manifest creation... 📦 Detected tag: 3.8.9 📦 Getting digests and sizes... ⬇️ Fetching AMD64 digest... ⬇️ Fetching ARM64 digest... ✅ ARM64 digest retrieved successfully! ✅ AMD64 digest retrieved successfully! 📏 Calculating AMD64 manifest size... ✅ AMD64 size: 2243 bytes 📏 Calculating ARM64 manifest size... ✅ ARM64 size: 2242 bytes 📝 Creating manifest.json... ✅ manifest.json created successfully! 🚀 Pushing manifest.json to registry... 📤 Pushing manifest for tag: 3.8.9 ✅ Successfully pushed manifest for tag: 3.8.9 📤 Pushing manifest for tag: latest ✅ Successfully pushed manifest for tag: latest 🎉 Multi-architecture manifest(s) successfully pushed to registry! ________________________________________________________ Executed in 1.45 secs fish external usr time 110.06 millis 0.29 millis 109.77 millis sys time 110.70 millis 2.23 millis 108.47 millis

…and will probably be rewritten in Rust eventually, or collapsed into beardist, which, isn’t linked because it’s not open-source. It’s custom-made for my needs — make your own!

Here’s the generated manifest:

{ "schemaVersion": 2, "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", "manifests": [ { "mediaType": "application/vnd.docker.distribution.manifest.v2+json", "size": 2243, "digest": "sha256:b2dc52ed0fc06d10b4681405289004da8dab86776223466beb4a84a86fbc8ade", "platform": { "architecture": "amd64", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v2+json", "size": 2242, "digest": "sha256:f79fbe3ae00713394d69970bb5f74af0d043dbf703d8b9ccb2a3f3c110cbd88d", "platform": { "architecture": "arm64", "os": "linux" } } ] }

And uhh yeah, it works!

Well. It worked for beardist — and then I could have other builds operate from the beardist:latest image, no matter whether they were running on arm64 workers or amd64 workers…

…but by this point, I didn’t really have any good amd64 workers left.

I had:

  • Some VM I run with UTM on macOS
  • That 8-core arm64 machine (good enough for Rust CI builds)
  • 5 2-core amd64 machines

And uhhhh… I tried. But after 30 minutes, the forgejo actions job timeout kicked in and.. yeah. It couldn’t build my entire website software.

Which, to be fair:

home on  main via 🦀 v1.85.1 cat Cargo.lock | grep -F '[[package' | wc -l 838

…is not surprising.

Because there is a persistent build storage, I could’ve just retried until it finally built, but… my site was still down at this point! I had preemptively migrated everything else, including postgres clusters, forgejo volumes etc. — but had left my CMS for last because, well, it’s my CMS! I know this!

At this point I realized my Mac Studio has to be on all the time, since it’s running a VM which does the macOS builds. And it has 32GB RAM… it can probably fit another x86_64 Linux VM, right?

Well, it can, but:

  • -6GB RAM is kinda brutal when I’m editing 4K videos
  • x86_64 emulation via qemu is slow, multicore emulation even moreso
  • USB SATA SSDs are slow (I don’t have enough internal storage for all my VMs)

I only realized that after hours of fiddling around to get IPv6 to work inside a container inside the VM inside my Mac Studio, becauuseeeee….

More like IPv5

I don’t know, okay? At some point I’ll do a deep dive, but… it was past 1AM, I don’t know, I just needed things to work.

Here’s what I think I understood. Maybe.

In Kubernetes, workloads are performed in “containers”, which are run in “pods”, which are scheduled on “nodes”.

In my setup, “nodes” are just, the Hetzner Cloud VMs:

A screenshot of the Hetzner cloud dashboard that shows 28 resources, including 6 servers.

I like the nice little map visualization. I think more cloud providers should do that.

Their API is also very fast.

And my x86_64 VM that I ran on my MacBookPro.

To k3s, they’re the same, they’re all just… nodes:

~ k get nodes NAME STATUS ROLES AGE VERSION domino Ready <none> 15h v1.31.6+k3s1 flam Ready <none> 18h v1.31.6+k3s1 hawk Ready <none> 18h v1.31.6+k3s1 heim Ready <none> 18h v1.31.6+k3s1 kaya Ready <none> 18h v1.31.6+k3s1 marl Ready <none> 18h v1.31.6+k3s1 styx Ready control-plane,etcd,master 18h v1.31.6+k3s1

Those nodes need not have a publicly routable IPv4 or IPv6 address: they can be behind NAT (Network Address Translation), and they’ll still be able to:

  • reach out to the k3s server
  • register themselves as nodes (given the proper auth token)
  • and join the overlay network

Why an overlay network? Because pods have their own IP address.

And in a simple setup like this, the pod IP addresses are not publicly routable either.

In my current setup…

infra on  main [$] via 🦀 v1.85.0 rg 'cidr' roles/ roles/k3s/leader/templates/config.yaml.j2 1:cluster-cidr: 10.42.0.0/16,fd00:42::/48 2:service-cidr: 10.43.0.0/16,fd00:43::/112

…neither the IPv4 or IPv6 addresses are “publicly routable” — if you send a packet to any of these set as destination to an internet router, it will chuckle and drop the packet.

Amos

The IPv4 address block is called “private address space” and the IPv6 address block is called “Unique Local Address” or ULA.

However, these are perfectly fine to use for a private overlay network like the one set up by k3s so that pods can talk to each other.

Cool bear

What’s the level of granularity of a pod? Like.. how many pods to an app?

To give you an example: traefik is the “ingress”, aka the HTTP reverse proxy, so it needs one pod per edge node:

fasterthanli.me on  main [$!] k get pods -n 'traefik' -o json | jq -c '.items[] | {nodeName: .spec.nodeName, podIP: .status.podIP}' {"nodeName":"domino","podIP":"192.168.210.3"} {"nodeName":"styx","podIP":"49.13.119.8"} {"nodeName":"domino","podIP":"192.168.1.100"} {"nodeName":"heim","podIP":"157.180.27.172"} {"nodeName":"hawk","podIP":"116.202.24.111"} {"nodeName":"kaya","podIP":"5.223.56.87"} {"nodeName":"marl","podIP":"5.78.90.129"} {"nodeName":"flam","podIP":"5.161.220.244"}

But those pods are a little special — they’re using host networking.

When I point a DNS record for fasterthanli.me at one of my nodes, I need it to listen on port 80 and 443, and I need those connections to go straight to traefik — hence, the pod IP is actually the publicly routable IP of that node.

fasterthanli.me on  main [$!] ssh root@49.13.119.8 -- "ip addr show eth0 | grep --color=always -E '(inet|inet6) ([0-9a-f:.]+)'" inet 49.13.119.8/32 brd 49.13.119.8 scope global dynamic eth0 inet6 2a01:4f8:c17:34b1::1/64 scope global inet6 fe80::9400:4ff:fe32:8ea/64 scope link
Amos

Redundant, I know!

But most pods are not special. They have an IP address that comes from the CIDR we defined earlier: that’s the case of pods in the home namespace:

fasterthanli.me on  main [$!] k get pods -n 'home' -o json | jq -c '.items[] | {nodeName: .spec.nodeName, podIP: .status.podIP}' {"nodeName":"heim","podIP":"10.42.40.130"} {"nodeName":"hawk","podIP":"10.42.123.3"} {"nodeName":"marl","podIP":"10.42.71.66"} {"nodeName":"kaya","podIP":"10.42.29.194"} {"nodeName":"hawk","podIP":"10.42.123.2"} {"nodeName":"heim","podIP":"10.42.40.131"} {"nodeName":"styx","podIP":"10.42.29.2"}

These are all in 10.42.0.0/16!

From one pod, we can reach another:

fasterthanli.me on  main [$!] k exec -n home cub-dc9f5b494-bhnjr -it -- curl -H 'Host: fasterthanli.me' -I http://10.42.40.130:1111 HTTP/1.1 200 OK content-type: text/html; charset=utf-8 cache-control: no-cache x-source: eu-north-1.heim.cub-dc9f5b494-bhnjr content-length: 105153 date: Mon, 07 Apr 2025 17:04:57 GMT

And that is what the overlay network is about.

But it’s not the same thing as having actual connectivity to the internet, or “egress”.

I’ll save you all the different troubleshooting steps I went through, but basically, here’s how things ended up working out: I ended up installing Calico to replace Flannel.

The first big difference is: instead of sending overlay packets as VXLAN over UDP, it actually establishes a wireguard network — traffic between nodes is now encrypted properly.

Amos

Apparently Flannel supports that too, it’s just not enabled by default.

And the second big difference is that it’s actually able to do something called NAT66.

Cool bear

Wait wait wait. What?

The NAT king calls

Okay, so let’s look at the simple case, right? We have a pod on a Hetzner cloud VM.

It makes an outbound request to a public IPv4 address — how is it routed?

Cool bear

I don’t know, let’s check traceroute?

Good instinct! Let’s do that. So we’ll create a pod with net-shooter

--- apiVersion: v1 kind: Pod metadata: name: net-shooter labels: app: net-shooter spec: containers: - name: net-shooter image: nicolaka/netshoot command: - sleep - infinity nodeSelector: provider: hcloud

Oh yeah, by the way, I changed my deploy script to not use yq or rsync and.. just use kubectl with a bunch of flags:

infra on  main [$?] via 🦀 v1.85.0 ./deploy manifests/tests/ 🔍 Performing dry run of kubectl apply... pod/net-shooter created (server dry run) ❓ Do you want to apply these changes? (y/n) y ✅ Applying changes... pod/net-shooter created 📤 Preparing to commit and push changes... ❓ Enter a commit message: create test pod [main 1ae5b54] create test pod 1 file changed, 14 insertions(+) create mode 100644 manifests/tests/000-ip-routing-test.yaml Enumerating objects: 7, done. Counting objects: 100% (7/7), done. Delta compression using up to 12 threads Compressing objects: 100% (5/5), done. Writing objects: 100% (5/5), 531 bytes | 531.00 KiB/s, done. Total 5 (delta 2), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (2/2), completed with 2 local objects. To https://github.com/bearcove/infra.git f60c806..1ae5b54 main -> main ✅ Changes have been committed and pushed.

Here’s the complete deploy script if you’re interested:

#!/bin/zsh -euo pipefail # Define color function colorize() { sed -E $'s/(unchanged)/\033[1;34m\\1\033[0m/g; s/(created)/\033[1;32m\\1\033[0m/g; s/(configured)/\033[1;33m\\1\033[0m/g; s/(deleted)/\033[1;31m\\1\033[0m/g' } # Prepare kubectl apply arguments kubectl_args=("-R") if [ $# -eq 0 ]; then kubectl_args+=("-f" "manifests/") else for manifest in "$@"; do kubectl_args+=("-f" "$manifest") done fi # Dry run echo "\033[1;33m🔍 Performing dry run of kubectl apply...\033[0m" kubectl apply "${kubectl_args[@]}" --dry-run=server | colorize # Ask for consent echo "\033[1;35m❓ Do you want to apply these changes? (y/n)\033[0m" read -r response # Apply if consent given if [[ "$response" =~ ^[Yy]$ ]]; then echo "\033[1;32m✅ Applying changes...\033[0m" kubectl apply "${kubectl_args[@]}" | colorize else echo "\033[1;31m❌ Operation cancelled.\033[0m" exit 1 fi # Commit and push changes echo "\033[1;34m📤 Preparing to commit and push changes...\033[0m" echo "\033[1;35m❓ Enter a commit message:\033[0m" read -r commit_message git add . git commit -m "$commit_message" git push echo "\033[1;32m✅ Changes have been committed and pushed.\033[0m"

Is our pod running?

infra on  main [$] via 🦀 v1.85.0 k get pods -l app=net-shooter NAME READY STATUS RESTARTS AGE net-shooter 1/1 Running 0 10s

Yes, good! Let’s run some tests shall we?

infra on  main [$] via 🦀 v1.85.0 k exec net-shooter -- ip addr show dev eth0 | grep -E 'inet |inet6 ' inet 10.42.29.15/32 scope global eth0 inet6 fd00:42:0:1d1b:89d4:e2d6:158f:6f0f/128 scope global inet6 fe80::ccf8:a1ff:fe55:ac8d/64 scope link proto kernel_ll

Okay, it definitely has an IPv4 address and an IPv6 address taken from our respective CIDR ranges and also a link local address starting with fe80.

Let’s try to do a traceroute to one of the other pods:

infra on  main [$] via 🦀 v1.85.0 k exec net-shooter -- traceroute 10.42.123.3 traceroute to 10.42.123.3 (10.42.123.3), 30 hops max, 46 byte packets 1 styx (49.13.119.8) 0.007 ms 0.015 ms 0.007 ms 2 10.42.123.1 (10.42.123.1) 2.354 ms 1.786 ms 0.844 ms 3 10.42.123.3 (10.42.123.3) 0.688 ms 1.061 ms 0.592 ms

Pretty straightforward.

Pods also have IPv6 addresses, since we’re dual-stack!

fasterthanli.me on  main [$] k get pods -n 'home' -o json | jq -c '.items[] | {nodeName: .spec.nodeName, name: .metadata.name, podIPs: .status.podIPs}' {"nodeName":"flam","name":"cub-695b7f6fdd-42z5m","podIPs":[{"ip":"10.42.52.80"},{"ip":"fd00:42:0:42b4:6c65:9873:2890:3453"}]} {"nodeName":"flam","name":"cub-695b7f6fdd-65dj5","podIPs":[{"ip":"10.42.52.78"},{"ip":"fd00:42:0:42b4:6c65:9873:2890:3451"}]} {"nodeName":"marl","name":"cub-695b7f6fdd-89ztk","podIPs":[{"ip":"10.42.71.75"},{"ip":"fd00:42:0:4746:36a6:b9d9:c23:ef8e"}]} {"nodeName":"hawk","name":"cub-695b7f6fdd-c5s5s","podIPs":[{"ip":"10.42.123.13"},{"ip":"fd00:42:0:f6da:458d:a644:59c2:d4e"}]} {"nodeName":"marl","name":"cub-695b7f6fdd-knh47","podIPs":[{"ip":"10.42.71.77"},{"ip":"fd00:42:0:4746:36a6:b9d9:c23:ef90"}]} {"nodeName":"heim","name":"cub-695b7f6fdd-rhll9","podIPs":[{"ip":"10.42.40.136"},{"ip":"fd00:42:0:28ae:bd57:7c2d:4a15:a89"}]} {"nodeName":"styx","name":"mom-85d6745745-vkjw2","podIPs":[{"ip":"10.42.29.20"},{"ip":"fd00:42:0:1d1b:89d4:e2d6:158f:6f16"}]}

And similarly we can trace that route:

infra on  main [$] via 🦀 v1.85.0 k exec net-shooter -- traceroute fd00:42:0:4746:36a6:b9d9:c23:ef8e traceroute to fd00:42:0:4746:36a6:b9d9:c23:ef8e (fd00:42:0:4746:36a6:b9d9:c23:ef8e), 30 hops max, 72 byte packets 1 fd00:42:0:1d1b:89d4:e2d6:158f:6f15 (fd00:42:0:1d1b:89d4:e2d6:158f:6f15) 0.013 ms 0.010 ms 0.011 ms 2 fd00:42:0:4746:36a6:b9d9:c23:ef89 (fd00:42:0:4746:36a6:b9d9:c23:ef89) 170.372 ms 169.522 ms 169.491 ms 3 fd00:42:0:4746:36a6:b9d9:c23:ef8e (fd00:42:0:4746:36a6:b9d9:c23:ef8e) 169.560 ms 169.474 ms 169.501 ms

But traceroute is… bordeline useless.

What we want to know here is answered better by the ip route command:

infra on  main [$] via 🦀 v1.85.0 k exec net-shooter -it -- ip -4 route show default via 169.254.1.1 dev eth0 169.254.1.1 dev eth0 scope link infra on  main [$] via 🦀 v1.85.0 k exec net-shooter -it -- ip -6 route show fd00:42:0:1d1b:89d4:e2d6:158f:6f0f dev eth0 proto kernel metric 256 pref medium fe80::/64 dev eth0 proto kernel metric 256 pref medium default via fe80::ecee:eeff:feee:eeee dev eth0 metric 1024 pref medium

To me, this is interesting, because, well… both 169.254.1.1 and fe80::/64 are “link-local” addresses: the only other place I’ve seen them is when DHCP fails and your computer decides to pick an address that, I guess, would allow you to communicate with something at the other end even without DHCP?

So, actually, the trick is coming from outside the pod… because if we ask a random server from the internet what’s our IP address, it will have a radically different answer than what we’ve seen so far:

infra on  main [$] via 🦀 v1.85.0 k exec net-shooter -it -- curl -4 https://icanhazip.com 49.13.119.8 infra on  main [$] via 🦀 v1.85.0 k exec net-shooter -it -- curl -6 https://icanhazip.com 2a01:4f8:c17:34b1::1

This is the IP address of the node, not the pod — NAT is happening, both for IPv4 (NAT44):

root@styx ~# iptables -4 -t nat -L cali-POSTROUTING -v -n Chain cali-POSTROUTING (1 references) pkts bytes target prot opt in out source destination 12104 740K cali-fip-snat 0 -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:Z-c7XtVd2Bq7s_hA */ 12104 740K cali-nat-outgoing 0 -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:nYKhEzDlr11Jccal */ 0 0 MASQUERADE 0 -- * vxlan.calico 0.0.0.0/0 0.0.0.0/0 /* cali:e9dnSgSVNmIcpVhP */ ADDRTYPE match src-type !LOCAL limit-out ADDRTYPE match src-type LOCAL random-fully 0 0 MASQUERADE 0 -- * wireguard.cali 0.0.0.0/0 0.0.0.0/0 /* cali:kgfCOPW4UKtzMAmO */ ADDRTYPE match src-type !LOCAL limit-out ADDRTYPE match src-type LOCAL random-fully

And for IPv6 (NAT66):

root@styx ~# ip6tables -t nat -L cali-POSTROUTING -v -n Chain cali-POSTROUTING (1 references) pkts bytes target prot opt in out source destination 2015 173K cali-fip-snat 0 -- * * ::/0 ::/0 /* cali:Z-c7XtVd2Bq7s_hA */ 2015 173K cali-nat-outgoing 0 -- * * ::/0 ::/0 /* cali:nYKhEzDlr11Jccal */ 0 0 MASQUERADE 0 -- * vxlan-v6.calico ::/0 ::/0 /* cali:MtS-9OgAQy-fAM-w */ ADDRTYPE match src-type !LOCAL limit-out ADDRTYPE match src-type LOCAL random-fully

And this is fine and great for virtual machines hosted on Hetzner which have a public IPv4 address and have a public IPv6 prefix.

But what happens when, say, you tell your home computer to join your Kubernetes cluster? That’s exactly what I ended up doing: let’s see what happens with it!

infra on  main [$] via 🦀 v1.85.0 ssh root@domino ip addr show dev enp3s0 | grep -E 'inet6?' inet 192.168.1.100/24 brd 192.168.1.255 scope global dynamic noprefixroute enp3s0 inet6 2a01:e0a:de8:a760:bf51:36ff:d905:1432/64 scope global temporary dynamic inet6 2a01:e0a:de8:a760:7656:3cff:fe28:5746/64 scope global dynamic mngtmpaddr noprefixroute inet6 fe80::7656:3cff:fe28:5746/64 scope link noprefixroute

It does have a publicly routable IPv6 because NAT is not required there. NAT is only being done for IPv4.

infra on  main [$] via 🦀 v1.85.0 ssh root@domino -- ip -6 route show default default via fe80::3a07:16ff:fec2:bc19 dev enp3s0 proto ra metric 100 pref medium infra on  main [$] via 🦀 v1.85.0 ssh root@domino -- ip -4 route show default default via 192.168.1.254 dev enp3s0 proto dhcp src 192.168.1.100 metric 100

The default routes for IPv4 and IPv6 go directly to the router, and icanhazip reveal our actual public IPv4:

infra on  main [$] via 🦀 v1.85.0 ssh root@domino -- curl -s -6 https://icanhazip.com 2a01:e0a:de8:a760:bf51:36ff:d905:1432 infra on  main [$] via 🦀 v1.85.0 ssh root@domino -- curl -s -4 https://icanhazip.com 87.182.152.211

And you know what this means?

Cool bear

No, but you’ll tell us, right?

Why the fuck is everything broken

Again and again and again ever since I had this home node join my Kubernetes cluster I have had nothing but issues.

And every time it’s been the exact same issue and it’s taken me so long to realize what was happening.

This node, called Domino, has IPv4 egress, but doesn’t have IPv4 ingress!

domino is able to establish a connection with google.com over IPv4 and exchange packets no problem, but it cannot host an IPv4 service! If it does, then it’s going to be giving out an IP that is not routable!

Back to our node IPs, let’s take kaya for example:

infra on  main [$] via 🦀 v1.85.0 k get nodes -o json | jq -c '.items[] | select(.metadata.name == "kaya") | .status.addresses' [{"address":"5.223.56.87","type":"InternalIP"},{"address":"2a01:4ff:2f0:10be::1","type":"InternalIP"},{"address":"kaya","type":"Hostname"}]

It has an internal IP of 5.223.56.87 — very well. What’s the IP of the traefik pod on that node?

infra on  main [$] via 🦀 v1.85.0 k get pods -n traefik -o json | jq -c -r '.items[] | select(.spec.nodeName == "kaya") | .status.podIPs' [{"ip":"5.223.56.87"},{"ip":"2a01:4ff:2f0:10be::1"}]

The very same! It uses host networking.

But on domino?

infra on  main [$] via 🦀 v1.85.0 k get pods -n traefik -o json | jq -c -r '.items[] | select(.spec.nodeName == "domino") | .status.podIPs' [{"ip":"192.168.1.100"},{"ip":"2a01:e0a:de8:a760:17c3:ece0:634:8ec7"}]

It’s that LAN IP, 192.168.1.100.

And that caused me some problems when, after restarting pods, the Kubernetes server decided to schedule cert manager challenge pods on Domino.

What’s cert-manager? It’s a neat thing that provisions TLS certificates automatically: you create a Certificate, like so:

--- apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: fasterthanli-me-cert namespace: home spec: secretName: fasterthanli-me-cert-secret issuerRef: name: letsencrypt-prod kind: ClusterIssuer dnsNames: [fasterthanli.me, cdn.fasterthanli.me]

And then internally it makes certificate request objects, it makes orders, it talks with the Let’s Encrypt system, and it’s able to do changes of different kind. The one that I was using was HTTP-01.

Which works by making a request on some well-known path. Literally a path that starts with /.well-known/acme-challenge/ - the cert-manager creates a temporary HTTP endpoint that the Let’s Encrypt servers can access to verify that you control the domain you’re requesting a certificate for.

This works by creating an ingress resource which in turn is handled by traefik, to serve just that path, over HTTP (and the usual site over HTTPS). As soon as the TLS certificate is created, it’s swapped in and traefik starts using it.

And that’s all well and good. But this is where we find out that there is actually at least two types of services that can be used in a Kubernetes setup like mine: ClusterIP and NodePort.

A list of services as seen from the K9S text TUI. We see Calico, we see Forgejo, we see cert manager, Kube DNS, Traefik, Umami.

And really, in my situation, there’s no good reason for any service to be NodePort except for traefik, which must use host networking since there’s no load balancing in front of it, I’m not doing managed kubernetes, and I also can’t roll my own load balancer at Level 3.

And yet, the cert manager challenge services defaulted to NodePort for some reason — which used to always work on the nodes that were actually hosted on Hed’s Node Cloud VMs, but didn’t work on domino!

Because domino is doing double NAT for IPv4, and, for IPv6, is still doing single NAT, because even though the node would be able to route a whole /64’s worth of IPv6 addresses, Calico is picking pod IP addresses from the pools we gave it:

--- apiVersion: crd.projectcalico.org/v1 kind: IPPool metadata: name: ipv4-pool spec: cidr: 10.42.0.0/16 ipipMode: Never vxlanMode: Always natOutgoing: true disabled: false nodeSelector: all() --- apiVersion: crd.projectcalico.org/v1 kind: IPPool metadata: name: ipv6-pool spec: cidr: fd00:42::/48 ipipMode: Never vxlanMode: Always natOutgoing: true disabled: false nodeSelector: all()

And that’s… just not great… to learn about, at 4 in the morning, when everything’s been down for hours.

Like… I’ve never asked to learn all this, man. I was just trying to throw a little arm64 in the mix. I miss solving problems with Dockerfile. Let me out. LET ME OUT.

Cool bear

Wait, wait, wait, so is there a way to disable NAT66 just for that node?

I think there is, but I’ve just been too scared to touch it so far.

Cool bear

But… where’s the fun in that?

Ah, damn it, you’re right.

Bye NATalie

If I understood everything correctly, we need to create another IP pool, just for our node:

# ✂️: ipv4 pool --- apiVersion: crd.projectcalico.org/v1 kind: IPPool metadata: name: ipv6-pool spec: cidr: fd00:42::/48 ipipMode: Never vxlanMode: Always natOutgoing: true disabled: false nodeSelector: "kubernetes.io/hostname != 'domino'" --- apiVersion: crd.projectcalico.org/v1 kind: IPPool metadata: name: public-ipv6-pool spec: cidr: 2a01:e0a:de8:a760::/64 ipipMode: Never vxlanMode: Always natOutgoing: false disabled: false nodeSelector: "kubernetes.io/hostname == 'domino'"

And apply it… and restart the pods, and… I don’t know, let’s test it:

--- apiVersion: v1 kind: Pod metadata: name: ipv6-server spec: nodeSelector: kubernetes.io/hostname: domino containers: - name: server image: python:3 command: - python3 - -m - http.server - "8080" - "--bind" - "::" # <-- Listen on all IPv6 interfaces ports: - containerPort: 8080 protocol: TCP
infra on  main [$?] via 🦀 v1.85.0 kubectl apply -f manifests/tests/100-python.yaml pod/ipv6-server created

Let’s see what IPs were assigned…

infra on  main [$?] via 🦀 v1.85.0 kubectl get pod ipv6-server -o jsonpath='{.status.podIPs}' | jq -c . [{"ip":"10.42.210.16"},{"ip":"2a01:e0a:de8:a760:8ccd:f32f:73e5:da03"}]

…oooh, promising!

Now let’s see if we can access that pod?

infra on  main [$] via 🦀 v1.85.0 curl --connect-timeout 2 -I 'http://[2a01:e0a:de8:a760:8ccd:f32f:73e5:da03]:8080' curl: (28) Failed to connect to 2a01:e0a:de8:a760:8ccd:f32f:73e5:da03 port 8080 after 2006 ms: Timeout was reached

Oh. We can’t.

Cool bear

Wait, but that’s from inside the LAN.

And? You think it’s going to work better outside the LAN?

Cool bear

Try it

Fine, fine if y-

amos in 🌐 styx in ~ curl --connect-timeout 2 -I 'http://[2a01:e0a:de8:a760:8ccd:f32f:73e5:da03]:8080' HTTP/1.0 200 OK Server: SimpleHTTP/0.6 Python/3.13.2 Date: Mon, 07 Apr 2025 19:50:04 GMT Content-type: text/html; charset=utf-8 Content-Length: 832

You… what? The fuck?

Cool bear

Claude tells me this can be caused by “LAN Hairpinning” or “NDP Scope Problems”.

Well, I guess that’s why we have happy eyeballs, so that the IPv4 path will work on LAN and the IPv6 path will work on the public internet.

Good night, everyone — and thanks for following along!

Comment on /r/fasterthanlime

Thanks to my sponsors: Walther, Adrián Garnier Artiñano, Yufan Lou, Michael Alyn Miller, old.woman.josiah, Dom, David Cornu, Beat Scherrer, Olly Swanson, qrpth, James Rhodes, Elendol, Julius Riegel, Romet Tagobert, jer, Ben Mitchell, Tobias Bahls, Alex Rudy, Hamilton Chapman, Justin Ossevoort and 262 more

My work is sponsored by people like you. Donate now so it can keep going:

Continue with GitHub Continue with Patreon

Here's another article just for you:

Understanding Rust futures by going way too deep

So! Rust futures! Easy peasy lemon squeezy. Until it’s not. So let’s do the easy thing, and then instead of waiting for the hard thing to sneak up on us, we’ll go for it intentionally.

Cool bear Cool Bear's hot tip

That’s all-around solid life advice.

Choo choo here comes the easy part 🚂💨

We make a new project:

$ cargo new waytoodeep Created binary (application) `waytoodeep` package