A static poppler build: the easy way

👋 This page was last updated ~3 years ago. Just so you know.

So! Now our asset processing pipeline is almost complete. But we’ve just traded dependencies against CLI tools, for dependencies against dynamic libraries:

$ ldd ./target/debug/pdftocairo linux-vdso.so.1 (0x00007ffd615be000) libpoppler-glib.so.8 => /lib64/libpoppler-glib.so.8 (0x00007f2ba1bb4000) libgobject-2.0.so.0 => /lib64/libgobject-2.0.so.0 (0x00007f2ba1b59000) libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x00007f2ba1a1e000) libcairo.so.2 => /lib64/libcairo.so.2 (0x00007f2ba1902000) libcairo-gobject.so.2 => /lib64/libcairo-gobject.so.2 (0x00007f2ba18f6000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f2ba18dc000) libm.so.6 => /lib64/libm.so.6 (0x00007f2ba17fe000) libc.so.6 => /lib64/libc.so.6 (0x00007f2ba15f4000) /lib64/ld-linux-x86-64.so.2 (0x00007f2ba216c000) libpoppler.so.112 => /lib64/libpoppler.so.112 (0x00007f2ba1288000) libfreetype.so.6 => /lib64/libfreetype.so.6 (0x00007f2ba11bd000) libgio-2.0.so.0 => /lib64/libgio-2.0.so.0 (0x00007f2ba0fe4000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f2ba0dc5000) libffi.so.6 => /lib64/libffi.so.6 (0x00007f2ba0db8000) libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f2ba0d40000) libpixman-1.so.0 => /lib64/libpixman-1.so.0 (0x00007f2ba0c94000) libfontconfig.so.1 => /lib64/libfontconfig.so.1 (0x00007f2ba0c45000) libpng16.so.16 => /lib64/libpng16.so.16 (0x00007f2ba0c0c000) libxcb-shm.so.0 => /lib64/libxcb-shm.so.0 (0x00007f2ba0c07000) libxcb.so.1 => /lib64/libxcb.so.1 (0x00007f2ba0bda000) libxcb-render.so.0 => /lib64/libxcb-render.so.0 (0x00007f2ba0bca000) libXrender.so.1 => /lib64/libXrender.so.1 (0x00007f2ba0bbd000) libX11.so.6 => /lib64/libX11.so.6 (0x00007f2ba0a75000) libXext.so.6 => /lib64/libXext.so.6 (0x00007f2ba0a60000) libz.so.1 => /lib64/libz.so.1 (0x00007f2ba0a46000) libjpeg.so.62 => /lib64/libjpeg.so.62 (0x00007f2ba09c2000) libopenjp2.so.7 => /lib64/libopenjp2.so.7 (0x00007f2ba0968000) liblcms2.so.2 => /lib64/liblcms2.so.2 (0x00007f2ba0903000) libtiff.so.5 => /lib64/libtiff.so.5 (0x00007f2ba087c000) libsmime3.so => /lib64/libsmime3.so (0x00007f2ba0850000) libnss3.so => /lib64/libnss3.so (0x00007f2ba0712000) libplc4.so => /lib64/libplc4.so (0x00007f2ba0709000) libnspr4.so => /lib64/libnspr4.so (0x00007f2ba06c6000) libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f2ba06b3000) libharfbuzz.so.0 => /lib64/libharfbuzz.so.0 (0x00007f2ba05dd000) libbrotlidec.so.1 => /lib64/libbrotlidec.so.1 (0x00007f2ba05cf000) libgmodule-2.0.so.0 => /lib64/libgmodule-2.0.so.0 (0x00007f2ba05c8000) libmount.so.1 => /lib64/libmount.so.1 (0x00007f2ba0581000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f2ba0556000) libxml2.so.2 => /lib64/libxml2.so.2 (0x00007f2ba03cd000) libXau.so.6 => /lib64/libXau.so.6 (0x00007f2ba03c7000) libwebp.so.7 => /lib64/libwebp.so.7 (0x00007f2ba0358000) libzstd.so.1 => /lib64/libzstd.so.1 (0x00007f2ba0260000) libjbig.so.2.1 => /lib64/libjbig.so.2.1 (0x00007f2ba0252000) libnssutil3.so => /lib64/libnssutil3.so (0x00007f2ba021f000) libplds4.so => /lib64/libplds4.so (0x00007f2ba021a000) libgraphite2.so.3 => /lib64/libgraphite2.so.3 (0x00007f2ba01f9000) libbrotlicommon.so.1 => /lib64/libbrotlicommon.so.1 (0x00007f2ba01d4000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f2ba019c000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f2ba0105000) liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f2ba00d9000)
Cool bear

Whew, that’s a LOT of dependencies.

It is. There’s a lot of stuff in there we don’t strictly need, they’re optional dependencies in the poppler dependency tree: we never need cairo to render to X11/xcb for example, only to an SVG surface.

On Linux, this mostly means installing a lot of different packages just to get our binary running. And those packages are named differently depending on which Linux distribution you’re using: I use Fedora, Ubuntu and ArchLinux on a regular basis. Their contents can be different too, again, some package maintainers make Executive Decisions and that makes it really hard to ship cross-distro packages.

Amos

This is where folks who work on AppImage, snapcraft or flatpak would usually come in and pitch their solution to that problem.

These are all fine for various usecases, they each make their own compromises, and I’m familiar with them - I’m making an informed decision not to use them.

Similarly, if you’re a Nix aficionado, I’m super happy for you, but go write your own article! I’ll retweet them, too! Today we’re just learning how to build / link / distribute software for internal use.

On Windows, I’d have to be careful to distribute a bunch of .dll files alongside the .exe, which can get messy real fast if you have a ~/bin folder where you just throw all your CLI tools.

Anyway, we can address both problems by just making our own build of poppler as a static library.

Before we do, let’s record the size of the pdftocairo binary, before and after stripping, when it’s linked dynamically against poppler & friends:

$ ls -lhA ./target/debug/pdftocairo -rwxr-xr-x. 2 amos amos 48M Nov 25 12:12 ./target/debug/pdftocairo $ objcopy --compress-debug-sections ./target/debug/pdftocairo /tmp/pdftocairo.compressed-debug $ ls -lhA /tmp/pdftocairo.compressed-debug -rwxr-xr-x. 1 amos amos 16M Nov 25 13:03 /tmp/pdftocairo.compressed-debug $ objcopy --strip-all ./target/debug/pdftocairo /tmp/pdftocairo.stripped $ ls -lhA /tmp/pdftocairo.stripped -rwxr-xr-x. 1 amos amos 5.3M Nov 25 13:04 /tmp/pdftocairo.stripped

During my tenure as “chief fucking around and finding out officer”, I’ve built software from sources “manually” a ton. If you somehow need to get good at this and have a month of free time, Linux From Scratch might be a nice thing to hyperfocus on.

For poppler, I determined that these were the dependencies I needed to build (in order):

  • pcre
  • libffi
  • zlib
  • libpng
  • glib
  • fontconfig
  • freetype
  • pixman
  • cairo
  • poppler
Amos

One could argue harfbuzz is missing from that list, but things went fine without it. Because in this case, the input is the result of Chrome’s “Print to PDF” feature, the text is already laid out, and so I don’t think harfbuzz is actually needed here.

I made a bunch of bash scripts for this, which is one tiny step up from “typing everything into a terminal once”.

They all have the same structure: we’ll follow the script for pcre for example.

#!/bin/bash -eux # ^ set some useful flags: # -e Exits immediately if a pipeline fails. For example sh exits if curl in # `curl | sh` dies. Super useful. # -u Treats unset variables as errors rather than just an empty string. # Also a lifesaver. # -x Prints commands before running them. Useful to follow what's going on, # especially for commands that don't print anything. # Absolute path to this script. /home/user/bin/foo.sh SCRIPT=$(readlink -f $0) # Absolute path this script is in. /home/user/bin SCRIPTPATH=`dirname $SCRIPT` # This is where headers, libraries, pkg-config/Gir/docs files will be installed export PREFIX=${SCRIPTPATH}/prefix mkdir -p ${PREFIX} # Some of these servers are _really slow_, it helps to be able to skip the # download when working on those scripts. Downloading can be skipped by # exporting `SKIP_DL=1` # # The `${FOOBAR:-otherwise}` expression will evaluate to "-otherwise" if # the `FOOBAR` variable is not set, which if [[ "${SKIP_DL:-unset}" == "1" ]]; then echo "Skipping download..." else rm -rf sources/pcre mkdir -p sources/pcre # Pipe curl into tar to extract on the fly (possible with tarballs, not so easy # with zips) # # curl's -f (--fail) flag will fail if it gets a non-2xx status code. # Unless it's a redirect (3xx), in which case -L (--follow) will follow # the redirect. # # tar's `x` extracts, `j` is for `.bz2`, and `--strip-components` extracts # `foobar-1.2.3/blah` as `blah` instead. `-C` "changes directory", essentially # specifying the destination. curl -f -L https://sourceforge.net/projects/pcre/files/pcre/8.45/pcre-8.45.tar.bz2/download | tar xj -C sources/pcre --strip-components 1 fi # Always re-build from scratch. Incremental rebuilds are always messy, # especially when reconfiguring, I'd rather not risk it. rm -rf builds/pcre mkdir -p builds/pcre # Like `cd`, but we can `popd` later to get out of it pushd builds/pcre # See later discussion export CFLAGS="-fPIC" # Look mom, it's autotools! # --prefix specifies "where to install stuff" # I'm doing this from Fedora, which uses `${PREFIX}/lib64` rather than # Debian's `${PREFIX}/${TRIPLET}/lib`, for some reason --libdir is needed. # --disable-shared says not to build shared/dynamic libraries (.so), and # --enable-static says to build static libraries (.a). ../../sources/pcre/configure \ --prefix=${PREFIX} \ --libdir=${PREFIX}/lib64 \ --disable-shared \ --enable-static \ # Build all that code, using all available processors/cores/hyperthreads make -j $(nproc) # Install all that make install # Kinda unnecessary since all of this is happening in a sub-shell that # exits immediately after, but the symmetry is nice. popd

So that’s with autotools! Not too bad, all things considered.

Cool bear

That reminds me of a tweet by Tim Martin:

I saw a book entitled “Die GNU Autotools” and I thought “My feelings exactly”. Turns out the book was in German.

The one surprising thing is this line:

export CFLAGS="-fPIC"

And that line exists because… the default Rust toolchain tries to make position-independent executables. We discussed this in Position-independent code, but back then we were hyperfocused on learning about ELF. Here we’re just hitting it in the real world!

Position-independent code is generally a good thing, but with most C build systems / compilers, you have to explicitly opt into it, unless you’re building shared objects (.so). Since we’re building static library archives here (.a), we have to tell the C compiler that we intend for that code to eventually be linked into a position-independent executable (a PIE).

As a result, it’ll use a different set of relocations, that are compatible with position-independent executable. If we forget that part, we’ll just get an error at link-time, where the linker will be like “uhh yeah I can’t make a PIE with those relocations”. It’s not as bad as silently building an executable that panics at runtime.

And autotools is relatively hands-off here: it’s mostly concerned with making sure it doesn’t have to enable a hundred workarounds, that are only needed if you’re building for HP-UX or Irix or something.

I mean, technically libtool has a thing where it’ll generate both PIC and non-PIC objects, and there’s the --with-pic option but eh… I don’t super trust it.

Other build systems have other ways to specify that you want position-independent code!

Here’s a cmake example:

# (most of the file is skipped, it looks a lot like the previous example) # Configure the build. # -S sets the source folder, -B sets the build folder, # -D specifies an option. "OFF" and "ON" are for features, "True" and "False" # are for booleans. We disable most features. cmake \ -S sources/poppler \ -B builds/poppler \ -DCMAKE_INSTALL_PREFIX=${PREFIX} \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_POSITION_INDEPENDENT_CODE=True \ -DBUILD_GTK_TESTS=OFF \ -DBUILD_QT5_TESTS=OFF \ -DBUILD_QT6_TESTS=OFF \ -DBUILD_CPP_TESTS=OFF \ -DBUILD_MANUAL_TESTS=OFF \ -DENABLE_BOOST=OFF \ -DENABLE_UTILS=OFF \ -DENABLE_CPP=OFF \ -DENABLE_GLIB=ON \ -DENABLE_GOBJECT_INTROSPECTION=OFF \ -DENABLE_GTK_DOC=OFF \ -DENABLE_QT5=OFF \ -DENABLE_QT6=OFF \ -DENABLE_LIBOPENJPEG=none \ -DENABLE_CMS=none \ -DENABLE_DCTDECODER=none \ -DENABLE_LIBCURL=OFF \ -DENABLE_ZLIB=OFF \ -DBUILD_SHARED_LIBS=OFF \ -DRUN_GPERF_IF_PRESENT=OFF \ ; # Build using aaaaall available processors/cores/hyperthreads cmake --build builds/poppler --parallel $(nproc) # Install cmake --install builds/poppler

The position-independent flag here is -DCMAKE_POSITION_INDEPENDENT_CODE=True.

Here’s an interesting distinction between autotools and CMake: although they both have this “configure” stage, where they generate some more files that will be required to build, CMake actually provides a --build (and an --install) option!

Whereas autotools had us running make by hand. Which is fine on Linux, macOS, or even MinGW on Windows. But on Windows, I want to use MSVC. Some projects ship a .sln (Visual Studio Solution) file directly, but these are not entirely portable across Visual Studio versions, even though they do their best to “upgrade” it when you open it.

So, if you were wondering why in the world people bothered with systems like CMake, that’s why: not everything is Linux, not every C compiler is GCC, not every build systems is GNU make. It’s super useful to have a tool that drives the whole process, and lets you use MSVC as a C compiler, or ninja as a build system, for example.

And Meson is another one of these!

Usage is really similar to CMake, here’s my build script for freetype:

# (again, most of the file is not shown because it just downloads and extracts # the sources) meson setup \ builds/freetype \ sources/freetype \ --prefix ${PREFIX} \ --default-library static \ -D b_staticpic=true \ -D b_pie=true \ -D brotli=disabled \ -D bzip2=disabled \ -D harfbuzz=disabled \ -D png=disabled \ -D tests=disabled \ -D zlib=enabled \ ; meson compile -C builds/freetype meson install -C builds/freetype

As you can see, there’s also a standard way to enable/disable features (like brotli, bzip2, etc.), pick the installation prefix (--prefix), the build target (--default-library), and some built-in options to enable position-independent code (b_staticpic and b_pie: pretty sure the latter is not needed here, but ah well).

And so, having written all my build scripts, and an all.sh script to call them all in order, I’m ready to make my static build. Here’s an asciinema of it: it completes in a little over a minute on a Ryzen 5950X with 128GB of RAM and a decent NVMe SSD (on Fedora 35 inside VMWare Workstation Player with a Windows 11 host):

I find it really fun to watch it go! You can see some colors around the Meson and CMake parts. It’s so nice!

In the end, we have 150MB’s worth of libraries in our prefix:

$ du -hd0 prefix 151M prefix $ ls prefix/lib64 glib-2.0 libcairo.la libfontconfig.a libgmodule-2.0.a libpcrecpp.a libpcreposix.la libpng16.la libpoppler-glib.a libcairo.a libexpat.a libfreetype.a libgobject-2.0.a libpcrecpp.la libpixman-1.a libpng.a libz.a libcairo-gobject.a libffi.a libgio-2.0.a libgthread-2.0.a libpcre.la libpixman-1.la libpng.la pkgconfig libcairo-gobject.la libffi.la libglib-2.0.a libpcre.a libpcreposix.a libpng16.a libpoppler.a

How do we link about this? Thanks to pkg-config!

The Fedora package for poppler-glib installed the /usr/lib64/pkgconfig/poppler-glibc.pc, which contains all the libraries and flags required to use it:

prefix=/usr libdir=/usr/lib64 includedir=/usr/include Name: poppler-glib Description: GLib wrapper for poppler Version: 21.08.0 Requires: glib-2.0 >= 2.56 gobject-2.0 >= 2.56 cairo >= 1.10.0 Requires.private: poppler = 21.08.0 Libs: -L${libdir} -lpoppler-glib Cflags: -I${includedir}/poppler/glib

And now that we’ve built and installed poppler-glib into our own prefix, we have a /home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig file

prefix=/home/amos/bearcove/poppler-build/prefix libdir=/home/amos/bearcove/poppler-build/prefix/lib64 includedir=/home/amos/bearcove/poppler-build/prefix/include Name: poppler-glib Description: GLib wrapper for poppler Version: 21.11.0 Requires: glib-2.0 >= 2.56 gobject-2.0 >= 2.56 cairo >= 1.10.0 Requires.private: poppler = 21.11.0 Libs: -L${libdir} -lpoppler-glib Cflags: -I${includedir}/poppler/glib

(Note that the version we built is more recent than the version Fedora 35 ships!)

pkg-config is relatively straightforward to use “manually”. To show it off, I made a sample C program:

#include <glib/poppler.h> #include <glib/gerror.h> #include <stdio.h> int main() { GError *error = NULL; PopplerDocument *doc = poppler_document_new_from_file("file:///tmp/export.pdf", NULL, &error); if (error) { printf("got GError: %s\n", error->message); g_clear_error(&error); exit(1); } printf("doc = %p\n", doc); }

Trying to compile it with gcc the naive way doesn’t work, because include paths are not set:

$ gcc sample.c -o sample sample.c:1:10: fatal error: glib/poppler.h: No such file or directory 1 | #include <glib/poppler.h> | ^~~~~~~~~~~~~~~~ compilation terminated.

But even setting those include paths aren’t enough:

$ gcc -I ~/bearcove/poppler-build/prefix/include/glib-2.0 -I ~/bearcove/poppler-build/prefix/include/poppler -I ~/bearcove/poppler-build/prefix/lib64/glib-2.0/include -I ~/bearcove/poppler-build/prefix/include/cairo sample.c -o sample /usr/bin/ld: /tmp/cc09zSS1.o: in function `main': sample.c:(.text+0x22): undefined reference to `poppler_document_new_from_file' /usr/bin/ld: sample.c:(.text+0x55): undefined reference to `g_clear_error' collect2: error: ld returned 1 exit status

…because now we’re missing the libraries. And this is what pkg-config solves!

$ pkg-config --cflags glib-2.0 -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/sysprof-4 -pthread

But this picks up the system / distro-installed .pc files. If we want to use the prefix we just made, we need to set the PKG_CONFIG_PATH environment variable.

$ PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \ pkg-config --cflags glib-2.0 -I/home/amos/bearcove/poppler-build/prefix/include/glib-2.0 -pthread -I/home/amos/bearcove/poppler-build/prefix/lib64/glib-2.0/include -I/home/amos/bearcove/poppler-build/prefix/include -DPCRE_STATIC

I’ve just shown the --cflags flag which is necessary for compiling C files (it sets defines and includes paths, mostly), and then there’s --libs, which is necessary to link everything together.

Here’s what linking dynamically (with the system packages) would look like:

$ pkg-config --libs glib-2.0 -lglib-2.0

And now statically (with our own prefix):

$ PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \ pkg-config --libs glib-2.0 -L/home/amos/bearcove/poppler-build/prefix/lib64 -lglib-2.0 -pthread -lm -lpcre

Because in our case we’re doing static linking, we also probably want to use the --static flag, which tells pkg-config to “be more aggressive when computing dependency graph (for static linking)”, and because we don’t want to accidentally end up dynamically linking against a system library, we can specify --env-only, which tells pkg-config to “look only for package entries in PKG_CONFIG_PATH”.

This doesn’t actually make a difference for glib-2.0:

$ PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \ pkg-config --env-only --static --libs glib-2.0 -L/home/amos/bearcove/poppler-build/prefix/lib64 -lglib-2.0 -pthread -lm -lpcre

But it does for cairo, for example!

$ PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \ pkg-config --libs cairo -L/home/amos/bearcove/poppler-build/prefix/lib64 -lcairo

Without --static, it only bothers specifying -lcairo, I’m guessing because it assumes by default we’re linking dynamically, and libcairo.so itself depends on other libraries:

$ ldd /usr/lib64/libcairo.so | grep -E 'glib|pixman|pcre|png|freetype' libpixman-1.so.0 => /lib64/libpixman-1.so.0 (0x00007f551e8cb000) libfreetype.so.6 => /lib64/libfreetype.so.6 (0x00007f551e7b1000) libpng16.so.16 => /lib64/libpng16.so.16 (0x00007f551e778000) libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x00007f551dedb000) libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f551de1f000)

There’s no such concept with .a archives (static libraries). They’re just a bunch of .o files:

$ ar tv /home/amos/bearcove/poppler-build/prefix/lib64/libpoppler-glib.a rw-r--r-- 1000/1000 16912 Nov 25 14:35 2021 poppler-action.cc.o rw-r--r-- 1000/1000 1992 Nov 25 14:35 2021 poppler-date.cc.o rw-r--r-- 1000/1000 116744 Nov 25 14:35 2021 poppler-document.cc.o rw-r--r-- 1000/1000 64688 Nov 25 14:35 2021 poppler-page.cc.o rw-r--r-- 1000/1000 10496 Nov 25 14:35 2021 poppler-attachment.cc.o rw-r--r-- 1000/1000 25032 Nov 25 14:35 2021 poppler-form-field.cc.o rw-r--r-- 1000/1000 65024 Nov 25 14:35 2021 poppler-annot.cc.o rw-r--r-- 1000/1000 7664 Nov 25 14:35 2021 poppler-layer.cc.o rw-r--r-- 1000/1000 10024 Nov 25 14:35 2021 poppler-movie.cc.o rw-r--r-- 1000/1000 11624 Nov 25 14:35 2021 poppler-media.cc.o rw-r--r-- 1000/1000 3032 Nov 25 14:35 2021 poppler.cc.o rw-r--r-- 1000/1000 6168 Nov 25 14:35 2021 poppler-cached-file-loader.cc.o rw-r--r-- 1000/1000 14136 Nov 25 14:35 2021 poppler-input-stream.cc.o rw-r--r-- 1000/1000 81248 Nov 25 14:35 2021 poppler-structure-element.cc.o rw-r--r-- 1000/1000 72280 Nov 25 14:35 2021 poppler-enums.c.o rw-r--r-- 1000/1000 29064 Nov 25 14:35 2021 CairoFontEngine.cc.o rw-r--r-- 1000/1000 142272 Nov 25 14:35 2021 CairoOutputDev.cc.o rw-r--r-- 1000/1000 7768 Nov 25 14:35 2021 CairoRescaleBox.cc.o
Cool bear Cool Bear's hot tip

“GNU ar” is most probably named for “archive”, and it’s part of binutils. The t flag prints a “lisT” of entries, and v is “Verbose”.

So, when we specify --static (and --env-only for good measure), pkg-config lists dependencies explicitly:

$ PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \ pkg-config --libs --env-only --static cairo -L/home/amos/bearcove/poppler-build/prefix/lib64 -lcairo -L/home/amos/bearcove/poppler-build/prefix/lib64 -lgobject-2.0 -L/home/amos/bearcove/poppler-build/prefix/lib64 -L/home/amos/bearcove/poppler-build/prefix/lib64 -L/home/amos/bearcove/poppler-build/prefix/lib64/../lib64 -lffi -lglib-2.0 -pthread -lm -lpcre -lpixman-1 -lfreetype -lz -lpng16 -lm -lm -lz

So, to link our sample C program statically against our libraries, all we need to do is grab the output of pkg-config --cflags and pkg-config --libs and use it for the appropriate build stages:

set -eux export PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig PKGS="poppler-glib gio-2.0 fontconfig libpng" CFLAGS=$(pkg-config --static --cflags ${PKGS}) LIBS=$(pkg-config --static --libs ${PKGS}) gcc ${CFLAGS} -c sample.c g++ sample.o -o sample ${LIBS} -static-libstdc++

The compile step ends up looking like this:

gcc -I${PREFIX}/include/poppler/glib -I${PREFIX}/include/glib-2.0 -I${PREFIX}/lib64/glib-2.0/include -I${PREFIX}/include -I${PREFIX}/include/cairo -I${PREFIX}/include/pixman-1 -I${PREFIX}/include/freetype2 -I${PREFIX}/include/libpng16 -I${PREFIX}/include/poppler -DPCRE_STATIC -pthread -c sample.c

And the link step, like this:

g++ sample.o -o sample -L${PREFIX}/lib64 -lpoppler-glib -lgobject-2.0 -lglib-2.0 -pthread -lm -lpcre -L${PREFIX}/lib64/../lib64 -lcairo -L${PREFIX}/lib64 -L${PREFIX}/lib64 -L${PREFIX}/lib64 -lm -lpixman-1 -lz -lm -L${PREFIX}/lib64 -lpoppler -lgio-2.0 -lgobject-2.0 -lffi -lgmodule-2.0 -lglib-2.0 -lm -lpcre -lfontconfig -pthread -lexpat -lfreetype -lz -lpng16 -lm -lm -lz -static-libstdc++

(With the actual prefix replaced with ${PREFIX} for readability).

Specifying poppler-glib wasn’t enough in this case, adding gio-2.0, fontconfig and libpng was still needed. This feels like a bug in .pc files, but then again, static linking is starting to feel like a lost art: mainstream distributions like Ubuntu and Fedora are definitely focused on dynamic linking above all else.

Here’s another cool tip: pkg-config can generate a graphviz (.dot) file to graph dependencies:

$ PKG_CONFIG_DEBUG_SPEW=1 \ PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \ pkg-config --static --env-only --digraph --exists --errors-to-stdout poppler-glib gio-2.0 fontconfig libpng > /tmp/deps.dot $ dot -Tsvg /tmp/deps.dot > /tmp/deps.svg

It seems cairo has basically no dependencies in that graph, whereas in reality, it definitely has a bunch:

$ ldd /usr/lib64/libcairo.so | grep -E 'fontconfig|freetype|pixman|png' libpixman-1.so.0 => /lib64/libpixman-1.so.0 (0x00007f4777ecb000) libfontconfig.so.1 => /lib64/libfontconfig.so.1 (0x00007f4777e7c000) libfreetype.so.6 => /lib64/libfreetype.so.6 (0x00007f4777db1000) libpng16.so.16 => /lib64/libpng16.so.16 (0x00007f4777d78000

Looking closer, it appears those are listed, but as “private” dependencies:

$ cat /home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig/cairo.pc prefix=/home/amos/bearcove/poppler-build/prefix exec_prefix=${prefix} libdir=/home/amos/bearcove/poppler-build/prefix/lib64 includedir=${prefix}/include Name: cairo Description: Multi-platform 2D graphics library Version: 1.16.0 Requires.private: gobject-2.0 glib-2.0 >= 2.14 pixman-1 >= 0.30.0 freetype2 >= 9.7.3 libpng Libs: -L${libdir} -lcairo Libs.private: -lz Cflags: -I${includedir}/cairo

So maybe it’s just the graph generation that’s broken, which explains why we need to specify fontconfig by hand, but not freetype2.

Anyway! Let’s get back on track.

With all that effort, our sample C program has almost no dynamic dependencies:

$ ldd ./sample linux-vdso.so.1 (0x00007ffc8e4c7000) libm.so.6 => /lib64/libm.so.6 (0x00007fcf437a8000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fcf4378e000) libc.so.6 => /lib64/libc.so.6 (0x00007fcf43584000) /lib64/ld-linux-x86-64.so.2 (0x00007fcf43897000

Similar to another randomly-chosen Rust CLI program:

$ ldd $(which sfz) linux-vdso.so.1 (0x00007ffceaba7000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fb569f8b000) libm.so.6 => /lib64/libm.so.6 (0x00007fb569eaf000) libc.so.6 => /lib64/libc.so.6 (0x00007fb569ca5000) /lib64/ld-linux-x86-64.so.2 (0x00007fb56a72c000)

The dependencies there are:

  • linux-vdso.so: a “virtual” dynamic shared object to make some syscalls faster (like gettimeofday)
  • libgcc_s.so: some GCC builtins, like _Unwind_Resume, __popcountdi2, etc.
  • libm.so: math functions, like tan, sqrtf32, etc.
  • libc.so: the standard C library, in this case, glibc
  • ld-linux-x86-64.so.2: the dynamic linker/loader, see this whole series

And the good news is: the -sys crates for various gnome/glib-adjacent libraries use pkg-config to know what to link against!

Let’s look at what gtk-rs/gir generated for poppler-rs/sys/build.rs for example:

// Generated by gir (https://github.com/gtk-rs/gir @ 8891a2f2c34b) // from ../../gir-files (@ c6afb5857607) // from ../gir-files (@ ec3e62ee546b) // DO NOT EDIT #[cfg(not(feature = "dox"))] use std::process; #[cfg(feature = "dox")] fn main() {} // prevent linking libraries to avoid documentation failure #[cfg(not(feature = "dox"))] fn main() { if let Err(s) = system_deps::Config::new().probe() { println!("cargo:warning={}", s); process::exit(1); } }

This uses the system-deps crate, which README says it supports pkg-config dependencies by looking at the Cargo.toml.

Do we have anything there?

# in `poppler-rs/sys/Cargo.toml` [package.metadata.system-deps.poppler_glib] name = "poppler-glib" version = "0.70" [package.metadata.system-deps.poppler_glib.v0_72] version = "0.72" [package.metadata.system-deps.poppler_glib.v0_73] version = "0.73" # (more versions omitted...)

We do!

And so if we build with no particular options, we can spy on cargo to see what it executes..

$ strace -ff -o /tmp/cargo-build -e 'execve' cargo build (output omitted)
Cool bear Cool Bear's hot tip

strace tracks syscalls: in this case we want to track execve calls (executing another program), we want to “follow forks” (-ff), ie. spy on all processes created by the top-level cargo, such as compiled build scripts, and we want to output the logs, per-PID (process identifier), to /tmp/cargo-build.PID files.

$ rg 'pkg-config' /tmp/cargo-build.* /tmp/cargo-build.215066 1:execve("/home/amos/.cargo/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory) 2:execve("/home/amos/.local/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory) 3:execve("/home/amos/.fly/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory) 4:execve("/home/amos/.vscode-server/bin/ccbaa2d27e38e5afa3e5c21c1c7bef4657064247/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory) 5:execve("/home/amos/.local/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory) 6:execve("/home/amos/.fly/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory) 7:execve("/home/amos/.vscode-server/bin/ccbaa2d27e38e5afa3e5c21c1c7bef4657064247/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory) 8:execve("/home/amos/.cargo/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory) 9:execve("/usr/local/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory) 10:execve("/usr/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = 0 12:execve("/usr/bin/x86_64-redhat-linux-gnu-pkg-config", ["/usr/bin/x86_64-redhat-linux-gnu"..., "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cc12d26980 /* 96 vars */) = 0 (many others omitted)
Cool bear Cool Bear's hot tip

rg is ripgrep — it’s excellent.

Hilariously, we can see that system-deps ends up trying every possible path for pkg-config, starting with ~/.cargo/bin, then ~/.local/bin, then ~/.fly/bin, etc. It does so for every invocation.

There could probably be benefits in looking it up once and just using that path, if someone feels like doing an easy, low-impact-but-nice PR!

So, because we know cargo ends up invoking pkg-config (and environment variables are inherited by child processes, unless that behavior is specifically disabled), we can just set PKG_CONFIG_PATH as before, and everything sh-

$ PKG_CONFIG_PATH=~/bearcove/poppler-build/prefix/lib64/pkgconfig cargo build Compiling proc-macro2 v1.0.32 Compiling unicode-xid v0.2.2 Compiling syn v1.0.81 Compiling serde v1.0.130 (cut) Compiling poppler-rs v0.18.2 (/home/amos/bearcove/poppler-rs/poppler) error: linking with `cc` failed: exit status: 1 | = note: "cc" "-m64" (cut.) = note: /usr/bin/ld: /home/amos/bearcove/poppler-build/prefix/lib64/libpoppler-glib.a(CairoOutputDev.cc.o): in function `CairoOutputDev::~CairoOutputDev()': CairoOutputDev.cc:(.text+0x19a): undefined reference to `operator delete(void*, unsigned long)' /usr/bin/ld: CairoOutputDev.cc:(.text+0x223): undefined reference to `TextPage::decRefCnt()' /usr/bin/ld: CairoOutputDev.cc:(.text+0x237): undefined reference to `ActualText::~ActualText()' /usr/bin/ld: CairoOutputDev.cc:(.text+0x244): undefined reference to `operator delete(void*, unsigned long)'

…everything is NOT working, because, remember, a bunch of packages are actually missing!

So we got a couple options here — one is to make our own build.rs script for pdftocairo. This is relatively simple to set up:

// in `pdftocairo/build.rs` fn main() { // poppler-glib requires poppler println!("cargo:rustc-link-lib=static=poppler"); // poppler is written in C++ println!("cargo:rustc-link-lib=static=stdc++"); // nobody bothers including this in their pkg-config files apparently println!("cargo:rustc-link-lib=static=png"); // cairo needs this println!("cargo:rustc-link-lib=static=freetype"); // cairo/freetype need this? // the freetype ChangeLog says the dependency graph looks like: // cairo => fontconfig => freetype2 => harfbuzz => cairo println!("cargo:rustc-link-lib=static=fontconfig"); // cairo also needs this println!("cargo:rustc-link-lib=static=pixman-1"); // fontconfig needs this (it's an XML parser) println!("cargo:rustc-link-lib=expat"); }

And then the build succeeds!

$ PKG_CONFIG_PATH=~/bearcove/poppler-build/prefix/lib64/pkgconfig cargo build Compiling pdftocairo v0.1.0 (/home/amos/bearcove/pdftocairo) Compiling glib-sys v0.14.0 Compiling gobject-sys v0.14.0 Compiling cairo-sys-rs v0.14.9 Compiling gio-sys v0.14.0 Compiling poppler-sys-rs v0.18.0 (/home/amos/bearcove/poppler-rs/sys) Compiling glib v0.14.8 Compiling cairo-rs v0.14.9 Compiling poppler-rs v0.18.2 (/home/amos/bearcove/poppler-rs/poppler) Finished dev [unoptimized + debuginfo] target(s) in 7.70s

As promised, it has “virtually no external dependencies”:

$ ldd ./target/debug/pdftocairo linux-vdso.so.1 (0x00007ffe81bfe000) libm.so.6 => /lib64/libm.so.6 (0x00007f6350a45000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f6350a2b000) libc.so.6 => /lib64/libc.so.6 (0x00007f6350821000) /lib64/ld-linux-x86-64.so.2 (0x00007f6351d44000)

And also, deliciously chonky:

$ ls -lhA ./target/debug/pdftocairo -rwxr-xr-x. 2 amos amos 73M Nov 26 14:31 ./target/debug/pdftocairo

(It was 48M when using dynamic linking).

Although, we have ways to make it smaller. By compressing debug sections:

$ objcopy --compress-debug-sections ./target/debug/pdftocairo /tmp/pdftocairo.compressed-debug $ ls -lhA /tmp/pdftocairo.compressed-debug -rwxr-xr-x. 1 amos amos 35M Nov 26 14:33 /tmp/pdftocairo.compressed-debug

(That variant was 16M with dynamic linking).

Or stripping debug info altogether:

$ objcopy --strip-all ./target/debug/pdftocairo /tmp/pdftocairo.stripped $ ls -lhA /tmp/pdftocairo.stripped -rwxr-xr-x. 1 amos amos 19M Nov 26 14:33 /tmp/pdftocairo.stripped

And that was just 5.3M with dynamic linking. So yeah! All that code isn’t free, but we have a freaking PDF to SVG converted in a single, relatively self-contained binary now!

Running ./target/debug/pdftocairo yields the same result as before. Yay!

Cool bear Cool Bear's hot tip

We talked earlier about having a couple options: another one is to patch the .pc files ourselves. But this isn’t great either, because system-deps defaults to dynamic linking and patching it the “proper” way would involve upstreaming patches to a bunch of crates.

Once again, the world as a whole has kinda given up on static linking, so we’re on our own here - unless we look into AppImage/snapcraft/flatpak.

Cool bear What did we learn?

In the world of C/C++ libraries, as far as build systems go, it’s kind of a free-for-all. There’s the old-school autoconf users, the great CMake / Meson divide, and then Bazel, Buck, and many others.

We made everything more complicated by really, really wanting a static build, which very few people care about, so we had to jump through more hoops than one would normally have to.

pkg-config is a tool that outputs the required compiler flags for compiling and linking against C/C++ libraries. It ships with Linux distributions and normally looks in a system prefix, like /usr/lib64/pkgconfig. But it also works with “custom prefixes” like we did.

Here too, static linking complicates things a little.

Comment on /r/fasterthanlime

(JavaScript is required to see this. Or maybe my stuff broke)

Here's another article just for you:

The case for sans-io

The most popular option to decompress ZIP files from the Rust programming language is a crate simply named zip — At the time of this writing, it has 48 million downloads. It’s fully-featured, supporting various compression methods, encryption, and even supports writing zip files.

However, that’s not the crate everyone uses to read ZIP files. Some applications benefit from using asynchronous I/O, especially if they decompress archives that they download from the network.