A static poppler build: the easy way

This article is part of the Don't shell out! series.

So! Now our asset processing pipeline is almost complete. But we've just traded dependencies against CLI tools, for dependencies against dynamic libraries:

Shell session
$ ldd ./target/debug/pdftocairo
        linux-vdso.so.1 (0x00007ffd615be000)
        libpoppler-glib.so.8 => /lib64/libpoppler-glib.so.8 (0x00007f2ba1bb4000)
        libgobject-2.0.so.0 => /lib64/libgobject-2.0.so.0 (0x00007f2ba1b59000)
        libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x00007f2ba1a1e000)
        libcairo.so.2 => /lib64/libcairo.so.2 (0x00007f2ba1902000)
        libcairo-gobject.so.2 => /lib64/libcairo-gobject.so.2 (0x00007f2ba18f6000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f2ba18dc000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f2ba17fe000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f2ba15f4000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f2ba216c000)
        libpoppler.so.112 => /lib64/libpoppler.so.112 (0x00007f2ba1288000)
        libfreetype.so.6 => /lib64/libfreetype.so.6 (0x00007f2ba11bd000)
        libgio-2.0.so.0 => /lib64/libgio-2.0.so.0 (0x00007f2ba0fe4000)
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f2ba0dc5000)
        libffi.so.6 => /lib64/libffi.so.6 (0x00007f2ba0db8000)
        libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f2ba0d40000)
        libpixman-1.so.0 => /lib64/libpixman-1.so.0 (0x00007f2ba0c94000)
        libfontconfig.so.1 => /lib64/libfontconfig.so.1 (0x00007f2ba0c45000)
        libpng16.so.16 => /lib64/libpng16.so.16 (0x00007f2ba0c0c000)
        libxcb-shm.so.0 => /lib64/libxcb-shm.so.0 (0x00007f2ba0c07000)
        libxcb.so.1 => /lib64/libxcb.so.1 (0x00007f2ba0bda000)
        libxcb-render.so.0 => /lib64/libxcb-render.so.0 (0x00007f2ba0bca000)
        libXrender.so.1 => /lib64/libXrender.so.1 (0x00007f2ba0bbd000)
        libX11.so.6 => /lib64/libX11.so.6 (0x00007f2ba0a75000)
        libXext.so.6 => /lib64/libXext.so.6 (0x00007f2ba0a60000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f2ba0a46000)
        libjpeg.so.62 => /lib64/libjpeg.so.62 (0x00007f2ba09c2000)
        libopenjp2.so.7 => /lib64/libopenjp2.so.7 (0x00007f2ba0968000)
        liblcms2.so.2 => /lib64/liblcms2.so.2 (0x00007f2ba0903000)
        libtiff.so.5 => /lib64/libtiff.so.5 (0x00007f2ba087c000)
        libsmime3.so => /lib64/libsmime3.so (0x00007f2ba0850000)
        libnss3.so => /lib64/libnss3.so (0x00007f2ba0712000)
        libplc4.so => /lib64/libplc4.so (0x00007f2ba0709000)
        libnspr4.so => /lib64/libnspr4.so (0x00007f2ba06c6000)
        libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f2ba06b3000)
        libharfbuzz.so.0 => /lib64/libharfbuzz.so.0 (0x00007f2ba05dd000)
        libbrotlidec.so.1 => /lib64/libbrotlidec.so.1 (0x00007f2ba05cf000)
        libgmodule-2.0.so.0 => /lib64/libgmodule-2.0.so.0 (0x00007f2ba05c8000)
        libmount.so.1 => /lib64/libmount.so.1 (0x00007f2ba0581000)
        libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f2ba0556000)
        libxml2.so.2 => /lib64/libxml2.so.2 (0x00007f2ba03cd000)
        libXau.so.6 => /lib64/libXau.so.6 (0x00007f2ba03c7000)
        libwebp.so.7 => /lib64/libwebp.so.7 (0x00007f2ba0358000)
        libzstd.so.1 => /lib64/libzstd.so.1 (0x00007f2ba0260000)
        libjbig.so.2.1 => /lib64/libjbig.so.2.1 (0x00007f2ba0252000)
        libnssutil3.so => /lib64/libnssutil3.so (0x00007f2ba021f000)
        libplds4.so => /lib64/libplds4.so (0x00007f2ba021a000)
        libgraphite2.so.3 => /lib64/libgraphite2.so.3 (0x00007f2ba01f9000)
        libbrotlicommon.so.1 => /lib64/libbrotlicommon.so.1 (0x00007f2ba01d4000)
        libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f2ba019c000)
        libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f2ba0105000)
        liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f2ba00d9000)

Whew, that's a LOT of dependencies.

It is. There's a lot of stuff in there we don't strictly need, they're optional dependencies in the poppler dependency tree: we never need cairo to render to X11/xcb for example, only to an SVG surface.

On Linux, this mostly means installing a lot of different packages just to get our binary running. And those packages are named differently depending on which Linux distribution you're using: I use Fedora, Ubuntu and ArchLinux on a regular basis. Their contents can be different too, again, some package maintainers make Executive Decisions and that makes it really hard to ship cross-distro packages.

This is where folks who work on AppImage, snapcraft or flatpak would usually come in and pitch their solution to that problem.

These are all fine for various usecases, they each make their own compromises, and I'm familiar with them - I'm making an informed decision not to use them.

Similarly, if you're a Nix aficionado, I'm super happy for you, but go write your own article! I'll retweet them, too! Today we're just learning how to build / link / distribute software for internal use.

On Windows, I'd have to be careful to distribute a bunch of .dll files alongside the .exe, which can get messy real fast if you have a ~/bin folder where you just throw all your CLI tools.

Anyway, we can address both problems by just making our own build of poppler as a static library.

Before we do, let's record the size of the pdftocairo binary, before and after stripping, when it's linked dynamically against poppler & friends:

Shell session
$ ls -lhA ./target/debug/pdftocairo
-rwxr-xr-x. 2 amos amos 48M Nov 25 12:12 ./target/debug/pdftocairo

$ objcopy --compress-debug-sections ./target/debug/pdftocairo /tmp/pdftocairo.compressed-debug
$ ls -lhA /tmp/pdftocairo.compressed-debug 
-rwxr-xr-x. 1 amos amos 16M Nov 25 13:03 /tmp/pdftocairo.compressed-debug

$ objcopy --strip-all ./target/debug/pdftocairo /tmp/pdftocairo.stripped
$ ls -lhA /tmp/pdftocairo.stripped                                      
-rwxr-xr-x. 1 amos amos 5.3M Nov 25 13:04 /tmp/pdftocairo.stripped

During my tenure as "chief fucking around and finding out officer", I've built software from sources "manually" a ton. If you somehow need to get good at this and have a month of free time, Linux From Scratch might be a nice thing to hyperfocus on.

For poppler, I determined that these were the dependencies I needed to build (in order):

One could argue harfbuzz is missing from that list, but things went fine without it. Because in this case, the input is the result of Chrome's "Print to PDF" feature, the text is already laid out, and so I don't think harfbuzz is actually needed here.

I made a bunch of bash scripts for this, which is one tiny step up from "typing everything into a terminal once".

They all have the same structure: we'll follow the script for pcre for example.

Bash
#!/bin/bash -eux

# ^ set some useful flags:
#   -e Exits immediately if a pipeline fails. For example sh exits if curl in
#      `curl | sh` dies. Super useful.
#   -u Treats unset variables as errors rather than just an empty string.
#      Also a lifesaver.
#   -x Prints commands before running them. Useful to follow what's going on,
#      especially for commands that don't print anything.

# Absolute path to this script. /home/user/bin/foo.sh
SCRIPT=$(readlink -f $0)
# Absolute path this script is in. /home/user/bin
SCRIPTPATH=`dirname $SCRIPT`

# This is where headers, libraries, pkg-config/Gir/docs files will be installed
export PREFIX=${SCRIPTPATH}/prefix
mkdir -p ${PREFIX}

# Some of these servers are _really slow_, it helps to be able to skip the
# download when working on those scripts. Downloading can be skipped by
# exporting `SKIP_DL=1`
#
# The `${FOOBAR:-otherwise}` expression will evaluate to "-otherwise" if
# the `FOOBAR` variable is not set, which 
if [[ "${SKIP_DL:-unset}" == "1" ]]; then
echo "Skipping download..."
else
rm -rf sources/pcre
mkdir -p sources/pcre

# Pipe curl into tar to extract on the fly (possible with tarballs, not so easy
# with zips)
# 
# curl's -f (--fail) flag will fail if it gets a non-2xx status code.
# Unless it's a redirect (3xx), in which case -L (--follow) will follow
# the redirect.
#
# tar's `x` extracts, `j` is for `.bz2`, and `--strip-components` extracts
# `foobar-1.2.3/blah` as `blah` instead. `-C` "changes directory", essentially
# specifying the destination.
curl -f -L https://sourceforge.net/projects/pcre/files/pcre/8.45/pcre-8.45.tar.bz2/download | tar xj -C sources/pcre --strip-components 1
fi

# Always re-build from scratch. Incremental rebuilds are always messy,
# especially when reconfiguring, I'd rather not risk it.
rm -rf builds/pcre
mkdir -p builds/pcre

# Like `cd`, but we can `popd` later to get out of it
pushd builds/pcre
# See later discussion
export CFLAGS="-fPIC"
# Look mom, it's autotools!
# --prefix specifies "where to install stuff"
# I'm doing this from Fedora, which uses `${PREFIX}/lib64` rather than
# Debian's `${PREFIX}/${TRIPLET}/lib`, for some reason --libdir is needed.
# --disable-shared says not to build shared/dynamic libraries (.so), and
# --enable-static says to build static libraries (.a).
../../sources/pcre/configure \
  --prefix=${PREFIX} \
  --libdir=${PREFIX}/lib64 \
  --disable-shared \
  --enable-static \

# Build all that code, using all available processors/cores/hyperthreads
make -j $(nproc)
# Install all that
make install

# Kinda unnecessary since all of this is happening in a sub-shell that
# exits immediately after, but the symmetry is nice.
popd

So that's with autotools! Not too bad, all things considered.

That reminds me of a tweet by Tim Martin:

I saw a book entitled "Die GNU Autotools" and I thought "My feelings exactly". Turns out the book was in German.

The one surprising thing is this line:

Bash
export CFLAGS="-fPIC"

And that line exists because... the default Rust toolchain tries to make position-independent executables. We discussed this in Position-independent code, but back then we were hyperfocused on learning about ELF. Here we're just hitting it in the real world!

Position-independent code is generally a good thing, but with most C build systems / compilers, you have to explicitly opt into it, unless you're building shared objects (.so). Since we're building static library archives here (.a), we have to tell the C compiler that we intend for that code to eventually be linked into a position-independent executable (a PIE).

As a result, it'll use a different set of relocations, that are compatible with position-independent executable. If we forget that part, we'll just get an error at link-time, where the linker will be like "uhh yeah I can't make a PIE with those relocations". It's not as bad as silently building an executable that panics at runtime.

And autotools is relatively hands-off here: it's mostly concerned with making sure it doesn't have to enable a hundred workarounds, that are only needed if you're building for HP-UX or Irix or something.

I mean, technically libtool has a thing where it'll generate both PIC and non-PIC objects, and there's the --with-pic option but eh... I don't super trust it.

Other build systems have other ways to specify that you want position-independent code!

Here's a cmake example:

Bash
# (most of the file is skipped, it looks a lot like the previous example)

# Configure the build.
# -S sets the source folder, -B sets the build folder,
# -D specifies an option. "OFF" and "ON" are for features, "True" and "False"
# are for booleans. We disable most features.
cmake \
  -S sources/poppler \
  -B builds/poppler \
  -DCMAKE_INSTALL_PREFIX=${PREFIX} \
  -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_POSITION_INDEPENDENT_CODE=True \
  -DBUILD_GTK_TESTS=OFF \
  -DBUILD_QT5_TESTS=OFF \
  -DBUILD_QT6_TESTS=OFF \
  -DBUILD_CPP_TESTS=OFF \
  -DBUILD_MANUAL_TESTS=OFF \
  -DENABLE_BOOST=OFF \
  -DENABLE_UTILS=OFF \
  -DENABLE_CPP=OFF \
  -DENABLE_GLIB=ON \
  -DENABLE_GOBJECT_INTROSPECTION=OFF \
  -DENABLE_GTK_DOC=OFF \
  -DENABLE_QT5=OFF \
  -DENABLE_QT6=OFF \
  -DENABLE_LIBOPENJPEG=none \
  -DENABLE_CMS=none \
  -DENABLE_DCTDECODER=none \
  -DENABLE_LIBCURL=OFF \
  -DENABLE_ZLIB=OFF \
  -DBUILD_SHARED_LIBS=OFF \
  -DRUN_GPERF_IF_PRESENT=OFF \
  ;

# Build using aaaaall available processors/cores/hyperthreads
cmake --build builds/poppler --parallel $(nproc)
# Install
cmake --install builds/poppler

The position-independent flag here is -DCMAKE_POSITION_INDEPENDENT_CODE=True.

Here's an interesting distinction between autotools and CMake: although they both have this "configure" stage, where they generate some more files that will be required to build, CMake actually provides a --build (and an --install) option!

Whereas autotools had us running make by hand. Which is fine on Linux, macOS, or even MinGW on Windows. But on Windows, I want to use MSVC. Some projects ship a .sln (Visual Studio Solution) file directly, but these are not entirely portable across Visual Studio versions, even though they do their best to "upgrade" it when you open it.

So, if you were wondering why in the world people bothered with systems like CMake, that's why: not everything is Linux, not every C compiler is GCC, not every build systems is GNU make. It's super useful to have a tool that drives the whole process, and lets you use MSVC as a C compiler, or ninja as a build system, for example.

And Meson is another one of these!

Usage is really similar to CMake, here's my build script for freetype:

Bash
# (again, most of the file is not shown because it just downloads and extracts
# the sources)

meson setup \
  builds/freetype \
  sources/freetype \
  --prefix ${PREFIX} \
  --default-library static \
  -D b_staticpic=true \
  -D b_pie=true \
  -D brotli=disabled \
  -D bzip2=disabled \
  -D harfbuzz=disabled \
  -D png=disabled \
  -D tests=disabled \
  -D zlib=enabled \
  ;
meson compile -C builds/freetype
meson install -C builds/freetype

As you can see, there's also a standard way to enable/disable features (like brotli, bzip2, etc.), pick the installation prefix (--prefix), the build target (--default-library), and some built-in options to enable position-independent code (b_staticpic and b_pie: pretty sure the latter is not needed here, but ah well).

And so, having written all my build scripts, and an all.sh script to call them all in order, I'm ready to make my static build. Here's an asciinema of it: it completes in a little over a minute on a Ryzen 5950X with 128GB of RAM and a decent NVMe SSD (on Fedora 35 inside VMWare Workstation Player with a Windows 11 host):

I find it really fun to watch it go! You can see some colors around the Meson and CMake parts. It's so nice!

In the end, we have 150MB's worth of libraries in our prefix:

sh
$ du -hd0 prefix
151M    prefix

$ ls prefix/lib64 
glib-2.0             libcairo.la  libfontconfig.a  libgmodule-2.0.a  libpcrecpp.a    libpcreposix.la  libpng16.la   libpoppler-glib.a
libcairo.a           libexpat.a   libfreetype.a    libgobject-2.0.a  libpcrecpp.la   libpixman-1.a    libpng.a      libz.a
libcairo-gobject.a   libffi.a     libgio-2.0.a     libgthread-2.0.a  libpcre.la      libpixman-1.la   libpng.la     pkgconfig
libcairo-gobject.la  libffi.la    libglib-2.0.a    libpcre.a         libpcreposix.a  libpng16.a       libpoppler.a

How do we link about this? Thanks to pkg-config!

The Fedora package for poppler-glib installed the /usr/lib64/pkgconfig/poppler-glibc.pc, which contains all the libraries and flags required to use it:

pkg-config
prefix=/usr
libdir=/usr/lib64
includedir=/usr/include

Name: poppler-glib
Description: GLib wrapper for poppler
Version: 21.08.0
Requires: glib-2.0 >= 2.56 gobject-2.0 >= 2.56 cairo >= 1.10.0 
Requires.private: poppler = 21.08.0

Libs: -L${libdir} -lpoppler-glib
Cflags: -I${includedir}/poppler/glib

And now that we've built and installed poppler-glib into our own prefix, we have a /home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig file

pkg-config
prefix=/home/amos/bearcove/poppler-build/prefix
libdir=/home/amos/bearcove/poppler-build/prefix/lib64
includedir=/home/amos/bearcove/poppler-build/prefix/include

Name: poppler-glib
Description: GLib wrapper for poppler
Version: 21.11.0
Requires: glib-2.0 >= 2.56 gobject-2.0 >= 2.56 cairo >= 1.10.0 
Requires.private: poppler = 21.11.0

Libs: -L${libdir} -lpoppler-glib
Cflags: -I${includedir}/poppler/glib

(Note that the version we built is more recent than the version Fedora 35 ships!)

pkg-config is relatively straightforward to use "manually". To show it off, I made a sample C program:

C code
#include <glib/poppler.h>
#include <glib/gerror.h>
#include <stdio.h>

int main() {
  GError *error = NULL;
  PopplerDocument *doc = poppler_document_new_from_file("file:///tmp/export.pdf", NULL, &error);
  if (error) {
    printf("got GError: %s\n", error->message);
    g_clear_error(&error);
    exit(1);
  }
  printf("doc = %p\n", doc);
}

Trying to compile it with gcc the naive way doesn't work, because include paths are not set:

Shell session
$ gcc sample.c -o sample
sample.c:1:10: fatal error: glib/poppler.h: No such file or directory
    1 | #include <glib/poppler.h>
      |          ^~~~~~~~~~~~~~~~
compilation terminated.

But even setting those include paths aren't enough:

Shell session
$ gcc -I ~/bearcove/poppler-build/prefix/include/glib-2.0 -I ~/bearcove/poppler-build/prefix/include/poppler -I ~/bearcove/poppler-build/prefix/lib64/glib-2.0/include -I ~/bearcove/poppler-build/prefix/include/cairo sample.c -o sample 
/usr/bin/ld: /tmp/cc09zSS1.o: in function `main':
sample.c:(.text+0x22): undefined reference to `poppler_document_new_from_file'
/usr/bin/ld: sample.c:(.text+0x55): undefined reference to `g_clear_error'
collect2: error: ld returned 1 exit status

...because now we're missing the libraries. And this is what pkg-config solves!

Shell session
$ pkg-config --cflags glib-2.0

-I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/sysprof-4 -pthread

But this picks up the system / distro-installed .pc files. If we want to use the prefix we just made, we need to set the PKG_CONFIG_PATH environment variable.

Shell session
$ PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \
pkg-config --cflags glib-2.0 

-I/home/amos/bearcove/poppler-build/prefix/include/glib-2.0 -pthread -I/home/amos/bearcove/poppler-build/prefix/lib64/glib-2.0/include -I/home/amos/bearcove/poppler-build/prefix/include -DPCRE_STATIC

I've just shown the --cflags flag which is necessary for compiling C files (it sets defines and includes paths, mostly), and then there's --libs, which is necessary to link everything together.

Here's what linking dynamically (with the system packages) would look like:

Shell session
$ pkg-config --libs glib-2.0 

-lglib-2.0

And now statically (with our own prefix):

Shell session
$ PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \
pkg-config --libs glib-2.0 

-L/home/amos/bearcove/poppler-build/prefix/lib64 -lglib-2.0 -pthread -lm -lpcre

Because in our case we're doing static linking, we also probably want to use the --static flag, which tells pkg-config to "be more aggressive when computing dependency graph (for static linking)", and because we don't want to accidentally end up dynamically linking against a system library, we can specify --env-only, which tells pkg-config to "look only for package entries in PKG_CONFIG_PATH".

This doesn't actually make a difference for glib-2.0:

Shell session
$ PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \
pkg-config --env-only --static --libs glib-2.0

-L/home/amos/bearcove/poppler-build/prefix/lib64 -lglib-2.0 -pthread -lm -lpcre

But it does for cairo, for example!

Shell session
$ PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \                       
pkg-config --libs cairo

-L/home/amos/bearcove/poppler-build/prefix/lib64 -lcairo

Without --static, it only bothers specifying -lcairo, I'm guessing because it assumes by default we're linking dynamically, and libcairo.so itself depends on other libraries:

Shell session
$ ldd /usr/lib64/libcairo.so | grep -E 'glib|pixman|pcre|png|freetype'                                                
        libpixman-1.so.0 => /lib64/libpixman-1.so.0 (0x00007f551e8cb000)
        libfreetype.so.6 => /lib64/libfreetype.so.6 (0x00007f551e7b1000)
        libpng16.so.16 => /lib64/libpng16.so.16 (0x00007f551e778000)
        libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x00007f551dedb000)
        libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f551de1f000)

There's no such concept with .a archives (static libraries). They're just a bunch of .o files:

Shell session
$ ar tv /home/amos/bearcove/poppler-build/prefix/lib64/libpoppler-glib.a
rw-r--r-- 1000/1000  16912 Nov 25 14:35 2021 poppler-action.cc.o
rw-r--r-- 1000/1000   1992 Nov 25 14:35 2021 poppler-date.cc.o
rw-r--r-- 1000/1000 116744 Nov 25 14:35 2021 poppler-document.cc.o
rw-r--r-- 1000/1000  64688 Nov 25 14:35 2021 poppler-page.cc.o
rw-r--r-- 1000/1000  10496 Nov 25 14:35 2021 poppler-attachment.cc.o
rw-r--r-- 1000/1000  25032 Nov 25 14:35 2021 poppler-form-field.cc.o
rw-r--r-- 1000/1000  65024 Nov 25 14:35 2021 poppler-annot.cc.o
rw-r--r-- 1000/1000   7664 Nov 25 14:35 2021 poppler-layer.cc.o
rw-r--r-- 1000/1000  10024 Nov 25 14:35 2021 poppler-movie.cc.o
rw-r--r-- 1000/1000  11624 Nov 25 14:35 2021 poppler-media.cc.o
rw-r--r-- 1000/1000   3032 Nov 25 14:35 2021 poppler.cc.o
rw-r--r-- 1000/1000   6168 Nov 25 14:35 2021 poppler-cached-file-loader.cc.o
rw-r--r-- 1000/1000  14136 Nov 25 14:35 2021 poppler-input-stream.cc.o
rw-r--r-- 1000/1000  81248 Nov 25 14:35 2021 poppler-structure-element.cc.o
rw-r--r-- 1000/1000  72280 Nov 25 14:35 2021 poppler-enums.c.o
rw-r--r-- 1000/1000  29064 Nov 25 14:35 2021 CairoFontEngine.cc.o
rw-r--r-- 1000/1000 142272 Nov 25 14:35 2021 CairoOutputDev.cc.o
rw-r--r-- 1000/1000   7768 Nov 25 14:35 2021 CairoRescaleBox.cc.o
Cool bear's hot tip

"GNU ar" is most probably named for "archive", and it's part of binutils. The t flag prints a "lisT" of entries, and v is "Verbose".

So, when we specify --static (and --env-only for good measure), pkg-config lists dependencies explicitly:

Shell session
$ PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \                                           
pkg-config --libs --env-only --static cairo

-L/home/amos/bearcove/poppler-build/prefix/lib64 -lcairo -L/home/amos/bearcove/poppler-build/prefix/lib64 -lgobject-2.0 -L/home/amos/bearcove/poppler-build/prefix/lib64 -L/home/amos/bearcove/poppler-build/prefix/lib64 -L/home/amos/bearcove/poppler-build/prefix/lib64/../lib64 -lffi -lglib-2.0 -pthread -lm -lpcre -lpixman-1 -lfreetype -lz -lpng16 -lm -lm -lz

So, to link our sample C program statically against our libraries, all we need to do is grab the output of pkg-config --cflags and pkg-config --libs and use it for the appropriate build stages:

Bash
set -eux
export PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig
PKGS="poppler-glib gio-2.0 fontconfig libpng"
CFLAGS=$(pkg-config --static --cflags ${PKGS})
LIBS=$(pkg-config --static --libs ${PKGS})
gcc ${CFLAGS} -c sample.c
g++ sample.o -o sample ${LIBS} -static-libstdc++

The compile step ends up looking like this:

Bash
gcc -I${PREFIX}/include/poppler/glib -I${PREFIX}/include/glib-2.0 -I${PREFIX}/lib64/glib-2.0/include -I${PREFIX}/include -I${PREFIX}/include/cairo -I${PREFIX}/include/pixman-1 -I${PREFIX}/include/freetype2 -I${PREFIX}/include/libpng16 -I${PREFIX}/include/poppler -DPCRE_STATIC -pthread -c sample.c

And the link step, like this:

Bash
g++ sample.o -o sample -L${PREFIX}/lib64 -lpoppler-glib -lgobject-2.0 -lglib-2.0 -pthread -lm -lpcre -L${PREFIX}/lib64/../lib64 -lcairo -L${PREFIX}/lib64 -L${PREFIX}/lib64 -L${PREFIX}/lib64 -lm -lpixman-1 -lz -lm -L${PREFIX}/lib64 -lpoppler -lgio-2.0 -lgobject-2.0 -lffi -lgmodule-2.0 -lglib-2.0 -lm -lpcre -lfontconfig -pthread -lexpat -lfreetype -lz -lpng16 -lm -lm -lz -static-libstdc++

(With the actual prefix replaced with ${PREFIX} for readability).

Specifying poppler-glib wasn't enough in this case, adding gio-2.0, fontconfig and libpng was still needed. This feels like a bug in .pc files, but then again, static linking is starting to feel like a lost art: mainstream distributions like Ubuntu and Fedora are definitely focused on dynamic linking above all else.

Here's another cool tip: pkg-config can generate a graphviz (.dot) file to graph dependencies:

Shell session
$ PKG_CONFIG_DEBUG_SPEW=1 \
PKG_CONFIG_PATH=/home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig \
pkg-config --static --env-only --digraph --exists --errors-to-stdout poppler-glib gio-2.0 fontconfig libpng > /tmp/deps.dot

$ dot -Tsvg /tmp/deps.dot > /tmp/deps.svg

It seems cairo has basically no dependencies in that graph, whereas in reality, it definitely has a bunch:

Shell session
$ ldd /usr/lib64/libcairo.so | grep -E 'fontconfig|freetype|pixman|png'
     libpixman-1.so.0 => /lib64/libpixman-1.so.0 (0x00007f4777ecb000)
     libfontconfig.so.1 => /lib64/libfontconfig.so.1 (0x00007f4777e7c000)
     libfreetype.so.6 => /lib64/libfreetype.so.6 (0x00007f4777db1000)
     libpng16.so.16 => /lib64/libpng16.so.16 (0x00007f4777d78000

Looking closer, it appears those are listed, but as "private" dependencies:

Shell session
$ cat /home/amos/bearcove/poppler-build/prefix/lib64/pkgconfig/cairo.pc
prefix=/home/amos/bearcove/poppler-build/prefix
exec_prefix=${prefix}
libdir=/home/amos/bearcove/poppler-build/prefix/lib64
includedir=${prefix}/include

Name: cairo
Description: Multi-platform 2D graphics library
Version: 1.16.0

Requires.private: gobject-2.0 glib-2.0 >= 2.14      pixman-1 >= 0.30.0  freetype2 >= 9.7.3   libpng 
Libs: -L${libdir} -lcairo
Libs.private:          -lz  
Cflags: -I${includedir}/cairo

So maybe it's just the graph generation that's broken, which explains why we need to specify fontconfig by hand, but not freetype2.

Anyway! Let's get back on track.

With all that effort, our sample C program has almost no dynamic dependencies:

Shell session
$ ldd ./sample
     linux-vdso.so.1 (0x00007ffc8e4c7000)
     libm.so.6 => /lib64/libm.so.6 (0x00007fcf437a8000)
     libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fcf4378e000)
     libc.so.6 => /lib64/libc.so.6 (0x00007fcf43584000)
     /lib64/ld-linux-x86-64.so.2 (0x00007fcf43897000

Similar to another randomly-chosen Rust CLI program:

Shell session
$ ldd $(which sfz)
     linux-vdso.so.1 (0x00007ffceaba7000)
     libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fb569f8b000)
     libm.so.6 => /lib64/libm.so.6 (0x00007fb569eaf000)
     libc.so.6 => /lib64/libc.so.6 (0x00007fb569ca5000)
     /lib64/ld-linux-x86-64.so.2 (0x00007fb56a72c000)

The dependencies there are:

And the good news is: the -sys crates for various gnome/glib-adjacent libraries use pkg-config to know what to link against!

Let's look at what gtk-rs/gir generated for poppler-rs/sys/build.rs for example:

Rust code
// Generated by gir (https://github.com/gtk-rs/gir @ 8891a2f2c34b)
// from ../../gir-files (@ c6afb5857607)
// from ../gir-files (@ ec3e62ee546b)
// DO NOT EDIT

#[cfg(not(feature = "dox"))]
use std::process;

#[cfg(feature = "dox")]
fn main() {} // prevent linking libraries to avoid documentation failure

#[cfg(not(feature = "dox"))]
fn main() {
    if let Err(s) = system_deps::Config::new().probe() {
        println!("cargo:warning={}", s);
        process::exit(1);
    }
}

This uses the system-deps crate, which README says it supports pkg-config dependencies by looking at the Cargo.toml.

Do we have anything there?

TOML markup
# in `poppler-rs/sys/Cargo.toml`

[package.metadata.system-deps.poppler_glib]
name = "poppler-glib"
version = "0.70"

[package.metadata.system-deps.poppler_glib.v0_72]
version = "0.72"

[package.metadata.system-deps.poppler_glib.v0_73]
version = "0.73"

# (more versions omitted...)

We do!

And so if we build with no particular options, we can spy on cargo to see what it executes..

Shell session
$ strace -ff -o /tmp/cargo-build -e 'execve' cargo build
(output omitted)
Cool bear's hot tip

strace tracks syscalls: in this case we want to track execve calls (executing another program), we want to "follow forks" (-ff), ie. spy on all processes created by the top-level cargo, such as compiled build scripts, and we want to output the logs, per-PID (process identifier), to /tmp/cargo-build.PID files.

Shell session
$ rg 'pkg-config' /tmp/cargo-build.*                    
/tmp/cargo-build.215066
1:execve("/home/amos/.cargo/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory)
2:execve("/home/amos/.local/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory)
3:execve("/home/amos/.fly/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory)
4:execve("/home/amos/.vscode-server/bin/ccbaa2d27e38e5afa3e5c21c1c7bef4657064247/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory)
5:execve("/home/amos/.local/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory)
6:execve("/home/amos/.fly/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory)
7:execve("/home/amos/.vscode-server/bin/ccbaa2d27e38e5afa3e5c21c1c7bef4657064247/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory)
8:execve("/home/amos/.cargo/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory)
9:execve("/usr/local/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = -1 ENOENT (No such file or directory)
10:execve("/usr/bin/pkg-config", ["pkg-config", "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cfe93b5010 /* 97 vars */) = 0
12:execve("/usr/bin/x86_64-redhat-linux-gnu-pkg-config", ["/usr/bin/x86_64-redhat-linux-gnu"..., "--libs", "--cflags", "cairo", "cairo >= 1.14"], 0x55cc12d26980 /* 96 vars */) = 0

(many others omitted)
Cool bear's hot tip

rg is ripgrep — it's excellent.

Hilariously, we can see that system-deps ends up trying every possible path for pkg-config, starting with ~/.cargo/bin, then ~/.local/bin, then ~/.fly/bin, etc. It does so for every invocation.

There could probably be benefits in looking it up once and just using that path, if someone feels like doing an easy, low-impact-but-nice PR!

So, because we know cargo ends up invoking pkg-config (and environment variables are inherited by child processes, unless that behavior is specifically disabled), we can just set PKG_CONFIG_PATH as before, and everything sh-

Shell session
$ PKG_CONFIG_PATH=~/bearcove/poppler-build/prefix/lib64/pkgconfig cargo build
   Compiling proc-macro2 v1.0.32
   Compiling unicode-xid v0.2.2
   Compiling syn v1.0.81
   Compiling serde v1.0.130
(cut)
   Compiling poppler-rs v0.18.2 (/home/amos/bearcove/poppler-rs/poppler)
error: linking with `cc` failed: exit status: 1
  |
  = note: "cc" "-m64" (cut.)
  = note: /usr/bin/ld: /home/amos/bearcove/poppler-build/prefix/lib64/libpoppler-glib.a(CairoOutputDev.cc.o): in function `CairoOutputDev::~CairoOutputDev()':
          CairoOutputDev.cc:(.text+0x19a): undefined reference to `operator delete(void*, unsigned long)'
          /usr/bin/ld: CairoOutputDev.cc:(.text+0x223): undefined reference to `TextPage::decRefCnt()'
          /usr/bin/ld: CairoOutputDev.cc:(.text+0x237): undefined reference to `ActualText::~ActualText()'
          /usr/bin/ld: CairoOutputDev.cc:(.text+0x244): undefined reference to `operator delete(void*, unsigned long)'

...everything is NOT working, because, remember, a bunch of packages are actually missing!

So we got a couple options here — one is to make our own build.rs script for pdftocairo. This is relatively simple to set up:

Rust code
// in `pdftocairo/build.rs`

fn main() {
    // poppler-glib requires poppler
    println!("cargo:rustc-link-lib=static=poppler");
    // poppler is written in C++
    println!("cargo:rustc-link-lib=static=stdc++");
    // nobody bothers including this in their pkg-config files apparently
    println!("cargo:rustc-link-lib=static=png");
    // cairo needs this
    println!("cargo:rustc-link-lib=static=freetype");
    // cairo/freetype need this?
    // the freetype ChangeLog says the dependency graph looks like:
    // cairo => fontconfig => freetype2 => harfbuzz => cairo
    println!("cargo:rustc-link-lib=static=fontconfig");
    // cairo also needs this
    println!("cargo:rustc-link-lib=static=pixman-1");
    // fontconfig needs this (it's an XML parser)
    println!("cargo:rustc-link-lib=expat");
}

And then the build succeeds!

Shell session
$ PKG_CONFIG_PATH=~/bearcove/poppler-build/prefix/lib64/pkgconfig cargo build
   Compiling pdftocairo v0.1.0 (/home/amos/bearcove/pdftocairo)
   Compiling glib-sys v0.14.0
   Compiling gobject-sys v0.14.0
   Compiling cairo-sys-rs v0.14.9
   Compiling gio-sys v0.14.0
   Compiling poppler-sys-rs v0.18.0 (/home/amos/bearcove/poppler-rs/sys)
   Compiling glib v0.14.8
   Compiling cairo-rs v0.14.9
   Compiling poppler-rs v0.18.2 (/home/amos/bearcove/poppler-rs/poppler)
    Finished dev [unoptimized + debuginfo] target(s) in 7.70s

As promised, it has "virtually no external dependencies":

Shell session
$ ldd ./target/debug/pdftocairo
        linux-vdso.so.1 (0x00007ffe81bfe000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f6350a45000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f6350a2b000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f6350821000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f6351d44000)

And also, deliciously chonky:

Shell session
$ ls -lhA ./target/debug/pdftocairo
-rwxr-xr-x. 2 amos amos 73M Nov 26 14:31 ./target/debug/pdftocairo

(It was 48M when using dynamic linking).

Although, we have ways to make it smaller. By compressing debug sections:

Shell session
$ objcopy --compress-debug-sections ./target/debug/pdftocairo /tmp/pdftocairo.compressed-debug

$ ls -lhA /tmp/pdftocairo.compressed-debug 
-rwxr-xr-x. 1 amos amos 35M Nov 26 14:33 /tmp/pdftocairo.compressed-debug

(That variant was 16M with dynamic linking).

Or stripping debug info altogether:

Shell session
$ objcopy --strip-all ./target/debug/pdftocairo /tmp/pdftocairo.stripped

$ ls -lhA /tmp/pdftocairo.stripped                                      
-rwxr-xr-x. 1 amos amos 19M Nov 26 14:33 /tmp/pdftocairo.stripped

And that was just 5.3M with dynamic linking. So yeah! All that code isn't free, but we have a freaking PDF to SVG converted in a single, relatively self-contained binary now!

Running ./target/debug/pdftocairo yields the same result as before. Yay!

Cool bear's hot tip

We talked earlier about having a couple options: another one is to patch the .pc files ourselves. But this isn't great either, because system-deps defaults to dynamic linking and patching it the "proper" way would involve upstreaming patches to a bunch of crates.

Once again, the world as a whole has kinda given up on static linking, so we're on our own here - unless we look into AppImage/snapcraft/flatpak.

What did we learn?

In the world of C/C++ libraries, as far as build systems go, it's kind of a free-for-all. There's the old-school autoconf users, the great CMake / Meson divide, and then Bazel, Buck, and many others.

We made everything more complicated by really, really wanting a static build, which very few people care about, so we had to jump through more hoops than one would normally have to.

pkg-config is a tool that outputs the required compiler flags for compiling and linking against C/C++ libraries. It ships with Linux distributions and normally looks in a system prefix, like /usr/lib64/pkgconfig. But it also works with "custom prefixes" like we did.

Here too, static linking complicates things a little.

This article is part 3 of the Don't shell out! series.

Read the next part

If you liked what you saw, please support my work!

Patreon logo Become a Patron

Latest video

video cover image
This is a video about video

A descent into madness.

You wouldn't remux a movie. Or would you?

Watch now

You can watch more videos over there

Looking for the homepage?
Another article: A half-hour to learn Rust