Productionizing our poppler build

👋 This page was last updated ~3 years ago. Just so you know.

I was a bit anxious about running our poppler meson build in CI, because it's the real test, you know? "Works on my machine" only goes so far, things have a tendency to break once you try to make them reproducible.

And I was right to worry... but not for the reasons I thought. As I tried to get everything to build in CI, there was a Pypi maintenance that prevented me from installing meson, and then Sourceforge was acting up.

Apart from that it was relatively smooth sailing? Let's run through the .circleci/config.yml file section by section.

YAML
version: 2.1

This opts into the newest (at the time of this writing) version of CircleCI configs, that allow specifying workflows and stuff.

YAML
orbs:
  win: circleci/windows@2.4.1
  aws-s3: circleci/aws-s3@3.0

Because we want to build on Windows, we'll need the circleci/windows orb, which gives us access to Windows executors. We'll also want to store artifacts on S3, and there's an orb for that too.

We've got three jobs: the Linux and Windows build (running in parallel), and finally an upload job:

YAML
workflows:
  version: 2
  build:
    jobs:
      - x86_64-unknown-linux-gnu:
          context: [aws]
      - x86_64-pc-windows-msvc
      - upload:
          context: [aws]
          requires:
            - x86_64-unknown-linux-gnu
            - x86_64-pc-windows-msvc

Let's start with the Linux job, the most straighforward:

YAML
jobs:
  x86_64-unknown-linux-gnu:
    docker:
      - image: 391789101930.dkr.ecr.us-east-1.amazonaws.com/bearcove-meson:latest
    steps:
      - checkout
      - run: |
          meson setup build --buildtype release --default-library static --prefix /tmp/poppler-prefix
          meson compile -C build
          meson install -C build
          tar -czf poppler-prefix-x86_64-unknown-linux-gnu.tar.gz -C /tmp poppler-prefix
      - persist_to_workspace:
          root: .
          paths: ["poppler-prefix-x86_64-unknown-linux-gnu.tar.gz"]

Passing --buildtype is important (a debug build is almost 3x as large). The prefix is installed in /tmp and we generate a .tar.gz file where everything is prefixed by poppler-prefix/.

That build runs in a Docker container I built specifically for this purpose: mostly it has a recent Python 3, a C/C++ toolchain, Ninja, and meson. It's using the same base I described in My ideal Rust workflow, it's just a separate target:

Dockerfile
##############################################
FROM base AS meson
##############################################

# Install python, C & C++ compiler, and ninja
RUN set -eux; \
    apt update; \
    apt install --yes --no-install-recommends \
    #
    # Python package manager (for latest meson)
    python3-pip \
    #
    # C & C++ compiler
    gcc g++ \
    #
    # Ninja build tool
    ninja-build \
    ;

# Install meson
RUN set -eux; \
    pip install meson \
    ;

Now onto the Windows pipeline! That one's fun, because it involves MSVC (aka Visual Studio C++). CircleCI's Windows executors has some version of MSVC installed already, but meson apparently can't find it without a little help.

The best language to use there is probably PowerShell, which we've done a little of already, and the little additional challenge is that... usually to "set up" an MSVC command-line environment, you'd call a batch file (or "source" it?). But this is PowerShell.

Luckily, someone with a much better handle of Batch and PowerShell and MVSC in general has solved that problem before, and so I was able to just steal slightly adjust their work to get it working. It involves this wonderful bit of PowerShell:

PowerShell session
pushd 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build'
cmd /c "vcvarsall.bat x64&set" |
foreach {
  if ($_ -match "=") {
    $v = $_.split("="); set-item -force -path "ENV:\$($v[0])"  -value "$($v[1])"
  }
}
popd
write-host "`nVisual Studio Command Prompt variables set." -ForegroundColor Yellow

And here's the job definition itself:

YAML
x86_64-pc-windows-msvc:
  executor:
    name: win/default
    size: medium
  steps:
    - checkout
    - run: |
        pip install "meson==0.60.2"
        .\.circleci\call-vcvarsall.ps1
        meson setup build --vsenv --buildtype release --default-library static --prefix C:/poppler-prefix
        meson compile -C build
        .\msvc-static-install.ps1 build C:/poppler-prefix
        tar -czf poppler-prefix-x86_64-pc-windows-msvc.tar.gz -C C:/ --exclude etc/fonts ./poppler-prefix
    - persist_to_workspace:
        root: "."
        paths: ["poppler-prefix-x86_64-pc-windows-msvc.tar.gz"]

Couple things here: --vsenv forces using MSVC (otherwise, if clang is detected, it will default to it). --buildtype and --default-library, we've already seen. Windows ships with an honest-to-cthulhu tar now, so we don't need to mess with .zip, and that leaves...

Bear

...what's with the --exclude?

Yes, that. Well... apparently fontconfig installs symlinks:

Shell session
$ tree -ah etc | sed 's/\/home\/amos\/bearcove\/poppler-prefix/PREFIX/'
etc
└── [   32]  fonts
    ├── [  652]  conf.d
    │   ├── [   85]  10-hinting-slight.conf -> PREFIX/share/fontconfig/conf.avail/10-hinting-slight.conf
    │   ├── [   89]  10-scale-bitmap-fonts.conf -> PREFIX/share/fontconfig/conf.avail/10-scale-bitmap-fonts.conf
    │   ├── [   88]  11-lcdfilter-default.conf -> PREFIX/share/fontconfig/conf.avail/11-lcdfilter-default.conf
    │   ├── [   88]  20-unhint-small-vera.conf -> PREFIX/share/fontconfig/conf.avail/20-unhint-small-vera.conf
    │   ├── [   85]  30-metric-aliases.conf -> PREFIX/share/fontconfig/conf.avail/30-metric-aliases.conf
    │   ├── [   79]  40-nonlatin.conf -> PREFIX/share/fontconfig/conf.avail/40-nonlatin.conf
    │   ├── [   78]  45-generic.conf -> PREFIX/share/fontconfig/conf.avail/45-generic.conf
    │   ├── [   76]  45-latin.conf -> PREFIX/share/fontconfig/conf.avail/45-latin.conf
    │   ├── [   80]  49-sansserif.conf -> PREFIX/share/fontconfig/conf.avail/49-sansserif.conf
    │   ├── [   75]  50-user.conf -> PREFIX/share/fontconfig/conf.avail/50-user.conf
    │   ├── [   76]  51-local.conf -> PREFIX/share/fontconfig/conf.avail/51-local.conf
    │   ├── [   78]  60-generic.conf -> PREFIX/share/fontconfig/conf.avail/60-generic.conf
    │   ├── [   76]  60-latin.conf -> PREFIX/share/fontconfig/conf.avail/60-latin.conf
    │   ├── [   84]  65-fonts-persian.conf -> PREFIX/share/fontconfig/conf.avail/65-fonts-persian.conf
    │   ├── [   79]  65-nonlatin.conf -> PREFIX/share/fontconfig/conf.avail/65-nonlatin.conf
    │   ├── [   78]  69-unifont.conf -> PREFIX/share/fontconfig/conf.avail/69-unifont.conf
    │   ├── [   80]  80-delicious.conf -> PREFIX/share/fontconfig/conf.avail/80-delicious.conf
    │   ├── [   80]  90-synthetic.conf -> PREFIX/share/fontconfig/conf.avail/90-synthetic.conf
    │   └── [ 1009]  README
    └── [ 2.7K]  fonts.conf

2 directories, 20 files

(They're absolute, too! I guess they don't expect people to copy prefixes across different computers). And it caused problems further down the line for me, which I won't get into right now to avoid spoilers.

So we're just not packing these files! I honestly doubt we're even using those fontconfig files for anything - by the time we deal with PDF files, they don't have any text left in them, just shapes.

Then we have the third job, which is boring. And boring is nice, in this case:

YAML
upload:
  docker:
    - image: "cimg/python:3.10"
  steps:
    - attach_workspace:
        at: /tmp/workspace
    - aws-s3/copy:
        from: /tmp/workspace/poppler-prefix-x86_64-unknown-linux-gnu.tar.gz
        to: s3://bearcove-binaries/poppler-prefix/$CIRCLE_SHA1/
    - aws-s3/copy:
        from: /tmp/workspace/poppler-prefix-x86_64-pc-windows-msvc.tar.gz
        to: s3://bearcove-binaries/poppler-prefix/$CIRCLE_SHA1/

And... tada!

The resulting archives are a little chonk, but I'm not about to argue for a fistful of megabytes

Bear

So, are we all done? All good?

Amos

Oh no we're not...

Bear

You mean to say... this wasn't a theoretical exercise? We're actually going to use these?

Amos

Yes we are...

Actually using this static poppler build

Bear

Well, easy right? Just make sure that before building any crates that depend on poppler, we download and extract them, and export PKG_CONFIG_PATH so that they're found and used. Right?

Well...

Bear

Oh no

The thing is...

Bear

Oh that's never a good sign.

Okay let's cut to the chase: yes, we can cobble CI pipelines together so that "it builds in CI". It's not that hard: we can add whatever we want as a shell script, so we can definitely run a curl/tar/export some vars and boom here we go.

But it's not enough.

Bear

SURE IT IS

No, it's not.

Bear

FREE ME

It's not! I want to be able to open those projects in VS Code and have them just build. The whole point of freeing ourselves of these dependencies is to not have to run a bunch of manual steps before we can be productive.

Bear

JUST DEVELOP ON LINUX, THEY HAVE PACKAGES

But even then! See, even today, as I had to set up salvage again, which involved installing libavif-tools to provide the avifenc CLI tool and... it was broken! It said something something ABI version mismatch I'm sad you can't have an avif file.

See, when you take on a dependency, it eventually becomes you prob-

Bear

BUT THEN YOU SWITCHED TO CAVIF AND IT WORKS NOW

Yes... and that was always the plan. But we're talking about SVG today.

Bear

IT'S BEEN A MONTH. POSSIBLY LONGER. WHAT DO YOU MEAN "TODAY"

And we're so close to the goal! All we need is a small build script and we'll be on our w-

Bear

NO. No. We don't "just" need "a small build script". Because there's the problem: our dependency tree looks like this right now, correct?

- salvage
  - poppler-rs
    - poppler-sys-rs
      - pkg-config-rs

(omitted: high-level and `-sys` crates for cairo, glib, etc.)

Well you're omitting a bunch of crates but su-

Bear

STAY ON TOPIC. It looks like this. And you know the build order there?

The uh.. the arrows go up?

Bear

PRECISELY. It'll build poppler-sys-rs first, which relies on pkg-config-rs to invoke pkg-config to find libpoppler-glib.a and friends.

And if you add a build script to salvage you know when it'll run?

Well it'll... ah. Uh.

Bear

EXACTLY.

It'll run after all the dependencies are already built. So exporting an environment variable from it would do sweet nothing. The build will have already failed, or picked up some dynamic libraries (if we're on Linux and we have development packages installed).

Well we could simply fork, uh, all the-

Bear

Oh you want to fork cairo-rs, cairo-sys-rs, glib-sys, glib, glib-macros, gobject-sys, and gio-sys?

When you put it like that... no, I don't really want to.

Bear

Right. So stop being a tryhard, cut your losses and QUIT IT WITH THE STATIC BUILDS.

Mh.

Mhhhhhhhhhhh. Unless...

Bear

long sigh

...unless we somehow patch pkg-config-rs.

Imagine we had a pkg-config-hack/Cargo.toml like this...

TOML markup
[package]
name = "pkg-config"
version = "0.3.24"
edition = "2021"

[dependencies]
async-compression = { version = "0.3.8", features = ["gzip", "tokio"] }
aws-config = "0.3.0"
aws-sdk-s3 = "0.3.0"
aws-types = "0.3.0"
camino = { version = "1.0.5", features = ["serde1"] }
futures = "0.3.17"
tokio = { version = "1.15.0", features = ["full"] }
tokio-tar = "0.3.0"
tokio-util = { version = "0.6.9", features = ["io"] }
walkdir = "2.3.2"
color-eyre = "0.5.11"
named-lock = "0.1.1"
serde = { version = "1.0.132", features = ["derive"] }
serde_json = "1.0.73"

# this allows re-exporting most of the upstream pkg-config.  note that we cannot
# use a simple crates.io deps, we have to use a git dep because of how we wrap
# it.
# also, if glib-sys etc. end up bumping their version of pkg-config-rs, that one
# will have to be bumped too.
[dependencies.upstream]
package = "pkg-config"
git = "https://github.com/rust-lang/pkg-config-rs"
rev = "49a4ac189aafa365167c72e8e503565a7c2697c2"

Then, in our top-level Cargo workspace, or package (if we're not using a workspace), we could do something like this...

TOML markup
# in `salvage/Cargo.toml`

[dependencies]
poppler-rs = "0.18.2"
cairo-rs = { version = "0.14.9", features = ["svg"] }

[patch.crates-io]
pkg-config = { git = "https://github.com/bearcove/poppler-meson-crates", rev = "4f06bd6" }

And then it would replace the pkg-config crate with our own crate.

Bear

Fine, sure, okay. Super crimey but okay. Then what?

Well, then we have full control of what's happening. For starters, we need to re-export most of the pkg-config API as-is, except for the Config struct, because it's the one doing the actual lookup:

Rust code
// in `poppler-meson-crates/pkg-config-hack/src/lib.rs`

pub mod config;
pub use config::cargo_config;

mod prepare;

pub use upstream::{Error, Library};

pub struct Config {
    inner: upstream::Config,
}

impl Config {
    pub fn new() -> Self {
        Self {
            inner: upstream::Config::new(),
        }
    }

    pub fn atleast_version(&mut self, vers: &str) -> &mut Config {
        self.inner.atleast_version(vers);
        self
    }

    pub fn print_system_libs(&mut self, print: bool) -> &mut Config {
        self.inner.print_system_libs(print);
        self
    }

    pub fn cargo_metadata(&mut self, cargo_metadata: bool) -> &mut Config {
        self.inner.cargo_metadata(cargo_metadata);
        self
    }

    pub fn env_metadata(&mut self, env_metadata: bool) -> &mut Config {
        self.inner.env_metadata(env_metadata);
        self
    }

    pub fn statik(&mut self, statik: bool) -> &mut Config {
        self.inner.statik(statik);
        self
    }

    pub fn probe(&self, name: &str) -> Result<Library, Error> {
        println!("cargo:warning=Probing library {}", name);
        prepare::prepare_pkgconfig_prefix();

        let res = self.inner.probe(name)?;
        Ok(res)
    }
}
Bear

And?

And well, you can see in Config::probe, before we let the actual pkg-config crate call the pkg-config CLI utility, we call prepare.

And that one's a little complicated, but I'm sure y'all can make sense of it.

Rust code
// in `poppler-meson-crates/pkg-config-hack/src/prepare.rs

use async_compression::tokio::bufread::GzipDecoder;
use aws_sdk_s3::Client;
use camino::{Utf8Path, Utf8PathBuf};
use color_eyre::{eyre::eyre, Report};
use futures::TryStreamExt;
use named_lock::NamedLock;
use serde::{Deserialize, Serialize};
use std::{env::temp_dir, io, sync::Once, time::Instant};
use tokio::io::BufReader;
use tokio_util::io::StreamReader;
use walkdir::WalkDir;

static COLOR_EYRE_ONCE: Once = Once::new();

pub fn prepare_pkgconfig_prefix() {
    // Make sure multiple build scripts don't try to prepare the prefix at the same time
    let lock = NamedLock::create("pkg-config-hack-prepare").unwrap();
    let before_lock = Instant::now();
    let _guard = lock.lock().unwrap();
    println!(
        "cargo:warning=Acquired lock after {:?}",
        before_lock.elapsed()
    );

    // Make color-eyre spit out useful errors
    COLOR_EYRE_ONCE.call_once(|| {
        std::env::set_var("RUST_LIB_BACKTRACE", "1");
        color_eyre::install().unwrap();
    });

    let pkg_config_path = download_and_extract_prefix().unwrap();
    println!("cargo:warning=Setting PKG_CONFIG_PATH={}", pkg_config_path);
    std::env::set_var("PKG_CONFIG_PATH", pkg_config_path);
}

const FORMAT_VERSION: u64 = 1;
const POPPLER_MESON_VERSION: &str = "8c4735aba88cfa81bf43a6648f6864862dfa495c";

/// Downloads a prefix containing a static build of poppler and its dependencies
/// (cairo, glib, etc.), and returns the absolute path to a pkg-config path.
#[tokio::main]
async fn download_and_extract_prefix() -> Result<Utf8PathBuf, Report> {
    // (Note: we're using tokio main to start an async executor just for
    // this function, since the AWS S3 requires one.)
    let temp_dir = Utf8PathBuf::try_from(temp_dir()).unwrap();
    let work_dir = temp_dir.join("pkg-config-hack");
    let prefix_dir = work_dir.join("poppler-prefix");

    println!("Will download to prefix path {}", work_dir);

    // Have we already prepared the right version?
    let ticket_path = work_dir.join("ticket.json");
    match Ticket::read(&ticket_path).await {
        Ok(ticket) => {
            if ticket.up_to_date() {
                return Ok(ticket.pkg_config_path.clone());
            }
        }
        Err(e) => {
            println!("cargo:warning=Could not read ticket: {}", e);
        }
    }

    if is_dir(&work_dir).await {
        // This should only fail if multiple processes are messing with
        // the prefix at the same time, which would mean our named lock
        // doesn't work.
        tokio::fs::remove_dir_all(&work_dir).await?;
    }
    tokio::fs::create_dir_all(&work_dir).await?;

    let target = std::env::var("TARGET").unwrap();
    println!("Building for target {}", target);

    let shared_config = aws_config::from_env().load().await;
    let client = Client::new(&shared_config);

    if shared_config.region().is_none() {
        panic!("AWS_REGION (and friends) must be set for pkg-config-hack to work");
    }
    println!("AWS region: {}", shared_config.region().unwrap());

    let key = format!(
        "poppler-prefix/{}/poppler-prefix-{}.tar.gz",
        POPPLER_MESON_VERSION, target
    );
    println!("Fetching ({})", key);

    let resp = client
        .get_object()
        .bucket("bearcove-binaries")
        .key(&key)
        .send()
        .await?;
    println!("Resp = {:?}", resp);

    // AWS error => std::io::Error
    let body = resp
        .body
        .map_err(|e| io::Error::new(io::ErrorKind::Other, e));
    // Stream<Bytes> => AsyncRead
    let body = StreamReader::new(body);
    // AsyncRead => AsyncBufRead
    let body = BufReader::new(body);
    // decompress gzip
    let body = GzipDecoder::new(body);
    // open tar archive
    let mut archive = tokio_tar::Archive::new(body);

    println!("Unpacking all entries...");
    archive.unpack(&work_dir).await?;

    println!("Patching .pc files...");

    for entry in WalkDir::new(&work_dir) {
        let entry = entry?;
        let path: Utf8PathBuf = entry.path().to_path_buf().try_into()?;

        if let Some("pc") = path.extension() {
            println!("Should patch {}", path);
            let old_contents = tokio::fs::read_to_string(&path).await?;
            let new_contents = old_contents
                // for Linux
                .replace("/tmp/poppler-prefix", prefix_dir.as_str())
                // for Windows
                .replace("C:/poppler-prefix", prefix_dir.as_str());
            tokio::fs::write(&path, &new_contents).await?;
        }
    }

    // Okay, prefix should exist now...
    assert!(is_dir(&prefix_dir).await);

    // The Linux build uses lib64 for some reason, even though it's built on
    // Ubuntu, which is a Debian derivative, which is never supposed to have
    // lib64: https://wiki.ubuntu.com/MultiarchSpec - if I had to pick some tool
    // to blame, it'd be meson, but I don't have to, thank Cthulhu.
    for libdir in ["lib", "lib64"] {
        let pkg_config_path = prefix_dir.join(libdir).join("pkgconfig");
        if is_dir(&pkg_config_path).await {
            let ticket = Ticket {
                format_version: FORMAT_VERSION,
                poppler_meson_version: POPPLER_MESON_VERSION.to_string(),
                pkg_config_path: pkg_config_path.clone(),
            };
            ticket.write(ticket_path).await?;

            println!("cargo:warning=Writing ticket: {:?}", ticket);

            return Ok(pkg_config_path);
        }
    }

    Err(eyre!(
        "pkgconfig dir not found, is the prefix even valid? (see file listing above)"
    ))
}

/// Returns true if the path is a directory that we can read.  Errors out if
/// it's anything other than a directory, or we couldn't get its metadata (b/c
/// permissions, I/O error, anything else)
async fn is_dir<P: AsRef<Utf8Path>>(path: P) -> bool {
    let path = path.as_ref();
    let res = matches!(tokio::fs::metadata(path).await, Ok(meta) if meta.is_dir());
    println!("is {} a dir? {}", path, res);
    res
}

#[derive(Debug, Serialize, Deserialize)]
struct Ticket {
    format_version: u64,
    poppler_meson_version: String,
    pkg_config_path: Utf8PathBuf,
}

impl Ticket {
    async fn read<P: AsRef<Utf8Path>>(ticket_path: P) -> Result<Self, Report> {
        let serialized = tokio::fs::read(ticket_path.as_ref()).await?;
        Ok(serde_json::from_slice(&serialized[..])?)
    }

    async fn write<P: AsRef<Utf8Path>>(&self, ticket_path: P) -> Result<(), Report> {
        let serialized = serde_json::to_vec_pretty(self)?;
        tokio::fs::write(ticket_path.as_ref(), &serialized[..]).await?;
        Ok(())
    }

    fn up_to_date(&self) -> bool {
        if self.format_version != FORMAT_VERSION {
            return false;
        }
        if self.poppler_meson_version != POPPLER_MESON_VERSION {
            return false;
        }
        true
    }
}
Bear

That... that could be an series of its own, couldn't it?

Yes it could! But it's not that complicated, basically we:

  1. Make sure we're the only process trying to set up the prefix at any given time, using the named-lock crate.
  2. Determine a "stable" temporary directory, something like /tmp/pkg-config-hack on Linux
  3. Check for any pre-existing "install ticket" that matches what we're trying to install
  4. If there is such a ticket, we're done!
  5. If there isn't, or we can't read it, we use the official AWS Rust SDK to download the .tar.gz, buffer it, decompress it, and unpack it as a tar archive.
  6. While we're at it, we "fix" a bunch of absolute paths in the installed .pc files
  7. And finally we write an install ticket
  8. ...and set PKG_CONFIG_PATH to something like /tmp/pkg-config-hack/poppler-prefix/lib/pkg-config

Easy right?

Bear

That is terrifying.

I mean... we have a terrifying number of build dependencies now (maybe bringing in all of tokio, hyper, a rust gzip and tar implementation, the S3 sdk, etc., is a tiny bit of overkill), but I actually think it's pretty readable!

So now all we have to do is actually use it in salvage, say, maybe we do this:

Rust code
// in `salvage/src/poppler.rs`

use cairo::{Context, SvgSurface};
use color_eyre::eyre::{self, eyre};
use poppler::{Document, Rectangle};
use std::{fs::File, path::Path};

#[derive(Clone)]
pub struct Poppler {}

impl Poppler {
    pub fn new() -> Self {
        Self {}
    }

    pub fn pdf_to_svg(&self, input: &Path, output: &Path) -> Result<(), eyre::Error> {
        let pdf_bytes = std::fs::read(input)?;
        let doc = Document::from_data(&pdf_bytes[..], None)?;

        let page = doc.page(0).unwrap();
        let mut bb: Rectangle = Default::default();
        page.get_bounding_box(&mut bb);

        let out = File::create(&output)?;
        let surface = SvgSurface::for_stream(bb.x2 - bb.x1, bb.y2 - bb.y1, out)?;
        let cx = Context::new(&surface)?;
        page.render(&cx);

        surface
            .finish_output_stream()
            .map_err(|e| eyre!("cairo error: {}", e.to_string()))?;

        Ok(())
    }
}

And now we... get a bunch of linking errors.

Bear

Ah, right, remember how the installed .pc files are not quite flexible enough to allow static linking? Well, that.

Well, no worries! We can just expose a config module from pkg-config-hack:

Rust code
// in `poppler-meson-crates/pkg-config-hack/src/config.rs`

/// Prints required flags to build against a static build of poppler.
pub fn cargo_config() {
    let target = std::env::var("TARGET").unwrap();
    let is_msvc = target.contains("msvc");

    // poppler-glib requires poppler
    println!("cargo:rustc-link-lib=static=poppler");

    if is_msvc {
        // on windows-msvc everything is fine, we just need a couple libraries

        // for CommandLineToArgvW, SHGetKnownFolderPath
        println!("cargo:rustc-link-lib=shell32");
        // for CoTaskMemFree
        println!("cargo:rustc-link-lib=ole32");

        // for C++ stuff
        println!("cargo:rustc-link-lib=vcruntime");
    } else {
        // that's where it goes on Fedora I guess ¯\_(ツ)_/¯
        // (doesn't hurt other distros)
        println!("cargo:rustc-link-search=native=/usr/lib/gcc/x86_64-redhat-linux/11/");

        // on linux, we want to link statically with the standard C++ library.
        println!("cargo:rustc-link-lib=static=stdc++");
    }

    // nobody bothers including this in their pkg-config files apparently
    println!("cargo:rustc-link-lib=static=png16");

    // cairo needs this
    println!("cargo:rustc-link-lib=static=freetype");

    // cairo/freetype need this?
    // the freetype ChangeLog says the dependency graph looks like:
    // cairo => fontconfig => freetype2 => harfbuzz => cairo
    println!("cargo:rustc-link-lib=static=fontconfig");

    // cairo also needs this
    println!("cargo:rustc-link-lib=static=pixman-1");

    // fontconfig needs this (it's an XML parser)
    println!("cargo:rustc-link-lib=static=expat");
}

And then we just add a build script for salvage that calls that:

Rust code
// in `salvage/build.rs`

fn main() {
    pkg_config::config::cargo_config();
}

Let's not forget to add pkg-config as a build dependency...

Shell session
$ cargo add -B pkg-config
    Updating 'https://github.com/rust-lang/crates.io-index' index
      Adding pkg-config v0.3.24 to build-dependencies

And BOOM, it builds!

Shell session
$ cargo build
   Compiling salvage v1.3.0 (/home/amos/bearcove/salvage)
error: linking with `cc` failed: exit status: 1
  |
  = note: "cc" "-m64" (cut) "-Wl,-Bdynamic" "-lpoppler" "-lpoppler-glib" "-lgobject-2.0" "-lffi" "-lglib-2.0" "-lcairo" "-ldl" "-lpng16" "-lz" "-lfontconfig" "-lfreetype" "-lexpat" "-lpixman-1" "-lm" "-lgio-2.0" "-lresolv" "-lz" "-lgobject-2.0" "-lffi" "-lgmodule-2.0" "-ldl" "-lglib-2.0" "-lm" "-lgobject-2.0" "-lffi" "-lglib-2.0" "-lm" "-lcairo-gobject" "-lgobject-2.0" "-lffi" "-lglib-2.0" "-lcairo" "-ldl" "-lpng16" "-lz" "-lfontconfig" "-lfreetype" "-lexpat" "-lpixman-1" "-lm" "-lgobject-2.0" "-lffi" "-lglib-2.0" "-lm" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-znoexecstack" "-L" "/home/amos/.rustup/toolchains/1.57.0-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/home/amos/bearcove/salvage/target/debug/deps/salvage-622b377b8581b717" "-Wl,--gc-sections" "-pie" "-Wl,-zrelro" "-Wl,-znow" "-nodefaultlibs" "-fuse-ld=lld"
  = note: ld.lld: error: undefined symbol: __res_nquery
          >>> referenced by gthreadedresolver.c
          >>>               gthreadedresolver.c.o:(do_lookup_records) in archive /tmp/pkg-config-hack/poppler-prefix/lib64/libgio-2.0.a
          >>> did you mean: __res_nquery@GLIBC_2.2.5
          >>> defined in: /lib64/libc.so.6

          ld.lld: error: undefined symbol: __dn_expand
          >>> referenced by gthreadedresolver.c
          >>>               gthreadedresolver.c.o:(do_lookup_records) in archive /tmp/pkg-config-hack/poppler-prefix/lib64/libgio-2.0.a
          >>> referenced by gthreadedresolver.c
          >>>               gthreadedresolver.c.o:(do_lookup_records) in archive /tmp/pkg-config-hack/poppler-prefix/lib64/libgio-2.0.a
          >>> referenced by gthreadedresolver.c
          >>>               gthreadedresolver.c.o:(do_lookup_records) in archive /tmp/pkg-config-hack/poppler-prefix/lib64/libgio-2.0.a
          >>> referenced 4 more times
          >>> did you mean: __dn_expand@GLIBC_2.2.5
          >>> defined in: /lib64/libc.so.6
          collect2: error: ld returned 1 exit status


error: could not compile `salvage` due to previous error
Bear

Bwahahahah no it doesn't.

It doesn't.

And that one's a little nasty...

See, that libgio-2.0.a was built on Ubuntu 20.04. And right now I'm trying to build something against it, from Fedora 35.

And on Ubuntu, that symbol is there:

Shell session
$ docker run --rm -it ubuntu:20.04 /bin/bash
root@98e93b395dfa:/# apt update && apt install -y --no-install-recommends binutils
(cut)
root@98e93b395dfa:/# nm -D /usr/lib/x86_64-linux-gnu/libresolv.so.2 | grep dn_expand
00000000000047f0 T __dn_expand
root@98e93b395dfa:/#

But on Fedora... it's named something else:

Shell session
$ docker run --rm -it fedora:35 /bin/bash
[root@91a05a86529e /]# nm -D /usr/lib64/libresolv.so.2 | grep dn_expand
bash: nm: command not found
[root@91a05a86529e /]# dnf provides nm
Fedora 35 - x86_64                                                                                          30 MB/s |  79 MB     00:02
Fedora 35 openh264 (From Cisco) - x86_64                                                                   2.5 kB/s | 2.5 kB     00:01
Fedora Modular 35 - x86_64                                                                                 4.8 MB/s | 3.3 MB     00:00
Fedora 35 - x86_64 - Updates                                                                                22 MB/s |  17 MB     00:00
Fedora Modular 35 - x86_64 - Updates                                                                       4.0 MB/s | 2.8 MB     00:00
binutils-2.37-10.fc35.i686 : A GNU collection of binary utilities
Repo        : fedora
Matched from:
Filename    : /usr/bin/nm

binutils-2.37-10.fc35.x86_64 : A GNU collection of binary utilities
Repo        : fedora
Matched from:
Filename    : /usr/bin/nm

[root@91a05a86529e /]# dnf install binutils
(cut)
[root@91a05a86529e /]# nm -D /usr/lib64/libresolv.so.2 | grep dn_expand
                 U __libc_dn_expand@GLIBC_PRIVATE

In fact, it's not even provided by libresolv.so.2:

Shell session
(continued)

[root@91a05a86529e /]# ldd /usr/lib64/libresolv.so.2
        linux-vdso.so.1 (0x00007ffd3b872000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f8d14459000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f8d1467b000)
[root@91a05a86529e /]# nm -D /lib64/libc.so.6 | grep __libc_dn_expand
000000000012eae0 T __libc_dn_expand@@GLIBC_PRIVATE

...but by libc.so.6.

So, what is a bear to do?

Bear

Oh, me? I've mentally checked out several pages ago.

Well, we could statically link against libresolv, or patch... gio I guess? To not call those functions, but neither of these sound really appealing right now, when we can just commit more code crimes.

Bear

Crimes, crimes, crimes!

See, we don't actually need dn_expand - it's DNS-related, and we sure never expect poppler, by way of gio, to access the network. So we don't need DNS.

So we could just... stub it.

C code
// in `salvage/src/screw-libresolv.c`

#include <stdio.h>
#include <stdlib.h>

static void bail(void)
{
  printf(
    "The program's about to abort. I'm sure you're wondering why?\n"
    "\n"
    "Well, it's DNS. it's *always* DNS.\n"
    "See, this program links against poppler-glib, which links against glib,\n"
    "which includes gio, which includes a DNS resolver. so it links against\n"
    "libresolv.\n"
    "\n"
    "Who provides libresolv, you ask? Depends who you ask!\n"
    "\n"
    "On some platforms it's ISC, as part of BIND. On others, it's part of libc.\n"
    "It exposes symbols like dn_expand. On Fedora, the _actual_ symbol name\n"
    "(provided by the static or dynamic library) is __libc__dn_expand. On Ubuntu\n"
    "it's __dn_expand. Note that the C function name is just 'dn_expand'.\n"
    "\n"
    "That means if you build gio on Ubuntu, it'll expand the __dn_expand symbol\n"
    "to exist. Then if you try to link against gio on Fedora, it'll fail because\n"
    "the actual symbol name will be __libc_dn_expand (in the dynamic library it's\n"
    "even private (via @GLIBC_PRIVATE).\n"
    "\n"
    "How do we resolve this? Well, WE DON'T EVEN NEED DNS. At least not from gio.\n"
    "So, if we expose our own dummy symbols... that just abort... it should build,\n"
    "and nobody should ever have to read this!\n"
  );
  abort();
}

void __attribute__((weak)) __dn_expand(void)
{
  bail();
}

void __attribute__((weak)) __res_nquery(void)
{
  bail();
}
Bear

Ohhhh and because it's a weak symbol it won't mess with an existing one?

Exactly!

So we just build it and link it...

Shell session
$ cargo add -B cc
    Updating 'https://github.com/rust-lang/crates.io-index' index
      Adding cc v1.0.72 to build-dependencies
Rust code
// in `salvage/build.rs`

fn main() {
    // 👇 new!
    if std::env::var("TARGET").unwrap().contains("linux-gnu") {
        cc::Build::new()
            .file("src/screw-libresolv.c")
            .warnings(false)
            .compile("screw-libresolv");
    }

    pkg_config::config::cargo_config();
}

And just like that, it builds. It clocks in at 20MB, but it builds. (18MB with compressed debug sections, 14MB stripped).

More Windows sadness

It doesn't build on Windows though... Well it builds! It just doesn't link. And that's because our various -sys crates which are generated by gtk-rs/gir, have snippets like these!

Rust code
// in `poppler-sys-rs/src/lib.rs`

#[link(name = "poppler-glib")]
#[link(name = "poppler")]
extern "C" {
    // a whole bunch of functions
}

And on Linux, this is fine! The toolchains there will pick up a .a like they would an .so, and they'll do the right thing.

But on Windows... it's not the case. I'm not sure exactly which tool is responsible, but at some point someone decides the symbol we want isn't poppler_document_new_from_data, it's actually __imp_poppler_document_new_from_data: as if poppler-glib.lib was an import library for poppler-glib.dll, and not a static library.

If we want static linking to succeed on Windows, we need to change it to something like this:

Rust code
// in `poppler-sys-rs/src/lib.rs`

#[link(name = "poppler-glib", kind = "static")]
#[link(name = "poppler", kind = "static")]
extern "C" {
    // a whole bunch of functions
}
Bear

Won't that break dynamic linking?

Of course it will! And that's why what we really want, is to be able to conditionally use either static or dynamic linking. One way to do it (which I don't love, but we make do) is with cargo features!

If poppler-sys-rs had a static feature, we could have:

Rust code
// in `poppler-sys-rs/src/lib.rs`

#[cfg_attr(feature = "static", link(name = "poppler", kind = "static"))]
#[cfg_attr(feature = "static", link(name = "poppler-glib", kind = "static"))]
#[cfg_attr(not(feature = "static"), link(name = "poppler"))]
#[cfg_attr(not(feature = "static"), link(name = "poppler-glib"))]
extern "C" {
    // a whole bunch of functions
}

And we could even get gir to generate those!

If only someone... contributed that feature upstream...

Bear

Oh no no no not ag-

Looking into gtk-rs/gir

Bear

...fuck's sake.

So the first wrinkle is that, at the time of this writing, there's gir 0.14, and gir's default branch, adn they generate incompatible code. So we'll need to regenerate all crates.

Bear

Weren't we planning on regenerating all of them anyway? (So they build correctly?)

Yeah. So it's not a big deal.

It wasn't too hard to find where #[link] attributes are generated:

Rust code
// in `gir/src/codegen/lib_.rs` (yes, with the trailing underscore)

fn write_link_attr(w: &mut dyn Write, shared_libs: &[String]) -> Result<()> {
    for it in shared_libs {
        writeln!(
            w,
            "#[link(name = \"{}\")]",
            shared_lib_name_to_link_name(it)
        )?;
    }

    Ok(())
}

So, we just need to change that!

Rust code
fn write_link_attr(w: &mut dyn Write, shared_libs: &[String]) -> Result<()> {
    for it in shared_libs {
        let link_name = shared_lib_name_to_link_name(it);
        writeln!(
            w,
            r#"#[cfg_attr(feature = "static", link(name = "{}", kind = "static"))]"#,
            link_name
        )?;
        writeln!(
            w,
            r#"#[cfg_attr(not(feature = "static"), link(name = "{}"))]"#,
            link_name
        )?;
    }

    Ok(())
}

Let's look at the diff before and after that change on poppler-rs-sys, since I own that crate:

$ git diff
(cut)

-#[link(name = "poppler-glib")]
-#[link(name = "poppler")]
+#[cfg_attr(feature = "static", link(name = "poppler-glib", kind = "static"))]
+#[cfg_attr(not(feature = "static"), link(name = "poppler-glib"))]
+#[cfg_attr(feature = "static", link(name = "poppler", kind = "static"))]
+#[cfg_attr(not(feature = "static"), link(name = "poppler"))]

Perfect!

Next, we to actually define the feature in Cargo.toml, and have it enable the feature for other crates.

Here's part of the code in gir that generates Cargo.toml files:

Rust code
// in `gir/src/codegen/sys/cargo_toml.rs`

fn fill_in(root: &mut Table, env: &Env) {
    // (cut)

    {
        let features = upsert_table(root, "features");
        let versions = collect_versions(env);
        versions.keys().fold(None::<Version>, |prev, &version| {
            let prev_array: Vec<Value> =
                get_feature_dependencies(version, prev, &env.config.feature_dependencies)
                    .iter()
                    .map(|s| Value::String(s.clone()))
                    .collect();
            features.insert(version.to_feature(), Value::Array(prev_array));
            Some(version)
        });
        features.insert(
            "dox".to_string(),
            Value::Array(
                env.config
                    .dox_feature_dependencies
                    .iter()
                    .map(|s| Value::String(s.clone()))
                    .collect(),
            ),
        );
    }

    // (cut)
}

You can see it's using an upsert_table helper: it's actually trying to modify the manifest in-place, because it means for those to be edited by humans.

If we just add this:

Rust code
        features.insert(
            "static".to_string(),
            Value::Array(
                env.config
                    .external_libraries
                    .iter()
                    .map(|l| Value::String(format!("{}/static", l.crate_name)))
                    .collect(),
            ),
        );

We're good to go! Here's the PR I opened, which as far as I can tell won't land this year.

Bear

Oh, that's cute.

But yeah, I've regenerated everything with it, in bearcove/gtk-rs-core and bearcove/poppler-rs, making sure poppler references the newer crates, and having itself a static feature:

TOML markup
# in `poppler-rs/poppler-rs/Cargo.toml`

[package]
name = "poppler"
version = "0.1.0"
edition = "2021"

[dependencies]
libc = "0.2.107"
bitflags = "1.3.2"

[dependencies.glib]
package = "glib"
version = "0.15.0"
git = "https://github.com/bearcove/gtk-rs-core"
branch = "amos/static-build"

[dependencies.cairo]
package = "cairo-rs"
version = "0.15.0"
git = "https://github.com/bearcove/gtk-rs-core"
branch = "amos/static-build"

[dependencies.ffi]
package = "poppler-sys"
git = "https://github.com/bearcove/poppler-rs"
branch = "amos/static-build"

[features]
static = ["ffi/static"]

And with all those changes, and our custom poppler prefix, it's actually almost trivial to make a static build of a simple app, like that one:

TOML markup
# in `poppler-rs/pdftocairo/Cargo.toml`

[package]
name = "pdftocairo"
version = "0.1.0"
edition = "2021"

[dependencies]
camino = "1.0.5"
color-eyre = "0.5.11"
poppler = { path = "../poppler-rs" }
tracing = "0.1.29"
tracing-error = "0.2.0"
tracing-subscriber = { version = "0.3.1", features = ["env-filter"] }

[features]
static = ["poppler/static"]
Rust code
// in `poppler-rs/pdftocairo/src/main.rs`

use camino::Utf8PathBuf;
use color_eyre::Report;
use tracing::info;

#[cfg_attr(feature = "static", link(name = "stdc++", kind = "static"))]
extern "C" {}

fn main() -> Result<(), Report> {
    if std::env::var("RUST_LOG").is_err() {
        std::env::set_var("RUST_LOG", "info");
    }
    color_eyre::install()?;
    install_tracing();

    let path = Utf8PathBuf::from("/tmp/export.pdf");
    info!(%path, "Reading file...");
    let data = std::fs::read(&path)?;
    info!(%path, "Reading file... done!");
    let doc = poppler::Document::from_data(&data[..], None)?;
    info!("Got the document! {:#?}", doc);

    info!("Producer = {:#?}", doc.producer());
    info!("Num pages = {:#?}", doc.n_pages());

    Ok(())
}

fn install_tracing() {
    use tracing_error::ErrorLayer;
    use tracing_subscriber::prelude::*;
    use tracing_subscriber::{fmt, EnvFilter};

    let fmt_layer = fmt::layer();
    let filter_layer = EnvFilter::try_from_default_env()
        .or_else(|_| EnvFilter::try_new("info"))
        .unwrap();

    tracing_subscriber::registry()
        .with(filter_layer)
        .with(fmt_layer)
        .with(ErrorLayer::default())
        .init();
}
Bear

Hey... that doesn't actually use cairo.

Amos

Uhh true. PRs welcome?

The #[link] attribute above is the only thing not covered by the .pc files themselves.

After that, we can do:

Shell session
$ PKG_CONFIG_ALL_STATIC=1 PKG_CONFIG_PATH=/home/amos/bearcove/prefix/lib64/pkgconfig cargo build --verbose --features static

And get an all-static, and gigantic, executable:

Shell session
$ ldd ./target/debug/pdftocairo
        linux-vdso.so.1 (0x00007ffc77d63000)
        libz.so.1 => /lib64/libz.so.1 (0x00007fac18f55000)
        libfreetype.so.6 => /lib64/libfreetype.so.6 (0x00007fac18e8a000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fac18dae000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fac18d93000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fac18b89000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fac19de3000)
        libbz2.so.1 => /lib64/libbz2.so.1 (0x00007fac18b76000)
        libpng16.so.16 => /lib64/libpng16.so.16 (0x00007fac18b3b000)
        libharfbuzz.so.0 => /lib64/libharfbuzz.so.0 (0x00007fac18a65000)
        libbrotlidec.so.1 => /lib64/libbrotlidec.so.1 (0x00007fac18a57000)
        libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x00007fac1891c000)
        libgraphite2.so.3 => /lib64/libgraphite2.so.3 (0x00007fac188fb000)
        libbrotlicommon.so.1 => /lib64/libbrotlicommon.so.1 (0x00007fac188d8000)
        libpcre.so.1 => /lib64/libpcre.so.1 (0x00007fac1885e000)

Ah, well, not really. I guess a bunch of libraries escaped there. I guess we'll stick with our awful tricks for the time being.

What about Windows?

But what about Windows? This is what we did all this for, after all...

Well, with this little PowerShell script:

PowerShell script
# in `poppler-rs/pdftocairo/build.ps1`

$env:PKG_CONFIG_ALL_STATIC = "1"
$env:PKG_CONFIG_PATH = "C:/Users/amos/AppData/Local/temp/pkg-config-hack/poppler-prefix/lib/pkgconfig"

cargo build --features static

And changing our little extern "C" block to this:

Rust code
// in `poppler-rs/pdftocairo/src/main.rs`

#[cfg_attr(
    all(feature = "static", target_os = "linux"),
    link(name = "stdc++", kind = "static")
)]
#[cfg_attr(all(feature = "static", target_os = "windows"), link(name = "shell32"))]
#[cfg_attr(all(feature = "static", target_os = "windows"), link(name = "ole32"))]
extern "C" {}

Then it builds!

PowerShell session
$ .\build.ps1
warning: unused import: `glib::StaticType`
  --> C:\Users\amos\bearcove\poppler-rs\poppler-rs\src\auto\document.rs:11:5
   |
11 | use glib::StaticType;

poppler-rs/pdftocairo on  amos/static-build [!+] is 📦 v0.1.0 via 🦀 v1.57.0
❯ .\build.ps1
   Compiling serde v1.0.132
(cut)
   Compiling poppler v0.1.0 (C:\Users\amos\bearcove\poppler-rs\poppler-rs)
   Compiling pdftocairo v0.1.0 (C:\Users\amos\bearcove\poppler-rs\pdftocairo)
    Finished dev [unoptimized + debuginfo] target(s) in 17.27s

Interestingly, it's only 7.8MB!

Using Dependencies (a spiritual successor to Dependency Walker, but that one freezes when I give it my executable 🙃), we can see it only depends on system Windows DLLs:

The productionized build

Now for poppler-meson-crates — this repo contains all the build script hackery that extracts the prefix from S3, so it should work out of the box. Just need to switch to the static-friendly poppler-rs crate:

TOML markup
# in `poppler-meson-crates/poppler-sample/Cargo.toml`

[dependencies.poppler]
git = "https://github.com/bearcove/poppler-rs"
branch = "amos/static-build"
features = ["static"]

And that's all the changes we need.

PowerShell session
$ cargo build
   Compiling proc-macro2 v1.0.34
   Compiling unicode-xid v0.2.2
(cut)
   Compiling cairo-rs v0.15.0 (https://github.com/bearcove/gtk-rs-core?branch=amos/static-build#dcff5004)
   Compiling poppler v0.1.0 (https://github.com/bearcove/poppler-rs?branch=amos/static-build#e35aeef9)
    Finished dev [unoptimized + debuginfo] target(s) in 1m 36s

That's right. It finally all works out of the box:

PowerShell session
$ .\target\debug\poppler-sample.exe
Producer = Some(
    "Skia/PDF m94",
)

And now for the real thing

This series took so long to write (it's been two and a half months!) that I had time to reinstall my setup from scratch again. This time, it was because WSL2 had gotten on my nerves one too many times, and I decided to switch back to "just giving 32GiB of RAM to a Fedora VM in VMWare".

And with that setup, I couldn't run the draw.io desktop app easily. I mean... I could, by just exporting DISPLAY=:0, so it would connect to the display server that was running inside the VM (tucked away in another Windows virtual desktop).

But it was infuriating to have to do that when I had gotten truly headless draw.io exports to work. So I just went ahead and folded my experimental "use headless chrome for .drawio -> .pdf" and "use poppler for .pdf -> .svg" code into the main branch for salvage (my command-line asset processor).

And it stopped building on Windows! Because of that static business. So, if we just change the dependencies a little... does it build now?

TOML markup
# in `futile/Cargo.toml`

[dependencies.poppler]
git = "https://github.com/bearcove/poppler-rs"
branch = "amos/static-build"
features = ["static"]

[dependencies.cairo-rs]
git = "https://github.com/bearcove/gtk-rs-core"
branch = "amos/static-build"
features = ["static", "svg"]
PowerShell session
$ cargo build
   Compiling glib v0.15.0 (https://github.com/bearcove/gtk-rs-core?branch=amos/static-build#34fcdc82)
   Compiling cairo-rs v0.15.0 (https://github.com/bearcove/gtk-rs-core?branch=amos/static-build#34fcdc82)
   Compiling poppler v0.1.0 (https://github.com/bearcove/poppler-rs?branch=amos/static-build#e35aeef9)
   |
20 |         page.get_bounding_box(&mut bb);
   |              ^^^^^^^^^^^^^^^^ method not found in `poppler::Page`

error[E0599]: no method named `render` found for struct `poppler::Page` in the current scope
  --> src\commands\poppler.rs:25:14
   |
25 |         page.render(&cx);
   |              ^^^^^^ method not found in `poppler::Page`

Some errors have detailed explanations: E0432, E0599.
For more information about an error, try `rustc --explain E0432`.
error: could not compile `salvage` due to 3 previous errors

Nope! I could swear those methods were in there somewhere...

Rust code
// in `poppler-rs/src/auto/page.rs`

impl Page {
    //#[doc(alias = "poppler_page_get_bounding_box")]
    //#[doc(alias = "get_bounding_box")]
    //pub fn is_bounding_box(&self, rect: /*Ignored*/&mut Rectangle) -> bool {
    //    unsafe { TODO: call ffi:poppler_page_get_bounding_box() }
    //}
}

Ah. Turns out my Gir.toml was slightly wrong: a little gir -m not_bound in poppler-rs/poppler-rs let me know that get_bounding_box wasn't generated because poppler.Rectangle wasn't generated.

As for render, I had it set to ignore, which I guess didn't have any effect in gir 0.14?

Anyway, with this Gir.toml

TOML markup
[options]
library = "Poppler"
version = "0.18"
target_path = "."
min_cfg_version = "0.70"
girs_directories = ["/home/amos/bearcove/prefix/share/gir-1.0", "/usr/share/gir-1.0"]
work_mode = "normal"
# generate_safety_asserts = true
deprecate_by_min_version = true
single_version_file = true

external_libraries = [
  "Gio",
  "GLib",
  "GObject",
  "Cairo",
]

manual = [
  "GLib.Bytes",
  "GLib.Error",
  "GLib.DateTime",
  "cairo.Context",
  "cairo.Surface",
  "cairo.Region",
]

generate = [
  "Poppler.Backend",
  "Poppler.Document",
  "Poppler.Rectangle",
]

[[object]]
name = "Poppler.Page"
status = "generate"

  [[object.function]]
  name = "get_bounding_box"
  rename = "get_bounding_box"

  [[object.function]]
  name = "get_text_layout"
  ignore = true

  [[object.function]]
  name = "get_text_layout_for_area"
  ignore = true

  [[object.function]]
  name = "get_crop_box"
  ignore = true

  [[object.function]]
  name = "render"
    [[object.function.parameter]]
    name = "cairo"
    const = true

  [[object.function]]
  name = "render_for_printing"
    [[object.function.parameter]]
    name = "cairo"
    const = true

The methods appeared again! The const = true thingy is a workaround for the lack of proper metadata in .gir files. All the ignore = true were generating incorrect code. gir is still a work-in-progress!

After fixing all this, I realized I was working off of the wrong poppler-rs repository: I had moved it from GitHub, to the GNOME gitlab.

I was also missing this wonderful workaround for the lackluster code generation around Rectangle:

Rust code
// in `poppler-rs/poppler-rs/src/lib.rs`

use glib::translate::{ToGlibPtr, ToGlibPtrMut};

impl Deref for Rectangle {
    type Target = ffi::PopplerRectangle;

    fn deref(&self) -> &Self::Target {
        unsafe { &*self.to_glib_none().0 }
    }
}

impl DerefMut for Rectangle {
    fn deref_mut(&mut self) -> &mut Self::Target {
        unsafe { &mut *self.to_glib_none_mut().0 }
    }
}

Wonderfully unsafe.

Anyway... does it build now?

Bear

DOES IT???

Shell session
$ cargo b
   Compiling poppler-sys v0.0.1 (https://github.com/bearcove/poppler-rs?branch=amos/static-build#3b05dc26)
   Compiling poppler v0.1.0 (https://github.com/bearcove/poppler-rs?branch=amos/static-build#3b05dc26)
   Compiling salvage v1.4.0 (C:\Users\amos\bearcove\salvage)
    Finished dev [unoptimized + debuginfo] target(s) in 5.17s

It does!!! 🎉

If you liked what you saw, please support my work!

Github logo Donate on GitHub Patreon logo Donate on Patreon

Here's another article just for you:

Rust modules vs files

A while back, I asked on Twitter what people found confusing in Rust, and one of the top topics was "how the module system maps to files".

I remember struggling with that a lot when I first started Rust, so I'll try to explain it in a way that makes sense to me.

Important note

All that follows is written for Rust 2021 edition. I have no interest in learning (or teaching) the ins and outs of the previous version, especially because it was a lot more confusing to me.