From Inkscape to poppler

This article is part of the Don't shell out! series.

What's next? Well... poppler is the library Inkscape uses to import PDFs.

Cool bear's hot tip

Yes, the name comes from Futurama.

Turns out, poppler comes with a bunch of CLI tools, including pdftocairo!

Halfway through this article, I realized the "regular weight" on my system was in fact Iosevka SS01 (Andale Mono Style) (see Releases), but the "bold weight" was the default Iosevka.

So, I removed both and reinstalled them from the official distribution, which explains visual and size changes after that point.

So, with a few more CLI incantations...

Shell session
$ pdftocairo /tmp/export.pdf -svg /tmp/export.svg
$ ls -lhA /tmp/export*
-rw-r--r--. 1 amos amos 159K Nov 19 10:14 /tmp/export.pdf
-rw-r--r--. 1 amos amos 739K Nov 19 10:14 /tmp/export.svg

We've got an SVG file! And it's a bit large, I wonder if it embeds part of a font, like the PDF does?

Well... it's a bit more complicated.

As it turns out, individual non-bold ("regular weight") letters actually refer to other paths:

But words made up of bold letters are a single, very lengthy path:

I wonder if that's because I've only installed the "Regular" weight for the Iosevka font... let's find out.

After installing the "Bold" weight, renaming /tmp/export.EXT to /tmp/export.regular.EXT, and running both steps again, the PDF export is smaller - and so is the SVG!

Shell session
$ ls -lhAt /tmp/export.*
-rw-r--r--. 1 amos amos 436K Nov 19 10:40 /tmp/export.svg
-rw-r--r--. 1 amos amos  68K Nov 19 10:40 /tmp/export.pdf
-rw-r--r--. 1 amos amos 739K Nov 19 10:39 /tmp/export.regular.svg
-rw-r--r--. 1 amos amos 159K Nov 19 10:39 /tmp/export.regular.pdf

The PDF file now contains two partial embedded fonts:

%% Original object ID: 4 0
9 0 obj
  /BaseFont /Iosevka-Bold
  /DescendantFonts [
    11 0 R
  /Encoding /Identity-H
  /Subtype /Type0
  /ToUnicode 12 0 R
  /Type /Font

%% Original object ID: 5 0
10 0 obj
  /BaseFont /Iosevka
  /DescendantFonts [
    14 0 R
  /Encoding /Identity-H
  /Subtype /Type0
  /ToUnicode 15 0 R
  /Type /Font

And we can see in the SVG file that bold characters now also take advantage of the SVG use tag.

So what happened with bold in the first export then? How did we even get bold letters, if we didn't have the corresponding font? Let's look at them both:

The bottom version is what Iosevka is supposed to look like. The top version is Chrome font's renderer (freetype?) doing its best to turn a regular font into a bold font, by just... embiggening stuff.

So anyway, now we have a reasonable SVG. It:

But, well, we used a CLI tool to do it. Ideally we'd be able to just do it from code, since we don't want any external dependencies (Chrome being the notable, and infuriating, exception).

GNOME has a pretty good story when it comes to Rust libraries. But the folks working on them are focusing mainly on cairo, gio, glib, pango, and gtk3/gtk4. There is a poppler crate on, but it is hopelessly out-of-date.

But the good news is: there's existing tooling for glib-based C libraries, and poppler is one of them. Can we use it to generate bindings before this article becomes so large it crashes your browser? Let's find out!

gobject introspection

In the year of our lord 2021, we could all use a little introspection. And APIs are absolutely no exception.

APIs are typically defined as a bunch of C headers, and that isn't machine-friendly for a bunch of reasons. I know that because I once tried writing a C preprocessor that basically converted #ifdef blocks into cargo features. It was awful.

So what a bunch of folks have been doing instead, is to have some canonical representation of the API as a structured language (that specifically isn't C), and then from there you can generate bindings with it.

That's what folks at Microsoft are doing with windows-rs do for example.

They actually have machinery involving clang and .NET (you can take a look at the win32metadata repository for more information), and the reference definitions look like this (as seen through ILSpy):

Apple is doing something similar with BridgeSupport, although I have found very little documentation about it, and at least one person claimed it was no longer supported.

And, well, the GNOME project has been doing the same thing! If the gobject-introspection Git history is to be trusted, they've started their effort in 2004! The Rust side of it, gtk-rs/gir was "only" started in 2015.

And like I said earlier, even though poppler is actually an offshoot from xpdf (and so it looks different from a lot of other GNOME-adjacent libraries), it does have a "glib interface" (alongside a QT interface), and that glib interface has a .Gir file, and so we can use it with gtk-rs/gir!

A .Gir file is just plain XML, here's an excerpt from /usr/share/gir-1.0/Poppler-0.18.Gir on a Fedora 35 install:

<?xml version="1.0"?>
<!-- This file was automatically generated from C sources - DO NOT EDIT!
To affect the contents of this file, edit the original C definitions,
and/or use gtk-doc annotations.  -->
<repository version="1.2"
  <include name="GObject" version="2.0"/>
  <include name="Gio" version="2.0"/>
  <include name="cairo" version="1.0"/>
  <package name="poppler-glib"/>
  <c:include name="poppler.h"/>
  <namespace name="Poppler"

  <!-- (skipping a few things to find the interesting bits...) -->

    <class name="Page"
      <method name="render" c:identifier="poppler_page_render">
        <doc xml:space="preserve"
             line="336">Render the page to the given cairo context. This function
is for rendering a page that will be displayed. If you want
to render a page that will be printed use
poppler_page_render_for_printing() instead.  Please see the documentation
for that function for the differences between rendering to the screen and
rendering to a printer.</doc>
        <source-position filename="glib/poppler-page.h" line="38"/>
        <return-value transfer-ownership="none">
          <type name="none" c:type="void"/>
          <instance-parameter name="page" transfer-ownership="none">
            <doc xml:space="preserve"
                 line="338">the page to render from</doc>
            <type name="Page" c:type="PopplerPage*"/>
          <parameter name="cairo" transfer-ownership="none">
            <doc xml:space="preserve"
                 line="339">cairo context to render to</doc>
            <type name="cairo.Context" c:type="cairo_t*"/>

And, you know, it doesn't have all the information one could dream of, but it's a perfectly fine start to generate Rust bindings.

So after chatting with the wonderful folks in the Gnome/Rust Matrix room, I got to work and started making my own poppler-rs.

A lot of "Rust bindings to C libraries" are actually two crates: a foobar-sys crate that is full of unsafe functions, and a foobar crate that wraps foobar-sys's functionality with safe abstractions.

And that's the model gtk-rs/gir enforces as well, so I made a little workspace...

TOML markup
# in poppler-rs/Cargo.toml

members = [

And for the sys crate, I added a little config:

TOML markup
# in poppler-rs/sys/Gir.toml

library = "Poppler"
version = "0.18"
target_path = "."
min_cfg_version = "0.70"
girs_directories = ["../../gir-files", "../gir-files"]
work_mode = "sys"

external_libraries = [

ignore = [
Cool bear's hot tip

MAJOR_VERSION etc. are defines in C. Because we link dynamically against poppler in most scenarios, and the binding is generated once and then used against many different versions of poppler, having them exposed to Rust is a) unnecessary, and b) makes gir-generated unit tests fail (because the numbers don't match up, even if the libraries would be compatible).

And then after running gir in the sys/ directory, BOOM, we have a sys crate. It has a single src/ file, that has a preamble...

Rust code
// in `poppler-rs/sys/src/`

// Generated by gir ( @ 8891a2f2c34b)
// from ../../gir-files (@ c6afb5857607)
// from ../gir-files (@ ec3e62ee546b)

#![allow(non_camel_case_types, non_upper_case_globals, non_snake_case)]
#![allow(clippy::approx_constant, clippy::type_complexity, clippy::unreadable_literal, clippy::upper_case_acronyms)]
#![cfg_attr(feature = "dox", feature(doc_cfg))]

use gio_sys as gio;
use glib_sys as glib;
use gobject_sys as gobject;
use cairo_sys as cairo;

use libc::{c_int, c_char, c_uchar, c_float, c_uint, c_double,
    c_short, c_ushort, c_long, c_ulong,
    c_void, size_t, ssize_t, intptr_t, uintptr_t, time_t, FILE};

use glib::{gboolean, gconstpointer, gpointer, GType};

// etc.

And then some enums...

Rust code
// Enums
pub type PopplerActionLayerAction = c_int;
pub const POPPLER_ACTION_LAYER_ON: PopplerActionLayerAction = 0;
pub const POPPLER_ACTION_LAYER_OFF: PopplerActionLayerAction = 1;
pub const POPPLER_ACTION_LAYER_TOGGLE: PopplerActionLayerAction = 2;

And then some unions...

Rust code
// Unions
#[derive(Copy, Clone)]
pub union PopplerAction {
    pub type_: PopplerActionType,
    pub any: PopplerActionAny,
    pub goto_dest: PopplerActionGotoDest,
    pub goto_remote: PopplerActionGotoRemote,
    pub launch: PopplerActionLaunch,
    pub uri: PopplerActionUri,
    pub named: PopplerActionNamed,
    pub movie: PopplerActionMovie,
    pub rendition: PopplerActionRendition,
    pub ocg_state: PopplerActionOCGState,
    pub javascript: PopplerActionJavascript,
    pub reset_form: PopplerActionResetForm,

Some callbacks (not shown here), some "records", which I guess is what structs are called in gobject-introspection:

Rust code
#[derive(Copy, Clone)]
pub struct PopplerRectangle {
    pub x1: c_double,
    pub y1: c_double,
    pub x2: c_double,
    pub y2: c_double,

impl ::std::fmt::Debug for PopplerRectangle {
    fn fmt(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
        f.debug_struct(&format!("PopplerRectangle @ {:p}", self))
         .field("x1", &self.x1)
         .field("y1", &self.y1)
         .field("x2", &self.x2)
         .field("y2", &self.y2)

And then some classes!

Rust code
pub struct PopplerPage(c_void);

impl ::std::fmt::Debug for PopplerPage {
    fn fmt(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
        f.debug_struct(&format!("PopplerPage @ {:p}", self))

And then, well, then there's every function in poppler-glib:

Rust code
#[link(name = "poppler-glib")]
#[link(name = "poppler")]
extern "C" {
    // (MANY functions skipped)

    // Oh look this one is gated behind a cargo feature automatically!

    #[cfg(any(feature = "v0_80", feature = "dox"))]
    #[cfg_attr(feature = "dox", doc(cfg(feature = "v0_80")))]
    pub fn poppler_print_duplex_get_type() -> GType;

    // skipping more...

    pub fn poppler_page_render(page: *mut PopplerPage, cairo: *mut cairo::cairo_t);

    // skipping everything else.

And that's how you get a -sys crate.

You'll note that it has only poppler functions. It doesn't have, for example, cairo functions, which is a dependency of poppler. Those are in other crates, which have already been generated and published to

TOML markup
# in `poppler-rs/sys/Cargo.toml`

cairo-sys-rs = "0.14.9"
gio-sys = "0.14.0"
glib-sys = "0.14.0"
gobject-sys = "0.14.0"
libc = "0.2"

Now that we have the low-level, unsafe crate, we can generate the high-level crate!

That one's a bit more complicated, because, again, the .Gir files are missing some information that matters for languages like Rust.

TOML markup
# in `poppler-rs/poppler/Gir.toml`

library = "Poppler"
version = "0.18"
target_path = "."
min_cfg_version = "0.70"
girs_directories = ["../../gir-files", "../gir-files"]
work_mode = "normal" # 👈 this was "sys" for the previous crate
deprecate_by_min_version = true
single_version_file = true

external_libraries = [

# This tells gir "these types exist in _other crates_, you don't need to
# generate them yourself BUT you shouldn't skip functions that use these"
# (Normally gir skips anything that uses types that aren't explicitly
# allowlisted).
manual = [

# This is the short way of telling gir what to generate
generate = [

# This is the long way of telling gir what to generate, where we can ignore
# specific "object functions" (methods, really..), change the constness of some
# parameters, etc.
name = "Poppler.Page"
status = "generate"

  name = "render"
    name = "cairo"
    const = true

  name = "render_for_printing"
    name = "cairo"
    const = true

  name = "get_text_layout"
  ignore = true

  name = "get_text_layout_for_area"
  ignore = true

  name = "get_crop_box"
  ignore = true

  name = "get_bounding_box"
  rename = "get_bounding_box"

name = "Poppler.Rectangle"
status = "generate"
boxed_inline = true

There's a couple interesting workarounds I've got baked in there, for some value of "interesting".

For example, the poppler_page_get_bounding_box function prototype looks like this:

C code
poppler_page_get_bounding_box (PopplerPage *page,
                               PopplerRectangle *rect);

And so by default, gtk-rs/gir generated something like this:

Rust code
impl Page {
  fn is_bounding_box(&mut self, rect: &mut Rectangle) -> bool;

Ohhh because it returns a bool, right.

...hence the odd "rename get_bounding_box to get_bounding_box" configuration. get_crop_box generated code that straight up refused to compile, so I had to ignore it - and I ran into a couple other issues, but I have to say I've been using the 0.14 branch of gtk-rs/gir, and the development branch contains a lot of improvements already.

Wait, why did you use 0.14 then?

That's what the existing glib and cairo-rs crates were generated with.

So.. the versions have to match to be able to interoperate?


And again, just running gir generates a whole crate, a high-level, safe one this time:

Rust code
// in `poppler-rs/poppler/src/auto/`

// This file was generated by gir (
// from ../../gir-files
// from ../gir-files

use crate::Rectangle;
use glib::object::ObjectType as ObjectType_;
use glib::signal::connect_raw;
use glib::signal::SignalHandlerId;
use glib::translate::*;
use std::boxed::Box as Box_;
use std::fmt;
use std::mem;
use std::mem::transmute; // that's how you know it's gonna get good

glib::wrapper! {
    #[doc(alias = "PopplerPage")]
    pub struct Page(Object<ffi::PopplerPage>);

    match fn {
        type_ => || ffi::poppler_page_get_type(),

impl Page {
    // (still not sure why this returns a bool / when this would ever return
    // false, the docs are non-existent)

    #[doc(alias = "poppler_page_get_bounding_box")]
    pub fn get_bounding_box(&self, rect: &mut Rectangle) -> bool {
        unsafe {
            from_glib(ffi::poppler_page_get_bounding_box(self.to_glib_none().0, rect.to_glib_none_mut().0))

    // (skipped a bunch of methods)

    #[doc(alias = "poppler_page_render")]
    pub fn render(&self, cairo: &cairo::Context) {
        unsafe {
            ffi::poppler_page_render(self.to_glib_none().0, mut_override(cairo.to_glib_none().0));

    #[doc(alias = "poppler_page_render_for_printing")]
    pub fn render_for_printing(&self, cairo: &cairo::Context) {
        unsafe {
            ffi::poppler_page_render_for_printing(self.to_glib_none().0, mut_override(cairo.to_glib_none().0));

    // (skipped all the other methods)

Just like before, the high-level poppler crate depends on high-level glib/cairo crates. And bitflags, for reasons™️

TOML markup
# in `poppler-rs/poppler/Cargo.toml`

glib = "0.14.8"
libc = "0.2.107"
cairo-rs = "0.14.9"
bitflags = "1.3.2"

And now, FINALLY, we can use these bindings.

Using our fresh poppler-rs bindings

I made a tiny version of pdftocairo that exclusively renders to a cairo SVG surface, just to try things out. Here it is in its entirety:

TOML markup
# in `pdftocairo/Cargo.toml`

name = "pdftocairo"
version = "0.1.0"
edition = "2021"

# for utf-8 paths
camino = "1.0.5"
# for error handling
color-eyre = "0.5.11"
# *chants* poppler, poppler, poppler!
poppler-rs = { path = "../poppler-rs/poppler" }
# for rendering
cairo-rs = { version = "0.14.9", features = ["svg"] }
# for application-level tracing
tracing = "0.1.29"
tracing-error = "0.2.0"
tracing-subscriber = { version = "0.3.1", features = ["env-filter"] }
Rust code
// in `pdftocairo/src/`

use std::fs::File;

use cairo::{Context, SvgSurface};
use camino::Utf8PathBuf;
use color_eyre::{eyre::eyre, Report};
use poppler::Rectangle;
use tracing::info;

fn main() -> Result<(), Report> {
    if std::env::var("RUST_LOG").is_err() {
        std::env::set_var("RUST_LOG", "info");

    let path = Utf8PathBuf::from("/tmp/export.pdf");
    info!(%path, "Reading file...");
    let data = std::fs::read(&path)?;
    info!(%path, "Reading file... done!");
    let doc = poppler::Document::from_data(&data[..], None)?;
    info!("Got the document! {:#?}", doc);

    info!("Producer = {:#?}", doc.producer());
    info!("Num pages = {:#?}", doc.n_pages());

    let page =;
    info!("page = {:#?}", page);

    let mut bb: Rectangle = Default::default();
    page.get_bounding_box(&mut bb);
    info!("bb = {:#?}", *bb);

    info!("Creating file!");
    let export_path = Utf8PathBuf::from("/tmp/export.svg");
    let f = File::create(&export_path)?;

    info!("Creating surface...");
    let surface = SvgSurface::for_stream(bb.x2 - bb.x1, bb.y2 - bb.y1, f)?;

    info!("Creating context...");
    let cx = Context::new(&surface)?;


    info!("Finishing output stream...");
        .map_err(|e| eyre!("cairo error: {}", e.to_string()))?;

    info!(%export_path, "We're.. done?");


fn install_tracing() {
    use tracing_error::ErrorLayer;
    use tracing_subscriber::prelude::*;
    use tracing_subscriber::{fmt, EnvFilter};

    let fmt_layer = fmt::layer();
    let filter_layer = EnvFilter::try_from_default_env()
        .or_else(|_| EnvFilter::try_new("info"))


And here's proof it works!

Shell session
$ cargo build
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s

$ ./target/debug/pdftocairo 
2021-11-24T18:14:36.936369Z  INFO pdftocairo: Reading file... path=/tmp/export.pdf
2021-11-24T18:14:36.936467Z  INFO pdftocairo: Reading file... done! path=/tmp/export.pdf
2021-11-24T18:14:36.939146Z  INFO pdftocairo: Got the document! Document(
    ObjectRef {
        inner: 0x000055bfa9458400,
        type: PopplerDocument,
2021-11-24T18:14:36.939199Z  INFO pdftocairo: Producer = Some(
    "Skia/PDF m74",
2021-11-24T18:14:36.939239Z  INFO pdftocairo: Num pages = 1
2021-11-24T18:14:36.939284Z  INFO pdftocairo: page = Page(
    ObjectRef {
        inner: 0x000055bfa9458440,
        type: PopplerPage,
2021-11-24T18:14:36.941495Z  INFO pdftocairo: bb = PopplerRectangle @ 0x55bfa9467810 {
    x1: 0.0,
    y1: 0.0,
    x2: 744.9599599999999,
    y2: 481.91998,
2021-11-24T18:14:36.941563Z  INFO pdftocairo: Creating file!
2021-11-24T18:14:36.941622Z  INFO pdftocairo: Creating surface...
2021-11-24T18:14:36.941667Z  INFO pdftocairo: Creating context...
2021-11-24T18:14:36.941691Z  INFO pdftocairo: Rendering...
2021-11-24T18:14:36.947172Z  INFO pdftocairo: Finishing output stream...
2021-11-24T18:14:36.955067Z  INFO pdftocairo: We're.. done? export_path=/tmp/export.svg

Oooh, Skia!

And here's the result:

Shell session
$ head /tmp/export.svg 
<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="" xmlns:xlink="" width="744.95996pt" height="481.91998pt" viewBox="0 0 744.95996 481.91998" version="1.1">
<symbol overflow="visible" id="glyph0-0">
<path style="stroke:none;" d="M 0.703125 0 L 0.703125 -8.8125 L 5.28125 -8.8125 L 5.28125 0 Z M 1.109375 -3.96875 L 2.84375 -6.140625 L 4.65625 -8.40625 L 3.53125 -8.40625 L 1.109375 -5.375 Z M 1.109375 -5.84375 L 2.28125 -7.296875 L 3.15625 -8.40625 L 2.03125 -8.40625 L 1.109375 -7.234375 Z M 1.109375 -7.703125 L 1.671875 -8.40625 L 1.109375 -8.40625 Z M 1.109375 -2.078125 L 3.890625 -5.578125 L 4.890625 -6.8125 L 4.890625 -8.21875 L 1.109375 -3.5 Z M 1.109375 -0.390625 L 1.25 -0.390625 L 4.890625 -4.9375 L 4.890625 -6.359375 L 1.109375 -1.625 Z M 1.625 -0.390625 L 2.75 -0.390625 L 4.890625 -3.0625 L 4.890625 -4.46875 Z M 3.125 -0.390625 L 4.25 -0.390625 L 4.890625 -1.203125 L 4.890625 -2.59375 Z M 4.890625 -0.390625 L 4.890625 -0.734375 L 4.625 -0.390625 Z M 4.890625 -0.390625 "/>
<symbol overflow="visible" id="glyph0-1">
<path style="stroke:none;" d="M 0.796875 0 L 0.796875 -8.8125 L 5.28125 -8.8125 L 5.28125 -7.65625 L 2.140625 -7.65625 L 2.140625 -5.15625 L 4.609375 -5.15625 L 4.609375 -4 L 2.140625 -4 L 2.140625 -1.15625 L 5.28125 -1.15625 L 5.28125 0 Z M 0.796875 0 "/>


What's the matter? Can't you render SVG in your head?

Mhh if I told you you'd probably have me do it!


What did we learn?

We were using a tiny subset of what Inkscape can do: rendering a PDF file to an SVG surface, as paths. And it turns out, we only need the poppler and cairo libraries to do that.

Because both have a "glib" interface, we can use all the GTK-cinematic-universe tooling to generate Rust bindings for them. cairo already has an official binding, but the poppler one was out-of-date: we just regenerated it with gtk-rs/gir and we were on our way.

This article is part 2 of the Don't shell out! series.

Read the next part

If you liked what you saw, please support my work!

Patreon logo Become a Patron

Latest video

video cover image
Getting good at SNES games through DLL injection

Are you ever confronted with a problem and then think to yourself "wait a minute, I know how to code?" — that's exactly what happened there.

Watch now

You can watch more videos over there

Looking for the homepage?
Another article: Rust 2020: Funding