Streaming PostgreSQL backups with pg_dump and borgbackup


I run a bitmagnet instance, an awesome Bittorrent DHT crawler and content classifier. While the database is not really critical and could be crawled again in case of data loss, it would take a lot of time to reach its current completeness. As you imagine, there are quite a lot of active torrents (indexed 29 million as of 2026!).

So I needed to back up the database. I identified two usual ways: back up the PostgreSQL files after making a Btrfs snapshot, or making a pg_dump of the DB. I eliminated the first option, as backing up the indices was not optimal.

pg_dump on the other hand required some reads and CPU time, and if used conventionally, a lot of disk storage (around 170GB) that is not really useful on the host. However we can work around the second limitation.

The solution

To avoid dumping the DB on the disk before it being picked up by borgbackup, we create a FIFO file, make pg_dump write into it, and finally make borgbackup read the file as a standard file with the --read-special flag. pg_dump will wait until borgbackup is reading before writing, so no overhead until the backup is made.

I'm using the borgbackup NixOS module, but as it's mostly simple bash, it could be used easily on other systems.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
services.borgbackup.jobs.example = {
  extraCreateArgs = "--read-special";
  preHook = lib.concatMapStringsSep "\n" (
    db:
    let
      dumpfile = "/tmp/pg-dump-${db}.sql";
    in
    ''
      ${pkgs.coreutils}/bin/mkfifo --mode=660 "${dumpfile}"
      ${pkgs.coreutils}/bin/chgrp postgres "${dumpfile}"
      ${pkgs.su}/bin/su postgres -c '${config.services.postgresql.package}/bin/pg_dump -f ${dumpfile} ${db}' &
    ''
  ) (["database1", "database2"]);
};

The & is important here, as pg_dump will not exit until the whole backup is read.