Toulouse, October 17th, 2023

Today, we are releasing pglift 1.0, a new framework for deploying and operating PostgreSQL at scale. The project ships with both a command-line interface to manage the life-cycle of your databases and a collection of Ansible modules to drive your infrastructure as code in a production context.

By this 1.0 milestone, the user interface of pglift is now stable and the product considered ready for general use.

What is pglift?

The project originates from the need of Dalibo’s customers to be able to deploy and operate many PostgreSQL clusters as part of their production environments, in a uniform manner across their infrastructure:

  • The production aspect means that PostgreSQL, the database engine, usually needs to be complemented by other services for backup, monitoring or high-availability.
  • The uniform aspect typically refers to automation, meaning that end-users want to streamline the deployment of their database service so as to operate it in a safe manner.

pglift tries to answer these challenges by providing:

  1. a single entry-point and user-friendly command-line interface to operate a database cluster on site, and,
  2. a collection of Ansible modules, to automate deployment and operations across an infrastructure.

Either way, the behaviour of pglift is highly configurable, from how the PostgreSQL cluster is deployed (authentication, storage layout, logging, backup) to which components are set up alongside the instance to make it production-ready (physical backup, monitoring, high-availability).

Anyone wanting to deploy and operate PostgreSQL at a higher level than bare initdb or pg_ctl commands or with the need for some level of automation might find pglift attractive. This goes from Enterprise IT services to devops wanting their infrastructure managed through GitOps.

On the other hand, pglift is not meant to be the single solution to every use case but rather a piece of software that composes well with other tools. As an illustration, while pglift is local first and agnostic of any particular infrastructure framework, it can seamlessly integrate with Ansible already and could be made working with Terraform (or Kubernetes) similarly.

While it remains as configurable as possible, pglift is opinionated in some respect, as when some business choices need to be made, they usually result from Dalibo’s expertise.

Enough theory, let’s see what pglift can do!

The command-line interface

The logical unit of work of pglift is the instance, which at least consists of a PostgreSQL cluster but usually also includes satellite components. Creating an instance from the command-line is done as follows:

$ pglift instance create main --pgbackrest-stanza=main
INFO     initializing PostgreSQL
INFO     configuring PostgreSQL authentication
INFO     configuring PostgreSQL
INFO     starting PostgreSQL 16-main
INFO     creating role 'powa'
INFO     creating role 'prometheus'
INFO     creating role 'backup'
INFO     altering role 'backup'
INFO     creating 'powa' database in 16/main
INFO     creating extension 'btree_gist' in database powa
INFO     creating extension 'pg_qualstats' in database powa
INFO     creating extension 'pg_stat_statements' in database powa
INFO     creating extension 'pg_stat_kcache' in database powa
INFO     creating extension 'powa' in database powa
INFO     configuring Prometheus postgres_exporter 16-main
INFO     configuring pgBackRest stanza 'main' for
         pg1-path=/srv/pgsql/16/main/data
INFO     creating pgBackRest stanza main
INFO     starting Prometheus postgres_exporter 16-main

As we can see, aside from PostgreSQL, the instance includes a set of extensions required to make PoWA available, a Prometheus postgres_exporter service for monitoring and a pgBackRest setup for physical backup. These integrations are not defined during operations, but rather configured beforehand and locally through so-called site settings, in the form of plain YAML file such as:

# file: /etc/pglift/settings.yaml
prefix: /srv
postgresql:
  auth:
    host: scram-sha-256
prometheus:
  execpath: /usr/bin/prometheus-postgres-exporter
pgbackrest:
  repository:
    mode: path
    path: /srv/pgsql-backups
powa: {}
systemd: {}
rsyslog: {}

In addition to services running alongside PostgreSQL (monitoring, backup), pglift can also handle system integrations like systemd and rsyslog as in our example. All this works without root privileges for better separation of concerns and security.

One fundamental feature of pglift is that it is stateless but still aware of the state of objects it manages at runtime; typically, this allows to retrieve the state of an instance (at some point in time):

$ pglift instance get main -o json
{
  "name": "main",
  "version": "16",
  "port": 5432,
  "settings": {
    "unix_socket_directories": "/run/user/1000/pglift/postgresql",
    "shared_buffers": "1 GB",
    "wal_level": "replica",
    "archive_mode": true,
    "archive_command": "/usr/bin/pgbackrest --config-path=/etc/pgbackrest --stanza=main --pg1-path=/srv/pgsql/16/main/data archive-push %p",
    "effective_cache_size": "4 GB",
    "log_destination": "syslog",
    "logging_collector": true,
    "log_directory": "/var/log/postgresql",
    "log_filename": "16-main-%Y-%m-%d_%H%M%S.log",
    "syslog_ident": "postgresql-16-main",
    "cluster_name": "main",
    "lc_messages": "C",
    "lc_monetary": "C",
    "lc_numeric": "C",
    "lc_time": "C",
    "shared_preload_libraries": "pg_qualstats, pg_stat_statements, pg_stat_kcache"
  },
  "data_checksums": false,
  "locale": "C",
  "encoding": "UTF8",
  "standby": null,
  "state": "started",
  "pending_restart": false,
  "wal_directory": "/srv/pgsql/16/main/wal",
  "prometheus": {
    "port": 9187
  },
  "data_directory": "/srv/pgsql/16/main/data",
  "powa": {},
  "pgbackrest": {
    "stanza": "main"
  }
}

or to modify the instance:

$ pglift pgconf -i main set log_connections=on
INFO     configuring PostgreSQL
INFO     instance 16/main needs reload due to parameter changes: log_connections
INFO     reloading PostgreSQL configuration for 16-main
log_connections: None -> True
$ pglift instance alter main --prometheus-port 8188
INFO     configuring PostgreSQL
INFO     reconfiguring Prometheus postgres_exporter 16-main
INFO     instance 16/main needs reload due to parameter changes: log_connections
INFO     reloading PostgreSQL configuration for 16-main
INFO     starting Prometheus postgres_exporter 16-main
$ pglift instance get main
 name  version  port  data_checksums  locale  encoding  pending_restart  prometheus  pgbackrest
 main  16       5432  False           C       UTF8      False            port: 9187  stanza: main

pglift can also operate on PostgreSQL objects: databases, roles, schemas, privileges, etc.

Moreover, instances and other PostgreSQL objects can be manipulated through native PostgreSQL utilities from the pglift command-line, which is composable as it forwards the instance environment to invoked command:

$ pglift instance exec main -- pgbench -i bench
creating tables...
generating data (client-side)...
100000 of 100000 tuples (100%) done (elapsed 0.06 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 0.18 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 0.08 s, vacuum 0.04 s, primary keys 0.05 s).
$ pglift instance exec main -- pgbench bench
pgbench (16.0 (Debian 16.0-1.pgdg120+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
number of failed transactions: 0 (0.000%)
latency average = 1.669 ms
initial connection time = 4.544 ms
tps = 599.125277 (without initial connection time)

This also applies to third party tools, e.g.:

$ pglift instance exec main -- pgbackrest info
stanza: main
    status: ok
    cipher: none

    db (current)
        wal archive min/max (16): 000000010000000000000001/000000010000000000000007

        full backup: 20231016-092726F
            timestamp start/stop: 2023-10-16 09:27:26+02 / 2023-10-16 09:27:31+02
            wal start/stop: 000000010000000000000004 / 000000010000000000000004
            database size: 32.0MB, database backup size: 32.0MB
            repo1: backup set size: 4.2MB, backup size: 4.2MB

        diff backup: 20231016-092726F_20231016-092821D
            timestamp start/stop: 2023-10-16 09:28:21+02 / 2023-10-16 09:28:24+02
            wal start/stop: 000000010000000000000007 / 000000010000000000000007
            database size: 54.5MB, database backup size: 22.6MB
            repo1: backup set size: 6MB, backup size: 1.8MB
            backup reference list: 20231016-092726F

See the CLI tutorial in the documentation for more details.

Ansible modules

pglift ships with a collection of Ansible modules, under the dalibo.pglift namespace. Here’s an example playbook illustrating its capabilities:

- name: Set up database instances
  hosts: dbserver
  tasks:
    - name: main instance
      dalibo.pglift.instance:
        name: main
        state: started
        port: 5444
        settings:
          max_connections: 100
          shared_buffers: 1GB
          shared_preload_libraries: 'pg_stat_statements, passwordcheck'
        surole_password: '{{ postgresql_surole_password }}'
        pgbackrest:
          stanza: main
          password: '{{ backup_role_password }}'
        prometheus:
          password: '{{ prometheus_role_password }}'
          port: 9186
        roles:
          - name: admin
            login: true
            password: '{{ admin_password }}'
            connection_limit: 10
            validity: '2025-01-01T00:00'
            in_roles:
              - pg_read_all_stats
              - pg_signal_backend
        databases:
          - name: main
            owner: admin
            settings:
              work_mem: 3MB
            extensions:
              - name: unaccent
                schema: public

As we can see, the dalibo.pglift.instance module makes it possible to manage not only an instance, but also related objects such as roles and databases. Sensitive data (such as passwords) can be managed by a secret management solution, such as Ansible vault. Ansible modules allow for much more control than the command-line, in that deeply nested or complex fields, such as PostgreSQL instance (or database) settings, database extensions, etc.

This interface is completely declarative and idempotent, but still stateless (as are Ansible modules in general). As for the command-line, head out to Ansible tutorial in the documentation for more details. Also note that these modules can work together with other Ansible modules, such as community.postgresql ones.

About the project

Under the hood, all the business logic is implemented as a Python API. The core idea of pglift is to make PostgreSQL infrastructure management declarative and stateless so that it can be used through infrastructure-as-code but does not need to store anything. This design made it possible to expose both an imperative command-line interface and a declarative interface, in the form of Ansible modules.

The project is open source native, released under the GPLv3 licence. It was made possible by a team of enthusiastic developers and administrators following the idea of encoding Dalibo’s expertise into a powerful, flexible and user-friendly toolkit for day-to-day PostgreSQL administration. Its development happens in public at:

Finally, great care have been taken to provide an extensive but accessible documentation including tutorials, user guides and how-tos.


DALIBO

DALIBO est le spécialiste français de PostgreSQL®. Nous proposons du support, de la formation et du conseil depuis 2005.