Creating User-Defined C Functions in PostgreSQL

Bear GilesMarch 14th, 2016Last Updated: March 5th, 2016

0 114 2 minutes read

Many experienced database developers are familiar with user-defined functions implemented in SQL or a procedural language. PostgreSQL itself supports PL/pgSQL, PL/Tcl, PL/Perl, PL/Python, and has third-party support for PL/sh, PL/Java, PL/Ruby, PL/PHP and even PL/R.

Recent versions of PostgreSQL also support SQL/MED (foreign data) with foreign data wrappers (FDW). This allows the database user to access foreign data sources – other RDMSes such as Oracle or MySQL, noSQL databases, even services such as twitter, as though they were PostgreSQL tables. Obviously there can be performance issues but if an architect only needs to perform a simple task, e.g., sending a tweet when certain conditions are met, it may be easier to use a FDW and database query than to incorporate native twitter support.

Sometimes this is not enough. We may wish to add support for a new procedural language or foreign data wrapper. We may wish to perform work involving external libraries on the database server instead of the application server for efficiency. We may wish to hide implementation details from the database user.

A subset of the last item is management of cryptographic material. Most webapps manage the cryptographic material themselves but proper key management is difficult. For instance encryption keys should be rotated on a regular basis but many webapps have never changed their database keys because there is no provision for it. Their only option is to shut down the application, run an application that loads every record, decrypts it, reencrypts it, and then updates each record. Key management on the database side is still a difficult problem but the solutions are far more likely to be reusable.

PostgreSQL Extensions

It is possible to create user-defined functions in a sql script:

CREATE OR REPLACE FUNCTION dgst_sha1(text)
RETURNS text
AS 'pgopenssltypes', 'dgst_sha1'
LANGUAGE C IMMUTABLE STRICT

and provide the implementation by dropping the appropriate shared library in the $libdir directory. See C-Language Functions for details, or my own Introduction to PL/Java for examples of the SQL definitions for functions, triggers, operators, and indexes.

This is unmaintainable for anything other than the smallest tasks. We want to bundle everything we need into one object that we can load and unload it in a single operation. This ensures there are no oversights or inconsistencies. PostgreSQL already has a solution to this problem: PostgreSQL Extensions.

PostgreSQL Extension Network (PGXN)

Writing a PostgreSQL extension from scratch can be tricky. A better approach is to create a skeleton using the PGXN utils package and then install it with the pgxn-client utility from the PostgreSQL Extension Network.

Source Code

I am not including any source code since there are numerous projects listed at the PGXN site. For instance semver (semantic version data type, git or pgaudit. As always all public code should be viewed critically – you don’t want to learn the wrong lessons by reading bad code. (Do not take my reference to these projects as an endorsement.)

Note: if you want to use a shared library your Makefile should include the libraries in a SHLIB_LINK declaration. The system will automatically pull in any dependencies when the module is loaded.