Compiling plpython(3)u

Time marches and Python 2 is now past EOL, with the last release 2.7 no longer
supported as of 2020-01-01. In the Postgres world Python 2 lives on though as the
default version when referring to the procedural language plpythonu per
documentation plpython 2/3. In order to use a Python 3 version of the procedural language the plpython3u variant needs to be present. For users getting their Postgres through a packaging system; DEB, RPM, etc this is taken care of with packages available for both versions of Python. If you are compiling Postgres from source, either from necessity or habit, the process is a bit more involved.

At this point plpythonu will only be built for Python 2(>=2.6). When you run
configure it will probe for Python versions in the order of python, python3,
python2. If python -V resolves to a Python 2 version then you will get
plpythonu if it resolves to Python 3 you will get plpython3u. The question is how to build for the other version of Python? The answer is you can put your thumb on the scale by setting the environment variable PYTHON to the Python version you want built:

export PYTHON=python3

What follows is the procedure I use where the system python is Python 2.

Initial configure. For this example I am using a minimal setup to just show the Python building. Normally there would be additional arguments say –with-openssl –with-libxml, etc.

Verify what python is pointing to.

python -V
2.7.x

The configure finds the Python 2 libraries.

/configure --with-python --prefix=/usr/local/pgsql14
checking for python… /usr/bin/python
configure: using python 2.7.18 (default, Apr 23 2020, 09:27:04) [GCC]
checking for Python distutils module… yes
checking Python configuration directory… /usr/lib64/python2.7/config
checking Python include directories… -I/usr/include/python2.7
checking how to link an embedded Python application… -L/usr/lib64 -lpython2.7
-lpthread -ldl -lutil -lm

Build the Python 2 version of the procedural language and install it. At this point the make is done at the top level of the build to have the entire source tree be built.

make
sudo make install
/usr/bin/mkdir -p '/usr/local/pgsql14/lib64'
/usr/bin/install -c -m 755 plpython2.so '/usr/local/pgsql14/lib64/plpython2.so'
/usr/bin/mkdir -p '/usr/local/pgsql14/share/extension' '/usr/local/pgsql14/include/server' '/usr/local/pgsql14/lib64/pgxs/src/pl/plpython'
/usr/bin/install -c -m 644 ./plpython2u.control ./plpython2u--1.0.sql ./plpythonu.control ./plpythonu--1.0.sql '/usr/local/pgsql14/share/extension/'
/usr/bin/install -c -m 644 ./plpython.h ./plpy_cursorobject.h ./plpy_elog.h ./plpy_exec.h ./plpy_main.h ./plpy_planobject.h ./plpy_plpymodule.h ./plpy_procedure.h ./plpy_resultobject.h ./plpy_spi.h ./plpy_subxactobject.h ./plpy_typeio.h ./plpy_util.h '/usr/local/pgsql14/include/server'
/usr/bin/install -c -m 644 ./regress-python3-mangle.mk '/usr/local/pgsql14/lib64/pgxs/src/pl/plpython'/\

The procedural language plpython2u is essentially symlinked to plpythonu in the above.

Use the PYTHON environment variable to point at python3. Verify that variable was set.

export PYTHON=python3
env | grep -i PYTHON
PYTHON=python3

Run configure again to pick up the new version(3) of Python, which it does.

./configure --with-python --prefix=/usr/local/pgsql14
checking for PYTHON… python3
configure: using python 3.6.12 (default, Dec 02 2020, 09:44:23) [GCC]
checking for Python distutils module… yes
checking Python configuration directory… /usr/lib64/python3.6/config-3.6m-x86_64-linux-gnu
checking Python include directories… -I/usr/include/python3.6m
checking how to link an embedded Python application… -L/usr/lib64 -lpython3.6m -lpthread -ldl -lutil -lm

Change directories to get to plpython subdirectory. Run make clean to get rid of previous Python 2 build. Then do the make/make install to build install plpython3u.

cd src/pl/plpython/
make clean
make
sudo make install

/usr/bin/mkdir -p '/usr/local/pgsql14/lib64'
/usr/bin/install -c -m 755 plpython3.so '/usr/local/pgsql14/lib64/plpython3.so'
/usr/bin/mkdir -p '/usr/local/pgsql14/share/extension' '/usr/local/pgsql14/include/server' '/usr/local/pgsql14/lib64/pgxs/src/pl/plpython'
/usr/bin/install -c -m 644 ./plpython3u.control ./plpython3u--1.0.sql '/usr/local/pgsql14/share/extension/'
/usr/bin/install -c -m 644 ./plpython.h ./plpy_cursorobject.h ./plpy_elog.h ./plpy_exec.h ./plpy_main.h ./plpy_planobject.h ./plpy_plpymodule.h ./plpy_procedure.h ./plpy_resultobject.h ./plpy_spi.h ./plpy_subxactobject.h ./plpy_typeio.h ./plpy_util.h '/usr/local/pgsql14/include/server'
/usr/bin/install -c -m 644 ./regress-python3-mangle.mk '/usr/local/pgsql14/lib64/pgxs/src/pl/plpython'

Now that they have been built it is just a matter of installing them, as extensions, into the database or databases of you choice.

create extension plpythonu;
create extension plpython3u;

One heads up about having both versions installed in the same database.

DO $$
    plpy.notice('Python 2')
$$ LANGUAGE plpythonu;
NOTICE:  Python 2
DO
DO $$
    plpy.notice('Python 3')
$$ LANGUAGE plpython3u;
FATAL:  multiple Python libraries are present in session
DETAIL:  Only one Python major version can be used in one session.
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

As the message says you can only use one plpythonu version at a time in a given session.

Postgres and JSON

Time for a renovation on an old post that dealt with Postgres and hstore. This post will move from the hstore data type to json(b). At the time the original post was written introduction of the json type was about three months away and jsonb two and half years out. It also took several years for the operator and function choices to be fleshed out. Now that the support has been mature for some time it seems a good time to make the change. To recap, the original post dealt with using hstore to capture deleted records into an audit table without having to modify the audit table when the source table structure changed. On to doing this with JSON instead.

Setup section.

This is being done using Postgres version 14.1. In the previous post a table from an actual application was used. Unfortunately, that relied on information not available to the reader. This post uses a test set up that allows for running the code locally should you desire to.

The setup DDL:

DROP TABLE IF EXISTS archive_test;

CREATE TABLE archive_test (
    line_id smallint PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
    varchar_fld varchar(10) NOT NULL,
    integer_fld integer,
    boolean_fld boolean NOT NULL DEFAULT 'f',
    ts_fld  timestamptz NOT NULL DEFAULT now()
);

DROP TABLE IF EXISTS archive_test_delete;

CREATE TABLE archive_test_delete (
    del_id integer PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
    record_fld jsonb,
    del_ts timestamp with time zone DEFAULT now(),
    del_user character varying
);

DROP FUNCTION IF EXISTS archive_record();

CREATE OR REPLACE FUNCTION archive_record()
RETURNS trigger
LANGUAGE plpgsql
SECURITY DEFINER
AS $BODY$
DECLARE
    tbl_name text := TG_TABLE_NAME || '_delete' ;
    archive_row jsonb := row_to_json(OLD.*);
    usr_name text := session_user;
BEGIN
    EXECUTE 'INSERT INTO ' ||quote_ident(tbl_name)
    ||'(record_fld,del_ts,del_user)' ||
    ' VALUES('||quote_literal(archive_row)||',now(),'
    ||quote_literal(usr_name)||')';
RETURN OLD;
END;
$BODY$;
 
CREATE TRIGGER 
    p1_delete 
AFTER DELETE ON 
    archive_test 
FOR EACH ROW EXECUTE PROCEDURE archive_record();

The archive_record() trigger function uses the row_to_json() function from here JSON Functions to convert the OLD record that represents the deleted values in the function to a JSON object. This is then combined with now() and session_user to be inserted into the audit table. The audit table name to be inserted into is derived from the source table name with the addition of the suffix _delete.

Initial data values:

INSERT INTO 
    archive_test 
VALUES 
    (DEFAULT, 'test', 1, 't', '12/06/2021 10:12:30'),
    (DEFAULT, 'cat', 2, 'f', '12/06/2021 11:00:15'),
    (DEFAULT, 'dog', 3, 't', '12/06/2021 11:15:32');

SELECT * FROM archive_test;

-[ RECORD 1 ]-----------------------
line_id     | 1
varchar_fld | test
integer_fld | 1
boolean_fld | t
ts_fld      | 2021-12-06 10:12:30-08
-[ RECORD 2 ]-----------------------
line_id     | 2
varchar_fld | cat
integer_fld | 2
boolean_fld | f
ts_fld      | 2021-12-06 11:00:15-08
-[ RECORD 3 ]-----------------------
line_id     | 3
varchar_fld | dog
integer_fld | 3
boolean_fld | t
ts_fld      | 2021-12-06 11:15:32-08

What happens when you DELETE a record:

DELETE FROM archive_test WHERE line_id = 2;

SELECT * FROM archive_test_delete;
-[ RECORD 1 ]----------------------------------------------------
del_id     | 1
record_fld | {"ts_fld": "2021-12-06T11:00:15-08:00", "line_id": 2, "boolean_fld": false, "integer_fld": 2, "varchar_fld": "cat"}
del_ts     | 2021-12-06 13:52:29.266929-08
del_user   | postgres


Per the trigger function the deleted row is added as JSON object to the audit table along with the time of deletion as well as the user that deleted it.

Change table by adding a column:

ALTER TABLE 
    archive_test
ADD COLUMN
    json_fld json;
    
--INSERT values into new form of table:

INSERT INTO 
    archive_test 
VALUES 
    (DEFAULT, 'fish', 4, 't', '11/26/2021 10:12:30', '{"one": 1, "two": 2}'),
    (DEFAULT, 'rabbit', 5, 'f', '11/26/2021 11:00:15', '{"three": 3, "four":4}'),
    (DEFAULT, 'cow', 6, 't', '11/26/2021 11:15:32', '{"five": 5, "six": 6}');
    
--DELETE from changed table:
    
DELETE FROM archive_test WHERE line_id = 6;

SELECT * FROM archive_test_delete WHERE del_id = 2;

-[ RECORD 1 ]----------------------------------------------------
del_id     | 2
record_fld | {"ts_fld": "2021-11-26T11:15:32-08:00", "line_id": 6, "json_fld": {"six": 6, "five": 5}, "boolean_fld": true, "integer_fld": 6, "varchar_fld": "cow"}
del_ts     | 2021-12-06 13:56:22.47116-08
del_user   | postgres

The function picks up the new column and includes it in the delete audit table .

Retrieving data from the audit table:

SELECT 
    t.* 
FROM 
    archive_test_delete, 
LATERAL 
    jsonb_populate_record(null::archive_test, record_fld) t;
    
-[ RECORD 1 ]-----------------------
line_id     | 2
varchar_fld | cat
integer_fld | 2
boolean_fld | f
ts_fld      | 2021-12-06 11:00:15-08
json_fld    | 
-[ RECORD 2 ]-----------------------
line_id     | 6
varchar_fld | cow
integer_fld | 6
boolean_fld | t
ts_fld      | 2021-11-26 11:15:32-08
json_fld    | {"six": 6, "five": 5}

Again using a function from JSON Functions pull the data out of the audit table and expand the JSON object into a record. json_populate_record takes as it’s first argument a composite type, in this case archive_test. In Postgres every table has a composite type defined for its structure so using archive_test means that the second argument record_fld(the JSON record) will use the archive_test table to match its field names to the column names and types for the table. The null:: just says use NULL for any columns in the archive_test type/table that do not have corresponding fields in the JSON object. The LATERAL comes from SELECT and in this particular case is really not needed. It is there to emphasize that the json_populate_record is working against each row in the archive_test_delete table.

What happens if you drop a column:

BEGIN;

ALTER TABLE archive_test DROP COLUMN json_fld ;

SELECT 
    t.* 
FROM 
    archive_test_delete, LATERAL jsonb_populate_record(null::archive_test, record_fld) t;


-[ RECORD 1 ]-----------------------
line_id     | 2
varchar_fld | cat
integer_fld | 2
boolean_fld | f
ts_fld      | 2021-12-06 11:00:15-08
-[ RECORD 2 ]-----------------------
line_id     | 6
varchar_fld | cow
integer_fld | 6
boolean_fld | t
ts_fld      | 2021-11-26 11:15:32-08

ROLLBACK;

Because jsonb_populate_record is using the archive_test composite type and the json_fld no longer exists it also ‘disappears’ from the query. This also applies if the column is renamed. The upshot is that if you are only always adding columns to a table and not renaming them then this system works without to much fuss on the retrieval end. Otherwise you will just have to pull out the record_fld as a JSON object and view the data from there.