10. Automating SQL TasksÂ

There will be times when you need a task in SQL to be automated, so that it can be triggered from another application, or scheduled to run at a specific time. This module describes a few options for doing this.

10.1. psql

Full database functionality is available through the psql command line shell application. This can be accessed from within pgAdmin, or on the command line - it is installed auomatically with Postgres, normally in the bin folder (e.g. C:\Program Files\PostgreSQL\13\bin\pgsql.exe), and so can be called from a batch file.

Full documentation on psql commands can be found here.

10.1.1. psql in pgAdmin

Got to Tools > PSQL Tool to start psql - you will see a screen like this, already connected to your database
Now type in an SQL command into the prompt, remembering to close the statement with a semicolon, for example:
select * from geometry_columns;
The full table will appear in the console

10.1.2. Run pqsl from the Command Prompt

You can run pgsql directly from a command prompt (in AppStream, use the OSGeo4W Shell command from the menu).

Start psql by typing the following, which includes the parameters for the database connection (use your own database name)
psql -h appstream_data.astuntechnology.com -d chris_guest3 -p 5432 -U postgis
Enter your password (postgis) at the prompt - you will see the following:
You can now enter an SQL statement in the same way as you did in pgAdmin, and press Enter to run it - try this with a SELECT statement on one of your tables, and donât forget to add a semi-colon ';' at the end

If you are working from the command line, and always using the same database, set the environment variables for the connection to avoid having use the parameters each time - for example:

set PGHOST=appstream_data.astuntechnology.com
set PGUSER=postgis
set PGPASSWORD=ppostgis
set PGDATABASE=chris_guest3
set PGPORT=5432

10.1.3. psql Meta-commands

As well as being able to run SQL, psql includes a set of meta-commands, prefixed with a backslash, that are processed byÂ psqlÂ itself. Meta-commands makeÂ psqlÂ useful for administration or scripting.

10.1.3.1. Basics

\h	get help with SQL
\?	get help with PSQL
\g OR ;	execute a query
\q	quit
\cd	change directory

10.1.3.2. Input / Outputs

\echo [STRING]	write string to standard output
\i [FILE]	execute commands from file
\o [FILE]	send all query results to file

10.1.3.3. Â Information

\d [OBJECTNAME]	describes tables, sequences views etc.
\db	lists tablespaces
\df	lists functions
\dn	lists schemas
\dp OR \z	lists tables, views sequences etc.
\du	lists users
\l	lists all databases
\dt	lists tables

Visit the PostgreSQL documentation for a more in depth guide to PSQL.

10.2. Functions

PostgreSQLÂ functions, also known as Stored Procedures, allow you to package queries so that they can be called when needed by any database user, with parameters if appropriate. Postgres/PostGIS are installed with a large number of built in functions - you will find them in the public schema.

Functions can be created in the language of your choice , including SQL, PL/pgSQL, C, and Python - the most common language for creating function is PL/pgSQL.

The basic syntax of a function is below:

CREATE [OR REPLACE] FUNCTION function_name (arguments) 
RETURNS return_datatype AS 
$block_name$
  DECLARE
    declaration;
    [...]
  BEGIN
    < function_body >
    [...]
    RETURN { variable_name | value }
  END; 
$block_name$
LANGUAGE plpgsql;

Here is a very simple example:

CREATE OR REPLACE FUNCTION general.myfunction(a integer, b integer) 
RETURNS integer AS 
$$
BEGIN
  return a*b;
  END;
$$
LANGUAGE plpgsql;

Although function names do not need to be schema qualified, it is recommended that they are placed in a schema. If no schema is specified they will be created in the public schema.

To run the above function you would simply call the function via a select clause. e.g.

select general.myfunction(4,3);

For more details on function see https://www.postgresql.org/docs/9.4/static/sql-createfunction.html

For more details on PL/pgSQL see https://www.postgresql.org/docs/9.4/static/plpgsql.html

10.3. Python Functions

You can write functions in python (and many other languages), first create the plpython3u extension using CREATE EXTENSION IF NOT EXISTS plpython3u. Then you can create a function using code like:

CREATE OR REPLACE FUNCTION fngetxlspts(
    param_filename text,
    OUT place text, OUT lon float, OUT lat float
)
RETURNS SETOF RECORD AS
$$
import xlrd
book = xlrd.open_workbook(param_filename)
sh = book.sheet_by_index(0)
for rx in range(1,sh.nrows):
    yield(
        sh.cell_value(rowx=rx,colx=0),
        sh.cell_value(rowx=rx,colx=1),
        sh.cell_value(rowx=rx,colx=2)
    )
$$
LANGUAGE plpython3u VOLATILE;

The lines above the first $$ define the function and it's input and output parameters, then the code between the two $$ marks is a string containing the python code.

First we import a module (xlrd) that handles reading .xls files from Excel. Then we open the workbook contained in the file who's name was passed in, and find the first sheet in the workbook. Finally, we loop through the rows (after skipping row 0 - the header) and return (or yield) the value of the first three cells in the current row. Using yield rather than return means our function can keep track of which row it's on.

The PL/Python extension also provides a python module plpy that provides access to the database. This allows you to query the catalog tables and run {{UPDATE}}s on tables based on the answers.

CREATE OR REPLACE FUNCTION add_wgs84(
    param_schema text,
    OUT result boolean
)
RETURNS boolean as
$BODY$
# get tables
plpy.info("Param is "+plpy.quote_literal(param_schema))
prep = plpy.prepare("select f_table_name as tablename, f_geometry_column as geom_col, type as gtype from public.geometry_columns where f_table_schema = $1",["text"])
rv = prep.execute([param_schema])

plpy.info("got "+str(len(rv))+" rows")
for r in range(0,len(rv)):
    plpy.execute('alter table '+param_schema+'.'+rv[r]['tablename']+' add column geom_wgs84 GEOMETRY('+rv[r]['gtype']+', 4326)')
    plpy.execute('update '+param_schema+'.'+rv[r]['tablename']+' set geom_wgs84 = st_transform('+rv[r]['geom_col']+', 4326)')

return True
$BODY$
LANGUAGE plpython3u VOLATILE;

10.4. Triggers

A trigger is a set of actions that are run automatically when a specified change operation (SQL INSERT, UPDATE, DELETE or TRUNCATE statement) is performed on a specified table. Triggers are useful for tasks such as enforcing business rules, validating input data, and keeping an audit trail.

10.4.1. Create trigger

A trigger is a named database object that is associated with a table, and it activates when a particular event (e.g. an insert, update or delete) occurs for the table/views. The statement CREATE TRIGGER creates a new trigger in PostgreSQL. Here is the syntax :

10.4.1.1. Syntax

CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } { event [ OR ... ] }
    ON table_name
    [ FROM referenced_table_name ]
    [ NOT DEFERRABLE | [ DEFERRABLE ] { INITIALLY IMMEDIATE | INITIALLY DEFERRED } ]
    [ FOR [ EACH ] { ROW | STATEMENT } ]
    [ WHEN ( condition ) ]
    EXECUTE PROCEDURE function_name ( arguments )

10.4.1.2. Parameters

Name	Description
name	The name of the trigger. A trigger must be distinct from the name of any other trigger for the same table. The name cannot be schema-qualified â the trigger inherits the schema of its table.Â
BEFOREÂ AFTER INSTEAD OF	Determines whether the function is called before, after, or instead of the event. A constraint trigger can only be specified as AFTER.
event	One of INSERT, UPDATE, DELETE, or TRUNCATE, that will fire the trigger.
table_name	The name of the table or view the trigger is for.
referenced_table_name	The (possibly schema-qualified) name of another table referenced by the constraint. This option is used for foreign-key constraints and is not recommended for general use. This can only be specified for constraint triggers.
DEFERRABLE NOTÂ DEFERRABLEÂ INITIALLY IMMEDIATEÂ INITIALLY DEFERRED	The default timing of the trigger.
FOR EACH ROWÂ FOR EACH STATEMENT	Specifies whether the trigger procedure should be fired once for every row affected by the trigger event, or just once per SQL statement. If neither is specified, FOR EACH STATEMENT is the default.
condition	A Boolean expression that determines whether the trigger function will actually be executed.
function_name	A user-supplied function that is declared as taking no arguments and returning type trigger, which is executed when the trigger fires.
arguments	An optional comma-separated list of arguments to be provided to the function when the trigger is executed. The arguments are literal string constants.

Triggers that are specified to fire INSTEAD OF the trigger event must be marked FOR EACH ROW, and can only be defined on views. BEFORE and AFTER triggers on a view must be marked as FOR EACH STATEMENT. In addition, triggers may be defined to fire for TRUNCATE, though only FOR EACH STATEMENT. The following table summarizes which types of triggers may be used on tables and views:

When	Event	Row-level	Statement-level
BEFORE	INSERT/UPDATE/DELETE	Tables	Tables and views
BEFORE	TRUNCATE	â	Tables
AFTER	INSERT/UPDATE/DELETE	Tables	Tables and views
AFTER	TRUNCATE	â	Tables
INSTEAD OF	INSERT/UPDATE/DELETE	Views	â
INSTEAD OF	TRUNCATE	â	â

Here is a simple example of trigger function.:

CREATE TABLE test_table( col1 text, col2 text, col3 date);


CREATE OR REPLACE FUNCTION test_table_trig() RETURNS TRIGGER as
$$
BEGIN
   new.col3 = current_date;
   return new;
END;
$$
LANGUAGE plpgsql;

Now we can create the trigger which will fire at the time of execution; the event as specified in the trigger for the associated tables.

CREATE TRIGGER test_table_trigger
  BEFORE INSERT
  ON test_table
  FOR EACH ROW
  EXECUTE PROCEDURE test_table_trig();

In the above trigger function there is new keyword 'NEW' which is a PostgreSQL extension to triggers. There are two PostgreSQL extensions to trigger 'OLD' and 'NEW'. OLD and NEW are not case sensitive.

Within the trigger body, the OLD and NEW keywords enable you to access columns in the rows affected by a trigger
In an INSERT trigger, only NEW.col_name can be used.
In a UPDATE trigger, you can use OLD.col_name to refer to the columns of a row before it is updated and NEW.col_name to refer to the columns of the row after it is updated.
In a DELETE trigger, only OLD.col_name can be used; there is no new row.

A column named with OLD is read only. You can refer to it (if you have the SELECT privilege), but not modify it. You can refer to a column named with NEW if you have the SELECT privilege for it. In a BEFORE trigger, you can also change its value with SET NEW.col_name = value if you have the UPDATE privilege for it. This means you can use a trigger to modify the values to be inserted into a new row or used to update a row. (Such a SET statement has no effect in an AFTER trigger because the row change will have already occurred.)

Here is another example of a trigger, which writes to an audit table.

-- table of employees
CREATE TABLE employees(
   id serial primary key,
   first_name varchar(40) NOT NULL,
   last_name varchar(40) NOT NULL
);

-- audit table for last name changes
CREATE TABLE employees_audits(
   id serial primary key,
   employee_id integer,
   last_name varchar(40) NOT NULL,
   changed_on timestamp
);

-- trigger function to log last name changed in audit table
CREATE OR REPLACE FUNCTION log_last_name_changes()
  RETURNS trigger AS
$BODY$
BEGIN
 IF NEW.last_name <> OLD.last_name THEN
    INSERT INTO employees_audits(employee_id,last_name,changed_on)
    VALUES(OLD.id,OLD.last_name,now());
 END IF;
 RETURN NEW;
END;
$BODY$
language plpgsql;

-- trigger
CREATE TRIGGER last_name_changes
  AFTER UPDATE
  ON employees
  FOR EACH ROW
  EXECUTE PROCEDURE log_last_name_changes();

For more information on triggers and trigger functions see https://www.postgresql.org/docs/9.4/static/plpgsql-trigger.html

10.5. Python

To use PostGIS from a Python application you need the Psycopg adapter so you can access PostgreSQL from Python.

10.5.1. Psycopg

PsycopgÂ is the most popular PostgreSQLÂ database adapter for theÂ PythonÂ programming language. Its main features are the complete implementation of the PythonÂ DB API 2.0Â specification and the thread safety (several threads can share the same connection). It was designed for heavily multi-threaded applications that create and destroy lots of cursors and make a large number of concurrentÂ INSERTs orÂ UPDATEs.

Psycopg 2 is mostly implemented in C as aÂ libpqÂ wrapper, resulting in being both efficient and secure. It features client-side andÂ server-sideÂ cursors,Â asynchronous communicationÂ andÂ notifications,Â COPYÂ TO/COPYÂ FROMÂ support. Many Python types are supported out-of-the-box andÂ adapted to matching PostgreSQL data types; adaptation can be extended and customized thanks to a flexibleÂ objects adaptation system.

Psycopg 2 is both Unicode and Python 3 friendly.

On Windows machines useÂ the following to install psycopg2.

pip install psycopg2-binary

Very simple examples of using Psycopg2

import psycopg2
conn = psycopg2.connect("dbname='pg_training' user='pgis' host='localhost' password='pgis'")
cur = conn.cursor()
cur.execute("""Select datname from pg_database;""")
rows = cur.fetchall()
for row in rows:
... print " ", row[0]
...

template1
template0
postgres
aileen

cur.execute("""SELECT general.myfunction(4,3);""")
cur.fetchall()

10.5.2. What about geometry?

For a simply point you can easily use the appropriate PostGIS functions e.g.

>>> import psycopg2

>>> conn = psycopg2.connect('...')
>>> curs = conn.cursor()

>>> cur.execute("select st_astext(ST_SetSRID(ST_MakePoint(32, 34, 0),4326));")
>>> cur.fetchall()
[('POINT Z (32 34 0)',)]
>>>

For more complicated geometries, such as LineString and Polygon geometries, you can handle them with a number of tools including

Python GDAL/OGR for reading and writing geo-spatial data (https://pcjericks.github.io/py-gdalogr-cookbook/ )
Shapley for analysing and manipulating geo-spatial data (https://pypi.python.org/pypi/Shapely )
GeoDjango (https://docs.djangoproject.com/en/1.10/ref/contrib/gis/tutorial/ )
GeoAlchemy 2 (http://geoalchemy-2.readthedocs.io/en/latest/ )

10.5.2.1. Useful references:

Python Geospatial Development, Erik Westra	Geoprocessing with Python, Chris Garrard

10.6. Simple example using Shapely

Shapely doesÂ manipulating and analyzing data. Itâs based on GEOS, the libraries used by PostGIS. With Shapely, you can do things like buffers, unions, intersections,Â centroids,Â convex hulls.

Shapely, then passes them through psycopg2 as hex-encoded WKB. Note that Shapely 1.3 or later is required to handle the export of 3D geometries with theÂ wkb_hexÂ property.

import psycopg2
from shapely.geometry import LineString
from shapely import wkb

conn = psycopg2.connect("dbname='pg_training' user='pgis' host='localhost' password='pgis'")
curs = conn.cursor()

# Make a Shapely geometry
ls = LineString([(2.2, 4.4, 10.2), (3.3, 5.5, 8.4)])
ls.wkt  # LINESTRING Z (2.2 4.4 10.2, 3.3 5.5 8.4)
ls.wkb_hex  # 0102000080020000009A999999999901409A999999999911406666666666662440666666...

# Send it to PostGIS
curs.execute('CREATE TEMP TABLE my_lines(geom geometry, name text)')
curs.execute(
    'INSERT INTO my_lines(geom, name) VALUES (ST_SetSRID(%(geom)s::geometry, %(srid)s), %(name)s)',
    {'geom': ls.wkb_hex, 'srid': 4326, 'name': 'First Line'})

# Fetch the data from PostGIS, reading hex-encoded WKB into a Shapely geometry
curs.execute('SELECT name, geom FROM my_lines')
curs.fetchall()

# Fetch the data from PostGIS, reading hex-encoded WKB into a Shapely geometry
curs.execute('SELECT name, geom FROM my_lines')
for name, geom_wkb in curs:
    geom = wkb.loads(geom_wkb, hex=True)
    	print('{0}: {1}'.format(name, geom.wkt))
# First Line: LINESTRING Z (2.2 4.4 10.2, 3.3 5.5 8.4)

Of course this can be accomplished by sending the geometry's WKT, however since it is converted to text, it is lossy and may reduce angstroms of precision. Transferring geometries as hex-encoded WKB is lossless, and preserves the exact precision of each coordinate.

PGIS-09 12: Automating SQL Tasks