Numberedheadings | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||
Automating SQL TaskspsqlThe course has focused on carrying out database tasks in pgAdmin with SQL, and in QGIS. However full database functionality is also available through command line shell applications, where an experienced database user has full and better control over the database. Included in pgAdmin is the command line application Tools > PSQL Tool. In a normal PostgreSQL installation it is There will be times when you need a task in SQL to be automated, so that it can be triggered from another application, or scheduled to run at a specific time. This module describes a few options for doing this. psqlFull database functionality is available through the psql command line shell application. This can be accessed from within pgAdmin, or on the command line - it is installed auomatically with Postgres, normally in the bin folder (e.g. C:\Program Files\PostgreSQL\13\bin\pgsql.exe), and so can be called from a batch file. Full documentation on psql commands can be found here.Using psqlfrom pgAdmin means that your database connection is already established. If you run pgsql from a command prompt, you canin pgAdmin
Run pqsl from the Command PromptYou can run pgsql directly from a command prompt (in AppStream, use the OSGeo4W Shell command from the menu).
If you are working from the command line, and always using the same database, set the environment variables for the database connection to avoid having use the parameters each time - for example: Alternatively, connect as in the following example (you will be prompted for a password): Running SQL in psqlTo run an SQL command, use the -c option e.g.: psql -f script/vml_count_features.sql
psql Meta-commandsAs well as being able to run SQL, psql includes a set of meta-commands, prefixed with a backslash, that are processed by psql itself. Meta-commands make psql useful for administration or scripting . Basics | ||||||||||||||||||||||
\h | get help with SQL | |||||||||||||||||||||
\? | get help with PSQL | |||||||||||||||||||||
\g OR ; | execute a query | |||||||||||||||||||||
\q | quit | |||||||||||||||||||||
\cd | change directory | |||||||||||||||||||||
\echo [STRING] | write string to standard output | |||||||||||||||||||||
\i [FILE] | execute commands from file | |||||||||||||||||||||
\o [FILE] | send all query results to file | |||||||||||||||||||||
\d [OBJECTNAME] | describes tables, sequences views etc. | |||||||||||||||||||||
\db | lists tablespaces | |||||||||||||||||||||
\df | lists functions | |||||||||||||||||||||
\dn | lists schemas | |||||||||||||||||||||
\dp OR \z | lists tables, views sequences etc. | |||||||||||||||||||||
\du | lists users | |||||||||||||||||||||
\l | lists all databases | |||||||||||||||||||||
\dt | lists tables |
|
Create a simple function below by running the code below in the Query Tool
CREATE OR REPLACE FUNCTION general.myfunction(a integer, b integer)
RETURNS integer AS
$$
BEGIN
return a*b;
END;
$$
LANGUAGE plpgsql;
Note that a schema name is not required - if it is not supplied, the function will be created in the public schema
Call the function using a SELECT statement in the Query Tool
select general.
myfunction(4,3)
If you have time, make a change to the function and call it again - note that to update the function, you can select it in pgAdmin then right-click > Scripts > CREATE Script, then amend the function definition run the statement
For more details on function functions see https://www.postgresql.org/docs/9.4/static/sql-createfunction.htmlFor more , and for details on PL/pgSQL see https://www.postgresql.org/docs/9.4/static/plpgsql.html
Python Functions
You can write functions in python (and many other languages), first create the plpython3u
extension using CREATE EXTENSION IF NOT EXISTS plpython3u
. Then you can create a function using code like:
The lines above the first $$
define the function and it's input and output parameters, then the code between the two $$
marks is a string containing the python code.
First we import a module (xlrd
) that handles reading .xls
files from Excel. Then we open the workbook contained in the file who's name was passed in, and find the first sheet in the workbook. Finally, we loop through the rows (after skipping row 0 - the header) and return (or yield
) the value of the first three cells in the current row. Using yield
rather than return
means our function can keep track of which row it's on.
The PL/Python extension also provides a python module plpy
that provides access to the database. This allows you to query the catalog tables and run {{UPDATE}}s on tables based on the answers.
Triggers
A trigger is a set of actions that are run automatically when a specified change operation (SQL INSERT, UPDATE, DELETE or TRUNCATE statement) is performed on a specified table. Triggers are useful for tasks such as enforcing business rules, validating input data, and keeping an audit trail.
Create
A trigger is a named database object that is associated with a table, and it activates when a particular event (e.g. an insert, update or delete) occurs for the table/views. The statement CREATE TRIGGER creates a new trigger in PostgreSQL. Here is the syntax :
Syntax
Parameters
Name
Description
name
The name of the trigger. A trigger must be distinct from the name of any other trigger for the same table. The name cannot be schema-qualified — the trigger inherits the schema of its table.
BEFORE
AFTER
INSTEAD OF
Determines whether the function is called before, after, or instead of the event. A constraint trigger can only be specified as AFTER.
event
One of INSERT, UPDATE, DELETE, or TRUNCATE, that will fire the trigger.
table_name
The name of the table or view the trigger is for.
referenced_table_name
The (possibly schema-qualified) name of another table referenced by the constraint. This option is used for foreign-key constraints and is not recommended for general use. This can only be specified for constraint triggers.
DEFERRABLE NOT
DEFERRABLE
INITIALLY IMMEDIATE
INITIALLY DEFERRED
The default timing of the trigger.
FOR EACH ROW
FOR EACH STATEMENT
Specifies whether the trigger procedure should be fired once for every row affected by the trigger event, or just once per SQL statement. If neither is specified, FOR EACH STATEMENT is the default.
condition
A Boolean expression that determines whether the trigger function will actually be executed.
function_name
A user-supplied function that is declared as taking no arguments and returning type trigger, which is executed when the trigger fires.
arguments
An optional comma-separated list of arguments to be provided to the function when the trigger is executed. The arguments are literal string constants.
Triggers that are specified to fire INSTEAD OF the trigger event must be marked FOR EACH ROW, and can only be defined on views. BEFORE and AFTER triggers on a view must be marked as FOR EACH STATEMENT. In addition, triggers may be defined to fire for TRUNCATE, though only FOR EACH STATEMENT. The following table summarizes which types of triggers may be used on tables and views:
When
Event
Row-level
Statement-level
BEFORE
INSERT/UPDATE/DELETE
Tables
Tables and views
TRUNCATE
—
Tables
AFTER
INSERT/UPDATE/DELETE
Tables
Tables and views
TRUNCATE
—
Tables
INSTEAD OF
INSERT/UPDATE/DELETE
Views
—
TRUNCATE
—
—
Here is a simple example of trigger function.:
Now we cana test table by running the SQL below
CREATE TABLE test_table( col1 text, col2 text, col3 date);
Create a trigger function which sets col3 to today’s date when the trigger is executed - this is the action the trigger will generated
CREATE OR REPLACE FUNCTION test_table_trig() RETURNS TRIGGER as
$$
BEGIN
new.col3 = current_date;
return new;
END;
$$
LANGUAGE plpgsql;
Now create the trigger which will fire at the time of execution
In the above trigger function there is new keyword 'NEW' which is a PostgreSQL extension to triggers. There are two PostgreSQL extensions to trigger 'OLD' and 'NEW'. OLD and NEW are not case sensitive.
Within the trigger body, the OLD and NEW keywords enable you to access columns in the rows affected by a trigger
In an INSERT trigger, only NEW.col_name can be used.
In a UPDATE trigger, you can use OLD.col_name to refer to the columns of a row before it is updated and NEW.col_name to refer to the columns of the row after it is updated.
In a DELETE trigger, only OLD.col_name can be used; there is no new row.
A column named with OLD is read only. You can refer to it (if you have the SELECT privilege), but not modify it. You can refer to a column named with NEW if you have the SELECT privilege for it. In a BEFORE trigger, you can also change its value with SET NEW.col_name = value if you have the UPDATE privilege for it. This means you can use a trigger to modify the values to be inserted into a new row or used to update a row. (Such a SET statement has no effect in an AFTER trigger because the row change will have already occurred.)
Here is another example of a trigger, which writes to an audit table.CREATE TRIGGER test_table_trigger
BEFORE INSERT
ON test_table
FOR EACH ROW
EXECUTE PROCEDURE test_table_trig();
Test the trigger by using INSERT to add a row to the table
INSERT INTO test_table(col1,col2) VALUES('1','2')
Look at the table data, and check that today’s date has been added as a value for column 3
For more information on triggers and trigger functions see https://www.postgresql.org/docs/9.4/static/plpgsql-trigger.html .
Python
To use PostGIS from a Python application you need the Psycopg adapter so you can access PostgreSQL from Python.
Psycopg
Psycopg is the most popular PostgreSQL database adapter for the Python programming language. Its main features are the complete implementation of the Python DB API 2.0 specification and the thread safety (several threads can share the same connection). It was designed for heavily multi-threaded applications that create and destroy lots of cursors and make a large number of concurrent INSERT
s or UPDATE
s.
COPY TO/COPY FROM
support. Many Python types are supported out-of-the-box and adapted to matching PostgreSQL data types; adaptation can be extended and customized thanks to a flexible objects adaptation system.
Psycopg 2 is both Unicode and Python 3 friendly.
On Windows machines use the following to install psycopg2.
Very simple examples of using Psycopg2
What about geometry?
For a simply point you can easily use the appropriate PostGIS functions e.g.
For more complicated geometries, such as LineString and Polygon geometries, you can handle them with a number of tools including
Python GDAL/OGR for reading and writing geo-spatial data (https://pcjericks.github.io/py-gdalogr-cookbook/ )
Shapley for analysing and manipulating geo-spatial data (https://pypi.python.org/pypi/Shapely )
GeoDjango (https://docs.djangoproject.com/en/1.10/ref/contrib/gis/tutorial/ )
GeoAlchemy 2 (http://geoalchemy-2.readthedocs.io/en/latest/ )
Useful references:
Python Geospatial Development, Erik Westra
Geoprocessing with Python, Chris Garrard
Simple example using Shapely
Shapely does manipulating and analyzing data. It’s based on GEOS, the libraries used by PostGIS. With Shapely, you can do things like buffers, unions, intersections, centroids, convex hulls.
Shapely, then passes them through psycopg2 as hex-encoded WKB. Note that Shapely 1.3 or later is required to handle the export of 3D geometries with the wkb_hex
property.
QGIS
Because QGIS can be used to run SQL on a Postgres database, and it also has a workflow tool (the Graphical Modeller), it can be used to automate tasks in Postgres.
Open QGIS, and go to Processing > Graphical Modeler
In the Algorithms panel, search for Postgres, and double-click on PostgreSQL Execute SQL to include the process in your model
Double-click on the process in the model window to open it
In the dialog:
Set the Description to Add column
Set the Database (connection name) to the connection you created earlier
In the SQL query field, add
ALTER TABLE test_table ADD COLUMN testedit_date date
Click OK to save these settings
In the Model Designer, click the Run model button or F5 to run the model
In pgAdmin, check that the new field has been added to the data, and right-click > Delete/Drop to remove it so we can run the process again
In the Graphical Modeler, add another PostgreSQL Execute SQL process to the model in the same way as before
This time, rename the process Add values, and add some data to the table you have just updated, using the following SQL in the SQL query field
INSERT INTO test_table(col1,col2, test_from_qgis) VALUES('3','4','test_value')
Under Dependencies, add a dependency on the previous process - this will make this process run after the previous one
Click OK - your model should look something like this
Run the model, then check the results in pgAdmin
When you are finished, save the model for future use