SQL-02 13: Further SQL
Further SQL
This module explores further tools and techniques in SQL for creating, updating, and transforming data, as well as some techniques for writing economical queries.
Aliases
Aliases are temporary names given to tables or columns as convenient references.
For columns, the syntax takes the form:
SELECT column_name AS column_alias
This gives you the ability to output a ‘friendlier’ name for a column, or to give a name to a column generated by a function. Note that the alias cannot then be used in a WHERE clause.
For tables, the alias is set in the FROM clause - normally a single letter is used.
So, to display distribution of crime in an area you might construct a query like this, then symbolise the results in QGIS.
SELECT n.gid, n.link_name AS "Locality", c.*, n.wkb_geometry FROM general.neighbourhoods n, crime.neighbourhood_crime c WHERE n.name=c.neighbourhood;
Apply column alias
Open the Query Tool
Build a query to display the EstablishmentNumber aliased as “DFEE”, TypeOfEstablishment (name), and SchoolCapacity from the general.school table
JOIN
Joins enable queries on two or more tables to be made, using relationships between specific columns, usually using keys where the data is present in both tables - for example:
SELECT tb1.name, tb2.description FROM schema.table1 tb1 JOIN schema.table2 tb2 ON tb1.name=tb2.name;
Create join
In pgAdmin, view the first 100 rows of general.imd_2019 (Index of Multiple Deprivation data for 2019 by LLSOA) and general.llsoa_2011_ew (LLSOA boundaries)
Create a SELECT statement which contains a join between the two tables to enable the IMD data to be visualized, using the example as a pattern, selecting geometry column from the LLSOA data, and all the columns from the IMD data
Use table aliases as shown in the example above to make the query more economical to write
VIEW
A view is a virtual table, defined by an SQL statement, and based on data from one or more tables. Data in Views can’t be edited directly, but otherwise behave much as a real table does, and like tables, they exist within a schema. The data within simple views can be changed through UPDATE statements. A view can be created using a statement such as:
CREATE VIEW view_name AS SELECT(....)
Create view
The query you developed in the last exercise returns a large number of features. Before you view the data in QGIS lets create the view so that only the top 50 most deprived areas by rank are selected. Create a view which joins the two tables, in the general schema e.g.
CREATE general.imd_llsoa_top_50 AS SELECT imd.*, llsoa.wkb_geometry FROM general.imd_2019 imd JOIN general.llsoa_2011_ew llsoa ON imd."LSOA code (2011)" = llsoa.code WHERE imd."Index of Multiple Deprivation (IMD) Rank" < 51
In QGIS, refresh the connection to your database and check that the view is present in the schema
Load the view in QGIS, and look at the attribute table to see the results
In pgAdmin, find the view in the object tree and right-click > Scripts > CREATE Script to see the SQL used to create it
Amend the view to add another of the columns from llsoa_2011_ew by copying the SQL into the query window and editing to add a new column. You may have to drop the view first by using:
DROP VIEW general.imd_llsoa_top_50;
Run the SQL to update the view, then check the results in pgAdmin
UPDATE
Update is used to change data in an existing record. Run the following SQL to view settlements that contain the phrase ‘Bassett’.
SELECT name from general.settlement WHERE name like '%Bassett%';
Lets change Royal Wooton Bassett to Royal Wooton Basset. In the example below, the SQL looks for records which meet the criteria in the WHERE clause, and updates the name and amended columns of those records:
UPDATE general.settlement SET name='Wootton Bassett', amended='2012-01-30' WHERE name='Royal Wootton Bassett';
Update records
In the general.wind_turbines_point table, write a query to SELECT all records where the value in the height column is greater than 20
Run the query and check the results
Change the SQL so that the status column is updated to ‘Application Refused’ for all those records where height is greater than 20
Run the query, and check that the change has been applied
CAST
The CAST command can be used to change a value or field from one datatype to another - for example a character varying to integer. It’s normally used in a SELECT statement.
SELECT CAST('100' AS integer);
CAST can also be written in shorthand using two colons:
SELECT '100'::integer;
Create point data from x, y coordinates
In this exercise we’ll take a non-spatial table which contains coordinates, and turn it into a spatial table. This will require a few steps.
Go to the general.bath_house_price_2016 table, and look at the postcode_centroid_latitude and postcode_centroid_longitude columns - you will see their datatype is character varying
Update the type to numeric using the following SQL:
ALTER TABLE general.bath_house_price_2016 ALTER column postcode_centroid_latitude type numeric using postcode_centroid_latitude::numeric;
This changes the datatype of the column, then populates it using CAST
Do the same thing for the longitude column
You can now use a PostGIS function to ‘spatialise’ the table
Add a new column called wkb_geometry of type geometry to the table
Populate the geometry column using the code below - this uses the PostGIS functions ST_Setsrid() to set the coordinate reference system to 4326 (lat/long), and ST_Makepoint() to create the geometry
UPDATE general.bath_house_price_2016 SET wkb_geometry = ST_Setsrid(ST_Makepoint(postcode_centroid_longitude, postcode_centroid_latitude), 4326);
Open the table in QGIS to check the result
Common Table Expressions (CTE)
A common table expression is a temporary result set which you can reference within another SQL statement including SELECT, INSERT, UPDATE or DELETE. A CTE is defined by a WITH statement which defines the query the CTE executes, and gives it a name. The CTE name can then be used within the rest of the query as if it were a table.
For example:
WITH schools_primary_small AS ( SELECT * FROM general.school WHERE "PhaseOfEducation (name)" = 'Primary' AND pupil_numbers::integer < 100 ) SELECT "EstablishmentName", "Postcode" FROM schools_primary_small ORDER BY "EstablishmentName";
Run SQL with a CTE
Using the example above, write SQL on the general.wind_turbines_point table which contains a CTE
The CTE should select only those points where Full Planning Permission was granted before 2008
The final query should return the reference, geometry and status of each point which meets the criteria in the CTE