...
Numberedheadings | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||
Loading Data to PostgresData can be loaded into Postgres in a number of ways - this module covers:
QGISIf you have QGIS installed, it provides a GUI-based means of loading data to a Posgres database. There are a few ways of getting data into Postgres using QGIS. Load data with DB ManagerThis approach gives the the greatest degree of control over the conversion from your source format to Postgres.
Load data using the BrowserThe QGIS Browser panel also contains functionality to load and manage data in the database.
Import CSV to PostgresThe Copy command can be used to import Comma Separated Value files (CSV) / text files to and from PostgreSQL.
ogr2ogrThe GDAL library contains an ogr2ogr executable which will convert between most spatial formats, including PosgresQL/PostGIS, Shapefile, MapInfo and GeoPackage. It can be run from the command line with appropriate parameters - see the GDAL site for full details. It’s also possible to use ogr2ogr from QGIS, so in this exercise we will use QGIS to create a command line, then run it in the command prompt. Load data with ogr2ogr
|
...
skip-headings | h3, h4 |
---|---|
h1 | [h1.decimal]. |
start-numbering-with | 2 |
h2 | [h1.decimal].[h2.upper-latin] |
GDAL/OGR og2ogr
A versatile command line took which will translate many different formats, including shapefile, .TAB and GML. There are some GUI interfaces that support ogr2ogr (e.g. http://www.ogr2gui.ca), and QGIS uses it as its translation engine.
A full list of supported formats is at http://www.gdal.org/ogr_formats.html.
Desktop GIS
If you are already using desktop GIS with support for PostGIS, it's usually easiest to use this, though bear in mind that some GIS products are more successful at doing this than others.
QGIS has several methods for uploading data into PostGIS. Probably the best is DB Manager plugin - this supports loading files as well as existing layers, and provides control over how the data will be saved to the database. However you can also drag and drop in the QGIS Browser panel.
MapInfo uses Easy loader. Data loaded is referenced in a custom map catalogue table, and a connection to it from MapInfo can then be made. Data loaded through this method is not immediately stored, so that other GIS and spatial tools can access it. Then datasets loaded though other methods are not available spatially in MapInfo until they have been referenced in the MapInfo map catalogue table.
Cadcorp can be used to load most formats directly into PostGIS, using PostGIS’s standard method of referencing spatial tables. Cadcorp requires data to adhere to strict rules, and it is sometimes difficult in Cadcorp to access data in PostGIS that has been loaded using other tools. Data needed to be used by Cardcorp should therefore be loaded using Cadcorp if possible.
ArcGIS does not support loading data into PostGIS, although it is possible using ArcSDE, as PostGIS can be used as an alternative backend database server for ArcSDE. You can view spatial data in PostGIS from ArcGIS version 10 and later using the Add Query Layer tool.
Copy
The Copy command is a command-line operated tool to import/export non-spatial data between PostgreSQL and CSV or other text formats.
copy to (export data to file)
copy from (import data from file)
Connecting to PostGIS from QGIS
...
Open QGIS
...
Add the PostGIS layer by right clicking on Postgis in the data browser, or by selecting new connection in the add data→PostGIS window.
...
Click Test Connect, then OK if the connection is successful
...
Select the connection and click Connect
...
Select the table you need, then click Add
The table will be added as a layer in the map.
Spatial Tables
The specific characteristics of spatial tables are important. These include:
Geometry field and datatype
Unique id/primary key
Spatial Indexes
Geometry fields
PostGIS expands PostgreSQL by introducing the datatype GEOMETRY, which holds spatial information in the OpenGIS standard: WKB (Well Known Binary) format. WKB is not easily interpreted by eye, so it can be converted to WKT (Well Known Text) to display the geometry in textual form, using the functions ST_AsText() or ST_AsEWKT().
WKB GEOMETRY
WKT GEOMETRY
PostGIS also supports 3D geometry using EWKB and EWKT.
The name of the geometry field can vary, and different methods of loading data use different names - e.g. the_geom, geometry, wkb_geometry. It's recommended to use the same name for all tables, and the datatype must always be geometry.
Type Modifier
Typemod is a type modifier, used to modify data types - for example on variable length character columns, they specify the number of character available on the data type.
description VARCHAR(255)
For geometry columns, PostGIS uses type modifiers to define:
The geometry type
The coordinate reference system
Dimension
These are written as follows:
Geometry_name GEOMETRY([datatype][(Z)], [SRID]
Example 2D: wkb_geometry GEOMETRY(Point, 27700)
Example 3D: wkb_geometry GEOMETRY(PointZ, 27700)
These modifiers specify:
the geometry type: Point
the dimension: Z, which includes 3rd dimension
the spatial reference identifier (SRID): 27700
Only geometry that meets these criteria can be inserted or updated into the table.
...
Type modifiers
Type modifiers replace constraints in PostGIS 1.5.
Unique Identifier
Every table should have a unique identifier field, with unique non-null values. It will normally be introduced by the importing tool as a serial or integer datatype, and be set as the primary key. However any unique value can also be set as a primary key.
If a primary key does not exist, you will not be able edit data in pgAdmin, and there may be problems viewing and editing in GIS applications.
Geometry Types
PostGIS geometry data types reflect OpenGIS standard definitions for spatial information. The three simplest types of geometry are points, lines and polygons. The diagram below shows how these are represented in WKT format:
For example:
POINT(2 3) LINESTRING(1 2, 3 3, 3 1) POLYGON((1 2, 3 3, 3 1, 1 2))
or
POLYGON((571178 6337246,571178 6402217,598061 6402217,598061 6337246,571178 6337246))
Collection Geometries
Collection geometries are groups of geometries that locally belong together. There is a collection type for each geometry type. GeometryCollections enable different geometry types to be stored together as one logical grouping.
Multipoints
Mulitlines
Multipolygons
Geometry Collections
Curved Geometries
3D Geometries
Spatial Reference Systems
PostGIS follows EPSG (European Petroleum Survey Group) numberings for spatial reference system identification.
Commonly used SRIDs (Spatial Reference IDs) used in the UK are:
SRID 27700: OSGB 1936 British National Grid
SRID 4326: WGS 84 Longitude/Latitude
SRID 3857: Web Mercator
Geometry Columns
PostGIS uses the geometry_columns view in the public schema to list metadata, based on and associated with the geometry field in the spatial tables. This view is automatically created when the PostGIS extension is added to the database. When a spatial table is created using a standard process (for example one of the ones listed above), it will automatically appear in the geometry_columns view.
The geometry_columns view is used in clients such as desktop GIS to generate a list of spatial tables without having to examine all tables for geometry types etc, so it acts as a catalogue of spatial data. Some clients will not open a spatial table if it does not appear in the view.
geometry_columns fields
...
Field name
...
Purpose
...
f_table_catalog
...
The name of the database
...
f_table_schema
...
The table schema
...
f_table_name
...
The table name
...
f_geometry_column
...
The name of the geometry column in the table
...
coord_dimension
...
Either 2(D) or 3(D)
...
srid_integer
...
The SRID (e.g 27700)
...
type
...
The geometry type (e.g POINT or GEOMETRY)
Geography Datatype
The geography datatype provides native support for spatial features represented on geographic coordinates - i.e. spherical coordinates expressed in angular units (degrees). The basis for the geometry type is a plane. On the plane the shortest path between two objects is a straight line. This means calculations on geometries can be calculated using cartesian maths and straight line vectors.
The geography type is based on a sphere. In the sphere the shortest path between two objects is an arc. This means that calculations on geographies (e.g. areas, distances, lengths, intersections) must be calculated on the sphere, using more complex maths. For accurate measurements, the calculations must take the actual shape of the earth into account, and as this is more complex, there are few functions defined for the geography type, and only WGS 84 (SRID:4326) is supported.
While it's useful to be aware of the geography type, this course will focus on geometry types. The table below compares the two.
...
Geometry
...
Geography
...
Basis is a plane
...
Basis is a sphere
...
Shortest path between two objects is a straight line
...
Shortest path between two objects is an arc
...
Calculations using simple maths and straight line vectors
...
Calculations (areas, distances, lengths, intersections, etc.) must be calculated on the sphere
...
Supported in all SRS
...
Limited and only lat/long SRSs are supported. 4326 is the default and if no SRID is specified, 4326 is assumed.
Indexes
A database index is a data structure that improves the speed of data retrieval operations on a database table, at the cost of slower writes and increased storage space. Any field in a database can be indexed and more than one field can be involved in a single index. While indexes improve efficiency in search, they slow down update operations.
Normally there should be a spatial index on the geometry field to improve the efficiency of rendering, analysis and searching of spatial data. Most import tools provide an option to create an index when the data is loaded, and it is good practice to do so. The most common type of spatial index is a GIST index, but other indexes include B-Tree type.
Loader for MasterMap
Astun's Loader is a set of Python scripts which make use of OGR 1.9 to transform GML or KML data to other formats. It is principally use to load Ordnance Survey MasterMap data to PostGreSQL. The source data is pre-processed using Python to make it suitable for loading with OGR, or to add value by deriving attributes.
Although Loader was originally written to load Ordnance Survey OS MasterMap Topographic Layer data in GML/GZ format, it has been used to load other GML and KML data.
Loader is an open source tool, at https://github.com/AstunTechnology/Loader.