Step 1 - Pre-Process AddressBase Premium CSV files

A relatively small Python script has been built to process all the ZIP files of CSV - about 5.5Gb.  You will also need about 37Gb for the combined CSV output files.

  1. Copy ALL Zip files to a folder - they can be in different sub folders, the script will recursively walk through.
  2. Create an OUTPUT folder that will have the combined OUTPUT CSV files.
  3. Copy the addressbase_csv.ini and Python addressbase_csv_import.py script to a Python folder.
  4. Edit the addressbase_csv.ini file to specify your dataFolder (point 1) and your outputFolder (point 2)


addressbase_csv.ini
#####
# extractor settings
#####
[data]
# location of the CSV file ZIPs
dataFolder = /home/[user]/addressbase_epoch24
#where to put the output CSV files
outputFolder = /home/[user]/addressbase_epoch24/output
#####
# Notification email settings
#####
[notification]
mailto = [to-email]
gmail_user = [from-email]
gmail_pwd = [password]
smtpserver = [servername]
#####
# Database credentials for database
# which will hold coordinates and metadata
#####
[database]
pgdbname = osmm1231
pghost = [postgresql_server_address]
pgport = [postgresql_server_port]
pguser = [postgresql_username]
pgpassword = [postgresql_password]

Go to to a command prompt, navigate to the Python folder and run the following:


Merge CSV via Python
python addressbase_csv_import.py


You will now have a folder with the following output files:

Only the files 'Required by iShare' will be created by default. Specify -all to have other output CSV created.

Filename
Description
Size
Required by iShare
10_header_10.csvHeader records22KbNo
11_street.csvStreet records203MBYes
15_streetdescriptor.csvStreet Descriptor records120MbYes
21_blpu.csvBasic Land & Property Unit records5.4GbYes
23_xref.csvCross Reference records13.9GbNo
24_lpi.csvLand & Property Identifier records6.6GbYes
28_dpa.csvDelivery Point Address records5.4GbYes
29_metadata.csvMetadata records99KbNo
30_successor.csvSuccessor records0No
31_organisation.csvOrganisation records110MbYes
32_classification.csvClassification records5.3GbNo
99_trailer.csvTrailer records12KbNo