Step 1 - Pre-Process AddressBase Premium CSV files
A relatively small Python script has been built to process all the ZIP files of CSV - about 5.5Gb. You will also need about 37Gb for the combined CSV output files.
Copy ALL Zip files to a folder - they can be in different sub folders, the script will recursively walk through.
Create an OUTPUT folder that will have the combined OUTPUT CSV files.
Copy the addressbase_csv.ini and Python addressbase_csv_import.py script to a Python folder.
Edit the addressbase_csv.ini file to specify your dataFolder (point 1) and your outputFolder (point 2)
addressbase_csv.ini
#####
# extractor settings
#####
[data]
# location of the CSV file ZIPs
dataFolder = /home/[user]/addressbase_epoch24
#where to put the output CSV files
outputFolder = /home/[user]/addressbase_epoch24/output
#####
# Notification email settings
#####
[notification]
mailto = [to-email]
gmail_user = [from-email]
gmail_pwd = [password]
smtpserver = [servername]
#####
# Database credentials for database
# which will hold coordinates and metadata
#####
[database]
pgdbname = osmm1231
pghost = [postgresql_server_address]
pgport = [postgresql_server_port]
pguser = [postgresql_username]
pgpassword = [postgresql_password]
Go to a command prompt, navigate to the Python folder and run the following:
python addressbase_csv_import.py
You will now have a folder with the following output files:
Only the files 'Required by iShare' will be created by default. Specify -all to have other output CSV created.
Filename | Description | Size | Required by iShare |
|---|---|---|---|
10_header_10.csv | Header records | 22Kb | No |
11_street.csv | Street records | 203MB | Yes |
15_streetdescriptor.csv | Street Descriptor records | 120Mb | Yes |
21_blpu.csv | Basic Land & Property Unit records | 5.4Gb | Yes |
23_xref.csv | Cross Reference records | 13.9Gb | No |
24_lpi.csv | Land & Property Identifier records | 6.6Gb | Yes |
28_dpa.csv | Delivery Point Address records | 5.4Gb | Yes |
29_metadata.csv | Metadata records | 99Kb | No |
30_successor.csv | Successor records | 0 | No |
31_organisation.csv | Organisation records | 110Mb | Yes |
32_classification.csv | Classification records | 5.3Gb | No |
99_trailer.csv | Trailer records | 12Kb | No |