Step 1 - Pre-Process AddressBase Premium CSV files

Step 1 - Pre-Process AddressBase Premium CSV files

A relatively small Python script has been built to process all the ZIP files of CSV - about 5.5Gb.  You will also need about 37Gb for the combined CSV output files.

  1. Copy ALL Zip files to a folder - they can be in different sub folders, the script will recursively walk through.

  2. Create an OUTPUT folder that will have the combined OUTPUT CSV files.

  3. Copy the addressbase_csv.ini and Python addressbase_csv_import.py script to a Python folder.

  4. Edit the addressbase_csv.ini file to specify your dataFolder (point 1) and your outputFolder (point 2)

 

addressbase_csv.ini
##### # extractor settings ##### [data] # location of the CSV file ZIPs dataFolder = /home/[user]/addressbase_epoch24 #where to put the output CSV files outputFolder = /home/[user]/addressbase_epoch24/output ##### # Notification email settings ##### [notification] mailto = [to-email] gmail_user = [from-email] gmail_pwd = [password] smtpserver = [servername] ##### # Database credentials for database # which will hold coordinates and metadata ##### [database] pgdbname = osmm1231 pghost = [postgresql_server_address] pgport = [postgresql_server_port] pguser = [postgresql_username] pgpassword = [postgresql_password]


Go to a command prompt, navigate to the Python folder and run the following:

python addressbase_csv_import.py

 

You will now have a folder with the following output files:

Only the files 'Required by iShare' will be created by default. Specify -all to have other output CSV created.

 

Filename

Description

Size

Required by iShare

Filename

Description

Size

Required by iShare

10_header_10.csv

Header records

22Kb

No

11_street.csv

Street records

203MB

Yes

15_streetdescriptor.csv

Street Descriptor records

120Mb

Yes

21_blpu.csv

Basic Land & Property Unit records

5.4Gb

Yes

23_xref.csv

Cross Reference records

13.9Gb

No

24_lpi.csv

Land & Property Identifier records

6.6Gb

Yes

28_dpa.csv

Delivery Point Address records

5.4Gb

Yes

29_metadata.csv

Metadata records

99Kb

No

30_successor.csv

Successor records

0

No

31_organisation.csv

Organisation records

110Mb

Yes

32_classification.csv

Classification records

5.3Gb

No

99_trailer.csv

Trailer records

12Kb

No