Step 1 - Pre-Process AddressBase Premium CSV files
A relatively small Python script has been built to process all the ZIP files of CSV - about 5.5Gb. Â You will also need about 37Gb for the combined CSV output files.
- Copy ALL Zip files to a folder - they can be in different sub folders, the script will recursively walk through.
- Create an OUTPUT folder that will have the combined OUTPUT CSV files.
- Copy the addressbase_csv.ini and Python addressbase_csv_import.py script to a Python folder.
- Edit the addressbase_csv.ini file to specify your dataFolder (point 1) and your outputFolder (point 2)
addressbase_csv.ini
##### # extractor settings ##### [data] # location of the CSV file ZIPs dataFolder = /home/[user]/addressbase_epoch24 #where to put the output CSV files outputFolder = /home/[user]/addressbase_epoch24/output ##### # Notification email settings ##### [notification] mailto = [to-email] gmail_user = [from-email] gmail_pwd = [password] smtpserver = [servername] ##### # Database credentials for database # which will hold coordinates and metadata ##### [database] pgdbname = osmm1231 pghost = [postgresql_server_address] pgport = [postgresql_server_port] pguser = [postgresql_username] pgpassword = [postgresql_password]
Go to to a command prompt, navigate to the Python folder and run the following:
Merge CSV via Python
python addressbase_csv_import.py
You will now have a folder with the following output files:
Only the files 'Required by iShare' will be created by default. Specify -all to have other output CSV created.
Filename | Description | Size | Required by iShare |
---|---|---|---|
10_header_10.csv | Header records | 22Kb | No |
11_street.csv | Street records | 203MB | Yes |
15_streetdescriptor.csv | Street Descriptor records | 120Mb | Yes |
21_blpu.csv | Basic Land & Property Unit records | 5.4Gb | Yes |
23_xref.csv | Cross Reference records | 13.9Gb | No |
24_lpi.csv | Land & Property Identifier records | 6.6Gb | Yes |
28_dpa.csv | Delivery Point Address records | 5.4Gb | Yes |
29_metadata.csv | Metadata records | 99Kb | No |
30_successor.csv | Successor records | 0 | No |
31_organisation.csv | Organisation records | 110Mb | Yes |
32_classification.csv | Classification records | 5.3Gb | No |
99_trailer.csv | Trailer records | 12Kb | No |