Mosaic thousands of raster images
Requirement (Problem)
You have 1000's of raster images and you want to create a single mosaic (a complete stitched map) of those images.
Image courtesy of Stack Exchange
Example
A customer had over 3800 ECW images of their area which needed to be viewed at a variety of scales. The key difference here is the need to view at different scales - if that requirement isn't needed then a simple Raster Tile Index can be built.
Why?
To build a single image in ECW format from 3800 ECW files, where the average filesize is between 1-2 Mb would result in an ECW file approximately 7.5 Gb. This guesstimate is taken from 3800*2/1024) - and the assumption that the new overviews that are built within the ECW file would be made up from the spare space when up to 2Gb for each file. Regardless of the output size the requirements would be for a an ERDAS or MrSID license to write a file over 500Mb (£1000,s)
Solution
Potential options include using PostGIS Raster functionality (Database) or GDAL tools (file based)
The following uses GDAL tools to build a Virtual Raster dataset (similar to a tileindex via gdalbuildvrt) of all the files and then an overview image (via gdaladdo) that software that is GDAL enabled (MapServer, GeoServer, QGIS etc) can access - alternative programs would best be served via a WMS feed from MapServer or GeoServer.
Step-by-step guide
The following steps were required for the customer project and will serve you well to follow
- Check all files are valid
- Create Virtual Dataset (VRT)
- Create Overviews
- Load in mapping software (MapServer example)
GDAL with ECW support
We're presuming that you have GDAL installed with ECW support - the ECW SDK may need to be downloaded. On Linux you'll need to recompile GDAL for the support, on Windows you'll need to move the ECW plugin into the GDAL directory.
Check all files are valid
Before embarking on the build of the Overviews, which can take hours to process, it is good to double check that all files are valid. We found issues (possible file corruption) with 2 ECWs from the 3800 which although a very small amount bombed out (stopped) the Overviews build. Here we're going to run a script against ALL the ECW files and try translating them to GeoTIFF. This example is a Linux bash script but an equivalent windows batch file or powershell script wouldn't be difficult to write.
# loop through all the ECW (*.ecw) in the script directory (paths could easily be added i.e. /iShareData/Workshop/Raster/Aerial/*.ecw) for i in *.ecw; do # Process $i and translate to a new file of the same name with a .tif extension gdal_translate $i $i.tif # now delete the file rm $i.tif done
The use of gdal_translate essential as it will do a full read of the files via GDAL and will detect any corruption. gdalinfo will only scan the header of the raster image and not do a full read.
bash at_check_ecw_valid.sh 2> ecw_errors.txt | grep ERROR
- bash - to execute the script
- at_check_ecw_valid.sh - script name
- 2> - only pipe through the errors to the output file
- ecw_errors.txt - output file
- | grep ERROR - linux to only show the error output on screen that contains ERROR (possibly not needed with 2> where 2 specifies only errors)
Create Virtual Dataset (VRT)
The VRT is an XML file, similar to a TileIndex, that references all the ECW files holding the extent, projection etc without processing or translating the files.
gdalbuildvrt all_ecw.vrt *.ecw -a_srs "EPSG:27700"
gdalbuildvrt - The GDAL utility to exefute
- all_ecw.vrt - the output vrt file name
- *.ecw - the input raster images
- -a_srs - force an output projection
- "EPSG:27700" - British National Grid projection
Note regarding reproject
the reprojection to EPSG:27700 (British National Grid - BNG) was needed as our data was supplied with BNG coordinate references but without a projection system. If all your images are in the same projection then this can be ignored - to check a similar script to the above gdal_translate one would be written having a grep command to see if NOT British National Grid
Create Overviews
TIFF supports the creation of “overviews” within the file, which is basically a down sampled version of the raster data suitable for use at lower resolutions.
gdaladdo -ro --config COMPRESS_OVERVIEW JPEG --config PHOTOMETRIC_OVERVIEW YCBCR --config INTERLEAVE_OVERVIEW PIXEL --config BIGTIFF_OVERVIEW YES _all_ecw.vrt 2 4 8 16
- gdaladdo - GDAL utility to execute
- -ro - option to force the generation of external overview
- --config - various config parameters
- COMPRESS_OVERVIEW JPEG - the compressed overviews to be stored in JPEG format
- PHOTOMETRIC_OVERVIEW YCBCR - the photometric interpretation
- INTERLEAVE_OVERVIEW PIXEL - interleave by PIXEL not BAND
- BIGTIFF_OVERVIEW YES - essential if you think the output file will be greater than 4Gb (TIFF has a limit of 4Gb)
- _all_ecw.vrt - input vrt file
- 2 4 8 16 - overview samples
Reference
Referenced from following blog entry at Linfiniti and MapServer and GeoServer documentation from Boundless
Load in mapping software (MapServer example)
The following example shows a layer snippet using the VRT and is referenced in the same way as a standard Raster image or Raster tileindex. GDAL does the magic of looking for the relevant .ovr file or the source ECW images.
LAYER NAME "aerial1999" STATUS DEFAULT TYPE RASTER METADATA "wms_title" "Aerial Photography 1999-2000" "wms_srs" "EPSG:27700" "wms_name" "aerial1999" "wms_server_version" "1.1.1" "wms_format" "image/png" END DATA "Raster\Aerial_Photographs_1999-2000\_all_ecw.vrt" END
Related articles