Better Open Data
Cloud-native Geospatial Data for Northern Ireland

Alex Donald (Geological Survey of Northern Ireland)

Visit https://geoawd.github.io/better-open-data-foss4guk/FOSS4GUK2025.html to access the slides with videos

But first...

These formats were not around when these data were acquired

The source data is unsupported

I'm not criticising the original data - I ❤️ it

...which is why I've processed it.

Why Standards?

Open Geospatial Consortium (OGC) standards are internationally recognised specifications that let different systems exchange information seamlessly.

Interoperability

Enable seamless data exchange between different systems, platforms, and organisations, reducing integration costs and complexity.

Data Consistency and Quality

Promote uniform data formats, definitions, and structures, leading to higher data reliability and easier validation.

Efficiency and Cost Savings

Reduce duplication of effort in data collection, processing, and analysis by allowing reuse of tools and datasets.

Scalability and Future-Proofing

Support scalable solutions that can grow and adapt with evolving technologies and datasets.

Compliance and Collaboration

Facilitate compliance with national/international regulations and foster collaboration across sectors and borders.

Avoid Vendor Lock-in

Not tied to any particular software, storage, or delivery platforms.

Cloud Optimized Geotiff

Performance

COG's are designed for efficient access, allowing for quick retrieval of data without loading the entire file.

Compatibility

COG's are widely supported by GIS software, cloud platforms, and web mapping libraries, ensuring interoperability.

Reduced Duplication of Data

Store your primary data as single file that can be served online instead of needing to process, copy and cache the data.

Scalability

COG's can handle large datasets efficiently, making them suitable for big data applications in geospatial analysis.

How do COGs work?

File structure + http technologies

File Structure

COG's are a specific type of GeoTIFF, optimised for cloud storage and access.

HTTP Range Requests

HTTP Range Requests allow clients to request specific parts of a file, enabling efficient data retrieval without downloading the entire file.

COGs are structured to support these range requests, allowing for efficient access to specific tiles or bands within the raster data.

You're probably all using http range requests (thanks streaming video!)
and most of you have probably accessed COGS thanks to Planet.

COG Compression

Lossy and lossless compression options available

TL;DR: compression, in the case of these LiDAR data, can make the data 1/3 of the size with no loss of resolution.

View the speaker notes for details of methods and their suitability.

How do I make a COG?


from osgeo import gdal

input = r'input.vrt'
output = r'output_file.tif'

translate_options = gdal.TranslateOptions(format='COG',
		creationOptions=["COMPRESS=ZSTD",
						"BIGTIFF=YES", 
						"PREDICTOR=2",	
						"IGNORE_OVERVIEWS=YES", 
						"NUM_THREADS=ALL_CPUS"],
						outputSRS='EPSG:29902')
gdal.Translate(output, input, options=translate_options)
					

import os
from osgeo import gdal

input_dir = r'lidar-source/'
output_dir = r'lidar-cog/'

for f in os.listdir(input_dir):
    if f.endswith('.tif'):
        input_file = os.path.join(input_dir, f)
        output_file = os.path.join(output_dir, f)
        
        translate_options = gdal.TranslateOptions(
            format='COG',
            creationOptions=[
				"COMPRESS=ZSTD", 
				"PREDICTOR=2",
				"IGNORE_OVERVIEWS=YES", 
				"BIGTIFF=YES", 
				"NUM_THREADS=ALL_CPUS", 
				"STATISTICS=Yes"],
            outputSRS='EPSG:29902'
        )
        
        gdal.Translate(output_file, 
					   input_file, 
					   options=translate_options)
					

better-open-data.com*

An experiment with COGs that might be useful...

*I needed a domain and couldn't think of anything...

Open Layers and Geotiff.js

Custom hillshade function to display the raw pixel data as an interactive hillshade

DEMO TIME

Let's hope the internet is working!




If you aren't at the FOSS4GUK 2025 conference, there are a series of videos on the following pages that you can view.

Where's the site/data...

Where should the data be...

  • Amazon S3
  • Cloudflare R2
  • Google Cloud Storage
  • Microsoft Azure Blob Storage
  • Digital Ocean Spaces
  • MinIO (self hosted)

Not just geospatial data...

In summary

COGs are really good geospatial storage *and* delivery format

If you have open LiDAR data get in touch...

All the code is on GitHub

Thanks!

awdo@bgs.ac.uk

Useful links

cogeo.org
GDAL - COG Driver
Rasterio
STAC in QGIS