Tutorial: Introduction to GDAL

Site: OpenCourseWare for GIS
Course: Programming for Geospatial Hydrological Applications
Book: Tutorial: Introduction to GDAL
Printed by: Guest user
Date: Thursday, 25 April 2024, 2:04 PM

1. Introduction

What is GDAL

GDAL is a translator library for raster geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single abstract data model to the calling application for all supported formats. It also comes with a variety of useful commandline utilities for data translation and processing. See for more info: http://www.gdal.org

Learning objectives

After this course you will be able to:

  • Retrieve information from GIS data
  • Reproject GIS data
  • Change raster properties
  • Convert raster formats
  • Convert vector formats
  • Apply spatial queries on vectors
  • Convert comma separated values files
  • Perform batch conversion

Software

For these exercises GDAL needs to be installed, preferably using the OSGEO4W distribution package. If you have installed QGIS, the OSGeo4W distribution is already there. In the video you can see how to install the LTR version of QGIS, which we will also use later for PyQGIS.


Exercise data

The exercise data can be downloaded from the main page. It contains:

roads.shp: road map from OpenStreetMap (http://openstreetmap.org)

srtm_37_02.tif: tile of a Digital Elevation Model (DEM) from the Shuttle Radar Topography Mission (SRTM) (http://www2.jpl.nasa.gov/srtm/)

gem_2011_gn1.shp:  borders of Dutch communities, freely available from CBS (Statistics Netherlands) and Kadaster (http://www.cbs.nl).

Locations.csv: table with object locations.

landuse.zip: contains land-use time series in IDRISI format.


2. Retrieving information from GIS data

In this chapter you will learn how to:

  • Retrieve information from raster data
  • Retrieve information from vector data
We'll introduce the gdalinfo and ogrinfo commands.

2.1. Retrieve information from raster data

One of the easiest and most useful commands in GDAL is gdalinfo. When given an image as an argument, it retrieves and prints all relevant information that is known about the file. This is especially useful if the image contains additional tag data, as is the case with TIF files. When working with satellite imagery, this is an extremely useful way of keeping track of the images location in long/lat coordinates as well as the image projection.

1. Open the prompt where you have GDAL configured. If you have QGIS installed you can use the OSGeo4W Shell

2. With the command prompt go to the directory where you have saved the exercise data, e.g. Z:\gdal_exercises

3. Execute the following command:
gdalinfo srtm_37_02.tif <ENTER>



4. Try to answer these questions, based on the info printed on the screen:
  • What is the size of the image?
  • What is the coordinate system?
  • What is the EPSG code?
  • What are the minimum and maximum values?

More info on using the gdalinfo command can be found at https://gdal.org/programs/gdalinfo.html

EPSG codes are used to define projections. More info in the video below.

2.2. Retrieve information from vector data

For retrieving info from vector data we use the ogrinfo command.

1. Execute the following command:
ogrinfo -al roads.shp | more <ENTER>
ogrinfo -al gem_2011_gn1.shp | more <ENTER>

2. Answer these questions:

  • What are the projections of these two shapefile?
  • What are the EPSG codes?
As you can see there are several EPSG codes reported. That is because different elements of the projection have their own EPSG code. The rule of thumb is that you can take the last EPSG code. So the projection of roads.shp is EPSG: 4326, which means that it is in Geographic Coordinate System (GCS) with coordinates in Latitude/Longitude.
For gem_2011_gn1.shp the projection is EPSG: 28992, which is the Dutch projection (Amersfoort RD/New).

More info about the ogrinfo command can be found here.

3. Reprojecting GIS data

In this chapter you will learn to:

  • Reproject raster data
  • Reproject vector data
You will use the gdalwarp and ogr2ogr commands.

In this exercise we want to make a map of the community of Delft with the main roads and the relief. Because the datasets are in different formats we have to reproject them to a common coordinate system. Here we reproject all datasets to the Dutch Amersfoort/RD New projection.

3.1. Reproject raster data

GDAL has the capability to change a raster coordinate system using the following syntax:

gdalwarp -t_srs EPSG:... <input> <output>

The -t_srs argument specifies the target coordinate system. If the source coordinate system is unknown it must be specified with the -s_srs argument. EPSG:... specifies the EPSG code of the projection. <input> and <output> are the input and output data respectively.

We are now going to reproject a Digital Elevation Model (DEM) acquired by the Shuttle Radar Topography Mission (SRTM). You can  download DEM's for your own area of interest from USGS Earth Explorer. Here we'll use the provided course data.

In order to reproject the DEM from WGS-84 lat/lon to Amersfoort/RD New we use this command:

gdalwarp -t_srs EPSG:XXXXX srtm_37_02.tif dem_rd.tif

1. Replace XXXXX with the proper EPSG code for Amersfoort/RD New (see one of your previous answers using ogrinfo).

2. Execute the command:

gdalwarp -t_srs EPSG:28992 srtm_37_02.tif dem_rd.tif <ENTER>

3. Visualize the result in QGIS.

More info on the gdalwarp command can be found here.

3.2. Reprojecting vector data

For vector data the ogr2ogr command is used. The syntax is:

ogr2ogr -t_srs EPSG:XXXXX <output> <input>

Note that <output> and <input> are in a different order than for gdalwarp!

Here we'll convert the OpenStreetMaps road data to the Amersfoort/RD New projection.

1. Execute:
ogr2ogr -t_srs EPSG:28992 roadsreprojected.shp roads.shp

The command is executed (it takes some time), but gives a warning.

2. What does the warning mean?

More information about the ogr2ogr command can be found here.


4. Convert GIS formats

In this chapter you'll learn how to

  • Convert between different raster formats
  • Convert between different vector formats
You'll use the gdal_translate and the ogr2ogr commands.

4.1. Convert raster formats

gdal_translate's primary function is to change between image formats. The basic syntax is:

gdal_translate -of FORMAT inputFile outputFile

All supported formats can be found here.

Now we are going to convert the DEM from geoTiff to SAGA format. SAGA is a GIS which has its own -gdal-supported- binary format with the .sdat extension.

1. Execute the following command:
gdal_translate -of SAGA dem_rd.tif dem_rd.sdat

Some formats need more arguments. PCRaster for examples needs a specification of the data type (boolean, nominal, scalar, etc.). Let's convert the same data to PCRaster format.

2. Execute the following command:
gdal_translate -of PCRaster -ot Float32 -mo "VS_SCALAR" dem_rd.tif dem_rd.map

To convert to PCRaster format you need to know the data type of the raster. In the example above the DEM is continuous data, so the data type is scalar. If the layer was discrete (classes) then the data type was nominal.

Data Type
-of -mo
Boolean Byte "VS_BOOLEAN"
Nominal Int32 "VS_NOMINAL"
Ordinal Int32 "VS_ORDINAL"
Scalar Float32 or Float64
"VS_SCALAR"
Direction Float32 or Float64
"VS_DIRECTION"
LDD Int32 "VS_LDD"

4.2. Convert vector formats

GDAL allows conversions between vector formats using the ogr2ogr command.
Here's a list of supported vector formats.

The general syntax is:
ogr2ogr -f <"Format"> <output> <input>
The "Format" is the short name in the vector drivers table and needs to be given in double quotes.

We're going to convert gem_2011_gn1.shp to a Google KML file that can be opened in Google Earth.

1. Execute the following command:
ogr2ogr -f "KML" gem.kml gem_2011_gn1.shp

Ignore the warnings. They're related to formats and characters that are not supported.
Here you can find all options of the ogr2ogr command. Note that you can also reproject and convert at the same time by using -t_srs. There are many other things you can do, but we don't cover in this basic tutorial.

2. Open the file in Google Earth to check the result.


4.3. Spatial queries of vector data

For our map of Delft we want to do the following GIS analysis:

  • Select the community of Delft from the community map and save it into a new shapefile;
  • Intersect the community boundaries of Delft with the road map of the Netherlands.
We can use a spatial query to select a feature from a vector map.

1. What is the attribute in the community map containing the names of the communities? You can use either ogrinfo or QGIS to answer this question.

2. Execute the following command:
ogr2ogr -f "ESRI Shapefile" -where GM_NAAM='Delft' -a_srs EPSG:28992 delft.shp gem_2011_gn1.shp

This will save the feature with GM_NAAM Delft to a new file called delft.shp. The argument -a_srs EPSG:28992 is used to assign the Amersfoort/RD New projection to the output file. The argument -f defines the output format.

3. We can clip the roadsreprojected.shp vector so it only covers the municipality of Delft. We use the -clipsrc argument.
ogr2ogr -clipsrc delft.shp roadsdelft.shp roadsreprojected.shp
 
4. Now open in QGIS the reprojected DEM (dem_rd.tif), the clipped road map (roadsdelft.shp) and the community of Delft (delft.shp).


Result GDAL

4.4. Convert Comma Separated Values (CSV)

Sometimes you want to reproject coordinates in an ASCII file, e.g. which has been saved to text in a spreadsheet program. Here we will convert the coordinates in a comma separated ASCII file locations.csv to a new ASCII file locations_reprojected.csv.

It is good practice to first view the contents of a CSV file and to verify (1) the coordinates, and (2) the column separator. The separator is not always a comma and sometimes depends on the language settings used while exporting from a spreadsheet programme.

1. View the contents of locations.csv . You can use the type command from the command prompt as you learned in the Command Line tutorial.

We can see that coordinates are in lat/lon, which means that we can use EPSG:4326. The column separator is a comma. The last column gives objects with a string in quotes.

To change the projection of the CSV, we first have to create a virtual data source by creating an XML control file.

2. On your windows computer open notepad and type/copy the XML code given below. Use indentations of three spaces.

<OGRVRTDataSource>
   <OGRVRTLayer name="locations">
      <SrcDataSource>locations.csv</SrcDataSource>
      <GeometryType>wkbPoint</GeometryType>
      <LayerSRS>EPSG:4326</LayerSRS>
      <GeometryField encoding="PointFromColumns" x="lon" y="lat"/>
   </OGRVRTLayer>
</OGRVRTDataSource>

3. Save the file as locations.vrt in the gdal_exercises folder.

Some explanation about the XML file:

  • <OGRVRTLayer name="locations"> should correspond with the <SrcDataSource>locations.csv</SrcDataSource>
  • <LayerSRS>EPSG:4326</LayerSRS> should correspond with the EPSG code of the coordinate columns
  • <GeometryField encoding="PointFromColumns" x="lon" y="lat"/> indicates the columns with the coordinates that you want to convert.
4. Execute the following command:
ogr2ogr -t_srs EPSG:28992 -f "CSV" locations_reprojected.csv locations.vrt -lco GEOMETRY=AS_XY

In this example locations.csv with lat/lon WGS-84 coordinates is converted to locations_projected.csv with Amersfoort/RD New projection.

5. Use notepad to check locations_reprojected.csv. What is saved in each column?

In the same way we can convert the comma separated file to a shapefile.

6. Execute the following command:
ogr2ogr -f "ESRI Shapefile" -t_srs EPSG:28992 locations.shp locations.vrt

7. Check the result with ogrinfo.

8. Visualize the shapefile in QGIS by plotting the locations over the DEM, road map and Delft community border. Make a nice map.

9. Convert the locations file to Google KML, open in Google Earth and find out what the object locations are.

5. Batch conversion

Desktop GIS programmes are very useful for GIS operations, but are hard to use if we have to repeat the same task for many GIS layers. Then scripting can be a solution.

Here we have an example dataset from a land-use model of Dublin. The data are in IDRISI raster format (.rst), with one layer for each year between 1990 and 2030. Our task is to convert all layers to GeoTiff (.tif )format.

1. Unzip landuse.zip provided with the course data to a new folder and check the contents.

Remember that for one raster file conversion we would use this:
gdal_translate -of GTiff 01_State19900101.rst 01_State19900101.tif

Now we're going to make a batch file (see Command Line tutorial) which includes a loop to convert all files in the folder.

2. Open a text editor, e.g. Notepad
3. Add the following code:

for %%f in (*.rst) do (
   echo %%~nf
   gdal_translate -of GTiff %%f %%~nf.tif
)

4. Save the batchfile as rst2tif.bat in the folder with the land-use rasters (don't forget to change the extension if you use Notepad, classic mistake!)

Try to understand the code. This is a for loop that loops over all *.rst files in the folder. %%f is the variable that contains the filename of each file. With echo we can print something to the screen. Here we print %%~nf , which is the part of the filename before the dot that separates it from the extension. Then we use the gdal_translate command with output format GeoTiff. At the end of the line we add the .tif extension to the filename.

5. Execute the batchfile. Type
rst2tif <ENTER>

6. Check the results