perrygeo

About | Articles | CV

Python affine transforms

Sun 13 September 2015

Raster data coordinate handling with 6-element geotransforms is a pain. Use the affine Python library instead.

The typical geospatial coordinate reference system is defined on a cartesian plane with the 0,0 origin in the bottom left and X and Y increasing as you go up and to the right. But raster data, coming from its image processing origins, uses a different referencing system to access pixels. We refer to rows and columns with the 0,0 origin in the upper left and rows increase and you move down while the columns increase as you go right. Still a cartesian plane but not the same one. xyrowcol

So how do you transform between the two? Affine transformations provide a simple way to do it through the use of matrix algebra. Geospatial software of all varieties use an affine transform (sometimes refered to as "geotransform") to go from raster rows/columns to the x/y of the coordinate reference system. Converting from x/y back to row/col uses the inverse of the affine transform. Of course the software implementations vary widely.

For the remainder, I'll assume the simple case of a non-rotated "north up" raster as that is by far the most common case.

If you're coming from the matrix algebra perspective, you can ignore the constants in the affine matrix and refer to the the six paramters as a, b, c, d, e, f. This is the ordering and notation used by the affine Python library.

a = width of a pixel
b = row rotation (typically zero)
c = x-coordinate of the upper-left corner of the upper-left pixel
d = column rotation (typically zero)
e = height of a pixel (typically negative)
f = y-coordinate of the of the upper-left corner of the upper-left pixel

Perhaps the most pervasive implementation of affine transform encoding in the GIS world is the ESRI World File. The world file is a simple text file accompanying any raster image which uses six line-separated values in this order:

a = width of a pixel
d = column rotation (typically zero)
b = row rotation (typically zero)
e = height of a pixel (typically negative)
c = x-coordinate of the center of the upper-left pixel
f = y-coordinate of the center of the upper-left pixel

It's important to note that the c and f parameters refer to the center of the cell, not the origin!

GDAL also uses the 6 parameter transform in yet a different order with the "Geotransform" array

c = x-coordinate of the upper-left corner of the upper-left pixel
a = width of a pixel
b = row rotation (typically zero)
f = y-coordinate of the of the upper-left corner of the upper-left pixel
d = column rotation (typically zero)
e = height of a pixel (typically negative)

None of those orderings are particularly intutive but at least the first, as implemented by affine, is "correct" from the matrix algebra perspective.

For python programmers looking to work with raster data, the osgeo.gdal library has existed for quite a while. With it the notion of a 6-tuple geotransform in GDAL ordering has become pervasive. And if ordering were the only issue, it wouldn't necessarily be worth switching to the use of the affine library. The more convincing argument for the use of affine is the ease with which you can transform coordinates. In other words, why should you have to worry about ordering of parameters at all?

When dealing with the geotransform as a simple 6-element tuple, you'll probably end up writing code like this to do the actual conversion:

# Using osgeo.gdal and GDAL geotransform 6-tuples
gt = ds.GetGeoTransform()

# col, row to x, y
x = (col * gt[1]) + gt[0]
y = (row * gt[5]) + gt[3]

# x,y to col,row
col = int((x - gt[0]) / gt[1]) 
row = int((y - gt[3]) / gt[5])

I'd be willing to guess that variations of that formula exist in hundreds of python codebases. Not very complicated math but opaque enough not to commit to memory. It's also very easy to slip up ("Is the y origin element 4 or 5?") and introduce non-obvious bugs. Why should such a basic formulation be reimplemented by every programmer? Again, why rely on element ordering at all? affine, through the use of clever operation overloading, gives you a much simpler interface:

# Using rasterio and affine
a = ds.affine

# col, row to x, y
x, y = a * (col, row)

# x, y to col, row
col, row = ~a * (x, y)

Clean, nice looking code that's harder to get wrong, wouldn't you agree? And as @Asgerpetersen pointed out, if there were a non-zero rotation parameter, the affine example would handle it seamlessly while the geotransform formula would fail.

Also, interoperability with GDAL-style geotransforms is painless

# construct from our GDAL geotransform
a = Affine.from_gdal(*gt)
gt = a.to_gdal()

As is the ability to read/write from World Files

from affine import loadsw, dumpsw

# Read from World File
with open('raster.tfw') as tfw:
    a = loadsw(tfw.read())

# Write to World File
with open('other.wld', 'w') as dest:
    dest.write(dumpsw(a))

With rasterio planning to deprecate the use of GDAL-style geotransforms in the 1.0 release, it's never too early to start making the switch. Your cleaner raster coordinate code will be well worth the effort.