RASTER DATA: The Python developers’ playing ground

raster
raster data storage

In your Geospatial python scripting endeavors, you are more than likely to manipulate various GIS raster data formats. Basically raster data is a combination of valued cells organized in rows and columns. It is this difference in resolution compared to other digital images that allows Python to exploit information that is very important in analysis of raster images. For example, the various bands of a multi-band image vary in wavelength which gives a developer the ability to extract very meaningful information.

Scientific computation algorithms have now matured enough to process all GIS raster data formats that are currently available. Below are the raster data formats that you can process using Python programming language:

TIFF (GeoTIFF)

TIFF stands for Tagged Image File Format which is a popular raster data format for Python programmers since the tagging system is flexible enough to store any type of data. Nevertheless, the extensibilities can be a drawback if different developers store the data in very different degrees. GeoTIFFs are accompanied by tfw file (geolocation), xml file (metadata) and aux (projections).

World files

These are simple text files that have geo-referenced information for image files that do not support spatial information storage. World files accompany PNG, JPEG and the lately popular GIF format. The naming convention of the world files makes it easy to manipulate them even using GIS software. Center (x,y) coordinates of the upper left cell, cell size along (x) and (y) axis and the rotation of the axes means you can overlay a .jpg file on an online base map layer perfectly.

JPEG 2000 and MrSID

JPEG 2000 is an update of the JPEG which expands the compression capabilities for storage of additional data such as geo-referencing data. A ratio of 20:1 is great especially when dealing with background imagery. JPEG2000 and MrSID supports losy compression (overlooks data for the size of the file) and lossless compression (maintains the data without overlooking file size) respectively. That means, you can process the files through python scripting. However, compressed data formats are not recommendable for remote sensing.

RELATED:  Geospatial Big Data processing, OGC standards and Cloud Computing intersection and resulting opportunities

ASCII GRIDS

Mostly used to store elevation data and the format is supported by most GIS softwareas well as python scripting. The file text stores x and y values as rows and columns contained in the header. Data stored in contains the following information:

  • Number of columns and rows
  • X and Y coordinates of the center cell
  • Size of cells in reference to the mapping units.

Other raster data formats are also used in geospatial data processing, but the above-mentioned file formats are the best place to start for python developers!

Bonface Thaa
About thaabonface 7 Articles
Bonface Thaa is Python developer and a GIS software consultant who graduated from Dedan Kimathi University of Technology with a Bachelor of Science degree in Geomatic Engineering. He is passionate about GIS technologies and has expertise in designing and developing custom GIS solutions using Python as well as popular mapping JavaScript Libraries. Bonface likes to read and write articles from time to time. He is also passionate about attending and participating in seminars, GIS hackathons and conferences.