How to process Shuttle data into a Paragliding Oracle

Finding locations with specific slope directions and inclines

Digital Elevation Mapping data

To begin with I needed to find some elevation data which I could process. After a lot of exploration I discovered that the Shuttle Radar Topographic Mission had generated a 90metre data set (that is, a grid of height information for every 3 arc-seconds around the curve of the earch - roughly 90m at our latitude). This seemed like it would be good enough. I had a sniff of a 25metre grid but like many data sets paid for by public money, these are deliberately prevented from being shared with the public. This is a kind of divisive and frankly selfish behaviour which I wouldn't expect from scientists or public service organisations, but I'm eternally surprised. If anyone has access to this data (limited only to academics with Athens credentials) and is willing to share it with me, then I'd be able to do a much better job.

With data in hand (covering a whole square of five longitude by five latitude) I had to find a way to process it in order to find hills.

NASA's SRTM Coordinate system

The first thing I did was to understand the structure of the data set. It turned out to be just a series of integer values for the height in metres for every 90 metre step, with one line per horizontal scan of the long-lat square in an ASCII text file This implicit coordinate system needs to be translated into longitude and latitude if you want the result to be plotted in something like Google Earth. Also if you're looking for hills near where you live, you need to do the neighbourhood search relative to a longitude latitude value too, so some translation is needed.

The file I was interested in carries the code Z_37_2.ASC and can be downloaded here. More information about the structure is available in this readme file. You can get a copy of the elevation data which covers your part of the globe - a 5 by 5 long,lat degree square - by browsing for the files through this Google earth view (requires Google Earth to be installed).

You can inspect files on a mac or linux system with a tool called wc short for 'word count' which is invoked as follows...

wc Z_37_2.ASC

...and reports the following for our file...

6006 36000012 179929191 Z_37_2.ASC

...which means basically 6000 lines (plus dross) with 6000 entries per line - a total of 36million records and 179 million bytes (MB) of data.

Here are some useful facts about the file's coordinate system...

  • Column 0, Row 0 (top left corner) is 55 degrees lat, 0 degrees long (on a vertical line with greenwich)
  • 6000,6000 (bottom right corner) is 50 degrees lat, 5 degrees long (stretching out into continental europe)
  • Each row and column implies a step of 3 arc seconds over the earth's surface

Making Ipswich the Centre of the Universe

Ipswich is at 1390, 3530 in [columns,rows] - in the ASCII file format, by the following calculations..



        

Ipswich is found at 52 03.30' N,  1 09.30' E (annotated in arc degrees, arc minutes and arc seconds from the equator and the meridian)



Take away 50 and 0 respectively for the bottom left corner (I find it easier to think in terms of x and y, which matches to lat and long).



There are 3600 arc seconds in a degree of lat/long and 60 arc seconds in a minute of lat/long



Northings position in arc seconds translated into grid columns and rows

= (2 * 3600) + (03 * 60) + (30)

= 7200 + 180 + 30

= 7410



Eastings position in arc seconds translated into grid columns and rows

= (1 * 3600) + (09 * 60) + (30)

= 3600 + 540 + 30

= 4170               



divide both by 3 to normalise for scale (3 degrees per square)

2470, 1390



The row coordinate system runs the opposite way from lat/long (top to bottom instead of bottom to top - invert the y coordinate)

= 6000 - 2470

= 3530

For final Ipswich coordinate switch northings with eastings to get columns and rows (as x,y)

= 1390, 3530

        

      

Processing the file for querying

The importance of columns and rows should become obvious. We're going to use a text processing tool called awk to grab the data from specific columns and rows by reading through the file, and then output that in MySQL format. We need to get the 4 million record subset of the nearest points to ipswich. In order to get the data for a 90km square, we're going to grab 1000 records either side of the Ipswich row and column.

Awk is fundamentally aware of rows and columns for processing tabular data, and can do this job very quickly. The script below turns the data from the flat ASCII file into commands which populates the MySQL database with individual rows for each elevation data point, recording the row and column it was grabbed from.

It's invoked by the following magic words which spits the whole file (cat) through a pipe (|) to be processed by the awk file (awk -f arcascii2mysql.awk) and then output to a new file (> Z_37_2.MYSQL)...



        

time cat Z_37_2.ASC | awk -f arcascii2mysql.awk > Z_37_2.MYSQL

        

      

The use of the prefix command 'time' is just to report how long it takes. On my machine (Macbook Pro, Core Duo) this takes about one minute to process all 36 million records. The full source of the arcascii2mysql.awk file is shown below. It does a lot of jobs in one as I'll explain.

arcascii2mysql.awk



      

BEGIN {

  print "CREATE DATABASE IF NOT EXISTS anglia;"

  print "USE anglia;"

  print "CREATE TABLE IF NOT EXISTS elevation ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, col SMALLINT, row SMALLINT, z SMALLINT);"

  print "CREATE INDEX coord_col_row ON elevation (col,row);"

  print "CREATE INDEX coord_row ON elevation (col);"



  

  latdeg = 52;  

  latmin = 3;

  latsec = 30;

  

  longdeg = 1;

  longmin = 9;

  longsec = 30;

  



  gridoriginlatdeg  = 50;

  gridoriginlongdeg  = 0;

  gridwidth          = 6000;

  gridheight        = 6000;

  gridsearchradius   = 1000;

  

  centerrow = 6000 - ((((latdeg-gridoriginlatdeg) * 3600) + (latmin * 60) + latsec) / 3);

  centercol = ((((longdeg-gridoriginlongdeg) * 3600) + (longmin * 60) + longsec) / 3);

  lowercol  = centercol - gridsearchradius;

  uppercol  = centercol + gridsearchradius;

  lowerrow  = centerrow - gridsearchradius;

  upperrow  = centerrow + gridsearchradius;



  lineno = 0;



}



/^[-0-9]/{

  if(lineno > lowerrow && lineno < upperrow){

  print "INSERT IGNORE INTO elevation (id,col,row,z) VALUES ";

  for(fieldno=lowercol;fieldno<=NF && fieldno<=uppercol;fieldno++){

    if(match($fieldno, "[0-9]")){

      if($fieldno == -9999){

        $fieldno = "NULL";

      }

      if(fieldno == lowercol){

        sep = ""

      }

      else{

        sep = ","

      }

      print sep "("( (lineno * gridwidth) + fieldno - 1) ", " (fieldno - 1) ", " lineno ", " $fieldno ")";

    }

  }

  print ";"

  }

  lineno++;

}



END {

  print "ALTER TABLE elevation";

  for(wind = 45; wind <= 360; wind += 45){

    print "ADD COLUMN `" wind "` TINYINT "

    if(wind != 360){

      print ","

    }

  }

  print ";"



  PI=3.14159265358979;

  for(wind = 45; wind <= 360; wind += 45){

    rad = (wind * 2 * PI) / 360;

    print "UPDATE elevation, elevation AS neighbour SET elevation.`" wind "` = (SELECT elevation.z - neighbour.z) WHERE neighbour.row = elevation.row + " int(cos(rad) * -1.5) " AND neighbour.col = elevation.col + " int(sin(rad) * 1.5) " ;";

    print "SELECT id, `" wind "`, (abs(col-" centercol ") + abs(row - " centerrow ")) as mdist, concat( " gridoriginlongdeg " + (col / 1200), ',', (" gridoriginlatdeg " + ((6000-row) /1200))) as gpsfloat, " wind " as wind from elevation  ORDER BY `" wind "` DESC, mdist ASC LIMIT 1000 into outfile '/tmp/" wind ".txt';";

  }  

}

        

      

To make sense of this awk script, you have to understand that this is a program for writing a program. Awk is actually creating a MySQL command which inserts records, indexes them and then queries them to find the slope in every different wind direction. You can inspect the commands directly if you like, before feeding awk's output into MySQL. An example of the kind of command it writes is shown below. However, note that the MySQL query file written is nearly 108 megabytes just for the 3000 arc seconds around Ipswich - it contains a lot of rows of data.

Stepping through the Awk file

First of all, the file creates the relevant database, called 'anglia' and a table to store our grid of elevation points. It sets up this table with some useful indexes in place so that large queries can be processed rapidly.

Then it calculates the proper grid section to retrieve, given our central lat/long point.

It checks for every line, whether it should be written into the MySQL database (within 1000 grid points of Ipswich) or just ignored. If it writes the line, it will record the row and column of the original point, and the height.

Then it creates new fields for each elevation, which will store the slope in each direction from that point(45, 90, 135 degrees, and so on). It fills in these values by testing the neighbours.

Finally it retrieves the most significant slopes (nearest and highest) and outputs them in a tabulated format to text files named '45.txt', '90.txt' etc. for every different 45 degree angle - a form suitable for repurposing later as a Google Earth file.

Deriving slope from elevation

In principle this is pretty easy, we can take neighbouring grid points and generate a new field for each record in the table which is the difference between itself and its neighbours. We need 8 fields altogether for the 8 immediate neighbours, which if you name them according to the angle between are 45, 90, 135, 180, 225, 270, 315 and 360 correlating to a NE, E, SE, S, SW, W, NW and N wind direction for paragliding. It populates these by testing the difference between each elevation and its relevant neighbour (one up and one across for a 45 degree angle). There's a useful sin() and int() number rounding hack which translates from the angle in degrees into +1 or 0 or -1 depending which neighbour it needs to test for a given angle.

To create an individual slope record for a given direction, let us say 45 degrees (North East) , we can first create the extra field called `45`, then populate it. Note the use of backticks to escape the numerical field name and the fact that rows increase towards the equator, and columns increase away from the Greenwich meridian, so 45 degrees North East is one row less, and one column more than the grid point you are testing.

Below are the important extracts from the MySQL query which Awk writes. Note especially the final retrieval using proximity to Ipswich and incline to the North East as the fields which we order on. My preference was to order first by steepness, and secondly by proximity. You definitely don't want to retrieve all records, so be sure to add a LIMIT clause at the end. If you do order by proximity first instead then you need to add a clause which limits to height differences greater than 10 metres or some sensible value for your locality, at least the same as the glide angle of a paraglider. Without this it just retrieves all the nearest points.

Z_37_2.MYSQL (written by the preceding awk script)



CREATE DATABASE IF NOT EXISTS anglia;

USE anglia;

CREATE TABLE IF NOT EXISTS elevation ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, col SMALLINT, row SMALLINT, z SMALLINT);

CREATE INDEX coord_col_row ON elevation (col,row);

CREATE INDEX coord_row ON elevation (col);

INSERT IGNORE INTO elevation (id,col,row,z) VALUES

(15186389, 389, 2531, NULL)

,(15186390, 390, 2531, NULL)



/...and so on.../



,(27176385, 2385, 4529, NULL)

,(27176386, 2386, 4529, NULL)

,(27176387, 2387, 4529, NULL)

,(27176388, 2388, 4529, NULL)

,(27176389, 2389, 4529, NULL);



ALTER TABLE elevation

ADD COLUMN `45` TINYINT,

ADD COLUMN `90` TINYINT,

ADD COLUMN `135` TINYINT,

ADD COLUMN `180` TINYINT,

ADD COLUMN `225` TINYINT,

ADD COLUMN `270` TINYINT,

ADD COLUMN `315` TINYINT,

ADD COLUMN `360` TINYINT;

UPDATE elevation, elevation AS neighbour SET elevation.`45` = (SELECT elevation.z - neighbour.z) WHERE neighbour.row = elevation.row + 1 AND neighbour.col = elevation.col + 1 ;

SELECT id, `45`, (abs(col-1390) + abs(row - 3530)) as mdist, concat( 0 + (col / 1200), ',', (50 + ((6000-row) /1200))) as gpsfloat from elevation ORDER BY `45` DESC, mdist ASC LIMIT 100 into outfile '/tmp/45.txt';

UPDATE elevation, elevation AS neighbour SET elevation.`90` = (SELECT elevation.z - neighbour.z) WHERE neighbour.row = elevation.row + 0 AND neighbour.col = elevation.col + 1 ;

SELECT id, `90`, (abs(col-1390) + abs(row - 3530)) as mdist, concat( 0 + (col / 1200), ',', (50 + ((6000-row) /1200))) as gpsfloat from elevation ORDER BY `90` DESC, mdist ASC LIMIT 100 into outfile '/tmp/90.txt';

UPDATE elevation, elevation AS neighbour SET elevation.`135` = (SELECT elevation.z - neighbour.z) WHERE neighbour.row = elevation.row + -1 AND neighbour.col = elevation.col + 1 ;

SELECT id, `135`, (abs(col-1390) + abs(row - 3530)) as mdist, concat( 0 + (col / 1200), ',', (50 + ((6000-row) /1200))) as gpsfloat from elevation ORDER BY `135` DESC, mdist ASC LIMIT 100 into outfile '/tmp/135.txt';

UPDATE elevation, elevation AS neighbour SET elevation.`180` = (SELECT elevation.z - neighbour.z) WHERE neighbour.row = elevation.row + -1 AND neighbour.col = elevation.col + 0 ;

SELECT id, `180`, (abs(col-1390) + abs(row - 3530)) as mdist, concat( 0 + (col / 1200), ',', (50 + ((6000-row) /1200))) as gpsfloat from elevation ORDER BY `180` DESC, mdist ASC LIMIT 100 into outfile '/tmp/180.txt';

UPDATE elevation, elevation AS neighbour SET elevation.`225` = (SELECT elevation.z - neighbour.z) WHERE neighbour.row = elevation.row + -1 AND neighbour.col = elevation.col + -1 ;

SELECT id, `225`, (abs(col-1390) + abs(row - 3530)) as mdist, concat( 0 + (col / 1200), ',', (50 + ((6000-row) /1200))) as gpsfloat from elevation ORDER BY `225` DESC, mdist ASC LIMIT 100 into outfile '/tmp/225.txt';

UPDATE elevation, elevation AS neighbour SET elevation.`270` = (SELECT elevation.z - neighbour.z) WHERE neighbour.row = elevation.row + 0 AND neighbour.col = elevation.col + -1 ;

SELECT id, `270`, (abs(col-1390) + abs(row - 3530)) as mdist, concat( 0 + (col / 1200), ',', (50 + ((6000-row) /1200))) as gpsfloat from elevation ORDER BY `270` DESC, mdist ASC LIMIT 100 into outfile '/tmp/270.txt';

UPDATE elevation, elevation AS neighbour SET elevation.`315` = (SELECT elevation.z - neighbour.z) WHERE neighbour.row = elevation.row + 1 AND neighbour.col = elevation.col + -1 ;

SELECT id, `315`, (abs(col-1390) + abs(row - 3530)) as mdist, concat( 0 + (col / 1200), ',', (50 + ((6000-row) /1200))) as gpsfloat from elevation ORDER BY `315` DESC, mdist ASC LIMIT 100 into outfile '/tmp/315.txt';

UPDATE elevation, elevation AS neighbour SET elevation.`360` = (SELECT elevation.z - neighbour.z) WHERE neighbour.row = elevation.row + 1 AND neighbour.col = elevation.col + 0 ;

SELECT id, `360`, (abs(col-1390) + abs(row - 3530)) as mdist, concat( 0 + (col / 1200), ',', (50 + ((6000-row) /1200))) as gpsfloat from elevation ORDER BY `360` DESC, mdist ASC LIMIT 100 into outfile '/tmp/360.txt';

      

Processing the output of Awk in MySQL

This is as simple as firing the contents of the file generated by the awk script (the previous step) directly into mysql. The invocation is as follows, with the '<' sign causing the file to be passed into MySQL...

time mysql -u root < Z_37_2.MYSQL

On my machine, (Macbook Pro, Core Duo), running the MySQL which awk has written for us takes about 30 minutes. The reason is that the records are being not merely processed, but indexed and then queried (as you can see from the SQL declarations which Awk writes).

The Money Shot - Google Earth Integration

Having mere tables isn't as useful as if we had it show up nice friendly wind arrows at the correct longitude and latitude in Google Earth, now is it?

Now we are in a position to find the most significant North Easterly inclines near Ipswich - a 3d visual document which I can refer to when the paragliding meteorological websites tell me I have a flyable North Easterly.

It's as simple as this...

cat /tmp/360.txt /tmp/45.txt /tmp/90.txt | awk -f high2kmz.awk > ne.kmz

Which is concatenating the North, North East and East tables, then firing them through the following awk script to get it to write a Google Earth (.kmz) file, the script looks like this...

And the output looks like this...



        

        

      <?xml version='1.0' encoding='UTF-8'?>

      <kml xmlns='http://earth.google.com/kml/2.1'>

        <Folder>

          <Placemark><Style><LabelStyle><scale>0.1</scale></LabelStyle><IconStyle><heading>360</heading><scale>0.925</scale><Icon><href>http://cefn.com/images/wind/n.gif</href></Icon></IconStyle></Style><name>0</name><description></description><Point><coordinates>0.4075,51.3400</coordinates></Point></Placemark>

          <Placemark><Style><LabelStyle><scale>0.1</scale></LabelStyle><IconStyle><heading>360</heading><scale>0.9</scale><Icon><href>http://cefn.com/images/wind/n.gif</href></Icon></IconStyle></Style><name>1</name><description></description><Point><coordinates>0.4067,51.3400</coordinates></Point></Placemark>

          <Placemark><Style><LabelStyle><scale>0.1</scale></LabelStyle><IconStyle><heading>360</heading><scale>0.875</scale><Icon><href>http://cefn.com/images/wind/n.gif</href></Icon></IconStyle></Style><name>2</name><description></description><Point><coordinates>0.4058,51.3400</coordinates></Point></Placemark>

          <Placemark><Style><LabelStyle><scale>0.1</scale></LabelStyle><IconStyle><heading>360</heading><scale>0.85</scale><Icon><href>http://cefn.com/images/wind/n.gif</href></Icon></IconStyle></Style><name>3</name><description></description><Point><coordinates>0.4308,51.3633</coordinates></Point></Placemark>

          /...and so on.../

          <Placemark><Style><LabelStyle><scale>0.1</scale></LabelStyle><IconStyle><heading>90</heading><scale>0.225</scale><Icon><href>http://cefn.com/images/wind/n.gif</href></Icon></IconStyle></Style><name>2996</name><description></description><Point><coordinates>1.4408,51.3850</coordinates></Point></Placemark>

          <Placemark><Style><LabelStyle><scale>0.1</scale></LabelStyle><IconStyle><heading>90</heading><scale>0.225</scale><Icon><href>http://cefn.com/images/wind/n.gif</href></Icon></IconStyle></Style><name>2997</name><description></description><Point><coordinates>1.7308,52.4433</coordinates></Point></Placemark>

          <Placemark><Style><LabelStyle><scale>0.1</scale></LabelStyle><IconStyle><heading>90</heading><scale>0.225</scale><Icon><href>http://cefn.com/images/wind/n.gif</href></Icon></IconStyle></Style><name>2998</name><description></description><Point><coordinates>1.4458,51.3733</coordinates></Point></Placemark>

          <Placemark><Style><LabelStyle><scale>0.1</scale></LabelStyle><IconStyle><heading>90</heading><scale>0.225</scale><Icon><href>http://cefn.com/images/wind/n.gif</href></Icon></IconStyle></Style><name>2999</name><description></description><Point><coordinates>1.4467,51.3733</coordinates></Point></Placemark>

        </Folder>

      </kml>

        

      

We're done! Try opening the North Easterly map in Google Earth to see the result. You can download the two awk source files in source.zip and the data file covering Ipswich from and can be downloaded at Z_37_2.ASC.zip

Now if only I could get hold of that 25 metre data set - anyone out there with an Athens login?