Wednesday, May 18, 2011

Compiling gdal with ECW and MrSID support

Well I wanted to generate an mbtiles database of some imagery using raster2mb, a python script based off of gdal2tiles, but the imagery was in ECW format.  I wanted to do this to use as a baselayer on the Mapbox ipad app.

I had gdal (geospatial data abstratcion layer) installed but the binary available from the repositories does not include this support built in.  So, svn checkout the latest stable trunk of gdal and find the libecwj2_3.3-1_i386.deb (that is not trivial as ERmapper doesn't support linux in their new SDK builds[4.2] and they don't host version 3.3 anymore alas I found someone hosting it on mediafire, remember Google is your friend)

Building gdal from source tips and hints found here.  Find yourself the aforementioned ECW sdk library(read only, the code is proprietary and to write this file type you have to pay for a license, but reading from and then converting to a more friendly format can be done with the r/o sdk).

While you're at it might as well download the MrSID SDK(free registration required) and add it to your ./configure arguments because lot's of data is available in this format freely and it'd be nice to manipulate it/convert it with gdal.

good luck. :)

gdalwarp -t_srs EPSG:900913 -of GTiff dallas.ecw dallas.tif
Creating output file that is 43163P x 50580L.
Processing input file dallas.ecw

Tuesday, May 10, 2011

Comparing two point datasets for error checking

Let's say you have a set of points given to you that represent addresses of snake farms[insert object here].  Also you've been given the raw address data and you want to compare the quality of the geocoded point file by geocoding the points yourself using the best road data you can find.  Then you might want to know the distances between all the "identical" points to check your data for errors.

As an exercise I generated 2 random sets of points (100 points each) and arbitrarily joined them based on an ID field of 0-99.  One way to figure out the distance between points that should be the same in each file(although in this case NONE will be the same because both were supposed to be random point sets).  Generate an X and Y field for each point in each dataset... (x1,y1 and x2,y2)  Then join the two datasets and create a new field called perhaps "distance" and using the field calculator in arcgis use sqr((x1-x2)^2 + (y1-y2)^2).  Your results will be in whatever unit your coordinate system is measuring in.  I used NAD83 Stateplane Texas North Central for the example.


Another method involves using the ET Geowizards free tool "point to polyline". To do this you would take both files and copy both sets of points into a 3rd shapefile.  This method is nice because you get a line connecting points with identical addresses and could color ramp them to look for ones with large differences in distances.
2 point datasets with lines connecting related points