Thursday, January 5, 2012

5yr Abs Chg



The data is comparing absolute enrollment numbers from 2005-2006 to 2010-2011.

This map was created by downloading PEIMS data from the TEA, joining it do a vector file of all the school districts and doing some simple calculations and then marking up the resulting data in Tilemill.  It's being hosted using Mapbox basic hosting plan.

Wednesday, November 16, 2011

Solar Power System

From Last day
I spent the last two weeks near Bayfield CO with my parents building a new solar power system.
Specs:
6x 230 watt Canadian Solar Panels configured for 2 series of 3 at 90VDC
4kw Schneider Electric Charger/Inverter with 8kw 10 second surge capacity
60amp Schneider MPPT charge controller
12x 6volt 370AH deep cell batteries wired for 24volts at 1110AH

The build went smooth and everything worked on the first flip of the switches. When I left at 10am roughly 3 hours before solar noon the output was 1180Watts, almost 30 volts at almost 40amps. Not sure yet how high the output got by 1pm if the sky stayed clear.
From Last day

Wednesday, October 19, 2011

Setting up a spatial postgresql database with postgis

My machine is running Ubuntu 10.04 x64 This is assuming you already have postgresql and postgis installed.
First create the PostgreSQL database:
createdb yourdatabase

Then add plpgsql support to that database:
createlang plpgsql yourdatabase

Then you need to import these two SQL files into that database to set up the PostGIS functions:
psql -d yourdatabase -f /usr/share/postgresql/8.4/contrib/postgis-1.5/postgis.sql psql -d yourdatabase -f /usr/share/postgresql/8.4/contrib/postgis-1.5/spatial_ref_sys.sql

That's it!
QGIS has a nice plugin tool for pushing shapefiles into postgis enabled postgresql databases called SPIT(Shapefile to Postgresql Import Tool

Friday, September 16, 2011

change in unemployment rate county by county coupled with total economic recovery spending

Courtesy of Development Seed

SEXtante

"SEXTANTE is a spatial data analysis library written in Java. The main aim of SEXTANTE is to provide a platform for the easy implementation, deployment and usage of rich geoprocessing functionality. It currently contains more than three hundred algorithms for both raster and vector data processing, as well as tabular data analysis tools. SEXTANTE integrates seamlessly with many open source Java GIS (such as gvSIG, uDig or OpenJUMP) and non-GIS tools (such as the 52N WPS server or the spatial ETL Talend)."

Friday, September 9, 2011

Preparing 2010 census data

Preparing 2010 census data found in the SF1 files.  The census instructions are helpful but they use extremely slow processes in their instructions on how to prepare the data for importing into ms access.

On page 5 of the instructions they tell you "All files with an .sf1 extension must be changed to .txt files. Right click on the first file with a .sf1 extension. Choose “Rename” and change the .sf1 portion of the name to .txt and hit Enter. Repeat for each file with a .sf1 extension"
This is incredibly slow how about opening up a CMD window and typing in
ren *.sf1 *.txt


Next on page 7 they tell you to use Wordpad to find and replace text in several huge text files, turns out this method is incredibly slow. How about we do this using sed linux command:

cat tx000062010.txt | \ > sed -e 's/SF1ST,TX,000,06,//' > tx000062010mod.txt
This command finds the pattern found between the first two forward slashes and replaces it with the pattern between the 2nd and 3rd forward slashes(in this case nothing). This took about 4 seconds to process a 565mb files on a quadcore AMD machine with 8gb memory. It was going to take hours to do this using wordpad's "find replace" tool. Turns out you don't need to cat the file and pipe it to sed. A good friend with way more experience using unix tools and programming than I also say's that awk is easier to use and I have to agree.


Count the number of fields in a comma delimited txt file with awk.

gawk -F"," '{ print NF ":" $0}' textfile.csv
sample output: 260:SF1ST,TX,000,45,0000438,0,0,0,0,0,0,0,0,1,0,1,0,1.00,1.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,0.00,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1 OR just awk -F"," '{print NF}' tx000452010.txt
sample output: 260 260 260 ....
This text file has 260 fields per line. Which I want to extract the first 239 fields from.
cut -d ',' -f1-239 tx000452010.txt > tx000452010part1.txt

Excluding a field range and writing to a new file
gawk -F"," -v f=6 -v t=239 '{ for (i=1; i<=260;i++) if( i>=f && i<=t) continue; else printf("%s%s", $i,(i!=260) ? OFS : ORS) }' tx000452010.txt > tx000452010part2.txt
replace the default seperator of space with comma
awk '{gsub(/ /,",");print}' tx000452010part2.txt > tx000452010part_2.txt

This could be done in a single command if I knew how.
Import the text files into a postgresql database because msaccess has a 2gb file limit and to have all this data in one database you're looking at a 10+gb database easy.
add the new field for building the geoid
ALTER TABLE "SF1_Access2003_mdb"."SF1_00001" ADD COLUMN geoid text;
Concatenate the fields to build the geoid for the block summary level. Hint: If you take the left 12 characters of this result you get the geoid for the blockgroup level......etc..
UPDATE "SF1_00002" SET geoid = "Txgeo2010"."STATE" || "Txgeo2010"."COUNTY" || "Txgeo2010"."TRACT" || "Txgeo2010"."BLOCK" FROM "Txgeo2010" WHERE ("SF1_00002"."LOGRECNO" = "Txgeo2010"."LOGRECNO");

...to be continued

Wednesday, June 1, 2011

Generalizing parcel data

Raw data -> google-refine for clustering and filtering -> join refined data to spatial data -> dissolve parcels using refined subdivision attribute -> buffer to fill in ROW's --> copy buffer to original dissolve layer --> dissolve again on subdivision attribute --> setup topology does not have gaps with tolerance of ~1foot --> validate topology (removes donut hole slivers)--> buffer negative value equal to the original buffer to remove overlap of adjacent subdivision polygons --> minor edits to clean up the data.