Python, the State Geocoder and Outputting Assembly Districts
Python code, takes a list of addresses, runs them through NYS Address Management system, then uses GeoPandas to calculate the municipality and Assembly district from GeoPackages then writes out to a CSV file. Nothing fancy, but it does a job. I could use it with any other shapefile I want. GeoPandas is nice because it’s fast and doesn’t require loading QGIS.
#!/usr/bin/python
import requests,sys,json,os,csv
import pandas as pd
import geopandas as gpd
lines=[]
# read list of addresses
with open(sys.argv[-1], newline='') as csvfile:
for line in csv.DictReader(csvfile):
lines.append(line)
# build address query
query = '{"records": ['
i=0
for line in lines:
query += '{ "attributes": { "OBJECTID":'+str(i)+', "SINGLELINE": "'+line['Address'].rstrip()+'"} },'+"\n"
i+=1
query += ']}'
post = { 'f':'pjson', 'outSR': 4326, 'addresses': query }
url = 'https://gisservices.its.ny.gov/arcgis/rest/services/Locators/Street_and_Address_Composite/GeocodeServer/geocodeAddresses'
# send request to state geocoder
req = requests.post(url, data = post)
locations = json.loads(req.text)['locations']
# parse response
for loc in locations:
i = loc['attributes']['ResultID']
lines[i]['y'] = loc['location']['y']
lines[i]['x'] = loc['location']['x']
lines[i]['Match_addr'] = loc['attributes']['Match_addr']
# hackish, might cause problems but keeps joins from erroring
if (lines[i]['x'] == 'NaN'):
lines[i]['x'] = 0
lines[i]['y'] = 0
# convert to pandas
locPd = pd.DataFrame(lines,columns=lines[0].keys())
locPd = gpd.GeoDataFrame(locPd, geometry=gpd.points_from_xy(locPd.x.astype('float32'), locPd.y.astype('float32')))
# add county municipality column
cosub = gpd.read_file(r'/home/andy/Documents/GIS.Data/geocode/cosub.gpkg')
locPd = gpd.sjoin(locPd, cosub, op="within")
del locPd['index_right']
# add ads column
ad = gpd.read_file(r'/home/andy/Documents/GIS.Data/geocode/ad.gpkg')
locPd = gpd.sjoin(locPd, ad, op="within")
del locPd['index_right']
# add sd column
sd = gpd.read_file(r'/home/andy/Documents/GIS.Data/geocode/sd.gpkg')
locPd = gpd.sjoin(locPd, sd, op="within")
del locPd['index_right']
# add cd column
sd = gpd.read_file(r'/home/andy/Documents/GIS.Data/geocode/cd.gpkg')
locPd = gpd.sjoin(locPd, sd, op="within")
# remove added geometery and index columns
del locPd['geometry']
del locPd['index_right']
# write pandas back to out csv
locPd.to_csv (os.path.splitext(sys.argv[-1])[0]+'-output.csv', index = False, header=True)
Sometimes I wish I had studied cartography and remote sensing in college π‘π
Truth be told, it wasn’t really a serious option some 20 years ago. While a handful of colleges offered classes in GIS and remote sensing, it was very much in it’s infancy …
Back then computers had very limited memory, hard drive and processing power — hard drives were rated in gigabytes and memory in hundred megabytes — and no ultra-fast solid-state hard drives
With large GIS files, it would have been impossible to download them over a dial-up internet
The quality of GIS data back in 2000 was rather poor as sensors were crude, GPS recorders were a new and expensive technology, the Census TIGER/Line files were rather geographically inaccurate
In 2000, a lot of now-freely available on the Internet data was only available for purchase, and it was expensive and came on multiple CD-ROMs
While there was rather cryptic but powerful GRASS back then, there was no Quantum GIS, which has become a very powerful open-source GIS program in the past twenty years
The disadvantage to not having formal training, is one doesn’t see things in an ordered, formal way. I learned how to make maps and use GIS data in a hands-on-way, and only learned the minimal terminology and methods required to get desired output. I understand the doings of map making, but not so much the theory. I have been reading open college textbooks to pick up some more of the formal theory behind maps, but a lot of methods are crude and just based on experience of what works.
But I am kind of glad that I don’t do map making professionally. It really frees me up to do my own thing, on my own time, not having to worry about conflicts with work. If I do a map or GIS research for an activist group, nobody can say I’m utilizing work equipment, skills or data I acquired from my job. Instead, I am doing it totally on my own based on my own studies and knowledge.
I’ve looked a bit at college classes nowadays, but they’re rather expensive, and I’m not sure how much they would benefit me. I guess it would be something to put on a resume, but I am generally happy with my current career, and the cost is high for getting a certification in something I am fairly familiar with at least on a practical perspective. Plus, I often think any classroom learning would be out of date, compared to what I am using nowadays with Quantum GIS and other open source tools and data.
Programs that use GDAL as a back-end should be able to natively open .e00 files β for example QGIS, GRASS, SAGA, ogr2ogr, Pythonβs GeoPandas or R Statistical Programmingβs sf package. This is a default built-in driver in GDAL.
If you have GDAL installed (which you may already have if you have any open source GIS software installed), you can also use ogr2ogr to do the conversion. Due to the limitations of Shapefiles, you it is better to convert to GeoPackage, as Arc/Info E00 files can have field names longer than 10-character long Shapefiles.
ogr2ogr output.shp input.e00 # note output comes before input
In the past, gdal didnβt have native E00 support but all recent versions do.