How to create a spreadsheet converting or crosswalking old Election Districts to new Election districts
It turns out with Python and GeoPANDAS available on pip it’s pretty easy to create a Microsoft Excel-style Spreadsheet with new election districts with the percentage of the old election districts within them. This code can also be used to calculate the percentage of a county in a Senate district. Obviously, you will need to update all of the paths and field names for your Shapefiles.
# requires the geopandas and pandas python libraries, available from pip
import pandas as pd
import geopandas as gpd
# Load Old EDs etc Shapefile
ac11 = gpd.read_file(r'/home/andy/Documents/GIS.Data/election.districts/2011albanyEDs.shp').to_crs(epsg='3857')
# Load the New EDs etc Shapefile
ac20 = gpd.read_file(r'/home/andy/Albany County October 2020 ED.gpkg').to_crs(epsg='3857')
# Save the area of new EDs
ac20['ed20_area']=ac20.area
# Create a GIS intersection of old and new EDs
edover=gpd.overlay(ac11, ac20, how='intersection')
# Calculate percentage of the New ED covered by the Old ED
edover['Percent of 2020 ED Area Covered by 2011 ED']=edover.area/edover['ed20_area']*100
# Rename columns to something more sensible
edover=edover.rename({'NAME_1':'2011 ED','NAME_2':'2020 ED'},axis=1)
# Exclude Portions of Old EDs that are less then 1% of New EDs
# Sort and Group by New ED and then Percentage of the Old ED
edover[(edover['Percent of 2020 ED Area Covered by 2011 ED']>1)][['2011 ED','2020 ED','Percent of 2020 ED Area Covered by 2011 ED']].sort_values(by=['2020 ED','Percent of 2020 ED Area Covered by 2011 ED']).groupby(by=['2020 ED','2011 ED']).median().to_excel('/tmp/New to Old Election District Crosswalk.xlsx')
Here is a sample of the Microsoft Excel-style spreadsheet that will be exported.
I rewrote the short little geocoder that I posted last night in Python, using the multiple address function in NY’s Street and Address Maintenance (SAM) Program. This should be able to encode up to 1,000 lines in a text file. You would use this script by typing python sam-geocode.csv with one address per line.
By running python sam-geocode.py address.txt, with a file like this:
5 Winding Brook Drive Guilderland NY 12084
1224 Union Street Schenectady NY 12308
6239 Randomwood Drive Schenectady NY 12030
1529 Western Avenue Suite 102 Albany NY 12203
4017b State Street Schenectady NY 12304
4017b State Street Schenectady NY 12304
1529 Western Avenue Suite 102 Albany NY 12203
1529 Western Avenue Suite 102 Albany NY 12203
2842 W. Lydius Street Schenectady NY 12306
200 Bloomingdale Road Altamont NY 12009
Gives you out a CSV file named address-geocoded.csv.
"1224 Union St, Schenectady, NY, 12308",42.81103508600006,-73.92220575399995
"6239 Randomwood Dr, Schenectady, NY, 12303",42.74713565800005,-73.93299995099994
"1529 Western Ave, Albany, NY, 12203",42.68293244500006,-73.84222237599994
"4017 B State St, Schenectady, NY, 12304",42.76677680800003,-73.88794539599996
"4017 B State St, Schenectady, NY, 12304",42.76677680800003,-73.88794539599996
"1529 Western Ave, Albany, NY, 12203",42.68293244500006,-73.84222237599994
"1529 Western Ave, Albany, NY, 12203",42.68293244500006,-73.84222237599994
"2842 W Lydius St, Schenectady, NY, 12306",42.75003186400005,-73.95573429299998
"23 Dresden Ct, Albany, NY, 12203",42.69198706100008,-73.87999566599996
"629 Salvia Ln, Schenectady, NY, 12303",42.73869920900006,-73.93268336399996
"5 Winding Brook Dr, Guilderland, NY, 12084",42.69401880400005,-73.90855096599995
"3 Feeney Ln, Schenectady, NY, 12303",42.72653497400006,-73.90924777399994
"730 Sachem Cir, Slingerlands, NY, 12159",42.67422884300004,-73.88860130999996
"200 Bloomingdale Ln, Altamont, NY, 12009",42.69634668300006,-73.94903687499993
Now, I could improve error capture on this script but for my purposes this is very simple and works wells for well-formed addresses. Maybe not as simple as the PHP script I posted last night, but this way of Geocoding is much faster and written in a language that’s more appropriate for such purposes.
#!/usr/bin/python
import requests,sys,json,csv,os
query = '{"records": ['
f = open(sys.argv[-1], 'r+')
with open(sys.argv[-1]) as f:
i = 0
for line in f:
query += '{ "attributes": { "OBJECTID":'+str(i)+', "SINGLELINE": "'+line.rstrip()+'"} },'+"n"
i+=1
query += ']}'
post = { 'f':'pjson', 'outSR': 4326, 'addresses': query }
url = 'https://gisservices.its.ny.gov/arcgis/rest/services/Locators/Street_and_Address_Composite/GeocodeServer/geocodeAddresses'
req = requests.post(url, data = post)
addresses = json.loads(req.text)['locations']
csv = open(os.path.splitext(sys.argv[-1])[0]+'-geocode.csv', "w")
for place in addresses:
csv.write('"'+place['address'].replace('"', '\"')+'",'+str(place['location']['y'])+','+str(place['location']['x'])+"n")
csv.close()
This also works for most other states, as most states have geocoding endpoints, as otherwise the fire department wouldn’t necessarily be able to find your house when it’s on fire when you call 911. To find your state, try searching for StateName "GeocodeServer/geocodeAddresses". For example, replace URL with:
Teaching myself about the Geopandas library. Not because I’m interested in the plotting capacities but it sure would be nice to have a fast “geocoder” that could join point data to other shapefiles I have like counties, municipalities, Assembly districts, etc. without using QGIS.
For example, I can feed the state’s address management geocoder up to 1,000 addresses and it will spit back a list of coordinates. In theory quite easily I could take those coordinates, spatially join them with Geopandas and add a column with the Assembly district and municipality – all without loading the shapefile or plotting a map or doing scripting within QGIS.
I really should try to get and save as many of the election district maps from county rest servers as possible before the year is up because with redistricting they will be changing the lines and for certain projects it sure would be nice to have the old ones to compare to new ones, especially in an automated fashion by using join by location in QGIS.
Been doing a lot of reading of some of the ERSI ArcGIS tutorials not because I’m interested in learning more about the commercial software but more seeking out more skills and terminology I can use to do neat new projects with QGIS.