Population Maths
2020 Population Maths !
These require Python, PANDAS and GeoPandas. You will also need the PL 94-171 redistricting files, specifically the 2020 TIGER Line Shapefiles and the nyplgeo2020.pl which is in a zip file. That nyplgeo2020.pl contains the population, households, and area from the 2020 census file — among other things for all census summary levels. It’s really handy to have.
This document is very helpful in understanding the Census files when you load them into PANDAS: 2020 Census State (P.L. 94-171) Redistricting Summary File Technical Documentation.
For all of these scripts, you will need to adjust the variables for the actual paths on your computer where they are saved. The overlay shape file can be anything, but you will need to update the catField to match the actual field in the shapefile that you want to calculate the population.
Population of an Area
The below code calculates the area of overlay layer, if you have an overlay shapefile with a series of rings extending out from the NYS Capitol. As this covers a large area, we use blockgroup sums to calculate, and then the cumulative sum of each ring.
import pandas as pd
import geopandas as gpd
# path to overlay shapefile
overlayshp = r'/tmp/dis_to_albany.gpkg'
# summary level -- 750 is tabulation block, 150 is blockgroup
# large areas over about 50 miles much faster to use bg
summaryLevel = 150
#summaryLevel = 750
# path to block or blockgroup file
if summaryLevel == 150:
blockshp = r'/home/andy/Documents/GIS.Data/census.tiger/36_New_York/tl_2020_36_bg20.shp.gpkg'
else:
blockshp = r'/home/andy/Documents/GIS.Data/census.tiger/36_New_York/tl_2020_36_tabblock20.shp.gpkg'
# path to PL 94-171 redistricting geoheader file
pl94171File = '/home/andy/Desktop/nygeo2020.pl'
# field to categorize on (such as Ward -- required!)
catField = 'Name'
# geo header contains 2020 census population in column 90
# per PL 94-171 documentation, low memory chunking disabled
# as it causes issues with the geoid column being mixed types
df=pd.read_csv(pl94171File,delimiter='|',header=None, low_memory=False )
# column 2 is summary level
population=df[(df.iloc[:,2] == summaryLevel)][[9,90]]
# load overlay
overlay = gpd.read_file(overlayshp).to_crs(epsg='3857')
# shapefile of nys 2020 blocks, IMPORTANT (!) mask by output file for speed
blocks = gpd.read_file(blockshp,mask=overlay).to_crs(epsg='3857')
# geoid for linking to shapefile is column 9
joinedBlocks=blocks.set_index('GEOID20').join(population.set_index(9))
# store the size of unbroken blocks
# in case overlay lines break blocks into two
joinedBlocks['area']=joinedBlocks.area
# run union
unionBlocks=gpd.overlay(overlay, joinedBlocks, how='union')
# drop blocks outside of overlay
unionBlocks=unionBlocks.dropna(subset=[catField])
# create population projection when a block crosses
# an overlay line -- avoid double counting -- this isn't perfect
# as we loose a 0.15 percent due to floating point errors
unionBlocks['sublock']=unionBlocks[90]*(unionBlocks.area/unionBlocks['area'])
# sum blocks in category
unionBlocks=pd.DataFrame(unionBlocks.groupby(catField).sum()['sublock'])
# rename columns
unionBlocks=unionBlocks.rename({'sublock': '2020 Census Population'},axis=1)
# calculate cumulative sum as you go out each ring
unionBlocks['millions']=unionBlocks.cumsum(axis=0)['2020 Census Population']/1000000
# each ring is 50 miles
unionBlocks['miles']=unionBlocks.index*50
# output
unionBlocks
Redistricting / Discrepancy from Ideal Districts
This is a variant of the above script, calculating the deviation in population from an ideal district. As this covers a small area, we use data from the block level. See below and the comments.
import pandas as pd
import geopandas as gpd
# path to overlay shapefile
overlayshp = r'/home/andy/Documents/GIS.Data/election.districts/albany wards 2015.gpkg'
# summary level -- 750 is tabulation block, 150 is blockgroup
# large areas over about 50 miles much faster to use bg
#summaryLevel = 150
summaryLevel = 750
# path to block or blockgroup file
if summaryLevel == 150:
blockshp = r'/home/andy/Documents/GIS.Data/census.tiger/36_New_York/tl_2020_36_bg20.shp.gpkg'
else:
blockshp = r'/home/andy/Documents/GIS.Data/census.tiger/36_New_York/tl_2020_36_tabblock20.shp.gpkg'
# path to PL 94-171 redistricting geoheader file
pl94171File = '/home/andy/Desktop/nygeo2020.pl'
# field to categorize on (such as Ward -- required!)
catField = 'Ward'
# geo header contains 2020 census population in column 90
# per PL 94-171 documentation, low memory chunking disabled
# as it causes issues with the geoid column being mixed types
df=pd.read_csv(pl94171File,delimiter='|',header=None, low_memory=False )
# column 2 is summary level
population=df[(df.iloc[:,2] == summaryLevel)][[9,90]]
# load overlay
overlay = gpd.read_file(overlayshp).to_crs(epsg='3857')
# shapefile of nys 2020 blocks, IMPORTANT (!) mask by output file for speed
blocks = gpd.read_file(blockshp,mask=overlay).to_crs(epsg='3857')
# geoid for linking to shapefile is column 9
joinedBlocks=blocks.set_index('GEOID20').join(population.set_index(9))
# store the size of unbroken blocks
# in case overlay lines break blocks into two
joinedBlocks['area']=joinedBlocks.area
# run union
unionBlocks=gpd.overlay(overlay, joinedBlocks, how='union')
# drop blocks outside of overlay
unionBlocks=unionBlocks.dropna(subset=[catField])
# create population projection when a block crosses
# an overlay line -- avoid double counting -- this isn't perfect
# as we loose a 0.15 percent due to floating point errors
unionBlocks['sublock']=unionBlocks[90]*(unionBlocks.area/unionBlocks['area'])
# sum blocks in category
unionBlocks=pd.DataFrame(unionBlocks.groupby(catField).sum()['sublock'])
# rename columns
unionBlocks=unionBlocks.rename({'sublock': '2020 Census Population'},axis=1)
# calculate ideal ward based on 15 districts, 2020 albany population 99,224
unionBlocks['Ideal']=99224/15
# calculate departure from ideal
unionBlocks['Departure']=unionBlocks['2020 Census Population']-unionBlocks['Ideal']
# calculate percent departure
unionBlocks['Percent Departure']=unionBlocks['Departure']/unionBlocks['2020 Census Population']*100
# output
unionBlocks
Catskill Park Hamlets
There are some odd and colorful hamlet and village names within the Catskill Park. Warmer colors show higher elevation while cooler colors are lower elevation hamlets.
Data Sources: Open Street Map, US Census, NYS DEC Catskill Park Boundaries.
Cayuta Roue 13
Deep valley outside of Cayuta bordered by large farm fields.
Taken on Thursday August 20, 2020 at Cayuta.Under the Railroad Bridge
Weather Update – August 20, 2021
It looks like the clouds will stick around for the weekend. βοΈ
Just look how muggy things will be for most of next week. π¦ Yucky! I am so tired of the heat and humidity, although I guess it’s not going to last all that many more weeks as autumn is coming and there is a big cool down predicted in the 8-14 day forecast.
I was so set on going out to Schoharie this weekend, but I’m really not sure if that is going to remain the plan. Quite honestly, the latest forecast kind of sucks, although not a washout. I guess Thacher Park is always an option — and hold off for next week. I will have the hotspot plan back next week, so I certainly could work remote next Friday from Schoharie, and then maybe take off next Monday. It’s a possibility.
Today. Feels like … August 28th. |
A chance of rain before 10am, then isolated showers and thunderstorms after 1pm. Mostly cloudy.
Northwest wind around 7 mph. Chance of precipitation is 30%. New rainfall amounts of less than a tenth of an inch, except higher amounts possible in thunderstorms.
|
80 degrees | 71 max dew point | 7:49 sunset |
|
Tonight. Muggy ! |
Mostly cloudy.
Light northwest wind.
|
67 degrees | 71 max dew point | 6:09 sunrise |
|
Saturday. Feels like … August 14th. |
Isolated showers between 9am and noon, then scattered showers and thunderstorms after noon. Mostly cloudy.
Light and variable wind becoming southeast 5 to 7 mph in the morning. Chance of precipitation is 40%. New rainfall amounts of less than a tenth of an inch, except higher amounts possible in thunderstorms.
|
82 degrees | 72 max dew point | 7:47 sunset |
|
Saturday Night. Muggy ! |
A chance of showers. Mostly cloudy.
Southeast wind around 5 mph becoming light and variable after midnight. Chance of precipitation is 30%. New precipitation amounts of less than a tenth of an inch possible.
|
68 degrees | 72 max dew point | 6:10 sunrise |
|
Sunday. Feels like … August 3rd. |
A chance of showers before 2pm, then a chance of rain and thunderstorms after 2pm. Mostly cloudy.
North wind 5 to 9 mph. Chance of precipitation is 40%. New rainfall amounts between a tenth and quarter of an inch, except higher amounts possible in thunderstorms.
|
83 degrees | 71 max dew point | 7:46 sunset |
|
Sunday Night. Muggy ! |
A chance of rain and thunderstorms. Mostly cloudy.
Chance of precipitation is 30%. New precipitation amounts between a tenth and quarter of an inch, except higher amounts possible in thunderstorms.
|
67 degrees | 70 max dew point | 6:11 sunrise |
|
Monday. Feels like … August 28th. |
Showers likely and possibly a thunderstorm, mainly after 2pm. Mostly cloudy.
Chance of precipitation is 60%.
|
80 degrees | 70 max dew point | 7:44 sunset |
|
Monday Night. Muggy ! |
Showers likely and possibly a thunderstorm before 8pm. Mostly cloudy.
Chance of precipitation is 60%.
|
66 degrees | 69 max dew point | 6:12 sunrise |
|
Tuesday. Muggy ! |
Sunny.
|
85 degrees | 69 max dew point | 7:43 sunset |
|
Tuesday Night. Muggy ! |
Mostly clear.
|
65 degrees | 68 max dew point | 6:13 sunrise |
|
Wednesday. Muggy ! |
A chance of showers and thunderstorms. Mostly sunny.
Chance of precipitation is 30%.
|
86 degrees | 69 max dew point | 7:41 sunset |
|
Wednesday Night. Muggy ! |
A chance of showers and thunderstorms. Partly cloudy.
Chance of precipitation is 30%.
|
66 degrees | 69 max dew point | 6:14 sunrise |
|
Thursday. Muggy ! |
A chance of showers and thunderstorms. Mostly sunny.
Chance of precipitation is 40%.
|
85 degrees | 70 max dew point | 7:39 sunset |
Good Morning – August 20, 2021
Good morning! Happy Friday. π§
I guess it’s good it’s a Friday, but I have a busy day on tap. It’s still raining out this morning! Which kind of sucks, because I’m up plenty early today,
Cloudy, rain dark and 69 degrees in Delmar, NY. π§ There is a northwest breeze at 7 mph. π. The dew point is 68 degrees. The skies will clear Tuesday around 2 am.
Yesterday, rained and rained some more. π§ I was going to go down to the library yesterday, but it was so wet into the evening, and I just wasn’t feeling real ambitious. And it was still raining this morning. It really is the story of the whole summer. Rain.
I finally got around to downloading the PL94-171 redistricting data πΊ to my computer and taking a look at it. I probably should have done that a while back, but I was less interested in that then other American Community Survey data. The nice thing about the PL94-171 data is it includes all publicly released data from the full-count census, which is nice to have locally. There is a lot of columns and rows, but PANDAS is more then happy to work with that. π€ There is just a lot of ways I can analyze that data and use I for various projects.