Calculate Median Household Income for Albany Wards using R πΊ
Calculate Median Household Income for Albany Wards using R πΊ
A while back I posted similar code using Python, but with R the code is cleaner, simpler.
library(tidycensus)
library(tidyverse)
library(sf)
# shapefile of Albany Wards, project to web mercator 3857
# calculate area of each ward, load Ward field as numeric
# hide extra fields in shapefile
wards <- read_sf('/tmp/Albany Wards.gpkg') %>%
st_transform(3857) %>%
mutate(ward_area = st_area(.),
Ward = as.numeric(Ward)) %>%
dplyr::select(Ward, ward_area)
# get median household income, re-project
# to web mercator 3857
income<- get_acs(
geography = "tract",
state='ny',
county='Albany',
variables = "B19013_001",
year = 2020,
survey = "acs5",
geometry = T
) %>%
st_transform(3857)
# intersect income against wards
# calculate percent of ward's area in each census tract
# group back together as a single ward
# create a weighted-mean of median household inco
# rejoin back together with original ward shapefile
income %>%
st_intersection(wards) %>%
mutate(percent_of_ward = (st_area(.) / ward_area) %>% units::drop_units() ) %>%
st_drop_geometry() %>%
group_by(Ward) %>%
summarise(estimate = weighted.mean(estimate, percent_of_ward, na.rm=T)) %>%
inner_join(wards, ., by=c('Ward'))
Here is what is produced, with geometry columns suppressed (exclude the last line above when you don’t need the other information for mapping).
Ward estimate
<dbl> <dbl>
1 1 57500.
2 2 46544.
3 3 37274.
4 4 39713.
5 5 38127.
6 6 47070.
7 7 47915.
8 8 86490.
9 9 57688.
10 10 44964.
11 11 33290.
12 12 69262.
13 13 67333.
14 14 78489.
15 15 72368.