How to calcuate what percentage income a person is using R and Census Microdata
Here is how to a create a table of percent income using R and Census Public Use Microdata and the Qauntile function.
library(tidycensus)
library(tidyverse)
library(hutils)
library(gt)
# Grab Public Use Microdata for Albany County, with
# PINCP = Total Personal Income see View(pums_variables)
# puma areas can be found with tigris and mapview for an interactive
# map of the puma areas: mapview::mapview(tigris::pumas('ny'))
aci <- get_pums(variables = 'PINCP', state = "NY",puma = c('02001','02002'))
# next use hutil's weight2rows to expand out the dataframe so that
# eached weighted observation has one row per observation
# then filter to only include persons with salary income, > $2
# extract out only the .$PINCP variable, then calculate quantile for
# 0, 5, 10, ... 90, 95, 99 percent
aci %>%
weight2rows('PWGTP') %>%
filter(PINCP > 2) %>%
.$PINCP %>%
quantile(c(seq(0,0.95,0.05),.99)) -> y
# create a tibble with the quantile ranges
# (what percentage am i)
# and calculated quantile values (how much income)
df2 <- tibble(
x=100-c(seq(0,0.95,0.05),.99)*100,
y=y
)
# Zero out the 100% value, to overcome
# how the Census stores $1, $2, etc.
df2[1,2] <- 0
# Create the table with gt
df2 %>%
select( Salary = y, `Top Percent` = x) %>%
gt() %>%
fmt_currency(1, decimals = 0) %>%
fmt_percent(2, scale_values = F, decimals = 1) %>%
opt_stylize() %>%
opt_css('body { text-align: center} ') %>%
cols_align('center') %>%
tab_header('What percent is a salary in Albany County?',
'Includes only persons with a yearly income of more then $1 a year.') %>%
tab_footnote(html('Andy Arthur, 1/22/23.<br /><em>Data Source:</em> 2016-2020 Public Use Microdata, <br />Total person\'s income, NY PUMA 02001 and 02002.'))