State College, PA in 1938
Much less development in State College nearly 83 years ago.
Why ads? π€ / Privacy Policy π³
Much less development in State College nearly 83 years ago.
East Berlin is a borough in Adams County, Pennsylvania, United States. The population was 1,521 at the 2010 census. East Berlin is served by the Bermudian Springs School District. East Berlin is located in the southern part of Pennsylvania, adjacent to the York County border and 13 miles west of York.
Same sex couples data is probably the most interesting part of the demographics to be released at 10 am today but there is much more worthwhile to pour over today. I’m hoping the data will be available using tidycensus, there are some interesting potentials for map making.
Want to be able to work with American Community Survey data offline using your own local copy of the ACS 5-year Summary File? It’s pretty easy to do with PANDAS. If you are planning a lot of Census queries, this can be a very fast way to extract data.
Before you can use this script, you will need to download some data:
import pandas as pd
path = '/home/andy/Desktop/acs-summary-file/'
# list of geography
geo = pd.read_excel(path+'5_year_Mini_Geo.xlsx', sheet_name='ny',index_col='Logical Record Number')
# load headers
header = pd.read_excel(path+'ACS_5yr_Seq_Table_Number_Lookup.xlsx')
# create a column with census variable headers
header['COL_NAME'] = header['Table ID'] + '_' + header['Line Number'].apply(lambda a: "{0:.0f}".format(a).zfill(3))
# segment id, along with ACS year and state
segId = 135
year = 2019
state = 'ny'
# create a list of headers for segment file
segHead = ['FILEID','FILETYPE','STUSAB','CHARITER','SEQUENCE','LOGRECNO'] \
+ header.query('`Sequence Number` == '+str(segId)).dropna(subset=['Line Number'])['COL_NAME'].to_list()
# read the segment file, including column names above
seg = pd.read_csv(path+'e'+str(year)+'5'+state+(str(segId).zfill(4))+'000.txt',header=None, names=segHead, index_col=5)
# join the segment file to geography using Logical Record number
seg = geo.join(seg)
# calculate percentage of households with internet subscriptions -- codes from ACS_5yr_Seq_Table_Number_Lookup.xlsx
seg['Internet Subscription']=seg['B28011_002']/seg['B28011_001']*100
# output the percentage of households by county with internet subscriptions
seg[seg['Geography ID'].str.startswith('050')][['Geography Name','Internet Subscription']]
Geography Name | Internet Subscription | |
---|---|---|
Logical Record Number | ||
13 | Albany County, New York | 83.888889 |
14 | Allegany County, New York | 76.248050 |
15 | Bronx County, New York | 75.917821 |
16 | Broome County, New York | 82.222562 |
17 | Cattaraugus County, New York | 72.431480 |
… | … | … |
70 | Washington County, New York | 80.224036 |
71 | Wayne County, New York | 81.508715 |
72 | Westchester County, New York | 86.371288 |
73 | Wyoming County, New York | 78.387887 |
74 | Yates County, New York | 75.916583 |
# alternatively you can display human readable columns automatically
seg.rename(dict(zip(header['COL_NAME'],header['Table Title'])),axis=1)
State | Geography ID | Geography Name | FILEID | FILETYPE | STUSAB | CHARITER | SEQUENCE | Total: | Has one or more types of computing devices: | |
---|---|---|---|---|---|---|---|---|---|---|
Logical Record Number | ||||||||||
1 | NY | 04000US36 | New York | ACSSF | 201900000.0 | ny | 0.0 | 135.0 | 7343234.0 | 6581493.0 |
2 | NY | 04001US36 | New York — Urban | ACSSF | 201900000.0 | ny | 0.0 | 135.0 | 6433524.0 | 5771681.0 |
3 | NY | 04043US36 | New York — Rural | ACSSF | 201900000.0 | ny | 0.0 | 135.0 | 909710.0 | 809812.0 |
4 | NY | 040A0US36 | New York — In metropolitan or micropolitan st… | ACSSF | 201900000.0 | ny | 0.0 | 135.0 | 7189902.0 | 6449723.0 |
5 | NY | 040C0US36 | New York — In metropolitan statistical area | ACSSF | 201900000.0 | ny | 0.0 | 135.0 | 6796057.0 | 6109882.0 |
… | … | … | … | … | … | … | … | … | … | … |
28400 | NY | 97000US3631920 | Yonkers City School District, New York | ACSSF | 201900000.0 | ny | 0.0 | 135.0 | 74897.0 | 65767.0 |
28401 | NY | 97000US3631950 | York Central School District, New York | ACSSF | 201900000.0 | ny | 0.0 | 135.0 | 2116.0 | 1964.0 |
28402 | NY | 97000US3631980 | Yorktown Central School District, New York | ACSSF | 201900000.0 | ny | 0.0 | 135.0 | 7068.0 | 6751.0 |
28403 | NY | 97000US3632010 | Cuba-Rushford Central School District, New York | ACSSF | 201900000.0 | ny | 0.0 | 135.0 | 2629.0 | 2186.0 |
28404 | NY | 97000US3699999 | Remainder of New York, New York | ACSSF | 201900000.0 | ny | 0.0 | 135.0 | 79779.0 | 75425.0 |
Too much work or don’t want to download the summary file yourself? You can query the Census API directly using PyPI’s censusdata library from PIP. For infrequent queries where you are online, for those with Internet at home, you would be much better off just querying the API directly.
import pandas as pd
import censusdata as cd
# attributes to load
cdcol=['B28011_001','B28011_002']
cdf = cd.download('acs5', 2019,
cd.censusgeo([('state', '36'),
('county','*')]),
cdcol)
# seperate out the geoid and geography name
geoid=[]
geoname=[]
for index in cdf.index.tolist():
geopart=''
for part in index.geo:
geopart = geopart + part[1]
geoid.append(geopart)
geoname.append(index.name)
cdf['geoid']=geoid
cdf['geoname']=geoname
# calculate percentage with internet subscriptions
cdf['Internet Subscription']=cdf['B28011_002']/cdf['B28011_001']*100
# output a similar table as above
cdf
Learn how to load into PANDAS the PL 94-171 2020 Redistricting Data, a process that is similar but different then ACS data.
Also, calculate the population of an area and it’s average demographics, including areas that don’t have Census demographics such as Election Districts or County Legislative districts.
I was pleasantly surprised how quick purr is to use with TidyCensus to get data for all Census Tracts in America. This code is based on Matt Herman’s example, but I used R’s built in state.abb
to get a list of all states. To only the CONUS Census Tracts, you could use state.abb %>% setdiff(c('AK','HI'))
, which uses R’s built in setdiff to remove AK and HI from the list of states pulled.
acs <- map_dfr(
state.abb,
~get_acs("tract", survey='acs5', var='B01001_001', state=., cache_table = T,
geometry = T,
year = 2021)
)
Here is how to a create a table of percent income using R and Census Public Use Microdata and the Qauntile function.
library(tidycensus)
library(tidyverse)
library(hutils)
library(gt)
# Grab Public Use Microdata for Albany County, with
# PINCP = Total Personal Income see View(pums_variables)
# puma areas can be found with tigris and mapview for an interactive
# map of the puma areas: mapview::mapview(tigris::pumas('ny'))
aci <- get_pums(variables = 'PINCP', state = "NY",puma = c('02001','02002'))
# next use hutil's weight2rows to expand out the dataframe so that
# eached weighted observation has one row per observation
# then filter to only include persons with salary income, > $2
# extract out only the .$PINCP variable, then calculate quantile for
# 0, 5, 10, ... 90, 95, 99 percent
aci %>%
weight2rows('PWGTP') %>%
filter(PINCP > 2) %>%
.$PINCP %>%
quantile(c(seq(0,0.95,0.05),.99)) -> y
# create a tibble with the quantile ranges
# (what percentage am i)
# and calculated quantile values (how much income)
df2 <- tibble(
x=100-c(seq(0,0.95,0.05),.99)*100,
y=y
)
# Zero out the 100% value, to overcome
# how the Census stores $1, $2, etc.
df2[1,2] <- 0
# Create the table with gt
df2 %>%
select( Salary = y, `Top Percent` = x) %>%
gt() %>%
fmt_currency(1, decimals = 0) %>%
fmt_percent(2, scale_values = F, decimals = 1) %>%
opt_stylize() %>%
opt_css('body { text-align: center} ') %>%
cols_align('center') %>%
tab_header('What percent is a salary in Albany County?',
'Includes only persons with a yearly income of more then $1 a year.') %>%
tab_footnote(html('Andy Arthur, 1/22/23.<br /><em>Data Source:</em> 2016-2020 Public Use Microdata, <br />Total person\'s income, NY PUMA 02001 and 02002.'))
State Population Estimates 2020 through 2022 | |||||
April 2020 Census Count | July 2020 Estimate | July 2021 Estimate | July 2022 Estimate | Change April 2020 to July 2022 | |
---|---|---|---|---|---|
New York | 20,201,230 | 20,108,296 | 19,857,492 | 19,677,151 | β2.6% |
District of Columbia | 689,546 | 670,868 | 668,791 | 671,803 | β2.6% |
Puerto Rico | 3,285,874 | 3,281,557 | 3,262,693 | 3,221,789 | β2.0% |
Illinois | 12,812,545 | 12,786,580 | 12,686,469 | 12,582,032 | β1.8% |
Louisiana | 4,657,749 | 4,651,664 | 4,627,098 | 4,590,241 | β1.4% |
California | 39,538,245 | 39,501,653 | 39,142,991 | 39,029,342 | β1.3% |
West Virginia | 1,793,755 | 1,791,420 | 1,785,526 | 1,775,156 | β1.0% |
Hawaii | 1,455,273 | 1,451,043 | 1,447,154 | 1,440,196 | β1.0% |
Mississippi | 2,961,288 | 2,958,141 | 2,949,586 | 2,940,057 | β0.7% |
Massachusetts | 7,029,949 | 6,995,729 | 6,989,690 | 6,981,974 | β0.7% |
Michigan | 10,077,325 | 10,069,577 | 10,037,504 | 10,034,113 | β0.4% |
Ohio | 11,799,374 | 11,797,517 | 11,764,342 | 11,756,058 | β0.4% |
Rhode Island | 1,097,371 | 1,096,345 | 1,096,985 | 1,093,734 | β0.3% |
New Jersey | 9,289,031 | 9,271,689 | 9,267,961 | 9,261,699 | β0.3% |
Pennsylvania | 13,002,689 | 12,994,440 | 13,012,059 | 12,972,008 | β0.2% |
Maryland | 6,177,213 | 6,173,205 | 6,174,610 | 6,164,660 | β0.2% |
New Mexico | 2,117,527 | 2,118,390 | 2,116,677 | 2,113,344 | β0.2% |
Kansas | 2,937,847 | 2,937,919 | 2,937,922 | 2,937,150 | β0.0% |
Wisconsin | 5,893,725 | 5,896,271 | 5,880,101 | 5,892,539 | β0.0% |
North Dakota | 779,091 | 779,518 | 777,934 | 779,261 | 0.0% |
Alaska | 733,378 | 732,923 | 734,182 | 733,583 | 0.0% |
Oregon | 4,237,291 | 4,244,795 | 4,256,301 | 4,240,137 | 0.1% |
Kentucky | 4,505,893 | 4,507,445 | 4,506,589 | 4,512,310 | 0.1% |
Minnesota | 5,706,504 | 5,709,852 | 5,711,471 | 5,717,184 | 0.2% |
Iowa | 3,190,372 | 3,190,571 | 3,197,689 | 3,200,517 | 0.3% |
Nebraska | 1,961,489 | 1,962,642 | 1,963,554 | 1,967,923 | 0.3% |
Missouri | 6,154,920 | 6,153,998 | 6,169,823 | 6,177,957 | 0.4% |
Connecticut | 3,605,942 | 3,597,362 | 3,623,355 | 3,626,205 | 0.6% |
Virginia | 8,631,384 | 8,636,471 | 8,657,365 | 8,683,619 | 0.6% |
Vermont | 643,085 | 642,893 | 646,972 | 647,064 | 0.6% |
Indiana | 6,785,668 | 6,788,799 | 6,813,532 | 6,833,037 | 0.7% |
Wyoming | 576,837 | 577,605 | 579,483 | 581,381 | 0.8% |
Alabama | 5,024,356 | 5,031,362 | 5,049,846 | 5,074,296 | 1.0% |
Washington | 7,705,247 | 7,724,031 | 7,740,745 | 7,785,786 | 1.0% |
Arkansas | 3,011,555 | 3,014,195 | 3,028,122 | 3,045,637 | 1.1% |
Colorado | 5,773,733 | 5,784,865 | 5,811,297 | 5,839,926 | 1.1% |
New Hampshire | 1,377,518 | 1,378,587 | 1,387,505 | 1,395,231 | 1.3% |
Oklahoma | 3,959,346 | 3,964,912 | 3,991,225 | 4,019,800 | 1.5% |
Maine | 1,362,341 | 1,363,557 | 1,377,238 | 1,385,340 | 1.7% |
Georgia | 10,711,937 | 10,729,828 | 10,788,029 | 10,912,876 | 1.9% |
Tennessee | 6,910,786 | 6,925,619 | 6,968,351 | 7,051,339 | 2.0% |
Nevada | 3,104,624 | 3,115,648 | 3,146,402 | 3,177,772 | 2.4% |
North Carolina | 10,439,414 | 10,449,445 | 10,565,885 | 10,698,973 | 2.5% |
South Dakota | 886,677 | 887,799 | 896,164 | 909,824 | 2.6% |
Delaware | 989,957 | 992,114 | 1,004,807 | 1,018,396 | 2.9% |
Arizona | 7,151,507 | 7,179,943 | 7,264,877 | 7,359,197 | 2.9% |
Texas | 29,145,428 | 29,232,474 | 29,558,864 | 30,029,572 | 3.0% |
South Carolina | 5,118,429 | 5,131,848 | 5,193,266 | 5,282,634 | 3.2% |
Florida | 21,538,226 | 21,589,602 | 21,828,069 | 22,244,823 | 3.3% |
Utah | 3,271,614 | 3,283,785 | 3,339,113 | 3,380,800 | 3.3% |
Montana | 1,084,197 | 1,087,075 | 1,106,227 | 1,122,867 | 3.6% |
Idaho | 1,839,092 | 1,849,202 | 1,904,314 | 1,939,033 | 5.4% |
Andy Arthur, 12/23/22. Data Source: US Census Population Estimates. |