Search Results for: pa census

Farm Country, East Berlin, PA

East Berlin is a borough in Adams County, Pennsylvania, United States. The population was 1,521 at the 2010 census. East Berlin is served by the Bermudian Springs School District. East Berlin is located in the southern part of Pennsylvania, adjacent to the York County border and 13 miles west of York.

Working PANDAS and American Community Survey Summary File

Want to be able to work with American Community Survey data offline using your own local copy of the ACS 5-year Summary File? It’s pretty easy to do with PANDAS. If you are planning a lot of Census queries, this can be a very fast way to extract data.

Before you can use this script, you will need to download some data:

import pandas as pd

path = '/home/andy/Desktop/acs-summary-file/'

# list of geography
geo = pd.read_excel(path+'5_year_Mini_Geo.xlsx', sheet_name='ny',index_col='Logical Record Number')

# load headers
header = pd.read_excel(path+'ACS_5yr_Seq_Table_Number_Lookup.xlsx')

# create a column with census variable headers
header['COL_NAME'] = header['Table ID'] + '_' + header['Line Number'].apply(lambda a: "{0:.0f}".format(a).zfill(3))

# segment id, along with ACS year and state
segId = 135
year = 2019
state = 'ny'

# create a list of headers for segment file
segHead = ['FILEID','FILETYPE','STUSAB','CHARITER','SEQUENCE','LOGRECNO'] \
    + header.query('`Sequence Number` == '+str(segId)).dropna(subset=['Line Number'])['COL_NAME'].to_list()

# read the segment file, including column names above    
seg = pd.read_csv(path+'e'+str(year)+'5'+state+(str(segId).zfill(4))+'000.txt',header=None, names=segHead, index_col=5)

# join the segment file to geography using Logical Record number
seg = geo.join(seg)

# calculate percentage of households with internet subscriptions -- codes from ACS_5yr_Seq_Table_Number_Lookup.xlsx
seg['Internet Subscription']=seg['B28011_002']/seg['B28011_001']*100

# output the percentage of households by county with internet subscriptions
seg[seg['Geography ID'].str.startswith('050')][['Geography Name','Internet Subscription']]

Geography NameInternet Subscription
Logical Record Number
13Albany County, New York83.888889
14Allegany County, New York76.248050
15Bronx County, New York75.917821
16Broome County, New York82.222562
17Cattaraugus County, New York72.431480
70Washington County, New York80.224036
71Wayne County, New York81.508715
72Westchester County, New York86.371288
73Wyoming County, New York78.387887
74Yates County, New York75.916583
# alternatively you can display human readable columns automatically
seg.rename(dict(zip(header['COL_NAME'],header['Table Title'])),axis=1)
StateGeography IDGeography NameFILEIDFILETYPESTUSABCHARITERSEQUENCETotal:Has one or more types of computing devices:
Logical Record Number
1NY04000US36New YorkACSSF201900000.0ny0.0135.07343234.06581493.0
2NY04001US36New York — UrbanACSSF201900000.0ny0.0135.06433524.05771681.0
3NY04043US36New York — RuralACSSF201900000.0ny0.0135.0909710.0809812.0
4NY040A0US36New York — In metropolitan or micropolitan st…ACSSF201900000.0ny0.0135.07189902.06449723.0
5NY040C0US36New York — In metropolitan statistical areaACSSF201900000.0ny0.0135.06796057.06109882.0
28400NY97000US3631920Yonkers City School District, New YorkACSSF201900000.0ny0.0135.074897.065767.0
28401NY97000US3631950York Central School District, New YorkACSSF201900000.0ny0.0135.02116.01964.0
28402NY97000US3631980Yorktown Central School District, New YorkACSSF201900000.0ny0.0135.07068.06751.0
28403NY97000US3632010Cuba-Rushford Central School District, New YorkACSSF201900000.0ny0.0135.02629.02186.0
28404NY97000US3699999Remainder of New York, New YorkACSSF201900000.0ny0.0135.079779.075425.0

Too much work or don’t want to download the summary file yourself? You can query the Census API directly using PyPI’s censusdata library from PIP. For infrequent queries where you are online, for those with Internet at home, you would be much better off just querying the API directly.

import pandas as pd
import censusdata as cd

# attributes to load
cdcol=['B28011_001','B28011_002']

cdf = cd.download('acs5', 2019,
           cd.censusgeo([('state', '36'),
                         ('county','*')]),
          cdcol)


# seperate out the geoid and geography name
geoid=[]
geoname=[]

for index in cdf.index.tolist():
    geopart=''
    for part in index.geo:
        geopart = geopart + part[1]
    geoid.append(geopart)
    geoname.append(index.name)

cdf['geoid']=geoid
cdf['geoname']=geoname

# calculate percentage with internet subscriptions
cdf['Internet Subscription']=cdf['B28011_002']/cdf['B28011_001']*100

# output a similar table as above
cdf

Learn how to load into PANDAS the PL 94-171 2020 Redistricting Data, a process that is similar but different then ACS data.

Also, calculate the population of an area and it’s average demographics, including areas that don’t have Census demographics such as Election Districts or County Legislative districts.

Obtaining Census Tract Data in All States using TidyCensus and purr

I was pleasantly surprised how quick purr is to use with TidyCensus to get data for all Census Tracts in America. This code is based on Matt Herman’s example, but I used R’s built in state.abb to get a list of all states. To only the CONUS Census Tracts, you could use state.abb %>% setdiff(c('AK','HI')), which uses R’s built in setdiff to remove AK and HI from the list of states pulled.

acs <- map_dfr(
  state.abb,
  ~get_acs("tract", survey='acs5', var='B01001_001', state=., cache_table = T,
               geometry = T,
               year = 2021)
  )

How to calcuate what percentage income a person is using R and Census Microdata

Here is how to a create a table of percent income using R and Census Public Use Microdata and the Qauntile function.

library(tidycensus)
library(tidyverse)
library(hutils)
library(gt)

# Grab Public Use Microdata for Albany County, with 
# PINCP = Total Personal Income see View(pums_variables) 
# puma areas can be found with tigris and mapview for an interactive
# map of the puma areas: mapview::mapview(tigris::pumas('ny'))
aci <- get_pums(variables = 'PINCP', state = "NY",puma = c('02001','02002'))

# next use hutil's weight2rows to expand out the dataframe so that 
# eached weighted observation has one row per observation
# then filter to only include persons with salary income, > $2
# extract out only the .$PINCP variable, then calculate quantile for
# 0, 5, 10, ... 90, 95, 99 percent
aci %>% 
  weight2rows('PWGTP') %>% 
  filter(PINCP > 2) %>%
  .$PINCP %>%
  quantile(c(seq(0,0.95,0.05),.99)) -> y

# create a tibble with the quantile ranges 
# (what percentage am i)
# and calculated quantile values (how much income)
df2 <- tibble(
  x=100-c(seq(0,0.95,0.05),.99)*100, 
  y=y
)

# Zero out the 100% value, to overcome 
# how the Census stores $1, $2, etc.
df2[1,2] <- 0

# Create the table with gt
df2 %>%
  select( Salary = y, `Top Percent` = x) %>%
  gt() %>%
  fmt_currency(1, decimals = 0) %>%
  fmt_percent(2, scale_values = F, decimals = 1) %>%
  opt_stylize() %>%
  opt_css('body { text-align: center} ') %>%
  cols_align('center') %>%
  tab_header('What percent is a salary in Albany County?',
             'Includes only persons with a yearly income of more then $1 a year.') %>%
  tab_footnote(html('Andy Arthur, 1/22/23.<br /><em>Data Source:</em> 2016-2020 Public Use Microdata, <br />Total person\'s income,  NY PUMA 02001 and 02002.'))

2022 US Census Population Estimates

State Population Estimates 2020 through 2022
April 2020 Census Count July 2020 Estimate July 2021 Estimate July 2022 Estimate Change April 2020 to July 2022
New York 20,201,230 20,108,296 19,857,492 19,677,151 βˆ’2.6%
District of Columbia 689,546 670,868 668,791 671,803 βˆ’2.6%
Puerto Rico 3,285,874 3,281,557 3,262,693 3,221,789 βˆ’2.0%
Illinois 12,812,545 12,786,580 12,686,469 12,582,032 βˆ’1.8%
Louisiana 4,657,749 4,651,664 4,627,098 4,590,241 βˆ’1.4%
California 39,538,245 39,501,653 39,142,991 39,029,342 βˆ’1.3%
West Virginia 1,793,755 1,791,420 1,785,526 1,775,156 βˆ’1.0%
Hawaii 1,455,273 1,451,043 1,447,154 1,440,196 βˆ’1.0%
Mississippi 2,961,288 2,958,141 2,949,586 2,940,057 βˆ’0.7%
Massachusetts 7,029,949 6,995,729 6,989,690 6,981,974 βˆ’0.7%
Michigan 10,077,325 10,069,577 10,037,504 10,034,113 βˆ’0.4%
Ohio 11,799,374 11,797,517 11,764,342 11,756,058 βˆ’0.4%
Rhode Island 1,097,371 1,096,345 1,096,985 1,093,734 βˆ’0.3%
New Jersey 9,289,031 9,271,689 9,267,961 9,261,699 βˆ’0.3%
Pennsylvania 13,002,689 12,994,440 13,012,059 12,972,008 βˆ’0.2%
Maryland 6,177,213 6,173,205 6,174,610 6,164,660 βˆ’0.2%
New Mexico 2,117,527 2,118,390 2,116,677 2,113,344 βˆ’0.2%
Kansas 2,937,847 2,937,919 2,937,922 2,937,150 βˆ’0.0%
Wisconsin 5,893,725 5,896,271 5,880,101 5,892,539 βˆ’0.0%
North Dakota 779,091 779,518 777,934 779,261 0.0%
Alaska 733,378 732,923 734,182 733,583 0.0%
Oregon 4,237,291 4,244,795 4,256,301 4,240,137 0.1%
Kentucky 4,505,893 4,507,445 4,506,589 4,512,310 0.1%
Minnesota 5,706,504 5,709,852 5,711,471 5,717,184 0.2%
Iowa 3,190,372 3,190,571 3,197,689 3,200,517 0.3%
Nebraska 1,961,489 1,962,642 1,963,554 1,967,923 0.3%
Missouri 6,154,920 6,153,998 6,169,823 6,177,957 0.4%
Connecticut 3,605,942 3,597,362 3,623,355 3,626,205 0.6%
Virginia 8,631,384 8,636,471 8,657,365 8,683,619 0.6%
Vermont 643,085 642,893 646,972 647,064 0.6%
Indiana 6,785,668 6,788,799 6,813,532 6,833,037 0.7%
Wyoming 576,837 577,605 579,483 581,381 0.8%
Alabama 5,024,356 5,031,362 5,049,846 5,074,296 1.0%
Washington 7,705,247 7,724,031 7,740,745 7,785,786 1.0%
Arkansas 3,011,555 3,014,195 3,028,122 3,045,637 1.1%
Colorado 5,773,733 5,784,865 5,811,297 5,839,926 1.1%
New Hampshire 1,377,518 1,378,587 1,387,505 1,395,231 1.3%
Oklahoma 3,959,346 3,964,912 3,991,225 4,019,800 1.5%
Maine 1,362,341 1,363,557 1,377,238 1,385,340 1.7%
Georgia 10,711,937 10,729,828 10,788,029 10,912,876 1.9%
Tennessee 6,910,786 6,925,619 6,968,351 7,051,339 2.0%
Nevada 3,104,624 3,115,648 3,146,402 3,177,772 2.4%
North Carolina 10,439,414 10,449,445 10,565,885 10,698,973 2.5%
South Dakota 886,677 887,799 896,164 909,824 2.6%
Delaware 989,957 992,114 1,004,807 1,018,396 2.9%
Arizona 7,151,507 7,179,943 7,264,877 7,359,197 2.9%
Texas 29,145,428 29,232,474 29,558,864 30,029,572 3.0%
South Carolina 5,118,429 5,131,848 5,193,266 5,282,634 3.2%
Florida 21,538,226 21,589,602 21,828,069 22,244,823 3.3%
Utah 3,271,614 3,283,785 3,339,113 3,380,800 3.3%
Montana 1,084,197 1,087,075 1,106,227 1,122,867 3.6%
Idaho 1,839,092 1,849,202 1,904,314 1,939,033 5.4%
Andy Arthur, 12/23/22.
Data Source: US Census Population Estimates.