PANDAS – Your source for unemployment statistics !

PANDAS – Your source for unemployment statistics ! 📉

Another use for PANDAS is to get the latest local area unemployment statistics. By using the remote zip library, you can even only download the actual CSV files you need — and avoid getting the statewide or metropolitan-region numbers, which I know I haven’t ever used. This will give you 160 data points to look at — 62 counties and 98 the towns, cities and villages in New York State whose population is greater then 25,000.

01import pandas as pd
02 
03# by using RemoteZip (pip install remotezip) this speeds
04# up downloads by only downloading the files in the zip file
05# that we actually need from DOL
06from remotezip import RemoteZip
07 
09 
10# download & load only cities and counties
11with RemoteZip(dolzip) as zip:
12    df=pd.read_csv(zip.extract('laus_counties.txt'))
13    df=df.append(pd.read_csv(zip.extract('laus_cities.txt')))
14 
15# get rid of double quotes in column names
16df.columns = df.columns.str.replace('\"','')
17 
18# get rid of spaces in column names
19df.columns=df.columns.str.replace(' ','')
20 
21# convert year and month field to datetime, coerce makes the column NaN for yearly averages
22df['DATETIME']=pd.to_datetime({'year': df['YEAR'], 'month': df['MONTH'],'day': 1}, errors='coerce')
23 
24# drop yearly averages, as they are NaN
25df=df.dropna(subset=['DATETIME'])
26 
27# Convert City/Town to Census Style for joining against
28# NAMELSAD20 in TIGER/Line Shapefiles (optional)
29df['AREA']=df['AREA'].str.replace('City','city')
30df['AREA']=df['AREA'].str.replace('Town','town')
31df['AREA']=df['AREA'].str.replace('Village','village')
32df['AREA']=df['AREA'].str.replace(' Ny','')

Create a quick pivot table of county employment rates for the past two years.

1df[((df['AREA'].str.contains('County')) & (df['YEAR'] > 2019))].pivot(index='datetime',columns='AREA',values='UNEMPRATE')

Or unemployment stats for the past year for all 160 jurisdictions, rotated so dates are up along the top.

1df.pivot(index='DATETIME',columns='AREA',values='UNEMPRATE').tail(12).T

Calculate the yearly average unemployment rate for each jurisdiction, going back to 1990.

1df.groupby(by=['YEAR','AREA']).mean()['UNEMPRATE'].unstack()

Leave a Reply

Your email address will not be published. Required fields are marked *