PANDAS – Your source for unemployment statistics !
Another use for PANDAS is to get the latest local area unemployment statistics. By using the remote zip library, you can even only download the actual CSV files you need — and avoid getting the statewide or metropolitan-region numbers, which I know I haven’t ever used. This will give you 160 data points to look at — 62 counties and 98 the towns, cities and villages in New York State whose population is greater then 25,000.
import pandas as pd
# by using RemoteZip (pip install remotezip) this speeds
# up downloads by only downloading the files in the zip file
# that we actually need from DOL
from remotezip import RemoteZip
dolzip='https://dol.ny.gov/statistics-lauszip'
# download & load only cities and counties
with RemoteZip(dolzip) as zip:
df=pd.read_csv(zip.extract('laus_counties.txt'))
df=df.append(pd.read_csv(zip.extract('laus_cities.txt')))
# get rid of double quotes in column names
df.columns = df.columns.str.replace('\"','')
# get rid of spaces in column names
df.columns=df.columns.str.replace(' ','')
# convert year and month field to datetime, coerce makes the column NaN for yearly averages
df['DATETIME']=pd.to_datetime({'year': df['YEAR'], 'month': df['MONTH'],'day': 1}, errors='coerce')
# drop yearly averages, as they are NaN
df=df.dropna(subset=['DATETIME'])
# Convert City/Town to Census Style for joining against
# NAMELSAD20 in TIGER/Line Shapefiles (optional)
df['AREA']=df['AREA'].str.replace('City','city')
df['AREA']=df['AREA'].str.replace('Town','town')
df['AREA']=df['AREA'].str.replace('Village','village')
df['AREA']=df['AREA'].str.replace(' Ny','')
Create a quick pivot table of county employment rates for the past two years.
df[((df['AREA'].str.contains('County')) & (df['YEAR'] > 2019))].pivot(index='datetime',columns='AREA',values='UNEMPRATE')
Or unemployment stats for the past year for all 160 jurisdictions, rotated so dates are up along the top.
df.pivot(index='DATETIME',columns='AREA',values='UNEMPRATE').tail(12).T
Calculate the yearly average unemployment rate for each jurisdiction, going back to 1990.
df.groupby(by=['YEAR','AREA']).mean()['UNEMPRATE'].unstack()