PANDAS – Your source for unemployment statistics !

PANDAS – Your source for unemployment statistics ! πŸ“‰

Another use for PANDAS is to get the latest local area unemployment statistics. By using the remote zip library, you can even only download the actual CSV files you need — and avoid getting the statewide or metropolitan-region numbers, which I know I haven’t ever used. This will give you 160 data points to look at — 62 counties and 98 the towns, cities and villages in New York State whose population is greater then 25,000.

import pandas as pd

# by using RemoteZip (pip install remotezip) this speeds
# up downloads by only downloading the files in the zip file
# that we actually need from DOL
from remotezip import RemoteZip

dolzip='https://dol.ny.gov/statistics-lauszip'

# download & load only cities and counties 
with RemoteZip(dolzip) as zip:
    df=pd.read_csv(zip.extract('laus_counties.txt'))
    df=df.append(pd.read_csv(zip.extract('laus_cities.txt')))

# get rid of double quotes in column names
df.columns = df.columns.str.replace('\"','')

# get rid of spaces in column names
df.columns=df.columns.str.replace(' ','')

# convert year and month field to datetime, coerce makes the column NaN for yearly averages
df['DATETIME']=pd.to_datetime({'year': df['YEAR'], 'month': df['MONTH'],'day': 1}, errors='coerce')

# drop yearly averages, as they are NaN
df=df.dropna(subset=['DATETIME'])

# Convert City/Town to Census Style for joining against 
# NAMELSAD20 in TIGER/Line Shapefiles (optional)
df['AREA']=df['AREA'].str.replace('City','city')
df['AREA']=df['AREA'].str.replace('Town','town')
df['AREA']=df['AREA'].str.replace('Village','village')
df['AREA']=df['AREA'].str.replace(' Ny','')

Create a quick pivot table of county employment rates for the past two years.

df[((df['AREA'].str.contains('County')) & (df['YEAR'] > 2019))].pivot(index='datetime',columns='AREA',values='UNEMPRATE')

Or unemployment stats for the past year for all 160 jurisdictions, rotated so dates are up along the top.

df.pivot(index='DATETIME',columns='AREA',values='UNEMPRATE').tail(12).T

Calculate the yearly average unemployment rate for each jurisdiction, going back to 1990.

df.groupby(by=['YEAR','AREA']).mean()['UNEMPRATE'].unstack()

Leave a Reply

Your email address will not be published. Required fields are marked *