R Script for Converting RPTL 1590 Reports into Excel Files

Real Property Tax Law 1590 requires that municipalities post their tax rolls, within 10 days of the proposed and final rolls being approved. Below is an R script that will extract the reports into a Shapefile for use in a GIS program or as a Excel Spreadsheet. Previously, I posted a PHP version of this script. Both versions can also be found on my GitHub.

This R script uses three R packages β€” tidyverse, pdftools and sf. It does not require any external dependencies and should work on Mac OS, Linux or Windows if you have free RStudio program installed.

You can install the required R packages using: lapply(c('tidyverse', 'pdftools', 'sf'), install.packages)

library(pdftools)
library(tidyverse)
library(sf)

pdf.text <- pdftools::pdf_text('knox_2022_final_roll.pdf')

tax.rolls <- pdf.text %>% 
  str_split('\\*{5,}') %>% 
  map(~str_trim(.))

tax.rolls <- unlist(tax.rolls, recursive = F) 

tax.rolls <- 
  tax.rolls[str_length(tax.rolls) > 50 &
          !grepl('STATE OF NEW YORK',
                tax.rolls) ]

tax.rolls.formatted <- 
  tibble(text = tax.rolls) %>%
  transmute(taxid = str_extract(text, '^.*?\\n.(.*?) ', group = 1),
            property.address = str_extract(text, '^(.*?)(\\n| {2,})', group = 1),
            owner = str_extract(text, '(.*?\n){2}(.*?) {2}', group=2),
            acres = str_extract(text, 'ACRES *?((\\d|.)*) ', group = 1) %>%
              parse_number,
            property.type = str_extract(text, '^.*?\\n.*? {2,}(.*?) {2,}', group = 1),
            school.district = str_extract(text, '(.*?\n){2}(.*?) {2,}(.*?) {2,}', group=3),
            deed.book = str_extract(text, 'DEED BOOK(.*?)(\n| {6,})', group=1) %>% str_replace('\\s{2,}',' ') %>% str_trim,
            owner.address = str_extract(text, '(.*?\n){3}(.*?) {2}', group=2),
            owner.address2 = str_extract(text, '(.*?\n){4}(.*?) {2}', group=2),  
            owner.address3 = str_extract(text, '(.*?\n){5}(.*?) {2}', group=2), 
            land.assessment = str_extract(text, '(.*?\n){2}(.*?) {3,}(.*?) {2,}(.*?) {2,}', group=4) %>%
              parse_number,
            total.assessment = str_extract(text, 'TOWN.*?VALUE {4,}(.*?)\n', group=1) %>%
              parse_number,
            full.market.value = str_extract(text, 'FULL.*?MARKET.*?VALUE *?(.*?)(\\n|$)', group=1) %>%
              parse_number,
            town.taxable.value = str_extract(text, 'TOWN.*?TAXABLE.*?VALUE(.*?)\n', group = 1) %>%
              parse_number,
            city.taxable.value = str_extract(text, 'CITY.*?TAXABLE.*?VALUE(.*?)\n', group = 1) %>%
              parse_number,
            county.taxable.value = str_extract(text, 'COUNTY TAXABLE VALUE(.*?)\n', group = 1) %>%
              parse_number,
            school.taxable.value = str_extract(text, 'SCHOOL.*?TAXABLE.*?VALUE(.*?)\n', group = 1) %>%
              parse_number,
            ag.dist.law = paste(str_extract(text,
                                      '(MAY BE SUBJECT TO PAYMENT)', group=1),
                                str_extract(text,
                                      '(UNDER .*?) {2,}', group=1)
            ),
            north =  str_extract(text, 'NRTH-(\\d*)', group=1) %>%
              parse_number %>% replace_na(0),
            east =  str_extract(text, 'EAST-(\\d*)', group=1) %>%
              parse_number  %>% replace_na(0),
            ) %>%
# ny long island crs 32014, ny east crs 32015, ny central crs 32016, ny west crs 32017
  st_as_sf(coords=c('east','north'), crs=32015) 

tax.rolls.formatted %>%
  write_sf('/tmp/taxroll.shp')

tax.rolls.formatted %>%
  write_sf('/tmp/taxroll.csv')

Tailings Pile Guardrail

Along NL Tahawus Road, the shoulder of the road consisted of a tailings from the former mining operation. It was remarkably scenic up here.

Taken on Saturday May 21, 2011 at Tahawus.

NPR

The IRS is working on software to allow taxpayers to file online : NPR

The IRS is developing a system that would let taxpayers send electronic returns directly to the government for free, sidestepping commercial options such as TurboTax.

The agency plans a pilot test of the program next year.

Many other countries already offer taxpayers a government-run filing system. But the IRS plan is likely to face stiff opposition from the $14 billion tax-preparation industry.

Tahawus Blast Furnace Ruins

This massive blast furnace was built in 1854, as part of a speculative effort to mine iron up in Tahawus. However, due to the railroad never making it up this far, and due to wilderness conditions, it was abandoned within 3 years of it's construction and barely ever used.

Taken on Saturday May 21, 2011 at Tahawus.

Good morning! Happy Pack Rat Day πŸ€

Despite the arrival of green up weeks ago, with the very low humidity and breeze they are talking about Red Flag conditions today. So probably best for pack rats everywhere to keep their piles of debris for Friday night bonfires rather than lighting it off today with a shit ton of used motor oil and diesel. That said, probably the rednecks that do the most burning of things are the biggest pack rats as a lot of things like metal don’t burn well and you never know when you can use that piece of junk.

Cloudy and a chilly 43 degrees in Delmar for the morning walk. ☁ There is a north-northwest breeze at 10 mph. πŸƒ.

Talk about unmotivating weather to get yourself going this morning. I had to get out the jeans πŸ‘– and flannel shirt and vest that I wore this weekend for camping before going out. My windows are closed upstairs. The rest of the week will be warmer but still cool.

I’m thinking this morning it will be quinoa and barley with cherries πŸ’ for breakfast. Yesterday was reheated quiche with extra kale and tomatoes and Monday was oat meal. Always trying to keep things mixed up both for keeping it interesting and delicious πŸ˜‹ but also to have a good mix of vitamins and minerals. With the growing season underway looking forward to adding more local foods soon too.

Today will start out mostly cloudy, πŸŒ₯ then gradually becoming sunny 🌞, with a high of 57 degrees at 5pm. 15 degrees below normal, which is similar to a typical day around April 11th. Northwest wind 10 to 16 mph, with gusts as high as 28 mph. A year ago, we had mostly sunny skies in the morning, remaining cloudy in the afternoon. The high last year was 67 degrees. The record high of 92 was set in 2017.

Got out from work at 6:10 pm last night, 🚍 then had dinner around seven, voted πŸ—³ and walked 8.2 miles for the day. It really helps the days are longer now. Sat out back for a while on the tailgate of my truck, as it was a mild evening before retiring to bed πŸ› a quarter after eight.

Heading to the dry cleaner this morning on my way into work, πŸ•΄οΈI’ll have to put it on my card πŸ’³ as I got busy again at work yesterday and forgot to get to the bank 🏦. It’s fine. Then I’ll get olives and chilli 🌢 peppers at the store after work.

Solar noon 🌞 is at 12:53 pm with sun having an altitude of 66.7° from the due south horizon (-4.2° vs. 6/21). A six foot person will cast a 2.6 foot shadow today compared to 2.2 feet on the first day of summer. The golden hour πŸ… starts at 7:32 pm with the sun in the west-northwest (291°). πŸ“Έ The sunset is in the west-northwest (298°) with the sun dropping below the horizon at 8:13 pm after setting for 3 minutes and 15 seconds with dusk around 8:45 pm, which is one minute and 2 seconds later than yesterday. πŸŒ‡ The best time to look at the stars is after 9:26 pm. At sunset, look for clear skies πŸŒ„ and temperatures around 52 degrees. There will be a north-northwest breeze at 10 mph. Today will have 14 hours and 44 minutes of daytime, an increase of 2 minutes over yesterday.

Tonight will be mostly clear πŸŒƒ, with a low of 31 degrees at 6am. 18 degrees below normal, which is similar to a typical night around April 1st. North wind 5 to 10 mph becoming light northwest after midnight. In 2022, we had mostly clear skies in the evening, which became partly cloudy by the early hours of the morning. It got down to 46 degrees. The record low of 29 occurred back in 1981.

Lupine Festival looks to be at least somewhat wet. πŸ¦‹ Saturday, showers likely, mainly after 2pm. Mostly cloudy, with a high near 68. Chance of precipitation is 70%. Maximum dew point of 54 at 12pm. Sunday, mostly sunny, with a high near 75. Maximum dew point of 50 at 7am. Typical average high for the weekend is 73 degrees.

Looking ahead, there are 5 weeks until Summer ️⛱️ when the sun will be setting at 8:38 pm with dusk at 9:11 pm. On that day in 2022, we had partly cloudy, rain showers and temperatures between 70 and 51 degrees. Typically, the high temperature is 81 degrees. We hit a record high of 97 back in 1938.

IMG 0065