Matt Herman – Getting Census data from multiple states using tidycensus and purrr
R Programming Language
ChatGPT coding assistant for RStudio
I often think about all the interesting — and time saving things — I’ve learned in R over the past year and a half
I often think about all the interesting — and time saving things — I’ve learned in R over the past year and a half. On Election Night, I wrote a script that used a headless Firefox browser and Selenium to pull Election Results in real-time and process it into an a Google Spreadsheet. I was pleasantly surprised how well it worked, and how with a few extra lines of code I could pipe it into ggplot and make all kinds of maps.
The New York City Gubernatorial maps I posted where much the same way. I took data from several NYC Election Night Results pages, aggregated it into a data frame, cleaned up and formatted the data, then piped it through to ggplot with some nice styling.
Every day, I try to learn and experiment in new directions, build my skills further. As I only become faster and more talented at using R the more I use it and learn the many shortcuts and libraries out there for writing better code, quicker.
How to do Zonal Histograms in R
I couldn’t find documentation on the web on how to create a Zonal Histogram similar to ArcGIS and QGIS efficiently in R but after discovering the exactextractr
library, which is a fast method of doing Zonal Statistics, I discovered this library can be used to efficiently generate zonal histograms.
The key is to do group_by and summarize inside of the function, so you never load more un-summarized pixels into memory then each individual zone of the Shapefile. This example is shown with a NLCD GeoTIFF of New York State, but it could be used for any other raster dataset where you want to calculate a Zonal Histogram.
llibrary(sf)
library(tidyverse)
rm(list=ls())
# load raster of NLCD with terra
nlcd <- terra::rast('Documents/GIS.Data/agriculture/nlcd2019_ny.tif')
# load counties in sf using tigris
cos <- tigris::counties('ny')
# exactextractr is a fast way to do zonal statistics and can be used
# for zonal histograms using this code
#
# append_cols = T ## include columns from shapefile to allow rejoining at end
# summarize_df = T ## pass the dataframecreated by extact_extract to the function
# function = .. calculate the fraction of the area using frac (fraction of area) option in exactexractr
nlcd_co <- exactextractr::exact_extract(nlcd, cos,
append_cols=T,
fun='frac' )
# this next step is optional, it just takes long format and pivots_wider
# for an output similiar to what you would get in ArcGIS or QGIS
# then inner join restores spatial data by joining on matching rows
# also we convert to acres, which is more useful then square meters.
nlcd_co %>%
mutate(area = units::set_units(area,'acres')) %>%
pivot_wider(names_from = 'nlcd_code', values_from = 'area',
names_glue = "nlcd{nlcd_code}" ) %>%
inner_join(cos, .) -> nlcd_co
# Create the Table Shown Below
nlcd_co %>%
arrange(NAME) %>%
select(NAME, starts_with('nlcd'), -nlcdNaN) %>%
mutate(across(starts_with('nlcd'), ~round(.x))) %>%
st_drop_geometry()
Here is the output. Totals in acres, as shown from the above code.
NAME | nlcd11 | nlcd21 | nlcd22 | nlcd23 | nlcd24 | nlcd31 | nlcd41 | nlcd42 | nlcd43 | nlcd52 | nlcd71 | nlcd81 | nlcd82 | nlcd90 | nlcd95 |
Albany | 5,674 | 28,419 | 22,713 | 16,441 | 8,579 | 1,799 | 61,285 | 28,204 | 85,195 | 2,848 | 4,110 | 47,328 | 4,288 | 21,578 | 2,850 |
Allegany | 2,922 | 33,036 | 6,302 | 1,929 | 372 | 525 | 312,236 | 44,757 | 104,295 | 3,294 | 1,453 | 117,192 | 23,491 | 8,107 | 1,936 |
Bronx | 9,535 | 2,254 | 2,491 | 7,235 | 12,599 | 105 | 1,491 | 32 | 97 | 35 | 188 | 24 | 7 | 182 | 441 |
Broome | 5,965 | 27,307 | 15,001 | 9,458 | 3,559 | 1,264 | 157,988 | 18,589 | 115,551 | 3,534 | 1,668 | 82,410 | 6,257 | 6,340 | 2,651 |
Cattaraugus | 8,608 | 35,531 | 10,641 | 4,850 | 1,029 | 1,307 | 415,374 | 33,863 | 142,644 | 4,349 | 2,245 | 121,870 | 34,993 | 23,658 | 5,080 |
Cayuga | 110,144 | 20,540 | 12,288 | 4,087 | 1,352 | 273 | 99,634 | 7,617 | 9,707 | 4,043 | 704 | 62,032 | 169,443 | 47,271 | 3,613 |
Chautauqua | 281,032 | 36,321 | 17,812 | 7,502 | 2,131 | 667 | 269,362 | 19,162 | 81,455 | 2,540 | 2,525 | 130,216 | 62,878 | 40,613 | 5,526 |
Chemung | 1,698 | 15,510 | 9,454 | 4,515 | 1,819 | 552 | 103,572 | 5,059 | 52,799 | 1,704 | 900 | 51,669 | 8,278 | 3,380 | 1,832 |
Chenango | 3,761 | 27,136 | 6,498 | 2,188 | 544 | 487 | 215,263 | 36,235 | 114,218 | 3,490 | 1,716 | 119,688 | 19,917 | 21,168 | 2,812 |
Clinton | 50,352 | 17,782 | 15,114 | 7,343 | 2,883 | 1,778 | 224,491 | 129,229 | 61,857 | 10,957 | 9,519 | 60,944 | 35,983 | 81,798 | 5,012 |
Columbia | 8,196 | 15,288 | 14,371 | 6,332 | 1,513 | 752 | 183,782 | 15,927 | 33,647 | 1,920 | 1,665 | 79,024 | 20,163 | 29,060 | 3,013 |
Cortland | 1,821 | 13,538 | 6,047 | 2,540 | 709 | 347 | 136,075 | 12,173 | 29,889 | 2,016 | 1,351 | 78,709 | 26,070 | 8,070 | 1,615 |
Delaware | 13,158 | 33,367 | 10,567 | 3,347 | 855 | 1,278 | 557,928 | 28,577 | 145,478 | 5,956 | 3,263 | 118,891 | 3,832 | 10,780 | 1,760 |
Dutchess | 15,280 | 34,129 | 30,131 | 19,650 | 5,961 | 1,660 | 263,823 | 5,483 | 15,053 | 2,907 | 3,495 | 73,868 | 10,393 | 42,531 | 3,679 |
Erie | 117,246 | 64,290 | 60,617 | 38,092 | 19,725 | 2,806 | 157,077 | 15,104 | 63,183 | 3,003 | 3,419 | 102,592 | 66,339 | 66,036 | 4,010 |
Essex | 76,576 | 20,083 | 10,820 | 4,888 | 936 | 2,139 | 404,216 | 372,508 | 205,758 | 11,311 | 6,500 | 33,807 | 3,579 | 66,641 | 6,378 |
Franklin | 42,858 | 23,993 | 8,987 | 3,944 | 1,011 | 575 | 432,858 | 218,105 | 75,410 | 13,759 | 9,120 | 60,295 | 27,263 | 160,041 | 8,170 |
Fulton | 23,088 | 11,824 | 6,169 | 2,940 | 858 | 1,007 | 103,409 | 71,262 | 40,344 | 1,514 | 1,421 | 31,101 | 3,647 | 39,545 | 2,912 |
Genesee | 1,763 | 13,830 | 8,565 | 3,213 | 1,290 | 733 | 44,702 | 836 | 6,304 | 920 | 494 | 33,642 | 137,534 | 57,541 | 5,627 |
Greene | 7,146 | 19,733 | 7,389 | 3,312 | 961 | 1,017 | 189,471 | 38,987 | 100,020 | 1,581 | 1,842 | 28,620 | 3,263 | 15,675 | 2,134 |
Hamilton | 57,611 | 7,496 | 2,132 | 942 | 141 | 153 | 623,574 | 202,608 | 144,793 | 7,435 | 2,510 | 784 | 77 | 100,966 | 5,770 |
Herkimer | 29,521 | 21,256 | 8,024 | 4,759 | 1,052 | 772 | 461,155 | 88,983 | 87,687 | 5,495 | 5,177 | 100,766 | 32,355 | 81,290 | 4,831 |
Jefferson | 374,475 | 25,873 | 23,322 | 13,794 | 4,319 | 1,965 | 224,831 | 38,661 | 18,135 | 24,488 | 16,922 | 227,115 | 79,861 | 99,267 | 15,657 |
Kings | 17,060 | 1,138 | 2,124 | 8,788 | 29,713 | 303 | 852 | 10 | 19 | 325 | 687 | 28 | 8 | 101 | 828 |
Lewis | 10,796 | 22,200 | 5,028 | 2,102 | 434 | 543 | 362,686 | 112,380 | 30,160 | 10,240 | 9,196 | 75,506 | 47,355 | 127,488 | 9,443 |
Livingston | 6,648 | 20,811 | 8,554 | 2,819 | 830 | 547 | 110,777 | 6,945 | 24,135 | 3,241 | 1,031 | 52,942 | 153,519 | 13,796 | 3,168 |
Madison | 4,556 | 22,445 | 7,321 | 3,142 | 817 | 312 | 144,056 | 18,944 | 22,322 | 4,764 | 4,561 | 93,604 | 57,361 | 36,425 | 2,599 |
Monroe | 453,560 | 55,584 | 52,799 | 27,211 | 11,795 | 1,081 | 64,300 | 819 | 17,612 | 1,351 | 1,307 | 56,089 | 85,105 | 41,095 | 4,980 |
Montgomery | 3,549 | 11,690 | 7,207 | 4,130 | 1,322 | 705 | 35,552 | 13,325 | 32,928 | 1,125 | 1,624 | 99,312 | 30,616 | 16,982 | 2,523 |
Nassau | 105,059 | 31,939 | 34,548 | 62,920 | 21,252 | 2,406 | 9,327 | 440 | 8,779 | 251 | 585 | 649 | 31 | 1,420 | 10,362 |
New York | 6,617 | 990 | 917 | 2,902 | 9,819 | 23 | 249 | 36 | 11 | 5 | 31 | 8 | NA | 6 | 52 |
Niagara | 394,458 | 22,194 | 20,559 | 9,522 | 4,672 | 1,427 | 47,882 | 150 | 4,848 | 838 | 1,164 | 48,225 | 117,935 | 50,628 | 3,297 |
Oneida | 29,517 | 43,328 | 19,153 | 10,967 | 4,277 | 477 | 277,272 | 86,674 | 32,966 | 9,391 | 9,497 | 118,645 | 74,731 | 83,922 | 4,071 |
Onondaga | 17,605 | 39,680 | 40,707 | 22,036 | 10,109 | 2,214 | 137,384 | 9,466 | 10,888 | 7,316 | 2,521 | 74,274 | 89,651 | 48,348 | 3,394 |
Ontario | 12,339 | 25,414 | 12,607 | 5,148 | 1,779 | 971 | 112,382 | 3,729 | 19,066 | 2,319 | 993 | 59,918 | 140,000 | 25,811 | 1,557 |
Orange | 13,944 | 45,348 | 28,793 | 18,392 | 7,227 | 1,002 | 244,190 | 2,698 | 20,083 | 3,624 | 4,144 | 71,758 | 14,476 | 54,195 | 6,692 |
Orleans | 273,267 | 10,695 | 6,014 | 1,746 | 569 | 302 | 37,554 | 175 | 4,358 | 344 | 366 | 23,614 | 120,057 | 42,045 | 1,965 |
Oswego | 229,605 | 30,684 | 12,260 | 5,016 | 1,798 | 621 | 260,545 | 54,536 | 20,811 | 8,821 | 3,192 | 74,138 | 27,487 | 103,286 | 6,918 |
Otsego | 9,682 | 29,240 | 9,613 | 3,622 | 853 | 162 | 252,720 | 36,400 | 90,010 | 2,497 | 2,343 | 143,280 | 23,166 | 42,287 | 4,208 |
Putnam | 9,092 | 15,026 | 8,917 | 4,898 | 1,065 | 205 | 99,745 | 577 | 2,929 | 515 | 725 | 3,191 | 57 | 9,833 | 686 |
Queens | 45,785 | 2,461 | 6,260 | 23,052 | 33,126 | 1,025 | 1,288 | 2 | 33 | 143 | 248 | 24 | 84 | 440 | 1,773 |
Rensselaer | 7,549 | 18,263 | 16,366 | 12,655 | 4,196 | 1,122 | 114,784 | 28,503 | 115,602 | 2,878 | 3,183 | 56,235 | 17,933 | 23,445 | 2,974 |
Richmond | 28,777 | 2,943 | 4,920 | 13,956 | 6,033 | 334 | 3,623 | 5 | 20 | 250 | 959 | 94 | 37 | 2,177 | 1,617 |
Rockland | 16,177 | 22,066 | 17,801 | 11,792 | 3,459 | 546 | 46,216 | 117 | 1,021 | 572 | 528 | 1,219 | 1 | 5,097 | 825 |
Saratoga | 20,281 | 34,665 | 25,224 | 11,373 | 3,691 | 1,683 | 124,412 | 95,691 | 70,869 | 4,293 | 3,917 | 48,487 | 16,585 | 73,499 | 5,367 |
Schenectady | 2,029 | 12,619 | 10,388 | 7,203 | 3,004 | 291 | 19,334 | 11,924 | 33,390 | 994 | 1,594 | 19,559 | 1,473 | 9,040 | 1,203 |
Schoharie | 3,183 | 17,437 | 5,866 | 2,485 | 578 | 720 | 106,951 | 47,616 | 102,921 | 3,022 | 4,096 | 79,500 | 10,258 | 13,967 | 2,249 |
Schuyler | 8,929 | 10,228 | 3,615 | 1,038 | 283 | 179 | 73,356 | 5,368 | 39,766 | 1,750 | 665 | 50,810 | 15,170 | 5,884 | 2,055 |
Seneca | 41,512 | 11,793 | 6,673 | 2,275 | 722 | 433 | 30,630 | 909 | 4,834 | 2,284 | 1,200 | 27,001 | 97,112 | 19,363 | 3,128 |
St. Lawrence | 90,958 | 52,272 | 15,441 | 8,384 | 2,261 | 2,302 | 745,095 | 162,469 | 76,527 | 30,757 | 18,545 | 198,021 | 59,287 | 315,417 | 27,830 |
Steuben | 9,191 | 47,919 | 12,454 | 5,207 | 1,376 | 947 | 316,912 | 29,103 | 169,969 | 4,300 | 2,911 | 202,196 | 77,082 | 13,429 | 5,302 |
Suffolk | 933,426 | 118,182 | 120,902 | 82,108 | 25,586 | 13,298 | 87,783 | 32,704 | 25,969 | 2,132 | 7,974 | 23,980 | 11,567 | 10,883 | 21,549 |
Sullivan | 15,829 | 33,171 | 10,173 | 4,534 | 1,207 | 1,047 | 277,449 | 42,717 | 174,988 | 4,031 | 3,426 | 38,832 | 781 | 26,318 | 3,181 |
Tioga | 2,876 | 17,685 | 5,896 | 2,088 | 701 | 776 | 105,110 | 13,393 | 82,193 | 2,296 | 1,126 | 81,968 | 11,712 | 4,942 | 1,813 |
Tompkins | 9,988 | 16,485 | 9,337 | 3,692 | 1,184 | 505 | 87,930 | 7,489 | 50,782 | 2,311 | 844 | 75,221 | 31,921 | 14,703 | 2,197 |
Ulster | 21,680 | 39,266 | 16,272 | 7,671 | 2,574 | 1,460 | 357,131 | 28,665 | 156,829 | 2,756 | 2,577 | 47,469 | 7,220 | 48,681 | 2,632 |
Warren | 40,453 | 18,693 | 9,712 | 5,030 | 1,629 | 928 | 214,894 | 165,465 | 88,330 | 3,707 | 3,551 | 6,794 | 332 | 32,997 | 3,708 |
Washington | 8,215 | 15,235 | 12,199 | 6,133 | 1,386 | 1,207 | 176,808 | 81,087 | 41,830 | 3,317 | 4,488 | 109,332 | 40,799 | 31,921 | 7,081 |
Wayne | 499,860 | 23,991 | 8,903 | 3,132 | 1,123 | 459 | 97,481 | 2,060 | 14,076 | 1,373 | 889 | 53,483 | 115,038 | 57,967 | 5,317 |
Westchester | 42,324 | 61,073 | 38,771 | 28,337 | 11,463 | 327 | 107,659 | 1,581 | 10,136 | 774 | 1,422 | 5,875 | 35 | 9,291 | 803 |
Wyoming | 2,672 | 18,385 | 4,805 | 1,624 | 472 | 65 | 109,573 | 8,937 | 28,627 | 2,248 | 742 | 37,180 | 139,719 | 23,862 | 2,711 |
Yates | 24,396 | 12,685 | 3,647 | 1,101 | 245 | 115 | 65,509 | 4,243 | 18,514 | 2,190 | 538 | 25,781 | 73,002 | 7,461 | 1,069 |
A Year with R
A Year with R
A year ago I stumbled upon the R programming language, mostly by accident on YouTube. I wanted a better platform for making graphics and maps and was running up against a lot of limitations in Python with matplotlib. Matplotlib is powerful but it often requires a lot of explicit code to make elegant well thought out graphics.
R has proven to be a very worthwhile skill to learn. While I consider myself to be a fairly experienced Python programmer, R has proven a lot more valuable especially when it comes to making good basic, attractive maps in SVG files. Simply said, R defaults with ggplot just make sense and are attractive. The pipe mechanism in R based around maggitir is fantastic for complicated data wrangling in a single line of code. Pipes are a wonderful thing in Unix and they make a lot of sense for processing data.
R is a werid language to get the hang of at first. It’s not necessarily bad – it’s actually pretty awesome for manipulating data with pipes. But it is different with strange operators and syntax, based around 1 indexing rather than 0 indexing of most C derived languages like Python. But I’ve really gotten the hang of it by doing a lot of reading and watching videos on R and just digging through the commands, reading help files and even the raw R code on objects.
R really excels with automating GIS processes and being a one stop shop from extract, transform, load to render. Interestingly, outside of academia it seems like R doesn’t get the credit it deserves – especially with Census data and tidycensus its a one stop shop from obtaining data to manipulating it to rendering it on a map, often with just a single pipeline of code. It’s pretty neat.
I’m glad I taught myself R and it’s a technology I will probably continue to use daily for exploring my world.
I made the DEC Firetower lighting code entirely using R programming language, directly querying the DEC Arc/GIS servers
I made the DEC Firetower lighting code entirely using R programming language, directly querying the DEC Arc/GIS servers. You can get it here: https://github.com/AndyArthur/r_maps_and_graphs/blob/main/DEC%20Firetowers%20Lit%20Up.R