R is a programming language and free software environment for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. It is widely used among statisticians and data miners for developing statistical software and data analysis.
I often think about all the interesting β and time saving things β Iβve learned in R over the past year and a half. On Election Night, I wrote a script that used a headless Firefox browser and Selenium to pull Election Results in real-time and process it into an a Google Spreadsheet. I was pleasantly surprised how well it worked, and how with a few extra lines of code I could pipe it into ggplot and make all kinds of maps.
The New York City Gubernatorial maps I posted where much the same way. I took data from several NYC Election Night Results pages, aggregated it into a data frame, cleaned up and formatted the data, then piped it through to ggplot with some nice styling.
Every day, I try to learn and experiment in new directions, build my skills further. As I only become faster and more talented at using R the more I use it and learn the many shortcuts and libraries out there for writing better code, quicker.
I couldnβt find documentation on the web on how to create a Zonal Histogram similar to ArcGIS and QGIS efficiently in R but after discovering the exactextractr library, which is a fast method of doing Zonal Statistics, I discovered this library can be used to efficiently generate zonal histograms.
The key is to do group_by and summarize inside of the function, so you never load more un-summarized pixels into memory then each individual zone of the Shapefile. This example is shown with a NLCD GeoTIFF of New York State, but it could be used for any other raster dataset where you want to calculate a Zonal Histogram.
A year ago I stumbled upon the R programming language, mostly by accident on YouTube. I wanted a better platform for making graphics and maps and was running up against a lot of limitations in Python with matplotlib. Matplotlib is powerful but it often requires a lot of explicit code to make elegant well thought out graphics.
R has proven to be a very worthwhile skill to learn. While I consider myself to be a fairly experienced Python programmer, R has proven a lot more valuable especially when it comes to making good basic, attractive maps in SVG files. Simply said, R defaults with ggplot just make sense and are attractive. The pipe mechanism in R based around maggitir is fantastic for complicated data wrangling in a single line of code. Pipes are a wonderful thing in Unix and they make a lot of sense for processing data.
R is a werid language to get the hang of at first. Itβs not necessarily bad β itβs actually pretty awesome for manipulating data with pipes. But it is different with strange operators and syntax, based around 1 indexing rather than 0 indexing of most C derived languages like Python. But Iβve really gotten the hang of it by doing a lot of reading and watching videos on R and just digging through the commands, reading help files and even the raw R code on objects.
R really excels with automating GIS processes and being a one stop shop from extract, transform, load to render. Interestingly, outside of academia it seems like R doesnβt get the credit it deserves β especially with Census data and tidycensus its a one stop shop from obtaining data to manipulating it to rendering it on a map, often with just a single pipeline of code. Itβs pretty neat.
Iβm glad I taught myself R and itβs a technology I will probably continue to use daily for exploring my world.