R Programming Language

R is a programming language and free software environment for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. It is widely used among statisticians and data miners for developing statistical software and data analysis.

Show Only ...
Maps - Photos - Videos

How to access WMS Servers in R Programming Language

How to access WMS Servers in R Programming Language

I couldn’t find a good package to read WMS data into a SpatialRaster or for downloading. However I discovered that you can use sf’s built-in version of gdal_util query the WMS server’s information then use the sf built in gdal_translate to download the WMS image into a temporary file then load it into memory.

01library(sf)
02library(tidyverse)
03library(terra)
04rm(list=ls())
05 
06# obtain layer info from WMS server using sf built-in version of gdal_info
08ginfo <- sf::gdal_utils('info', str_c('WMS:',wms_url), quiet=T)
09 
10# extract layers and layer urls from returned layer info,
11# create a data table from this data
12ldesc <- ginfo %>% str_match_all('SUBDATASET_(\\d)_DESC=(.*?)\n')
13ldsc <- ldesc[[1]][,3]
14lurl <- ginfo %>% str_match_all('SUBDATASET_(\\d)_NAME=(.*?)\n')
15lurl <- lurl[[1]][,3]
16wms_layers <- cbind(ldesc, lurl)
17rm(ldesc, lurl)
18 
19# obtain from census bureau a shapefile of the Town of Cazenovia,
20# then find the bounding box around it
21bbx<- tigris::county_subdivisions('ny') %>%
22  filter(NAME == 'Cazenovia') %>%
23  st_transform(3857) %>%
24  st_bbox()
25 
26# Revise the WMS URL based on the above bounding box
27# Web Mercator (3857) is best to use with WMS Servers as
28# that is the native projection. As I wanted multiple layers
29# I also revised the URL string to include all layers desired
30# Seperated by a comma (explore the wms_layer table for details)
31url <- wms_layers[4,2] %>%
32  str_replace('LAYERS=(.*?)&', 'LAYERS=3,2,1,0&') %>%
33  str_replace('SRS=(.*?)&', 'SRS=EPSG:3857&') %>%
34  str_replace('BBOX=(.*?)$', str_c('BBOX=',paste(bbx, collapse=',')) )
35 
36# Use sf built-in gdal_utils to download the image of Cazenovia based
37# on the URL, with an output size listed below. You can also download
38# based on resolution by using the tr option that is commented out below.
39# then load the temporary file into rast as a Spatial Raster for further
40# processing. In addition, adding -co COMPRESS=JPEG greatly reduces the size
41# of aerial photography with minimal impacts on quality or loading speed.
42 
43t <- tempfile()
44reso <-  c('-outsize','1920','1080','-co','COMPRESS=JPEG')
45#reso <- c('-tr', '1','1','-co','COMPRESS=JPEG') # "meters" in the 3857 projection
46sf::gdal_utils('translate', url, t, reso)
47r <- rast(t)
48 
49# display the spatial raster or whatever you would like to do with it
50# the tiff is stored in the temporary location in the t variable
51plot(r)

How to fix rselenium / wdman unable to start error

If you are getting an error starting geckodriver when using rselenium, you may want to try deleting the LICENSE.chromedriver file which accidentally attempts to be executed by wdman::geckodriver. You can do this simply in the Linux terminal by using this command. The xargs -r command only executes the rm command when there is a file matched to delete.

find ~/.local/share/ -name LICENSE.chromedriver -print | xargs -r rm

There was a time when PHP was my go to language for everything

Sure, I used other languages – Javascript, Python, C and even C++ but I really liked PHP for there being so many libraries and it being so easy to tie into web apps.

Python was fine but it was something that I used when choices were limited – like for QGIS scripting and automation. Then over summer vacation I got that book out of the library about data science and discovered the benefits of an interactive intrepter with Jupyter. Python’s data frames model with pandas is quite powerful. I enjoy working with them. But I also found matplotlib, the graphing and mapping libraries for Python to be often lacking when trying to output quality, professional looking graphics.

I heard a lot of good things about R so I decided to take it up last autumn, and I haven’t looked back…

R has a werid syntax but the pipe model that can be used throughout is incredibly powerful and ggplot2 with only a little bit of tweaking makes amazing graphs and maps that look like they came out of a professional GIS program. Things I used to do in QGIS, I’m increasingly doing with the R programming language and I even converting a lot of my PHP and Python code now over to R. I am really hooked on this language after struggling with it a bit at first. 

R Programming – IDW interpolation of Missing / NA Census Tract Data

You can do IDW interpolation of missing Census Tracts fairly easily in R using the gstat library. The key is to make sure you use a projected dataset. Other interpolation methods are covered here: https://rspatial.org/raster/analysis/4-interpolation.html

library(tidycensus)
library(tidyverse)
library(raster)
library(gstat)

# obtain census data on veteran status by tract and then
# reproject the shapefile geometery into a projected coordinate system

acs <- get_acs("tract",
state='ny',
survey='acs5',
var=
c('Total'='B21001_001',
'Veteran'='B21001_002'
),
cache_table = T,
geometry = T,
resolution='20m',
year = 2020,
output = "wide"
) %>% st_transform(26918)

# calculate the percentage of veterans per census tract
acs <- mutate(acs, vet_per = VeteranE/TotalE)

# create a copy of census tracts, dropping any NA values
# from vet_per field
vetNA <- acs %>% drop_na(vet_per)

# a raster should be created to do interpolation into
r <- raster(vetNA, res=1000)


# set the foruma based on field (vet_per) that contains
# the veterans percent to interpolate. This use IDW interpolation
# for all points, weighting farther ones less
gs <- gstat(formula = vet_per~1, locations = vetNA)

# interpolate the data (average based on nearby points)
nn <- interpolate(r, gs)

# extract the median value of the raster interpolation from the original shapefile,
# into a new column set as est
acs<- cbind(acs, est=exactextractr::exact_extract(nn, acs, fun='median'))

# replace any NA values with interpolated data so the map doesn't contain
# holes. You should probably mention that missing data was interpolated when
# sharing your map.
acs <- acs %>% mutate(vet_per = ifelse(is.na(vet_per), est, vet_per))

Create Mile Points from a LINESTRING or MULTILINESTRING in R

Here is an R function that takes a LINESTRING or touching MULTILINESTRING and returns a series of points along the line string at each mile, much like the β€œtombstone” mile markers along an expressway. This is helpful for making maps when you want to plot distance for hikers or drivers looking at their odometer of their car. It has the option to β€œreverse” the linestring so you can have the points going the opposite direction of the linestring, such as south to north. The code can be modified for quarter mile points or however you find useful.

01library(tidyverse)
02library(sf)
03library(units)
04 
05make_milepoint <- \(linestring, reverse = FALSE) {
06  # make sure were using a projected coordinate systm
07  linestring <- st_transform(linestring, 5070)
08   
09  # merge parts together into a multilinestring
10  linestring <- linestring %>% group_by() %>% summarise()
11   
12  # if multiline string, then attempt to join together
13  # this will raise an exception if the linestring is not contiguous
14  if (st_geometry_type(linestring) == 'MULTILINESTRING')
15    linestring <- st_line_merge(linestring )
16   
17  # reverse the string if we want to
18  # go from the other end
19  if (reverse)
20    linestring <- st_reverse(linestring)
21   
22  # length of string in miles
23  linestring.distance = st_length(linestring) %>% set_units('mi') %>%
24    drop_units()
25   
26  # percent of string equals each mile including start and end
27  linestring.sample.percent <- seq(0, 1, 1/linestring.distance) %>% c(1)
28   
29  # sample line string, convert multi-points to points, convert to sf
30  # add a column listing the mile-points, round to two digits
31  st_line_sample(linestring, sample = linestring.sample.percent) %>%
32    st_cast('POINT') %>%
33    st_as_sf() %>%
34    mutate(
35      mile = round(linestring.sample.percent * linestring.distance, digits = 2))
36}

Here is an example of the 1 mile points it outputs along Piseco-Powley Road.

And here is a static (paper) map I created using this code plus adding coordinates and elevation for this hiking map.

Soda Range Trail

An Introduction to Apache Arrow for R Users

Been learning about Apache Arrow for loading and processing extremely large datasets in R quickly without having to set up and use something like a PostgresSQL database. DMV vehicle file, you know what I'm thinking.