Python

Python is an interpreted high-level general-purpose programming language. Python’s design philosophy emphasizes code readability with its notable use of significant indentation.

Not afraid to use Python 🐍

Not afraid to use Python 🐍

I use R for a lot of projects as it has superior libraries for working with Census data, maps and spatial data. Tigris makes it easy to call and process all kinds of census lines.

But sometimes projects are best done in Python. I recently had a project where I needed run a regular expression to extract text from a PDF and put it in an excel file. Python was easy to do it in a few lines of code while I think it would have been harder to do in R.

I don’t really buy into all this doctoraire about you should use the same tool for every project. Use what makes sense, what you are best at, what executes quickly and produces the best results. 

I don’t love Python for Geospatial Projects 🐍

I don’t love Python for Geospatial Projects 🐍

Last night I went back for a second a look at the world of Geospatial Technology in Python. While Python’s ArcGIS and QGIS bindings are widely touted — and are best way to automate things within ArcGIS or QGIS — you are much better off using R programming language for quick, low-code GIS tasks outside of ArcGIS or QGIS.

Python has a lot of advantages for certain things:

  1. It is a good scripting language that is widely supported in applications.
  2. Python is generally a stronger language for building applications to run on web servers
  3. Both ArcGIS and QGIS have really good Python bindings

But as a stand-alone platform, the Python Geospatial libraries rather suck, and are undeveloped. To be sure you can make maps in Python, you can preform various geospatial operations like transformations, raster math and geometric operations. But it takes a lot of work within Python to get nice looking maps using matplotlib, and but you don’t have access to wealth of Census shapefiles or Census data at your finger tip, and Python’s dot chaining method isn’t necessarily as elegant or readable.

I would argue that the R Programming Language and RStudio are superior in many ways over working directly with Python:

  1. R Programs using tigris library, which gives you instant access to the Census Bureau TIGER/Line with just a single command that can be easily joined again or queried against other data. There is nothing like tigris in Python. If you want to County or County Subdivision lines in Python, you will have to manually download the shapefile and then load it into Geopandas. I’ve looked for things like tigris in Python and it doesn’t exist. The basis of most maps in my experience comes from Census TIGER/Line at least in United States. Cartograpy in Python does have access to Natural Earth Dataset, but that isn’t as good as TIGER/Line in the United States.
  2. There are Census Libraries in Python but they aren’t nearly as up to date, have access to nearly as much Census data or Census TIGER/Line. A lot of maps that you make involve plotting Census data, and that requires both the TIGER/Line and the raw data. tidycensus joins them together as one command, no need to download the TIGER/Line separately like in Python.
  3. While you can chain commands in Python and GeoPandas, the chaining mechanism in R is much stronger and flexible. Often in R you can exchange, transform, load and output a map in a single chain of R commands using the tidyverse and ggplot.
  4. ggplot2 is vastly superior to matplotlib for making maps. ggplot2 has sensible defaults, the output is clean and easily theme-able. ggplot2 main limitation is that is best for simpler, easy to read SVG maps. ggplot2 can be a bit strict in enforcing how Hadley Wickham thinks a map should be presented.  matplotlib is more flexible in overlaying and designing maps. Of course, for complicated maps, it’s still best to export the data as a shapefile or geopackage and load it into a full GIS platform like QGIS or ArcGIS.
  5. In general, R has a quirky syntax with cute and weird names, but with more sensible defaults, it often gets geospatial work done with less code and work then the same thing done in Python with Rasterio or GeoPandas. A lot of complicated exchange-load-transform things can be done in one line of code with R. People say that Python is a compact syntax, but it really isn’t compared to R’s geospatial libraries.

I still use Python for some QGIS work but I don’t recommend it for work outside of QGIS or ArcGIS. Python should be seen as a good language to automate processes within QGIS or ArcGIS but the state of geospatial tools in Python is weak when you get away from those automating those graphical GIS applications. If you want your work to end up in a graphical GIS program for additional manual tweaking after automating things, then use Python. But if you are processing GIS data from start to finish, your best bet is R.

MITO

I am a bit under impressed by MITO. For one, it’s not open source software and only works in Jupyter lab and not Jupyter Notebook. Despite all the marketing and hype it’s not that much of an improvement of what you can do with a few simple lines of PANDAS code. Another thing I didn’t like was you had to manually add a cell below the MITO window to see the generated code.

One thing that I thought was nice was the PANDAS code it outputted was clean and easy to read. It’s a nice browser of data frames – reminded me of view in R. The extension is free to use both for personal and professional use, you just can’t redistribute it.