Python and Data Science π²
Python and Data Science π²
The two big ways I process data for the blog is using QGIS for mapping and spatial analysis and LibreOffice Calc spreadsheet for pivot tables, formulas and statistical analysis. Sometimes I also will use PHP to process data like for creating graphs with the help of ChartsJS.
It works but it’s somewhat clunky and slow, requiring the opening of large programs that use a lot of memory and a lot of pointing and clicking. For very large datasets, this can really push the software to its limits. That’s why I ended up looking into GeoPandas and python more generally. It’s kind of a gateway drug.
The GeoPandas library is great as you can do quickly with a few lines of code spatial joins of coordinates and shapefiles in a few lines of code. I’ve done a little with the Pandas library in the past but suddenly now I’m really interested in what I can do with data processing in Python. Seems like a much better option than those afformentioned libraries. So I got a book out of the library called Python data science essentials : become an efficient data science practitioner by thoroughly understanding the key concepts of Python / Alberto Boschetti, Luca Massaron, and suddenly I’m so fascinated by the topic of data science and what I can do with it, especially mixed with Geospatial data.
To be sure, I’m an open source and open data guy, I’m only interested in building things out of information that is freely available. But I’m interested in learning new tools and finding new ways to process data both for my blog and other purposes. And despite what the internet advertising suggests, I’m not interested in data science to get a job, I just want to be able to do quick and dirty things in python to build new ways for analysis of data and maps.