Code Monkey home page Code Monkey logo

ma-redistricting-2022's Introduction

Spatial data cooking show

This is a demo project and template repository showing how I use SpatiaLite with Datasette for quick spatial analysis.

Live show notes: https://docs.google.com/document/d/1z30ZCWGsR_jr6EuHKukJ4kTZQMBryrrEkIgo_5twZjo/edit

Why SQL, SQLite and Datasette

SQL is old. It's boring. It's widely adopted and not controlled by a company that might change it or shut it down next year.

SQLite is a database in a single file. It's fast and durable. It's easy, both in code and in my head, to make and delete and remake lots -- hundreds -- of little databases. They're just files.

There are a lot of shiny, brittle things out there. This is cast iron coding.

Libraries included

Run pipenv install to create a virtual environment and get the latest version of everything. I also recommend installing SpatiaLite.

You might also use ...

These libraries aren't included in the Pipfile but could be useful:

What should I use this for?

Over the past year, I've started using this set of libraries to quickly explore, reshape and reformat data for visualization in news projects. My workflow usually involves three phases: loading, exploration, export.

This stack isn't (yet) a replacement for R, or GeoPandas or QGIS. Those are good tools, and if you're more comfortable there, I'm not here to talk you out of them.

This approach is, for me, faster than QGIS for most datasets. It allows me to more quickly iterate on finding interesting subsets (because I can write SQL faster than I can navigate a GUI), and it allows me to see intermediate results, because each query generates a map.

Loading data

For self-contained datasets, I use geojson-to-sqlite. (For shapefiles, I usually transform data to GeoJSON first using ogr2ogr, but shapefile-to-sqlite is also available.)

For tabular data, I use sqlite-utils.

Exploration

If you are new to Datasette, this tutorial is a good place to start.

Once you have data in SpatiaLite, every SQL query that includes a geometry column will now include a map. (If you're using leaflet-marker-cluster, you'll see a map if your query includes longitude and latitude columns.) Use queries to see different slices of the data.

Here's a map of counties in Massachusetts.

Using datasette-geojson-map lets you style map features using Mapbox's simple-style-spec. Github also supports this for GeoJSON and TopoJSON files.

With sqlite-colorbrewer, it's possible to color features based on data.

Here are countries in Asia, in color.

Combine those, and we can start using Datasette for exploratory visualization:

Export

Finally, let's get data out of exploration mode and into something more presentable.

This stack is built for reporting, not for storytelling.

For interactive maps, I recommend exporting a version of your data as GeoJSON and building static versions for presentation with Leaflet or Mapbox. For more advanced visualizations, look at d3-geo.

Datasette allows queries to run from the command line using the --get option, which will hit any Datasette URL and return the result. Combine this with canned queries for scripted exports. I usually put this in a Makefile:

combined.geojson: project.db
    pipenv run datasette project.db --get /project/combined.geojson \
     --load-extension spatialite \
     --setting sql_time_limit_ms 5000 \
     --setting max_returned_rows 20000 \
     -m metadata.yml > $@

That assumes I've loaded whatever I need into project.db and created a canned query called combined (and then uses datasette-geojson for geojson output).

My goal in this step is to end up with a file that won't need any further filtering or processing once I pull it into a frontend tool.

Here are static versions of the examples above:

Datasets to test out

ma-redistricting-2022's People

Contributors

eyeseast avatar

Stargazers

Bryan Hermsen avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.