Code Monkey home page Code Monkey logo

taswira's Introduction

Taswira

An interactive visualisation tool for GCBM.

Github All Contributors GitHub Workflow Status GitHub

Screenshot of Taswira

Taswira aims to be an easy-to-use utility to help the users of the Generic Carbon Budget Model (GCBM). It takes output generated by GCBM and creates a browser-based UI that allows users to:

  • View previews of the spatial data overlaid on an interactive map.
  • View graphs of ecosystem indicators from the non-spatial output.
  • Visually cycle through the time-series of the spatial output.

Install

Requires Git and Miniconda (or Anaconda) with Python 3.6 or newer.

  1. Clone the repository and cd into it:
git clone https://github.com/moja-global/GCBM.Visualisation_Tool
cd GCBM.Visualisation_Tool
  1. Create a conda environment and activate it:
conda env create -f environment.yml

conda activate taswira
  1. Install the Python package:
pip install -e .

Taswira is now installed, see Usage below.

Using Docker

You can also use Taswira through Docker. For that, build a container image:

DOCKER_BUILDKIT=1 docker build -t taswira:latest .

And then use it to run Taswira:

docker run taswira

Usage

usage: taswira [-h] [--allow-unoptimized] config spatial_results db_results

Interactive visualization tool for GCBM

positional arguments:
  config               path to JSON config file
  spatial_results      path to GCBM spatial output directory
  db_results           path to compiled GCBM results database

optional arguments:
  -h, --help           show this help message and exit
  --allow-unoptimized  allow processing unoptimized raster files

NOTE: spatial_results directory should contain GeoTIFFs with filenames that match the pattern {title}_{year}.tiff.

Configuration Schema

config should be a valid JSON file with an array of environment indicator configurations, each of which can have the following keys:

Key Description Required
database_indicator Database column that contains indicator's value Yes
file_pattern Pattern that matches to the filenames of the indicator Yes
palette A valid colormap string Yes
title Human-friendly title of the indicator No
graph_units Unit to use in the graph No

Example config file:

[
  {
    "database_indicator": "NPP",
    "file_pattern": "NPP*.tiff",
    "graph_units": "Ktc",
    "palette": "Greens"
  },
  {
    "database_indicator": "NBP",
    "file_pattern": "NBP*.tiff",
    "palette": "Greens"
  },
  {
    "database_indicator": "NEP",
    "file_pattern": "NEP*.tiff",
    "palette": "Reds"
  },
  {
    "title": "AG Biomass",
    "database_indicator": "Aboveground Biomass",
    "file_pattern": "AG_Biomass_C_*.tiff",
    "palette": "YlGnBu",
    "graph_units": "Mtc"
  }
]

Repository Contributors

Thanks goes to these wonderful people (emoji key):


moja global

๐Ÿ“†

Abhineet Tamrakar

๐Ÿ“– ๐Ÿ’ป

kaushik surya sangem

๐Ÿ‘€

Guy Janssen

๐Ÿ“†

This project follows the all-contributors specification. Contributions of any kind are welcome!

Maintainers Reviewers Ambassadors Coaches

The following people are Maintainers Reviewers Ambassadors or Coaches


Abhineet Tamrakar

๐Ÿ“– ๐Ÿ’ป

kaushik surya sangem

๐Ÿ‘€

Maintainers review and accept proposed changes
Reviewers check proposed changes before they go to the Maintainers
Ambassadors are available to provide training related to this repository
Coaches are available to provide information to new contributors to this repository

taswira's People

Contributors

abhineet97 avatar allcontributors[bot] avatar ditisalles avatar gmajan avatar ianfindlay avatar kaskou avatar leitchy avatar logmoss avatar mfellows avatar mtbdeligt avatar patamap avatar tlazypanda avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

taswira's Issues

Fix typo in docstring

In the docstring of the find_units function of this module, "from" has been accidentally written. It should be removed.

Pitch: Use Terracotta to Serve Raster Data

The GCBM spatial output is a list of tiled, DEFLATE compressed GeoTIFF files. In this form, the raster data cannot be used in my project's web application.

Also, the associated non-spatial data is not embedded into this raster data. It is present in a seperate SQLite database.

To be able to create an interactive front-end, I need an easy way to be able to access this raster data along with its associated non-spatial data.

Terracotta

Terracotta is a tile server written in Python. It takes some raster data and then serves it through an easy to use HTTP API. This API can solve the problems that I mentioned above.

It can serve the various tiles of a raster in the form of PNGs, which can then be easily rendered in a browser. The nature of the API also allows us easily overlay these images onto a map, as shown in this preview.

The API has the ability to return any arbitrary metadata associated with a particular raster. This can solve my problem of accessing non-spatial data on the front-end.

Terracotta & GCBM Output

Terracotta is not exactly a library. It is a command line app. It is used to serve and explore a directory of spatial data. You pass it a filename pattern as an argument. It parses the directory based on it, categorizes the raster data and then starts a server.

This make it a great tool to explore GCBM's spatial data. For example, following are the instructions for using it with the sample data available here.

First, rename the files so that they follow a pattern that Terracotta can understand. In bash:

$ for f in *.*; do echo "$(echo $f | sed 's/_//g' | sed -r 's/(.*)([0-9][0-9][0-9][0-9])(.*)/\1_\2\3/')"; done

Next, start a Terracotta server by executing:

$ terracotta serve -r ./{name}_{year}.tif'

Finally, connect this server to Terracotta's Preview interface:

$ terracotta connect localhost:5000

This will open an interface in your web browser, which you can then use to view raster data overlayed onto a map.

Terracotta & My Project

As mentioned above, Terracotta is not a library. So, how will I use it in my project?

For that, I plan on doing two thins:

First, I'll use Terracotta's Python API and generate a database of raster data. This will allow me to add the non-spatial data as metadata. And also instead of having Terracotta parse a directory, this would allow me to pass it just a single database.

Second, even though Terracotta is not advertised as a library, its code still follows a well organized and modular structure. This means that all of its features can be invoked programmatically by calling the right function (which would be this one in my case) with the right arguments.

Conclusion

I also looked at other solutions like MapServer, before settling on Terracotta. None of them offered the combined benifts of Python, HTTP API, Raster Optimization Tools, etc.

So, in conclusion, Terracotta seemed to be the most pragmatic choice for my project.

Convert DEFLATE-compressed rasters to ZSTD before intializing

Is your feature request related to a problem? Please describe.
Terracotta is not designed to work with DEFLATE-compressed raster files. It is made for Cloud Optimized GeoTIFFs (COGs) that use ZSTD compression. It does work with the DEFLATEs but the processing is considerably slow.

Describe the solution you'd like
We can convert these DELFATEs into ZSTDs before passing starting Terracotta. Thankfully Terracotta provides us with convenient methods for doing this. See here.

Describe alternatives you've considered
Asking the user to reconfigure GCBM on their end to produce COGs. This is what the program is currently doing.

Add Welcome Bot to create an inclusive environment for new contributors

Is your feature request related to a problem? Please describe.
Welcome is a simple way to welcome new users based off maintainer defined comments.

The 3 plugins it combines with are new-issue-welcome(Comment to be posted to on first time issues) , new-pr-welcome(Comment to be posted to on PRs from first time contributors in your repository) and first-pr-merge(Comment to be posted to on pull requests merged by a first time user).

Describe the solution you'd like
We can setup Welcome bot by adding the GitHub App to our organization repositories and configuring .github/config.yml according to the content of the messages we want.

This will make the new contributors feel welcomed and at ease in interacting with the community.

Additional context
Screenshots that depict the Welcome bot in action:

image

Make passing config file optional by adding a set of default configs

Is your feature request related to a problem? Please describe.
Users have to pass a JSON formatted configuration file to start Taswira. This i

Describe the solution you'd like
I believe that we can make this configuration file optional by hard-coding a set of configs for all the common ecosystem indicators. We can build on the example config file that is available in the README:

[
  {
    "database_indicator": "NPP",
    "file_pattern": "NPP*.tiff",
    "graph_units": "Ktc",
    "palette": "Greens"
  },
  {
    "database_indicator": "NBP",
    "file_pattern": "NBP*.tiff",
    "palette": "Greens"
  },
  {
    "database_indicator": "NEP",
    "file_pattern": "NEP*.tiff",
    "palette": "Reds"
  },
  {
    "title": "AG Biomass",
    "database_indicator": "Aboveground Biomass",
    "file_pattern": "AG_Biomass_C_*.tiff",
    "palette": "YlGnBu",
    "graph_units": "Mtc"
  }
]

Publish a moja global Taswira package

Hey @abhineet97 - our DevOps group have been on a tear publishing images for popular moja global tools. Could we please add some CI to this repo and add your Docker image as a GitHub package?

If so, I wondered if you'd like to label your service taswira in light of your ambition to be model agnostic (#35)?

Unable to create test raster files

Initially, I had planned on generating random GCBM raster files for testing Terracotta. But recently I found out that GCBM uses ZSTD compression. This doesn't pose a problem when ingesting the files to Terracotta. However, the problem seems to appear when I try to create a raster using rasterio (more info here).

It appears that to fix this, I'll need to build rasterio from source with ZSTD support. And apart from that, I'll also need to figure out how I'll share this build so that others can also run the tests easily.

This is something that I want to figure out. However, it doesn't seem important at the moment. So, I'm putting it here for later.

Remove Terracotta

Terracotta utilizes the Rasterio Python Library for working with the raster files. Rasterio is a convenience wrapper that makes it easy to work with GDAL.

This means that, theoretically, we can do away with Terracotta and directly work with Rasterio.

Restructure and Publish CLI on PyPI using Flit

You can install Taswira by creating a Conda environment using the included environment.yml file. This arrangement allows a cross-platform way for setting up the development environment.

This will be a little too complicated for people who want to use the tool and do not need to setup a development environment.

In this document, I propose that we should upload the tool on PyPI so that users can install by running:

$ pip install taswira

PyPI

The Python Package Index (PyPI) is a repository of software written in Python. It functions as the primary source for pip which is a package installer that by default ships with all modern distributions of Python.

The above descriptions makes PyPI that perfect place to publish our project at.

Flit

setuptools is the most popular method for packaging Python packages. It's great but requires a lot of complicated configuration. Flit is much more easier to configure as is supports the modern pyproject.toml package specification format.

Why Restructure Repository?

PyPI uses the repository's README on a project's page in its website. This is why I would need to change the README to describe the tool and would need to remove the GSoC information.

At the moment the tool is present in a sub-directory labelled taswira, I would need to move the contents of that sub-directory to the root of the repo. I would also like to change the repository's name to taswira and set its description to "An interactive visualisation tool for GCBM output". This will help increase the visibility of the tool and allow people to find it easily.

Project Plan

This is my project plan. Each section below represents a Milestone/Deliverable. Under each section is a list of tasks that must be completed for the milestone/deliverable to be considered achieved/delivered.

This is a living document and hence will be updated continuously as the project goes along. The list of changes can be viewed using the "edited" drop-down menu that's available above.

Community Bonding (May 4 - June 1)

  • Study the existing internal tool and understand its working.
  • Identify elements from the internal tool that can be reused.
  • Understand how to use spatial data in the browser.
  • Evaluate different frameworks and select the best one.
  • Learn more about the selected frameworks.
  • Setup coding environment (code editor, linting and styling tools, Python, etc)

Terracotta Back-End (June 1 - July 3)

  • Setup the code repository, ensuring that it conforms with moja global's standards.
  • Add Terracotta and add associated documentation for setting up development environment.
  • Create an ingestion script that loads GCBM data into Terracotta.
  • Implement a command line interface for the tool.
  • Add tests for checking the implementation of the above features.
  • Test and verify the integration (using Terracotta's inbuilt exploration interface).

Dash Integration (July 3 - July 31)

  • Pitch a plan for describing how Dash integration would work.
  • Implement overlaying raster files onto Leaflet.js Maps.
  • Identify the different controls that can be used for interacting with data.
  • Implement the interaction controls.
  • Setup continuous integration.

Final Touches (August 1 - August 24)

  • Test tool using different data sets and then fix any encountered bugs.
  • Identify potential features and enhancements for post-GSoC development.
  • Dockerize the environment to aid easy installation and development.
  • Update documentation and add installation instructions.
    Publish tool on PyPI and then setup continuous deployment. (see #31 for an explanation of why this was not possible)

A buffer of 2 weeks has been taken to account for any unforeseeable absence on my end (about which I'll inform as early as possible).

DEFLATE-compressed raster processing is slow

Describe the bug

When using DEFLATE-compressed raster files, it takes a significant amount of time for the tool to start. This is because Terracotta is optimized to work with ZSTD-compressed raster files (i.e., cloud-optimized graphics).

Possible Resolutions

  • Optimize raster files on-the-fly using Terracotta's optimize-raster feature.
  • Ask the user to reconfigure GCBM and provide COGs instead.

Remove `Science` folder

The Science folder in the root of this repository was imported from a template when this repository was created. We don't have any use of this folder and so it can be safely deleted.

Taswira as a general purpose spatial+non-spatial visualization tool

Is your feature request related to a problem? Please describe.
At the moment Taswira is able to visualize only GCBM output. It has the table names, raster patterns and other things hard-coded into it for the sole purpose of visualizing GCBM output. It would exponentially increase the utility of Taswira if it were able to visualize any arbitrary combinations of spatial and non-spatial data.

Describe the solution you'd like
I think that if we are somehow able to offload the task of specifying table names, column name, raster patterns, SQL queries, etc to the user then we can solve this problem.

Keyerror in ingestion.py when trying to run Taswira

Describe the bug
It is showing keyerror in ingestion.py file in visualization tool whenever i try to run Taswira.

To Reproduce
Steps to reproduce the behavior:

  1. go to directory GCBM.Visualisation_Tool
  2. go to the conda environment 'taswira' which you have created in your machine.
  3. When i typed following command it is showing me error
    $ taswira indicators.json sample_data/sample_1/output_files/spatial
    sample_data/sample_1/output_files/compiled_gcbm_output.db
    in place of sample_data/sample_1/output_files/spatial i have passed their path .
  4. See error
    KeyError: '2016'
    Expected behavior
    Normally my browser should automatically open up with Taswira's interface.

Screenshots
table

Operating Environment:

  • Ubuntu 20.04.1 LTS

Additional context
Add any other context about the problem here.

Pitch: Use Dash for the front-end

The front-end would be responsible for showing the raster files served by the back-end. In this document, I describe my rationale for using the Dash framework for this purpose.

Dash

A Python framework for creating web applications. Under the hood, it uses Flask, React.js and the Plotly.js (a graphing library). It's made specially for building data visualisation apps, which is why I came to know about it.

Why Dash?

The CLI tool taswira and the back-end are all written in Python. Wouldn't it be great if we could build even the front-end in Python?

This was my primary motivation behind selecting Dash. The idea of creating a web app written entirely in Python is incredibly appealing to me. It's something that I've never done. I believe that it would be a great learning opportunity for me.

The other reason is that Dash and Terracotta both use Flask. This means that it should be a straightforward process to combine the two. Dash has even provided us with the necessary documentation for this.

Apart from all that, Leaflet, the map library that I've planned to use, has Dash-specific bindings already available. This, I believe, would greatly simplify the process of building the front-end.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.