Code Monkey home page Code Monkey logo

crmprtd's Introduction

crmprtd

Python CI Pypi Publishing

Utility to download near real time weather data and insert it into PCIC's database

Installation

A Makefile handles the project installation:

make

If you do not wish you to use make, follow the instructions below.

For production usage, install the latest tagged release from PCIC's PyPI server.

pip install -i https://pypi.pacificclimate.org/simple crmprtd
# or with JSON logging functionality
pip install -i https://pypi.pacificclimate.org/simple crmprtd[jsonlogger]

Or for development, clone the repo and install it from your local source tree.

git clone [email protected]:pacificclimate/crmprtd
virtualenv venv
source venv/bin/activate
pip install -r requirements.txt -i https://pypi.pacificclimate.org/simple
pip install .

Usage

The most common usage pattern for the crmprtd is to configure a number of scripts to run on an hourly or daily basis.

Some of the data sources require authentication. For most scripts, credentials can be provided as command line arguments, or more preferrably, entries in a yaml config file. A sample version of this file can be see here. This is then sources by passing the file location with the --auth argument and the key with the --auth_key argument.

Each network has a download_[network_name] script which will download data. The standard output stream of this script can be redirected into a file or piped into crmprtd_process. crmprtd_process will take the data and run it through a series of formatting changes and checks before inserting the data into the database.

A list of all available network modules can be found in the online help for crmprtd_process:

(env) james@basalt:~/code/git/crmprtd$ crmprtd_process -h
usage: crmprtd_process [-h] -c CONNECTION_STRING [-D]
                       [--sample_size SAMPLE_SIZE]
                       [-N {bc_env_aq,bc_env_snow,bc_forestry,bc_tran,ec,moti,wamr,wmb}]
                       [-L LOG_CONF] [-l LOG_FILENAME]
                       [-o {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
                       [-m ERROR_EMAIL]

optional arguments:
...
  -N {bc_env_aq,bc_env_snow,bc_forestry,bc_tran,ec,moti,wamr,wmb}, --network {bc_env_aq,bc_env_snow,bc_forestry,bc_tran,ec,moti,wamr,wmb}
                        The network from which the data is coming from. The
                        name will be used for a dynamic import of the module's
                        normalization function.

Input/Output Streams

Connecting the I/O of the download scripts to cache files and the processing scripts is as easy as using unix pipes and I/O redirects. For example, fetching the SWOB-ML for the BC Forestry data and processing it, looks like this:

download_bc_forestry > cache_filename
crmprtd_process -N bc_forestry < cache_filename
# Or
download_bc_forestry | crmprtd_process -N bc_forestry
# Or
download_bc_forestry | tee cache_filename | crmprtd_process -N bc_forestry

More generally:

download_[network_name] > cache_filename
crmprtd_process -N [network_name] < cache_filename
# Or
download_[network_name] | crmprtd_process -N [network_name]
# Or
download_[network_name] | tee cache_filename | crmprtd_process -N [network_name]

Logging

One thing to be aware of when using pipes and stdout is that you need to ensure that no logging or debugging output from the download script goes to standard out. The default console logger sends logging output to the standard error stream. However, this is configurable, so the user must take care to not configure the logging output to go to standard out, lest it get mixed up with the data output stream.

Testing

Database tests use the testing.postgresql database fixture. This requires postgresql server in your PATH with the postgis extension. This should be as simple as:

apt-get install postgresql postgis
pip install -r test_requirements.txt
py.test -v tests

Releasing

  1. Increment __version__ in setup.py
  2. Summarize release changes in NEWS.md
  3. Commit these changes, then tag the release
git add setup.py NEWS.md
git commit -m"Bump to version x.x.x"
git tag -a -m"x.x.x" x.x.x
git push --follow-tags
  1. Our Github Actions workflow will build and release the package

crmprtd's People

Contributors

nikola-rados avatar jameshiebert avatar rod-glover avatar basilveerman avatar bhawesh96 avatar eyvorchuk avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.