Code Monkey home page Code Monkey logo

data2day-2022's Introduction

data2day-2022

Public repository containing the material for the 2022 data2day conference.

More on the program: From PoCs to Large Scale ML Operationalization Covering the End-to-End Pipeline.

Developers

This repository is owned and maintained by E-Breuninger Developer Team.

For any feedbacks or inquiries related to this repository, you can contact Olivier Bénard (Data Software Engineer).

Dependencies

The dependencies are managed via poetry. We recommend to use and integrate this tool in your process. However, we also provide the list of necessary requirements with the requirements.txt file if you decide otherwise.

Note: It might be possible that you have to switch your python version. We recommend using pyenv as a python version manager, to be installed via brew install pyenv.

Quick Start

To install all the dependencies and rapidly start getting your hands dirty:

  1. Create a settings.toml file based on the following template:
[default]
LOG_LEVEL = "DEBUG"
LATITUDE = "<google-map-latitude>"
LONGITUDE = "<google-map-longitude>"
APP_PATH = "/absolute/path/to/the/local/repository/"
  1. Create a .secrets.toml file based on the following template (you can left the default if you have no key):
[default]
google_map_api_key = "<your-google-map-api-key>"
  1. Install all the dependencies on the virtual environment via poetry:

     poetry install
    
  2. You are ready to go and can start the jupyter notebook kernel:

     make notebook
    

Only thing left to do if to naviguate through notebooks/ and play with the notebooks.

Bonus: If you want to publish some changes, you first need to install pre-commit:

    make pre-commit-install

This will guarantee that the code you push meets the best software development standards and the github CI/CD pipeline to succeed i.e. your code will be accepted.

Notes:

  • You need to install poetry if you do not have it already via brew install poetry.
  • The Google Map API key is used to display the weather stations on Google Map. However, you do not need it since by default, the developer mode (activated by default if you do not have a key or a valid one) - even though grants less opportunities - also does the job.

Architecture

  • The data2day_2022/ foler contains reusable part of the code such as the sql queries and the helpers package.
  • The datasets/ folder contains the template you have to fill int to make the forecast.
  • The notebooks/ folder contains a couple of jupyter notebook where lies the main logic of the code.
  • The results/ folder contains the results to be generated by the notebooks.
  • The slides/ folder contains the anonymised presentation as a .pdf format.
  • The tests/ folder contains a couple of unittests to test our code.
  • The .pre-commit-config.yaml file contains a couple of logics to be executed at the commit time before the code can be pushed.
  • The Makefile contains a serie of redundant commands e.g. make check or make notebook.
  • The .secrets.toml and settings.toml are parametrisation files containing the variables used in the code.

Running your own forecast

  • You can parametrised the serie you want to predict using the datasets/customer_frequentation.csv file. Fill it with your own data, respecting the following template:
date quantity
<YYYY-MM-DD> <float>
  • Rainfall data for Stuttgart in 2018 has been retrieved and collected in the results/weather_prpc.csv file. You can however query the intial tables on BigQuery using notebooks/weather_data_on_biqguery.ipynb. Results will be captured under the results/ folder.

Troubleshooting

The troubleshooting section is empty so far but should you encounter any issue not stated in the current documentation, please contact us.

data2day-2022's People

Contributors

olivierbenard avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.