Code Monkey home page Code Monkey logo

digitalpreservation-dmp's Introduction

DOI

Correlating Alcohol Consumption and UFO Sightings in the USA

This experiment aims to explore the connection between the alcohol consumption per capita and the number of ufo sightings in the USA.

Prerequisites

Prior to running the experiment make sure that the following folders exist:

  • data/raw - Folder to store the external datasets
  • data/processed - Folder to store the intermediate dataset in
  • reports/figures - Target folder for generated correlation plot

Data Sources

The cited datasources have already been added to this repository.

Follow these instructions if you want to use updated versions of these datasource:

  1. Download CSV files to folder data/raw
  2. Set paths to CSV files in notebook 01_data-preprocessing.ipynb by changing the values of UFO_SIGHTINGS and ALC_CONSUMPTION

Running the code

To run the code in this repository you will need to have access to a machine running python (at least version 3.5) and pip.

Run pip install -r requirements.txt to install the required dependencies.

Once the dependencies have been installed, start the jupyter notebook server via jupyter notebook and open http://localhost:8888.

In the notebooks folder you'll find the following notebooks:

01_data-preprocessing.ipynb

Running this notebook generates a dataset consisting of the number of ufo sightings and the alcohol consumption in the usa per year by preprocessing and accumulating the data provided by the datasources mentioned above.

The resulting dataset is located at data/processed/ufo_alcohol.csv

02_visualization.ipynb

This notebook takes the data generated by running 01_data-preprocessing.ipynb as input and generates a plot to visualize correlations between the data points.

The resulting plot is stored at reports/figures/correlation.png

Docker

Run docker build . to create a docker image of this repository. The resulting image exposes the jupyter notebook on port 8888.

Boot a docker container via docker run -i -p 8888:8888 <IMAGE_ID> to start a jupyter instance. The resulting console output will show the url you can open in your browser to take a look at the code, e.g.

 Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://0.0.0.0:8888/?token=<SOME_TOKEN>

Architecture

System Architecture Diagram

digitalpreservation-dmp's People

Contributors

mdietrichstein avatar sorx avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.