Code Monkey home page Code Monkey logo

analyse-disaster-data's Introduction

1. Introduction

During a disaster, typically we will get millions and millions of communications either direct or via social media right at the time when disaster response organizations have the least capacity to filter and then pull out the messages which are the most important. Machine learning is critical to helping different organizations understand which messages are relevant to them, and which messages to prioritize.

In this repo, I am going to be analyzing thousands of real messages of disaster data from Figure Eight, which contains pre-labeled tweets and text messages from real-life disasters, to create a model for an API that classifies disaster messages.

To get a betther understand about creating an ETL Pipeline, NLP Pipelines and Machine Learning Pipeline go through these repositories respectively:

2. Prerequisites

To install the flask app, you need:

python3
python packages in the requirements.txt file

Install the packages with

pip install -r requirements.txt
To create an environment using: conda create --name --file requirements.txt

3. Project Components

There are three components for this project:

  1. ETL Pipeline: First I will repair the data with the ETL pipeline that process messages and category data from CSV file and load them into SQLite database. In the Python script, process_data.py, you will find the data cleaning pipeline that:

    • Loads the messages and categories datasets
    • Merges the two datasets
    • Cleans the data
    • Stores it in a SQLite database
  2. ML Pipeline: Use the machine learning pipeline to raed data from the SQLite database to create and save a multi-output supervised learning model. In the Python script, train_classifier.py, you will find the machine learning pipeline that:

    • Loads data from the SQLite database
    • Splits the dataset into training and test sets
    • Builds a text processing and machine learning pipeline
    • Trains and tunes a model using GridSearchCV
    • Outputs results on the test set
    • Exports the final model as a pickle file
  3. Flask Web App: I will create a web application, which use the trained model(the pickle file) to classify incoming messages where an emergency worker can input a new message and get classification results in several categories.

4. Structure

Below you can find the file structure of the project:



      - disaster_app
      | - template
      | |- master.html  # main page of web app
      | |- go.html  # classification result page of web app
      | - static
      | |- imgs
      | | |- githublogo.png 
      | | |- linkedinlogo.png 
      |- __init__.py  # Intial Flask file that runs app
      |- routes.py # Flask route file

      - data
      |- disaster_categories.csv  # data to process 
      |- disaster_messages.csv  # data to process
      |- process_data.py
      |- ETL Pipeline Preparation.ipynb (details about crating the ETL Pipeline)
      |- Database.db   # database 
      
      - models
      |- train_classifier.py
      |- utils.py 
      |- ML Pipeline Preparation.ipynb (details about crating the ML Pipeline)
      |- model.pkl  # (the size of the trained model is to big, therefore I could not load it on the repo, please rerun the train_classifier code to get the trained model) saved model 
      
      - README.md
      - app.py 
      

5. Instructions for running the Python scripts

Run the following commands in the root directory of each file to set up your database and model

  • To run ETL pipeline that cleans data and stores in database:

           python process_data.py  --f1 disaster_messages.csv  --f2 disaster_categories.csv  --o Database.db
    
  • To run ML pipeline that trains classifier and saves it:

           python train_classifier.py  --f1 ../data/Database.db
    
  • Run the following command in the app's directory to run your web app:

           python app.py
           go to http://0.0.0.0:3001/
    
  • To get more information about how to deploy this app to a cloud, go through the Deploy the web app to the cloud step in this repository.

6. The screenshots of the web app.

analyse-disaster-data's People

Contributors

a2amir avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.