Code Monkey home page Code Monkey logo

crypto-reddit-data-streaming's Introduction

Crypto-Reddit-Data-Streaming

Springboard Data Engineering Capstone Project

Overview

This project streams reddit and yFinance data. The data is put into a visualization to allow analysis between how many subreddits are given in a timeframe and the price changes of the currency during the same period. We use kafka, python, Google Cloud Postgres, Faust, confluent plugins, websockets, docker containers, and plotly for visuals.

Data Source

We use kafka to stream the subreddits for our project. We use the landoop kafka docker container and add the conflunet kafka-connect plugin to the correct filepath within the docker container. We update the .property files with the correct subreddits we want to look at as well as directing the data to the correct kafka topic. We create this topic in kafka, then run kafka-connect to begin the data pipeline. Specific instructions are included in our shell command file.

The yFinance data is streamed through a websocket created through our python file. We send this data directly to Google Cloud Postgres though PGAdmin.

Data Transformation

We use Faust to add a timestamp to the reddit data coming in through kafka. We then send this data into our Cloud database. The yFincance data gets sent to the cloud directly from our websocket.

Data Storage

We create a Google Cloud Postgres Server noting the IP address, user, and password and install a local PGAdmin connecting to the server. We create tables, as well as a trigger to have created_at info. Specific instructions for this are included in the .sql file.

Data Visual

We use python and plotly.express to create visuals of the data for analysis. We create bar graphs of the number of occurances of subreddit posts over a given time and compare this to the line graph of the correlating stock price. The visuals are included in the slidedeck.

crypto-reddit-data-streaming's People

Contributors

madams234 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.