streamingtwittersentimentanalysis's Introduction

Docker instructions

Overview

A big data project combined with Machine leaning algorithm running on Apache Spark

Here is my post explaining detail of this project

Kafka Consumer Docker

docker run -it --name consumer --expose 8888 --expose 4040 -p 8888:8888 -p 4040:4040 -v ~/IdeaProjects/StreamingTwitterSentimentAnalysis/:/app -d jiangxiaoyong/consumer

4040 is spark UI
8888 is debugging port

Streaming Twitter Sentiment Analysis project complete running instructions

The complete project comprise four different parts of modules as below

Twitter producer
Kafka cluster
- Apache zookeeper
- Apache Kafka
Kafka consumer
- Spark Streaming
- Naive Bayes Model
Scala-play server

Running procedural and instructions

start set of dockers

$ docker start zookeeper kafka producer consumer playScala

run zookeeper and kafka servers inside docker

$ docker exec zookeeper /zookeeper-3.4.8/bin/zkServer.sh start
$ docker exec -it kafka /kafka_2.11-0.10.0.0/bin/kafka-server-start.sh /kafka_2.11-0.10.0.0/config/server.properties -d --override zookeeper.connect=${ZOOKEEPER_PORT_2181_TCP_ADDR}:${ZOOKEEPER_PORT_2181_TCP_PORT}

run producer inside of producer container

$ docker exec -it producer /bin/bash
$ cd /app
$ sbt
>run

run consumer inside of consumer container

$ docker exec -it consumer /bin/bash
$ cd /app
$ sbt
>run

run scala-play server inside of playScala container

$ docker exec -it playScala /bin/bash
$ cd /app
$ activator run

connect and send message to play server via WebSocket at index home page of web client
- see detail instructions at Scala-play repo

Training Naive Bayes Model

modify sbt file to specify which main file as the entry point
training data set from Sentiment140, total training data volume 1,600,000
run consumer inside consumer docker

SBT instructions

Debug mode

cd /app
sbt -jvm-debug 8888

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.

Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

TensorFlow

An Open Source Machine Learning Framework for Everyone

Django

The Web framework for perfectionists with deadlines.

Laravel

A PHP framework for web artisans

D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

web

Some thing interesting about web. New door for the world.

server

A server is a program made to process requests and deliver data to clients.

Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

Visualization

Some thing interesting about visualization, use data art

Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.

Microsoft

Open source projects and samples from Microsoft.

Google

Google ❤️ Open Source for everyone.

Alibaba

Alibaba Open Source for everyone

D3

Data-Driven Documents codes.

Tencent

China tencent open source team.

rootcss / streamingtwittersentimentanalysis Goto Github PK