A demo project using Spark Streaming to analyze popular hashtags from the twitter data streams. The data comes from the Twitter Streaming API source and is fed to Kafka. The consumer service receives data from Kafka and then processes it in a stream using Spark Streaming.
- Apache Maven 3.x
- JVM 8
- Docker machine
- Registered an Twitter Application. The following guides may also be helpful: How to create a Twitter application.
-
Change Twitter configuration in
\twitter-producer\src\main\resources\application.properties
: -
Run docker-compose with following command:
docker-compose up -d
-
Check if ZooKeeper and Kafka is running (from command prompt)
-
Launch twitter-producer app:
$ cd twitter-producer
$ mvn spring-boot:run
- Launch spark-consumer app:
$ cd twitter-producer
$ mvn spring-boot:run