Code Monkey home page Code Monkey logo

example-cassandra-alpakka-twitter's Introduction

Alpakka Cassandra and Twitter

This project is a Scala application which uses Alpakka Cassandra 2.0, Akka Streams and Twitter4S (Scala Twitter Client) to pull new Tweets from Twitter for a given hashtag (or set of hashtags) using Twitter API v1.1 and write them into a local Cassandra database.

NOTE: The project will only save tweets which are not a retweet of another tweet and currently only saves the truncated version of tweets (<=140 chars).

Img


Requirements

  • Scala 2.12+
  • JDK 8
  • sbt (this project uses 1.4.9)
  • Docker (and required RAM for running a Cassandra container)

Table of Contents

  1. Setup and run local Cassandra using Docker
  2. Configure Twitter API keys
  3. Setup hashtags and run the project using SBT
  4. Observe results in Cassandra using cqlsh

1. Cassandra Setup

1.1 - Make sure you have docker installed on your machine. Run the following docker command to pull up a local Cassandra container with port 9042 exposed:

docker run -p 9042:9042 --rm --name my-cassandra -d cassandra

1.2 - Make sure your container is running (may need to give the container a few minutes to boot up):

docker ps -a

Screenshot
The above output shows that the container has been running for 3 minutes, and also shows that port 9042 locally is bound to port 9042 in the container. (default port for Cassandra)

1.3 - Afterwards, run CQLSH on the container in interactive terminal mode to setup keyspace and tables:

docker exec -it my-cassandra cqlsh

1.4 - Once CQLSH comes up, create the necessary keyspace and table for this demo.

CREATE KEYSPACE testkeyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'}  AND durable_writes = true;

CREATE table testkeyspace.testtable(id bigint PRIMARY KEY, excerpt text);  

INSERT INTO testkeyspace.testtable(id, excerpt)
VALUES (37, 'appletest');

exit

2. Twitter Setup

2.1 - From the root folder of this repository, browse to the application.conf.example file found in /src/main/resources/application.conf.example. Copy this file into this same directory and rename it application.conf

mv /src/main/resources/application.conf.example /src/main/resources/application.conf

2.2 - Go to the twitter developer dashboard website, register an application and insert these four twitter api keys into this portion of application.conf:

twitter {
  consumer {
    key = "consumer-key-here"
    secret = "consumer-secret-here"
  }
  access {
    key = "access-key-here"
    secret = "access-token-here"
  }
}

3. Running The Project

3.1 - Navigate to /src/main/scala/com/alptwitter/AlpakkaTwitter.scala and change the following line to indicate what hashtags you wish to look at new tweets for val trackedWords = Seq("#myHashtag"):

vim /workspace/example-cassandra-alpakka-twitter/src/main/scala/com/alptwitter/AlpakkaTwitter.scala

If you want to track more than one hashtag, add more by adding more strings and separating with commas.

3.2 - The project can then be run by navigating to the root folder of the project and running:

sbt run

As new tweets are posted which contain any of the hashtags listed in the trackedWords variable, a message will print in the console which says whether the tweet was a retweet or a unique tweet.


4. Observe Tables

4.1 - As new tweets (not retweets of tweets) with your entered hashtags are posted and found, they will be saved to Cassandra as a (tweet id, text of tweet) entry in testkeyspace.testtable. To check that the tweets are being saved to Cassandra, run CQLSH on the cassandra container and observe the table:

docker exec -it my-cassandra cqlsh
SELECT * FROM testkeyspace.testtable; 

Twitter4S (Twitter for Scala) Github Repository

Twitter4S definition of Tweet object

Alpakka Cassandra Documentation

example-cassandra-alpakka-twitter's People

Contributors

adp8ke avatar nikita311 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.