Code Monkey home page Code Monkey logo

kafka-ruby-docker's Introduction

Kafka Demo App Using Ruby and Docker

Introduction

This application is a demo for using Apache Kafka for message passing in Ruby programming language. The whole environment and application is packaged into Docker containers.

Requirements

  • docker
  • docker-compose

Installing and Running

$ docker-compose up

The first time, this may take a long time as it downloads Kafka, Zookeeper and Ruby images and builds the application.

Description

When the app starts, it consists of four separate units, besides Zookeeper and Kafka. Each unit is a separate process. These processes do not need to be on the same machine. These units are:

  1. API
  2. Processor
  3. Subscriber One
  4. Subscriber Two

The main API is listening on port 8080 of the host OS, it accepts requests via HTTP. One can send new requests using this curl command below which requests the weather of the sol 1 in Mars. Sol in Mars is the equivalent of day in Earth.

$ curl -X GET http://localhost:8080/sol/1

Then the API unit puts the requests in a queue in Kafka as messages, the requests queue. The processor unit is consuming the requests queue. It processes each message and retrieves the sol weather from the Mars Atmospheric Aggregation System MAAS. After getting the weather it puts the result into another Kafka topic as messages.

Then there are two subscriber units, which are polling the results topic and consuming the messages. When new results are arrived into the topic, they pick it up and display the sol weather.

These steps and the units’ outputs are displayed in the application log. A sample output is shown below.

./screenshot.png

The first column says that logs are from the Ruby container. The second column shows the name of the process (or unit). The API unit has received the HTTP request and puts it in the queue. The processor unit has picked it up and has gotten the weather. Then it has put the sol wather results into the topic. Each subscriber unit (Subscriber1 and Subscriber2) has received the massage and they are displaying the weather.

Technical Details

The whole application is in Ruby programming language. It uses Sinatra and Rack for serving the API. For communicating to Kafka, it uses the Poseidon gem. Kafka consumers and producers are in the kafka directory.

The four unit processes are described in the file Procfile and are managed by foreman to run as daemons. The units are in the units directory. Eache line in the Procfile describes one process of the application.

Kafka Details, Topics and Queues

Kafka does not hold extra meta-data about consumers of messages. It just keeps all messages of a topic in a log and assign an offset to each message. It is the responsibility of the clients to know and keep the offset. They can request any offset they like and start consuming. This has made Kafka very flexible.

These log are called partitions. Each topic can have one or many partitions. Clients also can select the partition they are interested in. Each message of a topic go into one partition. Kafka ensures that all messages with the same key will end in the same partition.

If we want to have a traditional queue to balance the messages between the clients, we can assign clients to topics. Each client will consume one partition. Therefore when messages are divided between partitions, they are actually being divided between clients. This way each message will be consumed once and only by one client.

If we want to have a topic and multiple clients subscribed to that, we can use consumer groups. Kafka will send messages to each consumer group once. So if each client has a different consumer group, a message is consumed multiple times. In another way we can use one consumer group but tell the clients to consume one partition and keep the offsets themselves, and Kafka does not care who has consumed the message or if it is consumed or not.

In this application I produce messages to one partition of one topic with same keys, and tell consumers to consume that partition but keep the offsets themselves. If I had more than one processor unit, I had to use consumer groups and assign a different partition to each processor. This way requests would be balanced between the processors and a request would be processed only once.

Docker Images

The application uses three docker images, Kakfa, Zookeeper and Ruby. Zookeeper is needed as Kafka uses Zookeeper to manage clusters and its nodes. The Ruby image is the main image for our application which is built by the provided Dockerfile. These images are in the public Docker hub registry and are downloaded and built automatically by the above command on the first time.

These images are configured and run using docker-compose. The configuration is in the docker-compose.yml file. The main ports of Zookeeper and Kafka are exposed to the host OS and containers are linked. The application is accessible on port 8080 from the host OS.

Issues

Each time before running the app, first stop and remove the containers using below command, then start normally as above. Otherwise Kafka and Zookeeper fail to communicate.

$ docker-compose stop
$ docker-compose rm

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.