data-transform Go Project

The goal of this project is to read in, transform, and store json data using Apache Kafka Producer/Consumer architecture.

Producer: Read the JSON from an HTTPS endpoint as a stream, process it incrementally, and send chunks to Kafka.
Consumers: Read messages from Kafka and write them to a CSV file.

Terminate gracefully when receiving a shutdown message.

Test Environment
Configuration
- Start the Kafka Broker
- Build the producer and consumer
Usage
- Parsing Command-Line Arguments
- Example

Test Environment

Go
Kafka broker
Confluent Kafka Go client library (Optional)

Configuration

Start the Kafka Broker

$ confluent local kafka start

Note the Plaintext Ports printed in your terminal, which you will use when configuring the producer and consumer clients in upcoming steps.

Build the producer and consumer

$ go build producer.go
$ go build consumer.go

Usage

Parsing Command-Line Arguments

The project uses command-line arguments to configure various settings such as Kafka broker URL, topic name, and the number of consumers. You can provide these arguments when running the binary.

Producer

-url: The endpoint from which the data will be fetched. Default: https://open.gsa.gov/data.json
-kafka-broker The URL of the Kafka broker. Default: localhost:9092
kafka-topic: The name of the Kafka topic. Default: data-transform
-partitions: The number of partitions. Default: 8
-repl-factor: The replication factor as per your Kafka setup. Default: 1

Consumer

-kafka-broker: The URL of the Kafka broker. Default: localhost:9092
-kafka-topic: The name of the Kafka topic. Default: data-transform
-group-id: The consumer group ID. Default: data-transform-group
-consumers: The number of consumers. Default: 8
-output: The name of the output CSV file for transformed data. Default: output.csv

Example

./consumer -kafka-broker=localhost:9092 -kafka-topic=data-transform -group-id=data-transform-group -consumers=8 -output=output.csv

./producer -url=https://open.gsa.gov/data.json -kafka-broker=localhost:9092 -kafka-topic=data-transform -partitions=8 -repl-factor=1

Simply, run

$ ./consumer
$ ./producer

The two programs can be run independently, and it is recommended to run them on different consoles.

animber-coder / gokafka Goto Github PK

gokafka's Introduction

data-transform Go Project

Table of Contents

Test Environment

Configuration

Start the Kafka Broker

Build the producer and consumer

Usage

Parsing Command-Line Arguments

Example

gokafka's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent