Code Monkey home page Code Monkey logo

kafka-connect-adls's Introduction

Kafka Connect ADLS Sink Connector Overview

This Kafka Connect app is a Sink Connector for reading data from a Kafka topic and writing concurrently to Azure Data Lake Store Gen 1.

Pre-requisties

This connector works using Azure Service Principal authentication therefore requires you to have an active Azure App Registration done. From the registered App, you can get the App ID, Client Secret key and from the Azure Active Directory properties, you can get the Directory ID.
These properties are required for the Sink config.

Configuration

ADLSSinkConnector

The connector will read from a topic and write to the specified ADLS Gen 1 Directory. The default file rotation pattern is hourly.
The directory structure is: /<specified dir path>/<topic name>/<current date>/<curr hour>/file
This writing pattern can be easily extended as needed.

name=adls-sink-connector1
tasks.max=3
connector.class=com.github.arao.kafka.connect.adls.ADLSSinkConnector

# Set these required values
adls.dir.path= <ADLS directory to place the files, directory will be created if not pre-existing>
file.prefix= <file name prefix - friendly name to define the data files>
auth.token.endpoint= <the login auth url - https://login.microsoftonline.com/<directory ID/tenant id>/oauth2/token>
account.fqdn= <the ADLS stores' fully qualified name>
client.id= <azure app id>
client.key= <azure app client secret key>
topic= <topic to be used>

Building on you workstation

git clone [email protected]:ArchanaRa0/kafka-connect-adls.git
cd kafka-connect-adls
mvn clean package

Starting Connect Distributed with ADLS Sink Connector plugins

./bin/start.sh

Starting a ADLS Sink

cd config
curl -X POST -d @adls-sink-config.json  http://localhost:port/connectors --header "content-Type:application/json"

kafka-connect-adls's People

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.