Code Monkey home page Code Monkey logo

glue-streaming-etl-demo's Introduction

AWS serverless etl and streaming demo

Glue Streaming ETL Demo

This demo is shown how to use the Glue Streaming feature to Manage continuous ingestion pipelines and processing data on-the-fly. The Glue Steaming Jobs is extending AWS Glue jobs, based on Apache Spark, to run continuously and consume data from streaming platforms such as Amazon Kinesis Data Streams and Apache Kafka (including the fully-managed Amazon MSK).

Glue can provision, manage, and scale the infrastructure to ingest data to data lakes on Amazon S3, data warehouses such as Amazon Redshift, or store streaming data in a DynamoDB table for quick lookups, or in Elasticsearch to look for specific operation patterns.

Glue Streaming is based on Spark Structured Streaming to implement data transformations, such as aggregating, partitioning, and formatting as well as joining with other data sets to enrich or cleanse the data for easier analysis.

Please find more details in Adding Streaming ETL Jobs in AWS Glue guide

IoT-Kafka-GlueStreaming-Demo

serverless-etl-diagram-kafka

IoT-Kinesis-GlueStreaming-Demo

serverless-etl-diagram

kinesis-kafka-connector-Demo

kinesis-kafka-connector

Kinesis Data Anlytics Streaming Demo

This demo is shown how to use the Kinesis Data Anlytics to Manage continuous ingestion pipelines and processing data on-the-fly. Kinesis Data Anlytics can help you run continuously and consume data from streaming platforms such as Amazon Kinesis Data Streams and Apache Kafka (including the fully-managed Amazon MSK).

IoT-Kinesis-KinesisDataAnlytics-Demo

kinesis-kda-demo

IoT-Kafka-KinesisDataAnlytics-Demo

kafka-kda-demo

Glue ingest the RDS data

This demo is shown how to use the Glue to ingest data from RDS database.

Architeture

mysql-glue

Glue ingest MySQL5.7 via Glue connector

Glue ingest MySQL8.0 via Glue connector

Connect the RDS which SSL connection enabled

Data-On-Boarding-End2End-Demo

end2end-data-onboarding

Data On Boarding End2End Demo

Python Code send record to S3 via Kinesis Firehose

python-firehose-arch Pyhton-Send-Data-Firefose Demo

IoT-Athena-QuickSight

Build a business intelligence capability for streaming IoT device data using AWS IoT Core, Amazon Firehose, Amazon S3, Amazon Athena and Amazon QuickSight

iot-athen-quicksight-achitect IoT-Athena-QuickSight

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.