Code Monkey home page Code Monkey logo

nusa's Introduction

nusa

Another Dataflow Templates that hope to complement the DataflowExamples from Google. The idea of creating this repository because I want to learn more about Java, Apache Beam, and Dataflow templates. I will try to always update the Apache Beam Version in the pom.xml as soon as possible

Disclaimer: A lot of code in this repo is copied from the DataflowExamples, I just modified some configuration like using Pub/Sub Subscriptions instead of using Pub/Sub Topic. ๐Ÿ™‡

Preparation

This repository use:

If both installed you will see something like this:

โ–ถ java -version
openjdk version "1.8.0_272"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_272-b10)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.272-b10, mixed mode)
โ–ถ mvn -version
Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /usr/local/Cellar/maven/3.6.3_1/libexec
Java version: 1.8.0_272, vendor: AdoptOpenJDK, runtime: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre
Default locale: en_ID, platform encoding: UTF-8
OS name: "mac os x", version: "10.15.1", arch: "x86_64", family: "mac"

Creating Template

Based on this reference:

mvn compile exec:java \
     -Dexec.mainClass=com.irchanbani.beam.PubsubSubscriptionToAvro \
     -Dexec.cleanupDaemonThreads=false \
     -Dexec.args=" \
     --runner=DataflowRunner \
     --enableStreamingEngine \
     --diskSizeGb=30 \
     --project=[YOUR_PROJECT_ID] \
     --region=[YOUR_BUCKET_REGION] \
     --tempLocation=gs://[YOUR_BUCKET_NAME]/temp \
     --stagingLocation=gs://[YOUR_BUCKET_NAME]/staging \
     --templateLocation=gs://[YOUR_BUCKET_NAME]/templates/[BEAM_VERSION]/<template-name>"

Also you need to copy the metadata file in the same folder as the template.

gsutil cp metadata/Cloud_PubSub_Subscription_to_Avro_metadata gs://[YOUR_BUCKET_NAME]/templates/[BEAM_VERSION]/<template-name>

Run Dataflow

Basen on this reference:

gcloud dataflow jobs run [JOB_NAME] \
    --gcs-location gs://[YOUR_BUCKET_NAME]/templates/[BEAM_VERSION]/<template-name> \
    --region [REGION_ID] \
    --network [NETWORK] \
    --subnetwork [SUBNETWORK] \
    --max-workers [MAX_WORKERS] \
    --worker-machine-type [WORKER_MACHINE_TYPE] \
    --disable-public-ips \
    --parameters \
inputSubscription=projects/[PROJECT_ID]/subscriptions/[SUBSCRIPTIONS_ID],\
outputDirectory=gs://[BUCKET_NAME],\
outputFilenamePrefix=[PREFIX],\
outputFilenameSuffix=[SUFFIX],\
inputAttributeTimestamp=[PUBSUB_TIMESTAMP_ATTRIBUTE],\
inputAttributeId=[PUBSUB_ID_ATTRIBUTE],\
numShards=[NUM_SHARDS],\
avroTempDirectory=gs://[BUCKET_NAME]/tmp/

Contribution

nusa's People

Contributors

irchanbani avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.