Code Monkey home page Code Monkey logo

de-zoomcamp2024's Introduction

Data Engineering Zoomcamp by DataTalks.club

Course Overview

Module 1: Containerization and Infrastructure as Code

  • Course Overview: Introduction to the course and its objectives.
  • Introduction to GCP: An overview of Google Cloud Platform.
  • Docker and docker-compose: Understanding containerization with Docker.
  • Running Postgres locally with Docker: Setting up a local Postgres instance using Docker.
  • Setting up infrastructure on GCP with Terraform: Introduction to Infrastructure as Code using Terraform.
  • Preparing the environment for the course: Setting up the development environment.
  • Homework: Practical exercises to reinforce learning.

Module 2: Workflow Orchestration

  • Data Lake: Understanding the concept of a Data Lake.
  • Workflow orchestration: Managing data workflows efficiently.
  • Workflow orchestration with Mage: Practical use of the Mage tool.
  • Homework: Hands-on assignments related to workflow orchestration.

Workshop 1: Data Ingestion

  • Practical workshop focusing on data ingestion techniques.

Module 3: Data Warehouse

  • Data Warehouse: Overview of Data Warehousing concepts.
  • BigQuery: Introduction to Google's BigQuery.
  • Partitioning and clustering: Optimizing data storage in BigQuery.
  • BigQuery best practices: Efficient practices for using BigQuery.
  • Internals of BigQuery: Understanding the inner workings.
  • BigQuery Machine Learning: Exploring machine learning capabilities in BigQuery.

Module 4: Analytics Engineering

  • Basics of analytics engineering: Foundational concepts in analytics engineering.
  • dbt (data build tool): Introduction to dbt for building analytics.
  • BigQuery and dbt: Integrating dbt with BigQuery.
  • Postgres and dbt: Utilizing dbt with Postgres.
  • dbt models: Creating and managing dbt models.
  • Testing and documenting: Ensuring data quality and documentation practices.
  • Deployment to the cloud and locally: Strategies for deploying analytics.
  • Visualizing the data with google data studio and metabase: Data visualization tools.

Module 5: Batch Processing

  • Batch processing: Overview of batch processing.
  • What is Spark: Introduction to Apache Spark.
  • Spark Dataframes: Working with Spark Dataframes.
  • Spark SQL: Executing SQL queries in Spark.
  • Internals: GroupBy and joins: Understanding Spark internals for grouping and joining.

Module 6: Streaming

  • Introduction to Kafka: Basics of Apache Kafka.
  • Schemas (avro): Implementing data schemas with Avro.
  • Kafka Streams: Working with Kafka Streams.
  • Kafka Connect and KSQL: Utilizing Kafka Connect and KSQL for stream processing.

Workshop 2: Stream Processing with SQL

  • Practical workshop focused on stream processing using SQL.

de-zoomcamp2024's People

Contributors

muhammadatef avatar

Stargazers

Ataa Mohamed avatar  avatar Ahmad Muhammad avatar Mohamed Ramadan El-Manged avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.