dw_etl_airfow_example's Introduction

Data Warehouse Pipeline with Airflow

This project is a sample implementation of a data warehouse pipeline using Apache Airflow. It demonstrates the steps required to extract data from a source database, transform it, and load it into a star schema in a data warehouse.

DAGs

The dags folder contains the DAGs for each dimension and the fact table. Each DAG is responsible for running the necessary tasks to extract, transform, and load the data for that particular table.

Data

The data folder contains sample CSV files for the source data used in this project.

Scripts

The scripts folder contains SQL scripts used to create the source tables and the star schema in the data warehouse.

Prerequisites

Apache Airflow installed and configured
Access to a data warehouse

How to Use

Clone this repository to your local machine.
Create a virtual environment and activate it.
Install the required packages using pip install -r requirements.txt.
Create the necessary tables in your source database.
Update the connection IDs in the DAGs to match your Airflow connections.
Start the Airflow scheduler and webserver.
Trigger the DAGs to start the data pipeline.

Credits

This project was created by SAID AIT OUAKOUR. Feel free to use and modify it as needed.

Recommend Projects

luissalazarsalinas / dw_etl_airfow_example Goto Github PK

dw_etl_airfow_example's Introduction

Data Warehouse Pipeline with Airflow

DAGs

Data

Scripts

Prerequisites

How to Use

Credits

dw_etl_airfow_example's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent