Code Monkey home page Code Monkey logo

airflow-kube-setup's Introduction

Hi ๐Ÿ‘‹๐Ÿป, I'm Bhavani Ravi

Backend Engineer | Technical Writer | Trainer

Hi I'm Bhavani Ravi Independent Software Consultant Specialized in Python Backend systems, Devops Infra and Data Engineering. I build, teach, write, and speak tech
Learn By Doing

Here's what I do well

๐Ÿ Backend Systems (Python/Flask/Django) (Node/Express/Prisma) (Postgres/Mongo)

โ˜๏ธ Deploy systems in Cloud, Docker/Kubernetes Environment

๐ŸŸฉ Contribute to open source

๐Ÿ–Š Write blogs at bhavaniravi.com

๐Ÿ‘ฉ๐Ÿปโ€๐Ÿซ Teach PythonToProject Bootcamp

โŒ— Data Engineering

๐Ÿ“‰ Marketing and Entreprenurship

airflow-kube-setup's People

Contributors

bhavaniravi avatar jgn1982 avatar lfreina avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

airflow-kube-setup's Issues

Namespace not found while deploying

While deploying I was getting a namespace not found error. So I decided to declare the namespace as I did not found that anywhere.

The Pull Request will include some other changes that I needed to do in order to deploy the configuration. Maybe I did something unnecessary but the changes are included there :)

Unable to start the worker pods

Hello,

Whenever I start/schedule a Dag I get the following error on the worker pod:
image
Not sure what is going wrong. can you help?

error while running airflow yaml

error: error validating "airflow.yaml": error validating data: ValidationError(Deployment.spec): missing required field "selector" in io.k8s.api.apps.v1.DeploymentSpec; if you choose to ignore these errors, turn validation off with --validate=false

error while running airflow yaml

init error

The repo overall is very helpful but When I try to deploy the code. the airflow pod is failing at init. with init:error
Following is the error I see:
image

I think sqlalchemy modules are installed with the airflow image. I tried to install sql-alchemy packages separately and still see the same issue.
Can you please guide me through this.

out of the box issues with namespace airflow-example

I cloned the git and made the specific changes according to the readme/blog post. When the airflow.yaml is created and launched I get an error that it cannot find airflow-configmaps.

MountVolume.SetUp failed for volume "airflow-configmap" : configmap "airflow-configmap" not found.

If I change all the namespace values to default in airflow.yaml this works well enough but that doesn't seem to be the intended design of this repo. The webserver/scheduler seems to be designed to work within their own namespace and communicate back to the default namespace for the DB backend/configmaps.

Am I missing a setup step?

scheduler and postgres errors

I have deployed airflow on GKE and the pods are running fine. but when I check the logs of the scheduler and postgres database I see the following errors:
postgres pod:
image

scheduler container:
image
Please provide some insights on these errors.

Trouble getting kubernetsPodOperator in Dag running

This repo and the blog post are really good but I'm having difficulty using the KubernetsPodOperator. When I use it I get a /root/.kube/config not found in the log on Airflow. I'm running the example_kubernetes_operator DAG. Any suggestions?

issue with rbac permission resource

When launching a custom DAG using kubernetesPodOperator airflow correctly launched the pod but returned a failed status on the job. The log produced this error:

"status":"Failure","message":"pods \"airflow-dag-c4eaf498\" is forbidden: User \"system:serviceaccount:airflow-example:airflow\" cannot get resource \"pods/log\"

This was caused by the rbac group not having correct permissions in the resources. It should include "pods/log" to allow the airflow user to read the pod logs.

Once this edit was made the logs were accessed and airflow showed success.

https://kubernetes.io/docs/reference/access-authn-authz/rbac/ REF: Referring to resources.

Unable to start the airflow component

Hi Bhavani,
I have tried to setup the airflow using the script shared by you by building own docker image ( mayurdhande007/my-airflow-example:airflow_v1.1 as well as mayurdhande007/my-airflow-example:airflow_v1.2 ) and ran the deploy.sh as suggested by updating AIRFLOW_IMAGE and AIRFLOW_TAG values in deploy.sh.
After this, it spun up the postgres pod but failing to start the init container.
PFB events for pods:

root@cks-master:~/airflow/airflow-kube-setup/scripts/kube# kubectl describe pod airflow-f499b4866-x748r
Name: airflow-f499b4866-x748r
Namespace: airflow-example
Priority: 0
Node: cks-worker/10.148.0.3
Start Time: Sun, 16 May 2021 05:23:03 +0000
Labels: name=airflow
pod-template-hash=f499b4866
Annotations:
Status: Pending
IP: 10.0.0.17
IPs:
IP: 10.0.0.17
Controlled By: ReplicaSet/airflow-f499b4866
Init Containers:
init:
Container ID: docker://621eacb558016a22f74405759b65831ede42ae9e465a67d233d112d12d8bd4e9
Image: mayurdhande007/my-airflow-example:airflow_v1.1
Image ID: docker-pullable://mayurdhande007/my-airflow-example@sha256:f04102718f01f44c3ac9a38ff7ea9ab8b076ec427002a104b32816f9e95e9783
Port:
Host Port:
Command:
bash
Args:
-cx
./tmp/airflow-test-env-init.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sun, 16 May 2021 05:23:32 +0000
Finished: Sun, 16 May 2021 05:23:34 +0000
Ready: False
Restart Count: 2
Environment:
SQL_ALCHEMY_CONN: <set to the key 'sql_alchemy_conn' in secret 'airflow-secrets'> Optional: false
Mounts:
/root/airflow/airflow.cfg from airflow-configmap (rw,path="airflow.cfg")
/root/airflow/dags from airflow-dags (rw)
/var/run/secrets/kubernetes.io/serviceaccount from airflow-token-bq59v (ro)
Containers:
webserver:
Container ID:
Image: mayurdhande007/my-airflow-example:airflow_v1.1
Image ID:
Port: 8080/TCP
Host Port: 0/TCP
Args:
webserver
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
AIRFLOW_KUBE_NAMESPACE: airflow-example (v1:metadata.namespace)
SQL_ALCHEMY_CONN: <set to the key 'sql_alchemy_conn' in secret 'airflow-secrets'> Optional: false
Mounts:
/root/airflow/airflow.cfg from airflow-configmap (rw,path="airflow.cfg")
/root/airflow/dags from airflow-dags (rw)
/root/airflow/logs from airflow-logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from airflow-token-bq59v (ro)
scheduler:
Container ID:
Image: mayurdhande007/my-airflow-example:airflow_v1.1
Image ID:
Port:
Host Port:
Args:
scheduler
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
AIRFLOW_KUBE_NAMESPACE: airflow-example (v1:metadata.namespace)
SQL_ALCHEMY_CONN: <set to the key 'sql_alchemy_conn' in secret 'airflow-secrets'> Optional: false
Mounts:
/root/airflow/airflow.cfg from airflow-configmap (rw,path="airflow.cfg")
/root/airflow/dags from airflow-dags (rw)
/root/airflow/logs from airflow-logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from airflow-token-bq59v (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
airflow-dags:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: airflow-dags
ReadOnly: false
airflow-dags-fake:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit:
airflow-dags-git:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit:
airflow-logs:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: airflow-logs
ReadOnly: false
airflow-configmap:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: airflow-configmap
Optional: false
airflow-token-bq59v:
Type: Secret (a volume populated by a Secret)
SecretName: airflow-token-bq59v
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Normal Scheduled 52s default-scheduler Successfully assigned airflow-example/airflow-f499b4866-x748r to cks-worker
Normal Pulled 47s kubelet Successfully pulled image "mayurdhande007/my-airflow-example:airflow_v1.1" in 2.856313921s
Normal Pulled 41s kubelet Successfully pulled image "mayurdhande007/my-airflow-example:airflow_v1.1" in 2.743754591s
Normal Pulling 26s (x3 over 49s) kubelet Pulling image "mayurdhande007/my-airflow-example:airflow_v1.1"
Normal Created 23s (x3 over 46s) kubelet Created container init
Normal Started 23s (x3 over 46s) kubelet Started container init
Normal Pulled 23s kubelet Successfully pulled image "mayurdhande007/my-airflow-example:airflow_v1.1" in 2.745180192s
Warning BackOff 6s (x3 over 39s) kubelet Back-off restarting failed container

I have uploaded the updated templates in github "https://github.com/dhandma/my-airflow-setup-k8s.git" for reference.
Any help would be appreciated.

Many Thanks ,
Dhandma

Kubernetes Executor Pods unable to pull sql alchemy connection

I am converting customized airflow helm chart to use Kubernetes Executor instead of Celery Executor. I have tried multiple configuration for it work but somehow executor pods always tries to connect to default sql alchemy connection. pasting error.
scheduler and webserver able to connect to correct db using configuration variable AIRFLOW__CORE__SQL_ALCHEMY_CONN.

I have provided pod template yaml inside tag kubernetesPodTemplate.stringOverride in values.yaml

Error: (logs from executor pod)
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 5, in
from airflow.main import main
File "/usr/local/lib/python3.8/site-packages/airflow/init.py", line 46, in
settings.initialize()
File "/usr/local/lib/python3.8/site-packages/airflow/settings.py", line 447, in initialize
configure_orm()
File "/usr/local/lib/python3.8/site-packages/airflow/settings.py", line 222, in configure_orm
engine = create_engine(SQL_ALCHEMY_CONN, connect_args=connect_args, **engine_args)
File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/init.py", line 525, in create_engine
return strategy.create(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/strategies.py", line 54, in create
u = url.make_url(name_or_url)
File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/url.py", line 229, in make_url
return _parse_rfc1738_args(name_or_url)
File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/url.py", line 290, in _parse_rfc1738_args
raise exc.ArgumentError(
sqlalchemy.exc.ArgumentError: Could not parse rfc1738 URL from string ''postgresql+psycopg2://airflow:airflow@postgres:5432/airflow''

Kubernetes crashing as pod cannot be inittiated

Defaulted container "webserver" out of: webserver, scheduler, init (init)
Error from server (BadRequest): container "webserver" in pod "airflow-86647577fd-rljv6" is waiting to start: PodInitializing

from kubectl logs
Warning BackOff 2m24s (x574 over 127m) kubelet Back-off restarting failed container

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.