canonical / namespace-node-affinity-operator Goto Github PK
View Code? Open in Web Editor NEWJuju Charm for the Namespace Node Affinity tool
License: Apache License 2.0
Juju Charm for the Namespace Node Affinity tool
License: Apache License 2.0
the charm gets into this state wheres logging tls errors and it stays active idle. Its not working as expected and is not injecting configs specified in its settings_yaml
config to the other pods (in the corresponding namespaces.)
There is no proper visibility into the state of this charm outside of logs. It would be nice if the charm workload status reflected its state and it could forward its logs to COS(loki).
N/A
namespace-node-affinity active 1 namespace-node-affinity 0.1/beta 5 REDACTED no
2024/01/29 1835 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1838 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1840 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1842 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1857 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1857 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1857 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1836 http: TLS handshake error from REDACTED remote error: tls: bad certificate
2024/01/29 1805 http: TLS handshake error from REDACTED remote error: tls: bad certificate
settings_yaml
config
controller-k8s: |
nodeSelectorTerms:
- matchExpressions:
- key: kubeflowserver
operator: In
values:
- true
kubeflow: |
nodeSelectorTerms:
- matchExpressions:
- key: kubeflowserver
operator: In
values:
- true
metallb: |
nodeSelectorTerms:
- matchExpressions:
- key: kubeflowserver
operator: In
values:
- true
The nodeSelector
term injected to DaemonSet pods will be ignored due to the fact that
if multiple nodeSelectorTerms are associated with nodeAffinity types, then the Pod can be scheduled onto a node if one of the specified nodeSelectorTerms can be satisfied.
juju deploy metallb --channel 1.28/stable --trust --config namespace=metallb
juju deploy namespace-node-affinity --trust
kubectl label namespaces metallb namespace-node-affinity=enabled
settings.yaml
~$ cat settings.yaml
metallb: |
nodeSelectorTerms:
- matchExpressions:
- key: kubeflowserver
operator: In
values:
- true
SETTINGS_YAML=$(cat settings.yaml)
5. juju config namespace-node-affinity settings_yaml="$SETTINGS_YAML"
6. kubectl delete pods -n metallb metallb-0
(the operator
pod for metallb
app.)
7. kubectl get pods -n metallb metallb-0 -o yaml
, we can see in the yaml for the newly created pod:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: StatefulSet
name: metallb
uid: 5214cbc3-20b5-41ba-b7f9-69f2c16da6ca
resourceVersion: "84536"
uid: c0bf5e77-8b82-43e8-8670-e8ee3ecd43f9
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubeflowserver
operator: In
values:
- "true"
speaker-ngkg8
which was spawned by the charm pod, before deletion:kubectl get pods -n metallb speaker-ngkg8 -o yaml
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: DaemonSet
name: speaker
uid: 9332e100-c788-421d-b6a7-08751831a22d
resourceVersion: "83722"
uid: 1fffc42d-6d24-446f-bade-8ca1d7de6a15
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- vm-0
containers:
kubectl delete pods -n metallb speaker-ngkg8
kubectl get pods -n metallb speaker-k59pk -o yaml
, check this newly created pod: - apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: DaemonSet
name: speaker
uid: 9332e100-c788-421d-b6a7-08751831a22d
resourceVersion: "84818"
uid: d334c13c-fb14-487d-8cbd-b16352a42e21
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- vm-0
- matchExpressions:
- key: kubeflowserver
operator: In
values:
- "true"
containers:
We can see the nodeSelector
term is injected. However, this newly created pod will stay in vm-0
due to the fact that
If you specify multiple
nodeSelectorTerms
associated withnodeAffinity
types, then the Pod can be scheduled onto a node if one of the specifiednodeSelectorTerms
can be satisfied. i.e., Multiple nodeSelectorTerms within nodeAffinity are evaluated using OR logic. If any one of the terms is satisfied, the Pod can be scheduled on that node.
In this case, the second term is going to be ignored.
juju version: 2.9.43
kubernetes: v1.24.17
N/A
No response
This charm should be deprecated. Instead, its functionality should be a feature built-in juju. Here's the corresponding feature request.
'juju remove-application' only removes the operator pod.
It does NOT remove the webhook pod.
You will see two pods started:
namespace-node-affinity-0
andnamespace-node-affinity-pod-webhook-*****
You will see
namespace-node-affinity-0
is deleted, butnamespace-node-affinity-pod-webhook-*****
is still running.
juju version: 2.9.43
kubernetes: v1.24.17
model-7c837905-ac2e-48e7-8d3e-29546adc5dfc: 12:35:58 INFO juju.worker.caasupgrader abort check blocked until version event received
model-7c837905-ac2e-48e7-8d3e-29546adc5dfc: 12:35:58 INFO juju.worker.caasupgrader unblocking abort check
model-7c837905-ac2e-48e7-8d3e-29546adc5dfc: 12:35:59 INFO juju.worker.muxhttpserver starting http server on [::]:17071
model-7c837905-ac2e-48e7-8d3e-29546adc5dfc: 12:35:59 INFO juju.worker.caasadmission ensuring model k8s webhook configurations
controller-0: 12:36:17 INFO juju.worker.caasapplicationprovisioner.runner start "namespace-node-affinity"
controller-0: 12:36:21 INFO juju.worker.caasapplicationprovisioner.namespace-node-affinity scaling application "namespace-node-affinity" to desired scale 1
controller-0: 12:36:22 INFO juju.worker.caasapplicationprovisioner.namespace-node-affinity scaling application "namespace-node-affinity" to desired scale 1
unit-namespace-node-affinity-0: 12:36:29 INFO juju.cmd running containerAgent [2.9.43 3cb3f8beac4a0b05e10bdfb8014f5666118a269d gc go1.20.4]
unit-namespace-node-affinity-0: 12:36:29 INFO juju.cmd.containeragent.unit start "unit"
unit-namespace-node-affinity-0: 12:36:29 INFO juju.worker.upgradesteps upgrade steps for 2.9.43 have already been run.
unit-namespace-node-affinity-0: 12:36:29 INFO juju.worker.probehttpserver starting http server on [::]:65301
unit-namespace-node-affinity-0: 12:36:29 INFO juju.api connection established to "wss://controller-service.controller-myk8scloud-localhost.svc.cluster.local:17070/model/7c837905-ac2e-48e7-8d3e-29546adc5dfc/api"
unit-namespace-node-affinity-0: 12:36:29 INFO juju.worker.apicaller [7c8379] "unit-namespace-node-affinity-0" successfully connected to "controller-service.controller-myk8scloud-localhost.svc.cluster.local:17070"
unit-namespace-node-affinity-0: 12:36:29 INFO juju.api connection established to "wss://controller-service.controller-myk8scloud-localhost.svc.cluster.local:17070/model/7c837905-ac2e-48e7-8d3e-29546adc5dfc/api"
unit-namespace-node-affinity-0: 12:36:29 INFO juju.worker.apicaller [7c8379] "unit-namespace-node-affinity-0" successfully connected to "controller-service.controller-myk8scloud-localhost.svc.cluster.local:17070"
unit-namespace-node-affinity-0: 12:36:29 INFO juju.worker.migrationminion migration phase is now: NONE
unit-namespace-node-affinity-0: 12:36:29 INFO juju.worker.logger logger worker started
unit-namespace-node-affinity-0: 12:36:29 WARNING juju.worker.proxyupdater unable to set snap core settings [proxy.http= proxy.https= proxy.store=]: exec: "snap": executable file not found in $PATH, output: ""
unit-namespace-node-affinity-0: 12:36:30 INFO juju.worker.caasupgrader abort check blocked until version event received
unit-namespace-node-affinity-0: 12:36:30 INFO juju.worker.caasupgrader unblocking abort check
unit-namespace-node-affinity-0: 12:36:30 INFO juju.worker.leadership namespace-node-affinity/0 promoted to leadership of namespace-node-affinity
unit-namespace-node-affinity-0: 12:36:30 INFO juju.agent.tools ensure jujuc symlinks in /var/lib/juju/tools/unit-namespace-node-affinity-0
unit-namespace-node-affinity-0: 12:36:30 INFO juju.worker.uniter unit "namespace-node-affinity/0" started
unit-namespace-node-affinity-0: 12:36:30 INFO juju.worker.uniter resuming charm install
unit-namespace-node-affinity-0: 12:36:30 INFO juju.worker.uniter.charm downloading ch:amd64/focal/namespace-node-affinity-18 from API server
unit-namespace-node-affinity-0: 12:36:30 INFO juju.downloader downloading from ch:amd64/focal/namespace-node-affinity-18
unit-namespace-node-affinity-0: 12:36:30 INFO juju.downloader download complete ("ch:amd64/focal/namespace-node-affinity-18")
unit-namespace-node-affinity-0: 12:36:30 INFO juju.downloader download verified ("ch:amd64/focal/namespace-node-affinity-18")
unit-namespace-node-affinity-0: 12:36:31 INFO juju.worker.uniter hooks are retried true
unit-namespace-node-affinity-0: 12:36:32 INFO juju.worker.uniter found queued "install" hook
unit-namespace-node-affinity-0: 12:36:32 INFO unit.namespace-node-affinity/0.juju-log Running legacy hooks/install.
unit-namespace-node-affinity-0: 12:36:33 INFO unit.namespace-node-affinity/0.juju-log _gen_certs_if_missing
unit-namespace-node-affinity-0: 12:36:33 WARNING unit.namespace-node-affinity/0.install Generating RSA private key, 2048 bit long modulus (2 primes)
unit-namespace-node-affinity-0: 12:36:33 WARNING unit.namespace-node-affinity/0.install .................................+++++
unit-namespace-node-affinity-0: 12:36:33 WARNING unit.namespace-node-affinity/0.install ......................................................+++++
unit-namespace-node-affinity-0: 12:36:33 WARNING unit.namespace-node-affinity/0.install e is 65537 (0x010001)
unit-namespace-node-affinity-0: 12:36:33 WARNING unit.namespace-node-affinity/0.install Generating RSA private key, 2048 bit long modulus (2 primes)
unit-namespace-node-affinity-0: 12:36:33 WARNING unit.namespace-node-affinity/0.install ...................................................................................................................................+++++
unit-namespace-node-affinity-0: 12:36:33 WARNING unit.namespace-node-affinity/0.install .........................................................................................................................+++++
unit-namespace-node-affinity-0: 12:36:33 WARNING unit.namespace-node-affinity/0.install e is 65537 (0x010001)
unit-namespace-node-affinity-0: 12:36:33 WARNING unit.namespace-node-affinity/0.install Signature ok
unit-namespace-node-affinity-0: 12:36:33 WARNING unit.namespace-node-affinity/0.install subject=C = GB, ST = Canonical, L = Canonical, O = Canonical, OU = Canonical, CN = 127.0.0.1
unit-namespace-node-affinity-0: 12:36:33 WARNING unit.namespace-node-affinity/0.install Getting CA Private Key
unit-namespace-node-affinity-0: 12:36:33 INFO unit.namespace-node-affinity/0.juju-log Starting main
unit-namespace-node-affinity-0: 12:36:33 INFO unit.namespace-node-affinity/0.juju-log _check_leader
unit-namespace-node-affinity-0: 12:36:33 INFO unit.namespace-node-affinity/0.juju-log _deploy_k8s_resources
unit-namespace-node-affinity-0: 12:36:33 INFO unit.namespace-node-affinity/0.juju-log Rendering manifests
unit-namespace-node-affinity-0: 12:36:33 INFO unit.namespace-node-affinity/0.juju-log Reconcile completed successfully
unit-namespace-node-affinity-0: 12:36:34 INFO juju.worker.uniter.operation ran "install" hook (via hook dispatching script: dispatch)
unit-namespace-node-affinity-0: 12:36:34 INFO juju.worker.uniter found queued "leader-elected" hook
unit-namespace-node-affinity-0: 12:36:34 INFO unit.namespace-node-affinity/0.juju-log _gen_certs_if_missing
unit-namespace-node-affinity-0: 12:36:34 INFO unit.namespace-node-affinity/0.juju-log Starting main
unit-namespace-node-affinity-0: 12:36:34 INFO unit.namespace-node-affinity/0.juju-log _check_leader
unit-namespace-node-affinity-0: 12:36:34 INFO unit.namespace-node-affinity/0.juju-log _deploy_k8s_resources
unit-namespace-node-affinity-0: 12:36:35 INFO unit.namespace-node-affinity/0.juju-log Rendering manifests
unit-namespace-node-affinity-0: 12:36:35 INFO unit.namespace-node-affinity/0.juju-log Reconcile completed successfully
unit-namespace-node-affinity-0: 12:36:35 INFO juju.worker.uniter.operation ran "leader-elected" hook (via hook dispatching script: dispatch)
unit-namespace-node-affinity-0: 12:36:36 INFO unit.namespace-node-affinity/0.juju-log _gen_certs_if_missing
unit-namespace-node-affinity-0: 12:36:36 INFO unit.namespace-node-affinity/0.juju-log Starting main
unit-namespace-node-affinity-0: 12:36:36 INFO unit.namespace-node-affinity/0.juju-log _check_leader
unit-namespace-node-affinity-0: 12:36:36 INFO unit.namespace-node-affinity/0.juju-log _deploy_k8s_resources
unit-namespace-node-affinity-0: 12:36:36 INFO unit.namespace-node-affinity/0.juju-log Rendering manifests
unit-namespace-node-affinity-0: 12:36:36 INFO unit.namespace-node-affinity/0.juju-log Reconcile completed successfully
unit-namespace-node-affinity-0: 12:36:36 INFO juju.worker.uniter.operation ran "config-changed" hook (via hook dispatching script: dispatch)
unit-namespace-node-affinity-0: 12:36:36 INFO juju.worker.uniter found queued "start" hook
unit-namespace-node-affinity-0: 12:36:37 INFO unit.namespace-node-affinity/0.juju-log Running legacy hooks/start.
unit-namespace-node-affinity-0: 12:36:37 INFO unit.namespace-node-affinity/0.juju-log _gen_certs_if_missing
unit-namespace-node-affinity-0: 12:36:37 INFO juju.worker.uniter.operation ran "start" hook (via hook dispatching script: dispatch)
controller-0: 12:37:23 INFO juju.worker.caasapplicationprovisioner.namespace-node-affinity scaling application "namespace-node-affinity" to desired scale 0
controller-0: 12:37:23 INFO juju.worker.caasapplicationprovisioner.namespace-node-affinity scaling application "namespace-node-affinity" to desired scale 0
unit-namespace-node-affinity-0: 12:37:23 WARNING juju.worker.uniter.operation we should run a leader-deposed hook here, but we can't yet
controller-0: 12:37:24 WARNING juju.worker.caasapplicationprovisioner.namespace-node-affinity update units application "namespace-node-affinity" not found
controller-0: 12:37:24 WARNING juju.worker.caasapplicationprovisioner.namespace-node-affinity update units application "namespace-node-affinity" not found
controller-0: 12:37:25 WARNING juju.worker.caasapplicationprovisioner.namespace-node-affinity update units application "namespace-node-affinity" not found
controller-0: 12:37:25 WARNING juju.worker.caasapplicationprovisioner.namespace-node-affinity update units application "namespace-node-affinity" not found
controller-0: 12:37:26 INFO juju.worker.caasapplicationprovisioner.runner stopped "namespace-node-affinity", err: cannot scale dying application to 0: application "namespace-node-affinity" not found
controller-0: 12:37:26 ERROR juju.worker.caasapplicationprovisioner.runner exited "namespace-node-affinity": cannot scale dying application to 0: application "namespace-node-affinity" not found
controller-0: 12:37:26 INFO juju.worker.caasapplicationprovisioner.runner restarting "namespace-node-affinity" in 3s
controller-0: 12:37:29 INFO juju.worker.caasapplicationprovisioner.runner start "namespace-node-affinity"
controller-0: 12:37:29 INFO juju.worker.caasapplicationprovisioner.runner stopped "namespace-node-affinity", err: <nil>
No response
The charm will go to Active/Idle before the workload has successfully stood up. This could result in a race condition (charm active before workload active) that resolves itself, but can also hide other issues. If there is a problem with the deployment (ex: the image does not exist and the Pod goes to ImagePullBackoff
) the charm will still be Active/Idle.
In a test Microk8s environment, I was able to deploy this charm and have it, upon deletion of existing pods, apply the specified affinity rules to new pods.
However, the namespace-node-affinity-pod-webhook-* pod cannot be deleted in the same way, for likely obvious reasons - I suspect it is the pod which is performing the work of injecting affinity rules, and cannot apply them to itself.
Is there some way that we could apply the same affinity rules to the webhook pod itself, e.g. through the operator pod? (I haven't written this type of Kubernetes charm, but it seems like the operator pod spawns the webhook pod? In which case, perhaps the rules need to be injected by the operator pod when spwaning the webhook pod? Just an idea.)
Juju controller: 2.9.43
K8s: Microk8s v1.28.7
namespace-node-affinity charm version: revision 18, latest/stable
N/A
No response
During upgrade fron 1.6 to 1.7 namespace-node-affinity-pod-webhook haven't been recreated however namespace-node-affinity-webhook-certs secret was refreshed.
Which leading to old secrets are still mounted to the pod. And it can't work with the following error message in the log:
"failed calling webhook "namespace-node-affinity-pod-webhook.default.svc": failed to call webhook: Post "https://namespace-node-affinity-pod-webhook.kubeflow.svc:443/mutate?timeout=5s": x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "127.0.0.1")"
To fix the issue just delete webhook pod was enough.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.