gardener / kupid Goto Github PK
View Code? Open in Web Editor NEWInject scheduling criteria into target pods orthogonally by policy definition.
License: Apache License 2.0
Inject scheduling criteria into target pods orthogonally by policy definition.
License: Apache License 2.0
What would you like to be added:
Readme says:
The OPA Gatekeeper allows to define policy to validate and mutate any kubernetes resource. Technically, this can be used to dynamically inject anything, including scheduling policy into pods. But this is too big a component to introduce just to dynamically inject scheduling policy. Besides, the policy definition as code is undesirable in this context because the policy itself would be non-declarative and hard to validate while deploying the policy.
However, it doesn't seem this justifies building our own component (which is currently unmaintained?) in comparison to the relatively low effort to reuse a well-established project from the community.
This repository could basically be a few yaml files instead of thousands of lines of code.
Why is this needed:
What would you like to be added:
Why is this needed:
To complete the integration with gardener.
What would you like to be added:
I would like Kupid's mutating webhook to only handle the requests that are relevant to it by using an ObjectSelector in the webhook configuration. The object selector can be set based on the PSPs and CPSPs that Kupid uses to mutate these resources.
Why is this needed:
Today Kupid receives every request in the cluster, while it only wishes to mutate specific resources (like etcd statefulset) based on resource labels. This allows for low resource consumption by Kupid (by avoiding irrelevant requests to it) and reduces log load by getting rid of unnecessary Handling request...
logs.
What would you like to be added:
Log errors at a level that is is convenient to configure without flooding the logs.
Why is this needed:
Help debugging issues while not flooding the logs.
What would you like to be added:
Add support for healthz endpoint and expose metrics.
Why is this needed:
To enabled livenessProbe
and also to collect relevant metrics.
What happened:
Define a Job which has a selector for labels of Pod.
Define a Cluster or Namespace scoped PodScheduling policy using Kupid which works on the same labels which satisfies the Pod created by the above Job definition.
Once the Job has completed, change the PodScheduling policy by changing the label selector.
Try to delete the Job created before the PodScheduling policy was changed.
Once the new policy comes into affect, it stops the existing job to be deleted as it will try to update the PodSpec with the new Labels which is an immutable field.
We see following errors in KCM logs
I1207 11:25:10.408731 1 garbagecollector.go:529] remove DeleteDependents finalizer for item [batch/v1/Job, namespace: shoot--dev--test-ash-3, name: a3fc17-compact-job, uid: 728ec193-2a8e-4f8a-befb-1fa9526ef7f8]
E1207 11:25:10.431940 1 garbagecollector.go:309] error syncing item &garbagecollector.node{identity:garbagecollector.objectReference{OwnerReference:v1.OwnerReference{APIVersion:"batch/v1", Kind:"Job", Name:"a3fc17-compact-job", UID:"728ec193-2a8e-4f8a-befb-1fa9526ef7f8",
BlockOwnerDeletion:(*bool)(0xc001d277fa)}}}: Job.batch "a3fc17-compact-job" is invalid: spec.template: Invalid value: core.PodTemplateSpec: field is immutable
This keep the job orphaned unless manually deleted.
What you expected to happen:
Kupid should not update the spec of a Job as it runs to completion.
How to reproduce it (as minimally and precisely as possible):
Follow the steps above.
Anything else we need to know:
This happens only where the earlier Job is not deleted before creating the change in the PodScheduling Policy.
Environment:
k8s - v1.19.5
kupid - v0.1.6
What happened:
The kupid extension deployment is lacking a priority class. If the extension is running in a cluster with limited capacity existing Shoots which require the kupid extension can't be reconciled as other components (e.g. control plane components) might have a higher priority.
Similar to all other extensions I recommend to use a priority class with value 1000000000
.
Ref: https://github.com/gardener/gardener-extension-provider-aws/blob/master/charts/gardener-extension-provider-aws/templates/priorityclass.yaml#L5
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know:
Environment:
/assign
What would you like to be added:
User should be allowed to configure the QPS and burst via the kupid helm chart.
Why is this needed:
Currently, kupid provides flags which users can use to set QPS and burst settings for the manager config. But these are not exposed via the helm chart. Helm chart needs to be enhanced to allow setting these values.
What would you like to be added:
Currently for Kupid
it ignore failure on applying the scheduling policies.
Either we should fail in case the policy is not applied so that there are no side effects of pod being scheduled on workers which it was not defined in the scheduling policies or
We should log such errors and raise alerts to bring to the operator of such scheduling of pods to have a possibility to react to the anomaly.
Why is this needed:
Now in real scenario this can have pros and cons --
Pro
etcd
are scheduled even if not on the worker pool the policy describes but whatever scheduler prescribes.Cons
etcd
. This wouldn't have happened had the etcd
pod was deployed on the intended worker as per the policy definitions.What would you like to be added:
Support for Kubernetes 1.22+
Why is this needed:
With v1.22, Kubernetes dropped support for beta versions of the ValidatingWebhookConfiguration
and MutatingWebhookConfiguration
apis from admissionregistration.k8s.io/v1beta1
, as these have moved to v1
now. It's the same case for CustomResourceDefinition
api from apiextensions.k8s.io/v1beta1
, which has also progressed to v1
. More details can be found in the k8s 1.22 API changes page.
/assign
What happened:
When Kupid mutates Affinity rules for a resource which originally has no affinity rules then that mutation is not logged. This is required for better diagnostics
What you expected to happen:
All mutations (if any) done by kupid to a resource for node affinity changes should be logged.
How to reproduce it (as minimally and precisely as possible):
Create a new STS and ensure that it is a target for kupid to inject affinity rules. You will see that kupid injects the rules as defined in ClusterPodSchedulingPolicy resource but it will not log the mutation for the first change.
What would you like to be added:
If there is a conflict between the scheduling criteria (potentially more than one) being injected and what is already present in the target pod spec/template, merge done by kupid is currently ad hoc.
It is desirable to support full strategic merge patch in such a cases.
Why is this needed:
Consistency with Kubernetes best-practicies.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.