kyma-project / infrastructure-manager Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-license_file-1
Explanation: Does it have a license file? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-reuse_tool-4
Explanation: Is it compliant with REUSE rules? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-reuse_tool-1
Explanation: Does README mention REUSE? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
Reason
We're using a kubeconfig defined in gardener-kubeconfig-path
. We should limit the access to it to not allow unathorize access to the gardener project.
Acceptance criteria
Description
Enable possibility to create multiple worker groups with different machine types, volume types, node labels, annotations, taints.
Reasons
One size doesn't fit all. Many applications require specific nodes for particular services.
Description
As Prow will be discontinued in 2024, we have to move the Prow jobs used for the provisioner to an alternative CI/CD system. In our case Github Actions is the preferred choice.
Overview of all existing Prow-jobs is listed here: https://github.com/search?q=repo%3Akyma-project%2Ftest-infra+framefrog&type=code&p=1
AC:
Reasons
Migrate CI/CD jobs from Prow to Github Actions as Prow will be discontinued in 2024.
Attachments
Description
We should verify how the operator behaves under load. To increase the stabilisation and reliability of the infrastructure manager, a performance test has to be implemented which verifies common use cases. Goals is to measure regularly our internally defined performance KPIs (benchmarking/load test), verify the limits of the application (stress test) and detect performance critical behaviours before the Infrastructure Manager gets deployed on a productive landscape (no memory leaks etc.).
Acceptance criteria:
Reasons
Before deploying the operator on production we must know its performance characteristic.
Reason
Prevent the possibility that an agent will get access to the Infrastructure Manager Operator
Acceptance criteria
Description
Configure logging in Infrastructure Manager to be in json format
Reasons
Have an easy to consume logs.
Attachments
Description
The infrastructure Manager should provide metrics to allow early issues detection.
Reasons
Infrastructure Manager is a component that in the long run will be responsible for cluster creation. In case of a downtime the impact on Kyma Control Plane will be significant. We must prevent that by increasing the observability.
Acceptance criteria
Reason
When POD is disabled (even for a shorter duration like 10 seconds), and the GardenerCluster CR will be removed by KEB, IM controller will not receive an event and the corresponding secret will not be cleaned up.
What
Some mechanisms (e.g., owner reference/finalizers) should be introduced to ensure that when GardnerCluster CR is removed, the corresponding secret will also be removed.
Description
Errors are being thrown in logs when using force rotation.
Expected result
No errors should be thrown in logs when using force rotation.
Actual result
Errors are being thrown in logs when using force rotation.
2023-12-20T12:29:44Z INFO Rotation of secret kubeconfig-01568d6b-e96f-4106-b8f5-f5a745f0390d in namespace kcp-system forced. {"GardenerCluster": "01568d6b-e96f-4106-b8f5-f5a745f0390d", "Namespace": "kcp-system"}
2023-12-20T12:29:44Z ERROR status update failed {"error": "Operation cannot be fulfilled on gardenerclusters.infrastructuremanager.kyma-project.io \"01568d6b-e96f-4106-b8f5-f5a745f0390d\": the object has been modified; please apply your changes to the latest version and try again"}
2023-12-20T12:29:44Z ERROR Reconciler error {"controller": "gardenercluster", "controllerGroup": "infrastructuremanager.kyma-project.io", "controllerKind": "GardenerCluster", "GardenerCluster": {"name":"01568d6b-e96f-4106-b8f5-f5a745f0390d","namespace":"kcp-system"}, "namespace": "kcp-system", "name": "01568d6b-e96f-4106-b8f5-f5a745f0390d", "reconcileID": "f1f60c6e-15c4-45cb-bcde-a3c60b8ce864", "error": "Operation cannot be fulfilled on gardenerclusters.infrastructuremanager.kyma-project.io \"01568d6b-e96f-4106-b8f5-f5a745f0390d\": the object has been modified; please apply your changes to the latest version and try again"}
2023-12-20T12:29:44Z INFO Starting reconciliation. {"GardenerCluster": "01568d6b-e96f-4106-b8f5-f5a745f0390d", "Namespace": "kcp-system"}
2023-12-20T12:29:44Z INFO rotation params {"GardenerCluster": "01568d6b-e96f-4106-b8f5-f5a745f0390d", "Namespace": "kcp-system", "lastSync": "0001-01-01 00:00:00", "requeueAfter": "6h50m24s"}
Steps to reproduce
1.27.6
and then hibernated before the rotation was forced./kind bug
Description
Prepare a Go program/script that will iterate over Kyma resources. For each Kyma resource it will:
GardenerCluster
CRThe GardenerCluster
CR must contain the fields defined here. Kyma resource is created by the KEB, and the labels it adds can be found here. Mind that the secret name is also defined by KEB.
Reasons
In order to migrate to the architecture with the Infrastructure Manager responsible for dynamic kubeconfig creation the environment some additional steps must be performed. When Infrastructure Manager will be deployed on the target environment there will be a need to handle existent Kyma clusters. The migration script is needed to make sure Infrastructure Manager will control all the runtimes.
Description
Create a minimal structure for Cluster Inventory Infrastructure Manager.
Acceptance criteria:
make deploy
command - @akgalwasStretch:
Reasons
In order to kick off the implementation we need to define the code structure, create pipelines. We also need to define the interface for Kyma Environment Broker that is supposed to create Cluster CRs.
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-reuse_tool-3
Explanation: Is it registered in REUSE? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-reuse_tool-4
Explanation: Is it compliant with REUSE rules? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-vulnerability_alerts-1
Explanation: Are vulnerability alerts enabled? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
Description
The Provisioner has to be replaced by the Kyma Infrastrcuture Manager. The logic of the Provisioner has to be migrated into the Infrastructure Manager, but also considering already planned new features. This could required a rethinking of the current software architecture to ensure a flexible and extensible but also maintainable software structure of the Infrastructure Manager.
AC:
Reasons
Replacing the old Kyma Provisioner with the Kyma Infrastructure Manager to follow new KCP architectural paradigm (K8s native application).
Attachments
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-reuse_tool-3
Explanation: Is it registered in REUSE? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
Description
While working on #95, #97 and #99 we've noticed that the bigger changes in the corresponding code we've noticed that tests require an improvement.
Reasons
That's the crucial part of Infrastructure Manager that has to be correctly tested so the future enhancements or bug fixes will not cause regressions.
Attachments
Description
As critical backend service of Kyma, the monitoring of the availability of the Infrastructure Manger is critical to react in-time on service degradations.
Goals is to setup a end-2-end test case for the Infrastructure Manager which verifies the correct functionality of this service on KCP. The test should be executed in intervals (e.g. hourly) and create a full-fledged Gardener cluster and also destroy it afterwards.
In case that the cluster creation wasn't possible, an alert should be fired (e.g. via the SRE monitoring system) and inform the Framefrog team about the service degradation.
AC:
Reasons
Ensure high quality and proactive service monitoring.
Attachments
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-reuse_tool-2
Explanation: Does it have LICENSES directory with licenses? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
Reason
Those important IM resources should be audit logged.
Acceptance Criteria
Ensure following cases are recorded in the auditlog:
Description
Acceptance Criteria
Reason
We can't store personal data without a reason to be DPP compliant.
Acceptance criteria
Acceptance Criteria
Description
Configure:
Reasons
Be secure.
Attachments
/area control-plane
/area security
/kind feature
Description
The Infrastructure Manager must manage dynamic kubeconfigs.
Acceptance criteria:
Reasons
In the long term the Infrastructure Manager will replace Provisioner. In the first step it will be responsible for kubeconfig management in the Kyma Control Plane.
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-reuse_tool-1
Explanation: Does README mention REUSE? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
Description
With #11 we are able to make the Infrastructure Manager transparent and also simplify our operational life by establishing smart metrics and alerting rules.
Goals of this task is to identify which metrics / KPIs are business relevant and what the critical threshold for it are. We also have to define an action plan when such a threshold is reached which trigger a required action to bring our business back on track. Finally, alerting rules have to be configured which inform us as soon as one of the thresholds is reached.
AC:
Reasons
Improve operational quality and simplify on-call shifts by establish proper metrics/KPI measuring and alerting.
Extends #11
Attachments
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-license_file-1
Explanation: Does it have a license file? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-vulnerability_alerts-1
Explanation: Are vulnerability alerts enabled? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-license_file-1
Explanation: Does it have a license file? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-vulnerability_alerts-1
Explanation: Are vulnerability alerts enabled? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
Description
There should be a possibility to issue a kubeconfig for the cluster with limited access/privileges.
Kubernetes allows for creating kubeconfigs for specific ServiceAccounts. Having such SA-based kubeconfig makes it possible to limit its use with proper Roles/ClusterRoles.
Suggestions
this is just a proposal, feel free to refine/change/adapt it as you like
One of the options would be to have a new CRD used for issuing kubeconfigs - it could include ServiceAccount information along with the Role/ClusterRole assigned to that ServiceAccount. Based on this Infrastructure Manager could create the SA, (Cluster)Role, issue kubeconfig and save it as a secret in the KCP.
Such a solution would require introducing a controller for handling those, but it will be a universal solution that would support multiple Kubeconfigs to be issued for a single cluster (i.e. for KEB, KLM and other KCP Controllers that would require cluster access).
Regarding the deletion logic - it can be solved with a finalizer that is set on all the CRs, when the deletion timestamp is picked up by the controller then cluster resources (SAs, Roles, etc.) are dropped and the finalizer is removed.
Reasons
It is generally recommended to keep the required privileges minimal for the specific roles. Right now the issued kubeconfigs are for the cluster-admin
role which allows for unconstrained actions to be taken using this kubeconfig. From the security perspective, it would be also beneficial to differentiate between entities connecting to the SKR. Separate kubeconfigs for KEB or KLM would make it transparent from the audit-log perspective on which component took which action in the cluster.
Acceptance Criteria
this is just a proposal, feel free to refine those as you like
Description
For our release management and to fulfil SAP product standards, we have to document how our testing strategy for the KIM looks like.
Some example links to such documentations are available here: https://wiki.one.int.sap/wiki/display/kyma/Testing+Strategy+-+Link+summary
For the AC, the testing strategy is already documented.
AC:
Area
Kyma Infrastructure Manager
Reasons
Mandatory part of the delivery process and required for a fast creation of Microdeliveries.
Assignees
@kyma-project/technical-writers
Attachments
Description
Allow only needed actions/http methods.
Reason
We don't want to provide a way to edit/delete cluster related data.
Acceptance criteria
Attachments
Description
Gardner supports now the option to force the deletion of a cluster (which avoids longer waiting-periods during the de-provisioning e.g. the K8s cluster couldn't be gracefully stopped caused by hanging finalizers).
We agreed to use this feature flag and the infrastructure manager / provisioner should set this flag properly.
AC:
confirmation.gardener.cloud/force-deletion
is set in the shoot-specs of Gardener clusters.Reasons
Enable/accept non-graceful shutdowns of Gardener clusters to avoid longer waiting periods during the de-provisioning.
Attachments
[Moved from Provisioner to KIM]
Description
How it's going to be implemented is yet to be defined.
Reasons
Assure that the dynamic kubeconfigs feature is working e2e.
Acceptance criteria
Attachments
/area control-plane
/kind feature
Description
Configure a markdown link checker that will ensure that links we use in our *.MD files are valid.
Reasons
Attachments
/area documentation
/area control-plane
/kind feature
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.