Code Monkey home page Code Monkey logo

cluster-api-provider-oci's Introduction

Kubernetes Cluster API Provider OCI

Go Report Card


Kubernetes-native declarative infrastructure for OCI.

Kubernetes Cluster API Provider for Oracle Cloud Infrastructure

The Cluster API Provider for OCI (CAPOCI) brings declarative, Kubernetes-style APIs to cluster creation, configuration and management.

The Cluster API itself is shared across multiple cloud providers allowing for true hybrid deployments of Kubernetes.

Features

  • Self-managed and OCI Container Engine for Kubernetes(OKE) clusters
  • Manages the bootstrapping of VCNs, gateways, subnets, network security groups
  • Provide secure and sensible defaults

Getting Started

You can find detailed documentation as well as a getting started guide in the Cluster API Provider for OCI Book.

🤗 Community

The CAPOCI provider is developed in the open, and is constantly being improved by our users, contributors, and maintainers.

To ask questions or get the latest project news, please join us in the [#cluster-api-oci][#cluster-api-oci] channel on Slack.

Office Hours

The maintainers host office hours on the first Tuesday of every month at 06:00 PT / 09:00 ET / 14:00 CET / 18:00 IST via Zooom.

All interested community members are invited to join us. A recording of each session will be made available afterwards for folks who are unable to attend.

Previous meetings: [ notes | recordings (coming soon) ]

Support Policy

NOTE: As the versioning for this project is tied to the versioning of Cluster API, future modifications to this policy may be made to more closely align with other providers in the Cluster API ecosystem.

Cluster API Versions

CAPOCI supports the following Cluster API versions.

Cluster API v1beta1 (v1.x.x)
OCI Provider (v0.x.x)

Kubernetes versions

CAPOCI provider is able to install and manage the versions of Kubernetes supported by Cluster API (CAPI).

Contributing

This project welcomes contributions from the community. Before submitting a pull request, please review our contribution guide

Security

Please consult the security guide for our responsible security vulnerability disclosure process

License

Copyright (c) [2021, 2022] year Oracle and/or its affiliates.

Released under the Apache License License Version 2.0 as shown at http://www.apache.org/licenses/.

cluster-api-provider-oci's People

Contributors

dependabot[bot] avatar djelibeybi avatar hyder avatar joekr avatar junior avatar preethambojja avatar shyamradhakrishnan avatar spavlusieva avatar tozastation avatar yimw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cluster-api-provider-oci's Issues

Clarify use of various templates and CCM

What would you like to be documented:

There are many templates currently provided in the repo. Some install the CCM while others do not and the docs also provide steps to explicitly install the CCM.

Why is this needed:

For clarification purposes.

As a user I would expect to see OCI in the provider list when I run `clusterctl config repositories`

What would you like to be added:
The work for this issue will be done in the cluster-api repo, but will impact this provider. We will need to modify https://github.com/kubernetes-sigs/cluster-api/blob/a52bb727d22f13042d9810a6bf6566204ea28345/cmd/clusterctl/client/config/providers_client.go to make sure OCI shows up in the list

Why is this needed:
This will make it easier for customers to find our provider.

Add support for upgrading the machine pool's instance configuration.

What would you like to be added:
As part of #94 some investigation went into upgrading the machine pool's images (InstanceConfiguration). It turns out it was a bit more involved. We will need figure out how to cordon/drain each node and then delete and recreate. This is what machine deployments do.

As part of this we will need to do the following:

  • If the image is changed create a new InstanceConfiguration
  • Update the machine pool to use the new InstanceConfirguration (this won't update the existing instances)
  • Grab a list of the existing instances and iterate
    • cordon/drain the node (wait until done)
    • terminate instance (wait for new instance to start instance pools should automatically do this)
  • delete old InstanceConfiguration

Why is this needed:
If we want users using machine pools a major use case will be upgrading their cluster. We need to make sure they are able to upgrade without service interruption.

Add documentation for VCN Peering feature

What would you like to be documented:

Add documentation for VCN Peering feature

Why is this needed:

Customers can refer to documentation for using VCN peering feature.

Add more validation for OCICluster

What would you like to be added:
Now that #55 has been merged we need to work on adding more validation checks for things like the following:

  • NetworkSpec (valid CIDRs)
  • Region is valid (might need to call to OCI for validation or the OCI SDK might have pre-flight validation we could use)
  • ClusterName (length?)
  • For cluster update region is considered an immutable field. We will want to handle other immutable fields

Why is this needed:
Now that #55 is merged we should expand upon our validation. This issue isn't to 100% cover all possibilities, but rather cover what we can now that make sense.

Use instance_principal when installing CCM

What would you like to be documented:

We should use instance_principal when installing CCM instead of asking the user for credentials.

Why is this needed:
It's more secure

Default NSG should allow 10250 port to be open in Control plane Subnet for Control Plane CIDR

What happened:
kubectl logs in control plane nodes are not working. Default NSG should allow 10250 port to be open in Control plane Subnet for Control Plane CIDR

What you expected to happen:
kubectl logs for pods running on control plane nodes should work.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • CAPOCI version:
  • v0.1.0
  • Cluster-API version (use clusterctl version):
  • Kubernetes version (use kubectl version):
  • Docker version (use docker info):
  • OS (e.g. from /etc/os-release):

bug in API provider Authentication Configuration documentation

What happened:
My capoci-controller-manager pod was in CrashLoopBackOff. Looking at the logs for it I saw:

E0322 15:17:19.001678 1 logr.go:270] setup "msg"="unable to create OCI VCN Client" "error"="can not create client, bad configuration: x509: decryption password incorrect"

Looking at the secret itself, the passphrase was set to this:

% echo -n "T0NJX0NSRURFTlRJQUxTX1BBU1NQSFJBU0U=" | base64 -d
OCI_CREDENTIALS_PASSPHRASE

Going back to documentation, the Authentication Configuration section here is using the name of an environment variable instead of its value:

export OCI_CREDENTIALS_PASSPHRASE_B64="$(echo -n "OCI_CREDENTIALS_PASSPHRASE" | base64 | tr -d '\n')"

should be

export OCI_CREDENTIALS_PASSPHRASE_B64="$(echo -n "$OCI_CREDENTIALS_PASSPHRASE" | base64 | tr -d '\n')"

Broken Links on the Getting Started book.

What happened: The links to installation instructions on the Getting Started on the Introduction section of the book are broken and error 404

What you expected to happen: The links to take me to the installation instructions

How to reproduce it (as minimally and precisely as possible): Click on the links

Anything else we need to know?:

Environment:

  • CAPOCI version: 0.1.0
  • Cluster-API version (use clusterctl version):
  • Kubernetes version (use kubectl version):
  • Docker version (use docker info):
  • OS (e.g. from /etc/os-release):

Support in-flight checks of shape paremeters

What would you like to be added:
When the user chooses a bare metal shape that requires the PV_TRANSIT_ENCRYPTION to be false we should fast-fail before trying to launch a cluster. As a part of this we should look at other checks we should preform.

Why is this needed:
The user experience with in-flight encryption is somewhat hostile in that they aren't informed the cluster will fail until the machines fail to start.

Document how MachinePools work

What would you like to be documented:
We need to document how machine pools work

Why is this needed:
#89 added support for machine pools, but now we need to document how it works for users.

Remove mandatory field from `OCIClusterSpec` and use webhook validation instead

What would you like to be added:
We want to remove the mandatory from CompartmentId string `mandatory:"true" json:"compartmentId"` in the OCIClusterSpec and setup the mandatory validation in ocicluster_webhook

Why is this needed:
This seems to be the preferred way to do validation and it will allow use to remove the compartmentId: REPLACE defined in templates/clusterclass-example.yaml

Test clusterctl 1.1.5

What would you like to be added:
We need to test the new 1.1.5 release (probably cut 2022-07-04) to make sure CAPOCI won't have issues with the release.

We will want to run e2e tests using 1.1.5 at the minimum.

Why is this needed:
Need to make sure things don't break.

Add more validation for OCIMachineTemplate

What would you like to be added:
Now that #55 has been merged we need to work on adding more validation checks for things like the following:

Why is this needed:
Now that #55 is merged we should expand upon our validation. This issue isn't to 100% cover all possibilities, but rather cover what we can now that make sense.

Add a section on using instance principal in CAPOCI

What would you like to be documented:
Add a section on using instance principal in CAPOCI

Why is this needed:
If management cluster is in OCI, Instance Principal is the recommended and secure authentication mechanism to be used instead of user principal.

Provide ability to peer Management and Workload Cluster VCN

What would you like to be added:
Users should have the ability to peer management cluster VCN with workload cluster VCN. There are various techniques that can be used such as Local Peering Gateway or Dynamic Routing Gateway. Using Dynamic Routing Gateway will provide a single approach to peer VCNs in same as well as different regions and hence recommended as an initial implementation.

Why is this needed:
This is required for private cluster support.

Webhook validation failing for blank subnet cidr

What happened:
webhook validations failing for templates/cluster-template-arm-free-tier.yaml. Since the subnet CIDR's are set the validation fails.

What you expected to happen:
The blank CIDRs would still pass validation, but would be automatically populated by the provider

How to reproduce it (as minimally and precisely as possible):
Try to create a workload cluster using the templates/cluster-template-arm-free-tier.yaml template.

Anything else we need to know?:

Environment:

  • CAPOCI version: 0.3.0
  • Cluster-API version (use clusterctl version):
  • Kubernetes version (use kubectl version):
  • Docker version (use docker info):
  • OS (e.g. from /etc/os-release):

Add pagination for Machine Pools

What would you like to be added:
This issue is specifically focused on ListInstancePoolInstances for machine pools. There is an other issue for confirming all the other list api calls are paginated (#97)

Why is this needed:
Machine Pools will possibly have many instances. We want to land this change quickly as this could cause MachinePools issues.

Add e2e tests around MachinePools

What would you like to be added:
End to end tests for machine pools needs to be updated. At a minimum we should have test for:

  • Launching a cluster with a pool of size X
  • Scaling up a cluster pool
  • upgrading pool image

Why is this needed:
#89 added support for MachinePool but now we need to add testing around it.

Update documentation with details about workload clusters in multiple regions

What would you like to be documented:
As a part of #44 we are adding multi-region support. Currently this functionality isn't exposed in our provided templates, but we should.

Why is this needed:
This wasn't documented as part of the PR since we wanted to land the functionality then come back and document how it will work and provide a good user experience.

Update create cluster examples

What would you like to be documented:
Cover all available parameters in some examples

Why is this needed:
our examples don't cover all the options. It would be good if we can provide an example usage of all available parameters. It doesn't all have to be in a single example but rather all parameters must be used at least once across all the provided examples.

CAPOCI should publish CRD reference

What would you like to be documented:
CAPOCI should publish CRD reference

Why is this needed:
This will help users to understand the CRD, which the cluster templates are based on. Any customisation to the template will be greatly helped by this documentation.

As a user I should be able to specify shapes for control and worker nodes separately

What would you like to be added:
Right now if a cluster is defined as a BM.Standard3.64 shape both the control plane and all worker nodes are that same shape when the user is using our defined templates. We would like the ability to allow users to specify the control plane as on shape (example: VM.Standard.E4.Flex ) while the worker nodes are a different shape (example: BM.Standard3.64).

Why is this needed:
Customers have asked for this feature.

CAPOCI should not use ObjectMeta.ClusterName

What would you like to be added:
We need to go through and make sure we are not using ObjectMeta.ClusterName since there will be a change made in cluster-api to remove this

Why is this needed:
There is an upstream discussion about dropping ObjectMeta.ClusterName: kubernetes/kubernetes#108717. Related CAPI issue: kubernetes-sigs/cluster-api#6305

This field is not populated by Kubernetes and we should not use it as an CAPOCI Cluster Name.

This issue is about doing some research where we use it in CAPOCI and fix it if it is used.
cluster-api update: kubernetes/kubernetes#108717

Add CAPOCI to clusterctl

What would you like to be added:
When a user runs clusterctl config repositories we should show information about the CAPOCI provider. We will also want to coordinate with #20 to make sure the docs are updated as well.

Why is this needed:
This will make it easier for users to find our provider.

As a user I should be able to launch clusters in multiple regions

What would you like to be added:
There should be the ability to launch a work load cluster in multiple regions.
Example: A user can launch a cluster in SYD and another cluster in IAD and have the same management cluster managing the lifecycle of both workload clusters

Why is this needed:
We have users asking for this.

clusterctl move does not work with CAPOCI

What happened:
clusterctl move does not work with CAPOCI. The reconciliation is failing with the below error

cluster api tags have been modified out of context

What you expected to happen:

Reconciliation should still happen

How to reproduce it (as minimally and precisely as possible):

Use clusterctl move of an OCICluster
Anything else we need to know?:

CAPOCI is using the cluster UID as the resource identifier and tags all created resources with this tag. But during the move operation, the clusters gets a new UID which throws the modified out of context error.

Environment:

  • CAPOCI version:
  • Cluster-API version (use clusterctl version):
  • Kubernetes version (use kubectl version):
  • Docker version (use docker info):
  • OS (e.g. from /etc/os-release):

Add validations using validation webhooks

What would you like to be added:
Add validations via validation webhook.

Why is this needed:
Validation webhooks will make sure that mandatory parameters are set much before reconciliation so that users can catch errors much earlier. Currently all validations happen during reconciliation.

Provide ability to choose between public and private clusters

What would you like to be added:

Currently, clusters created by CAPOCI can all be publicly accessed via a public load balancer. This issue is a proposal to make it possible to have clusters whose API servers can only be accessed via a restricted CIDR range and not exposed to the Internet.

Why is this needed:
Restricting public access to Kubernetes clusters is recommended in order to improve the security level of a Kubernetes cluster.

Set defaults via webhooks

What would you like to be added:
Add defaults via webhooks.

Why is this needed:

Currently all defaults are set during reconciliation. Webhooks are a better place to do that rather than reconciliation.

Support for local Windows management cluster

What would you like to be added:
We don't currently have anyone on the core team using windows. It would be nice to have someone test:

  1. local management cluster processes work on Windows
  2. local development processes work on Windows
  3. Update docs (or create issues) to document any needed Windows tools

Why is this needed:
We want to be able to support the community on whatever tools they choose to use.

Continually get OOMKilled when using Rancher Desktop

What happened:
When launching a new management cluster with the OCI provider the capoci-controller-manager continually gets the OOMKilled error

What you expected to happen:
I would expect the CAPOCI manager to launch.

How to reproduce it (as minimally and precisely as possible):
Use kind to create a cluster then run clusterctl init --infrastructure oci and when the capoci-controller-manager attempts to launch it will get killed.

Anything else we need to know?:

Environment:

  • CAPOCI version: 0.2.0/0.3.0
  • Rancher Desktop version: 1.3.0
  • Cluster-API version (use clusterctl version):
  • Kubernetes version (use kubectl version):
  • Docker version (use docker info):
  • OS (e.g. from /etc/os-release):

Add support for MachinePool

What would you like to be added:
Support for multiple MachinePools. This would be implemented by OCI's InstancePool.

Why is this needed:

Currently, when we create a cluster, worker instances, they are created as individual instances. This means that failures in control plane/worker nodes have to be manually addressed.

We can address the above and make self-managed clusters created by CAPOCI more flexible and more resilient to failures.

Control plane nodes display provisioned despite desired state not met

What happened:
When setting control plane to 3 nodes, even though the desired state (3 provisioned control plane nodes) has not yet been met, the cluster object has already transitioned to provisioned. This may happen when a control plane node takes longer to provision but the cluster is available.

What you expected to happen:
The cluster object should only transition to provisioned after the desired state (i.e. 3 control plane nodes are ready) are met.

How to reproduce it (as minimally and precisely as possible):
export CONTROL_PLANE_MACHINE_COUNT=3

Anything else we need to know?:

Environment:

  • CAPOCI version: v0.1.0
  • Cluster-API version (use clusterctl version): v1.0
  • Kubernetes version (use kubectl version): v1.22.5
  • Docker version (use docker info):
  • OS (e.g. from /etc/os-release):

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.