Code Monkey home page Code Monkey logo

operator's People

Contributors

afrittoli avatar akihikokuroda avatar alex-souslik-hs avatar barthy1 avatar concaf avatar davidumea avatar dependabot[bot] avatar ibotty avatar jacksgt avatar jimmyjones2 avatar jkandasa avatar kabhiibm avatar kahirokunn avatar karthikjeeyar avatar khrm avatar nikhil-thomas avatar piyush-garg avatar pradeepitm12 avatar pratap0007 avatar puneetpunamiya avatar rupalibehera avatar sabre1041 avatar savitaashture avatar sugardon avatar timmyers avatar vdemeester avatar vinamra28 avatar vincent-pli avatar yselkowitz avatar zroubalik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

operator's Issues

Discussion: pick up CRD name more make sense than "Config"

When I install Tekton/operator on my private k8s cluster product, I feel a little uncomfortable, since a cluster scope and named Config stuff make people confuse, they think the resource is a globe configuration for the whole k8s cluster.

So, I think could we change the name, I checked Knative/serving-operator, they named it KnativeServing , so could we chose Tekton or TektonPipeline?

Publish Operator on OperatorHub.io

One we will have our first release, we're gonna want to publish the upstream operator on operatorhub.io. This issue is to track that.

Identify the scenarios of integration tests, and implement them

Expected Behavior

We need to identify the major use cases for integration tests to verify. For example, what are the CRs we need to create? What are the related resources to be created? Are there any resources able to restore themselves? Will all the resources be removed, when the CR is gone?

This should be a crucial step to setup ASAP, since we cannot rely on manual verification for each PR.

Actual Behavior

Steps to Reproduce the Problem

Additional Info

The issue of `deployOperator` in the test case with operator-sdk

I found an issue with the operator-sdk we are using for tekton operator, when I am working on the PR #24.

The reconcile function can be kicked-off every time there is change with CD or deployment, if we launch the operator locally, with the function deployOperator in the test case at https://github.com/tektoncd/operator/blob/master/test/testgroups/cluster_crd.go#L26.

If we check the status of the deployment and revive the pipeline deployment, it is possible that reconcile function needs to be kicked-off multiple times to reach the expected status. (CR created, kicking off reconcile; deployment 1 created, kicking off reconcile; deployment 2 created, kicking off reconcile; deployment 1 deleted, kicking off reconcile; deployment 2 deleted, kicking off reconcile...)

However, with the operator launched LOCALLY with the function deployOperator in the test case, reconcile func sometimes MISSES the call. If we build an image, and run the operator with the kubectl apply command, this issue does not apply.

[`main` branch] TektonPIpeline reconciler doesn't update the TektonPipeline instance status correctly

Expected Behavior

When the knative/pkg based implementation of operator installs TektonPipeline, the status of the instance of TektonPipeline CRD should be updated correctly

Actual Behavior

The READY statue of the CR is not updated correctly.

...
status:
  conditons:
...
  - lastTransitionTime: "2020-09-22T16:54:57Z"
    status: Unknown
    type: Ready
...

and

$ kubectl get tektonpipelines.operator.tekton.dev cluster
NAME      VERSION   READY     REASON
cluster   0.15.2    Unknown

Steps to Reproduce the Problem

  1. clone tektoncd/operator main branch
  2. ko apply -f config
  3. kubectl apply -f config/crds/operator_v1alpha1_pipeline_cr.yaml
  4. kubectl get tektonpipelines.operator.tekton.dev cluster

Refactor the backbone of operator from controller-runtime based to knative/pkg based

Expected Behavior

As we have a long discussion in the issue #1, the benefits to have a consistent architecture among all tekton project will pay back in future, in terms of shifting our efforts from one tekton project to another.

A similar project called serving-operator is undertaking the same transition now. I can move on with this transition, if there is no strong objection. @vdemeester @nikhil-thomas @vincent-pli @hrishin @bobcatfish @sthaha

Actual Behavior

We use a controller-runtime based structure generated by operator-sdk.

Steps to Reproduce the Problem

  1. Check the structure of source code.

Additional Info

The tekton-pipelines namespace is not deleted after delete the config CR

Expected Behavior

The tekton-pipelines namespace is deleted after delete the config CR

Actual Behavior

The tekton-pipelines namespace is not deleted after delete the config CR

Steps to Reproduce the Problem

  1. Start the tekton operator
  2. Apply the config cr
  3. delete the config cr

Additional Info

All resources in the tekton-pipelines namespace are deleted. But the namespace is still there. it should also be deleted I think.

export GO111MODULE=on makes `ko apply` support extremely slow

To enable ko apply support to build the image, and run the operator, we have to get rid of export GO111MODULE=on in the scripts, since it will severely slow down the image build and operator deployment, and even make the pod fail to launch before timeout.

README says 0.5.0 is installed?

Expected Behavior

Installing the operator installs a recent Tekton (based on 1ffa651 that's 0.7.0?), and that version is documented upfront.

...or the specific version isn't documented upfront, and documentation just says "a recent Tekton will be installed"

Actual Behavior

README.md says:

The Operator will automatic install Tekton pipeline with v0.5.2 in the namespace tekton-pipeline

Add ko support to build & publish the image, even install the operator

Expected Behavior

When we run the command ko apply -f config, operator should be installed from the source code.

ko should be the major command to build, publish, and install tekton projects.

Actual Behavior

We do not support. We rely on operator sdk to do everything.

Steps to Reproduce the Problem

  1. No support. No command available to run.

Additional Info

Tekton operator security improvements

Expected Behavior

  1. operator creates tekton-pipelines-controller and tekton-pipelines-webhook deployments, but it should be possible to define a custom registry for the images (to use in air-gap envs)
  2. securityContext for the deployments should be configurable, as of now they are {} - which means root, privileged, etc

Actual Behavior

Steps to Reproduce the Problem

  1. install the standard tekton operator using current defined config

Additional Info

Is zz_generated.openapi.go still needed?

Expected Behavior

All files should be used.

Actual Behavior

I did not see anything necessarily using this file. If not, we should remove it.

Steps to Reproduce the Problem

None

Additional Info

Failed to install after being removed

Expected Behavior

Be able to remove all traces of tekton operator and then reinstall it.

Actual Behavior

Tekton operator goes in an error loop:

2020-09-03T05:15:34.708842160Z {"level":"info","ts":1599110134.7087033,"logger":"cmd","msg":"Go Version: go1.13"}
2020-09-03T05:15:34.708891309Z {"level":"info","ts":1599110134.7087312,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
2020-09-03T05:15:34.708898679Z {"level":"info","ts":1599110134.7087388,"logger":"cmd","msg":"Version of operator-sdk: v0.17.0"}
2020-09-03T05:15:34.710152870Z {"level":"info","ts":1599110134.7100368,"logger":"leader","msg":"Trying to become the leader."}
2020-09-03T05:15:35.632809869Z {"level":"info","ts":1599110135.6326385,"logger":"leader","msg":"No pre-existing lock was found."}
2020-09-03T05:15:35.640752435Z {"level":"info","ts":1599110135.640669,"logger":"leader","msg":"Became the leader."}
2020-09-03T05:15:36.545345206Z {"level":"info","ts":1599110136.5452123,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":"0.0.0.0:8383"}
2020-09-03T05:15:36.545719820Z {"level":"info","ts":1599110136.5456192,"logger":"cmd","msg":"Registering Components."}
2020-09-03T05:15:36.559811421Z {"level":"info","ts":1599110136.5597262,"logger":"ctrl.tektonpipeline.add","msg":"Watching operator tektonpipeline CR"}
2020-09-03T05:15:36.559869170Z {"level":"info","ts":1599110136.5597873,"logger":"ctrl.tektonpipeline.create-cr","msg":"creating a clusterwide resource of tektonpipeline crd","name":"cluster"}
2020-09-03T05:15:39.317699016Z {"level":"info","ts":1599110139.317469,"logger":"metrics","msg":"Metrics Service object updated","Service.Name":"tekton-operator-metrics","Service.Namespace":"tekton-operator"}
2020-09-03T05:15:40.219461311Z {"level":"info","ts":1599110140.2192578,"logger":"cmd","msg":"Could not create ServiceMonitor object","error":"no ServiceMonitor registered with the API"}
2020-09-03T05:15:40.219492871Z {"level":"info","ts":1599110140.219319,"logger":"cmd","msg":"Install prometheus-operator in your cluster to create ServiceMonitor objects","error":"no ServiceMonitor registered with the API"}
2020-09-03T05:15:40.219500831Z {"level":"info","ts":1599110140.2193341,"logger":"cmd","msg":"Starting the Cmd."}
2020-09-03T05:15:40.219718438Z {"level":"info","ts":1599110140.219566,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
2020-09-03T05:15:40.219801556Z {"level":"info","ts":1599110140.2196357,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"tektonaddon-controller","source":"kind source: /, Kind="}
2020-09-03T05:15:40.219868685Z {"level":"info","ts":1599110140.2197213,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"tektonpipeline-controller","source":"kind source: /, Kind="}
2020-09-03T05:15:40.320204542Z {"level":"info","ts":1599110140.3200305,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"tektonaddon-controller","source":"kind source: /, Kind="}
2020-09-03T05:15:40.320406909Z {"level":"info","ts":1599110140.3202844,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"tektonpipeline-controller","source":"kind source: /, Kind="}
2020-09-03T05:15:40.420721977Z {"level":"info","ts":1599110140.420439,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"tektonaddon-controller"}
2020-09-03T05:15:40.420751816Z {"level":"info","ts":1599110140.4205706,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"tektonpipeline-controller"}
2020-09-03T05:15:40.420760486Z {"level":"info","ts":1599110140.4206328,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"tektonpipeline-controller","worker count":1}
2020-09-03T05:15:40.420798455Z {"level":"info","ts":1599110140.4206774,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"tektonaddon-controller","worker count":1}
2020-09-03T05:15:40.420872125Z {"level":"info","ts":1599110140.4208064,"logger":"ctrl.tektonpipeline.reconcile","msg":"reconciling tektonpipeline change","Request.Namespace":"","Request.NamespaceName":"/cluster","Request.Name":"cluster"}
2020-09-03T05:15:40.420883814Z {"level":"info","ts":1599110140.4208431,"logger":"ctrl.tektonpipeline.reconcile","msg":"installing pipelines","Request.Namespace":"","Request.NamespaceName":"/cluster","Request.Name":"cluster","path":"/var/run/ko/resources/pipelines/v0.15.2"}
2020-09-03T05:15:40.583869676Z time="2020-09-03T05:15:40Z" level=error msg="Operation cannot be fulfilled on tektonpipelines.operator.tekton.dev \"cluster\": the object has been modified; please apply your changes to the latest version and try againstatus update failed" source="pipeline_controller.go:328"
2020-09-03T05:15:40.583977214Z {"level":"error","ts":1599110140.583758,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"tektonpipeline-controller","request":"/cluster","error":"failed to apply non deployment manifest: Internal error occurred: failed calling webhook \"config.webhook.pipeline.tekton.dev\": Post \"https://tekton-pipelines-webhook.tekton-pipelines.svc:443/config-validation?timeout=30s\": service \"tekton-pipelines-webhook\" not found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tgithub.com/go-logr/[email protected]/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:88"}
2020-09-03T05:15:41.584217446Z {"level":"info","ts":1599110141.5840812,"logger":"ctrl.tektonpipeline.reconcile","msg":"reconciling tektonpipeline change","Request.Namespace":"","Request.NamespaceName":"/cluster","Request.Name":"cluster"}
2020-09-03T05:15:41.584248705Z {"level":"info","ts":1599110141.5841324,"logger":"ctrl.tektonpipeline.reconcile","msg":"installing pipelines","Request.Namespace":"","Request.NamespaceName":"/cluster","Request.Name":"cluster","path":"/var/run/ko/resources/pipelines/v0.15.2"}
2020-09-03T05:15:41.746497008Z {"level":"error","ts":1599110141.7462823,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"tektonpipeline-controller","request":"/cluster","error":"failed to apply non deployment manifest: Internal error occurred: failed calling webhook \"config.webhook.pipeline.tekton.dev\": Post \"https://tekton-pipelines-webhook.tekton-pipelines.svc:443/config-validation?timeout=30s\": service \"tekton-pipelines-webhook\" not found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tgithub.com/go-logr/[email protected]/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:88"}

Steps to Reproduce the Problem

  1. kubectl apply -f https://storage.googleapis.com/tekton-releases/operator/latest/release.notags.yaml
  2. Delete any resources and traces of tekton
  3. kubectl apply -f https://storage.googleapis.com/tekton-releases/operator/latest/release.notags.yaml

Handle changes of labels in pipeline, triggers and dashboard deployments

Expected Behavior

The latest pipeline, triggers and dashboard can be updated to the latest release

Actual Behavior

The master version of the three projects failed to update the in the robocat cluster because of new labels that were added to the deployments, which cause errors during update of the existing deployments:

[deploy-tekton-project] Resource: "apps/v1, Resource=deployments", GroupVersionKind: "apps/v1, Kind=Deployment"
[deploy-tekton-project] Name: "tekton-pipelines-webhook", Namespace: "tekton-pipelines"
[deploy-tekton-project] for: "pipeline/overlays/robocat": Deployment.apps "tekton-pipelines-webhook" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/component":"webhook", "app.kubernetes.io/instance":"default", "app.kubernetes.io/name":"webhook", "app.kubernetes.io/part-of":"tekton-pipelines"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

This is a heads-up as the operator may have to deal with this to ensure seamless upgrade.

Steps to Reproduce the Problem

Additional Info

See tektoncd/plumbing#392 for more details

Tekton Dashboard - Extensions Webhook - Error loading extension

Expected Behavior

Tekton Dashboard - Extensions Webhook - Page is available

Actual Behavior

Tekton Dashboard - Extensions Webhook
Page not available: "Error loading extension error loading dynamically imported module

Steps to Reproduce the Problem

  1. OKD 4.5.0
  2. Install Operator (latest)
  3. cluster-admin role for operator
  4. Install Dashboard (openshift-v0.8.2)
  5. Install extensionwebhooks ( one of: v0.2.0, openshift-v0.2.0, v0.6.1, v0.2.1, openshift-v0.2.1)
  6. Create Route for Dashboard
  7. Authenticate
  8. Browse to: Tekton Dashboard - Extensions Webhook

Change tektonVersion to a command flag so that user can choose the tekton verision

Expected Behavior

User can use --version v0.7.0 to use different tekton version to install

Actual Behavior

tektonVersion = "v0.8.0" version is hardcode in config controller

User cannot choose the version, just use the default v0.8.0 version now.

It is better to set it a parameter. then we can use it for the upgrade or future version maintain

Add a short poll to CheckDeployment Stage

Ref

func CheckDeployments(ctx context.Context, manifest *mf.Manifest, instance v1alpha1.TektonComponent) error {

Current behavior

The CheckDeployment was fetchin the deployments only once. Hence, the Ready status condition was not always updated correctly. This also made e2e tests flaky.

So the the CheckDeployment was modified (#174) to return an error instead of nil when the deployment is not found, so that the reconciler is re-triggered.

However, this is not ideal. All the stages before the CheckDeployment gets applied again.

Expected behavior

The CheckDeployment should retry a few times (with a short timeout) before returning error.

Cannot designate PR to build nightly image

Expected Behavior

We should build the the nightly image based on an existing commit in master. We should remove this line: https://github.com/tektoncd/operator/blob/master/test/presubmit-tests.sh#L57.

Actual Behavior

We build the the nightly image on every pending PR.

Steps to Reproduce the Problem

  1. Open a PR
  2. The image gcr.io/tekton-nightly/tektoncd-operator is built with https://github.com/tektoncd/operator/blob/master/test/presubmit-tests.sh#L57.

Additional Info

Whether `no-auto-install` of CR is needed?

Expected Behavior

We use CRs to install tekton pipeline and all the other components. CRs are the sources to configure.
The option no-auto-install gives us an option to install the pipeline CR automatically or not. Is it still necessary to configure the tekton operator?

Suggest to remove this option.
Users can install the operator. Then install pipeline by manually installing the pipeline CR.

Actual Behavior

Pipeline CR can be installed automatically by setting no-auto-install to false.

log of the pod which failed to run in the prow job for `ko apply -f config/`

I0827 16:28:54.492] ERROR: timeout waiting for pods to come up
I0827 16:28:54.492] tekton-operator-79f78d6994-t8w69   0/1   CrashLoopBackOff   5     5m36s
I0827 16:28:55.036] {"level":"info","ts":1566923178.1520412,"logger":"cmd","msg":"Go Version: go1.11.4"}
I0827 16:28:55.036] {"level":"info","ts":1566923178.1521137,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
I0827 16:28:55.037] {"level":"info","ts":1566923178.152141,"logger":"cmd","msg":"Version of operator-sdk: v0.9.0"}
I0827 16:28:55.037] {"level":"info","ts":1566923178.152508,"logger":"leader","msg":"Trying to become the leader."}
I0827 16:28:55.038] {"level":"error","ts":1566923178.3067694,"logger":"k8sutil","msg":"Failed to get Pod","Pod.Namespace":"default","Pod.Name":"tekton-operator-79f78d6994-t8w69","error":"pods \"tekton-operator-79f78d6994-t8w69\" is forbidden: User \"system:serviceaccount:default:tekton-operator\" cannot get resource \"pods\" in API group \"\" in the namespace \"default\": RBAC: role.rbac.authorization.k8s.io \"tekton-operator\" not found","stacktrace":"github.com/tektoncd/operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/tektoncd/operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/k8sutil.GetPod\n\t/go/src/github.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/k8sutil/k8sutil.go:107\ngithub.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/leader.myOwnerRef\n\t/go/src/github.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/leader/leader.go:133\ngithub.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/leader.Become\n\t/go/src/github.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/leader/leader.go:67\nmain.main\n\t/go/src/github.com/tektoncd/operator/cmd/manager/main.go:97\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:201"}
I0827 16:28:55.039] {"level":"error","ts":1566923178.3068914,"logger":"cmd","msg":"","error":"pods \"tekton-operator-79f78d6994-t8w69\" is forbidden: User \"system:serviceaccount:default:tekton-operator\" cannot get resource \"pods\" in API group \"\" in the namespace \"default\": RBAC: role.rbac.authorization.k8s.io \"tekton-operator\" not found","stacktrace":"github.com/tektoncd/operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/tektoncd/operator/vendor/github.com/go-logr/zapr/zapr.go:128\nmain.main\n\t/go/src/github.com/tektoncd/operator/cmd/manager/main.go:99\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:201"}

Tekton ConfigMaps are overwritten by Operator

Expected Behavior

After installing the Operator and make any modification to Tekton ConfigMaps (config-defaults, feature-flags) they should not be modified, even after operator restarts or upgrades.

Actual Behavior

If I restart the operator deployment container the config-maps are overwritten with default values, replacing anything I've defined.

Steps to Reproduce the Problem

  • Install Tekton Operator
  • Wait for everything to be provisioned
  • Modify any tekton-pipelines/feature-flags config map entry
  • Scale-down tekton operator deployment to zero
  • Scale-up tekton operator deployment to one
  • Check the modified tekton-pipelines/feature-flags config map is overwritten

Build an official image for Operator

Expected Behavior

End user do not need build image themself if no specific requirement.

Actual Behavior

Must build image for Operator

Additional Info

As @nikhil-thomas comment here: #15 (comment)
I think we need supply a official image for end user before we turn to ko (maybe)

Received may Failed to list *unstructured.Unstructured error in operator pod log

Expected Behavior

No related error log.

Actual Behavior

there is no error log when using local debug mode. But when I generated operator image and deploy officially, I received many error logs like:

E1127 07:26:09.322598       1 reflector.go:134] github.com/operator-framework/operator-sdk/pkg/kube-metrics/collector.go:67: Failed to list *unstructured.Unstructured: the server could not find the requested resource
E1127 07:26:09.324121       1 reflector.go:134] github.com/operator-framework/operator-sdk/pkg/kube-metrics/collector.go:67: Failed to list *unstructured.Unstructured: the server could not find the requested resource
E1127 07:26:10.328979       1 reflector.go:134] github.com/operator-framework/operator-sdk/pkg/kube-metrics/collector.go:67: Failed to list *unstructured.Unstructured: the server could not find the requested resource

Steps to Reproduce the Problem

  1. Build and push image to docker hub
  2. modify the operator yml to use this image
  3. kubectl apply and check log: `kubectl log -n tekton-operator

Additional Info

Separate API and Recinciler for Tektoncd Projects (Components)

Why

Using the same API (tektonaddons) and its reconciler for all Tekton components (triggers, dashboard etc) makes it difficult to run, automate upgrade tests. It is difficult to add component specific upgrade logic in the reconciler, each time we add a newer version of a component.

In addition, different components might support different install mode and/or watch modes. Eg: dashboard supports an features like TENANT_NAMESPACE with dashboard install script.

Current Situation

The operator provides 2 APIs:

  • tektonpipelines.operator.tekton.dev
  • tektonaddon.operator.tekton.dev

There are 2 separate reconcilers to reconcile, 1 for tektonpipeline, and 1 for tektonaddon. The tektonaddon reconciles installation, updation and deletion of all the tekton components.

Proposed Solution

Add new APIs

  • tektontriggers.operator.tekton.dev
  • tektondashboard.operator.tekton.dev
  • tektonexperimental.operator.tekton.dev (will function like the current tektonaddon API)

Add separate controllers for the additional API to handle component specific operational logic and customizations.

Additional Info

ref: knative/operator

Knative Operator handles Knative Serving and Knative Eventing using 2 separate set of APIs and reconcilers.
However, the 2 APIs and reconcilers do reuse code for common API specs and reconcile logic.

Pipeline deployment deletion can not kick off the reconcile function, leading to no deployment recreation

Expected Behavior

pipeline deployment deletion should kick off reconcile func, so that the pipeline deployment can be recreated.

Actual Behavior

no recreation of deployment, though deployment is registered:


	err = c.Watch(
		&source.Kind{Type: &appsv1.Deployment{}},
		&handler.EnqueueRequestForOwner{
			IsController: true,
			OwnerType:    &op.Config{},
		})
	if err != nil {
		return err
	}

Steps to Reproduce the Problem

  1. install the operator, remove the either deployment of the pipeline
  2. pipeline deployment can not be revived.
  3. reconcile func is not called.

Additional Info

TektonPipeline Resource status.conditions always appended

Expected Behavior

I expect only relevant conditions to be appended to the list

Actual Behavior

There are multiple copies of the same status appended:

...
status:
  conditions:
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1

Steps to Reproduce the Problem

  1. Just install the operator
  2. Check the resource

Additional Info

There is no check before append at pipeline_controller.go#L325

err := r.client.Get(context.TODO(), req.NamespacedName, res) will not get config cluster, since it is still cluster based

I am not sure if this is the bug with the client of controller-runtime, or it is a race condition in operator.

The config get function at
https://github.com/tektoncd/operator/blob/master/pkg/controller/config/config_controller.go#L166
will return NOT FOUND error, even if the CR was created with confirmation.

We need to remove ignoreRequest, otherwise we cannot do future action if pipeline deployment is removed as in #30.

We need to use err := r.client.Get(context.TODO(), types.NamespacedName{Name: req.Name}, res) to get the config cluster.

Tekton Dashboard - About - Could not find PipelineVersion

Expected Behavior

Pipeline Version available

Actual Behavior

Error getting date: Could not find PipelineVersion

Steps to Reproduce the Problem

  1. OKD 4.5.0
  2. Install Operator (latest)
  3. cluster-admin role for operator
  4. Install Dashboard (openshift-v0.8.2)
  5. Create Route for Dashboard
  6. Authenticate
  7. Browse to: Tekton Dashboard - About

unable to deploy dashboard due to rabc issue

Expected Behavior

Dashbord deployed via kubectl apply -f deploy/crds/operator_v1alpha1_addon_dashboard_cr.yaml

Actual Behavior

Deploy failed.

Steps to Reproduce the Problem

  1. Build operator
  2. Update image in operator.yaml
  3. Install tekton pipeline success.
  4. Intall dashboard viakubectl apply -f deploy/crds/operator_v1alpha1_addon_dashboard_cr.yaml

Additional Info

Check tekton-operator pod log and found bellow:

{"level":"error","ts":1582611068.88362,"logger":"ctrl.addon.addon install","msg":"failed to apply release.yaml","Request.Namespace":"","Request.NamespaceName":"/dashboard","Request.Name":"dashboard","error":"clusterroles.rbac.authorization.k8s.io \"tekton-dashboard-minimal\" is forbidden: user \"system:serviceaccount:tekton-operator:tekton-operator\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:tekton-operator\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"apps\"], Resources:[\"ingresses\"], Verbs:[\"get\" \"list\" \"watch\"]}\n{APIGroups:[\"extensions\"], Resources:[\"ingresses\"], Verbs:[\"get\" \"list\" \"watch\"]}","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\toperator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/tektoncd/operator/pkg/controller/addon.(*ReconcileAddon).reconcileAddon\n\toperator/pkg/controller/addon/addon_controller.go:212\ngithub.com/tektoncd/operator/pkg/controller/addon.(*ReconcileAddon).Reconcile\n\toperator/pkg/controller/addon/addon_controller.go:126\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\toperator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:215\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\toperator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\toperator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\toperator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\toperator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

Solution

Assign resournce ingresses role to role tekton-operator.

Dashboard versions mentioned for operator repository are old - check we can update

https://github.com/tektoncd/operator currently mentions

The current supported components and versions are:

dashboard
v0.1.1
v0.2.0
openshift-v0.2.0
extensionwebhooks
v0.2.0
openshift-v0.2.0
trigger
v0.1.0
v0.2.1

These versions are soooo last year and it'd be useful to see what the latest version of the dashboard is that we can fit in here.

We know from our own testing using the OpenShift operator that Triggers 0.4.1, Pipelines 0.11.3 and Dashboard 0.6.1 is a good combination, so advocating that would be ideal.

Upgrade to use kubernetes api v1

Expected Behavior

No warnings when installing tekton

Actual Behavior

With recent kubectl, warnings are generated when installing tekton

Steps to Reproduce the Problem

$ kubectl apply -f https://storage.googleapis.com/tekton-releases/operator/latest/release.notags.yaml
namespace/tekton-operator created
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition

Additional Info

  • Kubernetes version:

    Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T18:49:28Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}

Move to single version Payloads

Expected Behavior

Each branch Head/release represent a particular version of Pipelines, Triggers and Dashboard

Actual Behavior

Pipeline version is fixed through tektonVersion variable in controller (

tektonVersion = "v0.12.0"
). However, we preserve release manifests of older pipeline versions here: https://github.com/tektoncd/operator/tree/master/deploy/resources

Addons versions can be specified through Adoons CRD

Why would single version Payloads be better?

  1. simple to manage patches/payload components: new patches with new releases involve adding release manifests to correct path. This will become difficult to manage when we have more and more items in the operator.
    ref: https://github.com/openshift/tektoncd-pipeline-operator/tree/master/deploy/resources/v0.11.3 (this repo will be archived soon)

  2. Easier to version operators: having single payload component versions makes it easier to version the operator. (for example: payload version + operator version)

  3. Simplify testing: having multiple payload versions will make it difficult to resolve payload go package dependencies (pipeliens, triggers go clients/ apis...)

Additional Info

Recommended Operator Best Practices
https://github.com/operator-framework/community-operators/blob/master/docs/best-practices.md

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.