tektoncd / operator Goto Github PK

View Code? Open in Web Editor NEW

417.0 417.0 181.0 92.03 MB

Kubernetes operator to manage installation, updation and uninstallation of tektoncd projects (pipeline, …)

License: Apache License 2.0

Go 95.16% Shell 3.05% Makefile 0.48% Dockerfile 0.12% Python 0.88% Smarty 0.30%

kubernetes kubernetes-operator operator-crd pipeline tektoncd

operator's People

Contributors

Stargazers

Watchers

Forkers

vdemeester nikhil-thomas chrismellard hrishin sthaha sbose78 yashbhutwala 16yuki0702 akihikokuroda syldej openshift shareinto ch-stark mnitta zhangtbj quarkus-snapshot-12-16-2019 hayorov gdzy1987 khrm bonomali steveodonovan aristanetworks arilivigni laashub-soa timmyers eddycharly mosuke5 ihm-software piyush-garg vincent-pli jacobhjkim vinamra28 savitaashture divyansh42 gwonsoolee barthy1 dominicqi gyliu513 gogeof nobusugi246 shirmon pradeepitm12 hixio-mh devopstoday11 jackiedinh8 concaf 0xhexe nagesh4193 dprotaso puneetpunamiya karthikjeeyar clix-dev-llc marvel-works pu55yf3r nageshlop rudeigerc afrittoli yaoxiaoqi houshengbo duchri66 sm43 wzhanw xiaoboya liangyuanpeng gabemontero kubernetes-utilities dlorenc gijsvandulmen imjasonh emanuelelevo rchillarasolr cpwc katanomi averagemarcus andysongtech zer0se7en guillaumerose jdm-arc-99 doytsujin feber buldzr matheuspaesp trimeego mbharatk kevinyuan20150123 jacksgt nikolaishields fgiloux lastravex sayan-biswas guhaneswaran kywa sugardon ibotty yselkowitz just-do1 manojkumar238 noteable-io abayer pratap0007

operator's Issues

Discussion: pick up CRD name more make sense than "Config"

When I install Tekton/operator on my private k8s cluster product, I feel a little uncomfortable, since a cluster scope and named Config stuff make people confuse, they think the resource is a globe configuration for the whole k8s cluster.

So, I think could we change the name, I checked Knative/serving-operator, they named it KnativeServing , so could we chose Tekton or TektonPipeline?

Publish Operator on OperatorHub.io

One we will have our first release, we're gonna want to publish the upstream operator on operatorhub.io. This issue is to track that.

Identify the scenarios of integration tests, and implement them

Expected Behavior

We need to identify the major use cases for integration tests to verify. For example, what are the CRs we need to create? What are the related resources to be created? Are there any resources able to restore themselves? Will all the resources be removed, when the CR is gone?

This should be a crucial step to setup ASAP, since we cannot rely on manual verification for each PR.

Actual Behavior

Steps to Reproduce the Problem

Additional Info

The issue of `deployOperator` in the test case with operator-sdk

I found an issue with the operator-sdk we are using for tekton operator, when I am working on the PR #24.

The reconcile function can be kicked-off every time there is change with CD or deployment, if we launch the operator locally, with the function deployOperator in the test case at https://github.com/tektoncd/operator/blob/master/test/testgroups/cluster_crd.go#L26.

If we check the status of the deployment and revive the pipeline deployment, it is possible that reconcile function needs to be kicked-off multiple times to reach the expected status. (CR created, kicking off reconcile; deployment 1 created, kicking off reconcile; deployment 2 created, kicking off reconcile; deployment 1 deleted, kicking off reconcile; deployment 2 deleted, kicking off reconcile...)

However, with the operator launched LOCALLY with the function deployOperator in the test case, reconcile func sometimes MISSES the call. If we build an image, and run the operator with the kubectl apply command, this issue does not apply.

The namespace: tekton-pipeline can not be remove when delete CR

Expected Behavior

The namespace should be remove

Actual Behavior

not

Steps to Reproduce the Problem

After install tekton-operator then remove tektonpipeline/cluster

Additional Info

The issue is caused by manifestival, log a issue for tracing.
manifestival/manifestival#75

Create a default 'pipeline' service account to run Tasks

Expected behaviour:
When the operator setups up the tekton objects, I would expect a pipeline service account in every namespace that I could use to run my tasks instead of having to create one explicitly.

[`main` branch] TektonPIpeline reconciler doesn't update the TektonPipeline instance status correctly

Expected Behavior

When the knative/pkg based implementation of operator installs TektonPipeline, the status of the instance of TektonPipeline CRD should be updated correctly

Actual Behavior

The READY statue of the CR is not updated correctly.

...
status:
  conditons:
...
  - lastTransitionTime: "2020-09-22T16:54:57Z"
    status: Unknown
    type: Ready
...

and

$ kubectl get tektonpipelines.operator.tekton.dev cluster
NAME      VERSION   READY     REASON
cluster   0.15.2    Unknown

Steps to Reproduce the Problem

clone tektoncd/operator main branch
ko apply -f config
kubectl apply -f config/crds/operator_v1alpha1_pipeline_cr.yaml
kubectl get tektonpipelines.operator.tekton.dev cluster

Update go version

Go version can be updated in https://github.com/tektoncd/operator/blob/master/go.mod#L26

Refactor the backbone of operator from controller-runtime based to knative/pkg based

Expected Behavior

As we have a long discussion in the issue #1, the benefits to have a consistent architecture among all tekton project will pay back in future, in terms of shifting our efforts from one tekton project to another.

A similar project called serving-operator is undertaking the same transition now. I can move on with this transition, if there is no strong objection. @vdemeester @nikhil-thomas @vincent-pli @hrishin @bobcatfish @sthaha

Actual Behavior

We use a controller-runtime based structure generated by operator-sdk.

Steps to Reproduce the Problem

Check the structure of source code.

Additional Info

The tekton-pipelines namespace is not deleted after delete the config CR

Expected Behavior

The tekton-pipelines namespace is deleted after delete the config CR

Actual Behavior

The tekton-pipelines namespace is not deleted after delete the config CR

Steps to Reproduce the Problem

Start the tekton operator
Apply the config cr
delete the config cr

Additional Info

All resources in the tekton-pipelines namespace are deleted. But the namespace is still there. it should also be deleted I think.

No CSV file for the operator

Any plan to commit the CSV file here for the operator?

The cluster-admin privileges for tekton-dashboard-extensions

When installing the addon dashboard v0.8.2, we come across the following error:

'clusterroles.rbac.authorization.k8s.io "tekton-dashboard-extensions" is forbidden: must have cluster-admin privileges to use the aggregationRule'

Create a directory named vendor for all the dependencies

We need to have a script, e.g. ./hack/update-deps.sh to update/generate the dependecies and save them under vendor directory.

This is a good step for code compiling on local machines after imported into IDEs.

Use the existing test framework for integration tests

As I mentioned on PR #21, I suggest we setup the same test framework as we use in pipeline, e.g. create a file like https://github.com/tektoncd/pipeline/blob/master/test/init_test.go#L47 to initialize the tests.

Then we are able to run the integration test with go_test_e2e -timeout=20m ./test || failed=1, as in https://github.com/tektoncd/pipeline/blob/master/test/e2e-tests.sh#L34.

export GO111MODULE=on makes `ko apply` support extremely slow

To enable ko apply support to build the image, and run the operator, we have to get rid of export GO111MODULE=on in the scripts, since it will severely slow down the image build and operator deployment, and even make the pod fail to launch before timeout.

README says 0.5.0 is installed?

Expected Behavior

Installing the operator installs a recent Tekton (based on 1ffa651 that's 0.7.0?), and that version is documented upfront.

...or the specific version isn't documented upfront, and documentation just says "a recent Tekton will be installed"

Actual Behavior

README.md says:

The Operator will automatic install Tekton pipeline with v0.5.2 in the namespace tekton-pipeline

Create initial operator from openshift/tektoncd-operator

The openshift-operator is under going a major revamp see: openshift/tektoncd-pipeline-operator#66 . Once the PR merges, let us base the upstream operator on that sans the openshift bits.

Add ko support to build & publish the image, even install the operator

Expected Behavior

When we run the command ko apply -f config, operator should be installed from the source code.

ko should be the major command to build, publish, and install tekton projects.

Actual Behavior

We do not support. We rely on operator sdk to do everything.

Steps to Reproduce the Problem

No support. No command available to run.

Additional Info

Manage clustertasks from tektoncd/catalog

Expected Behavior

ClusterTasks get installed and updated by the operator.

Actual Behavior

They have to be added/modified manually.

Tekton operator security improvements

Expected Behavior

operator creates tekton-pipelines-controller and tekton-pipelines-webhook deployments, but it should be possible to define a custom registry for the images (to use in air-gap envs)
securityContext for the deployments should be configurable, as of now they are {} - which means root, privileged, etc

Actual Behavior

Steps to Reproduce the Problem

install the standard tekton operator using current defined config

Additional Info

Is zz_generated.openapi.go still needed?

Expected Behavior

All files should be used.

Actual Behavior

I did not see anything necessarily using this file. If not, we should remove it.

Steps to Reproduce the Problem

None

Additional Info

Failed to install after being removed

Expected Behavior

Be able to remove all traces of tekton operator and then reinstall it.

Actual Behavior

Tekton operator goes in an error loop:

2020-09-03T05:15:34.708842160Z {"level":"info","ts":1599110134.7087033,"logger":"cmd","msg":"Go Version: go1.13"}
2020-09-03T05:15:34.708891309Z {"level":"info","ts":1599110134.7087312,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
2020-09-03T05:15:34.708898679Z {"level":"info","ts":1599110134.7087388,"logger":"cmd","msg":"Version of operator-sdk: v0.17.0"}
2020-09-03T05:15:34.710152870Z {"level":"info","ts":1599110134.7100368,"logger":"leader","msg":"Trying to become the leader."}
2020-09-03T05:15:35.632809869Z {"level":"info","ts":1599110135.6326385,"logger":"leader","msg":"No pre-existing lock was found."}
2020-09-03T05:15:35.640752435Z {"level":"info","ts":1599110135.640669,"logger":"leader","msg":"Became the leader."}
2020-09-03T05:15:36.545345206Z {"level":"info","ts":1599110136.5452123,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":"0.0.0.0:8383"}
2020-09-03T05:15:36.545719820Z {"level":"info","ts":1599110136.5456192,"logger":"cmd","msg":"Registering Components."}
2020-09-03T05:15:36.559811421Z {"level":"info","ts":1599110136.5597262,"logger":"ctrl.tektonpipeline.add","msg":"Watching operator tektonpipeline CR"}
2020-09-03T05:15:36.559869170Z {"level":"info","ts":1599110136.5597873,"logger":"ctrl.tektonpipeline.create-cr","msg":"creating a clusterwide resource of tektonpipeline crd","name":"cluster"}
2020-09-03T05:15:39.317699016Z {"level":"info","ts":1599110139.317469,"logger":"metrics","msg":"Metrics Service object updated","Service.Name":"tekton-operator-metrics","Service.Namespace":"tekton-operator"}
2020-09-03T05:15:40.219461311Z {"level":"info","ts":1599110140.2192578,"logger":"cmd","msg":"Could not create ServiceMonitor object","error":"no ServiceMonitor registered with the API"}
2020-09-03T05:15:40.219492871Z {"level":"info","ts":1599110140.219319,"logger":"cmd","msg":"Install prometheus-operator in your cluster to create ServiceMonitor objects","error":"no ServiceMonitor registered with the API"}
2020-09-03T05:15:40.219500831Z {"level":"info","ts":1599110140.2193341,"logger":"cmd","msg":"Starting the Cmd."}
2020-09-03T05:15:40.219718438Z {"level":"info","ts":1599110140.219566,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
2020-09-03T05:15:40.219801556Z {"level":"info","ts":1599110140.2196357,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"tektonaddon-controller","source":"kind source: /, Kind="}
2020-09-03T05:15:40.219868685Z {"level":"info","ts":1599110140.2197213,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"tektonpipeline-controller","source":"kind source: /, Kind="}
2020-09-03T05:15:40.320204542Z {"level":"info","ts":1599110140.3200305,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"tektonaddon-controller","source":"kind source: /, Kind="}
2020-09-03T05:15:40.320406909Z {"level":"info","ts":1599110140.3202844,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"tektonpipeline-controller","source":"kind source: /, Kind="}
2020-09-03T05:15:40.420721977Z {"level":"info","ts":1599110140.420439,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"tektonaddon-controller"}
2020-09-03T05:15:40.420751816Z {"level":"info","ts":1599110140.4205706,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"tektonpipeline-controller"}
2020-09-03T05:15:40.420760486Z {"level":"info","ts":1599110140.4206328,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"tektonpipeline-controller","worker count":1}
2020-09-03T05:15:40.420798455Z {"level":"info","ts":1599110140.4206774,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"tektonaddon-controller","worker count":1}
2020-09-03T05:15:40.420872125Z {"level":"info","ts":1599110140.4208064,"logger":"ctrl.tektonpipeline.reconcile","msg":"reconciling tektonpipeline change","Request.Namespace":"","Request.NamespaceName":"/cluster","Request.Name":"cluster"}
2020-09-03T05:15:40.420883814Z {"level":"info","ts":1599110140.4208431,"logger":"ctrl.tektonpipeline.reconcile","msg":"installing pipelines","Request.Namespace":"","Request.NamespaceName":"/cluster","Request.Name":"cluster","path":"/var/run/ko/resources/pipelines/v0.15.2"}
2020-09-03T05:15:40.583869676Z time="2020-09-03T05:15:40Z" level=error msg="Operation cannot be fulfilled on tektonpipelines.operator.tekton.dev \"cluster\": the object has been modified; please apply your changes to the latest version and try againstatus update failed" source="pipeline_controller.go:328"
2020-09-03T05:15:40.583977214Z {"level":"error","ts":1599110140.583758,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"tektonpipeline-controller","request":"/cluster","error":"failed to apply non deployment manifest: Internal error occurred: failed calling webhook \"config.webhook.pipeline.tekton.dev\": Post \"https://tekton-pipelines-webhook.tekton-pipelines.svc:443/config-validation?timeout=30s\": service \"tekton-pipelines-webhook\" not found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tgithub.com/go-logr/[email protected]/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:88"}
2020-09-03T05:15:41.584217446Z {"level":"info","ts":1599110141.5840812,"logger":"ctrl.tektonpipeline.reconcile","msg":"reconciling tektonpipeline change","Request.Namespace":"","Request.NamespaceName":"/cluster","Request.Name":"cluster"}
2020-09-03T05:15:41.584248705Z {"level":"info","ts":1599110141.5841324,"logger":"ctrl.tektonpipeline.reconcile","msg":"installing pipelines","Request.Namespace":"","Request.NamespaceName":"/cluster","Request.Name":"cluster","path":"/var/run/ko/resources/pipelines/v0.15.2"}
2020-09-03T05:15:41.746497008Z {"level":"error","ts":1599110141.7462823,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"tektonpipeline-controller","request":"/cluster","error":"failed to apply non deployment manifest: Internal error occurred: failed calling webhook \"config.webhook.pipeline.tekton.dev\": Post \"https://tekton-pipelines-webhook.tekton-pipelines.svc:443/config-validation?timeout=30s\": service \"tekton-pipelines-webhook\" not found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tgithub.com/go-logr/[email protected]/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:88"}

Steps to Reproduce the Problem

kubectl apply -f https://storage.googleapis.com/tekton-releases/operator/latest/release.notags.yaml
Delete any resources and traces of tekton
kubectl apply -f https://storage.googleapis.com/tekton-releases/operator/latest/release.notags.yaml

Handle changes of labels in pipeline, triggers and dashboard deployments

Expected Behavior

The latest pipeline, triggers and dashboard can be updated to the latest release

Actual Behavior

The master version of the three projects failed to update the in the robocat cluster because of new labels that were added to the deployments, which cause errors during update of the existing deployments:

[deploy-tekton-project] Resource: "apps/v1, Resource=deployments", GroupVersionKind: "apps/v1, Kind=Deployment"
[deploy-tekton-project] Name: "tekton-pipelines-webhook", Namespace: "tekton-pipelines"
[deploy-tekton-project] for: "pipeline/overlays/robocat": Deployment.apps "tekton-pipelines-webhook" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/component":"webhook", "app.kubernetes.io/instance":"default", "app.kubernetes.io/name":"webhook", "app.kubernetes.io/part-of":"tekton-pipelines"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

This is a heads-up as the operator may have to deal with this to ensure seamless upgrade.

Steps to Reproduce the Problem

Additional Info

See tektoncd/plumbing#392 for more details

Tekton Dashboard - Extensions Webhook - Error loading extension

Expected Behavior

Tekton Dashboard - Extensions Webhook - Page is available

Actual Behavior

Tekton Dashboard - Extensions Webhook
Page not available: "Error loading extension error loading dynamically imported module

Steps to Reproduce the Problem

OKD 4.5.0
Install Operator (latest)
cluster-admin role for operator
Install Dashboard (openshift-v0.8.2)
Install extensionwebhooks ( one of: v0.2.0, openshift-v0.2.0, v0.6.1, v0.2.1, openshift-v0.2.1)
Create Route for Dashboard
Authenticate
Browse to: Tekton Dashboard - Extensions Webhook

Change tektonVersion to a command flag so that user can choose the tekton verision

Expected Behavior

User can use --version v0.7.0 to use different tekton version to install

Actual Behavior

tektonVersion = "v0.8.0" version is hardcode in config controller

User cannot choose the version, just use the default v0.8.0 version now.

It is better to set it a parameter. then we can use it for the upgrade or future version maintain

Add a short poll to CheckDeployment Stage

Ref

operator/pkg/reconciler/common/deployments.go

Line 31 in 646a683

    
           func CheckDeployments(ctx context.Context, manifest *mf.Manifest, instance v1alpha1.TektonComponent) error {

Current behavior

The CheckDeployment was fetchin the deployments only once. Hence, the Ready status condition was not always updated correctly. This also made e2e tests flaky.

So the the CheckDeployment was modified (#174) to return an error instead of nil when the deployment is not found, so that the reconciler is re-triggered.

However, this is not ideal. All the stages before the CheckDeployment gets applied again.

Expected behavior

The CheckDeployment should retry a few times (with a short timeout) before returning error.

Cannot designate PR to build nightly image

Expected Behavior

We should build the the nightly image based on an existing commit in master. We should remove this line: https://github.com/tektoncd/operator/blob/master/test/presubmit-tests.sh#L57.

Actual Behavior

We build the the nightly image on every pending PR.

Steps to Reproduce the Problem

Open a PR
The image gcr.io/tekton-nightly/tektoncd-operator is built with https://github.com/tektoncd/operator/blob/master/test/presubmit-tests.sh#L57.

Additional Info

Whether `no-auto-install` of CR is needed?

Expected Behavior

We use CRs to install tekton pipeline and all the other components. CRs are the sources to configure.
The option no-auto-install gives us an option to install the pipeline CR automatically or not. Is it still necessary to configure the tekton operator?

Suggest to remove this option.
Users can install the operator. Then install pipeline by manually installing the pipeline CR.

Actual Behavior

Pipeline CR can be installed automatically by setting no-auto-install to false.

log of the pod which failed to run in the prow job for `ko apply -f config/`

I0827 16:28:54.492] ERROR: timeout waiting for pods to come up
I0827 16:28:54.492] tekton-operator-79f78d6994-t8w69   0/1   CrashLoopBackOff   5     5m36s
I0827 16:28:55.036] {"level":"info","ts":1566923178.1520412,"logger":"cmd","msg":"Go Version: go1.11.4"}
I0827 16:28:55.036] {"level":"info","ts":1566923178.1521137,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
I0827 16:28:55.037] {"level":"info","ts":1566923178.152141,"logger":"cmd","msg":"Version of operator-sdk: v0.9.0"}
I0827 16:28:55.037] {"level":"info","ts":1566923178.152508,"logger":"leader","msg":"Trying to become the leader."}
I0827 16:28:55.038] {"level":"error","ts":1566923178.3067694,"logger":"k8sutil","msg":"Failed to get Pod","Pod.Namespace":"default","Pod.Name":"tekton-operator-79f78d6994-t8w69","error":"pods \"tekton-operator-79f78d6994-t8w69\" is forbidden: User \"system:serviceaccount:default:tekton-operator\" cannot get resource \"pods\" in API group \"\" in the namespace \"default\": RBAC: role.rbac.authorization.k8s.io \"tekton-operator\" not found","stacktrace":"github.com/tektoncd/operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/tektoncd/operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/k8sutil.GetPod\n\t/go/src/github.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/k8sutil/k8sutil.go:107\ngithub.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/leader.myOwnerRef\n\t/go/src/github.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/leader/leader.go:133\ngithub.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/leader.Become\n\t/go/src/github.com/tektoncd/operator/vendor/github.com/operator-framework/operator-sdk/pkg/leader/leader.go:67\nmain.main\n\t/go/src/github.com/tektoncd/operator/cmd/manager/main.go:97\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:201"}
I0827 16:28:55.039] {"level":"error","ts":1566923178.3068914,"logger":"cmd","msg":"","error":"pods \"tekton-operator-79f78d6994-t8w69\" is forbidden: User \"system:serviceaccount:default:tekton-operator\" cannot get resource \"pods\" in API group \"\" in the namespace \"default\": RBAC: role.rbac.authorization.k8s.io \"tekton-operator\" not found","stacktrace":"github.com/tektoncd/operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/tektoncd/operator/vendor/github.com/go-logr/zapr/zapr.go:128\nmain.main\n\t/go/src/github.com/tektoncd/operator/cmd/manager/main.go:99\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:201"}

Tekton ConfigMaps are overwritten by Operator

Expected Behavior

After installing the Operator and make any modification to Tekton ConfigMaps (config-defaults, feature-flags) they should not be modified, even after operator restarts or upgrades.

Actual Behavior

If I restart the operator deployment container the config-maps are overwritten with default values, replacing anything I've defined.

Steps to Reproduce the Problem

Install Tekton Operator
Wait for everything to be provisioned
Modify any tekton-pipelines/feature-flags config map entry
Scale-down tekton operator deployment to zero
Scale-up tekton operator deployment to one
Check the modified tekton-pipelines/feature-flags config map is overwritten

update version to include newer tekton pipelines releases and publish current version on operatorhub

Feature request

It would be great to have newer tekton versions.

Build an official image for Operator

Expected Behavior

End user do not need build image themself if no specific requirement.

Actual Behavior

Must build image for Operator

Additional Info

As @nikhil-thomas comment here: #15 (comment)
I think we need supply a official image for end user before we turn to ko (maybe)

Received may Failed to list *unstructured.Unstructured error in operator pod log

Expected Behavior

No related error log.

Actual Behavior

there is no error log when using local debug mode. But when I generated operator image and deploy officially, I received many error logs like:

E1127 07:26:09.322598       1 reflector.go:134] github.com/operator-framework/operator-sdk/pkg/kube-metrics/collector.go:67: Failed to list *unstructured.Unstructured: the server could not find the requested resource
E1127 07:26:09.324121       1 reflector.go:134] github.com/operator-framework/operator-sdk/pkg/kube-metrics/collector.go:67: Failed to list *unstructured.Unstructured: the server could not find the requested resource
E1127 07:26:10.328979       1 reflector.go:134] github.com/operator-framework/operator-sdk/pkg/kube-metrics/collector.go:67: Failed to list *unstructured.Unstructured: the server could not find the requested resource

Steps to Reproduce the Problem

Build and push image to docker hub
modify the operator yml to use this image
kubectl apply and check log: `kubectl log -n tekton-operator

Additional Info

Feature Request: Add "Tekton Results API" Addon

It would be great to be able to deploy Tekton Results API via TektonAddon.

Once tektoncd/experimental#652 is fixed, it should only need definitions in /command/manager/kodata/resources/addons/ (and documentation). I can create a pull request adding it, if desired.

Issue: Missing or invalid Task test/s2i-buildah-push: translating Build to Pod: serviceaccounts

Issue

I'm getting a very strange TaskRun message --> see

message: 'Missing or invalid Task test/s2i-buildah-push: translating
Build to Pod: serviceaccounts "build-bot" not found'

even if the serviceaccount exists like the task and taskruns

https://gist.github.com/cmoulliard/28cede49e8a2ea69531aa814ded10307#file-gistfile1-txt-L67

Separate API and Recinciler for Tektoncd Projects (Components)

Why

Using the same API (tektonaddons) and its reconciler for all Tekton components (triggers, dashboard etc) makes it difficult to run, automate upgrade tests. It is difficult to add component specific upgrade logic in the reconciler, each time we add a newer version of a component.

In addition, different components might support different install mode and/or watch modes. Eg: dashboard supports an features like TENANT_NAMESPACE with dashboard install script.

Current Situation

The operator provides 2 APIs:

tektonpipelines.operator.tekton.dev
tektonaddon.operator.tekton.dev

There are 2 separate reconcilers to reconcile, 1 for tektonpipeline, and 1 for tektonaddon. The tektonaddon reconciles installation, updation and deletion of all the tekton components.

Proposed Solution

Add new APIs

tektontriggers.operator.tekton.dev
tektondashboard.operator.tekton.dev
tektonexperimental.operator.tekton.dev (will function like the current tektonaddon API)

Add separate controllers for the additional API to handle component specific operational logic and customizations.

Additional Info

ref: knative/operator

Knative Operator handles Knative Serving and Knative Eventing using 2 separate set of APIs and reconcilers.
However, the 2 APIs and reconcilers do reuse code for common API specs and reconcile logic.

Upgrade to manifestival/manifestival

Right now we are using jcrossley3/manifestival in operator codebase This has been archived as you can see here https://github.com/jcrossley3/manifestival

The project has now moved to https://github.com/manifestival/manifestival We need to move the operator codebase to new dependency.

/kind misc
/assign

Pipeline deployment deletion can not kick off the reconcile function, leading to no deployment recreation

Expected Behavior

pipeline deployment deletion should kick off reconcile func, so that the pipeline deployment can be recreated.

Actual Behavior

no recreation of deployment, though deployment is registered:


	err = c.Watch(
		&source.Kind{Type: &appsv1.Deployment{}},
		&handler.EnqueueRequestForOwner{
			IsController: true,
			OwnerType:    &op.Config{},
		})
	if err != nil {
		return err
	}

Steps to Reproduce the Problem

install the operator, remove the either deployment of the pipeline
pipeline deployment can not be revived.
reconcile func is not called.

Additional Info

Issue to trace the bug for Manifestival v0.5.0

The bug will be merge soon, we need upgrade the version of Manifestival then.
That bug is Manifestival will not delete namespace create by itself when run Delete(), details please check the link:
manifestival/manifestival#61

Create a roadmap document

Mimic what the other project did with their roadmap.

/kind documentation
/area roadmap

Write a comprehensive doc regarding the installation of operator

Expected Behavior

We should have a README.md documenting the process to setup the operator in any cluster.

Actual Behavior

There is None

Steps to Reproduce the Problem

Go to the main page of this repo. Only a brief introduction so far.

Additional Info

TektonPipeline Resource status.conditions always appended

Expected Behavior

I expect only relevant conditions to be appended to the list

Actual Behavior

There are multiple copies of the same status appended:

...
status:
  conditions:
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1
  - code: installed
    version: v0.15.1
  - code: installing
    version: v0.15.1

Steps to Reproduce the Problem

Just install the operator
Check the resource

Additional Info

There is no check before append at pipeline_controller.go#L325

err := r.client.Get(context.TODO(), req.NamespacedName, res) will not get config cluster, since it is still cluster based

I am not sure if this is the bug with the client of controller-runtime, or it is a race condition in operator.

The config get function at
https://github.com/tektoncd/operator/blob/master/pkg/controller/config/config_controller.go#L166
will return NOT FOUND error, even if the CR was created with confirmation.

We need to remove ignoreRequest, otherwise we cannot do future action if pipeline deployment is removed as in #30.

We need to use err := r.client.Get(context.TODO(), types.NamespacedName{Name: req.Name}, res) to get the config cluster.

Tekton Dashboard - About - Could not find PipelineVersion

Expected Behavior

Pipeline Version available

Actual Behavior

Error getting date: Could not find PipelineVersion

Steps to Reproduce the Problem

OKD 4.5.0
Install Operator (latest)
cluster-admin role for operator
Install Dashboard (openshift-v0.8.2)
Create Route for Dashboard
Authenticate
Browse to: Tekton Dashboard - About

unable to deploy dashboard due to rabc issue

Expected Behavior

Dashbord deployed via kubectl apply -f deploy/crds/operator_v1alpha1_addon_dashboard_cr.yaml

Actual Behavior

Deploy failed.

Steps to Reproduce the Problem

Build operator
Update image in operator.yaml
Install tekton pipeline success.
Intall dashboard viakubectl apply -f deploy/crds/operator_v1alpha1_addon_dashboard_cr.yaml

Additional Info

Check tekton-operator pod log and found bellow:

{"level":"error","ts":1582611068.88362,"logger":"ctrl.addon.addon install","msg":"failed to apply release.yaml","Request.Namespace":"","Request.NamespaceName":"/dashboard","Request.Name":"dashboard","error":"clusterroles.rbac.authorization.k8s.io \"tekton-dashboard-minimal\" is forbidden: user \"system:serviceaccount:tekton-operator:tekton-operator\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:tekton-operator\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"apps\"], Resources:[\"ingresses\"], Verbs:[\"get\" \"list\" \"watch\"]}\n{APIGroups:[\"extensions\"], Resources:[\"ingresses\"], Verbs:[\"get\" \"list\" \"watch\"]}","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\toperator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/tektoncd/operator/pkg/controller/addon.(*ReconcileAddon).reconcileAddon\n\toperator/pkg/controller/addon/addon_controller.go:212\ngithub.com/tektoncd/operator/pkg/controller/addon.(*ReconcileAddon).Reconcile\n\toperator/pkg/controller/addon/addon_controller.go:126\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\toperator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:215\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\toperator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\toperator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\toperator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\toperator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

Solution

Assign resournce ingresses role to role tekton-operator.

Dashboard versions mentioned for operator repository are old - check we can update

https://github.com/tektoncd/operator currently mentions

The current supported components and versions are:

dashboard
v0.1.1
v0.2.0
openshift-v0.2.0
extensionwebhooks
v0.2.0
openshift-v0.2.0
trigger
v0.1.0
v0.2.1

These versions are soooo last year and it'd be useful to see what the latest version of the dashboard is that we can fit in here.

We know from our own testing using the OpenShift operator that Triggers 0.4.1, Pipelines 0.11.3 and Dashboard 0.6.1 is a good combination, so advocating that would be ideal.

Upgrade to use kubernetes api v1

Expected Behavior

No warnings when installing tekton

Actual Behavior

With recent kubectl, warnings are generated when installing tekton

Steps to Reproduce the Problem

$ kubectl apply -f https://storage.googleapis.com/tekton-releases/operator/latest/release.notags.yaml
namespace/tekton-operator created
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition

Additional Info

Kubernetes version:

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T18:49:28Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}

Move to single version Payloads

Expected Behavior

Each branch Head/release represent a particular version of Pipelines, Triggers and Dashboard

Actual Behavior

Pipeline version is fixed through tektonVersion variable in controller (

operator/pkg/controller/config/config_controller.go

Line 30 in abdc4db

tektonVersion = "v0.12.0"

). However, we preserve release manifests of older pipeline versions here: https://github.com/tektoncd/operator/tree/master/deploy/resources

Addons versions can be specified through Adoons CRD

Why would single version Payloads be better?

simple to manage patches/payload components: new patches with new releases involve adding release manifests to correct path. This will become difficult to manage when we have more and more items in the operator.
ref: https://github.com/openshift/tektoncd-pipeline-operator/tree/master/deploy/resources/v0.11.3 (this repo will be archived soon)
Easier to version operators: having single payload component versions makes it easier to version the operator. (for example: payload version + operator version)
Simplify testing: having multiple payload versions will make it difficult to resolve payload go package dependencies (pipeliens, triggers go clients/ apis...)

Additional Info

Recreate the deployment of pipelines, when they are deleted.

This is a valid scenario as we identify for the operator. So far, the reconcile func is kicked-off by being ignored, if deployment is deleted, but we need to implement this feature.

How to upgrade/downgrade CR for operator

I found that the opertor is creating CR at https://github.com/tektoncd/operator/blob/e8cf87e/pkg/controller/config/config_controller.go#L121-L124 , so my question is how to handle the case of upgrade or downgrade for the CR in the operator?

FYI @clyang82 @vincent-pli @houshengbo @chenzhiwei