ibm / varnish-operator Goto Github PK

View Code? Open in Web Editor NEW

69.0 10.0 27.0 3.59 MB

Run and manage Varnish clusters on Kubernetes

Home Page: https://ibm.github.io/varnish-operator/

License: Apache License 2.0

Dockerfile 0.32% Makefile 1.71% Go 95.58% VCL 0.71% Shell 1.48% Smarty 0.19%

kubernetes operator kubernetes-operator golang operatorsdk varnish varnish-cache

varnish-operator's Introduction

Varnish Operator

Project status: alpha

The project is in development and breaking changes can be introduced.

The purpose of the project is to provide a convenient way to deploy and manage Varnish instances in Kubernetes.

Kubernetes version >=1.21.0 is supported.

Varnish version 6.5.1 is supported.

Full documentation can be found here

Overview

Varnish operator manages Varnish clusters using a CustomResourceDefinition that defines a new Kind called VarnishCluster.

The operator manages the whole lifecycle of the cluster: creating, deleting and keeping the cluster configuration up to date. The operator is responsible for building the VCL configuration using templates defined by the users and keeping the configuration up to date when relevant events occur (backend pod failure, scaling of the deployment, VCL configuration change).

Features

Basic install
Full lifecycle support (create/update/delete)
Automatic VCL configuration updates (using user defined templates)
Prometheus metrics support
Scaling
Configurable update strategy
Persistence (for file storage backend support)
Multiple Varnish versions support
Autoscaling

varnish-operator's People

Contributors

Stargazers

Watchers

varnish-operator's Issues

Migrate from helm v2 to helm v3

Trying out make e2e-test I get

[snip]
Error: unknown command "init" for "helm"

Did you mean this?
	lint

Run 'helm --help' for usage.

Had helmv3 installed, switching to helmv2 seems to get rid of this error.

Cert issue

Hello,
First, thanks for this repository and documentation is really well.
I initiatialize the operator in subchart and then i launch another subchart which aims to create a VarnishCluster but this message occures when i install my repo with this two subcharts.

"Internal error occurred: failed calling webhook "mvarnishcluster.kb.io": could not get REST client: unable to load root certificates: unable to parse bytes as PEM block"

My VarnishCluster is consisted by the backend, vanish and service options.

What have I done wrong ?

Thank you in advance for helping me.

Don't return from VCL discard failure

https://github.com/IBM/varnish-operator/blob/main/pkg/varnishcontroller/controller/controller_varnish.go#L63

{"level":"error","ts":1618966483.733197,"logger":"controller-runtime.manager.controller.varnish-controller","caller":"controller/controller.go:301","msg":"Reconciler error","varnish_controller_version":"undefined","reconciler group":"caching.ibm.com","reconciler kind":"VarnishCluster","name":"vc-sun-location-services-varnish-0","namespace":"sun-location-services","error":"Can't delete VCL config \"v-63207441-1618966417\": Command failed with error code 106\nNo VCL named v-63207441-1618966417 known.\n: exit status 1","errorVerbose":"exit status 1\nCommand failed with error code 106\nNo VCL named v-63207441-1618966417 known.\n\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/varnishadm.(*VarnishAdm).Discard\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/varnishadm/varnishadm.go:145\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).reconcileVarnish\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller_varnish.go:61\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).reconcileWithContext\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:192\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).Reconcile\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:109\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:297\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:252\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:215\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374\nCan't delete VCL config \"v-63207441-1618966417\"\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).reconcileVarnish\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller_varnish.go:63\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).reconcileWithContext\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:192\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).Reconcile\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:109\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:297\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:252\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:215\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).reconcileWithContext\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:193\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).Reconcile\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:109\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:297\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:252\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:215\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374"}

I'm not sure why v-63207441-1618966417 is erroring out as it's not showing up when i run vcl.list. Regardless I think we should just log these errors and proceed through discarding the rest of the inactive profiles.

Migrate to ginko2?

Ginko 1.16 started tossing out a deprecated message. https://github.com/onsi/ginkgo#ginkgo-20-is-coming-soon we should migrate to Ginko 2 at some point but may need to disable the deprecation warning in the meantime.

Could I separate registry/repository:tag override for VarnishClusters Object ?

Hey,
I would like to override only the the Registry the same way I did with the Varnish-Operator(#87), because I don't want to pull my images from internet each time I don't have it on the node.

I would be open to do the PR

Thanks,
Arthur

Look into naming of PR based Helm charts

Helm 3.5.2 added stricter semver matching to prevent exploits (I'm not sure how that works but it's release and not backward compatible). I've removed invalid charts from our internal helm repo, but we need to verify the naming convention used here as well. Additionally, we need to clean up any published PR based charts here too.

Here's what the helm repo update output looks like w/invalid packages.

index.go:339: skipping loading invalid entry for chart "varnish-operator" "0.21.0-125_start_up_varnish_without_a_vcl" from https://na.artifactory.swg-devops.com/artifactory/wcp-icm-helm-virtual: validation: chart.metadata.version "0.21.0-125_start_up_varnish_without_a_vcl" is invalid
index.go:339: skipping loading invalid entry for chart "varnish-operator" "0.21.0-193_split_pod_to_individual_containers" from https://na.artifactory.swg-devops.com/artifactory/wcp-icm-helm-virtual: validation: chart.metadata.version "0.21.0-193_split_pod_to_individual_containers" is invalid
index.go:339: skipping loading invalid entry for chart "varnish-operator" "0.21.0-194_redesigned_crd" from https://na.artifactory.swg-devops.com/artifactory/wcp-icm-helm-virtual: validation: chart.metadata.version "0.21.0-194_redesigned_crd" is invalid

More info can be found here: helm/helm#9342
Semver rules for pre-releases: https://semver.org/#spec-item-9

Varnish dashboard typo -- sleep sleep

Requests sent to sleep sleep per seconds

can't deploy varnishcluster as shown in documentation

Hello,
_I don't know how to achieve the creation of a cluster, since I get the following error:

Steps to reproduce:

$ helm install --namespace varnish-operator varnish-operator varnish-operator/varnish-operator
$ kubectl create deployment nginx-backend --image nginx -n varnish-cluster --port=80
$ cat <<EOF | kubectl create -f -
apiVersion: caching.ibm.com/v1alpha1
kind: VarnishCluster
metadata:
  name: varnishcluster-example
  namespace: varnish-cluster # the namespace we've created above
spec:
  vcl:
    configMapName: vcl-config # name of the config map that will store your VCL files. Will be created if doesn't exist.
    entrypointFileName: entrypoint.vcl # main file used by Varnish to compile the VCL code.
  backend:
    port: 80
    selector:
      app: nginx-backend # labels that identify your backend pods
  service:
    port: 80 # Varnish pods will be exposed on that port
EOF

Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "mvarnishcluster.kb.io": failed to call webhook: Post "https://varnish-operator-service.varnish-operator.svc:443/mutate-caching-ibm-com-v1alpha1-varnishcluster?timeout=10s": Service Unavailable_

Thank you for your help.
Yann

Omit non-ready backends

The PWS team is currently encountering some latency issues due to requests going to pods that are just starting up on scale up. Tim brought up two things that may help them in this situation.

Omit the non-ready backends so they're not getting hit.
Possibility of using a service instead of the backends list.

this would have the added benefit of using the route to non-ready pods flag..forgot name offhand.

[Question] Configuring backend from Service (ExternalName) or with ext ip:port

Trying out varnish-operator, I see that the backends are generated dynamically from the pods. I would like to know if it is possible to specify an ExternalName Service as a backend or bunch of ip:port external to the cluster running varnish.

Additionally, when I try to get specify the pods using backend.selector.app, the varnish cluster pods in the sts have only the default/dummy backend resulting in "HTTP/1.1 503 No backends configured". How can one troubleshoot the processing of the backend configuration? "logLevel: debug" does not seem to produce any additional logs for this case.

Any help/guidance is much appreciated.

Varnish operator grafana dashboard pulls in too many metrics

The Grafana dashboard that is deployed with the operator pulls in metrics from other pods in the cluster that are unrelated to varnish. See the CPU usage and Memory usage graphs in the dashboard. We should narrow the scope of the metrics that are used in the dashboard. Additionally, the CPU usage graph appears to use time when it should be using percentage.

Container restart loses connectivity to backends

When Kubernetes restarts the container due to a liveliness probe failure the container comes back with 0 backends.

varnishadm backend.list
Backend name   Admin      Probe    Health     Last change
boot.default   healthy    0/0      healthy    Sun, 08 May 2022 02:30:49 GMT

I confirmed that /etc/varnish/backends.vcl is still populated correctly and other pods can still connect to the backends without a problem. Deleting the pod "fixes" it.

Here is our VarnishCluster manifest for context.

apiVersion: caching.ibm.com/v1alpha1
kind: VarnishCluster
metadata:
  labels:
    operator: varnish
  name: pcms-api
spec:
  backend:
    port: 80
    selector:
      app: pcms
      component: web
      purpose: api
  replicas: 3
  service:
    annotations:
      prometheus.io/path: /metrics
      prometheus.io/port: "9131"
      prometheus.io/scrape-only: "true"
    port: 80
  varnish:
    args:
    - -p
    - http_max_hdr=256
    - -p
    - http_resp_hdr_len=256k
    - -p
    - http_resp_size=1024k
    - -p
    - workspace_backend=256k
    - -s
    - malloc,756M
    resources:
      limits:
        cpu: 500m
        memory: 1028Mi
      requests:
        cpu: 500m
        memory: 1028Mi
  vcl:
    configMapName: pcms-varnishcluster
    entrypointFileName: default.vcl

Missing VarnishCluster config

It appears that some VarnishCluster properties are not defined in https://ibm.github.io/varnish-operator/varnish-cluster-configuration.html (such as replicas) - I am not sure how many others.

It would also help to sort the fields to make it easier to find things.

Create title override for Grafana dashboard

We need a way to set or override the title that is used by the Grafana dashboard, such as monitoring.grafanaDashboard.title. This will allow users who have multiple namespaced varnish cluster instances within the same k8s cluster to have separate Grafana dashboards that do not conflict with each other.

Add support for backends that require SSL

There are a lot of different ways to handle this. One way is to setup an external nginx deployment that handles SSL, but this will introduce another hop. In order to eliminate the hop and not have to jump through hoops in VCL to find nginx pods that are colocated with varnish pods, we should consider adding a nginx sidecar so we can talk over localhost for this. It'd simplify config and hopefully allow us to support SSL by flipping a switch.

Stop attempting to delete VCLs that do not exist

When under load, the varnish-controller logs show the following error being spammed every reconcile loop:

{"level":"error","ts":1616597072.4668653,"caller":"controller/controller.go:116","msg":"exit status 1\nCommand failed with error code 106\nNo VCL named v-25889868-1616597009 known.\n\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/varnishadm.(*VarnishAdm).Discard\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/varnishadm/varnishadm.go:145\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).reconcileVarnish\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller_varnish.go:61\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).reconcileWithContext\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:192\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).Reconcile\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:109\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:297\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:252\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:215\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374\nCan't delete VCL config \"v-25889868-1616597009\"\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).reconcileVarnish\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller_varnish.go:63\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).reconcileWithContext\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:192\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).Reconcile\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:109\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:297\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:252\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:215\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).reconcileWithContext\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:193\ngithub.com/ibm/varnish-operator/pkg/varnishcontroller/controller.(*ReconcileVarnish).Reconcile\n\t/go/src/github.com/ibm/varnish-operator/pkg/varnishcontroller/controller/controller.go:109\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:297\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:252\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:215\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374","varnish_controller_version":"undefined","varnish_cluster":"vc-sun-location-services","pod_name":"vc-sun-location-services-varnish-0","namespace":"sun-location-services"}

We should check to see if a VCL can be deleted before attempting to delete it to avoid this situation.

Add value to specify nodePort

Enable option to activate the control port on 6082

First off thanks for varnish-operator, we're very happy with the way it works and are using it on one of our projects.

The application it is caching has a mechanism which invalidates URLs using the control port (usually on 6082) with a shared secret to authenticate.

By default the operator puts -T 127.0.0.1:6082, so we found a way round this with args, and setting the secret with admAuth :

---
apiVersion: caching.ibm.com/v1alpha1
kind: VarnishCluster
metadata:
  labels:
    operator: varnish
  name: varnish-cluster
spec:
   [snip]
    admAuth:
        secretName: varnish-custom-secret
        key: varnish-custom-secret
    args: ["-T", "0.0.0.0:6082"]

We also need to add a Service object that points to that so it can be acceded from the pods using varnish-cli:6082.

Could there be a less "manual" way to make it easier for other people ? We'd be happy to give feedback and help test if there is a way to enable this type of feature.

Publish release info to icm-questions

This is a bit weird with this repo as it's open sourced. However we should still at least discuss if and how we should approach a solution for this problem here. We've started a discussion here about the problem in general. The goal is to get word out to the users of the chart that there's an upgrade available (and ideally what's in it).

Exposing container lifecycle and pod terminationGracePeriodSeconds configuration in VarnishCluster CRD

Hello, great project! I have a use-case where I'd like to add container[].lifecycle.preStop configurations into the varnish statefulset that's generated by the VarnishCluster CRD, but unfortunately there doesn't seem to be a way to do that currently.

My use case is as follows:

configured VCL to be a self-routing cluster with consistent hashing
want 0 downtime when pods move during upgrades (or any other maintenance)
In conjunction with setting a deregistration timeout on the loadbalancer side of things, increasing the terminationGracePeriodSeconds config, and including unused "shutdown" vcl configurations in my vcl ConfigMap, the following lifecycle config on the varnish container will allow for graceful pod removal with no errors or dropped traffic:

lifecycle:
  preStop:
    exec:
      command:
      - /bin/bash
      - -c
      - varnishadm vcl.load shutdown /etc/varnish/entrypoint_shutdown.vcl
        && varnishadm vcl.use shutdown && sleep 60

Worth mentioning that the entrypoint_shutdown.vcl is identical to my main entrypoint.vcl except the heartbeat probe between varnish cluster pods will return a 404 to mark it sick to all members to allow traffic to drain and not be sent to that pod.

Make grafana dashboard name unique

We need the varnish dashboard name to be unique by default. Can be accomplished by adding the release name as a prefix to the name.

Reference varnish-operator in https://operatorhub.io/

Looking for an operator for varnish, I headed to https://operatorhub.io/ but couldn't find any.

Luckily I also did a search and found this project (and will soon test it out).

Any reason why this operator is not referenced on https://operatorhub.io/ ?

Could we add an event to the worklfow docs

Hey,

Could we add an event for push on main branch to the workflow that generates the Documentation ?

Thanks,
Arthur

Fix integer conversion bug with k8s < v1.20.0+

Using helm 3 and k8s < v1.20.0+, users may experience the bug outlined here: kubernetes/kubernetes#87675. A fix was made in k8s v1.20.0: kubernetes/kubernetes#93250.

Default entrypointVCLFileContent issues

Varnish by default appends its builtin.vcl to custom logic

Right now, default entrypointVCLFileContent contains several calls to return preventing builtin.vcl execution this could cause several issues or weird behaviors.

IMHO, entrypointVCLFileContent should better stick to what is needed and improve its documentation

Issues/weirdness encountered with current entrypointVCLFileContent:

All request are transformed to GET: Varnish by default only deals with GET & HEAD, see https://github.com/varnishcache/varnish-cache/blob/6.5/bin/varnishd/builtin.vcl#L65, but currently entrypointVCLFileContent includes an insensitive return (hash); preventing built-in logic to kick in. AFAIK it's not documented
Only url is used for request hashing: This makes the config unsafe when backends varying responses upon host header. Imaging a PHP application listening to both api.my.app/node/1 and www.my.app/node/1. AFAIK it's not documented
Stale objects are never used: AFAIK it's not documented

I'll suggest to remove unneeded returns from entrypointVCLFileContent, it may both limit weirdness and ease maintanibility

Best regards

Varnish operator contxet deadline exceeded error

I am getting this error while creating the cluster resource, tried both 0.35.0 and 0.34.2 versions


C02W84XMHTD5:charts iahmad$ 
C02W84XMHTD5:charts iahmad$ k get svc -n varnish-operator
NAME                       TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)            AGE
varnish-operator-service   ClusterIP   10.202.30.16   <none>        8329/TCP,443/TCP   5m6s
C02W84XMHTD5:charts iahmad$ 
C02W84XMHTD5:charts iahmad$ 
C02W84XMHTD5:charts iahmad$ k apply -f varnish-cluster.yaml 
Error from server (InternalError): error when creating "varnish-cluster.yaml": Internal error occurred: failed calling webhook "mvarnishcluster.kb.io": failed to call webhook: Post "https://varnish-operator-service.varnish-operator.svc:443/mutate-caching-ibm-com-v1alpha1-varnishcluster?timeout=10s": context deadline exceeded
C02W84XMHTD5:charts iahmad$

No errors were found in the operator pod logs, the pod is up and running, this happens on GKE
works on EKS

Add Priority Class to VarnishClusters Pod

Hey,
I would like to add the possibility to set priority Class. As for now there is no possibility of setting them, it take the default priority class and get evicted just like a normal pod. This is an issue as it is not a normal pod, those pods hold in their RAM all the cache.

Thanks,
Arthur

Update versions of various components

We need to update the Dockerfile's dependencies as well as Go's dependencies.

Storage Backends for Varnish

Wondering if the current implementation supports using file as the storage backend for Varnish and if the file can be configured to use a PV/host mount ?

Is it possible to use PROXY protocol?

Hello,

Thank you for your work regarding varnish-operator! I'm investigating this right now and run into one problem:

I'm exploring now, how to enable PROXY support at varnish cluster provisioned by operator. This generally illustrate my use-case: https://www.varnish-software.com/developers/tutorials/proxy-protocol-varnish/#proxy-protocol-over-tcp.

According to docs, I should add something "-a :8443,PROXY" parameter.

Unfortunately, operator doesn't allow me to add this option. Is there a way to put varnish in "PROXY" mode, or simply bypass operator validation?

varnish.resources not applied

varnish.resources did been set in custom resource, but not applied on pods:

Unauthorized when leader election enabled

When leader election is enabled with multiple operator replicas, the logs show an unauthorized error for getting the lease resource within the coordination.k8s.io api group. We need to update the RBAC configuration to allow the varnish operator service account to access this resource.

VCLcompilationfailed

Hi again,

is it possible to launch some additional library for vcl. In my case, this is the specific one : https://github.com/varnish/libvmod-digest

I built a docker image before but it does not work with all the ecosystem you've created.

Thanks a lot

Verify the varnish cluster has to be installed in the same namespace as the backends

Goes back to a slack discussion here. If it's not supported, should we support it? Pros/cons?

Upgrade PodDisruptionBudget version

Getting warnings in the logs:
W0921 11:26:08.102693 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget

Add affinity in the templates of the deployment of the operator

Now, it is not possible to assign the controller to an specific node.

I can make the PR myself. Can I have access to the repo?

Provide alternative ways to get grafana dashboard

Hi 👋

Great to see that there is a way to deploy a grafana dashboard https://ibm.github.io/varnish-operator/monitoring.html for the varnish-operator.

It would be nice to have a "I already have a grafana somewhere else" option with the json one could copy/paste to create the dashboard. Or even publish the said json on https://grafana.com/grafana/dashboards

Frontend Requests on grafana dashboard shows wrong number.

If there is over 10 varnish. if the varnish--1 is selected on grafana dashboard, it shows sum of pods' request which start with varnish--1. For example, if there are varnishes which names are from varnish-1 to varnish-10 and varnish-1 is selected, frontend requests shows sum varnish-1's requests and varnish-10's requests.

Build prometheus metrics exporter for arm correctly

#83 Introduced multiarch builds for varnish exporter. However, it is still loading the amd64 build of the metrics exporter into the arm64 image.

Since this is also GO based, it can just be built from source at the same time as the other images

Varnish Cluster using sharding director ?

Hi Craig, we are pretty satisfied about how VarnishCluster works but, as we have 3 VarnishCluster replicas for resiliency we need 3 requests to build up cache on the 3 instances.

I was wondering if there is a way to use sharding as they would do it here in this example.
And avoid going back to the app, if one of the Varnish instances may have it in cache. But automatically set up by the controller inside the backends.vcl.tmpl

Thanks,
Arthur

How to broadcast PURGE/BAN request to all pods in cluster?

Hello,

As I'm exploring this operator (good work!) - I found next possible issue. Is it possible to broadcast some request calls among all varnish pods?

Use case would be call all varnish pods and make a cache-tags based cache flush. I found, that one possible solution is to use some sort of HTTP broadcaster as part of varnish containers. I see two alternative solutions:

Is it something that is part of of roadmap for project? Are there alternative solution that may be used with this project?

Need to add annotations to varnish cluster spec

I need to add the following annotations to the cluster definition so that all other objects created by cluster definition ( such as stateful set and pods ) inherit these annotations, currently, it doesn't work.

annotations:
   prometheus.io/scrape: "true"
   prometheus.io/port: "9131"

Enquiry about project status and ownership

First off, I am impressed with what this operator offers and the future plans look promising.

It is clear that the project is still currently in alpha status and appears that core contributions have slowed down somewhat.

Can more context be provided regarding the future of this project? Is this still something that IBM is actively developing?

cc @tomashibm @arthurzenika @cin

Build for arm

Would be useful to support arm64

Helm ClusterRole does not include Namespace permissions

I have a varnish operator installed using the default values for 0.28.1. The operator logs are filled with the error:

E0114 13:37:40.569616       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Namespace: failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:default:varnish-operator" cannot list resource "namespaces" in API group "" at the cluster scope

I assume the operator is trying to get a list of namespaces to scan them for varnishcluster objects. The clusterrole that is created by the helm chart does not include permission to list namespace resource:

  Resources                                      Non-Resource URLs  Resource Names  Verbs
  ---------                                      -----------------  --------------  -----
  configmaps                                     []                 []              [create delete get list update watch]
  leases.coordination.k8s.io                     []                 []              [create delete get list update watch]
  servicemonitors.monitoring.coreos.com          []                 []              [create delete get list update watch]
  serviceaccounts                                []                 []              [create delete list update watch]
  services                                       []                 []              [create delete list update watch]
  statefulsets.apps                              []                 []              [create delete list update watch]
  varnishclusters.caching.ibm.com                []                 []              [create delete list update watch]
  poddisruptionbudgets.policy                    []                 []              [create delete list update watch]
  clusterrolebindings.rbac.authorization.k8s.io  []                 []              [create delete list update watch]
  clusterroles.rbac.authorization.k8s.io         []                 []              [create delete list update watch]
  rolebindings.rbac.authorization.k8s.io         []                 []              [create delete list update watch]
  roles.rbac.authorization.k8s.io                []                 []              [create delete list update watch]
  secrets                                        []                 []              [create get list update watch]
  events                                         []                 []              [create patch]
  pods                                           []                 []              [get list update watch]
  varnishclusters.caching.ibm.com/finalizers     []                 []              [get patch update]
  varnishclusters.caching.ibm.com/status         []                 []              [get patch update]
  endpoints                                      []                 []              [list watch]
  nodes                                          []                 []              [list watch]

Varnish 7.2.0 is out

Need to add support for the newer varnish versions.

https://varnish-cache.org/releases/rel7.2.0.html#rel7-2-0

Could we separate Image/tag override ?

Hey,
I would like to override image but not the tag as I synchronize image from ibmcom/varnish.* in my container registry.
I don't want to override the tag as I want the latest versions, but I want it from my ECR.
I can do the PR.
Thanks

Grafana dashboard pull in metrics from other VarnishClusters

We probably need to modify the queries to avoid that. The dashboard deployed for a particular VarnishCluster shouldn't show metrics from other VarnishClusters.

Standard installation doesn't work

https://ibm.github.io/varnish-operator/installation.html

Hello. Doesn't work on a managed cluster (as service [scaleway] ). Probably because there is no master server

W1022 04:58:46.584001 1 client_config.go:608] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
{"level":"info","msg":"patching webhook configurations 'varnish-operator-webhook-configuration' mutating=true, validating=true, failurePolicy=Fail","source":"k8s/k8s.go:39","time":"2021-10-22T04:58:46Z"}
{"err":"the server could not find the requested resource","level":"fatal","msg":"failed getting validating webhook","source":"k8s/k8s.go:48","time":"2021-10-22T04:58:46Z"}

Regression in 0.34.1 (#87 ?)

Version 0.34.1 doesn't work out of the box, operator's logs says:

kubectl logs -n varnish-operator        pod/varnish-operator-758487765-2xk58
2022/11/03 16:49:42 unable to read env vars: image is not properly formatted: repository name must have at least one component

Comparing template output from 0.34.0 and 0.34.1 shows that it might have been an oversight on #87

diff -u <(helm template varnish-operator varnish-operator/varnish-operator --version 0.34.0) <(helm template varnish-operator varnish-operator/varnish-operator)
--- /dev/fd/63	2022-11-03 18:02:19.818870658 +0100
+++ /dev/fd/62	2022-11-03 18:02:19.818870658 +0100
@@ -217,7 +217,7 @@
       serviceAccountName: varnish-operator
       containers:
       - name: varnish-operator
-        image: "ibmcom/varnish-operator:0.34.0"
+        image: ibmcom/varnish-operator:0.34.1
         imagePullPolicy: "Always"
         env:
         - name: NAMESPACE
@@ -225,7 +225,7 @@
         - name: LEADERELECTION_ENABLED
           value: "false"
         - name: CONTAINER_IMAGE
-          value: "ibmcom/varnish-operator:0.34.0"
+          value: 
         - name: WEBHOOKS_ENABLED
           value: "true"
         - name: LOGLEVEL

ibm / varnish-operator Goto Github PK

varnish-operator's Introduction

Varnish Operator

Project status: alpha

Overview

Features

Further reading

varnish-operator's People

Contributors

Stargazers

Watchers

Forkers

varnish-operator's Issues

Recommend Projects

Recommend Topics

Recommend Org