fluxcd / flagger Goto Github PK
View Code? Open in Web Editor NEWProgressive delivery Kubernetes operator (Canary, A/B Testing and Blue/Green deployments)
Home Page: https://docs.flagger.app
License: Apache License 2.0
Progressive delivery Kubernetes operator (Canary, A/B Testing and Blue/Green deployments)
Home Page: https://docs.flagger.app
License: Apache License 2.0
Extend the status object with phases for:
In our Istio deployment, we made the decision to put the VirtualService definitions in a different namespace from where the deployments are running in order to limit access to the VirtualServices. Is this a support scenario with Flagger?
Push events to Slack on:
The Slack message should contain the workload name, namespace and the canary status.
Hey Stefan/others,
I was trying to get Flagger set up in our cluster and testing it out in my own namespace other than istio-system and the flagger pods were stuck in initialization. When I deployed to istio-system, the install worked.
These are my steps:
helm template './flagger' --name flagger --namespace=user-aaqel --set metricsServer=http://prometheus.istio-system:9090 > $HOME/scratch/flagger.yaml
kubectl apply -f flagger.yaml -n user-aaqel
Looking to migrate to your tool however we use Destination Rules all over the place. Will you be supporting them in the future?
Flagger should support for the canary name to differ from the target name. The target name should be used at bootstrap.
Sorry for all the Q's, but in my process of getting a demo of this up and running I was trying to see how to have the Canary definition specify service annotations. For example, we have a service annotation that tells our Prometheus instance that that service supports Prometheus scraping and the endpoint where those metrics can be scraped.
tl;dr Does the canary object support ClusterIP Service Annotations?
Currently Flagger's Cluster Role has no API restrictions.
The RBAC should allow read-write access only to:
We should use Alpine v3.9 as base image for Flagger and the load tester container images.
Would be great to have prometheus metrics to show a deployment has either been promoted from canary to primary or rolled back to primary.
Add timeout and retries to the Canary service spec.
Flagger metric checks will fail with no values found for metric istio_requests_total
if the target workload doesn't receive any traffic during the canary analysis. For workloads that are not receiving constant traffic Flagger could be configured with a webhook, that when called, will start a load test for the target workload.
We should provide a load test service based on rakyll/hey that will generate traffic during analysis.
Add A/B testing capabilities using fixed routing based on HTTP headers and cookies match conditions.
The canary analysis could be extended with webhooks. Flagger would call a URL and determine from the response status code if the canary is failing or not.
CRD spec:
kind: Canary
metadata:
name: podinfo
namespace: prod
spec:
canaryAnalysis:
webhooks:
- name: check1
url: http://validator.prod.svc.cluster.local:9898/check
timeout: 10s
# arbitrary key-value pairs
metadata:
key1: "value1"
key2: "value2"
Webhook payload (HTTP POST JSON):
{
"name": "podinfo",
"namespace": "prod",
"metadata": {
"key1": "value1",
"key2": "value2"
}
}
Response status codes:
On a non-2xx response include the the response body (if any) in the failed checks log & event.
Flagger should track changes for secrets and config maps used by the target deployment.
Track changes tasks
Add progress deadline to CRD and rollback a canary deployment if the last transition time exceeds the deadline value. This prevents Flagger from retrying the canary indefinitely if the Kubernetes deployment is stuck.
Right now, due to istio limitations, it is not possible to create a virtualservice with a mesh
and another host name. For example:
if I have:
...
gateways:
- www.myapp.com
- mesh
http:
- match:
- uri:
prefix: /api
route:
- destination:
host: api.default.svc.cluster.local
port:
number: 80
and
...
gateways:
- www.myapp.com
- mesh
http:
- match:
- uri:
prefix: /internal
route:
- destination:
host: internal.default.svc.cluster.local
port:
number: 80
Istio will throw an error
Only unique values for domains are permitted. Duplicate entry of domain www.myapp.com"
The two ways of fixing this I see is for flagger to either:
Let me know what you think!
The metrics validation should be extended to support any kind of metric by allowing custom Prometheus queries in the canary analysis spec.
Using Kubernetes Kind e2e testing can be run inside the CI pipeline. This will increase the build time with 3-5 minutes.
Events:
Docker Hub is highly unstable lately, one in two builds are failing on login or image push. Moving to Quay for better stats, security scanning and higher availability.
As per kubernetes/apiextensions-apiserver#25 CRD validation doesn't accept empty values for type "object" fields.
Workaround:
autoscalerRef:
anyOf:
- type: string
- type: object
required: ['apiVersion', 'kind', 'name']
properties:
apiVersion:
type: string
kind:
type: string
name:
type: string
Flagger assumes that a deployment specifies the pod label selector using the app
convention. We should support other label names like app.kubernetes.io/name or name
.
The label name or names could be specified as command flag.
For emergency cases, one may wish to skip the canary analysis and do the promotion asap. The Canary CRD should include a flag that will signal Flagger to promote the canary without analysing it.
https://docs.flagger.app/install/flagger-install-on-google-cloud
Create a wildcard certificate (replace example.com with your domain):
...
Warning BadConfig 2s (x7 over 2m37s) cert-manager Resource validation failed: spec.secretName: Required value: must be specified
spec.secretname: istio-ingressgateway-certs
should be
spec.secretName: istio-ingressgateway-certs
To minimise 503s during a rolling update one could specify the following Envoy headers:
x-envoy-max-retries: "10"
x-envoy-retry-on: "gateway-error,connect-failure,refused-stream"
The Canary service spec should allow headers to be appended.
At startup Flagger should check if the Kubernetes API version is lower than 1.11, if it is, it should log an unsupported error and exit. The validation should happen here https://github.com/stefanprodan/flagger/blob/master/cmd/flagger/main.go#L101
The sem ver comparison can be done with https://github.com/Masterminds/semver
Had a weird issue where I created a new deployment, and then updated the name
field in the canary.yaml
. After this every time I did a canary release flagger would roll out podinfo.demo
and demo.demo
.
It seemed like the name: podinfo
was cached and still being triggered along with name: demo
? They would each rollout at the same time but with individual config, podinfo.demo
had a stepWeight: 20
but demo.demo
had stepWeight: 2
.
The change that caused the behaviour to start: guyfedwards/istio-demo@397850d
We are using ambassador as our api gateway. I'd like to use flagger for our canary deployment however it seems tightly coupled with the istio ingress. Would it be possible to add support for ambassador? You can see here an example of how to do this:
Not sure if this is something you can support but it seems like all you would need to do is change the service annotation rather than create a virtual service?
Thanks!
Ted
Reject deployment if the pod label selector doesn't match app: <DEPLOYMENT_NAME>
Instead of using the same control loop interval for all canaries this setting should be configurable per canary deployment.
For user facing web apps a consistent hash-based load balancing would be more suitable than the default round robin one.
In order to enable session affinity, Flagger should create and use Istio destination rules. The Canary CRD should contain one of following affinity options: source IP, HTTP header and cookie.
This implies a breaking change to the bootstrap processes since the current Virtual Service created by Flagger is not using destination rules.
Use helm-gh-pages action to publish Flagger's charts to GitHub Pages.
Additional labels besides the app label do not get copied over to the Primary deployment.
Version: 0.9.0
Until Istio has an official k8s client, we'll have to maintain our own to keep up with the latest Istio CRDs.
Implement the new headers spec without breaking compatibility with Istio 1.0
For Flagger to support other service mesh technologies the Istio routing management should be refactor into its own package.
Currently Flagger uses a Cluster Role and watches for canary objects in all namespaces. It should be possible to make Flagger watch a single namespace and enforce the restriction with RBAC.
The request success check takes into account only HTTP 5xx. A new metric could be added that would target 4xx status codes. This implies renaming the success rate metric in the CRD to have a clear distinction between 5xx and 4xx.
Currently if there is an canary analysis underway and a new revision is applied on the cluster, Flagger will wait for the new revision to be rolled out and will resume the analysis. This could lead to an erroneous promotion since the new revision will not be fully validated. Instead of resuming the analysis Flagger should reset the validation process (set the failed checks and canary weight to zero) and start a fresh analysis.
Changes to the canaryAnalysis.interval
are not applied by Flagger. The canary analysis interval stays the same until Flagger restarts due to the fact that a job is created and never updated.
Could you show an example of how to use this with the istio ingress? I can't seem to figure out how to point to the correct service!
More specifically, is it possible to tell the istio ingress to route based on certain criteria (i.e. a uri prefix, etc?)
So far 0.6 has been great!
We are needing one more thing - corsPolicy support. Hopefully this change should be fairly trivial.
Thank you!
Unsure if this question even makes sense, but would flagger be able to apply its canary magic to knative serving CRD? Maybe too much going on there since knative uses different autoscaling logic I think. Just curious if both could be used together or really should just stick with flagger and Istio/k8s features.
Really amazing project, love the feature set and how it rolls things out. We might try experimenting with it soon! Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.