Code Monkey home page Code Monkey logo

flagger's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flagger's Issues

Slack notifications

Push events to Slack on:

  • canary initialisation
  • canary revision changed
  • promotion succeeded
  • promotion failed

The Slack message should contain the workload name, namespace and the canary status.

Flagger install is stuck in init when I try to install in namespace other than istio-system

Hey Stefan/others,

I was trying to get Flagger set up in our cluster and testing it out in my own namespace other than istio-system and the flagger pods were stuck in initialization. When I deployed to istio-system, the install worked.

These are my steps:
helm template './flagger' --name flagger --namespace=user-aaqel --set metricsServer=http://prometheus.istio-system:9090 > $HOME/scratch/flagger.yaml
kubectl apply -f flagger.yaml -n user-aaqel

Destination Rules support

Looking to migrate to your tool however we use Destination Rules all over the place. Will you be supporting them in the future?

Annotations on ClusterIP services

Sorry for all the Q's, but in my process of getting a demo of this up and running I was trying to see how to have the Canary definition specify service annotations. For example, we have a service annotation that tells our Prometheus instance that that service supports Prometheus scraping and the endpoint where those metrics can be scraped.

tl;dr Does the canary object support ClusterIP Service Annotations?

Fine-grained RBAC

Currently Flagger's Cluster Role has no API restrictions.

The RBAC should allow read-write access only to:

  • services
  • deployments
  • configmaps
  • secrets
  • virtualservices.networking.istio.io
  • canaries.flagger.app

Upgrade Alpine to 3.9

We should use Alpine v3.9 as base image for Flagger and the load tester container images.

Support AWS AppMesh

  • create CRDs and AppMesh K8s clientset
  • extend routing management with AppMesh virtual nodes and virtual services
  • setup metrics scraping for AppMesh sidecars on EKS
  • implement AppMesh metric checks

Load testing webhook

Flagger metric checks will fail with no values found for metric istio_requests_total if the target workload doesn't receive any traffic during the canary analysis. For workloads that are not receiving constant traffic Flagger could be configured with a webhook, that when called, will start a load test for the target workload.
We should provide a load test service based on rakyll/hey that will generate traffic during analysis.

Extend canary analysis with external checks

The canary analysis could be extended with webhooks. Flagger would call a URL and determine from the response status code if the canary is failing or not.

CRD spec:

kind: Canary
metadata:
  name: podinfo
  namespace: prod
spec:
  canaryAnalysis:
    webhooks:
    - name: check1
      url: http://validator.prod.svc.cluster.local:9898/check
      timeout: 10s
      # arbitrary key-value pairs 
      metadata:
        key1: "value1"
        key2: "value2"

Webhook payload (HTTP POST JSON):

{
    "name": "podinfo",
    "namespace": "prod", 
    "metadata": {
        "key1":  "value1",
        "key2":  "value2"
    }
}

Response status codes:

  • 200-202 - advance canary by increasing the traffic weight
  • timeout or non-2xx - halt advancement and increment failed checks

On a non-2xx response include the the response body (if any) in the failed checks log & event.

Track secrets and config maps

Flagger should track changes for secrets and config maps used by the target deployment.

Track changes tasks

  • discover secrets and config maps referenced by the target deployment
  • detect changes in secrets and config maps referenced by the target deployment
  • the primary bootstrap and promotion procedure should clone the target configs
  • trigger the canary analysis when a secret or configmap changes

Rollback canary when deployment is stuck

Add progress deadline to CRD and rollback a canary deployment if the last transition time exceeds the deadline value. This prevents Flagger from retrying the canary indefinitely if the Kubernetes deployment is stuck.

Only unique values for domains are permitted error with Istio 1.1.0 RC1

Right now, due to istio limitations, it is not possible to create a virtualservice with a mesh and another host name. For example:

if I have:

...
gateways:
- www.myapp.com
- mesh
http:
  - match:
    - uri:
        prefix: /api
    route:
    - destination:
        host: api.default.svc.cluster.local
        port:
          number: 80

and

...
gateways:
- www.myapp.com
- mesh
http:
  - match:
    - uri:
        prefix: /internal
    route:
    - destination:
        host: internal.default.svc.cluster.local
        port:
          number: 80

Istio will throw an error

Only unique values for domains are permitted. Duplicate entry of domain www.myapp.com"

The two ways of fixing this I see is for flagger to either:

  1. Create a separate virtualservice and maintain the canary settings for each one correlated to the particular service deployed
  2. Compile all virtualservices together into a singular virtualservice

Let me know what you think!

Support custom Prometheus metrics

The metrics validation should be extended to support any kind of metric by allowing custom Prometheus queries in the canary analysis spec.

Add details to Slack notifications

Events:

  • start analysis (metadata: max failed checks, max and step weight)
  • promotion (metadata: number of failed checks)
  • rollback (metadata: rollback reason, can be failed check or deadline exceeded)

Switch from Docker Hub to Quay

Docker Hub is highly unstable lately, one in two builds are failing on login or image push. Moving to Quay for better stats, security scanning and higher availability.

Skip canary analysis

For emergency cases, one may wish to skip the canary analysis and do the promotion asap. The Canary CRD should include a flag that will signal Flagger to promote the canary without analysing it.

Support Envoy retry headers

To minimise 503s during a rolling update one could specify the following Envoy headers:

x-envoy-max-retries: "10"
x-envoy-retry-on: "gateway-error,connect-failure,refused-stream"

The Canary service spec should allow headers to be appended.

Deployment seems cached after updating name in canary.yaml

Had a weird issue where I created a new deployment, and then updated the name field in the canary.yaml. After this every time I did a canary release flagger would roll out podinfo.demo and demo.demo.

It seemed like the name: podinfo was cached and still being triggered along with name: demo? They would each rollout at the same time but with individual config, podinfo.demo had a stepWeight: 20 but demo.demo had stepWeight: 2.

The change that caused the behaviour to start: guyfedwards/istio-demo@397850d

use with ambassaor ingress

We are using ambassador as our api gateway. I'd like to use flagger for our canary deployment however it seems tightly coupled with the istio ingress. Would it be possible to add support for ambassador? You can see here an example of how to do this:

https://www.getambassador.io/docs/dev-guide/canary-release-concepts#flexible-kubernetes-canary-releases-smart-routing-with-ambassador

Not sure if this is something you can support but it seems like all you would need to do is change the service annotation rather than create a virtual service?

Thanks!

Ted

Add support for session affinity

For user facing web apps a consistent hash-based load balancing would be more suitable than the default round robin one.

In order to enable session affinity, Flagger should create and use Istio destination rules. The Canary CRD should contain one of following affinity options: source IP, HTTP header and cookie.

This implies a breaking change to the bootstrap processes since the current Virtual Service created by Flagger is not using destination rules.

Restart analysis if revision changes during validation

Currently if there is an canary analysis underway and a new revision is applied on the cluster, Flagger will wait for the new revision to be rolled out and will resume the analysis. This could lead to an erroneous promotion since the new revision will not be fully validated. Instead of resuming the analysis Flagger should reset the validation process (set the failed checks and canary weight to zero) and start a fresh analysis.

Add HTTP match conditions to Canary service spec

Could you show an example of how to use this with the istio ingress? I can't seem to figure out how to point to the correct service!

More specifically, is it possible to tell the istio ingress to route based on certain criteria (i.e. a uri prefix, etc?)

Support VirtualService CorsPolicy

So far 0.6 has been great!

We are needing one more thing - corsPolicy support. Hopefully this change should be fairly trivial.

Thank you!

Question: knative serving support

Unsure if this question even makes sense, but would flagger be able to apply its canary magic to knative serving CRD? Maybe too much going on there since knative uses different autoscaling logic I think. Just curious if both could be used together or really should just stick with flagger and Istio/k8s features.

Really amazing project, love the feature set and how it rolls things out. We might try experimenting with it soon! Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.