litmuschaos / litmus-helm Goto Github PK

View Code? Open in Web Editor NEW

45.0 45.0 85.0 8.46 MB

Helm Charts for the Litmus Chaos Operator & CRDs

License: Apache License 2.0

Mustache 1.70% YAML 96.67% Smarty 1.63%

chart charts helm helm-chart helm-charts kubernetes

litmus-helm's People

Contributors

Stargazers

Watchers

Forkers

rahulchheda ispeakc0de hochuenw-dd kazukousen ksatchit jasstkn nlamirault radudd ishangupta-ds rberrelleza nosmolovskiy nikolay-o blame19 aditmeno psellars-hyprnz imrajdas niksko itustanic aldor007 gdsoumya dparbhakar bishtsaurabh5 jonsy13 mattslane oumkale sparepu tuananh luigi-bitonti talziv bbarin dipeshtripathi cryslam sethfduke mehdi-mameri calvinbui glagny kochsecurity stephen-harris stoehdoi mikelsid robinsegura carrefour-group romdum josephvano avaakash aslafy-z pchico83 ittireddy1 adarshkumar14 ludovic-pourrat dmyerscough vr00mm jimsheldon aristocrat888 bingwei-hong-partior sampriyadarshi davidcollom omerlh srivatsanraghunathan flockoftanks whereismybugfix mvlbarcelos calvinaud saranya-jena kbfu namkyu1999 nighthawq7 joannabs22 dwdraju vlzzz jouve andreaerco rxnew pmanghna rmullinnix461332 jlplasce hazem-nasser uditgaurav sambonbonne andoriyaprashant dusdjhyeon jongwooo

litmus-helm's Issues

Configure mongodb chart as a dependency for litmus-portal

Consider to use external mongodb helm chart as a dependency to avoid additional work and support for it

We can use for example this chart from Bitnami: https://github.com/bitnami/charts/tree/master/bitnami/mongodb

Add support for specifying priorityClassName in litmus operator

We'd like to be able to set the priorityClassName in litmus components (runner, operator and exporter).

Figure out cleaner values structure for litmus portal's components

Context: #265 (comment)

Example of issues: #257 (impossible to change values independently)

Also this structure increases complexity to use and manage values for the users.

Error on subscriber pod

Hi guys, I experimented an error while I was trying to install a litmus environment with self signed certificated.
I used gdsoumya/litmusportal-subscriber:ci image and SKIP_SSL_VERIFY : "true" env var as suggest in a previous issue, but I see this logs from the pod:

Add contribution guide

Question: How to set NodeSelector for agent plane resources using helm chart?

I want to run agent plane(subscriber, event-tracker, litmus-operator-ce, workflow-controller) resources on particular nodes.
I found that these resources can be executed with node selector(litmuschaos/litmus) but cant get how to do it using chart.
Do someone can help?

[BUG] Admin Login not Respected

Using v2.10.0 and v2.10.2 of the litmus chaos chart. I pass in the following values file via terraform:

adminConfig:
  ADMIN_USERNAME: admin
  ADMIN_PASSWORD: <password>

and do see the user and password values updated in the secret on K8s. However when I go to login I get denied. I also tried the default admin/litmus creds and they didnt work either. Checking the auth logs I see:

time="2022-06-30T14:10:19Z" level=info msg="users's collection already exists, continuing with the existing mongo collection"
time="2022-06-30T14:10:19Z" level=info msg="project's collection already exists, continuing with the existing mongo collection"
time="2022-06-30T14:10:23Z" level=info msg="Admin already exists in the database, not creating a new admin"
time="2022-06-30T14:10:23Z" level=info msg="Listening and serving HTTP on :3000"
time="2022-06-30T14:10:23Z" level=info msg="Listening and serving gRPC on :3030"
time="2022-06-30T14:25:32Z" level=warning msg="password doesn't match"
[GIN] 2022/06/30 - 14:25:32 | 401 |  3.831298258s |     10.145.2.23 | POST     "/login"

Which to me suggests the an admin account already exists in the database and a new one cannot be added. Any advice on how to pass the username and password in to be able to login, or how to access mongo db to get credentials would be very helpful.

Additional Files:

main.tf

resource "random_password" "admin_password" {
  count = var.admin_password == null ? 1 : 0
  length           = 10
  special          = true
}

resource "helm_release" "litmus_chaos" {
  name             = "chaos"
  repository       = "https://litmuschaos.github.io/litmus-helm"
  chart            = "litmus"
  version          = "2.10.2"

  create_namespace = var.create_namespace
  namespace        = var.namespace

  values = [
    yamlencode(local.chart_values),
    var.additional_yaml_config
  ]
}

locals.tf

locals {
  chart_values = {
    adminConfig = {
        ADMIN_USERNAME = var.admin_username
        ADMIN_PASSWORD = var.admin_password == null ? random_password.admin_password.0.result : var.admin_password
    }  
    ingress = {
        enabled = true 
        host = {
            name = var.ingress_host_name == null ? format("%s-%s-%s.azure.lnrsg.io", (length(split(var.names.location, "usgov")) > 1 ? "usgov" : "us"), var.names.product_name, var.names.environment) : var.ingress_host_name
        }
        ingress_class_name = var.ingress_class_name
        annotations = {
            "cert-manager.io/cluster-issuer" = "letsencrypt-issuer"
            "lnrs.io/zone-type" = "public"
            "nginx.ingress.kubernetes.io/rewrite-target" = "/$1"
        }
    }
  }
}

variables.tf

variable "names" {
  description = "Names to be applied to resources"
  type        = map(string)
}

variable "namespace" {
  type        = string
  description = "Namespace of aks helm release."
  default = "litmus"
  nullable = false
}

variable "create_namespace" {
  type        = bool
  description = "Create the namespace for the helm release."
  default = true
  nullable = false
}

variable "admin_username" {
  type        = string
  default     = "admin"
  description = "username for server admin."
}

variable "admin_password" {
  type        = string
  default = null
  description = "password for server admin."
}

# variable "cluster_issuer" {
#   type        = string
#   default     = "letsencrypt-acme-production"
#   description = "cert manager cluster issuer for ingress."
# }    

# Ingress
variable "ingress_enabled" {
  type = bool
  default = true
  nullable = false
}

variable "ingress_class_name" {
  type = string
  default = "nginx"
  nullable = false
}

variable "ingress_host_name" {
  type        = string
  description = "dns zone for ingress hosts."
  default = null
}

variable "additional_yaml_config" {
  type        = string
  default     = ""
  description = "values in raw yaml to pass to helm. Values will be merged, in order, as Helm does with multiple -f options."
}

Research CI migration to github actions

it should resolve issues with missing jobs for forks
everything will be in one CI system
better visibility

Provide helm logic for additional resource labels

Hello,
When deploying litmus resources to my environment, there are a few changes I need to make to meet infrastructure standards. Among them are identifying labels on resources to help correlate them with usage and owner. Following our discussion at https://kubernetes.slack.com/archives/CNXNB0ZTN/p1616624243043700 , @Jasstkn suggested that the helm charts could be refactored to render labels from values I provide, if they exist. This would be very helpful, let's please discuss.

Create a helmfile for litmuschaos charts

helmfile provides a declarative approach to managing helm charts/releases with the ability to bundle several disparate charts making up an application stack.

Add a helmfile for litmus, as there are several charts expected to come under this framework's purview

Litmus does not honor service.type field when ingress is enabled

We are hosting litmus on an kubernetes cluster on AWS (EKS) and we want to expose an external dns for litmus in order for us to access it from outside.

We are using aws ingress controller for this (aws load balancer controller), but it expects services to be NodePort as a prerequisite for creating the application load balancer (alb) and external dns

Here is the doc mentioning this : https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/ingress/spec/
The service, service-2048, must be of type NodePort in order for the provisioned ALB to route to it.

But we observed that litmus forces services to be ClusterIP if ingress is enabled, it does not honor service.type fields provided in values.yaml while installing helm release when ingress is enabled.

here is link and snippet from one of the service chart:

Question: Why ressource name contain resource type ?

Question: Why ressource name contain resource type ?
in manifest there is like a ".kind" field that already define it ?

Its a bit confusing.

[2.5.0] imagePullSecrets not used for internal pods

Hi everyone :), when I install the release 2.5.0 (also tested with older version), the .Values.image.imageRegistryName is used as prefix for image pull (like for chaos-operator, event-tracker, subscriber, workflow-controller pods) but not the .Values.image.ImagePullSecrets, the configuration of this part of the values are:

image:
  imagePullSecrets:
    - name: <SECRET NAME>
  imageRegistryName: <PRIVATE REGISTRY HERE>/litmuschaos

and I still get ImagePullBackOff with the good image shape ( like <PRIVATE REGISTRY HERE>/litmuschaos/litmusportal-event-tracker:2.5.0 for example) but there is no imagePullSecrets in the yaml file. Does anyone encounter the problem ? A big thanks in advance !

Make ingress path configurable for litmus-2-0-0-beta

I would make a PR for this however I want to know if it is a good idea.

My team has been working to install Litmus on our system. We went with trying the Litmus Portal which is of course packaged into the litmus-2-0-0-beta helm package.

However, after setting up the ingress, we noticed that the path is not configurable. The path is by default set to /(star), however our team usually just sets the path to "/". I had to manually install the helm charts on our local repository and point the helm chart to our configuration to delete the extra (star), to get it working properly on our end.

Usually, this path field is configurable. I'm wondering if I can open a PR to patch this variable and make it configurable, or is there some reason that /(star) is the default path for the ingress currently that is hardcoded?

Thanks.

Update LITMUS_CORE_VERSION to v2.9.0

Should LITMUS_CORE_VERSION be updated to v2.9.0 to work well with v2.9.0? Thank you
Same question for LITMUS_CHAOS_OPERATOR_IMAGE, LITMUS_CHAOS_RUNNER_IMAGE and LITMUS_CHAOS_EXPORTER_IMAGE

chaos operator metrics initialization needs additional permissions

Upon start-up, chaos operator displays this failure in the log, before proceeding with reconcile tasks. Though the operator metrics are not really used/documented in litmus, it needs to be backed up with the right permissions:

{"level":"info","ts":1597150388.8992712,"logger":"cmd","msg":"Could not create metrics Service","error":"failed to initialize service object for metrics: &{{{%!w(string=) %!w(string=)} {%!w(string=) %!w(string=) %!w(string=) %!w(*int64=<nil>)} %!w(string=Failure) %!w(string=replicasets.apps \"litmus-6749f7f68b\" is forbidden: User \"system:serviceaccount:litmus:litmus\" cannot get resource \"replicasets\" in API group \"apps\" in the namespace \"litmus\") %!w(v1.StatusReason=Forbidden) %!w(*v1.StatusDetails=&{litmus-6749f7f68b apps replicasets  [] 0}) %!w(int32=403)}}"}

[2.0.0-beta]No self-cluster when deploying with helm

Hi,

When deploying the portal using the helm chart I don't have the Self-Cluster available in Targets.

When deploying with yaml file the cluster is there.

Is there a way to fix this? Should I install litmusctl?

Thank you,
John

[Feature] Allow external creation of admin secret

Currently this helmchart forces it's users to put sensitive information into the values.yaml in clear text:

adminConfig:
  DBUSER: "admin"
  DBPASSWORD: "1234"
  JWTSecret: "litmus-portal@123"
  VERSION: "2.10.0"
  SKIP_SSL_VERIFY: "false"
  # -- leave empty if uses Mongo DB deployed by this chart
  DB_SERVER: ""
  DB_PORT: "27017"
  ADMIN_USERNAME: "admin"
  ADMIN_PASSWORD: "litmus"

This may lead to users being forced to adapt bad practices, like putting sensitive information into version control software (like git). A great way to bypass this issue would be to set an "existingSecret" property or a flag that allow users to manage this secret outside of helm.

[BUG] Default ingress configuration does not match to default deployment

Ingress configuration instantiates the following rules (when values.ingress.enabled=true)

Host: my-host-name

Path: /(.)
Link: http://my-host-name/(.*)
Backends: litmusportal-frontend-service:9091/backend/(.)

Path: /backend/(.)
Link: http://my-host-name/backend/(.*)
Backends: litmusportal-server-service:9002/backend/(.)

But helm chart configures the following pods:

ReplicaSet [litmusportal-frontend-84b4986f48]
ReplicaSet [litmusportal-server-5c9965fbc]

As you see there is not match between names which leads to 404 error

Configure Ingress for Traefik 1.7 Ingress Controller

Hello Traefik users,

Here are values you have to use to make litmus working on Traefik 1.7

ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: traefik
    traefik.ingress.kubernetes.io/rewrite-target: /
  host:
    paths:
      frontend: "/"
      server: "/backend"

the first annotation tell to kubernetes to use the traefik ingress controller
the second annotation define the rewrite target used for the server calls
https://doc.traefik.io/traefik/v1.7/configuration/backends/kubernetes/#general-annotations

Paths have to change too, cause traefik dont deal with (.*)

@ajesh

Allow specifying DB_URL instead of DB_SERVER and DB_PORT

When working with managed database, there are some cases where some additional flags needs to be added to the mongodb connection string. Some other can require the usage of multiple host. There should be a way to allow the use of a custom database connection string.

Make ClusterRole and ServiceAccount optional on the operator chart

Today, deploying litmus, even in namespaced-mode, requires cluster admin privileges due to the need to create ClusterRoles and ClusterRoleBindings.

It would be great if there was an option to install litmus without creating cluster-level resources, or that it allow the installer to pick an existing serviceAccount instead.

The openfaas chart is an example that follows this pattern. I already validated that the operator is able to run without this role, as long as the serviceAccount used by the operator has the correct permissions.

Missing feature when drag & drop workflow

Hi guys.

If I drag & drop a workflow, I can't modify the experiments graph and assign them a weight.
Am I doing something wrong or it is a missing feature? If I use predefined workflows, all works fine.

Thanks!

Prevent circle ci builds on the gh-pages branch

Currently, every merge commit triggers a build also on gh-pages branch, which doesn't contain any config file, resulting in multiple failed builds for this repo. There are two possible solutions (I think) for this:

Place a dummy ci config in gh-pages
Ignore gh-pages branch for the "lint" job in the workflow. The build job is anyways set to run "only" for master branch

The second options seems to be the better one, provided it works :)

Litmus front-end configmap custom changes

We have frontend configmap and could see default.conf not templated , there is nothing we can configure .

Can you accept the changes that we perform
or can you allow us to put additional custom configuration, so we don't have to perform the configuration at Kubernetes level . So that we can check them directly in the values file.
References : https://github.com/litmuschaos/litmus-helm/blob/master/charts/litmus/templates/controlplane-configs.yaml

Update litmus CRDs

Environment:

kind cluster
litmus installed via helm (following the docs)

Issue:

while trying to use httpProbe in an experiment chaos engine couldn't be initialized e.g
Unable to initialize ChaosEngine, because of Update Error: ChaosEngine.litmuschaos.io \"nginx-chaos-1\" is invalid: spec.experiments.spec.probe.httpProbe/inputs.method: Invalid value: 1

Solution:

manually edit chaosengine with kubectl edit crd chaosengines.litmuschaos.io to remove maxProperties: 1 on httpProbe. Thanks to @ispeakc0de

Conclusion and bug report:

Latest CRDs from https://litmuschaos.github.io/litmus/litmus-operator-v1.13.0.yaml and helm CRDs are not in sync. Helm CRDs needs an update. Not sure if other CRDs update sync are needed.

Security Issue

Hardcoded passwords
#179

Wrong service-name

litmus-helm/charts/litmus/templates/ingress.yaml

Line 44 in ea9afc5

name: litmusportal-frontend-service

The service-name needs to be updated with the new value. It's the same on the other lines of this file where service name is referenced.

Add param in values to enable/disable google analytics

In Litmus, we collect the following usage metrics from the litmus deployments - (a) number of operator installations (b) run count of a particular experiment.
This is done in order to gather chaos trends that will help us to improve the project.
However, in some cases where there are compliance requirements or in air-gapped environments, it might be necessary to turn the GA off.
This is possible today via an ENV in the litmus chaos operator deployment: refer https://docs.litmuschaos.io/docs/faq-general/#does-litmus-track-any-usage-metrics-on-the-test-clusters
This needs to be added a helm tunable, with default set to 'true'. For ex, in the operator:
```
policies:
  monitoring:
    enabled: true
```
With the env set to "FALSE" only when the operator.policies.monitoring.enabled is false.

Allow for configuring service account for mongo

For users that use PSPs, we need to bind a PSP to the mongo service account, in order to be able to provide permissions to the mongo pod. Currently, the mongo pod crashloops with

chown: changing ownership of '/data/db': Operation not permitted
chown: changing ownership of '/data/db/lost+found': Operation not permitted

Service type hardcoded in litmus-portal chart

Can't install litmus with the namespace scope

When I try to install litmus 2.0.32 using a namespace-scoped account, the installation fails.

This is because the role we use asks for nodes and namespaces permissions, which is not namespace scoped.

k8s-chaos-admin service account permissions

In this file, the cluster role is missing a few permissions required for some of the experiments: https://github.com/litmuschaos/litmus-helm/blob/7537fcd3f8caa4921b68525842707bc22c092b89/charts/kubernetes-chaos/templates/clusterrole.yaml

resources: networkpolicies,replicationcontrollers, deploymentconfigs, rollouts
apiGroups: argoproj.io, apps.openshift.io"

Release litmus-1.11.1.tgz

I have evaluated node-restart experiment, but failed due to exception of go-runner 1.11.0.
It seems go-runner 1.11.1 has released. I would like to evaluate it, so could you release the chart? Thanks!

Feature request: is it possible to introduce a new role for an user which allows only start/rerun existing workflows

Hello!

We would like to be able to invite external users to a litmus project who can run workflows which were already configured.
We don't wont them to create/delete/edit existing workflows.

Thx!

Create job for full cycle test: deploy litmus-chaos -> deploy experiments -> deploy engine

Enable helm chart test

It's better to have some automation tests which has been already written.
I don't know what is your CI service, but we can do it using github actions or something else
https://github.com/helm/chart-testing

Mongo DB_{USER,PASSWORD} and JWTSecret are stored in configmap and weak

The Mongo Credentials are stored into configmap.
Thats not good for security. And it will be harder to manage with rbac/roles.

A secret should be stored in a kubernetes secret.

The default DB password is weak.
We can generate random secret.

Here is an implementation of random mongo db password into a kubernetes secret:
Vr00mm@b0d675e

That will break helm upgrades from beta8 to beta9 if the current password is not set into .Values.adminConfig.

Tell me what you think about this change.

Best Regards,
Rémi

Mongo StatefulSet doesn't set correct security context

The Mongo DB StatefulSet pod doesn't set the correct security context parameters. This means that even if you have configured it to use a service account, and have set up the correct PSP, the PSP may not be chosen because the pod doesn't declare that it needs to run as non root, and that it needs CAP_CHOWN, CAP_SETUID and CAP_SETGID.

Administrator mode for helm-chart

I would like to provide cluster wide access for litmus during the HA tests which will be include:

installation of litmus
run test
generate report
delete litmus.

Could you advice if it's possible to add such option in helm-chart?
I can help with the contribution. Let me know what do you think about that.
thanks!

I cloned the repo and installed Litmus as per docs
Connected to the portal front end and created my first project
< Pods created >...
The event-tracker pod reports that the 'litmus-portal-config' configmap is missing

NAME                                                      READY   STATUS             RESTARTS   AGE
litmuschaos-litmus-2-0-0-beta-frontend-6bbdb89479-pbm2d   1/1     Running            0          7m38s
litmuschaos-litmus-2-0-0-beta-server-7f68d99757-nnk9l     2/2     Running            0          7m38s
litmuschaos-litmus-2-0-0-beta-mongo-0                     1/1     Running            0          7m38s
workflow-controller-5f8d8c5646-td8l5                      1/1     Running            0          7m17s
chaos-operator-ce-86c98974b9-msw8q                        1/1     Running            0          7m17s
chaos-exporter-568df7fd76-gs9tc                           1/1     Running            0          7m17s
argo-server-cd89c4579-l2wlm                               1/1     Running            0          7m17s
subscriber-6bffc4b64c-7bh2b                               0/1     CrashLoopBackOff   6          7m17s
event-tracker-7dcd96fd85-4nl46                            0/1     CrashLoopBackOff   6          7m17s

kubectl logs event-tracker-7dcd96fd85-4nl46 -n litmus
2021/03/18 10:15:10 configmaps "litmus-portal-config" not found

The subscriber reports the following, but I'm guessing this is a consequence of the above.

kubectl logs subscriber-6bffc4b64c-7bh2b -n litmus
2021/03/18 10:15:04 Go Version: go1.14.15
2021/03/18 10:15:04 Go OS/Arch: linux/amd64
2021/03/18 10:15:04 Post "http://172.27.250.23:0/query": dial tcp 172.27.250.23:0: connect: connection refused

litmus portal
prometheus
Authentication

Allow users to define namespace for litmus-portal

We need to allow users to define namespace for litmus-portal. It's impossible for now because of: https://github.com/litmuschaos/litmus-helm/blob/master/charts/litmus-portal/values.yaml#L6