litmuschaos / litmus-helm Goto Github PK
View Code? Open in Web Editor NEWHelm Charts for the Litmus Chaos Operator & CRDs
License: Apache License 2.0
Helm Charts for the Litmus Chaos Operator & CRDs
License: Apache License 2.0
Consider to use external mongodb helm chart as a dependency to avoid additional work and support for it
We can use for example this chart from Bitnami: https://github.com/bitnami/charts/tree/master/bitnami/mongodb
We'd like to be able to set the priorityClassName in litmus components (runner, operator and exporter).
Context: #265 (comment)
Example of issues: #257 (impossible to change values independently)
Also this structure increases complexity to use and manage values for the users.
I want to run agent plane(subscriber, event-tracker, litmus-operator-ce, workflow-controller) resources on particular nodes.
I found that these resources can be executed with node selector(litmuschaos/litmus) but cant get how to do it using chart.
Do someone can help?
Using v2.10.0 and v2.10.2 of the litmus chaos chart. I pass in the following values file via terraform:
adminConfig:
ADMIN_USERNAME: admin
ADMIN_PASSWORD: <password>
and do see the user and password values updated in the secret on K8s. However when I go to login I get denied. I also tried the default admin/litmus creds and they didnt work either. Checking the auth logs I see:
time="2022-06-30T14:10:19Z" level=info msg="users's collection already exists, continuing with the existing mongo collection"
time="2022-06-30T14:10:19Z" level=info msg="project's collection already exists, continuing with the existing mongo collection"
time="2022-06-30T14:10:23Z" level=info msg="Admin already exists in the database, not creating a new admin"
time="2022-06-30T14:10:23Z" level=info msg="Listening and serving HTTP on :3000"
time="2022-06-30T14:10:23Z" level=info msg="Listening and serving gRPC on :3030"
time="2022-06-30T14:25:32Z" level=warning msg="password doesn't match"
[GIN] 2022/06/30 - 14:25:32 | 401 | 3.831298258s | 10.145.2.23 | POST "/login"
Which to me suggests the an admin account already exists in the database and a new one cannot be added. Any advice on how to pass the username and password in to be able to login, or how to access mongo db to get credentials would be very helpful.
Additional Files:
main.tf
resource "random_password" "admin_password" {
count = var.admin_password == null ? 1 : 0
length = 10
special = true
}
resource "helm_release" "litmus_chaos" {
name = "chaos"
repository = "https://litmuschaos.github.io/litmus-helm"
chart = "litmus"
version = "2.10.2"
create_namespace = var.create_namespace
namespace = var.namespace
values = [
yamlencode(local.chart_values),
var.additional_yaml_config
]
}
locals.tf
locals {
chart_values = {
adminConfig = {
ADMIN_USERNAME = var.admin_username
ADMIN_PASSWORD = var.admin_password == null ? random_password.admin_password.0.result : var.admin_password
}
ingress = {
enabled = true
host = {
name = var.ingress_host_name == null ? format("%s-%s-%s.azure.lnrsg.io", (length(split(var.names.location, "usgov")) > 1 ? "usgov" : "us"), var.names.product_name, var.names.environment) : var.ingress_host_name
}
ingress_class_name = var.ingress_class_name
annotations = {
"cert-manager.io/cluster-issuer" = "letsencrypt-issuer"
"lnrs.io/zone-type" = "public"
"nginx.ingress.kubernetes.io/rewrite-target" = "/$1"
}
}
}
}
variables.tf
variable "names" {
description = "Names to be applied to resources"
type = map(string)
}
variable "namespace" {
type = string
description = "Namespace of aks helm release."
default = "litmus"
nullable = false
}
variable "create_namespace" {
type = bool
description = "Create the namespace for the helm release."
default = true
nullable = false
}
variable "admin_username" {
type = string
default = "admin"
description = "username for server admin."
}
variable "admin_password" {
type = string
default = null
description = "password for server admin."
}
# variable "cluster_issuer" {
# type = string
# default = "letsencrypt-acme-production"
# description = "cert manager cluster issuer for ingress."
# }
# Ingress
variable "ingress_enabled" {
type = bool
default = true
nullable = false
}
variable "ingress_class_name" {
type = string
default = "nginx"
nullable = false
}
variable "ingress_host_name" {
type = string
description = "dns zone for ingress hosts."
default = null
}
variable "additional_yaml_config" {
type = string
default = ""
description = "values in raw yaml to pass to helm. Values will be merged, in order, as Helm does with multiple -f options."
}
Hello,
When deploying litmus resources to my environment, there are a few changes I need to make to meet infrastructure standards. Among them are identifying labels on resources to help correlate them with usage and owner. Following our discussion at https://kubernetes.slack.com/archives/CNXNB0ZTN/p1616624243043700 , @Jasstkn suggested that the helm charts could be refactored to render labels from values I provide, if they exist. This would be very helpful, let's please discuss.
helmfile provides a declarative approach to managing helm charts/releases with the ability to bundle several disparate charts making up an application stack.
Add a helmfile for litmus, as there are several charts expected to come under this framework's purview
We are hosting litmus on an kubernetes cluster on AWS (EKS) and we want to expose an external dns for litmus in order for us to access it from outside.
We are using aws ingress controller for this (aws load balancer controller), but it expects services to be NodePort as a prerequisite for creating the application load balancer (alb) and external dns
Here is the doc mentioning this : https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/ingress/spec/
The service, service-2048, must be of type NodePort in order for the provisioned ALB to route to it.
But we observed that litmus forces services to be ClusterIP if ingress is enabled, it does not honor service.type fields provided in values.yaml while installing helm release when ingress is enabled.
here is link and snippet from one of the service chart:
Question: Why ressource name contain resource type ?
in manifest there is like a ".kind" field that already define it ?
Its a bit confusing.
Hi everyone :), when I install the release 2.5.0 (also tested with older version), the .Values.image.imageRegistryName
is used as prefix for image pull (like for chaos-operator
, event-tracker
, subscriber
, workflow-controller
pods) but not the .Values.image.ImagePullSecrets
, the configuration of this part of the values are:
image:
imagePullSecrets:
- name: <SECRET NAME>
imageRegistryName: <PRIVATE REGISTRY HERE>/litmuschaos
and I still get ImagePullBackOff
with the good image shape ( like <PRIVATE REGISTRY HERE>/litmuschaos/litmusportal-event-tracker:2.5.0
for example) but there is no imagePullSecrets
in the yaml file. Does anyone encounter the problem ? A big thanks in advance !
I would make a PR for this however I want to know if it is a good idea.
My team has been working to install Litmus on our system. We went with trying the Litmus Portal which is of course packaged into the litmus-2-0-0-beta helm package.
However, after setting up the ingress, we noticed that the path is not configurable. The path is by default set to /(star), however our team usually just sets the path to "/". I had to manually install the helm charts on our local repository and point the helm chart to our configuration to delete the extra (star), to get it working properly on our end.
Usually, this path field is configurable. I'm wondering if I can open a PR to patch this variable and make it configurable, or is there some reason that /(star) is the default path for the ingress currently that is hardcoded?
Thanks.
Should LITMUS_CORE_VERSION be updated to v2.9.0 to work well with v2.9.0? Thank you
Same question for LITMUS_CHAOS_OPERATOR_IMAGE, LITMUS_CHAOS_RUNNER_IMAGE and LITMUS_CHAOS_EXPORTER_IMAGE
Upon start-up, chaos operator displays this failure in the log, before proceeding with reconcile tasks. Though the operator metrics are not really used/documented in litmus, it needs to be backed up with the right permissions:
{"level":"info","ts":1597150388.8992712,"logger":"cmd","msg":"Could not create metrics Service","error":"failed to initialize service object for metrics: &{{{%!w(string=) %!w(string=)} {%!w(string=) %!w(string=) %!w(string=) %!w(*int64=<nil>)} %!w(string=Failure) %!w(string=replicasets.apps \"litmus-6749f7f68b\" is forbidden: User \"system:serviceaccount:litmus:litmus\" cannot get resource \"replicasets\" in API group \"apps\" in the namespace \"litmus\") %!w(v1.StatusReason=Forbidden) %!w(*v1.StatusDetails=&{litmus-6749f7f68b apps replicasets [] 0}) %!w(int32=403)}}"}
Hi,
When deploying the portal using the helm chart I don't have the Self-Cluster available in Targets.
When deploying with yaml file the cluster is there.
Is there a way to fix this? Should I install litmusctl?
Thank you,
John
Currently this helmchart forces it's users to put sensitive information into the values.yaml in clear text:
adminConfig:
DBUSER: "admin"
DBPASSWORD: "1234"
JWTSecret: "litmus-portal@123"
VERSION: "2.10.0"
SKIP_SSL_VERIFY: "false"
# -- leave empty if uses Mongo DB deployed by this chart
DB_SERVER: ""
DB_PORT: "27017"
ADMIN_USERNAME: "admin"
ADMIN_PASSWORD: "litmus"
This may lead to users being forced to adapt bad practices, like putting sensitive information into version control software (like git). A great way to bypass this issue would be to set an "existingSecret" property or a flag that allow users to manage this secret outside of helm.
Ingress configuration instantiates the following rules (when values.ingress.enabled=true)
Host: my-host-name
Path: /(.)
Link: http://my-host-name/(.*)
Backends: litmusportal-frontend-service:9091/backend/(.)
Path: /backend/(.)
Link: http://my-host-name/backend/(.*)
Backends: litmusportal-server-service:9002/backend/(.)
But helm chart configures the following pods:
ReplicaSet [litmusportal-frontend-84b4986f48]
ReplicaSet [litmusportal-server-5c9965fbc]
As you see there is not match between names which leads to 404 error
Hello Traefik users,
Here are values you have to use to make litmus working on Traefik 1.7
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: traefik
traefik.ingress.kubernetes.io/rewrite-target: /
host:
paths:
frontend: "/"
server: "/backend"
the first annotation tell to kubernetes to use the traefik ingress controller
the second annotation define the rewrite target used for the server calls
https://doc.traefik.io/traefik/v1.7/configuration/backends/kubernetes/#general-annotations
Paths have to change too, cause traefik dont deal with (.*)
When working with managed database, there are some cases where some additional flags needs to be added to the mongodb connection string. Some other can require the usage of multiple host. There should be a way to allow the use of a custom database connection string.
Today, deploying litmus, even in namespaced-mode, requires cluster admin privileges due to the need to create ClusterRoles and ClusterRoleBindings.
It would be great if there was an option to install litmus without creating cluster-level resources, or that it allow the installer to pick an existing serviceAccount instead.
The openfaas chart is an example that follows this pattern. I already validated that the operator is able to run without this role, as long as the serviceAccount used by the operator has the correct permissions.
Hi guys.
If I drag & drop a workflow, I can't modify the experiments graph and assign them a weight.
Am I doing something wrong or it is a missing feature? If I use predefined workflows, all works fine.
Thanks!
Currently, every merge commit triggers a build also on gh-pages branch, which doesn't contain any config file, resulting in multiple failed builds for this repo. There are two possible solutions (I think) for this:
The second options seems to be the better one, provided it works :)
We have frontend configmap and could see default.conf not templated , there is nothing we can configure .
Can you accept the changes that we perform
or can you allow us to put additional custom configuration, so we don't have to perform the configuration at Kubernetes level . So that we can check them directly in the values file.
References : https://github.com/litmuschaos/litmus-helm/blob/master/charts/litmus/templates/controlplane-configs.yaml
Environment:
kind
clusterIssue:
while trying to use httpProbe in an experiment chaos engine couldn't be initialized e.g
Unable to initialize ChaosEngine, because of Update Error: ChaosEngine.litmuschaos.io \"nginx-chaos-1\" is invalid: spec.experiments.spec.probe.httpProbe/inputs.method: Invalid value: 1
Solution:
manually edit chaosengine with kubectl edit crd chaosengines.litmuschaos.io
to remove maxProperties: 1
on httpProbe. Thanks to @ispeakc0de
Conclusion and bug report:
Latest CRDs from https://litmuschaos.github.io/litmus/litmus-operator-v1.13.0.yaml and helm CRDs are not in sync. Helm CRDs needs an update. Not sure if other CRDs update sync are needed.
Hardcoded passwords
#179
The service-name needs to be updated with the new value. It's the same on the other lines of this file where service name is referenced.
In Litmus, we collect the following usage metrics from the litmus deployments - (a) number of operator installations (b) run count of a particular experiment.
This is done in order to gather chaos trends that will help us to improve the project.
However, in some cases where there are compliance requirements or in air-gapped environments, it might be necessary to turn the GA off.
This is possible today via an ENV in the litmus chaos operator deployment: refer https://docs.litmuschaos.io/docs/faq-general/#does-litmus-track-any-usage-metrics-on-the-test-clusters
This needs to be added a helm tunable, with default set to 'true'. For ex, in the operator:
policies:
monitoring:
enabled: true
With the env set to "FALSE" only when the operator.policies.monitoring.enabled is false.
For users that use PSPs, we need to bind a PSP to the mongo service account, in order to be able to provide permissions to the mongo pod. Currently, the mongo pod crashloops with
chown: changing ownership of '/data/db': Operation not permitted
chown: changing ownership of '/data/db/lost+found': Operation not permitted
When I try to install litmus 2.0.32 using a namespace-scoped account, the installation fails.
This is because the role we use asks for nodes
and namespaces
permissions, which is not namespace scoped.
In this file, the cluster role is missing a few permissions required for some of the experiments: https://github.com/litmuschaos/litmus-helm/blob/7537fcd3f8caa4921b68525842707bc22c092b89/charts/kubernetes-chaos/templates/clusterrole.yaml
networkpolicies
,replicationcontrollers
, deploymentconfigs
, rollouts
argoproj.io
, apps.openshift.io"
I have evaluated node-restart experiment, but failed due to exception of go-runner 1.11.0.
It seems go-runner 1.11.1 has released. I would like to evaluate it, so could you release the chart? Thanks!
Hello!
We would like to be able to invite external users to a litmus project who can run workflows which were already configured.
We don't wont them to create/delete/edit existing workflows.
Thx!
It's better to have some automation tests which has been already written.
I don't know what is your CI service, but we can do it using github actions or something else
https://github.com/helm/chart-testing
The Mongo Credentials are stored into configmap.
Thats not good for security. And it will be harder to manage with rbac/roles.
A secret should be stored in a kubernetes secret.
The default DB password is weak.
We can generate random secret.
Here is an implementation of random mongo db password into a kubernetes secret:
Vr00mm@b0d675e
That will break helm upgrades from beta8 to beta9 if the current password is not set into .Values.adminConfig.
Tell me what you think about this change.
Best Regards,
Rémi
The Mongo DB StatefulSet pod doesn't set the correct security context parameters. This means that even if you have configured it to use a service account, and have set up the correct PSP, the PSP may not be chosen because the pod doesn't declare that it needs to run as non root, and that it needs CAP_CHOWN
, CAP_SETUID
and CAP_SETGID
.
I would like to provide cluster wide access for litmus during the HA tests which will be include:
Could you advice if it's possible to add such option in helm-chart?
I can help with the contribution. Let me know what do you think about that.
thanks!
Either we can add alternative mongoDB solution in our helm chart itself or Add some documentation around how to use an external mongoDB instance.
Reference for mongodb not working on mac M1 - bitnami/charts#7305
Hi,
Chaos scheduler is not part of the helm installation. If you can add it, it would be very helpful :)
https://litmuschaos.github.io/litmus/chaos-scheduler-v1.13.8.yaml
This request is to help this chart support best practices. Best practices for a kubernetes deployment usually consists of ensuring that your deployment only allows images from your internal registry.
Currently, this chart does not support changing the image registry.
Hi
NAME READY STATUS RESTARTS AGE
litmuschaos-litmus-2-0-0-beta-frontend-6bbdb89479-pbm2d 1/1 Running 0 7m38s
litmuschaos-litmus-2-0-0-beta-server-7f68d99757-nnk9l 2/2 Running 0 7m38s
litmuschaos-litmus-2-0-0-beta-mongo-0 1/1 Running 0 7m38s
workflow-controller-5f8d8c5646-td8l5 1/1 Running 0 7m17s
chaos-operator-ce-86c98974b9-msw8q 1/1 Running 0 7m17s
chaos-exporter-568df7fd76-gs9tc 1/1 Running 0 7m17s
argo-server-cd89c4579-l2wlm 1/1 Running 0 7m17s
subscriber-6bffc4b64c-7bh2b 0/1 CrashLoopBackOff 6 7m17s
event-tracker-7dcd96fd85-4nl46 0/1 CrashLoopBackOff 6 7m17s
kubectl logs event-tracker-7dcd96fd85-4nl46 -n litmus
2021/03/18 10:15:10 configmaps "litmus-portal-config" not found
The subscriber reports the following, but I'm guessing this is a consequence of the above.
kubectl logs subscriber-6bffc4b64c-7bh2b -n litmus
2021/03/18 10:15:04 Go Version: go1.14.15
2021/03/18 10:15:04 Go OS/Arch: linux/amd64
2021/03/18 10:15:04 Post "http://172.27.250.23:0/query": dial tcp 172.27.250.23:0: connect: connection refused
We need to minimize this soon, by next release at least @rajdas98 !
Originally posted by @ksatchit in #55 (comment)
We need to allow users to define namespace for litmus-portal. It's impossible for now because of: https://github.com/litmuschaos/litmus-helm/blob/master/charts/litmus-portal/values.yaml#L6
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.