stolostron / multicluster-observability-operator Goto Github PK

Operator for Multi-Cluster Monitoring with Thanos.

License: Apache License 2.0

Dockerfile 0.93% Shell 6.18% Go 91.78% Makefile 1.12%

monitoring thanos prometheus grafana openshift-operator observability open-cluster-management kubernetes

multicluster-observability-operator's Issues

Object aliases are already widely used in Nginx Operator.

Describe the bug
Kubernetes resources and aliases are always different by the domain.
Same name resources are common and we often forget about it.
But in this case ACM is using a resource that is already used in Nginx driven environment.
That can lead to tedious cases.

To Reproduce
Install Open Cluster Manager (Red Hat Advanced Cluster Manager) and Nginx Ingress Controller Operator
Try to get a policy (There goes the problem which one?)

oc api-resources|egrep  '^NAME| plc | pol '
NAME                                  SHORTNAMES         APIGROUP                                       NAMESPACED   KIND
policies                              plc                policy.open-cluster-management.io              true         Policy
policies                              pol                k8s.nginx.org                                                 true         Policy

Workaround
You can either use the FQDN or the shortname

plc or policies.policy.open-cluster-management.io
pol or policies.k8s.nginx.org

Solution
No solution there apart changing the names
I won't get on who did it first.
But at least we have a trace in the issues.

username has to be encoded in swtich-to-grafana-admin.sh script

Followed the instruction to set up the Grafana developer instance.
https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.2/html/observing_environments/observing-environments-intro#designing-your-grafana-dashboard

When tried with "switch-to-grafana-admin.sh", I encountered some issue as my id is "IAM#changwoo" since I installed RHACM on ROKS (Red Hat OpenShift on IBM Cloud).

Basically, "#" made things complicated.

Suggesting the fix something like this at line 70,

encoded_user_name=echo -n $user_name | jq -sRr '@uri'
userID=$curlCMD -s -X GET -H "Content-Type: application/json" -H "X-Forwarded-User: $XForwardedUser" 127.0.0.1:3001/api/users/lookup?loginOrEmail=$encoded_user_name | $PYTHON_CMD -c "import sys, json; print(json.load(sys.stdin)['id'])" 2>/dev/null

orgID=$curlCMD -s -X GET -H "Content-Type: application/json" -H "X-Forwarded-User:$XForwardedUser" 127.0.0.1:3001/api/users/lookup?loginOrEmail=$encoded_user_name | $PYTHON_CMD -c "import sys, json; print(json.load(sys.stdin)['orgId'])" 2>/dev/null

generate-dashboard-configmap-yaml.sh : asterisk character in panel query is incorrectly evaluated

The script generate-dashboard-configmap-yaml.sh doesn't correctly export dashboard containing asterisk character in the query of panels.

This issue occurs also when the script generate-dashboard-configmap-yaml.sh is executed in a not-empty directory.

To reproduce this issue execute the following steps:

Create a dashboard containing a panel with the following metric "100 * 100"
Open a terminal session, verify that the current directory isn't empty
Execute the script generate-dashboard-configmap-yaml.sh in order to export the dashboard created in the step 1
Verify the yaml file containing the dashboard exported to a ConfigMap, the metrics "100 * 100" isn't present. The character "*" has been replaced with the file list of the current directory. This is a not expected behaviour

Troubleshooting the 'View in Alertmanager' Button in Email Notifications

This question concerns Openshift. I have 2 clusters and on one of them I deployed multicluster-observability-operator on Redhat ACM:

https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.3/html-single/observability/index

It's a simple configuration that collects alerts from all clusters and sends them via email. So, I invoke MultiClusterObservability which creates a StatefulSet, and the StatefulSet manages the pods.

apiVersion: observability.open-cluster-management.io/v1beta2
kind: MultiClusterObservability
metadata:
name: observability
spec:
enableDownsampling: true
imagePullPolicy: Always
observabilityAddonSpec:
enableMetrics: true
interval: 60
storageConfig:
metricObjectStorage:
name: thanos-object-store
key: thanos.yaml
storeStorageSize: 1Gi
storageClass: xxx
advanced:
retentionConfig:
blockDuration: 2h
deleteDelay: 48h
retentionInLocal: 24h
retentionResolutionRaw: 10d
retentionResolution5m: 90d
retentionResolution1h: 10d

The emails work great (after configuring alertmanager.yaml), but there's a generic "View in Alertmanager" button in these emails. With the aforementioned configuration, this button doesn't work by default. The URL under the button points to:

http://observability-alertmanager-0:9093/#/alerts?receiver=XXX

observability-alertmanager-0 - this is the name of the pod. As I mentioned, everything is deployed through MultiClusterObservability. Could someone guide me on how to fix this button (so it actually directs to the Alertmanager's address) or at least someone could point me to documentation that controls this, as I can’t find it in the official sources,

Thank you.

Image with latest tags is not picked up, thats why PODS are giving error.

When I am trying to install multicluster-observability-operator on openshift cluster, the snapshot(2021...) which it is referring is not present in repository. If I am updating the tag with latest snapshot present(2022....) in repository, it is updating the YAMLs but again it is updating the image with previous snapshot.. I am not sure whether some ArgoCD functionality is built in with this deployments. Could you please look into this and help me out with resolution. How I can update the image with latest tag (snapshot). Due to this issue, I am not able to implement Thanos on openshift cluster. My PODS are not running.

2.2 Readme for Setting up object storage section needs update

Minio Directory in tests/e2e/minio/ is no longer present or moved to https://github.com/open-cluster-management/observability-kind-cluster/tree/master/minio. Readme section needs an update reflecting the changes for setting up object storage using minio.

Internal error occurred: error resolving resource when running on plain k8s

I see the README is geared at the hub cluster being Openshift,
however I also see this wording that suggests may this can work on plain k8s as well?

Note: By default, the API conversion webhook use on the OpenShift service serving certificate feature to manage the certificate, you can replace it with cert-manager if you want to run the multicluster-observability-operator in a kubernetes cluster.

I'm using local kind clusters with k8s v1.26.0
I've gotten as far as the below command but hitting an error.
I'm thinking it could be related to the webhook?

kubectl -n open-cluster-management-observability apply -f operators/multiclusterobservability/config/samples/observability_v1beta2_multiclusterobservability.yaml

Error from server (InternalError): error when retrieving current configuration of:
Resource: "observability.open-cluster-management.io/v1beta2, Resource=multiclusterobservabilities", GroupVersionKind: "observability.open-cluster-management.io/v1beta2, Kind=MultiClusterObservability"
Name: "observability", Namespace: ""
from server for: "operators/multiclusterobservability/config/samples/observability_v1beta2_multiclusterobservability.yaml": Internal error occurred: error resolving resource

Is there a way I can get this add-on to work with plain k8s?

Feature request - users should be able to secure Grafana observability route using own certificates

As of now, users are not able to configure TLS settings for the Grafana route (generate and use their own certificates) and the route is using the default OpenShift ingress certificate. There are no settings in the Observability CRD to specify a certificate for the route.
Idea is to provide users more flexibility with Grafana route configuration so they are able to configure TLS settings for the route. This can be set as a link to a secret in MultiClusterObservability CR for the Grafana route so the operator set route TLS settings using certificates from the secret.

Reference to thanos.io does not exist

The reference from https://github.com/open-cluster-management/multicluster-observability-operator/blob/main/bundle/manifests/observability.open-cluster-management.io_multiclusterobservabilities.yaml#L706
to URL: https://thanos.io/storage.md/#configuration does not exist ... getting Page Not Found

stolostron / multicluster-observability-operator Goto Github PK

multicluster-observability-operator's Issues

Object aliases are already widely used in Nginx Operator.

username has to be encoded in swtich-to-grafana-admin.sh script

generate-dashboard-configmap-yaml.sh : asterisk character in panel query is incorrectly evaluated

Troubleshooting the 'View in Alertmanager' Button in Email Notifications

Image with latest tags is not picked up, thats why PODS are giving error.

2.2 Readme for Setting up object storage section needs update

Internal error occurred: error resolving resource when running on plain k8s

Feature request - users should be able to secure Grafana observability route using own certificates

Reference to thanos.io does not exist

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent