Code Monkey home page Code Monkey logo

sre-monitoring-openshift's Introduction

SRE Monitoring for OCP

This set-up assumes that you have installed ServiceMesh in the istio-system namespace and deployed the bookinfo app in the bookinfo namespace. Instructions to do so can be found here

Prometheus and Grafana deployment

The following are the steps to deploy a parallel grafana/prometheus/alert-manager stack to what comes up with ServiceMesh

Preparation

export istio_cp_namespace=istio-system
export deploy_namespace=sre-monitoring
export istio_cp_name=basic-install

oc new-project ${deploy_namespace}

Deploy Prometheus Operator

cat prometheus-operator.yaml | envsubst | oc apply -f - -n ${deploy_namespace}

Deploy Grafana Operator

oc apply -f ./grafana-operator/crds
oc apply -f ./grafana-operator/manifests -n ${deploy_namespace}
cat ./grafana-operator/cluster_role_binding_grafana_operator.yaml | envsubst | oc apply -f -

Deploy Prometheus

export cert_chain_pem=$(oc get secret -n ${istio_cp_namespace} istio.default -o jsonpath="{.data['cert-chain\.pem']}")
export key_pem=$(oc get secret -n ${istio_cp_namespace} istio.default -o jsonpath="{.data['key\.pem']}")
export root_cert_pem=$(oc get secret -n ${istio_cp_namespace} istio.default -o jsonpath="{.data['root-cert\.pem']}")

#Get a list of member to add a rolebinding for prometheus-istio-system in each control plane member namespace
echo "members: $(oc get ServiceMeshMemberRoll/default -n ${istio_cp_namespace} -o jsonpath="{.spec.members}" | sed s'/ /, /g')" > /tmp/members.yaml

helm template prometheus-sre --namespace ${deploy_namespace}  -f /tmp/members.yaml --set istio_control_plane.name=${istio_cp_name} --set istio_control_plane.namespace=${istio_cp_namespace} --set istio_cert.cert_chain=${cert_chain_pem} --set istio_cert.key=${key_pem} --set istio_cert.root_cert=${root_cert_pem} | oc apply -f -

#after a few minutes
oc patch statefulset/prometheus-sre-prometheus --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--discovery.member-roll-name=default" }]' -n ${deploy_namespace}
oc patch statefulset/prometheus-sre-prometheus --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--discovery.member-roll-namespace='${istio_cp_namespace}'" }]' -n ${deploy_namespace}

Deploy Grafana with openshift-monitoring and sre prometheus datasources

helm template grafana-sre --namespace ${deploy_namespace} --set prometheus_datasource.openshift_monitoring.password=$(oc extract secret/grafana-datasources -n openshift-monitoring --keys=prometheus.yaml --to=- | jq -r '.datasources[0].basicAuthPassword') | oc apply -f -

If you are running Git Bash on Windows without jq you can follow:

oc extract secret/grafana-datasources -n openshift-monitoring --keys=prometheus.yaml --to=-
#copy the value inside quotes for the key "basicAuthPassword" from terminal
export basic_auth_password="paste_here"
helm template grafana-sre --namespace ${deploy_namespace} --set prometheus_datasource.openshift_monitoring.password=${basic_auth_password} | oc apply -f -

Error Budget Demo

The following are the steps to run the error budget demo

Deploy bookinfo and generate load

Follow instructions here to deploy OCP Service Mesh and bookinfo, the example app. Follow the steps below to deploy locust, a load generator

oc new-project locust
export istio_gateway_url=$(oc get route istio-ingressgateway -n istio-system -o jsonpath='{.spec.host}')
oc create configmap locust-tasks --from-file=tasks.py=./locust/locustfile.py -n locust
helm repo add stable https://kubernetes-charts.storage.googleapis.com
helm install stable/locust --namespace locust --set master.config.target-host=http://${istio_gateway_url} -f ./locust/values.yaml --name-template locust
oc expose service locust-master-svc --port 8089 --name locust -n locust

Start Swarming

export locust_url=$(oc get route -n locust locust -o jsonpath='{.spec.host}')
curl -X POST -H "Content-Type: application/x-www-form-urlencoded" -d "locust_count=2&hatch_rate=2" http://${locust_url}/swarm

Fault injection

oc apply -f failure-injection.yaml -n bookinfo

create SLO-based alerts

helm template sre-service-monitor-istio --namespace ${deploy_namespace} --set slo_percent=95 --set latency=1 --set prometheus=sre-prometheus --set destination_service=details.bookinfo.svc.cluster.local --set metrics_labels.destination_service_namespace=bookinfo | oc apply -f -

sre-monitoring-openshift's People

Contributors

trevorbox avatar raffaelespazzoli avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.