Code Monkey home page Code Monkey logo

linkerd-viz's Introduction

Docker Pulls

Upgrade Notice

This repo is for Linkerd 1. For instructions on setting up the Linkerd 2 viz extension, see the Linkerd 2 documentation.

linkerd-viz

Dead simple monitoring for linkerd.

linkerd-viz screenshot

linkerd-viz is a monitoring application based on Prometheus and Grafana, autoconfigured to collect metrics from linkerd. linkerd-viz currently supports DC/OS and Kubernetes.

linkerd-viz assumes linkerd has already been deployed onto your cluster, and your applications have been configured to route via linkerd. You'll also need to configure linkerd with the io.l5d.prometheus telemeter, to expose the stats that are displayed by linkerd-viz. You should add this block to your linkerd configuration file:

telemetry:
- kind: io.l5d.prometheus

For more information on getting started with linkerd have a look at our Getting Started guides.

Build Docker image

docker build -t buoyantio/linkerd-viz .

Local boot

docker run -p 3000:3000 -p 9191:9191 buoyantio/linkerd-viz

DC/OS Deploy

Install the official linkerd-viz DC/OS Universe package

dcos package install linkerd-viz

Custom installation

dcos marathon app add dcos/linkerd-viz.json

View dashboard

open $PUBLIC_NODE:3000

Kubernetes Deploy

kubectl apply -f k8s/linkerd-viz.yml

View dashboard

kubectl port-forward $(kubectl get po --selector=name=linkerd-viz -o jsonpath='{.items[*].metadata.name}') 3000:3000
open http://localhost:3000

Consul Deploy

First, examine and edit consul/prometheus-consul.yml to fit your ecosystem.

Run the Consul agent locally:

docker run -d --net=host consul:0.9.0 agent -bind=<external ip> -retry-join=<root agent ip>

for more information see Running Consul Agent in Client Mode.

Boot linkerd-viz locally:

docker run -d --net=host -p 3000:3000 -p 9191:9191 buoyantio/linkerd-viz:0.2.0 consul

View dashboard

open localhost:3000

linkerd-viz's People

Contributors

adleong avatar akreiling avatar christophetd avatar eentzel avatar klingerf avatar pcalcado avatar siggy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

linkerd-viz's Issues

Limit linkerd instances search to a particular Kubernetes namespace?

Hi,

I'm trying to deploy linkerd-viz in a Kubernetes cluster where I have rights on a single namespace, and I'm getting the following errors:

Failed to list *v1.Service: User \"system:serviceaccount:my-namespace:viewer\" cannot list all services in the cluster"
Failed to list *v1.Pod: User \"system:serviceaccount:my-namespace:viewer\" cannot list all pods in the cluster" 
Failed to list *v1.Endpoints: User \"system:serviceaccount:my-namespace:viewer\" cannot list all endpoints in the cluster" 

It seems that Prometheus is trying to list services / pods in the whole cluster. Is there a way to have it restrict itself to the namespace my-namespace only? I was thinking that using a __meta_kubernetes_namespace meta label could do the trick, but I'm unsure whether that will change the API call that Prometheus does, or just filter the services afterwards.

Note that I run linkerd-viz is run under a viewer service account that can list services / pods inside my namespace.

Thanks!

improvement for documentaiton regarding grafana

Hi,

the parameter grafana.url is not well documented.

I installed as described linkerd-viz + grafana from the helm chart in the same namespace.
So for grafana.url I configured http://grafana, http://grafana/, but I get only a 502 in return for the grafana links.

what happed?
Linkerd-viz does something with the url and produces error log entries like
2023/03/14 10:09:15 http: proxy error: dial tcp: lookup http://grafana: no such host
2023/03/14 10:19:15 http: proxy error: dial tcp: lookup http://grafana/: no such host
depending on the configuration.

So I eneded up in reading https://linkerd.io/2.12/tasks/grafana/index.html and use the configuration grafana:80 from the example.

Something is wrong in the rewriting or the documentation needs an update.

Best...
Uwe

Missing Network Policy Guidelines

We use Linkerd in a cluster that pretty much blocks every INGRESS/EGRESS not white listed with NetworkPolicies or GlobalNetworkPolicies (via Calico's CRD).

After successfully upgrading Linkerd from 2.9.4 to 2.10.1 we can't figure out what the viz plugin need and the fact it's installed in its own namespace makes all our previous configuration useless...

Can anyone help with some guidelines on how to proceed? What ports are used to where? If a cluster-wide configuration is needed, what would it look like?

Configurable scrape_interval is broken, breaks 1m Grafana graphs

The changes in #33 made for a surprise, when the success rate and request volume graphs became empty.

The default of 1m matches the query in Grafana, so the graphs become empty when they don't have two data points. I'd recommend a different default, like 30s.

However, this can't be overridden, because the replacement with $SCRAPE_INTERVAL fails.

The root of the issue is here:

sed -i "" "s@scrape_interval:.*@scrape_interval: $SCRAPE_INTERVAL@" $PROMETHEUS_CONF
sed -i "" "s@ evaluation_interval:.*@ evaluation_interval: $SCRAPE_INTERVAL@" $PROMETHEUS_CONF

With sed -i "" , the "" is interpreted as a filename by GNU sed, and the sed command fails. So the replacements never happen, and the command is only ever executed with the default "1m" in prometheus-$PLATFORM.yml . (whichever file that turns out to be, "k8s" in my case).

This should be sed -i"", with no space. (verified inside the container)

It would be a two-character PR, but the default causing empty graphs is also surprise, so the defaults should probably be adjusted down to ensure the irate() call has data.

Support more flexible router labels

Right now the linkerd-viz dashboard requires that the router from which it pull stats be labeled as "incoming", which might not be the case in all setups. We should think about ways to support other labels, possibly using grafana templating.

linkerd-viz assumes marathon master runs on localhost

In prometheus-mesos-marathon.yml line 11:

marathon_sd_configs:
  - servers:
    - 'http://localhost:8080'

should be changed to:

marathon_sd_configs:
  - servers:
    - 'http://marathon.mesos:8080'

To reflect other linkerd mesos-marathon examples.

You can also modify the linkerd-viz.json under mesos-marathon to use add-host parameter if that dns entry will not resolve:

{
  "id": "linkerd-viz",
  "instances": 1,
  "cpus": 1.0,
  "mem": 512.0,
  "acceptedResourceRoles": ["*", "slave_public"],
  "maintainer": "[email protected]",
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "buoyantio/linkerd-viz:latest",
      "parameters": [
        {
          "key": "add-host",
          "value": "marathon.mesos:192.168.250.11"
        }
      ],
      "forcePullImage": true,
      "network": "HOST",
      "privileged": true
    }
  },
  "args": ["mesos-marathon"],
  ...

DC/OS application group support

linkerd-viz does not properly aggregate metrics from DC/OS applications deployed as part of groups.

For example, an app named my-group/webapp yields metrics like this:

rt:outgoing:dst:id:_:io_l5d_marathon:my_group:webapp:requests 11

the last two metrics_relabel steps defined at:
https://github.com/BuoyantIO/linkerd-viz/blob/master/dcos/prometheus-dcos.yml#L32

.... cause the metric to be rewritten as:

linkerd:incoming:webapp:requests{instance="10.0.2.164",job="linkerd",service="my_group"}

...when the expected metric should be:

linkerd:incoming:requests{instance="10.0.2.164",job="linkerd",service="my_group/webapp"}

Consul startup doc error

The Readme for the Consul Deploy says to start Consul with docker in host networking mode, then start the linkerd-vis docker container without host networking mode. This will not work because the linkerd-vis configuration for consul is set to use localhost:8500. The localhost will not work unless the linkerd-vis container is also run in host networking mode.

display connection and client pool metrics

Additional stats around connection counts and client pools can be helpful in diagnosing performance issues. Consider adding these to the dashboard.

relevant connection stats:

  • rt:client:connections
  • rt:client:connects
  • rt:server:connections
  • rt:server:connects

relevant client pool stats:

  • rt:client:pool_cached
  • rt:client:pool_num_too_many_waiters
  • rt:client:pool_num_waited
  • rt:client:pool_size
  • rt:client:pool_waiters

grafana was not installed

In terminal:
linkerd viz install | kubectl apply -f -
then:
kubectl get pods -n linkerd-viz
NAME READY STATUS RESTARTS AGE
metrics-api-57c76d5c5c-jztcg 2/2 Running 0 56m
prometheus-5bcd95c8fc-vsrhb 2/2 Running 0 56m
tap-6694cbcb97-lsntm 2/2 Running 0 17m
tap-injector-779b797dbf-lrtj6 2/2 Running 0 17m
web-7bff7b8d89-7d9b8 2/2 Running 0 56m

捕获
Uploading 捕获.PNG…

DC/OS service URL

Modify the linkerd-viz DC/OS Universe package to support a DC/OS service URL:
http://<DCOS_URL>/service/linkerd-viz/

Attempted to set the server/root_url in the grafana.ini to the DCOS_URL, but got:

{"message":"Invalid Basic Auth Header"}

How to integrate auth?

@siggy The viz looks great. I am deploying a cluster for company on GKE, and really need to be able to secure the public facing Grafana auth to use the same Google Account users.

While I can normally do this from config files and UI, I'm struggling with this setup. It logs in anonymous, but the grafana admin functions are not available. I assume this is a feature of the anonymous logins, which I haven't used before.

Do you have a reference config to get it running with google cloud IAM you could share?

Prometheus does not install under minikube/podman

Problem description

As reported in Slack, prom/prometheus does not install properly if you are running minikube with podman.

image

Error: ImageInspectError
  Warning  InspectFailed  3m58s (x259 over 128m)  kubelet            Failed to inspect image "prom/prometheus:v2.47.0": rpc error: code
 = Unknown desc = short-name "prom/prometheus:v2.47.0" did not resolve to an alias and no unqualified-search registries are defined in 
"/etc/containers/registries.conf"

The reason for this is that only when running minikube with podman, there is no unqualified search registries. If you instead run minikube with docker, this just works without any issues.

Expected behavior

I have to say I'm not entirely sure. I think I would want podman to provide these unqualified search registries out of the box. But on the other hand, maybe it would be good if linkerd was friendly enough to specify the prometheus dependency prefixed with the intended search registry?

Workaround

  1. Log into minikube with minikube ssh.
  2. Run sudo vi /etc/containers/registries.conf.
  3. Add unqualified-search-registries = ["docker.io", "quay.io"].
  4. Restart minikube with minikube stop && minikube start.

linkerd health metrics

The current dashboard is very top-level request volume / success rate focused. Consider displaying linkerd health metrics (gc, etc), either on the existing dashboard, or as a separate "health" dashboard.

Linkerd2-viz chart?

Where's the chart for linkerd2? And don't point me to the docs that tell me to install a CLI on my local system and install it manually there - that's ridiculous. This should be doable via IaC.

DCOS 1.9 linkerd-viz never deploys

Have tried to deploy in DCOS using the universe package. Results in the service never deploying or running. I have looked on each node for the docker image buoyantio/linkerd-viz. No image has been pulled. I have also tried this from using just the json and creating a service from it and the result is the same. Not sure where else to look to resolve this. Not urgent but certainly would like to see this working in 1.9

support for minutely/hourly graphs

The dashboard today calculates rates per second. For lower velocities, It would be useful, likely via a template variable, to support minutely and hourly rates.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.