Comments (13)
Additionally, are you using long-term storage with prometheus to feed VPA?
Yes we use thanos
from charts.
Additional remark: we have multiple clients using our setup and only but all EKS clients are suffering from this, the AKS customers are not after the same upgrade.
from charts.
Relevant parameters:
I0816 16:14:07.067381 1 flags.go:57] FLAG: --add-dir-header="false"
I0816 16:14:07.067486 1 flags.go:57] FLAG: --address=":8942"
I0816 16:14:07.067492 1 flags.go:57] FLAG: --alsologtostderr="false"
I0816 16:14:07.067495 1 flags.go:57] FLAG: --checkpoints-gc-interval="10m0s"
I0816 16:14:07.067499 1 flags.go:57] FLAG: --checkpoints-timeout="1m0s"
I0816 16:14:07.067504 1 flags.go:57] FLAG: --container-name-label="container"
I0816 16:14:07.067509 1 flags.go:57] FLAG: --container-namespace-label="namespace"
I0816 16:14:07.067514 1 flags.go:57] FLAG: --container-pod-name-label="pod"
I0816 16:14:07.067517 1 flags.go:57] FLAG: --cpu-histogram-decay-half-life="24h0m0s"
I0816 16:14:07.067522 1 flags.go:57] FLAG: --cpu-integer-post-processor-enabled="false"
I0816 16:14:07.067526 1 flags.go:57] FLAG: --history-length="8d"
I0816 16:14:07.067531 1 flags.go:57] FLAG: --history-resolution="1h"
I0816 16:14:07.067535 1 flags.go:57] FLAG: --kube-api-burst="10"
I0816 16:14:07.067541 1 flags.go:57] FLAG: --kube-api-qps="5"
I0816 16:14:07.067547 1 flags.go:57] FLAG: --kubeconfig=""
I0816 16:14:07.067552 1 flags.go:57] FLAG: --log-backtrace-at=":0"
I0816 16:14:07.067566 1 flags.go:57] FLAG: --log-dir=""
I0816 16:14:07.067571 1 flags.go:57] FLAG: --log-file=""
I0816 16:14:07.067575 1 flags.go:57] FLAG: --log-file-max-size="1800"
I0816 16:14:07.067579 1 flags.go:57] FLAG: --logtostderr="true"
I0816 16:14:07.067584 1 flags.go:57] FLAG: --memory-aggregation-interval="24h0m0s"
I0816 16:14:07.067589 1 flags.go:57] FLAG: --memory-aggregation-interval-count="8"
I0816 16:14:07.067593 1 flags.go:57] FLAG: --memory-histogram-decay-half-life="24h0m0s"
I0816 16:14:07.067597 1 flags.go:57] FLAG: --memory-saver="false"
I0816 16:14:07.067601 1 flags.go:57] FLAG: --metric-for-pod-labels="kube_pod_labels{job=\"kube-state-metrics\"}[8d]"
I0816 16:14:07.067605 1 flags.go:57] FLAG: --min-checkpoints="10"
I0816 16:14:07.067609 1 flags.go:57] FLAG: --one-output="false"
I0816 16:14:07.067613 1 flags.go:57] FLAG: --oom-bump-up-ratio="1.2"
I0816 16:14:07.067618 1 flags.go:57] FLAG: --oom-min-bump-up-bytes="1.048576e+08"
I0816 16:14:07.067623 1 flags.go:57] FLAG: --pod-label-prefix=""
I0816 16:14:07.067627 1 flags.go:57] FLAG: --pod-name-label="pod"
I0816 16:14:07.067631 1 flags.go:57] FLAG: --pod-namespace-label="namespace"
I0816 16:14:07.067635 1 flags.go:57] FLAG: --pod-recommendation-min-cpu-millicores="5"
I0816 16:14:07.067640 1 flags.go:57] FLAG: --pod-recommendation-min-memory-mb="25"
I0816 16:14:07.067645 1 flags.go:57] FLAG: --prometheus-address="http://thanos-query-frontend.prometheus-stack:9090"
I0816 16:14:07.067649 1 flags.go:57] FLAG: --prometheus-cadvisor-job-name="kubelet"
I0816 16:14:07.067653 1 flags.go:57] FLAG: --prometheus-query-timeout="5m"
I0816 16:14:07.067657 1 flags.go:57] FLAG: --recommendation-margin-fraction="0.15"
I0816 16:14:07.067662 1 flags.go:57] FLAG: --recommender-interval="1m0s"
I0816 16:14:07.067667 1 flags.go:57] FLAG: --recommender-name="default"
I0816 16:14:07.067671 1 flags.go:57] FLAG: --skip-headers="false"
I0816 16:14:07.067675 1 flags.go:57] FLAG: --skip-log-headers="false"
I0816 16:14:07.067679 1 flags.go:57] FLAG: --stderrthreshold="2"
I0816 16:14:07.067683 1 flags.go:57] FLAG: --storage="prometheus"
I0816 16:14:07.067686 1 flags.go:57] FLAG: --target-cpu-percentile="0.9"
I0816 16:14:07.067690 1 flags.go:57] FLAG: --v="10"
I0816 16:14:07.067693 1 flags.go:57] FLAG: --vmodule=""
I0816 16:14:07.067697 1 flags.go:57] FLAG: --vpa-object-namespace=""
I0816 16:14:07.067702 1 main.go:82] Vertical Pod Autoscaler 0.13.0 Recommender: 0xc00004d820
Full logs in your mail :) not to leak any sensitive info here.
from charts.
Helm values are not much different:
vpa:
recommender:
extraArgs:
storage: "prometheus"
# The prometheus_server_endpoint should have the form http://<service-name>.<namespace-name>.svc:portnumber
prometheus-address: "http://thanos-query-frontend.prometheus-stack:9090"
prometheus-cadvisor-job-name: kubelet
pod-label-prefix: ""
pod-namespace-label: namespace
pod-name-label: pod
container-pod-name-label: pod
container-name-label: container
metric-for-pod-labels: kube_pod_labels{job="kube-state-metrics"}[8d]
pod-recommendation-min-cpu-millicores: 5
pod-recommendation-min-memory-mb: 25
v: 10
updater:
enabled: false
admissionController:
enabled: false
from charts.
How are you pulling these metrics into Grafana? Is it possible there's actually just an issue with the metrics reporting rather than the actual VPA recommendation itself? The changes from 1.7.5 to 2.x are almost entirely unrelated to the recommender deployment itself.
from charts.
Additionally, are you using long-term storage with prometheus to feed VPA?
from charts.
We use kube-state-metrics to scrape the VPA recommendations. The values in the Grafana dashboard are the same as when checking using kubectl get vpa
.
I also cannot understand why this change would lead to this behaviour. You have not seen anything like this before?
from charts.
The only time I've seen erratic recommendations is when I'm not using Prometheus data to feed the recommendations and I don't wait long enough for VPA to generate a good recommendation. Here's a cluster with 53 VPAs, using prometheus data, and the latest chart. (also using kube-state-metrics to poll the VPA data)
from charts.
Maybe try turning the log level on the recommender up to 10?
from charts.
I just realized the cluster that I'm showing in that graph above uses the vpa 0.14.0 image. Perhaps there's a bugfix in that version. Worth trying.
It would help if you could share your exact values for me to try to reproduce the issue
from charts.
Aha. You're using uncappedTarget
which does not respect limits set on the VPA or in the defaults
kubernetes/autoscaler#2747 (comment)
Uncapped Target gives the recommendation before applying constraints specified in the VPA spec, such as min or max.
I would imagine that switching that metrics to target
would provide more consistent data (that's what my graph above uses)
from charts.
That was just the first graph being shown by Grafana :) similar images for Target:
from charts.
Well now I'm at a loss. Perhaps the VPA folks can help explain why the recommendation status would oscillate so much. I personally haven't seen it do this in my various tests.
I'm guessing that the actual chart change actually has nothing to do with it, but it's something that is triggered by the re-deploy of the VPA pods. But that's just a hunch.
from charts.
Related Issues (20)
- [stable/vpa] Disabling admission controller causes Helm test to fail HOT 1
- [stable/goldilocks] Dashboard Deployment uses Controller TopologySpreadConstraint
- [stable/vpa] Expose ability to add deployment annotations in the vertical pod autoscaler helm chart HOT 1
- [VPA] Incorrect labels on PDB for admission controller HOT 1
- [stable/polaris] Support `string` type of `config` value HOT 4
- Security Issue: ClusterRoleBinding is using a pre-defined role which fails security scans and blocks deployment HOT 2
- [stable/insights-agent] Duplicate rules block in rbac for OPA HOT 1
- [VPA] namespaceSelector / objectSelector are not available
- Helm chart does not respect resources changes HOT 2
- VPA update-mode not changing for workloads after annotation HOT 3
- Goldilocks VPA Grafana Dashboard HOT 1
- Container Exclusions kubectl command doesn't work as suggested by doc
- Goldilocks chart dashboard service
- rbac-manager 1.8.0 was released HOT 2
- Upgrade to vertical-pod-autoscaler-1.1.0
- VPA manifests lack namespace HOT 1
- dashboard.excludeContainers is being ignored HOT 1
- How to disable banner "Add cost estimates to your Goldilocks dashboard!"?
- Unnecessary helm hooks HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from charts.