I'm just wondering if you've looked into using PodDisruptionBudgets and AntiAffinity r

In <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id=

Just FYI podAntiAffinity is <a href="https://github.com/kubernetes/kubernetes/pull/541

The <a href="https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#inter-

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

PodDisruptionBudget and AntiAffinity about kubernetes-kafka HOT 9 OPEN

yolean commented on July 24, 2024

PodDisruptionBudget and AntiAffinity

from kubernetes-kafka.

Comments (9)

solsson commented on July 24, 2024 2

What is "spreading"?

It's the most common term I found among the dicussions in #70 (comment).

My own conclusion: Let's say you have a cluster where nodes come and go (which is common) and you do for example kubectl set image (which is also common) a scheduler must have some default logic to avoid placing all the replacment pods on the same recently added node. The problem is that I've observed the opposite behavior in GKE. I've also observed the expected behavior, but I fail to identify the causes.

from kubernetes-kafka.

solsson commented on July 24, 2024

Good that you brought this up. I noticed that both are present in the recent blog post http://blog.kubernetes.io/2017/09/kubernetes-statefulsets-daemonsets.html too, and they explain it here. I'd gladly accept contributions, but preferrably with some proof of the benefits.

PodDisruptionBudgets: We just haven't found the time to investigate the need for it, and our policy to run with defaults until we learn exactly why we should override them. Manifests need maintenance too.

AntiAffinity: I did quite a bit of research on this for other services in our cluster, and "spreading" should actually be the default behavior, at least if the service is created first. Below is an exerpt from our internal issue tracker. We've observed this to be flaky for Deployment, sometimes spreading and sometimes not - possibly due to services or resource limits, but our Kafka and Zookeeper pods we've actually never seen on the same node in production.

kubernetes/kubernetes#2312
"The code we have already spreads all pods belonging to the same replication controller."

kubernetes/kubernetes#11144 (comment)
"there are other reasons services should be started first"

kubernetes/kubernetes#11369
"We don't handle the case of multiple services matching the same Pod very well"

kubernetes/kubernetes#10242
"Scheduler needs to deal with pods without resource limits"
"Create an rc with 100 pods in a custom namespace and you'll end up with all 100 on the same node"

kubernetes/kubernetes#21074
"selector_spreading functionality in scheduler"

https://github.com/kubernetes/kubernetes/pull/21235/files#diff-d44336036b627f815adec0707e648e4fL68
"CalculateSpreadPriority" removed, "SelectorSpreadPriority" added?

kubernetes/kubernetes#4971
"We are now spreading by both."

kubernetes/kubernetes#41708
"The scheduler SelectorSpread priority funtion didn't have the code to spread pods of StatefulSets."
(merged after 1.5)

kubernetes/kubernetes#27484
"We should investigate (1) if you request and limit 0, does it spread evenly"

https://stackoverflow.com/questions/37784480/avoiding-kubernetes-scheduler-to-run-all-pods-in-single-node-of-kubernetes-clust
"The scheduler should spread your pods if your containers specify resource request for the amount of memory and CPU they need"

from kubernetes-kafka.

adamresson commented on July 24, 2024

Yeh, the PDB, in this case, in my mind seems to be quite valuable as it ensure that's both Kafka and Zookeeper keep their minimum amount of nodes in the cluster to function properly.

I created a simple 3 node cluster in GKE with the following commands:

gcloud container clusters create kafka --cluster-version=1.7.5


kc apply -f ./zookeeper/
# let things settle
kc apply -f ./
# let them settled again

# upgrade master
gcloud container clusters upgrade kafka --cluster-version=1.7.6 --master

# upgrade nodes
gcloud container clusters upgrade kafka --cluster-version=1.7.6

without the PDB, all my zookeeper nodes ended up on the same node near the end of the migration (the 3rd one) and then all got terminated simultaneously while migrating over the last node.

from kubernetes-kafka.

solsson commented on July 24, 2024

That's a simple and useful test, and I realize that "spreading" is an argument for getting rid of the split between persistent and non-persistent zookeeper (the reason to keep it is to better support quorums among 5 zk in a 3-zone cluster).

For a production cluster that has services scaling horizontally to 5 instances, wouldn't 6 nodes be considered minimum? Absence of a single node should always be a non-issue, and as soon as you co-locate instances on a node you're increasing risk.

from kubernetes-kafka.

solsson commented on July 24, 2024

In #13 (comment) @BenjaminDavison too suggests use of affinity. On the other hand just this week I heard positive results with "spreading" by re-creating with the service created first. I'd like to see a discussion about the current state of this in the Kubernetes community. It feels like an anti-pattern to have to use AnitAffinity in every manifest when in fact spreading is a crucial behavior for any horizontally scaled service.

from kubernetes-kafka.

StevenACoffman commented on July 24, 2024

Just FYI podAntiAffinity is extremely inefficient for large clusters as of Kubernetes 1.7. Until the implementation is rewritten, it is best to avoid it.

from kubernetes-kafka.

solsson commented on July 24, 2024

Thanks for the heads up. I still haven't investigated what the state of "spreading" is in 1.8, but I'll be testing quite a bit in the upcoming weeks.

from kubernetes-kafka.

StevenACoffman commented on July 24, 2024

The official docs mention the performance problems here

Inter-pod affinity and anti-affinity require substantial amount of processing which can slow down scheduling in large clusters significantly. We do not recommend using them in clusters larger than several hundred nodes.

What is "spreading"?

from kubernetes-kafka.

coderroggie commented on July 24, 2024

@solsson It has been a while since this has been updated. Are there any updates on what the recommended approach is? I'm wondering if it is worth the trouble to add anti-affinity rules into all the zookeeper and kafka pods or if it is better just to let it ride.

from kubernetes-kafka.

PodDisruptionBudget and AntiAffinity about kubernetes-kafka HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent