Code Monkey home page Code Monkey logo

Comments (10)

shrinandj avatar shrinandj commented on July 24, 2024 1

@solsson I've tried to rephrase the reason for having pzoo and zoo below. Let me know what you think:

AFAICT, there are at least two types of failures for which there should be some protection.

  • Software errors: This is where something goes wrong with a Zookeeper pod that results in it going down. There is nothing wrong with the underlying infrastructure.

  • Infra errors: Underlying AWS/cloud infrastructure went down.

If there are 3 AZs, the 5 ZK pods are spread across these 3 AZs. If an AZ goes down, there is little benefit to be had of having 5 ZK pods since the AZ that went down could result in 2 ZK pods being lost. The ZK cluster is 1 more failure away from being unavailable. The situation would be the same if there were only 3 ZK pods and 1 AZ went down.

However, for software errors, each pod could go down by itself and having 5 ZK nodes helps because it can tolerate 2 individual pod failures (instead of 1 in the 3ZK case).

While having only 3 EBS volumes instead of 5 does keep costs low, to avoid confusion, it would be better to have a single statefulset of pzoo with 5 nodes.

from kubernetes-kafka.

solsson avatar solsson commented on July 24, 2024

It was introduced in #34 and discussed in #26 (comment).

The case for this has weakened though, with increased support for dynamic volume provisioning across different kubernetes setups, and with this setup being used for heavier workloads. I'd perfer if the two statefulsets could simply be scaled up and down individually. For example if you're on a single zone you don't have the volume portability issue. In a setup like #118 with local volumes howerver it's quite difficult to ensure quorum capabilities on single node failure.

Unfortunately Zookeeper's configuration is static, prior to 3.5 wich is in development. Adapting to initial scale would be doable, I think. For example the init script could use the Kubernetes API to read the desired number of replicas for the both StatefulSets and generate the server.X strings accordingly.

from kubernetes-kafka.

thoslin avatar thoslin commented on July 24, 2024

Hi @solsson . I have the same confusion. I've read your comment on the other issue, but it didn't help me understand why two node are using empty_dir but not persistent volumn. Could you elaborate a little more as to under what scenario they will be useful? How does it compare to use persistent volume for all 5 nodes? I'm running my Kubernetes cluster on AWS, with 6 worker nodes spreading across 3 availability zones. Thanks.

from kubernetes-kafka.

solsson avatar solsson commented on July 24, 2024

Good that you question this. The complexity should be removed if it can't be motivated. I'm certainly prepared to switch to all-persistent Zookeeper.

The design goal was to make the persistent layer as robust as the services layer. Probably not as robust as bucket stores or 3rd party hosted databases, but same uptime as your frontend is good enough.

Thus workloads will have to migrate in the face of lost availability zones, like non-stateful apps will certainly do with Kubernetes. I recall https://medium.com/spire-labs/mitigating-an-aws-instance-failure-with-the-magic-of-kubernetes-128a44d44c14 "a sense of awe watching the automatic mitigation".

Unless you have a volume type that can migrate, the problem is that stateful pods will only start in the zone where the volume was provisioned. With both 5 and 7 node zk across 3 zones, if a zone with 2 or 3 zk pods repsectively goes out, you're -1 pod away from losing a majority of your zk. My assumption is that lost majority means your service goes down. Zone outage can be extensive, as in the AWS case above, and due to zk's static configuration you can't reconfigure to adapt to the situation as it would cause the -1.

With kafka brokers you can throw money at the problem: increase your replication factor. With zk you can't. Or maybe you can, with scale=9?

from kubernetes-kafka.

solsson avatar solsson commented on July 24, 2024

While having only 3 EBS volumes instead of 5 does keep costs low, to avoid confusion, it would be better to have a single statefulset of pzoo with 5 nodes.

@shrinandj I think I agree at this stage. What would be even better, in particular now (unlike in the k8s 1.2 days) that support for automatic volume provisioning can be expected, would be to support scaling of the zookeeper statefulset(s). That way everyone can descide for themselves, and we can default to 5 persistent pods. Should be quite doable in the initscript, by retrieving desired number of replicas with kubect. I'd be happy to accept PRs for such things.

from kubernetes-kafka.

shrinandj avatar shrinandj commented on July 24, 2024

Can you elaborate a bit on that?

  • The default will be a statefulset with 5 pods.
  • Users can scale this up if needed by simply increasing the number from 5 to whatever using kubectl scale statefulsets pzoo --replicas=<new-replicas>. This should create the new PVCs and then run the pods.

What changes are required in the init script?

from kubernetes-kafka.

solsson avatar solsson commented on July 24, 2024

Sounds like a good summary, and my ideas for how are sketchy at best. Sadly(?) this repo has come of age already and needs to consider backwards compatibility. Hence we might want a multi-step solution:

  1. Add volume claims to the zoo statefulset, keep the init script as is.
  2. Add an ezoo (ephemeral) statefulset as a copy of the "old" zoo, for the multi-zone frugal use case, but with replicas=0.
  3. Include the above kubernetes-kafka release.
  4. Add a branch (for evaluation by those who dare) that generates the server entries based on kubectl -n kafka get statefulset zoo -o=jsonpath='{.status.replicas}' (and equivalent for pzoo - deprecated - and ezoo).
  5. If this is looking good, change defaults to replicas=5 for zoo and replicas=0 for pzoo+ezoo, with a documented migration procedure in release notes.

from kubernetes-kafka.

AndresPineros avatar AndresPineros commented on July 24, 2024

@solsson I understand that the steps mentioned above are needed due to backwards compatibility, but in case I want 5 pzoos I just need to change the replication to 5 and remove the zoo statefulset, right?

from kubernetes-kafka.

solsson avatar solsson commented on July 24, 2024

@AndresPineros You'll also need to change the server.4 and server.5 lines in 10zookeeper-config.yml and prepend the p.

from kubernetes-kafka.

solsson avatar solsson commented on July 24, 2024

See #191 (comment) for the suggested way forward.

from kubernetes-kafka.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.