Comments (9)
@ashahba That is not a good idea. There is a fix for this in #37. There was an error with how pod anti-affinity was setup.
from vck.
We already do what you said @ashahba in a way here but in this case it was not because of that.
from vck.
ATM disk pressure is not taken into consideration while scheduling the pod. But that's one of the things we'd like to include and it should be possible once we implement the reconciler. But right now, the pods are scheduled on the nodes even with high disk pressure which could result in pod and CR failures.
from vck.
NAME READY STATUS RESTARTS AGE IP NODE
kvc-845d468d84-gd4m5 1/1 Running 0 25m 10.72.13.156 gke-dls-us-n1-standard-4-1ba3a893-gvr1
kvc-resource-8360b237-4cce-11e8-9e1e-0a580a480d9c 1/1 Running 0 1m 10.72.0.227 gke-dls-us-n1-highmem-8-skylake-82af83b4-7gc4
kvc-resource-836203d3-4cce-11e8-9e1e-0a580a480d9c 1/1 Running 0 1m 10.72.0.228 gke-dls-us-n1-highmem-8-skylake-82af83b4-7gc4
kvc-resource-83630e86-4cce-11e8-9e1e-0a580a480d9c 1/1 Running 0 1m 10.72.21.145 gke-dls-us-n1-highmem-8-skylake-82af83b4-w7jt
kvc-resource-83642634-4cce-11e8-9e1e-0a580a480d9c 1/1 Running 0 1m 10.72.19.150 gke-dls-us-n1-highmem-8-skylake-82af83b4-8nvh
kvc-resource-8365b972-4cce-11e8-9e1e-0a580a480d9c 1/1 Running 0 1m 10.72.0.229 gke-dls-us-n1-highmem-8-skylake-82af83b4-7gc4
kvc-resource-83675c8b-4cce-11e8-9e1e-0a580a480d9c 1/1 Running 0 1m 10.72.16.149 gke-dls-us-n1-standard-4-1ba3a893-wvlw
from vck.
We probably need a check to make sure if replicas
is greater than number of hosts provided in nodeAffinity
, then reduce the replicas to match the number of hosts.
from vck.
@ashahba I don't understand your suggestion here? can you explain?
from vck.
@balajismaniam I think if user asked for 6
replicas but when populating nodeAffinity
during Scheduling
if we only found 3
nodes that meet the criteria (this could be due a several reasons like for example: The rest of nodes don't have enough space left on the disk), then we should not deploy the pods multiple times on the same node just to meet the requested replicas.
We need to decide what action we take in that scenario before implementing a solution.
My suggestion is: Print an error message to users notifying them that they can only ask for (for example) 3 replicas at the most.
from vck.
Thanks @balajismaniam and @Ajay191191 .
One last question:
What if /mnt/stateful_partition
is 100%
full on half of the nodes and user asks for exactly same replicas as len(nodeList)
? do we end up replicating some pods across the same node which in turn end up copying data to the same node twice?
from vck.
Fixed by #37.
from vck.
Related Issues (20)
- Cachefilesd: Deliver documentation HOT 4
- Cachefilesd: Figure out how cachefilesd configurations will be surfaced to the objects running in Kubernetes HOT 2
- Cachefilesd: Update the NFS handler in KVC to handle installation of cachefilesd HOT 5
- Cachefilesd: Add end to end example for Cachefilesd HOT 24
- Bubble-up pod logs to CR when data download fails. HOT 6
- Investigate the possibility of preserving logs
- Consider using a Job for data downloading instead of the pod.
- Create `v0.1.0-alpha1` tag for KVC HOT 2
- Snapshots HOT 3
- Error when / is missing at the end of the s3 directory
- Creating PVCs hang and causes all subsequent attempts to create PVc to hang as well if timeoutForDataDownload is a non quoted integer value. HOT 3
- Fail to fetch a "directory" in s3 with minio/mc
- Support a data distribution strategy HOT 1
- non-authenticated object store source type support
- Periodic S3 source data updating
- Erroneously empty directory in test pod when trying to mount IBM Cloud Object Storage HOT 4
- Sync back with S3 data source. HOT 5
- Compile failed HOT 3
- [problem] HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vck.