Comments (11)
I'm a little wary of "upon promotion". I don't think our code should have any concept of promotion, as it introduces statefulness of the cluster. Perhaps this is acceptable in the optimization case, but it still smells weird to me.
I do agree that we don't want to be eating the cost of flipping all the resources around.
from kserve.
I might be more comfortable if we tracked a hash of the kfspec. If a knconfiguration owned by our KFService is already serving that spec, we can re-use it. Perhaps we can mark the configuration with a hash?
This optimizes both the case where canary == default and the case of flipping canary to default and deleting canary. We just need to make sure that both canary & default are not using a configuration before cleaning one up. Perhaps we can track this with owner references on the configuration?
I don't think this is too different from what you're suggesting. What do you think?
from kserve.
Seems like optimization that we can punt for now?
from kserve.
Going back to my earlier philosophy which I commented - why is the focus here on enhancing knative functionality? If knative is going to serve 80% of world's microservices (hypothetically), what is it in model-serving which is so fundamentally different? If we are solving Knative's deficiencies, lets target that community and contribute there. The goal here should be to NOT deviate from Knative's defaults, so that tools and ecosystem which emerge in the community can be used (e.g. monitoring etc..).
from kserve.
If there are domain specific usecase in Model serving which require a fundamentally different approach than knative, that makes sense. To me a lot fo discussions here seem to be around general enhancements on canary deployments, routing etc.... which ideally should be taken up on Knative side, rather than implementing this here
from kserve.
+1 @animeshsingh. I do think that this is an artifact of how we're treating Kn Configurations, and it may not necessarily be a common pattern for them to optimize around, but either they should have a better way for us to use Knative, or they should implement this optimization.
I think the principle you allude to of "KFServing solves ML problems, Knative solves deployment problems" is a good guiding principle for us as we prioritize our efforts.
from kserve.
@animeshsingh definitely agree with your points. We have discussed the solution with knative team and knative serving is designed in a flexible way that can enable user for more specific pattern with the component they provided(here we use two knative configurations instead of one service), but it may worth to discuss with them if our pattern is common enough which can be pushed down to knative.
from kserve.
/area control-plane
/priority p2
from kserve.
"here we use two knative configurations instead of one service" - where can find the rationale for this? we have many knative contributors in IBM, and i would want to pass it by them to get opinion as well
from kserve.
/area performance
from kserve.
/kind feature
from kserve.
Related Issues (20)
- Kserve Model Deployment issue HOT 4
- Serverless install: real DNS kills routing HOT 13
- Pytorch V2 Protocol case is not working HOT 6
- [Documentation] IRSA example is not working
- Domain validation should be skipped if ingress creation is disabled
- InferenceService CRD is too large HOT 7
- Make storage initializer install only what is needed for it to run
- Wrong request uri? HOT 4
- Sample model failing to launch,"No such file or directory: '/mnt/models'" inside the inference service pod. HOT 11
- Storage Initializer Azure IsADirectoryError
- Sample inference deployed using the modelcar functionality is not starting HOT 7
- What types of models does the Hugging Face runtime support? HOT 1
- HTTP 404 Not Found: Deploying Sklearn-Iris Inference Service on KServe 0.10.0 Using RawDeployment Mode HOT 1
- Triton model inference with yolov8 onnx HOT 1
- Cannot scale down inferenceservice pod to zero replica HOT 5
- You need to specify either `text` or `text_target` HOT 1
- hugging face demo does not work in my local environment HOT 1
- Fix code scanning alert - Potential file inclusion via variable
- Selected Inference Services predictions in Inference Graph steps to be output together per request
- Is it possible to inject custom initContainer into the InferenceService pod ? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kserve.