Comments (10)
@VikasAbhishek Can you post the response of http://${Host}:${Port}/v1/models
from kserve.
@VikasAbhishek Can you post the response of
http://${Host}:${Port}/v1/models
- Trying 127.0.0.1:8080...
- TCP_NODELAY set
- Connected to localhost (127.0.0.1) port 8080 (#0)
GET /v1/models HTTP/1.1
Host: localhost:8080
User-Agent: curl/7.68.0
Accept: /
- Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
< date: Mon, 13 May 2024 06:49:27 GMT
< server: istio-envoy
< content-length: 0
< - Connection #0 to host localhost left intact
from kserve.
Have you added Host header ?
from kserve.
from kserve.
from kserve.
@VikasAbhishek The response is not available in your comment, anyways you can verify if the model is ready, and can view the model name. Try using this as the model name for inference . If the response is empty, this may mean that the model is not loaded. In that case please, verify the model server logs.
from kserve.
from kserve.
@VikasAbhishek As mentioned earlier, the model is not loaded. Please, verify the model server logs and storage initializer logs.
from kserve.
I have a similar issue using the InferenceService on Azure AKS with a MLflow tracked model.
This is the structure of my model directory:
└── model
├── MLmodel
├── conda.yaml
├── metadata
│ ├── MLmodel
│ ├── conda.yaml
│ ├── python_env.yaml
│ └── requirements.txt
├── model-settings.json
├── model.pkl
├── python_env.yaml
└── requirements.txt
I followed the Azure guide here and MLflow guide here but cannot seem to get the InferenceService to deploy correctly.
This is my model.yml
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "wine-classifier"
namespace: "mlflow-kserve-test"
spec:
predictor:
serviceAccountName: sa
model:
modelFormat:
name: mlflow
protocolVersion: v2
storageUri: "https://{SA}.blob.core.windows.net/azureml/ExperimentRun/dcid.{RUNID}/model"
I tried using storageUri: "https://{SA}.blob.core.windows.net/azureml/ExperimentRun/dcid.{RUNID}/model/model.pkl"
, but then the service doesn't start properly because it cannot build the environment.
Here are the log outputs from the kserve-container
:
Environment tarball not found at '/mnt/models/environment.tar.gz'
Environment not found at './envs/environment'
2024-06-05 14:36:17,236 [mlserver.parallel] DEBUG - Starting response processing loop...
2024-06-05 14:36:17,238 [mlserver.rest] INFO - HTTP server running on http://0.0.0.0:8080
INFO: Started server process [1]
INFO: Waiting for application startup.
2024-06-05 14:36:17,267 [mlserver.metrics] INFO - Metrics server running on http://0.0.0.0:8082
2024-06-05 14:36:17,267 [mlserver.metrics] INFO - Prometheus scraping endpoint can be accessed on http://0.0.0.0:8082/metrics
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
2024-06-05 14:36:18,568 [mlserver.grpc] INFO - gRPC server running on http://0.0.0.0:9000
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
INFO: Uvicorn running on http://0.0.0.0:8082 (Press CTRL+C to quit)
2024/06/05 14:36:19 WARNING mlflow.pyfunc: Detected one or more mismatches between the model's dependencies and the current Python environment:
- mlflow (current: 2.3.1, required: mlflow==2.12.2)
- cloudpickle (current: 2.2.1, required: cloudpickle==3.0.0)
- numpy (current: 1.23.5, required: numpy==1.24.4)
- packaging (current: 23.1, required: packaging==23.2)
- psutil (current: uninstalled, required: psutil==5.9.8)
- pyyaml (current: 6.0, required: pyyaml==6.0.1)
- scikit-learn (current: 1.2.2, required: scikit-learn==1.3.2)
- scipy (current: 1.9.1, required: scipy==1.10.1)
To fix the mismatches, call `mlflow.pyfunc.get_model_dependencies(model_uri)` to fetch the model's environment and install dependencies using the resulting environment file.
2024-06-05 14:36:19,791 [mlserver] INFO - Couldn't load model 'wine-classifier'. Model will be removed from registry.
2024-06-05 14:36:19,791 [mlserver.parallel] ERROR - An error occurred processing a model update of type 'Load'.
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/worker.py", line 158, in _process_model_update
await self._model_registry.load(model_settings)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 293, in load
return await self._models[model_settings.name].load(model_settings)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 148, in load
await self._load_model(new_model)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 165, in _load_model
model.ready = await model.load()
File "/opt/conda/lib/python3.8/site-packages/mlserver_mlflow/runtime.py", line 155, in load
self._model = mlflow.pyfunc.load_model(model_uri)
File "/opt/conda/lib/python3.8/site-packages/mlflow/pyfunc/__init__.py", line 582, in load_model
model_meta = Model.load(os.path.join(local_path, MLMODEL_FILE_NAME))
File "/opt/conda/lib/python3.8/site-packages/mlflow/models/model.py", line 468, in load
return cls.from_dict(yaml.safe_load(f.read()))
File "/opt/conda/lib/python3.8/site-packages/mlflow/models/model.py", line 478, in from_dict
model_dict["signature"] = ModelSignature.from_dict(model_dict["signature"])
File "/opt/conda/lib/python3.8/site-packages/mlflow/models/signature.py", line 83, in from_dict
inputs = Schema.from_json(signature_dict["inputs"])
File "/opt/conda/lib/python3.8/site-packages/mlflow/types/schema.py", line 360, in from_json
return cls([read_input(x) for x in json.loads(json_str)])
File "/opt/conda/lib/python3.8/site-packages/mlflow/types/schema.py", line 360, in <listcomp>
return cls([read_input(x) for x in json.loads(json_str)])
File "/opt/conda/lib/python3.8/site-packages/mlflow/types/schema.py", line 358, in read_input
return TensorSpec.from_json_dict(**x) if x["type"] == "tensor" else ColSpec(**x)
TypeError: __init__() got an unexpected keyword argument 'required'
2024-06-05 14:36:19,793 [mlserver] INFO - Couldn't load model 'wine-classifier'. Model will be removed from registry.
2024-06-05 14:36:19,795 [mlserver.parallel] ERROR - An error occurred processing a model update of type 'Unload'.
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/worker.py", line 160, in _process_model_update
await self._model_registry.unload_version(
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 302, in unload_version
await model_registry.unload_version(version)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 201, in unload_version
model = await self.get_model(version)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 237, in get_model
raise ModelNotFound(self._name, version)
mlserver.errors.ModelNotFound: Model wine-classifier not found
2024-06-05 14:36:19,796 [mlserver] ERROR - Some of the models failed to load during startup!
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/mlserver/server.py", line 125, in start
await asyncio.gather(
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 293, in load
return await self._models[model_settings.name].load(model_settings)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 148, in load
await self._load_model(new_model)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 161, in _load_model
model = await callback(model)
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/registry.py", line 152, in load_model
loaded = await pool.load_model(model)
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/pool.py", line 74, in load_model
await self._dispatcher.dispatch_update(load_message)
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/dispatcher.py", line 123, in dispatch_update
return await asyncio.gather(
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/dispatcher.py", line 138, in _dispatch_update
return await self._dispatch(worker_update)
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/dispatcher.py", line 146, in _dispatch
return await self._wait_response(internal_id)
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/dispatcher.py", line 152, in _wait_response
inference_response = await async_response
mlserver.parallel.errors.WorkerError: builtins.TypeError: __init__() got an unexpected keyword argument 'required'
2024-06-05 14:36:19,796 [mlserver.parallel] INFO - Waiting for shutdown of default inference pool...
2024-06-05 14:36:19,997 [mlserver.parallel] INFO - Shutdown of default inference pool complete
2024-06-05 14:36:19,997 [mlserver.grpc] INFO - Waiting for gRPC server shutdown
2024-06-05 14:36:20,001 [mlserver.grpc] INFO - gRPC server shutdown complete
INFO: Shutting down
INFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
INFO: Finished server process [1]
INFO: Application shutdown complete.
INFO: Finished server process [1]
It manages to create the environment, but cannot load the model.
Calling mlflow.pyfunc.load_model(model_uri)
locally, loads the model.
And testing with mlserver start .
was also successful.
Deployments of real-time endpoints of the tracked model in Azure Machine Learning also work, but I need an alternative to deploy on prem.
Any help would be much appreciated.
from kserve.
I have a similar issue. pretty sure it's due to the dependencies not loading for the model but the error is less than helpful. In my case, I'm using a private pypi so thinking that's not working.
from kserve.
Related Issues (20)
- Extend the OpenAI schema to support additional parameters
- transformer with v2 + grpc is not work HOT 3
- When reasoning with huggingface server, NCCL error occurs when GPU>1 HOT 4
- Resource requests and limits not respected / Usage of EmptyDir volume "models-dir" exceeds the limit "1536Mi" HOT 2
- Remove ray dependency for the storage-initializer HOT 2
- Large model deployment timesout HOT 12
- mlflow model cannot be deployed HOT 1
- Target Model State: Pending HOT 2
- TLS with S3 Outside ConfigMap HOT 2
- Support speculative decoding in vLLM backend of HuggingFace server HOT 1
- Tensorflow model Could not find variable HOT 1
- How to isolate models from worker pods in a multi-tenancy setup HOT 1
- Support Envoy Gateway as ingress option
- Bump install manifests to v0.13.0 HOT 1
- Enabling inference service logging disables streaming responses HOT 8
- Add HTTP headers to inbound requests to identify isvc name and namespace HOT 1
- fail to locate model file for model <model-name> under dir /mnt/models,trying loading from model repository
- Incorrect totalCopies values after editing InferenceService
- Triton Inference Server Doesn't Returning Expected Response on Model Endpoint
- KServe Storage Initializer to update the Isvc with some status HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kserve.