/kind bug What steps did you take and what happened:</strong

Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

you might need to do this in your isvc: <a href="https://kserve.github.io/website/0.11

I tried to organize a <a href="https://microk8s.io/docs/registry-built-in" rel="nofoll

Hi, what I meant was to use this structure: <div class="highlight highlight-source

Unable to run InferenceService on a local cluster about kserve HOT 12 CLOSED

yurkoff-mv commented on July 28, 2024

Unable to run InferenceService on a local cluster

from kserve.

Comments (12)

israel-hdez commented on July 28, 2024 2

Looks like you are using KServe serverless mode, which uses Knative.

Knative always tries to resolve image tags to digests, which is an operation that requires access to the registry (reference: https://knative.dev/docs/serving/tag-resolution/)

Thus, you may want your try using the digest of your image in the InferenceServices, instead of 0.9.0-gpu.

from kserve.

yurkoff-mv commented on July 28, 2024 2

Hi, @israel-hdez, @spolti! Thanks a lot! This works for me!
I edited the ConfigMap config-deployment

microk8s kubectl edit configmap config-deployment -n knative-serving

by adding the following line:

registries-skipping-tag-resolving: "kind.local,ko.local,dev.local,index.docker.io"

and the local image was successfully applied in InferenceService.

from kserve.

spolti commented on July 28, 2024

Hi, I never used microk8s before, but there are a few things that might be causing it:

First, shouldn't you use the complete image name instead just yurkoff/torchserve-kfs:0.9.0-gpu?

Secondly, this looks strange:

"https://index.docker.io/v2 /"

notice the space in the last /

you might need to investigate why is this API address having the extra space at the end.

from kserve.

yurkoff-mv commented on July 28, 2024

Hello!
Thanks for the answer.
There is no space there, apparently it was copied incorrectly from Linux. I tried using the full name (docker.io/yurkoff/torchserve-kfs:0.9.0-gpu) too.

Revision "llm-predictor-00001" failed with message: Unable to fetch image "docker.io/yurkoff/torchserve-kfs:0.9.0-gpu": failed to resolve image to digest: Get "https://index.docker.io/v2/": read tcp 10.1.22.219:40004->54.236.113.205:443: read: connection reset by peer.

Interestingly, KubeFlow is automatically deployed from the local images, but the InferenceService is not possible.

from kserve.

spolti commented on July 28, 2024

you might need to do this in your isvc: https://kserve.github.io/website/0.11/modelserving/v1beta1/custom/custom_model/#deploy-the-rest-custom-serving-runtime-on-kserve
Using SHA might be helpful as well.
The podSpec is exposed in the isvc as inline, so any PodSpec field would be available like the example above.

from kserve.

yurkoff-mv commented on July 28, 2024

I don’t quite understand what exactly I need to do? I compiled the image in Docker. It is successfully downloaded and deployed in a cluster with Internet access. From this cluster I export the image as a tar file. I import the resulting image into the cluster without the Internet. For some reason, InferenceService thinks that it does not exist and tries to download it. If you create a Deployment, it considers that the image is present.

from kserve.

spolti commented on July 28, 2024

See the inference service structure from the link I sent you. PullImagePolicy and the container is a property from the containers field.

from kserve.

yurkoff-mv commented on July 28, 2024

Sorry, but I didn't find any mention of imagePullPolicy in the link provided. However, this parameter is in the description of V1beta1TorchServeSpec.

from kserve.

yurkoff-mv commented on July 28, 2024

I tried to organize a local registry. I uploaded my image yurkoff/torchserve-kfs:0.9.0-gpu there, but I get the following error:

Message:               Revision "llm-predictor-00001" failed with message: Unable to fetch image "127.0.0.1:32000/yurkoff/torchserve-kfs:0.9.0-gpu": failed to resolve image to digest: Get "https://127.0.0.1:32000/v2/": dial tcp 127.0.0.1:32000: connect: connection refused; Get "http://127.0.0.1:32000/v2/": dial tcp 127.0.0.1:32000: connect: connection refused.

Given that the registry is available.
curl -v http://127.0.0.1:32000/v2/

*   Trying 127.0.0.1:32000...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 32000 (#0)
> GET /v2/ HTTP/1.1
> Host: 127.0.0.1:32000
> User-Agent: curl/7.68.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 2
< Content-Type: application/json; charset=utf-8
< Docker-Distribution-Api-Version: registry/2.0
< X-Content-Type-Options: nosniff
< Date: Thu, 16 May 2024 11:03:13 GMT
< 
{}
* Connection #0 to host 127.0.0.1 left intact

curl -v http://127.0.0.1:32000/v2/_catalog

*   Trying 127.0.0.1:32000...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 32000 (#0)
> GET /v2/_catalog HTTP/1.1
> Host: 127.0.0.1:32000
> User-Agent: curl/7.68.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=utf-8
< Docker-Distribution-Api-Version: registry/2.0
< X-Content-Type-Options: nosniff
< Date: Thu, 16 May 2024 12:00:45 GMT
< Content-Length: 44
< 
{"repositories":["yurkoff/torchserve-kfs"]}
* Connection #0 to host 127.0.0.1 left intact

I can’t understand what information InferenceService wants to receive from outside if everything is available locall

from kserve.

spolti commented on July 28, 2024

Hi, what I meant was to use this structure:

spec:
  predictor:
    containers:
      image: xxx
      name: kserve-container
      ports: xxx

or you can define it in your custom Serving Runtime as well.

from kserve.

yurkoff-mv commented on July 28, 2024

Hi, @spolti !
I tried this, same result.
My yaml-file:

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: "llm"
  namespace: "kubeflow-megaputer"
spec:
  predictor:
    containers:
      - name: kserve-container
        image: "yurkoff/torchserve-kfs:0.9.0-gpu"
        imagePullPolicy: IfNotPresent
#        storageUri: pvc://torchserve-claim/llm
        env:
          - name: STORAGE_URI
            value: pvc://torchserve-claim/llm
        resources:
          requests:
            cpu: "2"
            memory: 16Gi
            nvidia.com/gpu: "1"
          limits:
            cpu: "4"
            memory: 24Gi
            nvidia.com/gpu: "1"

from kserve.

spolti commented on July 28, 2024

Nice @israel-hdez , didn't spot it :D

from kserve.

Unable to run InferenceService on a local cluster about kserve HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent