Code Monkey home page Code Monkey logo

Comments (13)

ukclivecox avatar ukclivecox commented on May 29, 2024

Can you check the logs of the cluster-manager and check the pods are running. There should always be a status so need to track this down further.

from example-seldon.

DavidLangworthy avatar DavidLangworthy commented on May 29, 2024

What specifically do I need to look for? Kubeflow starts up so much it's hard to find my way around.

from example-seldon.

DavidLangworthy avatar DavidLangworthy commented on May 29, 2024

!kubectl get pods -n kubeflow

NAME READY STATUS RESTARTS AGE
ambassador-c9647fb66-fl4zr 1/1 Running 0 1d
ambassador-c9647fb66-g6n9r 1/1 Running 0 1d
ambassador-c9647fb66-z7p27 1/1 Running 0 1d
argo-ui-755fcfc656-s2rgl 1/1 Running 0 1d
centraldashboard-7c948d9df6-jh8zj 1/1 Running 0 1d
jupyter-0 1/1 Running 0 1d
jupyter-web-app-6ffc57d749-mqtgr 0/1 CrashLoopBackOff 318 1d
katib-ui-6dc644d54-jg6mj 1/1 Running 0 1d
kubeflow-r-train-srxtq-1399384440 0/1 Completed 0 23h
kubeflow-sk-train-6llnn-122502152 0/1 Completed 0 23h
kubeflow-tf-train-nc5kg-1269457206 0/1 Completed 0 23h
metacontroller-0 1/1 Running 0 1d
minio-b7595688d-4xhbq 1/1 Running 0 1d
ml-pipeline-59459675dd-npjh6 1/1 Running 0 1d
ml-pipeline-persistenceagent-7f6d4555d7-hdkmn 1/1 Running 1 1d
ml-pipeline-scheduledworkflow-5f4d44fb4f-65xt9 1/1 Running 0 1d
ml-pipeline-ui-f5d595697-z8cl5 1/1 Running 0 1d
ml-pipeline-viewer-controller-deployment-5b4954fb4c-4ldm8 1/1 Running 0 1d
mnist-train-5-worker-0 0/1 Completed 0 23h
mykubeflowapp2-controller-b5677fccf-5fpsm 1/1 Running 0 1d
mysql-5b7578d9f5-8mjld 1/1 Running 0 1d
notebooks-controller-9c5f6b7f5-t2xlh 1/1 Running 0 1d
profiles-7bfcbd5f76-2ht9w 1/1 Running 0 1d
pytorch-operator-847d884f4d-cvwpm 1/1 Running 0 1d
r-train-mfs75 0/1 Completed 0 23h
sk-train-svnwb 0/1 Completed 0 23h
spartakus-volunteer-7787b4cf54-z79tj 1/1 Running 0 1d
studyjob-controller-5995857687-46xrn 1/1 Running 0 1d
tf-job-dashboard-c899cd664-94wtf 1/1 Running 0 1d
tf-job-operator-785546f859-rfzrm 1/1 Running 0 1d
vizier-core-6d56d75f76-969ks 1/1 Running 3 1d
vizier-core-rest-79bdbfbfb8-qnvz9 1/1 Running 0 1d
vizier-db-79d57d5667-f7nst 1/1 Running 0 1d
vizier-suggestion-bayesianoptimization-759f6c56c8-54p6x 1/1 Running 0 1d
vizier-suggestion-grid-59f7f5646d-fqcfg 1/1 Running 0 1d
vizier-suggestion-hyperband-84b8ddc658-xm9fb 1/1 Running 0 1d
vizier-suggestion-random-64b4467f6b-gptpl 1/1 Running 0 1d
workflow-controller-8564bd964f-df7x2 1/1 Running 0 1d

from example-seldon.

ukclivecox avatar ukclivecox commented on May 29, 2024

I don't see the seldon cluster-manager. Did you install seldon as per the docs?

from example-seldon.

DavidLangworthy avatar DavidLangworthy commented on May 29, 2024

Yes, but I gather it was not successful. I will try again.

Thank you

from example-seldon.

DavidLangworthy avatar DavidLangworthy commented on May 29, 2024

The deployment worked this time and the cluster manager is up:
dlan@loadclient:~$ kubectl get pods --all-namespaces | grep seldon
kube-system seldon-spartakus-volunteer-57647c7679-vb6pt 1/1 Running 0 1d kubeflow seldon-core-ambassador-6bb6fb974d-qwg79 1/1 Running 0 1m
kubeflow seldon-core-redis-685dd67c95-grv2h 1/1
Running 0 1m
kubeflow seldon-core-seldon-cluster-manager-dd8497ccf-xtm46 1/1
Running 0 1m

However I am still getting an error calling the prediction service.

ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))

The port forward window gives me the following:

dlan@loadclient:~$ kubectl port-forward $(kubectl get pods -n kubeflow -l service=ambassador -o jsonpath='{.items[0].metadata.name}') -n kubeflow 8002:80
Forwarding from 127.0.0.1:8002 -> 80
Forwarding from [::1]:8002 -> 80
Handling connection for 8002
E0419 21:38:55.183309 12957 portforward.go:400] an error occurred forwarding 8002 -> 80: error forwarding port 80 to pod baa7cdd3e0fc3d4ce1d30ff49cd8602421ebce99f6895fdb5aa70e1e362051f9, uid : exit status 1: 2019/04/19 21:38:55 socat[9620] E connect(6, AF=2 127.0.0.1:80, 16): Connection refused
Handling connection for 8002
E0419 21:38:58.731598 12957 portforward.go:400] an error occurred forwarding 8002 -> 80: error forwarding port 80 to pod baa7cdd3e0fc3d4ce1d30ff49cd8602421ebce99f6895fdb5aa70e1e362051f9, uid : exit status 1: 2019/04/19 21:38:58 socat[9798] E connect(6, AF=2 127.0.0.1:80, 16): Connection refused
Handling connection for 8002
E0419 21:39:27.769533 12957 portforward.go:400] an error occurred forwarding 8002 -> 80: error forwarding port 80 to pod baa7cdd3e0fc3d4ce1d30ff49cd8602421ebce99f6895fdb5aa70e1e362051f9, uid : exit status 1: 2019/04/19 21:39:27 socat[10904] E connect(6, AF=2 127.0.0.1:80, 16): Connection refused

from example-seldon.

ukclivecox avatar ukclivecox commented on May 29, 2024

OK. Can you check the Ambassador exposes port 80 or has moved to 8080 now?

from example-seldon.

DavidLangworthy avatar DavidLangworthy commented on May 29, 2024

I have two ambassadors
ambassador ClusterIP 10.0.233.236 80/TCP
seldon-core-ambassador NodePort 10.0.158.182 80:30489/TCP, 443:31294/TCP

Thanks for your help.

from example-seldon.

ukclivecox avatar ukclivecox commented on May 29, 2024

I would try connecting to both Ambassadors directly to see which ones work and also check the Ambassador diagnostics.

from example-seldon.

DavidLangworthy avatar DavidLangworthy commented on May 29, 2024

from example-seldon.

DavidLangworthy avatar DavidLangworthy commented on May 29, 2024

I can hit the predictor directly and it works fine. The routes look fine in ambassador. However I do not see requests in the ambassador logs.

Any suggestions?

I'll keep looking around.

from example-seldon.

ukclivecox avatar ukclivecox commented on May 29, 2024

Sorry, missed this. You won't see requests in the Ambassador logs by default I think as Ambassador doesn't logs every request. Are the requests working?

from example-seldon.

DavidLangworthy avatar DavidLangworthy commented on May 29, 2024

The requests were not working. I've recycled this cluster. I'll bring up a fresh one and see if there is a repro.

Thank you.

from example-seldon.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.