Comments (13)
Can you check the logs of the cluster-manager and check the pods are running. There should always be a status so need to track this down further.
from example-seldon.
What specifically do I need to look for? Kubeflow starts up so much it's hard to find my way around.
from example-seldon.
!kubectl get pods -n kubeflow
NAME READY STATUS RESTARTS AGE
ambassador-c9647fb66-fl4zr 1/1 Running 0 1d
ambassador-c9647fb66-g6n9r 1/1 Running 0 1d
ambassador-c9647fb66-z7p27 1/1 Running 0 1d
argo-ui-755fcfc656-s2rgl 1/1 Running 0 1d
centraldashboard-7c948d9df6-jh8zj 1/1 Running 0 1d
jupyter-0 1/1 Running 0 1d
jupyter-web-app-6ffc57d749-mqtgr 0/1 CrashLoopBackOff 318 1d
katib-ui-6dc644d54-jg6mj 1/1 Running 0 1d
kubeflow-r-train-srxtq-1399384440 0/1 Completed 0 23h
kubeflow-sk-train-6llnn-122502152 0/1 Completed 0 23h
kubeflow-tf-train-nc5kg-1269457206 0/1 Completed 0 23h
metacontroller-0 1/1 Running 0 1d
minio-b7595688d-4xhbq 1/1 Running 0 1d
ml-pipeline-59459675dd-npjh6 1/1 Running 0 1d
ml-pipeline-persistenceagent-7f6d4555d7-hdkmn 1/1 Running 1 1d
ml-pipeline-scheduledworkflow-5f4d44fb4f-65xt9 1/1 Running 0 1d
ml-pipeline-ui-f5d595697-z8cl5 1/1 Running 0 1d
ml-pipeline-viewer-controller-deployment-5b4954fb4c-4ldm8 1/1 Running 0 1d
mnist-train-5-worker-0 0/1 Completed 0 23h
mykubeflowapp2-controller-b5677fccf-5fpsm 1/1 Running 0 1d
mysql-5b7578d9f5-8mjld 1/1 Running 0 1d
notebooks-controller-9c5f6b7f5-t2xlh 1/1 Running 0 1d
profiles-7bfcbd5f76-2ht9w 1/1 Running 0 1d
pytorch-operator-847d884f4d-cvwpm 1/1 Running 0 1d
r-train-mfs75 0/1 Completed 0 23h
sk-train-svnwb 0/1 Completed 0 23h
spartakus-volunteer-7787b4cf54-z79tj 1/1 Running 0 1d
studyjob-controller-5995857687-46xrn 1/1 Running 0 1d
tf-job-dashboard-c899cd664-94wtf 1/1 Running 0 1d
tf-job-operator-785546f859-rfzrm 1/1 Running 0 1d
vizier-core-6d56d75f76-969ks 1/1 Running 3 1d
vizier-core-rest-79bdbfbfb8-qnvz9 1/1 Running 0 1d
vizier-db-79d57d5667-f7nst 1/1 Running 0 1d
vizier-suggestion-bayesianoptimization-759f6c56c8-54p6x 1/1 Running 0 1d
vizier-suggestion-grid-59f7f5646d-fqcfg 1/1 Running 0 1d
vizier-suggestion-hyperband-84b8ddc658-xm9fb 1/1 Running 0 1d
vizier-suggestion-random-64b4467f6b-gptpl 1/1 Running 0 1d
workflow-controller-8564bd964f-df7x2 1/1 Running 0 1d
from example-seldon.
I don't see the seldon cluster-manager. Did you install seldon as per the docs?
from example-seldon.
Yes, but I gather it was not successful. I will try again.
Thank you
from example-seldon.
The deployment worked this time and the cluster manager is up:
dlan@loadclient:~$ kubectl get pods --all-namespaces | grep seldon
kube-system seldon-spartakus-volunteer-57647c7679-vb6pt 1/1 Running 0 1d kubeflow seldon-core-ambassador-6bb6fb974d-qwg79 1/1 Running 0 1m
kubeflow seldon-core-redis-685dd67c95-grv2h 1/1
Running 0 1m
kubeflow seldon-core-seldon-cluster-manager-dd8497ccf-xtm46 1/1
Running 0 1m
However I am still getting an error calling the prediction service.
ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
The port forward window gives me the following:
dlan@loadclient:~$ kubectl port-forward $(kubectl get pods -n kubeflow -l service=ambassador -o jsonpath='{.items[0].metadata.name}') -n kubeflow 8002:80
Forwarding from 127.0.0.1:8002 -> 80
Forwarding from [::1]:8002 -> 80
Handling connection for 8002
E0419 21:38:55.183309 12957 portforward.go:400] an error occurred forwarding 8002 -> 80: error forwarding port 80 to pod baa7cdd3e0fc3d4ce1d30ff49cd8602421ebce99f6895fdb5aa70e1e362051f9, uid : exit status 1: 2019/04/19 21:38:55 socat[9620] E connect(6, AF=2 127.0.0.1:80, 16): Connection refused
Handling connection for 8002
E0419 21:38:58.731598 12957 portforward.go:400] an error occurred forwarding 8002 -> 80: error forwarding port 80 to pod baa7cdd3e0fc3d4ce1d30ff49cd8602421ebce99f6895fdb5aa70e1e362051f9, uid : exit status 1: 2019/04/19 21:38:58 socat[9798] E connect(6, AF=2 127.0.0.1:80, 16): Connection refused
Handling connection for 8002
E0419 21:39:27.769533 12957 portforward.go:400] an error occurred forwarding 8002 -> 80: error forwarding port 80 to pod baa7cdd3e0fc3d4ce1d30ff49cd8602421ebce99f6895fdb5aa70e1e362051f9, uid : exit status 1: 2019/04/19 21:39:27 socat[10904] E connect(6, AF=2 127.0.0.1:80, 16): Connection refused
from example-seldon.
OK. Can you check the Ambassador exposes port 80 or has moved to 8080 now?
from example-seldon.
I have two ambassadors
ambassador ClusterIP 10.0.233.236 80/TCP
seldon-core-ambassador NodePort 10.0.158.182 80:30489/TCP, 443:31294/TCP
Thanks for your help.
from example-seldon.
I would try connecting to both Ambassadors directly to see which ones work and also check the Ambassador diagnostics.
from example-seldon.
from example-seldon.
I can hit the predictor directly and it works fine. The routes look fine in ambassador. However I do not see requests in the ambassador logs.
Any suggestions?
I'll keep looking around.
from example-seldon.
Sorry, missed this. You won't see requests in the Ambassador logs by default I think as Ambassador doesn't logs every request. Are the requests working?
from example-seldon.
The requests were not working. I've recycled this cluster. I'll bring up a fresh one and see if there is a repro.
Thank you.
from example-seldon.
Related Issues (17)
- Install fails with ksonnet 0.10.0.alpha.1 HOT 1
- Wrapping model for MNIST Scikit-learn doesn't work HOT 1
- [prow] Fix the typo in kubernetes/test-infra and setup test HOT 2
- deploy.sh fail with error "must provide URIs beginning with 'github.com'" HOT 3
- Allow a cluster internal internal Docker repo to be used
- Remove argo default cluster binding
- Not using GCloud to setup Seldon HOT 1
- Training error on all frameworks HOT 10
- prediction_pb2 and prediction_pb2_grpc missing
- Utilize an NFS PVC for storage
- Prediction Analytics Dashboard not showing metrics
- TFJob version out of date
- RESOURCE_ERROR:No valid versions with the prefix \"1.11\" found
- Archive kubeflow/example-seldon
- Update to use latest kubeflow ksonnet registry
- Make pushing to docker repo optional in workflows
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from example-seldon.