Comments (5)
@anencore94 Can you share your cluster spec?
- If you explicitly specify 10001 in your container spec, it would work.
kuberay/ray-operator/config/samples/ray-cluster.mini.yaml
Lines 60 to 62 in 6743385
- Another way is to remove container ports and controller will open all default ports (10001 and dashboard etc).
kuberay/ray-operator/controllers/common/service.go
Lines 34 to 38 in 7965aab
I think this is a little bit tricky. If you can help file a PR for option1 after your testing. That would be great
from kuberay.
I just use minikube node with (k8s-version=v1.19.2, cpus=4, driver=docker) to test before using kuberay to my multi-node cluster at work.
If you explicitly specify 10001 in your container spec, it would work.
Thank you! 'Connecting my k8s raycluster from outside of k8s cluster with python client' worked with your comments, but then my raycluster-mini-head service only expose 10001.
Here's the result of created svc
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2021-09-30T06:19:52Z"
labels:
ray.io/cluster: raycluster-mini
ray.io/identifier: raycluster-mini-head
ray.io/node-type: head
name: raycluster-mini-head-svc
namespace: ray-system
ownerReferences:
- apiVersion: ray.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: RayCluster
name: raycluster-mini
uid: 141efbba-65e0-4460-9d20-502320f46252
resourceVersion: "865"
selfLink: /api/v1/namespaces/ray-system/services/raycluster-mini-head-svc
uid: 506e0a71-9e5d-4fd9-8e22-9b0d55bc797f
spec:
clusterIP: 10.100.96.184
ports:
- port: 10001
protocol: TCP
targetPort: 10001
selector:
ray.io/cluster: raycluster-mini
ray.io/identifier: raycluster-mini-head
ray.io/node-type: head
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
From this raycluster.yaml file:
apiVersion: ray.io/v1alpha1
kind: RayCluster
metadata:
labels:
controller-tools.k8s.io: "1.0"
# An unique identifier for the head node and workers of this cluster.
name: raycluster-mini
spec:
rayVersion: '0.8.6' # should match the Ray version in the image of the containers
######################headGroupSpecs#################################
# head group template and specs, (perhaps 'group' is not needed in the name)
headGroupSpec:
# Kubernetes Service Type, valid values are 'ClusterIP', 'NodePort' and 'LoadBalancer'
serviceType: ClusterIP
# the pod replicas in this group typed head (assuming there could be more than 1 in the future)
replicas: 1
# logical group name, for this called head-group, also can be functional
# pod type head or worker
# rayNodeType: head # Not needed since it is under the headgroup
# the following params are used to complete the ray start: ray start --head --block --redis-port=6379 ...
rayStartParams:
port: '6379' # should match headService targetPort
object-manager-port: '12345'
node-manager-port: '12346'
#include_webui: 'true'
object-store-memory: '100000000'
redis-password: 'LetMeInRay'
# webui_host: "10.1.2.60"
dashboard-host: '0.0.0.0'
num-cpus: '1' # can be auto-completed from the limits
node-ip-address: $MY_POD_IP # auto-completed as the head pod IP
#pod template
template:
metadata:
labels:
# custom labels. NOTE: do not define custom labels start with `raycluster.`, they may be used in controller.
# Refer to https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
rayCluster: raycluster-sample # will be injected if missing
rayNodeType: head # will be injected if missing, must be head or wroker
groupName: headgroup # will be injected if missing
# annotations for pod
annotations:
key: value
spec:
containers:
- name: ray-head
image: rayproject/ray:1.6.0
#image: rayproject/ray:nightly
#image: bonsaidev.azurecr.io/bonsai/lazer-0-9-0-cpu:dev
# you can have any command and args here to run your code.
# the below command/args will be appended after the Ray start command and it args, and executed after Ray start.
command: ["sleep"]
args:
- '10000'
env:
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
ports:
- containerPort: 6379
- containerPort: 8265 # Ray dashboard
- containerPort: 10001 #################### this is the only changed ####################
from kuberay.
Also, from your second comment :
Another way is to remove container ports and controller will open all default ports (10001 and dashboard etc).
The svc actually exposed all default ports like this:
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2021-09-30T06:29:04Z"
labels:
ray.io/cluster: raycluster-mini
ray.io/identifier: raycluster-mini-head
ray.io/node-type: head
name: raycluster-mini-head-svc
namespace: ray-system
ownerReferences:
- apiVersion: ray.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: RayCluster
name: raycluster-mini
uid: 00aebf70-3fb5-4aed-a469-d71531247469
resourceVersion: "1293"
selfLink: /api/v1/namespaces/ray-system/services/raycluster-mini-head-svc
uid: ae39272d-f257-490c-a092-c4fa72268194
spec:
clusterIP: 10.106.64.226
ports:
- name: client
port: 10001
protocol: TCP
targetPort: 10001
- name: redis
port: 6379
protocol: TCP
targetPort: 6379
- name: dashboard
port: 8265
protocol: TCP
targetPort: 8265
selector:
ray.io/cluster: raycluster-mini
ray.io/identifier: raycluster-mini-head
ray.io/node-type: head
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
from kuberay.
I found the reason why 😄 :
ports:
- containerPort: 6379
- containerPort: 8265 # Ray dashboard
- containerPort: 10001
When I enter this way, the port.name
is all the same for each elements, so they were overwritten with same name.
So I've changed my yaml by this,
ports:
- containerPort: 6379
name: a
- containerPort: 8265 # Ray dashboard
name: b
- containerPort: 10001
name: c
Then finally, generated svc exposes all ports as expected:
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2021-09-30T08:17:55Z"
labels:
ray.io/cluster: raycluster-mini
ray.io/identifier: raycluster-mini-head
ray.io/node-type: head
name: raycluster-mini-head-svc
namespace: ray-system
ownerReferences:
- apiVersion: ray.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: RayCluster
name: raycluster-mini
uid: 29353c10-5127-4fb4-b2a3-94688007dd05
resourceVersion: "562"
selfLink: /api/v1/namespaces/ray-system/services/raycluster-mini-head-svc
uid: bba0da0b-329f-41d3-92c7-28bd8428291d
spec:
clusterIP: 10.109.63.104
ports:
- name: a
port: 6379
protocol: TCP
targetPort: 6379
- name: b
port: 8265
protocol: TCP
targetPort: 8265
- name: c
port: 10001
protocol: TCP
targetPort: 10001
selector:
ray.io/cluster: raycluster-mini
ray.io/identifier: raycluster-mini-head
ray.io/node-type: head
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
from kuberay.
I guess we could close this issue /close
from kuberay.
Related Issues (20)
- [Feature] serve run and serve deploy question HOT 2
- [CI] [Bug] e2e sample YAML test pipeline is broken with Docker image hash mismatch HOT 15
- [RayJob] Add tests for #1855 HOT 1
- [Docs] Add docs for structured config and default sidecar containers HOT 1
- [Feature] GPU Support Allow setting runtimeClass in template.spec HOT 1
- [Bug] Long Route names and namespaces auto-generate invalid host
- [Feature] Support AWS IAM for Redis Auth HOT 1
- [Feature] What is kuberay's roadmap for 2024 ? HOT 2
- [Docs] Add example docs of using RayJob and Kueue HOT 4
- [Doc] RayJob `suspend`
- [Feature] Make RayJob recover automatically from K8S submitter job and Ray cluster head node failures HOT 2
- test
- [Feature] Test
- [Bug] Test
- Refactor Current e2e Test to Use Server-Side Apply
- [Core] Metric unintentional_worker_failures_total is not accurate HOT 1
- [Doc] Create a doc for external Redis with TLS HOT 1
- [Feature] Associate RayService and its K8s service HOT 2
- [Feature] Should we stop publishing images on DockerHub? HOT 4
- [Feature] Allow setting of common values between head and worker pods HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kuberay.