Comments (4)
https://kuttl.dev/docs/#pre-requisites
maybe we can consider using this tool?
from kuberay.
@Irvingwangjr Kuttl looks cool! I read the README. It seems suitable for testing some sample YAMLs, but it doesn't seem to support well the case where we need to execute some commands in the Pods to trigger certain behaviors.
from kuberay.
@Irvingwangjr Kuttl looks cool! I read the README. It seems suitable for testing some sample YAMLs, but it doesn't seem to support well the case where we need to execute some commands in the Pods to trigger certain behaviors.
yeah, that might be a problem, right now we use TestStep to execute scripts to trigger behaviors.
here is an example, we try to simulate the eviction of ray head, and check whether the RayJob will eventually enter the failed state:
apiVersion: kuttl.dev/v1beta1
kind: TestStep
commands:
- script: |
pod_name=$(kubectl get pod -l "xx.com/ray-pod-name=mlrayjob-head-failed-on-running-head-0" -n=ns -o jsonpath="{.items[0].metadata.name}")
cmd="kubectl exec -it $pod_name -n=ns -c ray-container -- dd if=/dev/zero of=/tmp/test.txt bs=1M count=100000"
$cmd &
cmd_pid=$!
wait $cmd_pid
exit_code=$?
if [ $exit_code -eq 137 ]; then
echo "the process was killed, exit with return code of 137"
else
echo "the process was killed, exit with return code of $exit_code"
fi
then we assert the RayJob to be in status of failed
apiVersion: kuttl.dev/v1beta1
kind: TestAssert
commands:
- command: kubectl assert exist-enhanced rayjob/mlrayjob-head-failed-on-running -n=ns --field-selector status.phase=Failed
we also use kube-assert here, it might be helpful.
from kuberay.
https://github.com/open-feature/open-feature-operator
OpenFeature(an CNCF project) adopt this tool, it also provides some examples
from kuberay.
Related Issues (20)
- [Feature] Allow referring requirements.txt path from zipfile in runtime_env in serveConfigV2
- [Bug] Unable to launch vLLM with llama3.1:70B HOT 19
- [Bug] Exec probes are causing high load on Ray pods HOT 4
- [Doc] KubeRay configuration
- [Regression from v1.1] RayCluster Condition should surface resource quota issue correctly HOT 8
- [Bug] Resume the `Replicas` field in type HeadGroupSpec struct HOT 1
- [Bug] RayJob does not shut down the submitter pod properly HOT 1
- [Feature] Should --num-cpu be based on CPU requests instead of limits? HOT 14
- [Feature] scheduled jobs, workflows, and incremental learning HOT 1
- [Feature] Support setting QPS and Burst in configuration and command line flags
- [Feature] Remove `//nolint:gosec` to allow rule G115 after the false positive issue is solved
- [Bug] CI error: actions/upload-artifact version deprecation
- [Bug] Old RayServices not deleted after operator update to 1.2.1 HOT 8
- [Bug] Submit a task with kuberay-operator, deliberately let the task fail, found that the submitter will repeatedly create three times, how to turn off this option? HOT 2
- [Bug] kuberay cannot run long tasks, and service disconnection occurs. HOT 5
- [Feature] Consider setting AUTOSCALER_CONSERVE_GPU_NODES by default in Ray autoscaler HOT 1
- [Bug] Autoscaler sideacr crashes, bringing down head pod, if request exceeds max pod replicas
- [Proposal] Let kubectl-plugin import KubeRay types from ray-operator HOT 1
- [Bug] Bubble ImagePullErr and ImagePullBackoff to the Ray CRD HOT 1
- [Feature][kubectl-plugin] Run port-forward in a goroutine and retry if the connection fails HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kuberay.