Comments (5)
We need a better design to fix this issue. Things would fail if users simultaneously running two python examples/resnet_app.py
.
from skypilot.
I'm currently working on sky-executing multiple independent tasks, where each task relies on a different yml config file.
from skypilot.
@gmittal and I chatted about this; Gautam, are you taking charge of designing this within the CLI?
We need to support both requirements:
- Support launching scripts on different clusters
- Support launching scripts on the same cluster. Mostly for debugging.
We can take inspiration from the Ray CLI:
ray up cluster.yaml
ray up cluster.yaml # Same cluster is used.
ray up cluster.yaml -n <name> # Override the cluster name -> a new cluster is used.
So, we can for example do
sky run <script or app>.py
sky run <script or app>.py # Same cluster is used.
sky run <script or app>.py --cluster=<name> # A new cluster is used.
# However, it remains to be seen how to conveniently override these,
# where sky.execute(dag) is called from within:
python app.py
from skypilot.
In #58 there's a fix for this. Any feedback would be appreciated.
from skypilot.
Fixed by CLI.
from skypilot.
Related Issues (20)
- [docs] Add references to Kubernetes in SkyServe docs
- [GCP] Compute Engine Metadata unavailable when using service account in a local Docker? HOT 3
- [Core] Gracefully handle OOM for a job HOT 1
- [GCP] Invalid value for field 'resource.instanceProperties.labels' HOT 3
- [AI Gallery] Add kubernetes tabs in AI gallery examples
- [k8s] file mount fails when using job controller HOT 2
- [Jobs][k8s] Streaming logs fails for job controller
- [K8S] `sky local up` failed on a multi-user system HOT 1
- [K8S] Provision on Kubernetes failed but `sky check` shows it is enabled HOT 2
- [Tests] Add unit tests for `show-gpus` behavior
- [Catalog/show-gpus] Combine RunPod's A100 and A100-SXM HOT 1
- Too-Small Clusters Cause Jobs to Fail Without Logs
- Runpod uses extremely old base image HOT 1
- Conflicting version requirement for aiohttp while setting up development environment
- Support `sky.serve.core.up` as a library function HOT 1
- [Serve] Support skip failed replica in CLI
- [Feature Request]: vLLM API-Key support for Sky Serve
- [Serve] Limit the lines of output for `sky sevre logs` HOT 1
- [Cloud] sh: 1: Bad substitution error when using docker as runtime environment
- [K8s] Latest Docker Image not being pulled if older image exists locally
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from skypilot.