Comments (15)
The PR is merged so it should be available in the next Ray release (2.1) or nightly wheel.
from prefect-ray.
Really interesting. Let me know if you need anything.
from prefect-ray.
Could you check to see if ray-project/ray#28043 fixes the issue?
from prefect-ray.
Ugh the classic standard library doesn't have a dedicated namespace issue. What do you mean re this workaround?
from prefect-ray.
In ray.workers
module:
if mode == SCRIPT_MODE:
# Add the directory containing the script that is running to the Python
# paths of the workers. Also add the current directory. Note that this
# assumes that the directory structures on the machines in the clusters
# are the same.
# When using an interactive shell, there is no script directory.
if not interactive_mode:
script_directory = os.path.abspath(os.path.dirname(sys.argv[0]))
worker.run_function_on_all_workers(
lambda worker_info: sys.path.insert(1, script_directory)
)
The reason why this doesn't happen running flows locally vs deployments is because when Prefect runs deployed code, it executes the script through .../lib/python3.8/site-packages/prefect/engine.py
and the root directory is .../lib/python3.8/site-packages/prefect
where packaging also lives.
At the moment, working with the Ray OSS team to resolve this.
from prefect-ray.
How does engine.py
execute the Ray script?
from prefect-ray.
python -m prefect.engine <UUID>
from prefect-ray.
python -m prefect.engine <UUID>
is the entrypoint to Prefect flow runs. The flow run process then calls ray.init()
(see the RayTaskRunner
implementation here) and task runs are submitted to the Ray cluster.
from prefect-ray.
The reason why we have this behavior is because https://docs.python.org/3/library/sys.html#sys.path
As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first. Notice that the script directory is inserted before the entries inserted as a result of [PYTHONPATH](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH).
Currently there is no easy way to disable this behavior besides the workaround Andrew mentioned.
from prefect-ray.
I think https://blog.cykerway.com/posts/2020/12/15/python-name-conflict-with-built-in-module.html is essentially the problem we are facing here.
from prefect-ray.
We're not running a script though, we're using python -m <module>
. I'm not sure why Ray needs to modify the path in this case.
For example:
# example.py
import sys
print(sys.path[0])
Displays the path of the script
❯ python src/prefect/example.py
/Users/mz/dev/prefect/src/prefect
Displays my current working directory
❯ python -m prefect.example
/Users/mz/dev/prefect
from prefect-ray.
Oh, got it. Then I feel it's a bug on our side, we diverge from the default python behavior.
from prefect-ray.
Neat! It works! Thanks @jjyao for the quick fix!
10:27:39.736 | INFO | prefect.agent - Submitting flow run '6d01c2cd-0af1-4f55-91b6-9349b524e3f3'
10:27:40.393 | INFO | prefect.infrastructure.process - Opening process 'cornflower-puma'...
10:27:40.399 | INFO | prefect.agent - Completed submission of flow run '6d01c2cd-0af1-4f55-91b6-9349b524e3f3'
10:27:41.785 | WARNING | ray.worker - Failed to set SIGTERM handler, processes mightnot be cleaned up properly on exit.
10:27:42.029 | INFO | prefect.task_runner.ray - Creating a local Ray instance
2022-08-22 10:27:43,661 INFO services.py:1470 -- View the Ray dashboard at http://127.0.0.1:8265
10:27:45.034 | INFO | prefect.task_runner.ray - Using Ray cluster with 1 nodes.
10:27:45.034 | INFO | prefect.task_runner.ray - The Ray UI is available at 127.0.0.1:8265
10:27:46.189 | INFO | Flow run 'cornflower-puma' - Created task run 'say_hello-0b28e502-0' for task 'say_hello'
10:27:46.196 | INFO | Flow run 'cornflower-puma' - Submitted task run 'say_hello-0b28e502-0' for execution.
10:27:50.279 | INFO | Flow run 'cornflower-puma' - Finished in state Completed('All states completed.')
(begin_task_run pid=29639) hello Ford
10:27:50.447 | INFO | prefect.infrastructure.process - Process 'cornflower-puma' exited cleanly.
If possible, let us know when you merge and release so we can pin that specific version!
from prefect-ray.
Exciting news! Will close this once ray 2.1 is released and we pin 2.1 in our requirements
from prefect-ray.
2.1 has been released; closing this now.
from prefect-ray.
Related Issues (20)
- `RayTaskRunner` crashes with `pydantic>=2.0.0` HOT 3
- Python 3.10 support
- `timeout_seconds` in `task` doesn't work with `RayTaskRunner` HOT 2
- `timeout_seconds` not enforced with `RayTaskRunner`
- Flow crashes after multiple hanged tasks with "Prefect Task run TASK_RUN_ID already finished" HOT 5
- Ensure that logs from Ray workers are propagated to the Prefect backend HOT 3
- Recognize Ray Cluster defined via env var RAY_ADDRESS HOT 2
- Ray is stuck at the older version
- Circular Import on Deployment (again) HOT 1
- AWS EC2 Spot Interrupt fails entire flow HOT 5
- Flows with many tasks unable to complete HOT 6
- Flow code executes before dependencies present on Ray cluster
- Prefect-Ray spins up workers before the inputs for the worker tasks are ready / before the task's dependencies are complete HOT 6
- Propagating prefect task names to ray tasks instead of it showing up as _run_prefect_task in the Ray dashboard HOT 6
- Prefect Logger does not work with native Ray Functions HOT 1
- how to set resources on @task HOT 2
- Return object references from tasks?
- prefect worker start with resources ?
- Specifying resources for task causes flow to fail/crash
- Error in test suite
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from prefect-ray.