Code Monkey home page Code Monkey logo

Comments (7)

rynewang avatar rynewang commented on July 24, 2024 1

We use this code

def get_num_cpus(
to detect CPUs. It reads from cgroup files, or from multiprocessing package.

from ray.

guangzlu avatar guangzlu commented on July 24, 2024

raylet_out_2024-05-20-16-54-06.txt
Here is our raylet.out log.

from ray.

guangzlu avatar guangzlu commented on July 24, 2024

Update: it can run with specifying both num_cpus and num_gpus. But why it cannot work when only set num_gpus? And how do ray detect num_cpus by default?How to set the num_cpus properly?

from ray.

rynewang avatar rynewang commented on July 24, 2024

We don't have AMD GPU environments. If you can provide us an environment to reproduce, please ping us on Slack. https://ray-distributed.slack.com/team/U055TQCDAAY

from ray.

rynewang avatar rynewang commented on July 24, 2024

If you don't set num_cpus or num_gpus, Ray will auto detect. Have you tried to not set num_gpus and see if it can detect the CPU and GPU counts?

from ray.

guangzlu avatar guangzlu commented on July 24, 2024

If you don't set num_cpus or num_gpus, Ray will auto detect. Have you tried to not set num_gpus and see if it can detect the CPU and GPU counts?

Yes we tried it, but if we don't set any arguments, we just used ray.init(), it would still corrupt. Sorry that we cannot provide an AMD environment right now. But I think the problem is in cpu side. Because it cannot detect cpu automatically, we need to set num_cpus manually. Can you tell me how do ray detect CPUs? And is there any method to figure out more about the problem? For example, check whether the cpu threads are working well?

from ray.

guangzlu avatar guangzlu commented on July 24, 2024

Update: we can use multiprocessing.cpu_count() to get cpu number successfully. But we cannot set num_cpus too large. We have 192 cpus on the machine, but we can only set num_cpus to be up to 10. If we set it to be 20, it would interrupt. Here is the log of num_cpus=20.
ray-num-cpu-20-log.txt

from ray.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.