Right now rllab uses parallelism in an ad-hoc manner through multiprocessing library,

Consider using <a href="https://github.com/ray-project/ray/tree/master/python/ray/rlli

related (perhaps for design inspiration): <a href="https://www.python.org/dev/peps/pep

related (perhaps for design inspiration): <a href="https://www.python.org

closed via <a class="issue-link js-issue-link" data-error-text="Failed to load title"

Create ray sampler about garage HOT 18 CLOSED

rlworkgroup commented on September 7, 2024

Create ray sampler

from garage.

Comments (18)

0xJchen commented on September 7, 2024 1

Thanks for fixing that. It makes sense!

from garage.

ryanjulian commented on September 7, 2024

Consider using ray

from garage.

avnishn commented on September 7, 2024

What I've learned is that in order to do a batch sampler with ray:

share agent parameters with workers in between sampler iterations, and use those shared parameters to update the model inside the individual workers.
This unfortunately means that we can't make a sampler that is agnostic of the library that we are using, as we will need to have custom ways of updating an agents parameters between sampler iterations (e.g. one set of parameter getter and setters for pytorch, and one for tensorflow
The most efficient way of communicating objects with ray is to put them into numpy array objects. Ray still supports somewhat fast serialization of Primitive types: ints, floats, longs, bools, strings, unicode, and numpy arrays. Any list, dictionary, or tuple whose elements can be serialized by Ray. But definitely not arbitrary python objects
I have noticed that it is easy to get the parameters of torch models using model.state_dict(), and set the parameters using model.load_state_dict()
With tensorflow, we can get all the variables, and set them by following the same methods that the people who wrote the distributed sgd Ray code did.
The distributed sgd ray code chooses to pass around generic python lists instead of dictionaries, or numpy arrays, when moving parameters. Its probably worth trying out using dictionaries, or numpy arrays, or lists to pass around data to see which way provides the fastest serialization. Perhaps converting lists to arrays is too slow of an operation on larger models.

Further discoveries:

Garage tf policies have an option to get and set their parameters using numpy arrays, making the job of passing parameters even easier
This approach again will make it ultimately hard to separate the sampler from the entire library as a standalone library, assuming that the sampler class will allow for users to use an arbitrary agent with their sampler.

from garage.

krzentner commented on September 7, 2024

I spent some time looking into and working on this problem last week. I ended up writing a sampler which appears to be faster than any of our previous attempts. I'll put it in a garage branch today so we can discuss the details, but basically it uses a simple state machine per worker and queues directly. This appears to be a better abstraction for our purposes than a pool, which the rllab sampler uses.

One of the surprising things is how hard it is to avoid pickling pytorch tensors when using pytorch. Even the state dict doesn't contain numpy arrays. Because of this, I'm just using multiprocessing's queues. We might consider converting them to something that can memmap numpy arrays, such as joblib's queues (or maybe zmq?).

from garage.

ryanjulian commented on September 7, 2024

related (perhaps for design inspiration): https://www.python.org/dev/peps/pep-0554/

from garage.

ryanjulian commented on September 7, 2024

if it meets/exceeds the current speed, is cleaner and easier to test, and doesn't compromise the performance of any algorithm, then let's work on commiting it and then iterate.

from garage.

avnishn commented on September 7, 2024

If we use python multiprocessing, or job lib, don't those both not allow for distributed sampling? Or utilization of both GPUs/cpu elements. I thought that's something we wanted, and why we were looking into using ray?

But I guess speed is the most important thing here. I'll write a ray sampler for the TensorFlow branch based on the previous observations, and see if there is any apparent speedup.

from garage.

avnishn commented on September 7, 2024

related (perhaps for design inspiration): https://www.python.org/dev/peps/pep-0554/

Can you elaborate? I'm unsure what I'm looking for in this design document. Thanks!

from garage.

ryanjulian commented on September 7, 2024

it is an RFC for allowing python processes to have multiple real system threads without having to use multiprocessing

from garage.

avnishn commented on September 7, 2024

closed via #793

from garage.

krzentner commented on September 7, 2024

More general sampler refactoring should have a separate issue.

from garage.

0xJchen commented on September 7, 2024

Hi friends. @krzentner @avnishn . Thanks for introducing this ray sampler feature.
I wonder why MAML only support RaySampler (I found that more generic MultiprocessingSampler does not work here).

from garage.

avnishn commented on September 7, 2024

The ray sampler is better than the multi processing sampler in terms of resources and efficiency in runtime.

This library is pretty much in a mode of maintenance right now anyways, but I'd reject the idea of adding support for different times of multi process samplers.

from garage.

krzentner commented on September 7, 2024

I don't know why MAML only works with RaySampler. It should work with MultiprocessingSampler, since they implement the same API (with the exception that RaySampler controls GPU usage, and MultiprocessingSampler doesn't).

from garage.

0xJchen commented on September 7, 2024

I don't know why MAML only works with RaySampler. It should work with MultiprocessingSampler, since they implement the same API (with the exception that RaySampler controls GPU usage, and MultiprocessingSampler doesn't).

@avnishn @krzentner
If I simply replace the RaySampler in maml_trpo_half_cheetah_dir.py with MultiprocessingSampler, it gives the following error.

2022-04-16 00:05:04 | [maml_trpo_half_cheetah_dir] Logging to /garage/src/garage/examples/torch/data/local/experiment/maml_trpo_half_cheetah_dir_13
/garage/src/garage/experiment/deterministic.py:37: UserWarning: Enabeling deterministic mode in PyTorch can have a performance impact when using GPU.
  'Enabeling deterministic mode in PyTorch can have a performance '
2022-04-16 00:05:04 | [maml_trpo_half_cheetah_dir] Obtaining samples...
Sampling  [------------------------------------]    0%Traceback (most recent call last):
Traceback (most recent call last):
  File "/opt/conda/envs/gym-fish/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/opt/conda/envs/gym-fish/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "/opt/conda/envs/gym-fish/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
    obj = _ForkingPickler.dumps(obj)
AttributeError: Can't pickle local object 'maml_trpo_half_cheetah_dir.<locals>.<lambda>'
  File "/opt/conda/envs/gym-fish/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'maml_trpo_half_cheetah_dir.<locals>.<lambda>'

from garage.

0xJchen commented on September 7, 2024

It seems that worker cannot receive any instructions in the waiting loop here from the queue, it seems getting stucked

Update: the reason is that SetTaskUpdate is unpickable, thus it can not be passed through the Queue(). We need to do something else here for the meta-rl environments.

from garage.

krzentner commented on September 7, 2024

cloudpickle is supposed to fix this, since it can pickle SetTaskUpdate. The bug is probably that we don't pass cloudpickle.dumps to prepare_worker_messages here. (And then use cloudpickle.loads here)

from garage.

krzentner commented on September 7, 2024

It turns out to be slightly more complicated than that. @peppacat Please try to use the branch (or cherry-pick the change from) PR #2322

from garage.

Create ray sampler about garage HOT 18 CLOSED

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent