Comments (18)
Thanks for fixing that. It makes sense!
from garage.
Consider using ray
from garage.
What I've learned is that in order to do a batch sampler with ray:
-
share agent parameters with workers in between sampler iterations, and use those shared parameters to update the model inside the individual workers.
-
This unfortunately means that we can't make a sampler that is agnostic of the library that we are using, as we will need to have custom ways of updating an agents parameters between sampler iterations (e.g. one set of parameter getter and setters for pytorch, and one for tensorflow
-
The most efficient way of communicating objects with ray is to put them into numpy array objects. Ray still supports somewhat fast serialization of Primitive types: ints, floats, longs, bools, strings, unicode, and numpy arrays. Any list, dictionary, or tuple whose elements can be serialized by Ray. But definitely not arbitrary python objects
-
I have noticed that it is easy to get the parameters of torch models using model.state_dict(), and set the parameters using model.load_state_dict()
-
With tensorflow, we can get all the variables, and set them by following the same methods that the people who wrote the distributed sgd Ray code did.
-
The distributed sgd ray code chooses to pass around generic python lists instead of dictionaries, or numpy arrays, when moving parameters. Its probably worth trying out using dictionaries, or numpy arrays, or lists to pass around data to see which way provides the fastest serialization. Perhaps converting lists to arrays is too slow of an operation on larger models.
Further discoveries:
- Garage tf policies have an option to get and set their parameters using numpy arrays, making the job of passing parameters even easier
- This approach again will make it ultimately hard to separate the sampler from the entire library as a standalone library, assuming that the sampler class will allow for users to use an arbitrary agent with their sampler.
from garage.
I spent some time looking into and working on this problem last week. I ended up writing a sampler which appears to be faster than any of our previous attempts. I'll put it in a garage branch today so we can discuss the details, but basically it uses a simple state machine per worker and queues directly. This appears to be a better abstraction for our purposes than a pool, which the rllab sampler uses.
One of the surprising things is how hard it is to avoid pickling pytorch tensors when using pytorch. Even the state dict doesn't contain numpy arrays. Because of this, I'm just using multiprocessing's queues. We might consider converting them to something that can memmap numpy arrays, such as joblib's queues (or maybe zmq?).
from garage.
related (perhaps for design inspiration): https://www.python.org/dev/peps/pep-0554/
from garage.
if it meets/exceeds the current speed, is cleaner and easier to test, and doesn't compromise the performance of any algorithm, then let's work on commiting it and then iterate.
from garage.
If we use python multiprocessing, or job lib, don't those both not allow for distributed sampling? Or utilization of both GPUs/cpu elements. I thought that's something we wanted, and why we were looking into using ray?
But I guess speed is the most important thing here. I'll write a ray sampler for the TensorFlow branch based on the previous observations, and see if there is any apparent speedup.
from garage.
related (perhaps for design inspiration): https://www.python.org/dev/peps/pep-0554/
Can you elaborate? I'm unsure what I'm looking for in this design document. Thanks!
from garage.
it is an RFC for allowing python processes to have multiple real system threads without having to use multiprocessing
from garage.
closed via #793
from garage.
More general sampler refactoring should have a separate issue.
from garage.
Hi friends. @krzentner @avnishn . Thanks for introducing this ray sampler feature.
I wonder why MAML only support RaySampler
(I found that more generic MultiprocessingSampler
does not work here).
from garage.
The ray sampler is better than the multi processing sampler in terms of resources and efficiency in runtime.
This library is pretty much in a mode of maintenance right now anyways, but I'd reject the idea of adding support for different times of multi process samplers.
from garage.
I don't know why MAML only works with RaySampler
. It should work with MultiprocessingSampler
, since they implement the same API (with the exception that RaySampler
controls GPU usage, and MultiprocessingSampler
doesn't).
from garage.
I don't know why MAML only works with
RaySampler
. It should work withMultiprocessingSampler
, since they implement the same API (with the exception thatRaySampler
controls GPU usage, andMultiprocessingSampler
doesn't).
@avnishn @krzentner
If I simply replace the RaySampler
in maml_trpo_half_cheetah_dir.py with MultiprocessingSampler
, it gives the following error.
2022-04-16 00:05:04 | [maml_trpo_half_cheetah_dir] Logging to /garage/src/garage/examples/torch/data/local/experiment/maml_trpo_half_cheetah_dir_13
/garage/src/garage/experiment/deterministic.py:37: UserWarning: Enabeling deterministic mode in PyTorch can have a performance impact when using GPU.
'Enabeling deterministic mode in PyTorch can have a performance '
2022-04-16 00:05:04 | [maml_trpo_half_cheetah_dir] Obtaining samples...
Sampling [------------------------------------] 0%Traceback (most recent call last):
Traceback (most recent call last):
File "/opt/conda/envs/gym-fish/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "/opt/conda/envs/gym-fish/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/opt/conda/envs/gym-fish/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
AttributeError: Can't pickle local object 'maml_trpo_half_cheetah_dir.<locals>.<lambda>'
File "/opt/conda/envs/gym-fish/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'maml_trpo_half_cheetah_dir.<locals>.<lambda>'
from garage.
It seems that worker cannot receive any instructions in the waiting loop here from the queue, it seems getting stucked
Update: the reason is that SetTaskUpdate
is unpickable, thus it can not be passed through the Queue()
. We need to do something else here for the meta-rl environments.
from garage.
cloudpickle
is supposed to fix this, since it can pickle SetTaskUpdate
. The bug is probably that we don't pass cloudpickle.dumps
to prepare_worker_messages
here. (And then use cloudpickle.loads
here)
from garage.
It turns out to be slightly more complicated than that. @peppacat Please try to use the branch (or cherry-pick the change from) PR #2322
from garage.
Related Issues (20)
- RuntimeError: Tensor for argument #2 'mat1' is on CPU, but expected it to be on GPU (while checking arguments for addmm) HOT 1
- torch.cuda.is_available() is False, but it is not! HOT 2
- KeyError: 'render.modes' in GymEnv wrapping "CartPole-vX" HOT 2
- Constraining the output interval of the GaussianLSTMModel to [0.0 .. 1.0] HOT 1
- Suggestion to add how to implement pre-trained policies. HOT 1
- Garage without TF HOT 2
- JSON Serializable error HOT 6
- Problems with workers when using custom Gym/Reacher Environment
- implementation detail question about MAMLPPO
- Reproducibility issue HOT 1
- Multinode experiments
- How to improve the sampling speed of Raysampler
- Reproducibility Issue of Atari, and Fetch tasks
- DQN : Improve performance of obtain_evaluation_episodes HOT 1
- "pip install garage" error on Windows 11 and Ubuntu 22.04 LTS HOT 1
- Examples folder error HOT 1
- Memory leak on trpo algorithm(pytorch)
- Running the rl2 algorithm issue
- Extracting episode-level results from MetaEvaluator
- garage on Python 3.10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from garage.