farama-foundation / minari Goto Github PK
View Code? Open in Web Editor NEWA standard format for offline reinforcement learning datasets, with popular reference datasets and related utilities
Home Page: https://minari.farama.org
License: Other
A standard format for offline reinforcement learning datasets, with popular reference datasets and related utilities
Home Page: https://minari.farama.org
License: Other
I have a use case where we have a dataset consisting of image-based observations, and I notice that sampling speed seems to be slower than with 1D observations. I checked out how sampling is working internally, and noticed that Minari samples episodes serially, instead of sampling in parallel. I thought that parallelizing this call may have been thought about already, so I was curious for any recommendations on the best way to do this. I was also wondering if this was something that will be added in the future.
I have one more layer of complexity on top of this, where instead of 1 dataset, I have say 10 datasets from different envs, each have image-based observations. Think multi-task Atari. I have 10 minari datasets, and then say want 30 episodes from each for each gradient update. Also want to do this in parallel, and will experiment with different parallelization techniques but curious if others had intuition about this.
Minari/minari/dataset/minari_storage.py
Lines 153 to 180 in c0669fc
Add gymasium env id, file size and dataset group to the displayed table when running command minari list remote
or minari list local
.
So this would mean adding new columns named something like "env_id", "size on disk" and something like "dataset group."
Right now, the datasets do not have a dataset_group
value, so for backwards compatibility, the PR should check for a dataset_group
attribute, and if there is none, it should use the string "Unknown" as a placeholder value.
This should be a useful hint for getting started with getting the file size for remote datasets: https://stackoverflow.com/questions/50875461/google-cloud-storage-get-object-size-api
To get started with this, it would be useful to look at the code in cli.py
local.py
and hosting.py
The doc will also need to be updated to reflect the existence of the new field dataset_group
. Definitely on this page, https://minari.farama.org/main/content/dataset_standards/, and probably also on the individual dataset pages.
This is partially to address #79, and also it's useful to know how large each dataset is to get an idea of how long it will take to download or process a particular dataset.
The line:
Line 15 in e24113b
This currently causes an import error in gymnasium 1.0.0rc
Should be changed to:
from gymnasium.wrappers import RecordEpisodeStatistics
I have just created a dataset from the deployment of an agent on the Farama Minigrid environment, and I am wondering how to ensure that when data is loaded from the dataset, it is done in the original order, without any steps being loaded redundantly. Is there a way to do that with the pytorch DataLoader class? For instance, given the example shown here https://minari.farama.org/content/basic_usage/, how would I go about getting batches from the DataLoader
such that the steps given in the batches were in the same order as that in which they were created when the agent was deployed on the environment? Also, is there a way that I can do that without any of the steps being repeated?
I was wondering, is there a separate Bibtex citation that we should use specifically for Minari, or should we use the one given for the D4RL project?
I think it would be a great idea to port Antmaze data originally proposed in D4RL, to Minari.
Many offline RL papers use antmaze. Many papers will even skip pointmaze and only report numbers on antmaze because the latter is more difficult. I think adding this dataset and environment with Minari's clean and easier to use code will boost Minari's usage amongst offline RL researchers.
Describe the bug
The pointmaze-open-v1
dataset is not found in the list of remote datasets and fails to download.
Code example
minari.download_dataset('pointmaze-open-v1')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../lib/python3.8/site-packages/minari/storage/hosting.py", line 117, in download_dataset
raise ValueError(
ValueError: Couldn't find any compatible version of dataset pointmaze-open with the local installed version of Minari, 0.4.1.
System Info
Describe the characteristic of your environment:
MacOS creates .DS_Store if you access to a folder using Finder.
If this is done on the datasets folder, this will produce an error:
NotADirectoryError: [Errno 20] Not a directory: '/Users/username/.minari/datasets/.DS_Store'
because it tries to read .DS_Store as a dataset
Ths discord link does not work
While hdf5 and h5py is the most popular approach for multi-dimensional arrays storage, is has some major limitations. For example, the inability to read data in multiple processes / threads simultaneously, which can be important for the implementation of efficient data loading.
There is an alternative - Zarr, which is very similar, but a bit more capable. I think a discussion on this would be useful to the community.
3.7
Writing this without fixes for now, things I'm noticing going through the library:
'door-cloned-v0', 'door-expert-v0', 'door-human-v0', ...
, at least reading it initially. One issue here is that the dataset names don't really tell you much unless you already know the context - we should probably mention that they're specifically for mujoco, probably specifically for some specific model - maybe we can list the name of the gymnasium env?minari.download_dataset("door-human-v0")
results in a pretty ugly/uncaught import error for gymnasium_robotics
. Since gymnasium-robotics isn't a necessary dependency, we should have a more elegant way of handling this.overwrite
or force
argument to do that behavior, and by default it should just skip downloading it. It might be worthwhile to think if we want to do explicit downloads like this in the first place, but that's another thing.EpisodeData
is a namedtuple, which leads to the default representation being an enormous printout of all observations, rewards etc. This is not very useful, so we should probably make it a proper dataclass instead, and override the default representation.Stopping here for now, @rodrigodelazcano let me know if there's anything I'm missing anything for any of those points (like some additional reasons for doing X thing in a certain way that I'm unaware of). Otherwise we can make this into a to-do list of improvements
Hey, I am trying to make dataset of Ant-v5
and i am getting this error:
$ py create_dataset.py # expert-v0
Traceback (most recent call last):
File "/home/master-andreas/gym/rl/project/create_dataset.py", line 56, in <module>
obs, rew, terminated, truncated, info = collector_env.step(action)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/master-andreas/gym/rl/project/temp_env/lib/python3.11/site-packages/minari/data_collector/data_collector.py", line 222, in step
self._add_step_data(self._buffer[-1], step_data)
File "/home/master-andreas/gym/rl/project/temp_env/lib/python3.11/site-packages/minari/data_collector/data_collector.py", line 162, in _add_step_data
raise ValueError(
ValueError: Info structure inconsistent with info structure returned by original reset.
for reference:
self._reference_info
is set using info
from reset
which is then compared to the info
from reset
Also in general, it is not in Env
's specification that info
s need to have the same structure each time
https://gymnasium.farama.org/main/api/env/#gymnasium.Env.step
pip install git+...
(minari install from github) succeedspip install cython numpy
pip install git+https://github.com/Farama-Foundation/Minari.git
Everything works ok
pip install minari
failsEven when the prereqs numpy
and cython
are installed, pip install minari
results in:
Building wheels for collected packages: minari
Building wheel for minari (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [28 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-310
creating build/lib.linux-x86_64-cpython-310/minari
copying minari/__init__.py -> build/lib.linux-x86_64-cpython-310/minari
copying minari/logger.py -> build/lib.linux-x86_64-cpython-310/minari
copying minari/dataset.pyx -> build/lib.linux-x86_64-cpython-310/minari
copying minari/dataset.pxd -> build/lib.linux-x86_64-cpython-310/minari
copying minari/dataset.pyi -> build/lib.linux-x86_64-cpython-310/minari
running build_ext
building 'minari.dataset' extension
creating build/temp.linux-x86_64-cpython-310
creating build/temp.linux-x86_64-cpython-310/minari
x86_64-linux-gnu-gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -std=c++11 -fPIC -I/home/will/testminari/venv/lib/python3.10/site-packages/numpy/core/include -Iminari/cpp/include -I/home/will/testminari/venv/include -I/usr/include/python3.10 -c minari/dataset.cpp -o build/temp.linux-x86_64-cpython-310/minari/dataset.o -std=c++11 -O3 -ffast-math
In file included from /home/will/testminari/venv/lib/python3.10/site-packages/numpy/core/include/numpy/ndarraytypes.h:1940,
from /home/will/testminari/venv/lib/python3.10/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
from /home/will/testminari/venv/lib/python3.10/site-packages/numpy/core/include/numpy/arrayobject.h:5,
from minari/dataset.cpp:792:
/home/will/testminari/venv/lib/python3.10/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
17 | #warning "Using deprecated NumPy API, disable it with " \
| ^~~~~~~
minari/dataset.cpp:806:10: fatal error: minari/dataset.h: No such file or directory
806 | #include "minari/dataset.h"
| ^~~~~~~~~~~~~~~~~~
compilation terminated.
error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for minari
Running setup.py clean for minari
Failed to build minari
Installing collected packages: pyasn1, gymnasium-notices, urllib3, typing_extensions, structlog, six, rsa, pyasn1-modules, protobuf, jax-jumpy, idna, h5py, google-crc32c, cloudpickle, charset-normalizer, certifi, cachetools, tensorboardX, requests, googleapis-common-protos, google-resumable-media, google-auth, google-api-core, google-cloud-core, google-cloud-storage, shimmy, gymnasium, minari
Running setup.py install for minari ... error
error: subprocess-exited-with-error
× Running setup.py install for minari did not run successfully.
│ exit code: 1
╰─> [30 lines of output]
running install
/home/will/testminari/venv/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-310
creating build/lib.linux-x86_64-cpython-310/minari
copying minari/__init__.py -> build/lib.linux-x86_64-cpython-310/minari
copying minari/logger.py -> build/lib.linux-x86_64-cpython-310/minari
copying minari/dataset.pyx -> build/lib.linux-x86_64-cpython-310/minari
copying minari/dataset.pxd -> build/lib.linux-x86_64-cpython-310/minari
copying minari/dataset.pyi -> build/lib.linux-x86_64-cpython-310/minari
running build_ext
building 'minari.dataset' extension
creating build/temp.linux-x86_64-cpython-310
creating build/temp.linux-x86_64-cpython-310/minari
x86_64-linux-gnu-gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -std=c++11 -fPIC -I/home/will/testminari/venv/lib/python3.10/site-packages/numpy/core/include -Iminari/cpp/include -I/home/will/testminari/venv/include -I/usr/include/python3.10 -c minari/dataset.cpp -o build/temp.linux-x86_64-cpython-310/minari/dataset.o -std=c++11 -O3 -ffast-math
In file included from /home/will/testminari/venv/lib/python3.10/site-packages/numpy/core/include/numpy/ndarraytypes.h:1940,
from /home/will/testminari/venv/lib/python3.10/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
from /home/will/testminari/venv/lib/python3.10/site-packages/numpy/core/include/numpy/arrayobject.h:5,
from minari/dataset.cpp:792:
/home/will/testminari/venv/lib/python3.10/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
17 | #warning "Using deprecated NumPy API, disable it with " \
| ^~~~~~~
minari/dataset.cpp:806:10: fatal error: minari/dataset.h: No such file or directory
806 | #include "minari/dataset.h"
| ^~~~~~~~~~~~~~~~~~
compilation terminated.
error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> minari
note: This is an issue with the package mentioned above, not pip.
Describe the bug
According to the documentation, the shape of the observation should be num_steps + 1. However, for a handful of episodes, the length is num_steps.
Code example
import minari
dataset = minari.load_dataset("antmaze-umaze-v0", download=True)
for episode_data in dataset.iterate_episodes():
if episode_data.observations["observation"].shape[0] == episode_data.total_timesteps:
print("Warning: bad episode")
Output:
Warning: bad episode
Warning: bad episode
Warning: bad episode
Warning: bad episode
System Info
Describe the characteristic of your environment:
Additional context
I haven't checked the other Ant environments.
I tried to set the render mode in recover_environment
as is allowed in gym.make
:
data_complete = minari.load_dataset("kitchen-complete-v1")
env = data_complete.recover_environment(render_mode="rgb_array")
But I got the following error:
TypeError: MinariDataset.recover_environment() got an unexpected keyword argument 'render_mode'
Describe the bug
I have a custom gym env that inherits MujocoRobotEnv
and wrap it with DataCollector
. However, I found all image observations that are stored in DataCollector
are black. When not using DataCollector
, a custom environment wrapped by PixelObservationV0
can successfully obtain a colored image from the step
function. Besides, the ValueError is also triggered when the info
dict returned by the step
function is not empty. Below is a minimal working example to reproduce these errors. The panda_mujoco_gym
env was created by myself and can be required from here.
Thanks in advance for your precious time in checking this issue!
Code example
import gymnasium as gym
from gymnasium.experimental.wrappers import PixelObservationV0
import matplotlib.pyplot as plt
from minari import DataCollector
import panda_mujoco_gym
env = gym.make("FrankaPickAndPlaceSparse-v0", render_mode="rgb_array")
env = PixelObservationV0(env, pixels_only=True)
obs, info = env.reset(seed=seed) # rgb image can be observed
env = DataCollector(env, record_infos=True, max_buffer_steps=max_buffer_steps)
obs, info = env.reset(seed=seed)
plt.imshow(obs) # rgb image can not be observed
plt.show()
System Info
Describe the characteristic of your environment:
Tests for the file minari_dataset should be added.
Hi, isn't it too hard to have an assertion error when observations or action are not contained in its defined spaces?
Perhaps an option to disable or set error-level to ignore/warn/error?
Minari/minari/data_collector/data_collector.py
Lines 198 to 203 in e24113b
Describe the bug
Have you tried combining more than two dataset at a time? When I try to do that, I get an error, while combining only two datasets works fine. I think this might be related to the h5py/h5py#1385.
Error
Traceback (most recent call last):
File "rllib_mini/train_ppo.py", line 171, in <module>
train()
File "/usr/local/lib/python3.8/dist-packages/pyrallis/argparsing.py", line 158, in wrapper_inner
response = fn(cfg, *args, **kwargs)
File "rllib_mini/train_ppo.py", line 164, in train
comdined_dataset = minari.combine_datasets(datasets, new_dataset_id=f"{config.env_name}-dataset-v{config.version}")
File "/usr/local/lib/python3.8/dist-packages/minari/utils.py", line 132, in combine_datasets
combined_data_file.attrs["author"] = dataset_file.attrs["author"]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/usr/local/lib/python3.8/dist-packages/h5py/_hl/attrs.py", line 103, in __setitem__
self.create(name, data=value)
File "/usr/local/lib/python3.8/dist-packages/h5py/_hl/attrs.py", line 212, in create
h5a.delete(self._id, self._e(name))
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5a.pyx", line 145, in h5py.h5a.delete
KeyError: 'Unable to delete attribute (record is not in B-tree)'
I am trying to build a dataset for the MiniHack environment. The environment has a fairly large dictionary space, with many keys, but only one level of nesting. I don't want to save all the keys, because it bloats the size of the dataset very much. I thought it could be done with StepDataCallback, like this:
class TTYStepDataCallback(minari.StepDataCallback):
def __call__(self, env, obs, info, action=None, rew=None, terminated=None, truncated=None):
tty_filter_keys = ["tty_chars", "tty_colors", "tty_cursor"]
obs = {k: v for k, v in obs.items() if k in tty_filter_keys}
step_data = super().__call__(env, obs, info, action, rew, terminated, truncated)
return step_data
However, this will raise an error, as some keys from original observation space are missing. Why can't I just use the ones I want to save right away? I am logging data during distributed PPO training, which uses the other keys too, so they are needed for online training :(
The DataCollectorV0 wrapper can optionally record the info from each step. However, when creating a dataset from the collector the info data is lost.
Shouldn't EpisodeData also optionally keep the infos? Else, what is the purpose of recording the info on the wrapper?
>>> python pointmaze_dataset.py
/home/graham/micromamba/envs/ctx/lib/python3.10/site-packages/gymnasium/core.py:297: UserWarning: WARN: env.maze to get variables from other wrappers is deprecated and will be removed in v1.0, to get this variable you can do `env.unwrapped.maze` for environment variables or `env.get_attr('maze')` that will search the reminding wrappers.
logger.warn(
Traceback (most recent call last):
File "/home/graham/code/inctxdt/compare/pointmaze/pointmaze_dataset.py", line 369, in <module>
obs, rew, terminated, truncated, info = collector_env.step(action)
File "/home/graham/micromamba/envs/ctx/lib/python3.10/site-packages/minari/data_collector/data_collector.py", line 216, in step
assert self.dataset_action_space.contains(
AssertionError: Actions are not in action space.
also seems like there is an issue with the spaces .contain method whereby the assert in data_collector.py can fail if the dtype is float64 and the space dtype is float32
Hi,
the discord link is invalid now. Would you link to update discord invitation link?
best
Hi,
I am curious is there any time plan to migrates d4rl mujoco environments to minari? E.g. ant-medium/expert/...-v2/v4, hopper-medium/expert/...-v2/v4/?
Though mujoco's -v2 environments are no longer maintained, but it's still popular baseline for research.
Thanks your great work!
Hi,
would it be better to let "ref_max_score" and "ref_min_score" be attribute of environment but not dataset? So that different datasets have a uniform normalization metric?
in d4RL implementation it also actually do so, for different dataset's corresponding environemnt, like "env-xxx-xxx", their ref_max_score and ref_min_score comes from same macro definition. https://github.com/Farama-Foundation/D4RL/blob/71a9549f2091accff93eeff68f1f3ab2c0e0a288/d4rl/gym_mujoco/__init__.py#L23-L31
So why not we directly make it as attribute of environemnt or dictionary of minari package?
Describe the bug
Total steps for "pen-human-v0" dataset is none or doesn't exist. Seems like this is true for all adroit datasets.
Code example
import minari
minari.download_dataset("pen-human-v0")
dataset = minari.load_dataset("pen-human-v0")
print(dataset.total_steps)
Additional context
However, this will work
print(dataset.spec.total_steps)
Why doesn't the code just take it out of spec? Why does it either None or re-read from self._data?
Hello, the majority of data collections from simulated environments comes from parallel data collectors such as VecEnvs and methods that return either a step data with a batch dimension or even at the end of the collector a list like collection of episode data.
By now the simpler scenario is to sample using VecEnv so is there any simple recipe for adapting the current DataCollector for VecEnv format?
Describe the bug
Looks if we wanna to make pre-commit check, it always throws error about typing issue that relates to gymnasium. If the gymnasium is installed, when we do pre-commit, pyright check will give error relates to typing hint of env_spec or like those. If we uninstall gymnasium then the error will dispear.
For more detail, you could check the code example and attached output. I think it's very clear that the error is caused by type hint of Optional
, the env attribute could be None
. E.g. environment's env_spec
could be None, then env_spec.make
could be None.make()
so pyright thorws error.
Also, I believe in github workflows, if we add step pip install gymnasium
(or install dependencies of minari) into precommit action, pre-commit
will thorow error anyway.
Code example
To recover, just do so:
$ conda create -n test3.11 python=3.11
$ conda activate test3.11
$ git clone [email protected]:Farama-Foundation/Minari.git
$ cd Minari
$ pip install -e . # or run pip install gymnasium
$ pip install pre-commit
$ pre-commit run --all-files
Then you could find there is four errors and quite some warnings If we run $ pip uninstall gymnasium
then the error and warning will disapear.
xxxxx/Minari/minari/utils.py
xxxxx/Minari/minari/utils.py:177:13 - error: Argument of type "Generator[int | None, None, None]" cannot be assigned to parameter "__iterable" of type "Iterable[SupportsRichComparisonT@max]" in function "max"
"Generator[int | None, None, None]" is incompatible with "Iterable[SupportsRichComparisonT@max]"
TypeVar "_T_co@Iterable" is covariant
Type "int | None" cannot be assigned to type "SupportsRichComparison"
Type "int | None" cannot be assigned to type "SupportsRichComparison"
Type "None" cannot be assigned to type "SupportsRichComparison" (reportGeneralTypeIssues)
xxxxx/Minari/minari/utils.py:344:60 - error: Argument of type "ActType@get_average_reference_score" cannot be assigned to parameter "action" of type "WrapperActType@Wrapper" in function "step"
Type "ActType@get_average_reference_score" cannot be assigned to type "WrapperActType@Wrapper" (reportGeneralTypeIssues)
xxxxx/Minari/minari/dataset/minari_dataset.py
xxxxx/Minari/minari/dataset/minari_dataset.py:279:41 - warning: "_total_episodes" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/minari/dataset/minari_dataset.py:314:41 - warning: "_total_episodes" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/common.py
xxxxx/Minari/tests/common.py:566:32 - warning: "_buffer" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/common.py:568:28 - warning: "_buffer" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/data_collector/callbacks/test_step_data_callback.py
xxxxx/Minari/tests/data_collector/callbacks/test_step_data_callback.py:104:34 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_dataset_download.py
xxxxx/Minari/tests/dataset/test_dataset_download.py:59:34 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py
xxxxx/Minari/tests/dataset/test_minari_dataset.py:94:32 - warning: "_buffer" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:96:28 - warning: "_buffer" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:107:34 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:154:26 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:168:32 - warning: "_buffer" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:170:28 - warning: "_buffer" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:199:19 - warning: "_episode_indices" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:200:29 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:201:20 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:288:19 - warning: "_episode_indices" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:289:29 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:290:20 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/dataset/test_minari_dataset.py:466:34 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/utils/test_dataset_combine.py
xxxxx/Minari/tests/utils/test_dataset_combine.py:74:18 - error: "max_episode_steps" is not a known member of "None" (reportOptionalMemberAccess)
xxxxx/Minari/tests/utils/test_dataset_combine.py:75:24 - error: "make" is not a known member of "None" (reportOptionalMemberAccess)
xxxxx/Minari/tests/utils/test_dataset_creation.py
xxxxx/Minari/tests/utils/test_dataset_creation.py:59:32 - warning: "_buffer" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/utils/test_dataset_creation.py:61:28 - warning: "_buffer" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/utils/test_dataset_creation.py:80:34 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/utils/test_dataset_creation.py:173:34 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
xxxxx/Minari/tests/utils/test_dataset_creation.py:287:34 - warning: "_data" is protected and used outside of the class in which it is declared (reportPrivateUsage)
4 errors, 24 warnings, 0 informations
Completed in 1.5sec
System Info
Describe the characteristic of your environment:
I've been running into this error lately when trying to minari download
any dataset. Replicated on both Windows and Linux:
>minari download door-human-v1
with the result
Python\Python310\lib\site-packages\minari\storage\hosting.py:252: UserWarning: Misconfigured dataset named door-human-v1/data/main_data.hdf5 on remote warnings.warn(f"Misconfigured dataset named {blob.name} on remote")
minari show
on the dataset returns an AssertionError. (It's nicely formatted in the terminal, but does not lend itself to being copy-pasted, so I put the content in the attached .txt file.)
minarierror.txt
I tried running the dataset-related test suites, and they all passed. After running them, everything worked fine. But I've replicated this behavior after deleting the datasets folder and reinstalling Minari, both on Windows and Linux (Ubuntu). It's not the worst problem in the world, since at least one of the tests (or some combination of them - I didn't pin down which one) fixes it, but it would be nice to use the datasets without having to clone the repository and run the tests beforehand.
I did edit hosting.py to indicate which blob was returning the error, and it gives <Blob: minari-datasets, door-human-v1/data/main_data.hdf5, 1713638657975740>
, even if I've deleted the door-human-v1 dataset. I'm guessing there's some sort of corrupt config file that's being referenced, which is not being deleted when Minari is reinstalled, though perhaps it's being deleted and created by the tests? That would make sense, since manually deleting the datasets doesn't update the config file - though it doesn't explain why there would be an error even after they're redownloaded, and it doesn't explain why I ran into this problem in the first place.
Is there a software citation for Minari available for citing it in a paper?
Add docs to the webpage for each dataset in the remote GCP bucket (https://minari.farama.org/main/datasets/coming_soon/). Generate the docs automatically wen building the website, this can be done similarly to the Gymnasium environment docs.
The side menu in the docs webpage should have the names of the datasets classified by environment id
:
door-human-v0
door-expert-v0
door-cloned-v0
For the docs of each dataset include the following:
Spec | Value |
---|---|
total_episodes |
- |
total_timesteps |
- |
flatten_observations |
- |
flatten_actions |
- |
algorithm |
- |
code_permalink |
- |
author |
- |
author email |
- |
These specs can be retrieved from the minari.list_remote_datasets()
function
Describe the bug
Hi, glad to know that the info in dataset is also iteratable now. However, I just find that the existing dataset just lost the info when every time the environment is reset.
Acoording to the current data collection standard, https://minari.farama.org/content/dataset_standards/#additional-information-formatting, the collector will collect info
while env.reset() and every time env.step(), i.e. it should satisfy: len(observations) == len(info[xxx])
, however, I tested some datasets, it only satisifes len(actions) == len(info[xxx])
, I guess it may related to d4rl, while d4rl released, env.reset() doesn't return any infos.
We may need to fix this issue.
Code example
I tested two datasets, but other should be same:
for name in ['door-human-v1', 'hammer-expert-v1']:
dt = minari.load_dataset(name)
obs = dt[1].observations
act = dt[1].observations
info = dt[1].infos
info_key = list(info.keys()][0]
info_i = info[info_key]
print(len(info_i) == len(act))
print(len(info_i) == len(obs))
System Info
I thinks this is not important to report system info for this issue
Describe the bug
Setting the author email to None
, then saving the dataset will lead to the following exception:
dataset.save()
File "minari/dataset.pyx", line 576, in minari.dataset.MinariDataset.save
File "minari/dataset.pyx", line 582, in minari.dataset.MinariDataset.save
File "/Users/markus/OpenSource/venvs/gymnasium_env/lib/python3.10/site-packages/h5py/_hl/group.py", line 161, in create_dataset
dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)
File "/Users/markus/OpenSource/venvs/gymnasium_env/lib/python3.10/site-packages/h5py/_hl/dataset.py", line 54, in make_new_dset
raise TypeError("One of data, shape or dtype must be specified")
TypeError: One of data, shape or dtype must be specified
I would expect None
to be a valid choice.
Code example
Just set the author email to None
in tutorials/LocalStorage/local_storage.py
System Info
Describe the characteristic of your environment:
PS C:\Users\noahs> py -m pip install --upgrade minari
Collecting minari
Using cached minari-0.3.1-py3-none-any.whl (30 kB)
Requirement already satisfied: numpy>=1.21.0 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from minari) (1.23.5)
Collecting h5py==3.7.0 (from minari)
Using cached h5py-3.7.0.tar.gz (392 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting tqdm>=4.65.0 (from minari)
Using cached tqdm-4.65.0-py3-none-any.whl (77 kB)
Requirement already satisfied: typing-extensions==4.4.0 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from minari) (4.4.0)
Collecting google-cloud-storage==2.5.0 (from minari)
Using cached google_cloud_storage-2.5.0-py2.py3-none-any.whl (106 kB)
Collecting typer[all]==0.7.0 (from minari)
Using cached typer-0.7.0-py3-none-any.whl (38 kB)
Requirement already satisfied: gymnasium>=0.28.1 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from minari) (0.28.1)
Requirement already satisfied: google-auth<3.0dev,>=1.25.0 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from google-cloud-storage==2.5.0->minari) (2.16.2)
Collecting google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5 (from google-cloud-storage==2.5.0->minari)
Using cached google_api_core-2.11.0-py3-none-any.whl (120 kB)
Collecting google-cloud-core<3.0dev,>=2.3.0 (from google-cloud-storage==2.5.0->minari)
Using cached google_cloud_core-2.3.2-py2.py3-none-any.whl (29 kB)
Collecting google-resumable-media>=2.3.2 (from google-cloud-storage==2.5.0->minari)
Using cached google_resumable_media-2.5.0-py2.py3-none-any.whl (77 kB)
Requirement already satisfied: requests<3.0.0dev,>=2.18.0 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from google-cloud-storage==2.5.0->minari) (2.28.2)
Requirement already satisfied: click<9.0.0,>=7.1.1 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from typer[all]==0.7.0->minari) (8.1.3)
Requirement already satisfied: colorama<0.5.0,>=0.4.3 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from typer[all]==0.7.0->minari) (0.4.6)
Collecting shellingham<2.0.0,>=1.3.0 (from typer[all]==0.7.0->minari)
Using cached shellingham-1.5.0.post1-py2.py3-none-any.whl (9.4 kB)
Requirement already satisfied: rich<13.0.0,>=10.11.0 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from typer[all]==0.7.0->minari) (12.6.0)
Requirement already satisfied: jax-jumpy>=1.0.0 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from gymnasium>=0.28.1->minari) (1.0.0)
Requirement already satisfied: cloudpickle>=1.2.0 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from gymnasium>=0.28.1->minari) (2.2.1)
Requirement already satisfied: farama-notifications>=0.0.1 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from gymnasium>=0.28.1->minari) (0.0.4)
Collecting googleapis-common-protos<2.0dev,>=1.56.2 (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5->google-cloud-storage==2.5.0->minari)
Using cached googleapis_common_protos-1.59.0-py2.py3-none-any.whl (223 kB)
Requirement already satisfied: protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5->google-cloud-storage==2.5.0->minari) (4.22.1)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from google-auth<3.0dev,>=1.25.0->google-cloud-storage==2.5.0->minari) (5.3.0)
Requirement already satisfied: pyasn1-modules>=0.2.1 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from google-auth<3.0dev,>=1.25.0->google-cloud-storage==2.5.0->minari) (0.2.8)
Requirement already satisfied: six>=1.9.0 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from google-auth<3.0dev,>=1.25.0->google-cloud-storage==2.5.0->minari) (1.16.0)
Requirement already satisfied: rsa<5,>=3.1.4 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from google-auth<3.0dev,>=1.25.0->google-cloud-storage==2.5.0->minari) (4.9)
Collecting google-crc32c<2.0dev,>=1.0 (from google-resumable-media>=2.3.2->google-cloud-storage==2.5.0->minari)
Using cached google_crc32c-1.5.0-cp311-cp311-win_amd64.whl (27 kB)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from requests<3.0.0dev,>=2.18.0->google-cloud-storage==2.5.0->minari) (3.0.1)
Requirement already satisfied: idna<4,>=2.5 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from requests<3.0.0dev,>=2.18.0->google-cloud-storage==2.5.0->minari) (3.4)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from requests<3.0.0dev,>=2.18.0->google-cloud-storage==2.5.0->minari) (1.26.14)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from requests<3.0.0dev,>=2.18.0->google-cloud-storage==2.5.0->minari) (2022.12.7)
Requirement already satisfied: commonmark<0.10.0,>=0.9.0 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from rich<13.0.0,>=10.11.0->typer[all]==0.7.0->minari) (0.9.1)
Requirement already satisfied: pygments<3.0.0,>=2.6.0 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from rich<13.0.0,>=10.11.0->typer[all]==0.7.0->minari) (2.14.0)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in c:\users\noahs\appdata\local\programs\python\python311\lib\site-packages (from pyasn1-modules>=0.2.1->google-auth<3.0dev,>=1.25.0->google-cloud-storage==2.5.0->minari) (0.4.8)
Building wheels for collected packages: h5py
Building wheel for h5py (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for h5py (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [72 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-cpython-311
creating build\lib.win-amd64-cpython-311\h5py
copying h5py\h5py_warnings.py -> build\lib.win-amd64-cpython-311\h5py
copying h5py\ipy_completer.py -> build\lib.win-amd64-cpython-311\h5py
copying h5py\version.py -> build\lib.win-amd64-cpython-311\h5py
copying h5py\__init__.py -> build\lib.win-amd64-cpython-311\h5py
creating build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\attrs.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\base.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\compat.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\dataset.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\datatype.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\dims.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\files.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\filters.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\group.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\selections.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\selections2.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\vds.py -> build\lib.win-amd64-cpython-311\h5py\_hl
copying h5py\_hl\__init__.py -> build\lib.win-amd64-cpython-311\h5py\_hl
creating build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\common.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\conftest.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_attribute_create.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_attrs.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_attrs_data.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_base.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_big_endian_file.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_completions.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_dataset.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_dataset_getitem.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_dataset_swmr.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_datatype.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_dimension_scales.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_dims_dimensionproxy.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_dtype.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_errors.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_file.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_file2.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_file_alignment.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_file_image.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_filters.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_group.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_h5.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_h5d_direct_chunk.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_h5f.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_h5o.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_h5p.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_h5pl.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_h5t.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_objects.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_selections.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\test_slicing.py -> build\lib.win-amd64-cpython-311\h5py\tests
copying h5py\tests\__init__.py -> build\lib.win-amd64-cpython-311\h5py\tests
creating build\lib.win-amd64-cpython-311\h5py\tests\data_files
copying h5py\tests\data_files\__init__.py -> build\lib.win-amd64-cpython-311\h5py\tests\data_files
creating build\lib.win-amd64-cpython-311\h5py\tests\test_vds
copying h5py\tests\test_vds\test_highlevel_vds.py -> build\lib.win-amd64-cpython-311\h5py\tests\test_vds
copying h5py\tests\test_vds\test_lowlevel_vds.py -> build\lib.win-amd64-cpython-311\h5py\tests\test_vds
copying h5py\tests\test_vds\test_virtual_source.py -> build\lib.win-amd64-cpython-311\h5py\tests\test_vds
copying h5py\tests\test_vds\__init__.py -> build\lib.win-amd64-cpython-311\h5py\tests\test_vds
copying h5py\tests\data_files\vlen_string_dset.h5 -> build\lib.win-amd64-cpython-311\h5py\tests\data_files
copying h5py\tests\data_files\vlen_string_dset_utc.h5 -> build\lib.win-amd64-cpython-311\h5py\tests\data_files
copying h5py\tests\data_files\vlen_string_s390x.h5 -> build\lib.win-amd64-cpython-311\h5py\tests\data_files
running build_ext
Loading library to get build settings and version: hdf5.dll
error: Unable to load dependency HDF5, make sure HDF5 is installed properly
error: Could not find module 'hdf5.dll' (or one of its dependencies). Try using the full path with constructor syntax.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for h5py
Failed to build h5py
ERROR: Could not build wheels for h5py, which is required to install pyproject.toml-based projects
Hi, please state clearly in the documentation and dataset definition if in a time step "r_0" is consequence of "a_0"
With previous Offline RL libs, there has been some confusion with this respect.
With the standar in RL being (s,a,r,s') one assume that r is a consequence of applying action a in state s.
If r is not, please state it clearly, because then, the r(s,a) should be r_1 and not r_0
Thanks !
Hey, I am trying to make dataset of Ant-v5
and I am getting this error:
$ py create_dataset.py # expert-v0
Traceback (most recent call last):
File "/home/master-andreas/gym/rl/project/create_dataset.py", line 56, in <module>
obs, rew, terminated, truncated, info = collector_env.step(action)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/master-andreas/gym/rl/project/temp_env/lib/python3.11/site-packages/minari/data_collector/data_collector.py", line 222, in step
self._add_step_data(self._buffer[-1], step_data)
File "/home/master-andreas/gym/rl/project/temp_env/lib/python3.11/site-packages/minari/data_collector/data_collector.py", line 162, in _add_step_data
raise ValueError(
ValueError: Info structure inconsistent with info structure returned by original reset.
for reference:
self._reference_info
is set using info
from reset
which is then compared to the info
from step
in general, step
can have additional info
compared to reset
such as transition information
Is this compatible with PettingZoo? I've adapted a Gym environment into a custom PettingZoo environment, and I wanted to use Minari for some Offline RL. However, it seems DataCollector needs an environment instantiated by "gym.make" and whenever I try to make my environment "registrable" I get errors because it seems that the interface required is not really compatible with that of PettingZoo.
Is this a problem of PettingZoo or Minari? Or mine?
Thanks
Always remove the data generated by tests, whenever they fail or not.
Currently, tests leave leftovers. This can make all the tests fail after one failed, because of minari.list_local_datasets()
We can use pytest.fixture
and a temporary directory to be sure everything is cleaned after.
See here for an example.
Describe the bug
When creating a BabyAI dataset, I get the following NotImplementedError No serialization method available for MissionSpace(<function BabyAIMissionSpace._gen_mission at 0x7f7a93fcdb90>, None)
Code example
import gymnasium as gym
import minari
from minari import DataCollectorV0
dataset_id = 'BabyAI-GoToLocal-v0'
env = gym.make(dataset_id)
env = DataCollectorV0(env, record_infos=True, max_buffer_steps=100000)
total_episodes = 100
for _ in range(total_episodes):
env.reset(seed=123)
while True:
# random action policy
action = env.action_space.sample()
obs, rew, terminated, truncated, info = env.step(action)
if terminated or truncated:
break
dataset = minari.create_dataset_from_collector_env(dataset_id=dataset_id,
collector_env=env,
algorithm_name="Random-Policy"
)
System Info
Describe the characteristic of your environment:
Additional context
BabyAI is part of MiniGrid: BabyAI GoToLocal
Does not iterating over episodes should produce "all" the episode data including infos and other stores keys?
After that we can use more specific iterators to just select a subset of the keys.
I.e. for what reason we are able to store in MuJoCo Hopper the x-position if when iterating we can't read it? And we cannot access Infos per episode?
If download
True fetch dataset name from remote Farama server.
A clear and concise description of the proposal.
This functionality is similar to Pytorch dataset loading
Proposal
I was wondering if there's interest in making humanoid part of D4RL. The dataset was introduced in the following paper
Paper: https://arxiv.org/abs/2305.14550
The link for the data is here: https://dl.fbaipublicfiles.com/prajj/rl_paradigm/humanoid_offline_rl_data.tar.gz
Repo : https://github.com/prajjwal1/rl_paradigm
Motivation
Making humanoid available would be helpful for offline RL community for the same reason existing datasets are. Humanoid is more challenging in some ways than existing D4RL datasets such as state space dimension.
We provide medium, medium-expert and expert data for humanoid all in the same format as D4RL. It contains a lot of timestep data than what existing D4RL datasets provide.
Hi,
it looks strange that the installing requires 4.4.0, since it could cause conflict frequently.
I checked the usage of typing-extensions, it only used for two places
Use TypedDict
as StepData
, here
Use Annotated
here,
So, shall the requirement of typing-extensions==4.4.0
be removed?
This is more a question than a bug report.
In torchrl we test a random set of datasets extracted from list_remote_datasets
Since the last release, this test has been broken: https://github.com/pytorch/rl/actions/runs/8649362912/job/23715408536
The reason is that a dataset picked from the list has the right major version (0.4) but the wrong minor, and isn't available for this minor anymore.
My questions are:
list_remote_datasets
to display non-compatible datasets would be useful (I'd expect people to be surprised to see datasets they can't work with in the list except if they ask for them)I am trying load a dataset created from a Minigrid-type environment using the following code:
import os
import gymnasium as gym
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from gymnasium import spaces
from stable_baselines3 import PPO
from torch.utils.data import DataLoader
from tqdm.auto import tqdm
import minari
from minari import DataCollectorV0
def collate_fn(batch):
return {
"id": torch.Tensor([x.id for x in batch]),
"seed": torch.Tensor([x.seed for x in batch]),
"total_timesteps": torch.Tensor([x.total_timesteps for x in batch]),
"observations": torch.nn.utils.rnn.pad_sequence(
[torch.as_tensor(x.observations) for x in batch],
batch_first=True
),
"actions": torch.nn.utils.rnn.pad_sequence(
[torch.as_tensor(x.actions) for x in batch],
batch_first=True
),
"rewards": torch.nn.utils.rnn.pad_sequence(
[torch.as_tensor(x.rewards) for x in batch],
batch_first=True
),
"terminations": torch.nn.utils.rnn.pad_sequence(
[torch.as_tensor(x.terminations) for x in batch],
batch_first=True
),
"truncations": torch.nn.utils.rnn.pad_sequence(
[torch.as_tensor(x.truncations) for x in batch],
batch_first=True
)
}
torch.manual_seed(42)
minari_testset = minari.load_dataset("MinigridRandomWall-6Spots-v0")
dataloader = DataLoader(minari_dataset, batch_size=64, shuffle=True, collate_fn=collate_fn)
for batch in dataloader:
print("Observation shape: " + str(batch['observations'].shape))
print("Action shape: " + str(batch['actions'].shape))
print("Reward shape: " + str(batch['rewards'].shape))
print("Timestep shape " + str(batch["infos"]["timestep"].shape))
When I run the code, I get this error:
minari_testset = minari.load_dataset("MinigridRandomWall-6Spots-v0")
File "/home/justin/Minari/minari/storage/local.py", line 22, in load_dataset
return MinariDataset(data_path)
File "/home/justin/Minari/minari/dataset/minari_dataset.py", line 133, in __init__
self._data = MinariStorage(data)
File "/home/justin/Minari/minari/dataset/minari_storage.py", line 22, in __init__
flatten_observations = f.attrs["flatten_observation"].item()
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/usr/local/lib/python3.8/dist-packages/h5py/_hl/attrs.py", line 56, in __getitem__
attr = h5a.open(self._id, self._e(name))
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5a.pyx", line 80, in h5py.h5a.open
KeyError: "Can't open attribute (can't locate attribute in name index)"
When I created the dataset, I used an ImgObsWrapper
for the environment. Could that be the source of the problem?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.