Comments (7)
How often does it happen?
It happens every time I ran that enviroment, usually before 50k steps. Never happened with basic or defend the center.
Info about my enviroment:
I'm running everything in Linux Mint 21.1 Vera, under Anaconda using Spyder IDE.
Vizdoom version: 1.2.0
Gymnasium version: 0.26.3
Stable-baselines3 version_ 2.0.0a5
Let me know if you need any more information about my enviroment.
EDIT: Updated Gymnasium to 0.28.1, still getting the same problem.
Here's the traceback:
File ~/anaconda3/lib/python3.9/site-packages/spyder_kernels/py3compat.py:356 in compat_exec
exec(code, globals, locals)
File ~/TFM/Doom_RL/vizdoom_A2C.py:248
model.learn(total_timesteps=3000000, callback=callback, progress_bar=True)
File ~/anaconda3/lib/python3.9/site-packages/stable_baselines3/a2c/a2c.py:194 in learn
return super().learn(
File ~/anaconda3/lib/python3.9/site-packages/stable_baselines3/common/on_policy_algorithm.py:259 in learn
continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps)
File ~/anaconda3/lib/python3.9/site-packages/stable_baselines3/common/on_policy_algorithm.py:178 in collect_rollouts
new_obs, rewards, dones, infos = env.step(clipped_actions)
File ~/anaconda3/lib/python3.9/site-packages/stable_baselines3/common/vec_env/base_vec_env.py:171 in step
return self.step_wait()
File ~/anaconda3/lib/python3.9/site-packages/stable_baselines3/common/vec_env/vec_transpose.py:95 in step_wait
observations, rewards, dones, infos = self.venv.step_wait()
File ~/anaconda3/lib/python3.9/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py:69 in step_wait
obs, self.reset_infos[env_idx] = self.envs[env_idx].reset()
File ~/anaconda3/lib/python3.9/site-packages/stable_baselines3/common/monitor.py:83 in reset
return self.env.reset(**kwargs)
File ~/TFM/Doom_RL/vizdoom_A2C.py:208 in reset
state = self.game.get_state().screen_buffer
AttributeError: 'NoneType' object has no attribute 'screen_buffer'
from vizdoom.
So at the moment, I think the reason might be that your
.cfg
or.wad
files were somehow modified and, for example, now allow the agent to be killed before the episode starts
Thanks for this! I modified my cfg
file and set the episode start time to 1. After 100k steps it was running smoothly.
Seems that for any reason, you can get killed sooner there than in other episodes.
from vizdoom.
Hi @MetallicaSPA! I may need some help to fully understand what is happening. If you mean that from time to time, you get None from get_state()
, then this is expected. In the original ViZDoom API get_state()
will return None if the episode ends/reaches the terminal state. So you should always check if it's None or use the self.game.is_episode_finished()
check.
If your problem is that self.game.new_episode()
doesn't reset your episode then this is unexpected, but I would need a code sample to run to see what is happening.
Also, we now provide official wrappers for Gym and Gymnasium, so you don't need to implement them yourself! Check https://github.com/Farama-Foundation/ViZDoom/tree/master/examples/python directory for Gym, Gymnasium and StableBaselines examples.
from vizdoom.
If your problem is that
self.game.new_episode()
doesn't reset your episode then this is unexpected, but I would need a code sample to run to see what is happening.
That's what it seems to happen, because I tried and it happens at different steps; so I feel it's something random.
Here's the full code:
import vizdoom as vzd
import numpy as np
import cv2
import os
from vizdoom import *
from gymnasium import Env
from gymnasium.spaces import Discrete, Box
from stable_baselines3.common.callbacks import CallbackList, EvalCallback, ProgressBarCallback, CheckpointCallback
from stable_baselines3 import A2C
DEFAULT_CONFIG = "/home/joaquin/TFM/Doom_RL/scenarios/deadly_corridor.cfg"
SCENARIO_PATH = '/home/joaquin/TFM/Doom_RL/scenarios_official/deadly_corridor.wad'
CHECKPOINT_DIR = './train/train_deadly_corridor'
LOG_DIR = './logs/log_deadly_corridor'
render = False # True will show the window while training, False don't but will make the training faster
class VizDoomGym(Env):
# Function that is called when we start the env
def __init__(self, render=render):
# Inherit from Env
super().__init__()
# Setup the game
self.game = vzd.DoomGame()
self.game.load_config(DEFAULT_CONFIG)
self.game.set_doom_scenario_path(SCENARIO_PATH)
self.game.set_doom_game_path("/home/joaquin/TFM/Doom_RL/DOOM2.WAD")
self.game.set_render_hud(False)
self.game.set_screen_resolution(vzd.ScreenResolution.RES_640X480)
# self.game.set_screen_resolution(vzd.ScreenResolution.RES_160X120)
# Set cv2 friendly format.
# self.game.set_screen_format(vzd.ScreenFormat.BGR24)
# Enables labeling of the in game objects.
self.game.set_labels_buffer_enabled(True)
# Enables depth buffer (turned off by default).
self.game.set_depth_buffer_enabled(True)
# Render frame logic
if render == False:
self.game.set_window_visible(False)
else:
self.game.set_window_visible(True)
self.game.clear_available_game_variables()
self.game.set_available_game_variables([
vzd.GameVariable.AMMO0,
vzd.GameVariable.AMMO1,
vzd.GameVariable.AMMO2,
vzd.GameVariable.AMMO3,
vzd.GameVariable.AMMO4,
vzd.GameVariable.AMMO5,
vzd.GameVariable.AMMO6,
vzd.GameVariable.AMMO7,
vzd.GameVariable.AMMO8,
vzd.GameVariable.AMMO9,
vzd.GameVariable.ARMOR,
vzd.GameVariable.HEALTH,
vzd.GameVariable.POSITION_X,
vzd.GameVariable.POSITION_Y,
vzd.GameVariable.POSITION_Z,
vzd.GameVariable.SELECTED_WEAPON,
vzd.GameVariable.SELECTED_WEAPON_AMMO,
vzd.GameVariable.WEAPON0,
vzd.GameVariable.WEAPON1,
vzd.GameVariable.WEAPON2,
vzd.GameVariable.WEAPON3,
vzd.GameVariable.WEAPON4,
vzd.GameVariable.WEAPON5,
vzd.GameVariable.WEAPON6,
vzd.GameVariable.WEAPON7,
vzd.GameVariable.WEAPON8,
vzd.GameVariable.WEAPON9,
vzd.GameVariable.DAMAGE_TAKEN,
vzd.GameVariable.HITCOUNT
])
# Start the game
self.game.init()
# Get game variables:
self.damage_taken = 0
self.hitcount = 0
self.ammo = 52
# Create the action space and observation space
self.observation_space = Box(low=0, high=255, shape=(160,120,1), dtype=np.uint8)
self.action_space = Discrete(14)
# This is how we take a step in the environment
def step(self, action):
# Specify action and take step
actions = np.identity(14)
action_reward = self.game.make_action(actions[action], 4)
# Get all the other stuff we need to return
if self.game.get_state():
state = self.game.get_state().screen_buffer
state = self.grayscale(state)
ammo0 = self.game.get_state().game_variables[0]
ammo1 = self.game.get_state().game_variables[1]
ammo2= self.game.get_state().game_variables[2]
ammo3 = self.game.get_state().game_variables[3]
ammo4 = self.game.get_state().game_variables[4]
ammo5 = self.game.get_state().game_variables[5]
ammo6 = self.game.get_state().game_variables[6]
ammo7 = self.game.get_state().game_variables[7]
ammo8 = self.game.get_state().game_variables[8]
ammo9 = self.game.get_state().game_variables[9]
armor = self.game.get_state().game_variables[10]
health = self.game.get_state().game_variables[11]
pos_x = self.game.get_state().game_variables[12]
pos_y = self.game.get_state().game_variables[13]
pos_z = self.game.get_state().game_variables[14]
selected_weapon = self.game.get_state().game_variables[15]
selected_weapon_ammo = self.game.get_state().game_variables[16]
weapon0 = self.game.get_state().game_variables[17]
weapon1 = self.game.get_state().game_variables[18]
weapon2 = self.game.get_state().game_variables[19]
weapon3 = self.game.get_state().game_variables[20]
weapon4 = self.game.get_state().game_variables[21]
weapon5 =self.game.get_state().game_variables[22]
weapon6 = self.game.get_state().game_variables[23]
weapon7 = self.game.get_state().game_variables[24]
weapon8 = self.game.get_state().game_variables[25]
weapon9 = self.game.get_state().game_variables[26]
damage_taken = self.game.get_state().game_variables[27]
hitcount = self.game.get_state().game_variables[28]
info = {"ammo0":ammo0, "ammo1":ammo1, "ammo2":ammo2, "ammo3":ammo3,
"ammo4":ammo4,"ammo5":ammo5,"ammo6":ammo6,"ammo7":ammo7, "ammo8":ammo8,
"ammo9":ammo9, "armor":armor, "health":health, "pos_x":pos_x,
"pos_y":pos_y, "pos_z":pos_z, "selected_weapon":selected_weapon,
"selected_weapon_ammo":selected_weapon_ammo, "weapon0":weapon0,
"weapon1":weapon1,"weapon2":weapon2,"weapon3":weapon3,
"weapon4":weapon4,"weapon5":weapon5,"weapon6":weapon6,
"weapon7":weapon7,"weapon8":weapon8,"weapon9":weapon9,
'damage_taken':damage_taken, 'hitcount':hitcount}
# Calculate rewards:
total_damage_taken = -damage_taken + self.damage_taken
self.damage_taken = total_damage_taken
total_hitcount = hitcount - self.hitcount
total_ammo = ammo0 + ammo1 + ammo2 + ammo3 + ammo4 + ammo5 + ammo6 + ammo7 + ammo8 + ammo9 - self.ammo
self.ammo = total_ammo
reward = action_reward + total_damage_taken*10 + total_hitcount*200 + total_ammo*5
truncated = False
else:
state = np.zeros(self.observation_space.shape)
info = 0
reward = 0
truncated = True
info = {"info":info}
done = self.game.is_episode_finished()
return state, reward, done, truncated, info
# Define how to render the game or environment
def render():
pass
# What happens when we start a new game
def reset(self):
self.game.new_episode()
state = self.game.get_state().screen_buffer
info = 0
info = {"info":info}
return self.grayscale(state), info
# Grayscale the game frame and resize it
def grayscale(self, observation):
gray = cv2.cvtColor(np.moveaxis(observation, 0, -1), cv2.COLOR_BGR2GRAY)
resize = cv2.resize(gray, (160,120), interpolation=cv2.INTER_CUBIC)
state = np.reshape(resize, (160,120,1))
return state
# Call to close down the game
def close(self):
self.game.close()
# ENVIROMENT CHECK:
# env = VizDoomGym(render=True)
# state = env.reset()
# env_checker.check_env(env)
# TRAIN MODEL
env = VizDoomGym()
checkpoint_callback = CheckpointCallback(save_freq=50000, save_path=CHECKPOINT_DIR,
save_replay_buffer=True, save_vecnormalize=True)
eval_callback = EvalCallback(env, best_model_save_path=CHECKPOINT_DIR, log_path=LOG_DIR,
eval_freq=50000, deterministic=False, render=True, verbose=1)
callback = CallbackList([checkpoint_callback, eval_callback])
model = A2C('CnnPolicy', env, tensorboard_log=LOG_DIR, verbose=1, learning_rate=0.0001, n_steps=8192)
# model = A2C.load('/home/joaquin/TFM/Doom_RL/train/train_basic/best_model_1800000', env)
model.learn(total_timesteps=3000000, callback=callback, progress_bar=True)
model.save('vizdoom_A2C')
env.close()`
from vizdoom.
How often does it happen? I'm running your code using Stable-Baselines3 2.0.0a5 alpha (one with Gymnasium support), installed in the following way:
pip install "sb3_contrib>=2.0.0a1" --upgrade
pip install "stable_baselines3>=2.0.0a1" --upgrade
and I don't see any problem with the reset method after 200k timesteps. I'm afraid I will need more details to help you. Details about your environment, and detailed instructions on how to reproduce the problem (and how it occurs).
from vizdoom.
@MetallicaSPA, I replicated your environment and ran a slightly modified script (I attached the modified version below). I've just changed paths to config/log/model files. After 3mln of timesteps, no error. Checked deathmatch and deadly corridor environments.
So at the moment, I think the reason might be that your .cfg
or .wad
files were somehow modified and, for example, now allow the agent to be killed before the episode starts. This is, for example, possible if the episode's start_time in the config is set to a large number. If you are sure that your .cfg
/.wad
files were not modified, then I will need to ask you to prepare a docker file that I can run to replicate the problem.
import vizdoom as vzd
import numpy as np
import cv2
import os
from vizdoom import *
from gymnasium import Env
from gymnasium.spaces import Discrete, Box
from stable_baselines3.common.callbacks import CallbackList, EvalCallback, ProgressBarCallback, CheckpointCallback
from stable_baselines3 import A2C
SCENARIO = "deadly_corridor"
DEFAULT_CONFIG = os.path.join(scenarios_path, f"{SCENARIO}.cfg")
CHECKPOINT_DIR = f'./vizdoom_train/train_{SCENARIO}'
LOG_DIR = f'./vizdoom_logs/log_{SCENARIO}'
render = False # True will show the window while training, False don't but will make the training faster
class VizDoomGym(Env):
# Function that is called when we start the env
def __init__(self, render=render):
# Inherit from Env
super().__init__()
# Setup the game
self.game = vzd.DoomGame()
self.game.load_config(DEFAULT_CONFIG)
self.game.set_doom_game_path("doom2.wad")
self.game.set_render_hud(False)
#self.game.set_screen_resolution(vzd.ScreenResolution.RES_640X480)
self.game.set_screen_resolution(vzd.ScreenResolution.RES_160X120)
# Set cv2 friendly format.
# self.game.set_screen_format(vzd.ScreenFormat.BGR24)
# Enables labeling of the in game objects.
self.game.set_labels_buffer_enabled(True)
# Enables depth buffer (turned off by default).
self.game.set_depth_buffer_enabled(True)
# Render frame logic
if render == False:
self.game.set_window_visible(False)
else:
self.game.set_window_visible(True)
self.game.clear_available_game_variables()
self.game.set_available_game_variables([
vzd.GameVariable.AMMO0,
vzd.GameVariable.AMMO1,
vzd.GameVariable.AMMO2,
vzd.GameVariable.AMMO3,
vzd.GameVariable.AMMO4,
vzd.GameVariable.AMMO5,
vzd.GameVariable.AMMO6,
vzd.GameVariable.AMMO7,
vzd.GameVariable.AMMO8,
vzd.GameVariable.AMMO9,
vzd.GameVariable.ARMOR,
vzd.GameVariable.HEALTH,
vzd.GameVariable.POSITION_X,
vzd.GameVariable.POSITION_Y,
vzd.GameVariable.POSITION_Z,
vzd.GameVariable.SELECTED_WEAPON,
vzd.GameVariable.SELECTED_WEAPON_AMMO,
vzd.GameVariable.WEAPON0,
vzd.GameVariable.WEAPON1,
vzd.GameVariable.WEAPON2,
vzd.GameVariable.WEAPON3,
vzd.GameVariable.WEAPON4,
vzd.GameVariable.WEAPON5,
vzd.GameVariable.WEAPON6,
vzd.GameVariable.WEAPON7,
vzd.GameVariable.WEAPON8,
vzd.GameVariable.WEAPON9,
vzd.GameVariable.DAMAGE_TAKEN,
vzd.GameVariable.HITCOUNT
])
# Start the game
self.game.init()
# Get game variables:
self.damage_taken = 0
self.hitcount = 0
self.ammo = 52
# Create the action space and observation space
self.observation_space = Box(low=0, high=255, shape=(160,120,1), dtype=np.uint8)
self.action_space = Discrete(14)
# This is how we take a step in the environment
def step(self, action):
# Specify action and take step
actions = np.identity(14)
action_reward = self.game.make_action(actions[action], 4)
# Get all the other stuff we need to return
if self.game.get_state():
state = self.game.get_state().screen_buffer
state = self.grayscale(state)
ammo0 = self.game.get_state().game_variables[0]
ammo1 = self.game.get_state().game_variables[1]
ammo2 = self.game.get_state().game_variables[2]
ammo3 = self.game.get_state().game_variables[3]
ammo4 = self.game.get_state().game_variables[4]
ammo5 = self.game.get_state().game_variables[5]
ammo6 = self.game.get_state().game_variables[6]
ammo7 = self.game.get_state().game_variables[7]
ammo8 = self.game.get_state().game_variables[8]
ammo9 = self.game.get_state().game_variables[9]
armor = self.game.get_state().game_variables[10]
health = self.game.get_state().game_variables[11]
pos_x = self.game.get_state().game_variables[12]
pos_y = self.game.get_state().game_variables[13]
pos_z = self.game.get_state().game_variables[14]
selected_weapon = self.game.get_state().game_variables[15]
selected_weapon_ammo = self.game.get_state().game_variables[16]
weapon0 = self.game.get_state().game_variables[17]
weapon1 = self.game.get_state().game_variables[18]
weapon2 = self.game.get_state().game_variables[19]
weapon3 = self.game.get_state().game_variables[20]
weapon4 = self.game.get_state().game_variables[21]
weapon5 = self.game.get_state().game_variables[22]
weapon6 = self.game.get_state().game_variables[23]
weapon7 = self.game.get_state().game_variables[24]
weapon8 = self.game.get_state().game_variables[25]
weapon9 = self.game.get_state().game_variables[26]
damage_taken = self.game.get_state().game_variables[27]
hitcount = self.game.get_state().game_variables[28]
info = {"ammo0":ammo0, "ammo1":ammo1, "ammo2":ammo2, "ammo3":ammo3,
"ammo4":ammo4,"ammo5":ammo5,"ammo6":ammo6,"ammo7":ammo7, "ammo8":ammo8,
"ammo9":ammo9, "armor":armor, "health":health, "pos_x":pos_x,
"pos_y":pos_y, "pos_z":pos_z, "selected_weapon":selected_weapon,
"selected_weapon_ammo":selected_weapon_ammo, "weapon0":weapon0,
"weapon1":weapon1,"weapon2":weapon2,"weapon3":weapon3,
"weapon4":weapon4,"weapon5":weapon5,"weapon6":weapon6,
"weapon7":weapon7,"weapon8":weapon8,"weapon9":weapon9,
'damage_taken':damage_taken, 'hitcount':hitcount}
# Calculate rewards:
total_damage_taken = -damage_taken + self.damage_taken
self.damage_taken = total_damage_taken
total_hitcount = hitcount - self.hitcount
total_ammo = ammo0 + ammo1 + ammo2 + ammo3 + ammo4 + ammo5 + ammo6 + ammo7 + ammo8 + ammo9 - self.ammo
self.ammo = total_ammo
reward = action_reward + total_damage_taken*10 + total_hitcount*200 + total_ammo*5
truncated = False
else:
state = np.zeros(self.observation_space.shape)
info = 0
reward = 0
truncated = True
info = {"info":info}
done = self.game.is_episode_finished()
return state, reward, done, truncated, info
# Define how to render the game or environment
def render():
pass
# What happens when we start a new game
def reset(self):
self.game.new_episode()
state = self.game.get_state().screen_buffer
info = 0
info = {"info":info}
#print("Reseting!")
return self.grayscale(state), info
# Grayscale the game frame and resize it
def grayscale(self, observation):
gray = cv2.cvtColor(np.moveaxis(observation, 0, -1), cv2.COLOR_BGR2GRAY)
resize = cv2.resize(gray, (160,120), interpolation=cv2.INTER_CUBIC)
state = np.reshape(resize, (160,120,1))
return state
# Call to close down the game
def close(self):
self.game.close()
# ENVIROMENT CHECK:
# env = VizDoomGym(render=True)
# state = env.reset()
# env_checker.check_env(env)
# TRAIN MODEL
env = VizDoomGym(render=True)
checkpoint_callback = CheckpointCallback(save_freq=50000, save_path=CHECKPOINT_DIR,
save_replay_buffer=True, save_vecnormalize=True)
eval_callback = EvalCallback(env, best_model_save_path=CHECKPOINT_DIR, log_path=LOG_DIR,
eval_freq=50000, deterministic=False, render=True, verbose=1)
callback = CallbackList([checkpoint_callback, eval_callback])
model = A2C('CnnPolicy', env, verbose=1, learning_rate=0.0001, n_steps=8192)
# model = A2C.load('/home/joaquin/TFM/Doom_RL/train/train_basic/best_model_1800000', env)
model.learn(total_timesteps=3000000, callback=callback, progress_bar=True)
model.save('vizdoom_A2C')
env.close()
from vizdoom.
Happy that we've figured this out! :)
from vizdoom.
Related Issues (20)
- Create documentation in farama.org style
- Update to the newest Gymnasium API
- Create a git pre-commit hook with black, flake8 and isort for gym_wrapper HOT 3
- Create a GitHub workflow with black, flake8 and isort checks HOT 1
- Add docstrings to pybind11 module
- Fix SetuptoolsDeprecationWarning in setup.py
- python setup.py build failed in ARM64 machine HOT 9
- Cannot pickle vizdoom.vizdoom.GameState in python HOT 1
- Curious why can't I install vizdoom with conda deps?
- from vizdoom import gymnasium_wrapper doesn't work HOT 2
- Occasional start-up crash when using DirectDraw on Windows 11 HOT 4
- ViZDoom's Development Roadmap
- Checklist for Maturity
- Cannot install via pip HOT 6
- Can I use VizDoom in other FPS games? HOT 3
- First Time User Encountering Pip Install Error M1 HOT 6
- Bug for VizdoomEnv HOT 3
- ATTACK command not working properly, The SELECTED_WEAPON_AMMO variable does not change when pressed HOT 9
- ViZDoom for behavioral modelling; stabilizing time-diff between states HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vizdoom.