kevslinger / dtqn Goto Github PK

Deep Transformer Q-Networks for Partially Observable Reinforcement Learning

License: MIT License

Python 100.00%

dtqn's Introduction

Hi there 👋, I'm Kevin Esslinger! But you can call me Kev

Want to chat? Feel free to send me a message 📬, a follow 🐦, an invitation to connect 📥, or an email 📧! (Or all of the above 😁)

Bio

👨‍💻 I'm working as a Graduate Software Developer at Flow Traders in Amsterdam, The Netherlands.
🎓 Master of Science in Computer Science from Northeastern University in the Khoury College of Computer Sciences in Boston, Massachusetts. Graduated December 2022. Co-advised by Chris Amato and Robert Platt, worked on using transformers to solve challenging, partially observable tasks with reinforcement learning
🎓 Bachelor of Science in Computer Science and Mathematics with a minor in Data Science from Temple University in Philadelphia, Pennsylvania. Graduated Summa Cum Laude in May 2020
🦅 Achieved the rank of Eagle Scout

My favourite technologies:

Programming languages:
Text editors:
Machine learning lbiraries:
Operating system:

News

📓 My first paper, Deep Transformer Q-Networks for Partially Observable Reinforcement Learning, was published to the Neurips 2022 Workshop on Foundation Models for Decision Making! It's publicly available here on arXiv. The code for the paper is available at my DTQN repo

More about me

🚴‍♂️ Casual city biker
🎮 Teamfight Tactics and Magic: the Gathering player
🏀 76ers fans
🏈 Ravens and Eagles fan
☕ Latte art amateur
📜 Check out my resume here

dtqn's People

Contributors

Stargazers

Watchers

Forkers

timckai mahyardana lyu-xg mhahn0106 marisgg ibagur eejuncao richardjozsa sci-i toughstyle darcstar-solutions-tech hust1booze alireza-ebrahimi-ai dianabessie

dtqn's Issues

Getting an error stating environment D does not exist

I tried to run the basic version of the code python run.py without installing any of the additional packages and getting this error

WARNING: ``gym_gridverse`` is not installed. This means you cannot run an experiment with the `gv_*` domains.
WARNING: ``gym_gridverse`` is not installed. This means you cannot run an experiment with the gv_*.yaml domains.
WARNING: ``gym_pomdps`` is not installed. This means you cannot run an experiment with the HeavenHell or Hallway domain. 
WARNING: ``mini_hack`` is not installed. This means you cannot run an experiment with any of the MH- domains.
Loading using gym.make
Environment with id D not found.
Loading using YAML
Traceback (most recent call last):
  File "/...../DTQN-main/utils/env_processing.py", line 34, in make_env
    env = gym.make(id_or_path)
  File "/......./lib/python3.10/site-packages/gym/envs/registration.py", line 569, in make
    _check_version_exists(ns, name, version)
  File "/......./lib/python3.10/site-packages/gym/envs/registration.py", line 219, in _check_version_exists
    _check_name_exists(ns, name)
  File "/....../lib/python3.10/site-packages/gym/envs/registration.py", line 197, in _check_name_exists
    raise error.NameNotFound(
gym.error.NameNotFound: Environment D doesn't exist. 

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/....../DTQN-main/run.py", line 533, in <module>
    run_experiment(get_args())
  File "/......./DTQN-main/run.py", line 415, in run_experiment
    envs.append(env_processing.make_env(env_str))
  File "/....../DTQN-main/utils/env_processing.py", line 39, in make_env
    inner_env = factory_env_from_yaml(
NameError: name 'factory_env_from_yaml' is not defined

But then I did installation of gridverse and ran python3 run.py --envs DiscreteCarFlags-v0 --device mps and then it ran successfully.

I'm trying to connect DTQN network with SUMO. I want to know inputs of DTQN.

Hello, again.
As I mention on the title. I'm trying to connect DTQN network with SUMO as an environment. I want to know inputs of DTQN.
I think the inputs are Observations as I read the following paper.
But I don't know the format.
I'm struggle really hard on finding the input format of DTQN.
Which python file should I check? Could you tell me the location, Please?
Furthermore, I'm trying to use sequential numbers as states. Ex) relative distance between agent and other cars, agent's velocity and current lane number.
And I wonder DTQN architecture can handle this state as observation.
Because DQN's inputs seem like sequential images. Is DTQN's inputs are sequential images too?

Error run.py

(DTQN) mds@mds:~/DTQN$ python -u "/home/mds/DTQN/run.py"
Loading using gym.make
Environment with id D not found.
Loading using YAML
Traceback (most recent call last):
File "/home/mds/DTQN/utils/env_processing.py", line 34, in make_env
env = gym.make(id_or_path)
File "/home/mds/anaconda3/envs/DTQN/lib/python3.8/site-packages/gym/envs/registration.py", line 142, in make
return registry.make(id, **kwargs)
File "/home/mds/anaconda3/envs/DTQN/lib/python3.8/site-packages/gym/envs/registration.py", line 86, in make
spec = self.spec(path)
File "/home/mds/anaconda3/envs/DTQN/lib/python3.8/site-packages/gym/envs/registration.py", line 115, in spec
raise error.Error('Attempted to look up malformed environment ID: {}. (Currently all IDs must be of the form {}.)'.format(id.encode('utf-8'), env_id_re.pattern))
gym.error.Error: Attempted to look up malformed environment ID: b'D'. (Currently all IDs must be of the form ^(?:[\w:-]+/)?([\w:.-]+)-v(\d+)$.)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/mds/DTQN/run.py", line 533, in
run_experiment(get_args())
File "/home/mds/DTQN/run.py", line 415, in run_experiment
envs.append(env_processing.make_env(env_str))
File "/home/mds/DTQN/utils/env_processing.py", line 39, in make_env
inner_env = factory_env_from_yaml(
File "/home/mds/anaconda3/envs/DTQN/lib/python3.8/site-packages/gym_gridverse/envs/yaml/factory.py", line 243, in factory_env_from_yaml
with open(path) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/mds/DTQN/envs/gridverse/D'

When I executed run.py , got above Error. I think I installed right. But might installed in wrong path. I don't understand envs/gridverse/D.
Please help me execute run.py

Question about transformer with DQN

Hi, Kev
It's glad to know your work about DTQN.
I am very curious about why the work of combine Transformer and DQN is very small ,and this two technology is emit very early.
Because I thought there would be a lot of work in that point, but it is not.
As I know , It's maybe just one paper 'TRANSFORMER BASED REINFORCEMENT LEARNING FOR GAMES' before your work, and in that paper DTQN is not good as DRQN.
Do you have some insight about this?

Reproduction of GridVerse results

Hi!

Thanks for sharing your intriguing ideas on how to setup a transformer-based memory DRL algorithm. I'm interested in the way how the interface of the transformer works during inference and optimization. So I started out by simply trying to reproduce your GridVerse results as stated in your readme.

I'm currently running 3 repetitions of this experiment:

python run.py --env gv_memory.7x7.yaml --inembed 128 --disable-wandb --verbose

The success rate stays zero for the entire training so far.

[ December 15, 14:00:03 ] Training Steps: 699000, Success Rate: 0.00, Return: -25.00, Episode Length: 500.00, Hours: 3.77
[ December 15, 14:00:23 ] Training Steps: 700000, Success Rate: 0.00, Return: -25.00, Episode Length: 500.00, Hours: 3.78
[ December 15, 14:00:42 ] Training Steps: 701000, Success Rate: 0.00, Return: -25.00, Episode Length: 500.00, Hours: 3.78
[ December 15, 14:01:02 ] Training Steps: 702000, Success Rate: 0.00, Return: -25.00, Episode Length: 500.00, Hours: 3.79

I'm pretty sure I missed something. It would be great if you could help.

edit:
Training on 5x5 looks pretty volatile in comparison to the reported results.

[ December 15, 14:10:52 ] Training Steps: 891000, Success Rate: 0.30, Return: -2.24, Episode Length: 4.90, Hours: 3.92
[ December 15, 14:11:04 ] Training Steps: 892000, Success Rate: 0.70, Return: 1.78, Episode Length: 4.40, Hours: 3.92
[ December 15, 14:11:17 ] Training Steps: 893000, Success Rate: 0.40, Return: -1.20, Episode Length: 4.00, Hours: 3.92
[ December 15, 14:11:30 ] Training Steps: 894000, Success Rate: 0.70, Return: -0.42, Episode Length: 48.40, Hours: 3.93

kevslinger / dtqn Goto Github PK

dtqn's Introduction

Hi there 👋, I'm Kevin Esslinger! But you can call me Kev

Bio

My favourite technologies:

News

More about me

dtqn's People

Contributors

Stargazers

Watchers

Forkers

dtqn's Issues

Getting an error stating environment D does not exist

I'm trying to connect DTQN network with SUMO. I want to know inputs of DTQN.

Error run.py

Question about transformer with DQN

Reproduction of GridVerse results

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent