awjuliani / neuro-nav Goto Github PK

A library for neuroscience-inspired navigation and decision making research.

License: Other

Jupyter Notebook 98.34% Python 1.66%

reinforcement-learning cognitive-science gym-environment machine-learning deep-reinforcement-learning

neuro-nav's Issues

Add linear policies to agents where applicable

As of right now, neuro-nav only includes tabular agent policies. This restricts the kinds of observation spaces to simple index based spaces, which prevents agents from being able to learn anything generalizable about their environments from contextual information.

I currently have a work-in-progress branch https://github.com/awjuliani/neuro-nav/tree/dev-linear-policies, which includes changes to a subset of the agent algorithms to support linear policies where applicable. Further steps would be to support gradient-based learning in addition to TD updates.

Deep agents in GraphEnv

Hi there, Is it possible to run the deep agents on a graphenv?

thank you for the awesome package,
q

a few questions

Hi there!

My task is quite peculiar, in it I have a few requirements. I'm almost at where I want but I have a few questions:

I would like to make rewards be optional to collect and make that an action (in my task there is a separate action to collect rewards) - how would I go about implementing that?
My task wraps around: meaning, if you travel to the rightmost edge of the maze, you should appear on the leftmost. I was able to solve this using portals, do you see a better way?
I would like every action to move my agent 1 step north. - any ideas on how to make it so? So far what I came up with was resetting the env with the new position being 1 + current position, but I want it to happen in the same episode.
I have different types of rewards (differently valued rewards), is this possible?

My task is hexxed

Thank you so much for the incredible package.

Best,
q

Putting in different types of grid squares/blocks

I am interested in implementing an environment wherein grid locations can be "colored" ie can act as observable information to the agent, for instance a red square might signal context A, and a blue square might signal context B, where A and B are different reward contingencies. It does not seem currently trivially possible to do this, but I was wondering if it could be done with some work. If so I wouldn't mind taking a crack at doing so, if I could be pointed in the right direction.

Thanks for this awesome library!

No-Op Support

Hi, thanks for putting together this cool package! I seem to have found a minor issue. I've set enable_noop=True on a GridEnv, but when I call env.step(4), the following error is thrown:

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Cell In[20], line 3
      1 obs, reward, done, _ = env.step(4)
      2 print(f"Reward: {reward}", f"Done: {done}")
----> 3 env.render()

File python3.8/site-packages/neuronav/envs/grid_env.py:292), in GridEnv.render(self, provide)
    288 def render(self, provide=False):
    289     """
    290     Renders the environment in a pyplot window.
    291     """
--> 292     image = self.make_visual_obs()
    293     if self.obs_mode == GridObservation.rendered_3d:
    294         img_first = self.renderer.render_frame(self)

File [~/python3.8/site-packages/neuronav/envs/grid_env.py:486] in GridEnv.make_visual_obs(self, resize)
    477 elif agent_dir == 1:
    478     # facing right
    479     pts = np.array(
    480         [
    481             (x_offset, y_offset),
   (...)
    484         ]
    485     )
--> 486 cv.fillConvexPoly(img, pts, agent_color)
    487 if resize:
    488     img = cv.resize(img, (128, 128))

UnboundLocalError: local variable 'pts' referenced before assignment

It seems to be caused by the fact that agent_dir=4 is not supported by the current code. I think this is likely a simple fix, but I figured it might be worthy of attention. Thank you!

POMDP support/hacking

Hi there, great repo! What's the easiest way to adapt your code to support something fully POMDP, i.e. imagine a mouse in the dark with only whisker info. It seems like the closest agent observation types to this are the "window" and the "boundary" observations. For "window" I would like to shrink the window to 3x3, i.e. adjacent to the agent only. For boundary, I would like the cardinal rays to only extend one square/cell away from the agent. Presumably/hopefully, these implementations would give identical results. Would it be possible for you to implement this or to provide some instructions? Both approaches appear to require hacking the openai gym spaces.Box code at the minimum which seems painful.

Add replication of "A distributional code for value in dopamine-based reinforcement learning"

Provide a replication of the main computational results from the paper: "A distributional code for value in dopamine-based reinforcement learning".

This will involve adding support for stochastic transitions of the graph environments, as well as implementing a distributional RL algorithm. Given that both of these paradigms have high relevance to the computational neuroscience community, they are high priority to be additions to the library in the near future.

Add visualization of experiment MDPs

Graph Experiment 1 and 2 are missing renders of the MDPs for the task.

awjuliani / neuro-nav Goto Github PK

neuro-nav's Issues

Add linear policies to agents where applicable

Deep agents in GraphEnv

a few questions

Putting in different types of grid squares/blocks

No-Op Support

POMDP support/hacking

Add replication of "A distributional code for value in dopamine-based reinforcement learning"

Add visualization of experiment MDPs

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent