Code Monkey home page Code Monkey logo

Comments (7)

seolhokim avatar seolhokim commented on August 9, 2024

It is already considered in here.

observation = samples.all_observation[:-1] # [t, t+batch_length+1] -> [t, t+batch_length]
action = samples.all_action[1:] # [t-1, t+batch_length] -> [t, t+batch_length]
reward = samples.all_reward[1:] # [t-1, t+batch_length] -> [t, t+batch_length]

from dreamer-pytorch.

gunnxx avatar gunnxx commented on August 9, 2024

Hi, shouldn't it be

observation = samples.all_observation[:-1]  # [t, t+batch_length+1] -> [t, t+batch_length] 
action = samples.all_action[:-1]            # [t-1, t+batch_length] -> [t-1, t+batch_length-1] 
reward = samples.all_reward[1:]             # [t-1, t+batch_length] -> [t, t+batch_length] 

so that

self.representation_model(obs_embed[t], action[t], prev_state)

will be $p(s_t | s_{t-1}, a_{t-1})$ for the prior and $p(s_t | s_{t-1}, a_{t-1}, o_t)$ for the posterior.

Current code is computing $p(s_t | s_{t-1}, a_t)$ for the prior and $p(s_t | s_{t-1}, a_t, o_t)$ for the posterior. Did I miss something?

from dreamer-pytorch.

seolhokim avatar seolhokim commented on August 9, 2024

all_observation is observation. not state. check the comment in lines :)

from dreamer-pytorch.

gunnxx avatar gunnxx commented on August 9, 2024

Hi sorry maybe I was not clear. My question was about indexing the action. The code is

        observation = samples.all_observation[:-1]  # [t, t+batch_length+1] -> [t, t+batch_length]
        action = samples.all_action[1:]  # [t-1, t+batch_length] -> [t, t+batch_length]
        reward = samples.all_reward[1:]  # [t-1, t+batch_length] -> [t, t+batch_length]
        reward = reward.unsqueeze(2)
        done = samples.done
        done = done.unsqueeze(2)

        # Extract tensors from the Samples object
        # They all have the batch_t dimension first, but we'll put the batch_b dimension first.
        # Also, we convert all tensors to floats so they can be fed into our models.

        lead_dim, batch_t, batch_b, img_shape = infer_leading_dims(observation, 3)
        # squeeze batch sizes to single batch dimension for imagination roll-out
        batch_size = batch_t * batch_b

        # normalize image
        observation = observation.type(self.type) / 255.0 - 0.5
        # embed the image
        embed = model.observation_encoder(observation)

        prev_state = model.representation.initial_state(batch_b, device=action.device, dtype=action.dtype)
        # Rollout model by taking the same series of actions as the real model
        prior, post = model.rollout.rollout_representation(batch_t, embed, action, prev_state)

which means embed is $o_{t:t+K}$ and action is $a_{t:t+K}$ (judging by the comment in the code). Don't we need $a_{t-1:t+K-1}$ instead?

from dreamer-pytorch.

seolhokim avatar seolhokim commented on August 9, 2024

no. action is $a_{t-1 : t+K-1}$. observation sequence timestep is like [t, t+batch_length+1] by [:-1]and action sequence timestep is like [t-1, t+batch_length] by [1:]

from dreamer-pytorch.

gunnxx avatar gunnxx commented on August 9, 2024

Because the comment for observation is

# [t, t+batch_length+1] -> [t, t+batch_length]

and for action is

# [t-1, t+batch_length] -> [t, t+batch_length]

So that's why I thought it was wrong because both are $o_{t:t+K}$ and $a_{t:t+K}$. I said I was not really sure as well because I was not sure about the replay buffer sampling. Thanks for the confirmation!

from dreamer-pytorch.

seolhokim avatar seolhokim commented on August 9, 2024

Okay. Every code is fine. Depending on where you cut the array, you can create data starting from t-1 or data starting from t. Thanks.

from dreamer-pytorch.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.