Sorry for the simple question. I'm trying to replicate my results across runs in S

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How to achieve replicable results in SenseAct? about senseact HOT 3 OPEN

kindredresearch commented on June 4, 2024

How to achieve replicable results in SenseAct?

from senseact.

Comments (3)

gauthamvasan commented on June 4, 2024

Hi @fisherxue, we had a similar discussion in this issue. Please look at the entire discussion and try the steps listed there. If you're still running into issues, let me know.

from senseact.

fisherxue commented on June 4, 2024

What I've done: saved the random state in a file, then loaded it using pickle. I do this before the env is loaded. I then set the tensorflow random seed and the python random seed after sess.enter()

I am also passing the random state I load from file into the environment.
I'm fairly sure my hardware is not the bottleneck.

I am able to get relatively consistent results when I run two simulations at the same time. However, when I run one after the other, I get vastly different results. Any advice?

This is what I have:

# use fixed random state
with open('random.obj', 'rb') as f:
    rand_state = pickle.load(f)
np.random.set_state(rand_state)
tf_set_seeds(np.random.randint(1, 2**31 - 1))

#Create Asynchronous Simulation of InvertedDoublePendulum-v2 mujoco environment.
env = DoubleInvertedPendulumEnv(agent_dt=0.005,
                                sensor_dt=[0.01, 0.0033333],
                                is_render=False,
                                random_state=rand_state
                               )
# Start environment processes
env.start()

# Create baselines ppo policy function
sess = U.single_threaded_session()
sess.__enter__()
seed = np.random.randint(1, 2**31 - 1)
tf.set_random_seed(seed)
random.seed(seed)

Thanks!

from senseact.

fisherxue commented on June 4, 2024

@gauthamvasan
I'm getting this warning when running on one machine:
WARNING:root:Agent has over-run its allocated dt, it has been 0.008300065994262695 since the last observation, 0.003300065994262695 more than allowed
However, on the other machine I'm running it on, I only get that warning at the start of each iteration.
I'm still failing to get tight repeatability curves on double pendulum.

Any tips?
Thanks!

from senseact.

Recommend Projects

How to achieve replicable results in SenseAct? about senseact HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent