Hi, I used run_experiment.py in order the retrieve the results for dqn. I

The full sequence of experiments is listed in <a href="https://github.com/facebookrese

Retrieving results from paper using run_experiment.py about phyre HOT 3 CLOSED

facebookresearch commented on August 31, 2024

Retrieving results from paper using run_experiment.py

from phyre.

Comments (3)

akhti commented on August 31, 2024

base_dqn uses 10k action during evaluation. We did a sweep over action-to-rank on the validation set and found that the optimal number of actions depends on the generalization settings due to overcrowdness problem. See lines 145-150 in run_experiment.py:

    dqn_ranks = dict(
        ball_cross_template='--dqn-rank-size=1000',
        ball_within_template='--dqn-rank-size=10000',
        two_balls_cross_template='--dqn-rank-size=100000',
        two_balls_within_template='--dqn-rank-size=100000',
    )

To get the final results you need to run final arg-generator. It will take the pretrained DQN from base_dqn and use it to do eval with the optimal number of actions.

from phyre.

Wonder1905 commented on August 31, 2024

Let me get things straight, first you ran with those settings:

   dqn_ranks = dict(
    ball_cross_template='--dqn-rank-size=1000',
    ball_within_template='--dqn-rank-size=10000',
    two_balls_cross_template='--dqn-rank-size=100000',
    two_balls_within_template='--dqn-rank-size=100000',
)

Afterward, you ran another run to find the optimal amount of actions you better rank?

from phyre.

akhti commented on August 31, 2024

The full sequence of experiments is listed in agents/train_all_baseline.sh.

First we train a DQN on 3 dev-folds. Then we use then to rank different number of actions and measure AUCCESS:

python $RUN_EXPERIMENT_SCRIPT --use-test-split 0 --arg-generator base_dqn --num-seeds $DEV_SEEDS
python $RUN_EXPERIMENT_SCRIPT --use-test-split 0 --arg-generator rank_and_online_sweep --num-seeds $DEV_SEEDS

Then we manually chose the best number of actions to rank (see figure 4 in the paper). These values are used for get the final numbers:

python $RUN_EXPERIMENT_SCRIPT --use-test-split 1 --arg-generator base_dqn --num-seeds $FINAL_SEEDS
wait_for_results "results/final/$DQN_BASE_NAME" $FINAL_SEEDS
python $RUN_EXPERIMENT_SCRIPT --use-test-split 1 --arg-generator finals --num-seeds $FINAL_SEEDS

The first command trains DQN on the final (non-dev) folds (and ranks default 10k during evaluation, but it doesn’t matter as it’s ignored). The second command use the pre-trained checkpoints to rank optimal number of actions for each evaluation setting. It also evaluates other baseline algorithms (like MEM) on the final folds. That’s why these 2 commands are separate.

You can see what exactly each arg-generator command do in agents/run_experiment.py.

from phyre.

Retrieving results from paper using run_experiment.py about phyre HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent