Code Monkey home page Code Monkey logo

Comments (6)

Itsukara avatar Itsukara commented on August 16, 2024

(Sample report 1)

Summary

  • Total run step: 48M STEP
  • Average Score (last 16 lines): 610-710
  • Learning curve: https://cdn-ak.f.st-hatena.com/images/fotolife/I/Itsukara/20160907/20160907105007.png
  • Experimenter: Itsukara
  • Web site: http://itsukara.hateblo.jp/

Run Structure

  • 0M - 48M STEP: --train-episode-steps=30 --lives-lost-reward=-0.03 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=False --color-averaging-in-ale=True

Details

  • 0M - 48M STEP: --train-episode-steps=30 --lives-lost-reward=-0.03 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=False --color-averaging-in-ale=True

******************** options ********************^M
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=80, end_time_step=80000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=-0.03, lives_lost_rratio=1.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=100, max_play_steps=4500, max_play_time=300, max_time_step=100000000, max_to_keep=5, no_reward_steps=225, no_reward_time=15, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=2.2222222222222223e-05, randomness_log_interval=1500.0, randomness_log_num=30, randomness_steps=45000, randomness_time=3000.0, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)^M

<Average Episode score (last 16 lines)>
@@@ Average Episode score = 698.000000, s= 47597186,th=3
@@@ Average Episode score = 641.000000, s= 47601045,th=2
@@@ Average Episode score = 665.000000, s= 47609651,th=4
@@@ Average Episode score = 675.000000, s= 47616326,th=7
@@@ Average Episode score = 610.000000, s= 47674768,th=0
@@@ Average Episode score = 692.000000, s= 47705725,th=5
@@@ Average Episode score = 698.000000, s= 47758888,th=3
@@@ Average Episode score = 661.000000, s= 47791572,th=1
@@@ Average Episode score = 701.000000, s= 47798075,th=4
@@@ Average Episode score = 672.000000, s= 47815701,th=6
@@@ Average Episode score = 700.000000, s= 47821598,th=2
@@@ Average Episode score = 610.000000, s= 47858878,th=0
@@@ Average Episode score = 689.000000, s= 47859319,th=7
@@@ Average Episode score = 710.000000, s= 47928128,th=5
@@@ Average Episode score = 661.000000, s= 47993088,th=3
@@@ Average Episode score = 665.000000, s= 47993704,th=1

from async_deep_reinforce.

Itsukara avatar Itsukara commented on August 16, 2024

(Sample Report2)

Summary

  • Total run step: 26M STEP
  • Average Score (last 16 lines): 359 - 424
  • Learning curve: https://cdn-ak.f.st-hatena.com/images/fotolife/I/Itsukara/20160907/20160907105132.png
  • Experimenter: Itsukara
  • Web site: http://itsukara.hateblo.jp/

Run Structure

  • 0M - 3M STEP: --train-episode-steps=30 --lives-lost-reward=-0.02 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=False --color-averaging-in-ale=True
  • 3M - 26M STEP: --train-episode-steps=30 --lives-lost-reward=-0.02 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=True --color-averaging-in-ale=False

Details

  • 0M - 3M STEP: --train-episode-steps=30 --lives-lost-reward=-0.02 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=False --color-averaging-in-ale=True

******************** options ********************
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=3, end_time_step=3000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=-0.02, lives_lost_rratio=1.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=100, max_play_steps=4500, max_play_time=300, max_time_step=100000000, max_to_keep=5, no_reward_steps=225, no_reward_time=15, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=2.2222222222222223e-05, randomness_log_interval=1500.0, randomness_log_num=30, randomness_steps=45000, randomness_time=3000.0, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)


  • 3M - 26M STEP: --train-episode-steps=30 --lives-lost-reward=-0.02 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=True --color-averaging-in-ale=False

******************** options ********************
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=False, color_maximizing_in_gs=True, display=True, end_mega_step=80, end_time_step=80000000, entropy_beta=0.01, frames_skip_in_ale=1, frames_skip_in_gs=4, gamma=0.99, grad_norm_clip=40.0, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=-0.02, lives_lost_rratio=1.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=100, max_play_steps=4500, max_play_time=300, max_time_step=100000000, max_to_keep=5, no_reward_steps=225, no_reward_time=15, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=2.2222222222222223e-05, randomness_log_interval=1500.0, randomness_log_num=30, randomness_steps=45000, randomness_time=3000.0, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)


<Average Episode score (last 16 lines)>
@@@ Average Episode score = 400.000000, s= 25752794,th=6
@@@ Average Episode score = 419.000000, s= 25781176,th=1
@@@ Average Episode score = 409.000000, s= 25791626,th=2
@@@ Average Episode score = 362.000000, s= 25795054,th=3
@@@ Average Episode score = 381.000000, s= 25810532,th=4
@@@ Average Episode score = 376.000000, s= 25812920,th=7
@@@ Average Episode score = 396.000000, s= 25814558,th=5
@@@ Average Episode score = 424.000000, s= 25949909,th=1
@@@ Average Episode score = 359.000000, s= 25950264,th=0
@@@ Average Episode score = 373.000000, s= 25951475,th=7
@@@ Average Episode score = 397.000000, s= 25957375,th=6
@@@ Average Episode score = 369.000000, s= 25963558,th=3
@@@ Average Episode score = 386.000000, s= 25996207,th=5
@@@ Average Episode score = 392.000000, s= 26015968,th=4
@@@ Average Episode score = 388.000000, s= 26053439,th=6
@@@ Average Episode score = 415.000000, s= 26065448,th=2

from async_deep_reinforce.

tflare avatar tflare commented on August 16, 2024

Summary

Details

  • --train-episode-steps=30 --lives-lost-reward=-0.02 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --repeat-action-probability=0.0 --randomness-time=3000 --end-mega-step=80 --log-interval=800 --max-to-keep=5 --color-maximizing-in-gs=False --color-averaging-in-ale=True

options

Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=80, end_time_step=80000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=-0.02, lives_lost_rratio=0.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=100, max_play_steps=4500, max_play_time=300, max_time_step=100000000, max_to_keep=5, no_reward_steps=225, no_reward_time=15, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=2.2222222222222223e-05, randomness_log_interval=1500.0, randomness_log_num=30, randomness_steps=45000, randomness_time=3000.0, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)

last 20 Average

@@@ Average Episode score = 317.000000, s= 43205428,th=5
@@@ Average Episode score = 332.000000, s= 43229035,th=4
@@@ Average Episode score = 321.000000, s= 43232792,th=6
@@@ Average Episode score = 312.000000, s= 43250069,th=3
@@@ Average Episode score = 336.000000, s= 43256872,th=2
@@@ Average Episode score = 298.000000, s= 43301059,th=0
@@@ Average Episode score = 350.000000, s= 43349114,th=7
@@@ Average Episode score = 331.000000, s= 43352754,th=1
@@@ Average Episode score = 315.000000, s= 43362133,th=6
@@@ Average Episode score = 307.000000, s= 43393477,th=5
@@@ Average Episode score = 329.000000, s= 43425115,th=4
@@@ Average Episode score = 340.000000, s= 43439061,th=2
@@@ Average Episode score = 301.000000, s= 43456306,th=3
@@@ Average Episode score = 329.000000, s= 43512264,th=1
@@@ Average Episode score = 307.000000, s= 43519510,th=0
@@@ Average Episode score = 298.000000, s= 43525898,th=6
@@@ Average Episode score = 337.000000, s= 43529651,th=7
@@@ Average Episode score = 323.000000, s= 43626017,th=4
@@@ Average Episode score = 346.000000, s= 43626579,th=2
@@@ Average Episode score = 312.000000, s= 43634385,th=5

from async_deep_reinforce.

Itsukara avatar Itsukara commented on August 16, 2024

Summary

Run Structure

  • 0M - 84M STEP: --train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --repeat-action-probability=0.0 --randomness-time=3000 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200

Details

  • 0M - 84M STEP: --train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --repeat-action-probability=0.0 --randomness-time=3000 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200

******************** options ********************
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=100, end_time_step=100000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=0.0, lives_lost_rratio=0.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=200, max_play_steps=4500, max_play_time=300, max_time_step=200000000, max_to_keep=None, no_reward_steps=225, no_reward_time=15, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=2.2222222222222223e-05, randomness_log_interval=1500.0, randomness_log_num=30, randomness_steps=45000, randomness_time=3000.0, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, tes_extend=True, tes_extend_ratio=5.0, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)
use_gpu=True, use_lstm=False, verbose=True)


<Average Episode score (last 16 lines just before 84M STEP)>
@@@ Average Episode score = 1114.000000, s= 83401163,th=4
@@@ Average Episode score = 1280.000000, s= 83447970,th=3
@@@ Average Episode score = 1130.000000, s= 83486089,th=7
@@@ Average Episode score = 1193.000000, s= 83510661,th=6
@@@ Average Episode score = 1213.000000, s= 83570385,th=2
@@@ Average Episode score = 1081.000000, s= 83572954,th=5
@@@ Average Episode score = 1023.000000, s= 83623526,th=1
@@@ Average Episode score = 1012.000000, s= 83633493,th=0
@@@ Average Episode score = 1156.000000, s= 83691654,th=4
@@@ Average Episode score = 1265.000000, s= 83771600,th=3
@@@ Average Episode score = 1007.000000, s= 83772564,th=1
@@@ Average Episode score = 1169.000000, s= 83777307,th=6
@@@ Average Episode score = 1054.000000, s= 83779671,th=5
@@@ Average Episode score = 1065.000000, s= 83787417,th=7
@@@ Average Episode score = 1255.000000, s= 83913801,th=2
@@@ Average Episode score = 1017.000000, s= 83935966,th=0

from async_deep_reinforce.

Itsukara avatar Itsukara commented on August 16, 2024

Summary

  • Total run step: 65.2M STEP
  • Average Score (last 16 lines): 1090 - 1573
  • Learning curve:https://cdn-ak.f.st-hatena.com/images/fotolife/I/Itsukara/20160916/20160916141555.png
  • Checkpoints data:
  • Experimenter: Itsukara
  • Web site: http://itsukara.hateblo.jp/

Run Structure

  • 0M - 62.5M STEP:option=--train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --no-reward-time=600 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200 --greediness=0.01 --repeat-action-ratio=0.25

Details

  • 0M - 62.5M STEP:option=--train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --no-reward-time=600 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200 --greediness=0.01 --repeat-action-ratio=0.25

******************** options ********************
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=100, end_time_step=100000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, greediness=0.01, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=0.0, lives_lost_rratio=0.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=200, max_play_steps=4500, max_play_time=300, max_time_step=200000000, max_to_keep=None, no_reward_steps=9000, no_reward_time=600, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=0.00022222222222222223, randomness_log_interval=150.0, randomness_log_num=30, randomness_steps=4500, randomness_time=300, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, repeat_action_ratio=0.25, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, tes_extend=True, tes_extend_ratio=5.0, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)


<Average Episode score (last 16 lines 65.2M STEP)>
@@@ Average Episode score = 1369.000000, s= 64695930,th=2
@@@ Average Episode score = 1441.000000, s= 64752913,th=6
@@@ Average Episode score = 1183.000000, s= 64781579,th=5
@@@ Average Episode score = 1217.000000, s= 64783400,th=3
@@@ Average Episode score = 1431.000000, s= 64823778,th=0
@@@ Average Episode score = 1252.000000, s= 64847953,th=4
@@@ Average Episode score = 1508.000000, s= 64858656,th=7
@@@ Average Episode score = 1090.000000, s= 64887300,th=1
@@@ Average Episode score = 1522.000000, s= 64988743,th=6
@@@ Average Episode score = 1349.000000, s= 65035796,th=2
@@@ Average Episode score = 1202.000000, s= 65065045,th=5
@@@ Average Episode score = 1201.000000, s= 65143090,th=3
@@@ Average Episode score = 1492.000000, s= 65156749,th=0
@@@ Average Episode score = 1573.000000, s= 65180094,th=7
@@@ Average Episode score = 1357.000000, s= 65206441,th=4
@@@ Average Episode score = 1191.000000, s= 65217488,th=1

from async_deep_reinforce.

Itsukara avatar Itsukara commented on August 16, 2024

Summary

Run Structure

  • 0M - 100M STEP:option=--train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --no-reward-time=600 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200 --greediness=0.01 --repeat-action-ratio=0.25

Details

  • 0M - 100M STEP:option=--train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --no-reward-time=600 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200 --greediness=0.01 --repeat-action-ratio=0.25

******************** options ********************
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=100, end_time_step=100000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, greediness=0.01, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=0.0, lives_lost_rratio=0.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=200, max_play_steps=4500, max_play_time=300, max_time_step=200000000, max_to_keep=None, no_reward_steps=9000, no_reward_time=600, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=0.00022222222222222223, randomness_log_interval=150.0, randomness_log_num=30, randomness_steps=4500, randomness_time=300, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, repeat_action_ratio=0.25, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, stack_frames_in_gs=False, terminate_on_lives_lost=False, tes_extend=True, tes_extend_ratio=5.0, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)


<Average Episode score (36 lines @ 78.459M - 79.864M STEP)>
@@@ Average Episode score = 1683.000000, s= 78459385,th=2
@@@ Average Episode score = 1570.000000, s= 78474074,th=5
@@@ Average Episode score = 1600.000000, s= 78534232,th=3
@@@ Average Episode score = 1528.000000, s= 78534463,th=1
@@@ Average Episode score = 1561.000000, s= 78602286,th=0
@@@ Average Episode score = 1730.000000, s= 78649996,th=7
@@@ Average Episode score = 1596.000000, s= 78763734,th=4
@@@ Average Episode score = 1539.000000, s= 78787510,th=6
@@@ Average Episode score = 1585.000000, s= 78790811,th=5
@@@ Average Episode score = 1723.000000, s= 78799980,th=2
@@@ Average Episode score = 1532.000000, s= 78864929,th=1
@@@ Average Episode score = 1561.000000, s= 78889484,th=3
@@@ Average Episode score = 1541.000000, s= 78962613,th=0
@@@ Average Episode score = 1754.000000, s= 78976248,th=7
@@@ Average Episode score = 1621.000000, s= 78996270,th=4
@@@ Average Episode score = 1619.000000, s= 79100351,th=5
@@@ Average Episode score = 1664.000000, s= 79115006,th=2
@@@ Average Episode score = 1529.000000, s= 79128643,th=6
@@@ Average Episode score = 1552.000000, s= 79202649,th=3
@@@ Average Episode score = 1573.000000, s= 79225666,th=1
@@@ Average Episode score = 1604.000000, s= 79308819,th=4
@@@ Average Episode score = 1734.000000, s= 79311025,th=7
@@@ Average Episode score = 1561.000000, s= 79323126,th=0
@@@ Average Episode score = 1680.000000, s= 79414686,th=5
@@@ Average Episode score = 1703.000000, s= 79427641,th=2
@@@ Average Episode score = 1600.000000, s= 79437383,th=6
@@@ Average Episode score = 1569.000000, s= 79558727,th=1
@@@ Average Episode score = 1592.000000, s= 79562370,th=3
@@@ Average Episode score = 1562.000000, s= 79625564,th=4
@@@ Average Episode score = 1520.000000, s= 79657699,th=0
@@@ Average Episode score = 1636.000000, s= 79668274,th=7
@@@ Average Episode score = 1678.000000, s= 79721917,th=5
@@@ Average Episode score = 1553.000000, s= 79739129,th=6
@@@ Average Episode score = 1668.000000, s= 79739390,th=2
@@@ Average Episode score = 1520.000000, s= 79784754,th=1
@@@ Average Episode score = 1546.000000, s= 79864012,th=3

from async_deep_reinforce.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.