Population and group size of GA. about siamdw HOT 4 CLOSED

researchmm commented on September 25, 2024

Population and group size of GA.

from siamdw.

Comments (4)

JudasDie commented on September 25, 2024

I run the tuning process of SiamFC with OTB-2015 on the cloud by 8 GPUs, but it takes a long time to get the results.

The population and group size of GA in tune_gune are both set to 100, the population size seems quite bigger, so I am thinking to reduce it, is that possible? Do you test the tuning process of SiamFC using GA
with small population size?

Yes, it takes a long time to tune hyper-parameters.
The total hyperparametric numbers needs to reach a certain scale even with a small population size. It takes about one day with 8 GPUs.
Try TPE for better results.

from siamdw.

iminfine commented on September 25, 2024

I run the tuning process of SiamFC with OTB-2015 on the cloud by 8 GPUs, but it takes a long time to get the results.
The population and group size of GA in tune_gune are both set to 100, the population size seems quite bigger, so I am thinking to reduce it, is that possible? Do you test the tuning process of SiamFC using GA
with small population size?

Yes, it takes a long time to tune hyper-parameters.

The total hyperparametric numbers needs to reach a certain scale even with a small population size. It takes about one day with 8 GPUs.

Try TPE for better results.

Thanks, the TPE seems not working after you fixing TAB-SPACE bugs of test_siamfc.py, and comment in code said the GA is faster than TPE(not faster enough, I prefer to use TPE ). Thus I run GA instead. I got error messages like this:
2019-07-15 05:55:28,916 WARNING experiment.py:30 -- trial_resourcesis deprecated. Please useresources_per_trial. trial_resources` will be removed in future versions of Ray.
2019-07-15 05:55:28,916 INFO tune.py:139 -- Did not find checkpoint file in ./TPE_results/zp_tune.
2019-07-15 05:55:28,916 INFO tune.py:145 -- Starting a new experiment.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 0/4 CPUs, 0/1 GPUs
Memory usage on this node: 6.7/16.7 GB

2019-07-15 05:55:29,000 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 1/4 CPUs, 0.5/1 GPUs
Memory usage on this node: 6.8/16.7 GB
Result logdir: ./TPE_results/zp_tune
PENDING trials:

fitness_2_scale_lr=0.5674,scale_penalty=0.9596,scale_step=1.0614,w_influence=0.3149: PENDING
fitness_3_scale_lr=0.5008,scale_penalty=0.9529,scale_step=1.1539,w_influence=0.2413: PENDING
fitness_4_scale_lr=0.6948,scale_penalty=0.9551,scale_step=1.011,w_influence=0.4641: PENDING
fitness_5_scale_lr=0.6827,scale_penalty=0.9836,scale_step=1.0539,w_influence=0.6642: PENDING
RUNNING trials:
fitness_1_scale_lr=0.2358,scale_penalty=0.9937,scale_step=1.1995,w_influence=0.5447: RUNNING

2019-07-15 05:55:29,109 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:29,346 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,396 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,997 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 4081809b3f48f44d15f26536b32befdb77815c5f. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'

2019-07-15 05:55:29,998 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 7e05f09aa98b7d34320bcf1366eed7cd1bd0e8f1. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'

2019-07-15 05:55:29,999 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28562, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method

2019-07-15 05:55:30,037 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:30,129 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28563, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method

2019-07-15 05:55:30,173 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
`
Any suggestions of this?

from siamdw.

JudasDie commented on September 25, 2024

I run the tuning process of SiamFC with OTB-2015 on the cloud by 8 GPUs, but it takes a long time to get the results.
The population and group size of GA in tune_gune are both set to 100, the population size seems quite bigger, so I am thinking to reduce it, is that possible? Do you test the tuning process of SiamFC using GA
with small population size?

Yes, it takes a long time to tune hyper-parameters.

The total hyperparametric numbers needs to reach a certain scale even with a small population size. It takes about one day with 8 GPUs.

Try TPE for better results.

Thanks, the TPE seems not working after you fixing TAB-SPACE bugs of test_siamfc.py, and comment in code said the GA is faster than TPE(not faster enough, I prefer to use TPE ). Thus I run GA instead. I got error messages like this:
2019-07-15 05:55:28,916 WARNING experiment.py:30 -- trial_resourcesis deprecated. Please useresources_per_trial. trial_resources` will be removed in future versions of Ray.
2019-07-15 05:55:28,916 INFO tune.py:139 -- Did not find checkpoint file in ./TPE_results/zp_tune.
2019-07-15 05:55:28,916 INFO tune.py:145 -- Starting a new experiment.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 0/4 CPUs, 0/1 GPUs
Memory usage on this node: 6.7/16.7 GB

2019-07-15 05:55:29,000 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 1/4 CPUs, 0.5/1 GPUs
Memory usage on this node: 6.8/16.7 GB
Result logdir: ./TPE_results/zp_tune
PENDING trials:
* fitness_2_scale_lr=0.5674,scale_penalty=0.9596,scale_step=1.0614,w_influence=0.3149:	PENDING

* fitness_3_scale_lr=0.5008,scale_penalty=0.9529,scale_step=1.1539,w_influence=0.2413:	PENDING

* fitness_4_scale_lr=0.6948,scale_penalty=0.9551,scale_step=1.011,w_influence=0.4641:	PENDING

* fitness_5_scale_lr=0.6827,scale_penalty=0.9836,scale_step=1.0539,w_influence=0.6642:	PENDING
  RUNNING trials:

* fitness_1_scale_lr=0.2358,scale_penalty=0.9937,scale_step=1.1995,w_influence=0.5447:	RUNNING
2019-07-15 05:55:29,109 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:29,346 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,396 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,997 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 4081809b3f48f44d15f26536b32befdb77815c5f. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'

2019-07-15 05:55:29,998 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 7e05f09aa98b7d34320bcf1366eed7cd1bd0e8f1. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'

2019-07-15 05:55:29,999 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28562, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method

2019-07-15 05:55:30,037 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:30,129 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28563, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method

2019-07-15 05:55:30,173 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
`
Any suggestions of this?

Rerun the code for several times. Sometimes it reports No module named tracker or No module named model for unknowen reason.

from siamdw.

iminfine commented on September 25, 2024

I run the tuning process of SiamFC with OTB-2015 on the cloud by 8 GPUs, but it takes a long time to get the results.
The population and group size of GA in tune_gune are both set to 100, the population size seems quite bigger, so I am thinking to reduce it, is that possible? Do you test the tuning process of SiamFC using GA
with small population size?

Yes, it takes a long time to tune hyper-parameters.

The total hyperparametric numbers needs to reach a certain scale even with a small population size. It takes about one day with 8 GPUs.

Try TPE for better results.

Thanks, the TPE seems not working after you fixing TAB-SPACE bugs of test_siamfc.py, and comment in code said the GA is faster than TPE(not faster enough, I prefer to use TPE ). Thus I run GA instead. I got error messages like this:
2019-07-15 05:55:28,916 WARNING experiment.py:30 -- trial_resourcesis deprecated. Please useresources_per_trial. trial_resources` will be removed in future versions of Ray.
2019-07-15 05:55:28,916 INFO tune.py:139 -- Did not find checkpoint file in ./TPE_results/zp_tune.
2019-07-15 05:55:28,916 INFO tune.py:145 -- Starting a new experiment.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 0/4 CPUs, 0/1 GPUs
Memory usage on this node: 6.7/16.7 GB
2019-07-15 05:55:29,000 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 1/4 CPUs, 0.5/1 GPUs
Memory usage on this node: 6.8/16.7 GB
Result logdir: ./TPE_results/zp_tune
PENDING trials:
* fitness_2_scale_lr=0.5674,scale_penalty=0.9596,scale_step=1.0614,w_influence=0.3149:	PENDING

* fitness_3_scale_lr=0.5008,scale_penalty=0.9529,scale_step=1.1539,w_influence=0.2413:	PENDING

* fitness_4_scale_lr=0.6948,scale_penalty=0.9551,scale_step=1.011,w_influence=0.4641:	PENDING

* fitness_5_scale_lr=0.6827,scale_penalty=0.9836,scale_step=1.0539,w_influence=0.6642:	PENDING
  RUNNING trials:

* fitness_1_scale_lr=0.2358,scale_penalty=0.9937,scale_step=1.1995,w_influence=0.5447:	RUNNING
2019-07-15 05:55:29,109 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:29,346 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,396 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,997 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 4081809b3f48f44d15f26536b32befdb77815c5f. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'
2019-07-15 05:55:29,998 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 7e05f09aa98b7d34320bcf1366eed7cd1bd0e8f1. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'
2019-07-15 05:55:29,999 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28562, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method
2019-07-15 05:55:30,037 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:30,129 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28563, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method
2019-07-15 05:55:30,173 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
`
Any suggestions of this?
Rerun the code for several times. Sometimes it reports No module named tracker or No module named model for unknowen reason.

Thanks.

from siamdw.

Population and group size of GA. about siamdw HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent