Code Monkey home page Code Monkey logo

Comments (4)

JudasDie avatar JudasDie commented on September 25, 2024

I run the tuning process of SiamFC with OTB-2015 on the cloud by 8 GPUs, but it takes a long time to get the results.

The population and group size of GA in tune_gune are both set to 100, the population size seems quite bigger, so I am thinking to reduce it, is that possible? Do you test the tuning process of SiamFC using GA
with small population size?

  1. Yes, it takes a long time to tune hyper-parameters.
  2. The total hyperparametric numbers needs to reach a certain scale even with a small population size. It takes about one day with 8 GPUs.
  3. Try TPE for better results.

from siamdw.

iminfine avatar iminfine commented on September 25, 2024

I run the tuning process of SiamFC with OTB-2015 on the cloud by 8 GPUs, but it takes a long time to get the results.
The population and group size of GA in tune_gune are both set to 100, the population size seems quite bigger, so I am thinking to reduce it, is that possible? Do you test the tuning process of SiamFC using GA
with small population size?

  1. Yes, it takes a long time to tune hyper-parameters.
  2. The total hyperparametric numbers needs to reach a certain scale even with a small population size. It takes about one day with 8 GPUs.
  3. Try TPE for better results.

Thanks, the TPE seems not working after you fixing TAB-SPACE bugs of test_siamfc.py, and comment in code said the GA is faster than TPE(not faster enough, I prefer to use TPE ). Thus I run GA instead. I got error messages like this:
2019-07-15 05:55:28,916 WARNING experiment.py:30 -- trial_resourcesis deprecated. Please useresources_per_trial. trial_resources` will be removed in future versions of Ray.
2019-07-15 05:55:28,916 INFO tune.py:139 -- Did not find checkpoint file in ./TPE_results/zp_tune.
2019-07-15 05:55:28,916 INFO tune.py:145 -- Starting a new experiment.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 0/4 CPUs, 0/1 GPUs
Memory usage on this node: 6.7/16.7 GB

2019-07-15 05:55:29,000 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 1/4 CPUs, 0.5/1 GPUs
Memory usage on this node: 6.8/16.7 GB
Result logdir: ./TPE_results/zp_tune
PENDING trials:

  • fitness_2_scale_lr=0.5674,scale_penalty=0.9596,scale_step=1.0614,w_influence=0.3149: PENDING
  • fitness_3_scale_lr=0.5008,scale_penalty=0.9529,scale_step=1.1539,w_influence=0.2413: PENDING
  • fitness_4_scale_lr=0.6948,scale_penalty=0.9551,scale_step=1.011,w_influence=0.4641: PENDING
  • fitness_5_scale_lr=0.6827,scale_penalty=0.9836,scale_step=1.0539,w_influence=0.6642: PENDING
    RUNNING trials:
  • fitness_1_scale_lr=0.2358,scale_penalty=0.9937,scale_step=1.1995,w_influence=0.5447: RUNNING

2019-07-15 05:55:29,109 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:29,346 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,396 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,997 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 4081809b3f48f44d15f26536b32befdb77815c5f. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'

2019-07-15 05:55:29,998 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 7e05f09aa98b7d34320bcf1366eed7cd1bd0e8f1. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'

2019-07-15 05:55:29,999 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28562, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method

2019-07-15 05:55:30,037 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:30,129 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28563, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method

2019-07-15 05:55:30,173 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
`
Any suggestions of this?

from siamdw.

JudasDie avatar JudasDie commented on September 25, 2024

I run the tuning process of SiamFC with OTB-2015 on the cloud by 8 GPUs, but it takes a long time to get the results.
The population and group size of GA in tune_gune are both set to 100, the population size seems quite bigger, so I am thinking to reduce it, is that possible? Do you test the tuning process of SiamFC using GA
with small population size?

  1. Yes, it takes a long time to tune hyper-parameters.
  2. The total hyperparametric numbers needs to reach a certain scale even with a small population size. It takes about one day with 8 GPUs.
  3. Try TPE for better results.

Thanks, the TPE seems not working after you fixing TAB-SPACE bugs of test_siamfc.py, and comment in code said the GA is faster than TPE(not faster enough, I prefer to use TPE ). Thus I run GA instead. I got error messages like this:
2019-07-15 05:55:28,916 WARNING experiment.py:30 -- trial_resourcesis deprecated. Please useresources_per_trial. trial_resources` will be removed in future versions of Ray.
2019-07-15 05:55:28,916 INFO tune.py:139 -- Did not find checkpoint file in ./TPE_results/zp_tune.
2019-07-15 05:55:28,916 INFO tune.py:145 -- Starting a new experiment.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 0/4 CPUs, 0/1 GPUs
Memory usage on this node: 6.7/16.7 GB

2019-07-15 05:55:29,000 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 1/4 CPUs, 0.5/1 GPUs
Memory usage on this node: 6.8/16.7 GB
Result logdir: ./TPE_results/zp_tune
PENDING trials:

* fitness_2_scale_lr=0.5674,scale_penalty=0.9596,scale_step=1.0614,w_influence=0.3149:	PENDING

* fitness_3_scale_lr=0.5008,scale_penalty=0.9529,scale_step=1.1539,w_influence=0.2413:	PENDING

* fitness_4_scale_lr=0.6948,scale_penalty=0.9551,scale_step=1.011,w_influence=0.4641:	PENDING

* fitness_5_scale_lr=0.6827,scale_penalty=0.9836,scale_step=1.0539,w_influence=0.6642:	PENDING
  RUNNING trials:

* fitness_1_scale_lr=0.2358,scale_penalty=0.9937,scale_step=1.1995,w_influence=0.5447:	RUNNING

2019-07-15 05:55:29,109 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:29,346 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,396 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,997 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 4081809b3f48f44d15f26536b32befdb77815c5f. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'

2019-07-15 05:55:29,998 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 7e05f09aa98b7d34320bcf1366eed7cd1bd0e8f1. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'

2019-07-15 05:55:29,999 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28562, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method

2019-07-15 05:55:30,037 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:30,129 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28563, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method

2019-07-15 05:55:30,173 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
`
Any suggestions of this?

Rerun the code for several times. Sometimes it reports No module named tracker or No module named model for unknowen reason.

from siamdw.

iminfine avatar iminfine commented on September 25, 2024

I run the tuning process of SiamFC with OTB-2015 on the cloud by 8 GPUs, but it takes a long time to get the results.
The population and group size of GA in tune_gune are both set to 100, the population size seems quite bigger, so I am thinking to reduce it, is that possible? Do you test the tuning process of SiamFC using GA
with small population size?

  1. Yes, it takes a long time to tune hyper-parameters.
  2. The total hyperparametric numbers needs to reach a certain scale even with a small population size. It takes about one day with 8 GPUs.
  3. Try TPE for better results.

Thanks, the TPE seems not working after you fixing TAB-SPACE bugs of test_siamfc.py, and comment in code said the GA is faster than TPE(not faster enough, I prefer to use TPE ). Thus I run GA instead. I got error messages like this:
2019-07-15 05:55:28,916 WARNING experiment.py:30 -- trial_resourcesis deprecated. Please useresources_per_trial. trial_resources` will be removed in future versions of Ray.
2019-07-15 05:55:28,916 INFO tune.py:139 -- Did not find checkpoint file in ./TPE_results/zp_tune.
2019-07-15 05:55:28,916 INFO tune.py:145 -- Starting a new experiment.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 0/4 CPUs, 0/1 GPUs
Memory usage on this node: 6.7/16.7 GB
2019-07-15 05:55:29,000 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
== Status ==
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 180.000: None | Iter 60.000: None | Iter 20.000: None
Bracket: Iter 180.000: None | Iter 60.000: None
Bracket: Iter 180.000: None
Resources requested: 1/4 CPUs, 0.5/1 GPUs
Memory usage on this node: 6.8/16.7 GB
Result logdir: ./TPE_results/zp_tune
PENDING trials:

* fitness_2_scale_lr=0.5674,scale_penalty=0.9596,scale_step=1.0614,w_influence=0.3149:	PENDING

* fitness_3_scale_lr=0.5008,scale_penalty=0.9529,scale_step=1.1539,w_influence=0.2413:	PENDING

* fitness_4_scale_lr=0.6948,scale_penalty=0.9551,scale_step=1.011,w_influence=0.4641:	PENDING

* fitness_5_scale_lr=0.6827,scale_penalty=0.9836,scale_step=1.0539,w_influence=0.6642:	PENDING
  RUNNING trials:

* fitness_1_scale_lr=0.2358,scale_penalty=0.9937,scale_step=1.1995,w_influence=0.5447:	RUNNING

2019-07-15 05:55:29,109 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:29,346 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,396 WARNING logger.py:27 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2019-07-15 05:55:29,997 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 4081809b3f48f44d15f26536b32befdb77815c5f. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'
2019-07-15 05:55:29,998 ERROR worker.py:1632 -- Failed to unpickle actor class 'WrappedFunc' for actor ID 7e05f09aa98b7d34320bcf1366eed7cd1bd0e8f1. Traceback:
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/function_manager.py", line 632, in fetch_and_register_actor
unpickled_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'tracker'
2019-07-15 05:55:29,999 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28562, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method
2019-07-15 05:55:30,037 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
2019-07-15 05:55:30,129 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/bo/anaconda3/envs/siamDW/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=28563, host=bo-Surface-Book-2)
Exception: The actor with name WrappedFunc failed to be imported, and so cannot execute this method
2019-07-15 05:55:30,173 WARNING logger.py:105 -- Could not instantiate <class 'ray.tune.logger._TFLogger'> - skipping.
`
Any suggestions of this?

Rerun the code for several times. Sometimes it reports No module named tracker or No module named model for unknowen reason.

Thanks.

from siamdw.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.