Cannot find params['features.1.bn1.running_mean'] about maml-in-pytorch HOT 8 CLOSED

jik0730 commented on August 23, 2024

Cannot find params['features.1.bn1.running_mean']

from maml-in-pytorch.

Comments (8)

jik0730 commented on August 23, 2024

Hi, thanks for your interest in the code.
What about trying params['meta_learner.features.1.bn1.running_mean']?
The reason being is that in the code we are using a class instance of 'MetaLearner' as a model to optimize and it defines an attribute 'self.meta_learner' so that in order to access its parameters via dictionary type the key should start with 'meta_learner.blahblah...'.
If there exists other issues, let me know:)
Thank you!

from maml-in-pytorch.

rshaojimmy commented on August 23, 2024

Hi, thanks for your interest in the code.
What about trying params['meta_learner.features.1.bn1.running_mean']?
The reason being is that in the code we are using a class instance of 'MetaLearner' as a model to optimize and it defines an attribute 'self.meta_learner' so that in order to access its parameters via dictionary type the key should start with 'meta_learner.blahblah...'.
If there exists other issues, let me know:)
Thank you!

Yes, I have solved my previous problem. But I have found that another implementation says that there is no gradient in BatchNorm with running_mean and running_var. Therefore, these two factors should be avoided in the calculation process of theta_old and theta_new.

May I ask whether you agree with it? Thanks.

from maml-in-pytorch.

jik0730 commented on August 23, 2024

Yes, there is no gradient in BatchNorm with running_mean and running_var since those are just buffers. And yes, those two factors should be avoided in computing theta.
In our implementation, those factors are never used since we do not turn model into eval() mode. For each task given in training or evaluation, in batch norm layer, it uses current task's (mini-batch's) statistics for forward path. This has to be true since for each task its data distribution may differ from other tasks.
Thanks!

from maml-in-pytorch.

rshaojimmy commented on August 23, 2024

For each task given in training or evaluation, in batch norm layer, it uses current task's (mini-batch's) statistics for forward path. This has to be true since for each task its data distribution may differ from other tasks.

Thanks for your reply!

Does this mean that the momentum for batch norm should be set as 1?

from maml-in-pytorch.

jik0730 commented on August 23, 2024

Yes, but if you use eval() mode in the model. I think in our current code it does not matter how the momentum value is set since we do not use eval() mode.
However, if you use eval() mode, assuming that test point comes one-by-one, the momentum should be set by 1, which means we will use support set's batch norm statistics in evaluation. But, in many meta-learning papers, in each task (episode), they assume that we are given both support and query sets.
I think more correct way to be beneficial from this transductive setting is that we need to forward both support and query sets together and update meta-parameter to task-specific parameter with support set and set the model into eval() mode given momentum=1 for evaluating query set.
Correct me if I'm wrong, thanks!

from maml-in-pytorch.

rshaojimmy commented on August 23, 2024

momentum

Yes, but I find that the momentum is set by 1 by default in your implementation and says that "NOTE we do not need to care about running_mean anv var since momentum=1". However, the default value in batchnorm function of Pytorch is set by 0.1.

I wonder whether the value of the momentum will affect the process of meta-learning, and may I ask why you set the value of momentum by 1 by default?

Thank you very much.

from maml-in-pytorch.

jik0730 commented on August 23, 2024

I wonder whether the value of the momentum will affect the process of meta-learning ...
I'm pretty sure that the momentum value will not affect the training of meta-learning (here few-shot learning) because the momentum value is used only for computing moving average of statistics in batch-norm, which are used in evaluation after turning on model.eval().
may I ask why you set the value of momentum by 1 by default?
If we set momentum=1, batch-norm statistics are to be set by previous mini-batch's statistics, which is not desirable situation in general supervised learning task (that's why the default value=0.1). However in meta-learning context, in meta-evaluation phase, we want the statistics in batch-norm only computed from the current task's support set, and finally compute predictions for query set with the computed statistics. That's why we set momemtum=1 by default.

I agree NOTE in the code might be confusing, and will be corrected soon. Thanks for asking! If you want further clarification on this issue, let me know.

from maml-in-pytorch.

rshaojimmy commented on August 23, 2024

I wonder whether the value of the momentum will affect the process of meta-learning ...
I'm pretty sure that the momentum value will not affect the training of meta-learning (here few-shot learning) because the momentum value is used only for computing moving average of statistics in batch-norm, which are used in evaluation after turning on model.eval().

may I ask why you set the value of momentum by 1 by default?
If we set momentum=1, batch-norm statistics are to be set by previous mini-batch's statistics, which is not desirable situation in general supervised learning task (that's why the default value=0.1). However in meta-learning context, in meta-evaluation phase, we want the statistics in batch-norm only computed from the current task's support set, and finally compute predictions for query set with the computed statistics. That's why we set momemtum=1 by default.

I agree NOTE in the code might be confusing, and will be corrected soon. Thanks for asking! If you want further clarification on this issue, let me know.

Noted! Thank you very much!

from maml-in-pytorch.

Cannot find params['features.1.bn1.running_mean'] about maml-in-pytorch HOT 8 CLOSED

Comments (8)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent