Hi Jathushan， Thanks for your awesome work. Though I have a ques

are you referring to this issue? <a class="issue-link js-issue-link" data-error-te

Something strange about the update of theta and psi in the inner loop about itaml HOT 6 CLOSED

brjathu commented on August 25, 2024

Something strange about the update of theta and psi in the inner loop

from itaml.

Comments (6)

brjathu commented on August 25, 2024

Thanks. Yes, theta is updated for each task while psi_i only updated for corresponding task. Gradients are not calculated separately as you suspect, once we updated the theta and psi_i, in the inner loop using backward pass, in the outerloop only theta is updated using a weighted average of all task.

from itaml.

genghuanlee commented on August 25, 2024

Thanks and i get it. And here l have another question. When i make the train and test in Minist, i find something strange about the accuracy after the meta test. The accuracy of the first task reach alomost 100 percent, and the accuracies of the second task , the third task and so on become lower and lower. It make me so confused and i can't find the answer in the code and paper. Can you explain it to me? THANKS

from itaml.

brjathu commented on August 25, 2024

are you referring to this issue?
#10

from itaml.

genghuanlee commented on August 25, 2024

Thanks. I get it. I have found out the reason of my issue. I wanna transform the method to a new dataset, but the number of pictures belonging to different tasks is unbalanced. Just for this, the accuracies corresponding to different tasks have big gap.

from itaml.

genghuanlee commented on August 25, 2024

Here, sorry, i have to disturb you again. I have another question.When i read the code as:

main_learner=Learner(model=model,args=args,trainloader=train_loader, testloader=test_loader, use_cuda=use_cuda)
main_learner.learn()
memory = inc_dataset.get_memory(memory, for_memory)
acc_task = main_learner.meta_test(main_learner.best_model, memory, inc_dataset)

Here, i find the function Learner() doesn't use the memory message. And you use memory in meta-test. And when i read the paper, i find that when you train the model, you use both the new task and memory message. Can you explain it to me.

And my first question about the update of the theta and psi. When i read the code, i am also confused. Here i show the code of the outer update.

        for i,(p,q) in enumerate(zip(model.parameters(), model_base.parameters())):
            alpha = np.exp(-self.args.beta*((1.0*self.args.sess)/self.args.num_task))
            ll = torch.stack(reptile_grads[i])
            p.data = torch.mean(ll,0)*(alpha) + (1-alpha)* q.data

Here the p refers to the mode's whole parameter which can't match what you say,'in the outerloop only theta is updated using a weighted average of all task'.

from itaml.

brjathu commented on August 25, 2024

No worries,

Learner takes the dataloader as the inputs, which is generated from here.
task_info, train_loader, val_loader, test_loader, for_memory = inc_dataset.new_task(memory)

the train_loader contains data from both new task and memory.

for the next part, the gradients are calculated only for the part of the fully connected layer. so only the classification parameters for the task will be updated.

iTAML/learner_task_itaml.py

Line 148 in e56e72b

    
           loss = F.binary_cross_entropy_with_logits(class_pre_ce[:, ai:bi], class_tar_ce[:, ai:bi])

And, about adding parameters, this was much faster. However, we can use the below part as well.

for i,(p,q) in enumerate(zip(model.parameters(), model_base.parameters())):
                alpha = np.exp(-self.args.beta*((1.0*self.args.sess)/self.args.num_task))
                ll = torch.stack(reptile_grads[i])
                if(p.data.size()[0]==10 and p.data.size()[1]==256):
                     for ik in sessions:
                           p.data[2*ik[0]:2*(ik[0]+1),:] = ll[ik[1]][2*ik[0]:2*(ik[0]+1),:]*(alpha) + (1-alpha)* q.data[2*ik[0]:2*(ik[0]+1),:]  
                else:
                     p.data = torch.mean(ll,0)*(alpha) + (1-alpha)* q.data

from itaml.

Something strange about the update of theta and psi in the inner loop about itaml HOT 6 CLOSED

Comments (6)

Related Issues (19)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent