What is the usage of SECOND_ORDER_GRAD_ITER=0 and <co

What is the usage of SECOND_ORDER_GRAD_ITER=0 and self.total_loss1? about mzsr HOT 3 OPEN

jwsoh commented on August 24, 2024

What is the usage of SECOND_ORDER_GRAD_ITER=0 and self.total_loss1?

from mzsr.

Comments (3)

JWSoh commented on August 24, 2024

MAML requires 2nd-order gradients, which requires large computation.
For the fast training, we use 1st-order approximation of the gradients at the beginning of the training, and then after SECOND_ORDER_GRAD_ITER, we use the full gradients.

Therefore, SECOND_ORDER_GRAD_ITER is to decide how many steps to approximate the gradients within 1st-order.

For the self.total_loss1, you are right. it is not used for the training. You may ignore that loss and corresponding optimizer.

from mzsr.

BassantTolba1234 commented on August 24, 2024

Dear Sir,
Amazing work ! Congratulation!!
please , I have a question.can you kindly provide me with the full path I should insert of checkpoint the trained large scale training model to be able to use it as a pre-trained to meta transfer training? as it says that there is no check point file
I'm waiting for your reply.
Thanks in advance

from mzsr.

BassantTolba1234 commented on August 24, 2024

Please can you kindly explain me how to calculate this weight loss ?

def get_loss_weights(self):
loss_weights = tf.ones(shape=[self.TASK_ITER]) * (1.0/self.TASK_ITER)
decay_rate = 1.0 / self.TASK_ITER / (10000 / 3)
min_value= 0.03 / self.TASK_ITER

    loss_weights_pre = tf.maximum(loss_weights[:-1] - (tf.multiply(tf.to_float(self.global_step), decay_rate)), min_value)

    loss_weight_cur= tf.minimum(loss_weights[-1] + (tf.multiply(tf.to_float(self.global_step),(self.TASK_ITER- 1) * decay_rate)), 1.0 - ((self.TASK_ITER - 1) * min_value))
    loss_weights = tf.concat([[loss_weights_pre], [[loss_weight_cur]]], axis=1)
    return loss_weights

from mzsr.

Recommend Projects

What is the usage of SECOND_ORDER_GRAD_ITER=0 and self.total_loss1? about mzsr HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent