Comments (3)
MAML requires 2nd-order gradients, which requires large computation.
For the fast training, we use 1st-order approximation of the gradients at the beginning of the training, and then after SECOND_ORDER_GRAD_ITER, we use the full gradients.
Therefore, SECOND_ORDER_GRAD_ITER is to decide how many steps to approximate the gradients within 1st-order.
For the self.total_loss1, you are right. it is not used for the training. You may ignore that loss and corresponding optimizer.
from mzsr.
Dear Sir,
Amazing work ! Congratulation!!
please , I have a question.can you kindly provide me with the full path I should insert of checkpoint the trained large scale training model to be able to use it as a pre-trained to meta transfer training? as it says that there is no check point file
I'm waiting for your reply.
Thanks in advance
from mzsr.
Please can you kindly explain me how to calculate this weight loss ?
def get_loss_weights(self):
loss_weights = tf.ones(shape=[self.TASK_ITER]) * (1.0/self.TASK_ITER)
decay_rate = 1.0 / self.TASK_ITER / (10000 / 3)
min_value= 0.03 / self.TASK_ITER
loss_weights_pre = tf.maximum(loss_weights[:-1] - (tf.multiply(tf.to_float(self.global_step), decay_rate)), min_value)
loss_weight_cur= tf.minimum(loss_weights[-1] + (tf.multiply(tf.to_float(self.global_step),(self.TASK_ITER- 1) * decay_rate)), 1.0 - ((self.TASK_ITER - 1) * min_value))
loss_weights = tf.concat([[loss_weights_pre], [[loss_weight_cur]]], axis=1)
return loss_weights
from mzsr.
Related Issues (20)
- Variable dimensions are incompatible while calculating l1_loss(during Large-Scale_Training) HOT 1
- about the high_resolution image HOT 3
- How to obtain X3 experimental results HOT 2
- Unable to create event file HOT 2
- Why is the downsampling operator implemented by a model rather than a algorithm in the meta-test step HOT 3
- When I ran large-scale training code, I have some problems. Could you help me? HOT 4
- UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 86: invalid continuation byte HOT 3
- Reproduction with the given model HOT 5
- During reproducing “bicubic” downsampling scenario... HOT 2
- Where is the path I should insert of checkpoint the trained large scale training model ? HOT 1
- Error during Large Scale Training HOT 1
- Problem when i load the pretrained model , specially when it reads the checkpoint HOT 1
- I do not understand how to calculate the weight loss ?
- Error during large scale training
- AlreadyExistsError during Meta-training
- Use MZSR without CUDA?
- Sir,I have a problem when training
- train the model
- Model results
- distributed training
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mzsr.