Comments (9)
see https://github.com/wy1iu/LargeMargin_Softmax_Loss#notes-for-training
from sphereface.
For large and difficult datasets, you should first try to set lambda_min as 5 or 10
Is this one?
But, when I use sphereface-20, I can get a small lambda_min and the results is good.
from sphereface.
It really depends on the dataset and the network architecture. There is no universally good hyper-parameter in all cases. You should change it depending on your task, dataset and the network.
from sphereface.
Ok, How to determine lambda_min?
train softmax_loss ? train accuracy? or test results?
from sphereface.
Maybe trying a large value first, and gradually decreasing it could be a good strategy.
Besides the lambda_min. the other hyperparameters also affect the convergence.
from sphereface.
Okay, I down it from 1000, and my boss wants me to drop it to 5.
I'll try it for more numbers!
Thanks again.
Best!
from sphereface.
87.3365 is a magic number. if the data is nan, the softmaxloss will be 87.3365. You can print the debug info to see the wrong. And I think you can print the x_norm and weight norm in the margin_inner product , maybe the x_norm is 0 which will make the data to be nan.
from sphereface.
@fromwhzz I suffered the same problem and according to your suggestion, I found L1 norm = (nan, nan); L2 norm = (nan, nan), loss is always 87.3365. What should I do to solve it? Thanks a lot.
from sphereface.
@wy1iu
For a custom dataset, I have 850 ids for train and 877 ids for test with 909 unique ids.
what number should I use for the margin_inner_product in softmax loss ? Also, I have the same problem with softmaxloss=87.335 and final of net.forward with a known image is nan.
Thanks
Chandra
from sphereface.
Related Issues (20)
- How to construct your input text file?
- cos(4\theta ) issue
- C + + version
- LFW 1024-dim HOT 1
- the value of MarginInnerProduct param HOT 2
- bad results but good loss HOT 1
- lambda on test
- Data Preprocess HOT 1
- Data preprocess
- Index exceeds matrix dimensions when runing evaluation.m
- Invalid MEX-file '/home/caffe/matlab/+caffe/private/caffe_.mexa64': /usr/local/lib/libopencv_imgcodecs.so.3.4: undefined symbol: _ZN2cv6detail17check_failed_autoEmmRKNS0_12CheckContextE.
- sphereFace划分的类别数等于训练集中人脸总数吗? HOT 1
- why my softmax loss [email protected] , who can help me ?
- Effect of mean and scale during the training
- coverted
- 性质2是怎么证明的呢
- Is any one know how to plot the Fig. 5 in paper? What type of tool is required to achieve this? HOT 2
- what is the function of normalizing weights?
- Compare untrained new faces from photos with multiple faces
- How to generate ' lfw-112X112 ' ?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sphereface.