Comments (6)
No it still does not work. The basic classification training loss gets stuck at 2.3 i.e. random guessing. I don't think initialization can play such a huge role. It might really have to do with the fact that the final output is in 2d and in this case throwing away 3/4 of the space is too difficult for the network.
update:
-
lowering the starting learning rate down to 1e-3 can help to make a pure ReLU network trainable. It still converges much lower compared to PReLU. I should have spent a little more time doing hyper-parameter searching. The embedding looks as expected: 10 beams squeezed in the first quadrant.
-
Using vanilla ReLU for all layers, and simply changing the nonlinear in classification net to a PReLU, with default initialization, would make the network converge faster and get good looking embeddings. Constraining them to a quadrant is really too difficult.
from siamese-triplet.
ReLU should work just as well. I chose PReLU because their outputs serve better for visualizations. And personally I like the idea of a learnable slope in the activation function (although various papers show it's not always better)
from siamese-triplet.
But I tried replacing PReLU with ReLU, keeping the other config unchanged, and the training does not converge. The default setup seems reasonable though. It seems a little surprising.
from siamese-triplet.
For training with ReLU, you need to initialize convolutional layers more carefully. E.g. you can use kaiming initialization with gain for ReLU nonlinearity. You can do it with these lines in initialization of EmbeddingNet:
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
from siamese-triplet.
@w-hc did you try training with this change?
from siamese-triplet.
I will try in a minute. Sry.
from siamese-triplet.
Related Issues (20)
- how to classify
- Tripletloss finally returns to zero HOT 1
- #how could i reload the pre-trained
- #how could i reload the pre-trained parameters when I use siamese
- #how could i load the primary pre-trained params when I using the siamese module #
- utils.py:FunctionNegativeTripletSelector - 'anchor_positive' referenced before assignment when len(label_indices) < 2 HOT 6
- Implementation of Triplet loss on CASIA Web Face Dataset
- training
- References HOT 1
- BalancedBatchSampler: classes vs samples HOT 1
- How to use this siamese model for classification task?
- batch_szie
- batch_size
- datasets.py HOT 1
- ValueError when using BalancedBatchSampler
- I got this on cifar10 test set
- About generating all possible triplets using combinations() function HOT 1
- how tesT
- how should we use it to classify?
- Online Triplet Loss calculation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from siamese-triplet.