idstcv / softtriple Goto Github PK
View Code? Open in Web Editor NEWPyTorch Implementation for SoftTriple Loss
License: Apache License 2.0
PyTorch Implementation for SoftTriple Loss
License: Apache License 2.0
Hi Qi Qian,
I like your paper, especially the elegant derivation part. Can I have some clarifications on gamma and lambda:
From my understanding, gamma and lambda can also be viewed as the temperature in SoftMax, which controls how concentrated/spread of intra-class similarity and inter-class similarity. I noticed that you let gamma to be 0.1, and 1/lambda to be 0.2. Which means you want intra-class similarly distribution to be very concentrated, and inter-class similarity distribution to be relatively a little bit smoother, is my understanding correct? Also do you require gamma and 1/lambda to be less than 1 to give a low temperature to avoid over-smoothing?
Thanks!
Hello, I'm trying to train the network on a dataset of 8k+ classes and I have this error:
RuntimeError: CUDA out of memory. Tried to allocate 14.00 MiB (GPU 0; 11.17 GiB total capacity; 10.68 GiB already allocated; 11.31 MiB free; 10.72 GiB reserved in total by PyTorch)
I use a Tesla K80. When training on 3k classes or less everything is fine. The limit of classes seems to be around 3k/4k. I don't think this is a matter of number of images since reducing the batch size doesn't fix the error.
Did you try to run it with a big number of classes ? Any ways to reduce the memory allocated ?
Thanks !
The paper mentiones the data (x) is of unit length. It even seems to be a requirement for using SoftTriplet.
Is this correct? How did you guarantee this in your experiments?
Thanks!
Hi, @qian-qi
I run your codes with default settings on a single Tesla-v100 card, but an error occurred
"RuntimeError: CUDA out of memory. Tried to allocate 10.52 GiB (GPU 0; 31.72 GiB total capacity; 22.42 GiB already allocated; 8.18 GiB free; 15.33 MiB cached)"
Why the softtriple loss costs such many memory? Or can you share your experimental settings in details? Thank you.
Hello, I'm very interested in your deep metric learning work.
the code in evaluation:
YNN = Y[indices]
for i in range(0, len(Kset)):
pos = 0.
for j in range(0, num):
if Y[j] in YNN[j, :Kset[i]]:
pos += 1.
recallK[i] = pos/num
return nmi, recallK
This looks more like computing Rank-k
Can you explain why compute the recall@k in this way? Thanks!
the relevant document of recall@k:
https://ils.unc.edu/courses/2013_spring/inls509_001/lectures/10-EvaluationMetrics.pdf
since it concate all the images, in evaluation period,
`def evaluation(X, Y, Kset):
num = X.shape[0]
classN = np.max(Y)+1
kmax = np.max(Kset)
recallK = np.zeros(len(Kset))
#compute NMI
kmeans = KMeans(n_clusters=classN).fit(X)
nmi = normalized_mutual_info_score(Y, kmeans.labels_, average_method='arithmetic')
#compute Recall@K
sim = X.dot(X.T)
minval = np.min(sim) - 1.
sim -= np.diag(np.diag(sim))
sim += np.diag(np.ones(num) * minval)
indices = np.argsort(-sim, axis=1)[:, : kmax]
YNN = Y[indices]
for i in range(0, len(Kset)):
pos = 0.
for j in range(0, num):
if Y[j] in YNN[j, :Kset[i]]:
pos += 1.
recallK[i] = pos/num
return nmi, recallK`
sim += np.diag(np.ones(num) * minval)
num=122994
MemoryError: Unable to allocate 113. GiB for an array with shape (122994, 122994) and data type float64
could you fix it?
Hello, thanks for your excellent work and generous code sharing !
I'm trying to use SoftTriple loss function in my project . At first , I followed most of your hyperparameters and only changed class number(cN) to 247,the loss turned to 'nan' quickly . After that , I tried to change K and margin. I also changed initial learning rate of loss function, it seemed when lr changed to smaller , the time before loss turn to nan is longer.
I still can't solve this problem,it would be great if you have any suggestions ! Thank you again !
Thank for all your contributions. I wonder that SoftTriple can work on oneshot problem or not ? And do you have any pretrained models ?
The current implementation seems that the BatchNorm layers are not completely frozen.
The following lines only stop calculating the running mean & var, but the weight & bias of BN will still be changed:
Lines 138 to 141 in 2417374
To avoid changing the weight & bias, the following lines:
Lines 82 to 86 in 2417374
if args.freeze_BN:
for m in model.modules():
if isinstance(m, torch.nn.BatchNorm2d):
m.weight.requires_grad = False
m.bias.requires_grad = False
# define loss function (criterion) and optimizer
criterion = loss.SoftTriple(args.la, args.gamma, args.tau, args.margin, args.dim, args.C, args.K).cuda()
optimizer = torch.optim.Adam([{"params": filter(lambda p: p.requires_grad, model.parameters()), # model.parameters(),
"lr": args.modellr},
{"params": criterion.parameters(), "lr": args.centerlr}],
eps=args.eps, weight_decay=args.weight_decay)
Hi, Thank you for your outstanding contribution,but i don't understooding the proof of proposition1, can you explain the entropy of the distribution p in detail.
Thanks very much!
when i set pretrained=None, and the epoch is very big, but i find the loss of train can't decrease, How long does the training model take? and can you share some skill of train or something of loss
@idstcv
您好,谢谢您的开源代码,很棒^^
我用在我的框架中,有点疑问
1、args.dim, args.C ,这个是输出特征的维度,和训练数据的所有的类别?参数中是98,
2、如果用在其其他检索任务上,下面的其他参数需要调整吗?
3、在reid 任务上,您有试过, softtriploss +分类损失联合训练吗
4、模型 和 损失的学习率 为什么设置的不同?
5、什么情况下,用该损失 表现并不理想?
criterion = loss.SoftTriple(args.la, args.gamma, args.tau, args.margin, args.dim, args.C, args.K).cuda()
optimizer = torch.optim.Adam([{"params": model.parameters(), "lr": args.modellr},
{"params": criterion.parameters(), "lr": args.centerlr}],
eps=args.eps, weight_decay=args.weight_decay)
We refer the objective in Eqn. 10 asSoft-Triple. We setτ= 0.2andγ= 0.1for SoftTriple. Besides,we set a small margin asδ= 0.01to break the tie explicitly.The number of centers is set toK= 10.
Can you provide the detailed calculation process of proof of Proposition 1. i can not get your results. And i think there is somehing wrong with your proof
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.