idstcv / softtriple Goto Github PK

View Code? Open in Web Editor NEW

203.0 203.0 31.0 70 KB

PyTorch Implementation for SoftTriple Loss

License: Apache License 2.0

Python 100.00%

softtriple's People

Contributors

Stargazers

Watchers

softtriple's Issues

Clarification on gamma and lambda

Hi Qi Qian,
I like your paper, especially the elegant derivation part. Can I have some clarifications on gamma and lambda:
From my understanding, gamma and lambda can also be viewed as the temperature in SoftMax, which controls how concentrated/spread of intra-class similarity and inter-class similarity. I noticed that you let gamma to be 0.1, and 1/lambda to be 0.2. Which means you want intra-class similarly distribution to be very concentrated, and inter-class similarity distribution to be relatively a little bit smoother, is my understanding correct? Also do you require gamma and 1/lambda to be less than 1 to give a low temperature to avoid over-smoothing?
Thanks!

Out of Memory error when training on big number of classes

Hello, I'm trying to train the network on a dataset of 8k+ classes and I have this error:

RuntimeError: CUDA out of memory. Tried to allocate 14.00 MiB (GPU 0; 11.17 GiB total capacity; 10.68 GiB already allocated; 11.31 MiB free; 10.72 GiB reserved in total by PyTorch)
I use a Tesla K80. When training on 3k classes or less everything is fine. The limit of classes seems to be around 3k/4k. I don't think this is a matter of number of images since reducing the batch size doesn't fix the error.

Did you try to run it with a big number of classes ? Any ways to reduce the memory allocated ?

Thanks !

Question - data with unit length

The paper mentiones the data (x) is of unit length. It even seems to be a requirement for using SoftTriplet.
Is this correct? How did you guarantee this in your experiments?

Thanks!

out of memory error

Hi, @qian-qi
I run your codes with default settings on a single Tesla-v100 card, but an error occurred

"RuntimeError: CUDA out of memory. Tried to allocate 10.52 GiB (GPU 0; 31.72 GiB total capacity; 22.42 GiB already allocated; 8.18 GiB free; 15.33 MiB cached)"

Why the softtriple loss costs such many memory? Or can you share your experimental settings in details? Thank you.

can you explain the recall@k?

Hello, I'm very interested in your deep metric learning work.
the code in evaluation:
YNN = Y[indices]
for i in range(0, len(Kset)):
pos = 0.
for j in range(0, num):
if Y[j] in YNN[j, :Kset[i]]:
pos += 1.
recallK[i] = pos/num
return nmi, recallK
This looks more like computing Rank-k
Can you explain why compute the recall@k in this way? Thanks!
the relevant document of recall@k:
https://ils.unc.edu/courses/2013_spring/inls509_001/lectures/10-EvaluationMetrics.pdf

centers = F.normalize(self.fc, p=2, dim=0) what is its purpose？

MemoryError: Unable to allocate 113. GiB for an array with shape (122994, 122994) and data type float64

since it concate all the images, in evaluation period,

`def evaluation(X, Y, Kset):
    num = X.shape[0]
    classN = np.max(Y)+1
    kmax = np.max(Kset)
    recallK = np.zeros(len(Kset))
    #compute NMI
    kmeans = KMeans(n_clusters=classN).fit(X)
    nmi = normalized_mutual_info_score(Y, kmeans.labels_, average_method='arithmetic')
    #compute Recall@K
    sim = X.dot(X.T)
    minval = np.min(sim) - 1.
    sim -= np.diag(np.diag(sim))
    sim += np.diag(np.ones(num) * minval)
    indices = np.argsort(-sim, axis=1)[:, : kmax]
    YNN = Y[indices]
    for i in range(0, len(Kset)):
        pos = 0.
        for j in range(0, num):
            if Y[j] in YNN[j, :Kset[i]]:
                pos += 1.
        recallK[i] = pos/num
    return nmi, recallK`

sim += np.diag(np.ones(num) * minval)

num=122994
MemoryError: Unable to allocate 113. GiB for an array with shape (122994, 122994) and data type float64

could you fix it?

loss is nan

Hello, thanks for your excellent work and generous code sharing !

I'm trying to use SoftTriple loss function in my project . At first , I followed most of your hyperparameters and only changed class number(cN) to 247,the loss turned to 'nan' quickly . After that , I tried to change K and margin. I also changed initial learning rate of loss function, it seemed when lr changed to smaller , the time before loss turn to nan is longer.

I still can't solve this problem,it would be great if you have any suggestions ! Thank you again !

pretrained models

Thank for all your contributions. I wonder that SoftTriple can work on oneshot problem or not ? And do you have any pretrained models ?

The BatchNorm layers are not completely frozen

The current implementation seems that the BatchNorm layers are not completely frozen.

The following lines only stop calculating the running mean & var, but the weight & bias of BN will still be changed:

SoftTriple/train.py

Lines 138 to 141 in 2417374

    
           if args.freeze_BN: 
        
               for m in model.modules(): 
        
                   if isinstance(m, nn.BatchNorm2d): 
        
                       m.eval()

To avoid changing the weight & bias, the following lines:

SoftTriple/train.py

Lines 82 to 86 in 2417374

    
           # define loss function (criterion) and optimizer 
        
           criterion = loss.SoftTriple(args.la, args.gamma, args.tau, args.margin, args.dim, args.C, args.K).cuda() 
        
           optimizer = torch.optim.Adam([{"params": model.parameters(), "lr": args.modellr}, 
        
                                         {"params": criterion.parameters(), "lr": args.centerlr}], 
        
                                        eps=args.eps, weight_decay=args.weight_decay)

can be changed to:

    if args.freeze_BN:
        for m in model.modules():
            if isinstance(m, torch.nn.BatchNorm2d):
                m.weight.requires_grad = False
                m.bias.requires_grad = False

    # define loss function (criterion) and optimizer
    criterion = loss.SoftTriple(args.la, args.gamma, args.tau, args.margin, args.dim, args.C, args.K).cuda()
    optimizer = torch.optim.Adam([{"params": filter(lambda p: p.requires_grad, model.parameters()), # model.parameters(), 
                                   "lr": args.modellr},
                                  {"params": criterion.parameters(), "lr": args.centerlr}],
                                 eps=args.eps, weight_decay=args.weight_decay)

proposition1

Hi, Thank you for your outstanding contribution，but i don't understooding the proof of proposition1, can you explain the entropy of the distribution p in detail.

Thanks very much!

train loss

when i set pretrained=None, and the epoch is very big, but i find the loss of train can't decrease, How long does the training model take？ and can you share some skill of train or something of loss

softtriploss param

@idstcv
您好，谢谢您的开源代码，很棒^^
我用在我的框架中，有点疑问
1、args.dim, args.C ，这个是输出特征的维度，和训练数据的所有的类别？参数中是98,
2、如果用在其其他检索任务上，下面的其他参数需要调整吗？
3、在reid 任务上，您有试过， softtriploss +分类损失联合训练吗
4、模型和损失的学习率为什么设置的不同？
5、什么情况下，用该损失表现并不理想？

criterion = loss.SoftTriple(args.la, args.gamma, args.tau, args.margin, args.dim, args.C, args.K).cuda()
optimizer = torch.optim.Adam([{"params": model.parameters(), "lr": args.modellr},
{"params": criterion.parameters(), "lr": args.centerlr}],
eps=args.eps, weight_decay=args.weight_decay)

We refer the objective in Eqn. 10 asSoft-Triple. We setτ= 0.2andγ= 0.1for SoftTriple. Besides,we set a small margin asδ= 0.01to break the tie explicitly.The number of centers is set toK= 10.

Code and Paper don't seem to match...

When mapped with Code, I am thinking as follows.

In Paper...

But, I think this Code is

lossClassify = F.cross_entropy(self.la*(simClass-marginM), target)
->
lossClassify = F.cross_entropy(F.softmax(self.la * (simClass-marginM), dim=1), target)

proof of proposition1

Can you provide the detailed calculation process of proof of Proposition 1. i can not get your results. And i think there is somehing wrong with your proof

	if args.freeze_BN:
	for m in model.modules():
	if isinstance(m, nn.BatchNorm2d):
	m.eval()

	# define loss function (criterion) and optimizer
	criterion = loss.SoftTriple(args.la, args.gamma, args.tau, args.margin, args.dim, args.C, args.K).cuda()
	optimizer = torch.optim.Adam([{"params": model.parameters(), "lr": args.modellr},
	{"params": criterion.parameters(), "lr": args.centerlr}],
	eps=args.eps, weight_decay=args.weight_decay)

idstcv / softtriple Goto Github PK

softtriple's People

Contributors

Stargazers

Watchers

Forkers

softtriple's Issues

Recommend Projects

Recommend Topics

Recommend Org