Code Monkey home page Code Monkey logo

softtriple's People

Contributors

qian-qi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

softtriple's Issues

Clarification on gamma and lambda

Hi Qi Qian,
I like your paper, especially the elegant derivation part. Can I have some clarifications on gamma and lambda:
From my understanding, gamma and lambda can also be viewed as the temperature in SoftMax, which controls how concentrated/spread of intra-class similarity and inter-class similarity. I noticed that you let gamma to be 0.1, and 1/lambda to be 0.2. Which means you want intra-class similarly distribution to be very concentrated, and inter-class similarity distribution to be relatively a little bit smoother, is my understanding correct? Also do you require gamma and 1/lambda to be less than 1 to give a low temperature to avoid over-smoothing?
Thanks!

Out of Memory error when training on big number of classes

Hello, I'm trying to train the network on a dataset of 8k+ classes and I have this error:

RuntimeError: CUDA out of memory. Tried to allocate 14.00 MiB (GPU 0; 11.17 GiB total capacity; 10.68 GiB already allocated; 11.31 MiB free; 10.72 GiB reserved in total by PyTorch)
I use a Tesla K80. When training on 3k classes or less everything is fine. The limit of classes seems to be around 3k/4k. I don't think this is a matter of number of images since reducing the batch size doesn't fix the error.

Did you try to run it with a big number of classes ? Any ways to reduce the memory allocated ?

Thanks !

Question - data with unit length

The paper mentiones the data (x) is of unit length. It even seems to be a requirement for using SoftTriplet.
Is this correct? How did you guarantee this in your experiments?

Thanks!

out of memory error

Hi, @qian-qi
I run your codes with default settings on a single Tesla-v100 card, but an error occurred

"RuntimeError: CUDA out of memory. Tried to allocate 10.52 GiB (GPU 0; 31.72 GiB total capacity; 22.42 GiB already allocated; 8.18 GiB free; 15.33 MiB cached)"

Why the softtriple loss costs such many memory? Or can you share your experimental settings in details? Thank you.

MemoryError: Unable to allocate 113. GiB for an array with shape (122994, 122994) and data type float64

since it concate all the images, in evaluation period,

`def evaluation(X, Y, Kset):
    num = X.shape[0]
    classN = np.max(Y)+1
    kmax = np.max(Kset)
    recallK = np.zeros(len(Kset))
    #compute NMI
    kmeans = KMeans(n_clusters=classN).fit(X)
    nmi = normalized_mutual_info_score(Y, kmeans.labels_, average_method='arithmetic')
    #compute Recall@K
    sim = X.dot(X.T)
    minval = np.min(sim) - 1.
    sim -= np.diag(np.diag(sim))
    sim += np.diag(np.ones(num) * minval)
    indices = np.argsort(-sim, axis=1)[:, : kmax]
    YNN = Y[indices]
    for i in range(0, len(Kset)):
        pos = 0.
        for j in range(0, num):
            if Y[j] in YNN[j, :Kset[i]]:
                pos += 1.
        recallK[i] = pos/num
    return nmi, recallK`

sim += np.diag(np.ones(num) * minval)

num=122994
MemoryError: Unable to allocate 113. GiB for an array with shape (122994, 122994) and data type float64

could you fix it?

loss is nan

Hello, thanks for your excellent work and generous code sharing !

I'm trying to use SoftTriple loss function in my project . At first , I followed most of your hyperparameters and only changed class number(cN) to 247,the loss turned to 'nan' quickly . After that , I tried to change K and margin. I also changed initial learning rate of loss function, it seemed when lr changed to smaller , the time before loss turn to nan is longer.

I still can't solve this problem,it would be great if you have any suggestions ! Thank you again !

pretrained models

Thank for all your contributions. I wonder that SoftTriple can work on oneshot problem or not ? And do you have any pretrained models ?

The BatchNorm layers are not completely frozen

The current implementation seems that the BatchNorm layers are not completely frozen.

The following lines only stop calculating the running mean & var, but the weight & bias of BN will still be changed:

SoftTriple/train.py

Lines 138 to 141 in 2417374

if args.freeze_BN:
for m in model.modules():
if isinstance(m, nn.BatchNorm2d):
m.eval()

To avoid changing the weight & bias, the following lines:

SoftTriple/train.py

Lines 82 to 86 in 2417374

# define loss function (criterion) and optimizer
criterion = loss.SoftTriple(args.la, args.gamma, args.tau, args.margin, args.dim, args.C, args.K).cuda()
optimizer = torch.optim.Adam([{"params": model.parameters(), "lr": args.modellr},
{"params": criterion.parameters(), "lr": args.centerlr}],
eps=args.eps, weight_decay=args.weight_decay)

can be changed to:

    if args.freeze_BN:
        for m in model.modules():
            if isinstance(m, torch.nn.BatchNorm2d):
                m.weight.requires_grad = False
                m.bias.requires_grad = False

    # define loss function (criterion) and optimizer
    criterion = loss.SoftTriple(args.la, args.gamma, args.tau, args.margin, args.dim, args.C, args.K).cuda()
    optimizer = torch.optim.Adam([{"params": filter(lambda p: p.requires_grad, model.parameters()), # model.parameters(), 
                                   "lr": args.modellr},
                                  {"params": criterion.parameters(), "lr": args.centerlr}],
                                 eps=args.eps, weight_decay=args.weight_decay)

proposition1

Hi, Thank you for your outstanding contribution,but i don't understooding the proof of proposition1, can you explain the entropy of the distribution p in detail.

Thanks very much!

train loss

when i set pretrained=None, and the epoch is very big, but i find the loss of train can't decrease, How long does the training model take? and can you share some skill of train or something of loss

softtriploss param

@idstcv
您好,谢谢您的开源代码,很棒^^
我用在我的框架中,有点疑问
1、args.dim, args.C ,这个是输出特征的维度,和训练数据的所有的类别?参数中是98,
2、如果用在其其他检索任务上,下面的其他参数需要调整吗?
3、在reid 任务上,您有试过, softtriploss +分类损失联合训练吗
4、模型 和 损失的学习率 为什么设置的不同?
5、什么情况下,用该损失 表现并不理想?

criterion = loss.SoftTriple(args.la, args.gamma, args.tau, args.margin, args.dim, args.C, args.K).cuda()
optimizer = torch.optim.Adam([{"params": model.parameters(), "lr": args.modellr},
{"params": criterion.parameters(), "lr": args.centerlr}],
eps=args.eps, weight_decay=args.weight_decay)

We refer the objective in Eqn. 10 asSoft-Triple. We setτ= 0.2andγ= 0.1for SoftTriple. Besides,we set a small margin asδ= 0.01to break the tie explicitly.The number of centers is set toK= 10.

Code and Paper don't seem to match...

When mapped with Code, I am thinking as follows.

image

In Paper...
image

But, I think this Code is
image

lossClassify = F.cross_entropy(self.la*(simClass-marginM), target)
->
lossClassify = F.cross_entropy(F.softmax(self.la * (simClass-marginM), dim=1), target)

proof of proposition1

Can you provide the detailed calculation process of proof of Proposition 1. i can not get your results. And i think there is somehing wrong with your proof

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.