msight-tech / research-ms-loss Goto Github PK
View Code? Open in Web Editor NEWMS-Loss: Multi-Similarity Loss for Deep Metric Learning
License: Other
MS-Loss: Multi-Similarity Loss for Deep Metric Learning
License: Other
in the XBM module, i can't seem to find the build memory data function.
There is not "lr_mul" in torch.optim, so the backbone's learning rate will be the same as embedding layer?
I cannot see any experiments setting for Cars-196, can you show the yaml config also?
When I use the CUB-200's yaml config to train the Cars-196(followed the paper's training split rule, 98 classes), but only got best R@1=78.4%(the embedding size is 512), much lower than 84.1% in your paper.
Thanks,
Adam
Hello , I found this code can only run on one-label task and , I want to use MSLOSS to an image retrieval task , And When I deal with multi-label dataset , I am kind of ignorant now.
pos_pair_ = sim_mat[i][labels == labels[i]]
pos_pair_ = pos_pair_[pos_pair_ < 1 - ep]
neg_pair_ = sim_mat[i][labels != labels[i]]
this is a original code , What I am think first is that I want to change the part "label==label[i]" into label[i] ,but before this I input a onehot-label and I use (label = label @ label.t() > 0)
but I the loss is easy to get a INF and NAN
I have suspicion that it's my problem, so I'm asking you for advice
I trained models with the same loss setting as mentioned in the paper; alpha=2 and beta=50. It seemed like the models can't produce good enough embedding features for the minority class (judging from the visualization with t-SNE), but they do obviously a better job for the majority class which led to poor classification results. I'd like to get some advice on how to adjust the hyperparameters or mining setting of this ms loss to better handle the highly imbalanced dataset (say having class 0 10x more samples than class 1). For additional details, I used the embedding size of 512 and the batch size of 8 (maximum capacity of my GPU because the image size is quite large).
Thanks in advance.
class MultiSimilarityLoss(nn.Module):
def __init__(self, configer=None):
super(MultiSimilarityLoss, self).__init__()
self.is_norm = True
self.eps = 0.1
self.lamb = 1
self.alpha = 2
self.beta = 50
def forward(self, inputs, targets):
n = inputs.size(0)
if self.is_norm:
inputs = inputs / torch.norm(inputs, dim=1, keepdim=True)
similari_matrix = inputs.matmul(inputs.t())
mask = targets.expand(n, n).eq(targets.expand(n, n).t())
loss = None
for i in range(n):
temp_sim, temp_mask = similari_matrix[i], mask[i]
min_ap, max_an = temp_sim[temp_mask].min(), temp_sim[temp_mask==0].max()
temp_AP = temp_sim[(temp_mask==1) & (temp_sim < max_an + self.eps)] # may be tensor([])
temp_AN = temp_sim[(temp_mask==0) & (temp_sim > min_ap - self.eps)] # torch.sum(tensor([])) = tensor(0.)
L1 = torch.log(1 + torch.sum(torch.exp(-self.alpha * (temp_AP - self.lamb)))) / self.alpha
L2 = torch.log(1 + torch.sum(torch.exp(self.beta * (temp_AN - self.lamb)))) / self.beta
L = L1 + L2
if loss is None:
loss = L
else:
loss += L
loss /= n
return loss
First of all thank you for the great work.
In ret_benchmark/data/transforms/build.py, you have:
.......
normalize_transform = T.Normalize(mean=cfg.INPUT.PIXEL_MEAN,
std=cfg.INPUT.PIXEL_STD)
if is_train:
transform = T.Compose([
T.Resize(size=cfg.INPUT.ORIGIN_SIZE),
T.RandomResizedCrop(
scale=cfg.INPUT.CROP_SCALE,
size=cfg.INPUT.CROP_SIZE
),
T.RandomHorizontalFlip(p=cfg.INPUT.FLIP_PROB),
T.ToTensor(),
normalize_transform,
])
............
I wonder how is the PIXEL_MEAN and PIXEL_STD calculated? Are they calculated after Resize()、RandomResizedCrop(), RandomHorizontalFlip() and ToTensor()? Or they are calculated applying only ToTensor() (which converts a PIL image from [0, 255] to [0,1]) to all the pics in the dataset?
Hi, I tried to run the code. But is showed that "No module named 'ret_benchmark'". Why?
Hi there, thanks for sharing the code and beautifule work!
In multi_similarity_loss.py line 35 :
pos_pair_ = pos_pair_[pos_pair_ < 1 - epsilon]
why do we need this code ?
And what's the logic of using the output of avgpooling as the embeddings of network?
First of all, thanks you for a great work!
Can you upload train.txt
and test.txt
files for training CUB-200-2011 dataset?
我是想修改成异常检测,虽然有6种不同的产品,但只分了OK、和Anomaly两类。
我自己写了个dataloader,使每个batch里只有单一产品的两类样本且平均分布OK和Anomaly。OK标签为1,Anomaly标签为0。
修改了loss如下,但是训练时pos_pair_很快就变成0。是哪里改的不对吗?
再请教下,用XBM的那个库会不会更适合我的这个情况呢?
多谢!!!
@LOSS.register('ms_loss')
class MultiSimilarityLoss(nn.Module):
def __init__(self, cfg):
super(MultiSimilarityLoss, self).__init__()
self.thresh = 0.5
self.margin = 0.1
self.scale_pos = cfg.LOSSES.MULTI_SIMILARITY_LOSS.SCALE_POS
self.scale_neg = cfg.LOSSES.MULTI_SIMILARITY_LOSS.SCALE_NEG
def forward(self, feats, labels):
assert feats.size(0) == labels.size(0), \
f"feats.size(0): {feats.size(0)} is not equal to labels.size(0): {labels.size(0)}"
batch_size = feats.size(0)
sim_mat = torch.matmul(feats, torch.t(feats))
epsilon = 1e-5
loss = list()
for i in range(batch_size):
pos_pair_ = sim_mat[i][labels == 1] # 此处修改
pos_pair_ = pos_pair_[pos_pair_ < 1 - epsilon]
neg_pair_ = sim_mat[i][labels == 0] # 此处修改
neg_pair = neg_pair_[neg_pair_ + self.margin > min(pos_pair_)]
pos_pair = pos_pair_[pos_pair_ - self.margin < max(neg_pair_)]
if len(neg_pair) < 1 or len(pos_pair) < 1:
continue
# weighting step
pos_loss = 1.0 / self.scale_pos * torch.log(
1 + torch.sum(torch.exp(-self.scale_pos * (pos_pair - self.thresh))))
neg_loss = 1.0 / self.scale_neg * torch.log(
1 + torch.sum(torch.exp(self.scale_neg * (neg_pair - self.thresh))))
loss.append(pos_loss + neg_loss)
if len(loss) == 0:
return torch.zeros([], requires_grad=True)
loss = sum(loss) / batch_size
return loss
Hi,
I read your good paper with title "Multi-similarity loss with general pair weighting for deep metric learning" and based on the content of that paper, I cannot understand logic of "_prepare_batch" function in "RandomIdentitySampler", because you select just positive pair in it (if available).
This make your method bad training!
for convenient i copy mentioned part bellow:
for label in self.labels:
idxs = copy.deepcopy(self.label_index_dict[label])
#load all data indexes that equal to label to idxs
if len(idxs) < self.K:
idxs.extend(np.random.choice(idxs, size=self.K - len(idxs), replace=True))
Hi, the unify framework for all knids of paired-loss proposed in the paper is great, while i found that it appeared that the best "test recall" has been actually decided by val_dataset, which refenced to the raw code below:
According to the fig above, "val datatset" actually also plays a role of "test dataset", which means "test dataset" is visible during training.
So does it seems like choosing a "best train iteration" parameter, which is a risk of overfitting on training hyperparameters?
(I have found similar operation in several other papers, and i knew there was a lack of test dataset building the dataset, such as the general protocal "construct query+gallery based on the raw val+test split in DeepFashion")
@mscottml
First of all thank you for sharing the code, this is really great work.
I ran the experiment and got good results, but I can't understand the implementation of computational recall @ K in your code. Can you explain it to me? The two bold lines are shown below.
`def recall_k(self, k=1):
m = len(self.sim_mat)
match_counter = 0
for i in range(m):
pos_sim = self.sim_mat[i][self.gallery_labels == self.query_labels[i]]
neg_sim = self.sim_mat[i][self.gallery_labels != self.query_labels[i]]
thresh = np.sort(pos_sim)[-2] if self.is_equal_query else np.max(pos_sim)
****if np.sum(neg_sim > thresh) < k: # The lines that I can not understand.
match_counter += 1****
return float(match_counter) / m`
Thank you!
https://github.com/geonm/tf_ms_loss
I uploaded only the ms-loss codes
I'm wondering whether my implementation is correct.
Please comment here if you guys find something wrong.
The setting of hyper parameters we can refer is just the example.yaml, but as you said in your paper, you have experimented on more than one dataset. How to set the hyper parameters for those datasets? Please give more config file for reference, thanks.
Hi,
Do you have the NMI results of your methods in the datasets you experiment on the paper? If not, do you have the trained models?
I was interesting to compare our method with yours in the NMI metric.
Thanks in advance!
I use resnet50 + MS loss to train my own dataset, but sometimes loss will be Nan, it seems that the loss is not very stable
Use a for loop in loss calculation is a little bit slow.
You can find a way to remove the for loop.
In my case, only pairs on the diagonal are positive, so I remove the for loop as follows.
simi_mat = torch.matmul(y1, torch.t(y2))
simi_sub = simi_mat - ms_gama
pos_pair_sub = torch.unsqueeze(torch.diag(simi_sub), 1)
neg_pair_sub_plus1 = simi_sub
neg_pair_sub_plus1[range(batch_size), range(batch_size)] = 0
pos_loss = torch.log(1 + torch.sum(torch.exp(-ms_alpha * pos_pair_sub), dim = 1)) / ms_alpha
neg_loss = torch.log(torch.sum(torch.exp(ms_beta * neg_pair_sub_plus1), dim = 1)) / ms_beta
loss = torch.mean(pos_loss + neg_loss)
First of all thank you for sharing the code, this is really great work.
I am trying to reproduce the results on other datasets (CARS, SOP, In-Shop), both with resnet50 and inception-bn, could you please share the hyperparams that you used? I tried the default ones and a few tweaking but I could not get the numbers in the paper.
Thank you!
dear @mscottml ,i have just read the paper "Multi-Similarity Loss with General Pair Weighting
for Deep Metric Learning" .It's a great job I have ever seen in metrics learning.
Would you please release the code and show more details ?Thank you very much.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.