shi-labs / self-similarity-grouping Goto Github PK

Self-similarity Grouping: A Simple Unsupervised Cross Domain Adaptation Approach for Person Re-identification (ICCV 2019, Oral)

Python 99.87% Shell 0.13%

deep-learning computer-vision person-reidentification domain-adaptation

self-similarity-grouping's Issues

About the source_train.py

I run the source_train.py, and there is a error, as follows:
Traceback (most recent call last):
File "source_train.py", line 311, in
main(parser.parse_args())
File "source_train.py", line 238, in main
top1 = rank_score.allshots[0]
AttributeError: 'numpy.float64' object has no attribute 'allshots'.
Why?

where is source_train.py

Hi, I am reproducing your work, could you please update the code and fix the error? thanks

ValueError: => No checkpoint found

when i run selftraining.py it shows
ValueError: => No checkpoint found at '/data/liangchen.song/models/torch/trained/model_best.pth.tar' ,can you give me some advices?

I ran into a problem when I tried to run source_train.py

Traceback (most recent call last):
File "source_train.py", line 315, in
main(parser.parse_args())
File "source_train.py", line 241, in main
rank_score = evaluator.evaluate(val_loader, dataset.val, dataset.val)
File "/data0/network/SSG-master/SSG-master/reid/evaluators.py", line 190, in evaluate
features, _ = extract_features(self.model, data_loader, print_freq=self.print_freq)
File "/data0/network/SSG-master/SSG-master/reid/evaluators.py", line 28, in extract_features
outputs = extract_cnn_feature(model, imgs, for_eval)
File "/data0/network/SSG-master/SSG-master/reid/feature_extraction/cnn.py", line 16, in extract_cnn_feature
outputs = model(inputs, for_eval)[0]
File "/home/dongwh/anaconda3/envs/SSG/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/dongwh/anaconda3/envs/SSG/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 143, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/dongwh/anaconda3/envs/SSG/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 153, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/dongwh/anaconda3/envs/SSG/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
raise output
File "/home/dongwh/anaconda3/envs/SSG/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker
output = module(*input, **kwargs)
File "/home/dongwh/anaconda3/envs/SSG/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
TypeError: forward() takes 2 positional arguments but 3 were given

Market2Duke results

I runned the code in Market2Duke and Duke2Market, the result of Duke2Market is a match to the reported numbers while the result of Market2Duke has a drop in performance. The result is showed as below.(I runned on Linux LTS 16.04 with pytorch 0.4.0 and python3.6)
|SSG method| rank-1 | mAP |
| reported |73.0% |53.4% |
| observed |70.2% | 49.8% |

|SSG++ method| rank-1 | mAP |
| reported | 76.0% | 60.3% |
| observed | 72.7% | 53.7% |
No change was been made to the training codes, can you please give me some advice about what the reasons probably be? Thank you.

some error of eug.py

when I run semitrain.py no matter I use cluster or random it will report :No such file/directory: 'random_split/random_marker1501.pkl', the error code is in eug.py line406:
with open(load_path, "wb") as fp:
pickle.dump({"label set": label_dataset, "unlabel set":unlabel_dataset}, fp)

and another question: what's the mean of parser --sample cluster and --sample random

some problems in source_train.py

hello, when I read your code in source_train.py, I can not find the definition of the function "get_one_shot_in_cam1", so I can't not understanding the meaning of the variable "l_data" & "u_data". Thank you

So using re_ranking when pseudo label assignment is not using the re_ranking method?

parameter about dce_loss

in you train file
parser.add_argument('--no-rerank', action='store_true', help="train without rerank")
parser.add_argument('--dce-loss', action='store_true', help="train without rerank")
both args.dce_loss and args.no-rerank default is False, it seems like you default to use rerank, but if I don't want to use rerank how can I set the parameter about --no-rerank and --dce-loss, I can't find the instuction about --dce-loss in your paper

TypeError: Can't instantiate abstract class Euclidean with abstract methods get_metric, score_pairs

Hi, I again got a problem when running the source_train.py

Files already downloaded and verified
Market1501 dataset loaded
subset | # ids | # images

train | 676 | 11744
val | 75 | 1192
trainval | 751 | 12936
query | 750 | 3368
gallery | 751 | 15913
Traceback (most recent call last):
File "/home/node3/xxxxxx/SSG/source_train.py", line 311, in
main(parser.parse_args())
File "/home/node3/xxxxxx/SSG/source_train.py", line 196, in main
metric = DistanceMetric(algorithm=args.dist_metric)
File "/home/node3/xxxxxx/SSG/reid/dist_metric.py", line 13, in init
self.metric = get_metric(algorithm, *args, **kwargs)
File "/home/node3/xxxxxx/SSG/reid/metric_learning/init.py", line 25, in get_metric
return __factory[algorithm](*args, **kwargs)
TypeError: Can't instantiate abstract class Euclidean with abstract methods get_metric, score_pairs

Thanks for your kindly reply!

Cannot obtain the reported performance by directly running the run.sh

Hi, I would like to thank you for releasing the codes in the first place.

Following the readme.md, we directly run the run.sh without any modification, but fail to obtain the reported performance. Can you help us figure out what the problem is?

To be more specific, we obtain Mean AP: 54.3% and top1: 77.3% (best) after training for 30 epoches by running the run.sh for SSG in Duke->Market1501, which should have Mean AP: 58.3% and top1: 80.0%.

The generation of the pseudo labels.

In the selftraining.py line 276, we can observe that the generate_selflabel only generates the pseudo label once through the pre-trained model rather than every iteration. Is that reasonable?

dataset写的比较难懂

写的冗长又完全没有必要，谁会去从解压文件开始

I think the result of the paper is under the use of re_ranking

features about rerank

Usually rerank is used in one dataset,but in your rerank.py you rerank source feature with target feature? Why? Or my understanding is fault? At this moment, what's the mean of your e_dist and r_dist? Is it still distence between each sample in target datasets?

something confused me in JointTrainer2

why used yt as the label for feature ft, ft_up, ft_low,it's different from your paper.
here is the code,calculate the loss Lsemi
loss_global_eug, prec_global_eug = self.criterions[1](outputs_eug[1], pids_eug, epoch) for i, output_p in enumerate(outputs_eug[0]): loss_tri, prec_tri = self.criterions[0](output_p, pids_eug, epoch) loss_os += loss_tri
and it is different from loss Lssg,
code here
loss_global, prec_global = self.criterions[1](outputs[1], pids[0], epoch) for i, output_p in enumerate(outputs[0]): loss_tri, prec_tri = self.criterions[0](output_p, pids[i], epoch) loss_uns += loss_tri
I'm tring to recurrent your paper,pls help me,Thank you

Error on source_train.py

Hi,

I run the code and got the following error.

Traceback (most recent call last):
File "source_train.py", line 311, in
main(parser.parse_args())
File "source_train.py", line 237, in main
rank_score = evaluator.evaluate(val_loader, dataset.val, dataset.val)
File "/media/mahfuj/DATA/preid/SSG-master/reid/evaluators.py", line 190, in evaluate
features, _ = extract_features(self.model, data_loader, print_freq=self.print_freq)
File "/media/mahfuj/DATA/preid/SSG-master/reid/evaluators.py", line 28, in extract_features
outputs = extract_cnn_feature(model, imgs, for_eval)
File "/media/mahfuj/DATA/preid/SSG-master/reid/feature_extraction/cnn.py", line 16, in extract_cnn_feature
outputs = model(inputs, for_eval)[0]
File "/home/mahfuj/pytorch_new_python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/mahfuj/pytorch_new_python3/lib/python3.5/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/mahfuj/pytorch_new_python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
TypeError: forward() takes 2 positional arguments but 3 were given

Are you sure this code is running. There are a lot of miskates

Hi; thank you for sharing
But your code has a lot of mistake; I'm inclined to say; it can't execute. Except we modified them. Pls, make sure you upload a running code out of the box?
For example; here; you used cross_entropy without self.cross_entropy.
https://github.com/OasisYang/SSG/blob/b607ab650d2e01a984861a041b7cd248e727d621/reid/loss/weight_cross_entropy.py#L21

in reid/loss/triplet.py L35, the code is unreachable, you used if False

module 'reid.models' has no attribute 'create'

So 䍚

why reranking so time-cosuming?

When I run the selftraining.py, it is very time-consuming to calculate the source distance and original distance, and why?

TypeError: Can't instantiate abstract class Euclidean with abstract methods get_metric, score_pairs

File "C:\Project\Self-Similarity-Grouping-master\reid\metric_learning_init_.py", line 25, in get_metric
return __factory[algorithm](*args, **kwargs)
TypeError: Can't instantiate abstract class Euclidean with abstract methods get_metric, score_pairs

I meet this question in Self-Similarity-Grouping-master\reid\metric_learning_init_.py line25

I thinkYour code is taking up unnecessary 6GB memory in selftraining.py

you don't need cluster_list on line 366,it takes up 6GB,beacuse cluster.components_ is a is a matrix shape of (15xxx,16552),you can see this in site-packages/sklearn/cluster/dbscan_.py.
I think eps_list can have the same effect.
I don't know if I am right, please tell me

Error everywhere in the code; not working. Pls, Solve them and upload

There might be a bug in line 195 of SSG-master/reid/trainers.py

To my understanding, should it be "loss += loss_tri" instead of "loss + loss_tri"?

Seems a typo in the paper

paper link
The table 2 says trained DukeMTMC-reID dataset and tested on Market1501 dataset, but the result in table 2 (mAP: 53.4, R1: 73.0) actually is the result of Market1501 → DukeMTMC-ReID in table 1

License question

Hi,

Thank you for the interesting paper and sharing the code.
I am currently working on a project based on your implementation, and before publishing it I need to figure out what kind of licence your work is under. Since I understood it's based on open-reid and DomainAdaptiveReID, whose licenses are MIT, I figured yours was also, although I can't find any license information in your project.

Thank you again for your great work,

Error while training with batch_size=32, double gpus, File "selftraining.py"

Hi,
Thank you for your great work and the provided code.
I've got a bug while following your suggestion with batch_size=32 and two gpus:

File "selftraining.py", line 452, in <module>
    main(parser.parse_args())
  File "selftraining.py", line 253, in main
    criterion, args.epochs, args.logs_dir, args.print_freq, iter_n, old_max_it, multi_disc=args.multi_disc)
  File "selftraining.py", line 286, in iter_trainer
    trainer.train(epoch, train_loader_list, optimizer, optimizer_disc = optimizer_disc)
  File "reid/trainers.py", line 618, in train
    loss, prec, losses_cam, precs_cam = self._forward(inputs, pids, epoch, camid)
  File "reid/trainers.py", line 685, in _forward
    loss_tri, prec_tri = self.criterions[0](output_p, pids[i], epoch)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/reid/loss/triplet.py", line 57, in forward
    dist_an.append(neg_examples.min().view(1))
RuntimeError: invalid argument 1: cannot perform reduction function min on tensor with no elements because the operation does not have an identity at /opt/conda/conda-bld/pytorch_1573049301898/work/aten/src/THC/generic/THCTensorMathReduce.cu:64

Do you have any clues about the error? And just to be sure, you suggest 2 gpus with batch size=32, you mean batch_size =32 for each gpu or for both two gpus ?

Thank you again,
sincerely,

DukeMTMC-reID dataset download is broken, can you share this dataset download link?

Hi, thanks for your excellent work and your release code.
However, I found the DukeMTMC dataset is broken, can you share your data download link?

something confused me in JointTrainer2

why used yt as the label for feature ft, ft_up, ft_low,it's different from your paper.
here is the code,calculate the loss Lsemi
"loss_global_eug, prec_global_eug = self.criterions[1](outputs_eug[1], pids_eug, epoch)
for i, output_p in enumerate(outputs_eug[0]):
loss_tri, prec_tri = self.criterions[0](output_p, pids_eug, epoch) loss_os += loss_tri"
and it is different from loss Lssg,
code here
"loss_global, prec_global = self.criterions[1](outputs[1], pids[0], epoch)
for i, output_p in enumerate(outputs[0]):
loss_tri, prec_tri = self.criterions[0](output_p, pids[i], epoch) loss_uns += loss_tri"
I'm tring to recurrent your paper,pls help me,Thank you

market1501_trained.pth.tar load error from your google drive download link

Following is my problem, I download your pretrained model from google drive.
I load dukemtmc.pth.tar successful but fall at market1501.
Any suggestion about this? Anyway, I could retrain a market1501 model using source_train.py
Thanks!

Files already downloaded and verified
Market1501 dataset loaded
subset | # ids | # images

train | 676 | 11744
val | 75 | 1192
trainval | 751 | 12936
query | 750 | 3368
gallery | 751 | 15913
Files already downloaded and verified
DukeMTMC dataset loaded
subset | # ids | # images

train | 632 | 14923
val | 70 | 1599
trainval | 702 | 16522
query | 702 | 2228
gallery | 1110 | 17661
Resuming checkpoints from finetuned model on another dataset...

Traceback (most recent call last):
File "/home/node3/xxx/SSG/selftraining.py", line 405, in
main(parser.parse_args())
File "/home/node3/xxx/SSG/selftraining.py", line 136, in main
checkpoint = load_checkpoint(args.resume)
File "/home/node3/xxx/SSG/reid/utils/serialization.py", line 34, in load_checkpoint
checkpoint = torch.load(fpath)['state_dict']
File "/home/node3/anaconda3/envs/py36_cu10/lib/python3.6/site-packages/torch/serialization.py", line 386, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/home/node3/anaconda3/envs/py36_cu10/lib/python3.6/site-packages/torch/serialization.py", line 573, in _load
result = unpickler.load()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 0: ordinal not in range(128)

SSG+&&SSG++

你好，刚看完论文，还没来及看代码。对SSG+以及SSG++的训练过程，这点有些疑问。SGG+是不是先利用无监督方法将SSG模型完全收敛以后，利用稳定的聚类来选取label参考集，对target上面每个数据取标签然后再去fine-tuneSSG？

the parase in selftraining.py and in semitrain.py

SGG++:use rerank both in selftraining.py and in semitrain.py
SGG:only use rerank in selftraining.py and don't train semitrain.py
Is my understanding right?

where is the script "source_train.py"?

I cannot find the "source_train.py" in this project, where can I find it?

Can you share the trained model on the dataset of Market and Duke？

why use the same compute_dist way in both selftraining.py and semitraining.py?

I am confused that why computing the distance between the source features and target ones in SSG in your codes. And it is similar to SSG+. Isn`t it that Self-similarity Grouping is just about target features grouping?
if you could, please point out my mistakes in the comprehension of your paper or code.

thanks for your great works!

How to do PK sampler to ensure the calculating of three triplet losses when the labels pf one image are different?

你好，有两个问题想要请教一下

由于每张图片根据三个聚类结果有三个标签，代码里是根据第一个标签进行PK采样得到一个batch的图片数据，来进行triplet loss的计算。
但是，如何保证这个batch里的每一张图片在根据后两个聚类结果打上伪标签时，在这个batch中能找到正样本来保证triplet loss的正确计算呢？
在selftraining.py文件中的compute_dist(）函数中采用了源域数据特征来计算距离，怎么解释呢？我看论文里并没有提到

shi-labs / self-similarity-grouping Goto Github PK

self-similarity-grouping's Issues

Files already downloaded and verified Market1501 dataset loaded subset | # ids | # images

Files already downloaded and verified Market1501 dataset loaded subset | # ids | # images

train | 676 | 11744 val | 75 | 1192 trainval | 751 | 12936 query | 750 | 3368 gallery | 751 | 15913 Files already downloaded and verified DukeMTMC dataset loaded subset | # ids | # images

Recommend Projects

Recommend Topics

Recommend Org

Files already downloaded and verified
Market1501 dataset loaded
subset | # ids | # images

Files already downloaded and verified
Market1501 dataset loaded
subset | # ids | # images

train | 676 | 11744
val | 75 | 1192
trainval | 751 | 12936
query | 750 | 3368
gallery | 751 | 15913
Files already downloaded and verified
DukeMTMC dataset loaded
subset | # ids | # images