kaiyangzhou / deep-person-reid Goto Github PK
View Code? Open in Web Editor NEWTorchreid: Deep learning person re-identification in PyTorch.
Home Page: https://kaiyangzhou.github.io/deep-person-reid/
License: MIT License
Torchreid: Deep learning person re-identification in PyTorch.
Home Page: https://kaiyangzhou.github.io/deep-person-reid/
License: MIT License
==> Test
Extracted features for query set, obtained 3368-by-2048 matrix
Extracted features for gallery set, obtained 15913-by-2048 matrix
==> BatchTime(s)/BatchSize(img): 0.014/64
Segmentation fault (core dumped)
测试的时候出现这个问题,谢谢
In Table 1 of the paper, we can see the structure of three parts Layers.There are some different points between your code and paper. In Multi-scale-A, stream id 3 and stream 4 seems to be wrong in paper. Because the number of last layer's output is not fit the number of next layer's input. What do you think about it?
@luzai @KaiyangZhou Hi i have taken the default DukeMTMC-VideoReID dataset and training it but in the end where is the model getting saved? should i create a director called save-model and pass it as a arguement ? can you pls share the command / process to view the saved model file
Thanks for your repo!
Ref. to commits 0adb4f2. I did not try multiprocess.Pool
but I tried Cython
. It speeds up. The time for function eval_market1501
on Market1501 dataset reduces from 163.857s
to 7.393s
.
The modification are mainly:
eval_market1501_wrap
in eval.pyx
setup.py
to compile.eval.pyx
may seems lengthy, but I make some basic test, it calculates the approximate mAP and cmc as the original function eval_market1501
. I am sorry that it seems still have some problem on precision . I will try to fix it as soon as possible.
# cython: boundscheck=False, wraparound=False, nonecheck=False, cdivision=True
cpdef eval_market1501_wrap(distmat,
q_pids,
g_pids,
q_camids,
g_camids,
max_rank):
distmat = np.asarray(distmat,dtype = np.float32)
q_pids = np.asarray(q_pids, dtype = np.int64)
g_pids = np.asarray(g_pids , dtype = np.int64)
q_camids=np.asarray(q_camids,dtype=np.int64)
g_camids=np.asarray(g_camids, dtype=np.int64)
return eval_market1501(distmat, q_pids, g_pids, q_camids, g_camids, max_rank)
cpdef eval_market1501(
float[:,:] distmat,
long[:] q_pids,
long[:] g_pids,
long[:] q_camids,
long[:] g_camids,
long max_rank,
):
# return 0,0
cdef:
long num_q = distmat.shape[0], num_g = distmat.shape[1]
if num_g < max_rank:
max_rank = num_g
print("Note: number of gallery samples is quite small, got {}".format(num_g))
cdef:
long[:,:] indices = np.argsort(distmat, axis=1)
long[:,:] matches = (np.asarray(g_pids)[np.asarray(indices)] == np.asarray(q_pids)[:, np.newaxis]).astype(np.int64)
float[:,:] all_cmc = np.zeros((num_q,max_rank),dtype=np.float32)
float[:] all_AP = np.zeros(num_q,dtype=np.float32)
long q_pid, q_camid
long[:] order=np.zeros(num_g,dtype=np.int64), keep =np.zeros(num_g,dtype=np.int64)
long num_valid_q = 0, q_idx, idx
# long[:] orig_cmc=np.zeros(num_g,dtype=np.int64)
float[:] orig_cmc=np.zeros(num_g,dtype=np.float32)
float[:] cmc=np.zeros(num_g,dtype=np.float32), tmp_cmc=np.zeros(num_g,dtype=np.float32)
long num_orig_cmc=0
float num_rel=0.
float tmp_cmc_sum =0.
# num_orig_cmc is the valid size of orig_cmc, cmc and tmp_cmc
unsigned int orig_cmc_flag=0
for q_idx in range(num_q):
# get query pid and camid
q_pid = q_pids[q_idx]
q_camid = q_camids[q_idx]
# remove gallery samples that have the same pid and camid with query
order = indices[q_idx]
for idx in range(num_g):
keep[idx] = ( g_pids[order[idx]] !=q_pid) | (g_camids[order[idx]]!=q_camid )
# compute cmc curve
num_orig_cmc=0
orig_cmc_flag=0
for idx in range(num_g):
if keep[idx]:
orig_cmc[num_orig_cmc] = matches[q_idx][idx]
num_orig_cmc +=1
orig_cmc_flag=1
if not orig_cmc_flag:
# this condition is true when query identity does not appear in gallery
continue
my_cusum(orig_cmc,cmc,num_orig_cmc)
for idx in range(num_orig_cmc):
if cmc[idx] >1:
cmc[idx] =1
all_cmc[q_idx] = cmc[:max_rank]
num_valid_q+=1
print('cmc', np.asarray(cmc)[:14].tolist())
# compute average precision
# reference: https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval)#Average_precision
num_rel = 0.
print('ori cmc', np.asarray(orig_cmc)[:14].tolist())
for idx in range(num_orig_cmc):
num_rel += orig_cmc[idx]
my_cusum( orig_cmc, tmp_cmc, num_orig_cmc)
for idx in range(num_orig_cmc):
tmp_cmc[idx] = tmp_cmc[idx] / (idx+1.) * orig_cmc[idx]
print('tmp_cmc', np.asarray(tmp_cmc)[:14].tolist())
tmp_cmc_sum=my_sum(tmp_cmc,num_orig_cmc)
if num_rel<1e-32:
all_AP[q_idx] =0
else:
all_AP[q_idx] = tmp_cmc_sum / num_rel
print('final',tmp_cmc_sum, num_rel)
assert num_valid_q > 0, "Error: all query identities do not appear in gallery"
print_dbg('all ap', all_AP)
print_dbg('all cmc', all_cmc)
return np.mean(all_AP), np.asarray(all_cmc).astype(np.float32).sum(axis=0) / num_valid_q
def print_dbg(msg, val):
print(msg, np.asarray(val))
cpdef void my_cusum(
cython.numeric[:] src,
cython.numeric[:] dst,
long size
) nogil:
cdef:
long idx
for idx in range(size):
if idx==0:
dst[idx] = src[idx]
else:
dst[idx] = src[idx]+dst[idx-1]
cpdef cython.numeric my_sum(
cython.numeric[:] src,
long size
) nogil:
cdef:
long idx
cython.numeric ttl=0
for idx in range(size):
ttl+=src[idx]
return ttl
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
ext_modules = [Extension("cython_eval",
["eval.pyx"],
libraries=["m"],
include_dirs=[numpy_include],
extra_compile_args=
["-ffast-math","-Wno-cpp", "-Wno-unused-function"]
),
]
setup(
name='lib',
cmdclass={"build_ext": build_ext},
ext_modules=ext_modules)
Hi! Thanks for the evaluation baseline.
Can we used the cuhk03_eval
to evaluate VIPeR dataset?
Thank.
Currently using GPU 0
Initializing dataset prid
Initializing model: resnet50
Model size: 23.69039M
Loading checkpoint from 'saved-models/resnet50_xent_prid.pth.tar'
Evaluate only
Extracted features for query set, obtained 89-by-2048 matrix
Traceback (most recent call last):
File "train_vid_model_xent.py", line 269, in
main()
File "train_vid_model_xent.py", line 150, in main
test(model, queryloader, galleryloader, args.pool, use_gpu)
File "train_vid_model_xent.py", line 232, in test
imgs = imgs.view(b*s, c, h, w)
RuntimeError: invalid argument 2: size '[15 x 3 x 256 x 128]' is invalid for input with 2949120 elements at /pytorch/torch/lib/TH/THStorage.c:41
Thank you for providing the code.I learn a lot from it. However, I meet some problem when train the HACNN with xent loss, I set 120 epoch, and the parameters are:
--height 160 --width 64 --max-epoch 120-train-batch 32 --test-batch 32 --stepsize 20 --eval-step 20
However, the Rank1 result on Market-1501 is only 82.1% with the mAP 61.2%, which are worsing than your providing results. When I try the hacnn with xent+htri, the results in epoch150 are only 57.2% Rank-1 with 35.6% mAP. I don't know whether the hyperparameters are set right. Could you teach me how to set the parameter when you train the hacnn?
Thanks!
When I try to run the example as specified in the README, like this:
python train_imgreid_xent.py -d market1501 -a resnet50 --evaluate --resume saved-models/resnet50_xent_market1501.pth.tar --save-dir log/resnet50-xent-market1501 --test-batch 100 --gpu-devices 0
I get the following error:
Warning: Cython evaluation is UNAVAILABLE
==========
Args:Namespace(arch='resnet50', cuhk03_classic_split=False, cuhk03_labeled=False, dataset='market1501', eval_step=-1, evaluate=True, fixbase_epoch=0, fixbase_lr=0.0003, freeze_bn=False, gamma=0.1, gpu_devices='0', height=256, load_weights='', lr=0.0003, max_epoch=60, optim='adam', print_freq=10, resume='saved-models/resnet50_xent_market1501.pth.tar', root='data', save_dir='log/resnet50-xent-market1501', seed=1, split_id=0, start_epoch=0, start_eval=0, stepsize=[20, 40], test_batch=100, train_batch=32, use_cpu=False, use_lmdb=False, use_metric_cuhk03=False, vis_ranked_res=False, weight_decay=0.0005, width=128, workers=4)
==========
Currently using GPU 0
Initializing dataset market1501
=> Market1501 loaded
Dataset statistics:
------------------------------
subset | # ids | # images
------------------------------
train | 751 | 12936
query | 750 | 3368
gallery | 751 | 15913
------------------------------
total | 1501 | 32217
------------------------------
Initializing model: resnet50
Model size: 23.508 M
Traceback (most recent call last):
File "train_imgreid_xent.py", line 379, in <module>
main()
File "train_imgreid_xent.py", line 196, in main
checkpoint = torch.load(args.resume)
File "/opt/conda/lib/python3.6/site-packages/torch/serialization.py", line 303, in load
return _load(f, map_location, pickle_module)
File "/opt/conda/lib/python3.6/site-packages/torch/serialization.py", line 469, in _load
result = unpickler.load()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa9 in position 1: ordinal not in range(128)
It appears that there is something wrong with the reading of the saved model.
A google search yields the following results in this SO answer. However, this is only valid for python2 not python3 which I am using. (I tried using it but did it did not work)
Is it possible that the code is not python3 compatible? Or is there something else?
您好,我按照您的环境(PyTorch (0.4.0),torchvision (0.2.1),Python2)以及方法复现了cross entropy loss+ triplet loss 的reid,数据集是market1501, 最后的结果只能有:
单卡 Rank-1/Rank-5/Rank-10: 81.0/92.6/95.4 mAP:63.1
多(4)卡 Rank-1/Rank-5/Rank-10: 43.2/66.7/75.8 mAP:24.2
单独复现cross entropy loss的是结果正常能达到您的结果,加上triplet loss会有以上情况出现。使用您提供的pretrained的模型测试也是能够正常达到rank-1 87的结果
请问可能是哪里有问题呢
谢谢
Hello,The epoch I set is 30 times, but it runs to 20 times and it stops (on the market1501 dataset also). Is there any problem with the following commands?
The wrong hints are as follows:
Epoch: [20][3970/3983] Time 0.359 (0.362) Data 0.009 (0.008) Loss 1.0619 (1.0747)
Epoch: [20][3980/3983] Time 0.341 (0.362) Data 0.008 (0.008) Loss 1.0882 (1.0747)
==> Test
Extracted features for query set, obtained 1980-by-2048 matrix
Extracted features for gallery set, obtained 9330-by-2048 matrix
==> BatchTime(s)/BatchSize(img): 0.109/960
[3]+ 已杀死
The commands are as follows:
python train_img_model_xent.py -d market1501 -a resnet50 --max-epoch 30 --train-batch 128 --test-batch 64 --stepsize 20 --eval-step 20 --save-dir log/resnet50-xent-market1501 --gpu-devices 3
python train_vid_model_xent.py -d mars -a resnet50 --max-epoch 30 --train-batch 128 --test-batch 64 --stepsize 20 --eval-step 20 --save-dir log/resnet50-xent-mars --gpu-devices 0
Dear Author, Thanks a lot for your excellent work. If I may, can I ask you to provide the details of your hyperparameter settings? For example, I am training MobileNet V2 on Market, when I use the default settings, such as lr=0.0003, the training can only get 11.9% accuracy. I use lr=0.01 and decrease by x0.1 every 20 epoches, the training accuracy can hit 98.%, but the test performance is 68.1%, 85.5%,90.4%(rank 1 5 10) and map is 44.6%, which is much lower than your results. Can you point out what's wrong? FYI: I did't use crossentropy with label smooth, I use standard cross-entropy. will this have an impact on the influence?
When I run HACNN on Duke MTMC using
python train_vidreid_xent.py -d dukemtmcvidreid -a hacnn --evaluate --resume saved-models/hacnn_xent_dukemtmcreid.pth.tar --save-dir log/resnet50-xent-dukemtmc --test-batch 2 --gpu-devices 0
I get the following error:
Warning: Cython evaluation is UNAVAILABLE
==========
Args:Namespace(arch='hacnn', dataset='dukemtmcvidreid', eval_step=-1, evaluate=True, fixbase_epoch=0, fixbase_lr=0.0003, freeze_bn=False, gamma=0.1, gpu_devices='0', height=256, label_smooth=False, load_weights='', lr=0.0003, max_epoch=15, optim='adam', pool='avg', print_freq=10, resume='saved-models/hacnn_xent_dukemtmcreid.pth.tar', root='data', save_dir='log/resnet50-xent-dukemtmc', seed=1, seq_len=15, start_epoch=0, start_eval=0, stepsize=[20, 40], test_batch=2, train_batch=32, use_cpu=False, vis_ranked_res=False, weight_decay=0.0005, width=128, workers=4)
==========
Currently using GPU 0
Initializing dataset dukemtmcvidreid
This dataset has been downloaded.
Note: if root path is changed, the previously generated json files need to be re-generated (so delete them first)
=> Automatically generating split (might take a while for the first time, have a coffe)
Processing data/dukemtmc-vidreid/DukeMTMC-VideoReID/train with 702 person identities
Saving split to data/dukemtmc-vidreid/split_train.json
=> Automatically generating split (might take a while for the first time, have a coffe)
Processing data/dukemtmc-vidreid/DukeMTMC-VideoReID/query with 702 person identities
Saving split to data/dukemtmc-vidreid/split_query.json
=> Automatically generating split (might take a while for the first time, have a coffe)
Processing data/dukemtmc-vidreid/DukeMTMC-VideoReID/gallery with 1110 person identities
Warn: index name F0001 in data/dukemtmc-vidreid/DukeMTMC-VideoReID/gallery/0002/2197 is missing, jump to next
Saving split to data/dukemtmc-vidreid/split_gallery.json
=> DukeMTMC-VideoReID loaded
Dataset statistics:
------------------------------
subset | # ids | # tracklets
------------------------------
train | 702 | 2196
query | 702 | 702
gallery | 1110 | 2636
------------------------------
total | 1404 | 5534
number of images per tracklet: 1 ~ 9324, average 167.6
------------------------------
Initializing model: hacnn
Model size: 3.649 M
Loaded checkpoint from 'saved-models/hacnn_xent_dukemtmcreid.pth.tar'
- start_epoch: 299
- rank1: 0.8070017695426941
Evaluate only
Traceback (most recent call last):
File "train_vidreid_xent.py", line 395, in <module>
main()
File "train_vidreid_xent.py", line 216, in main
distmat = test(model, queryloader, galleryloader, args.pool, use_gpu, return_distmat=True)
File "train_vidreid_xent.py", line 328, in test
features = model(imgs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 112, in forward
return self.module(*inputs[0], **kwargs[0])
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/data/rooijenalv/Projects/gitlab-dv/projects/fietsreid/deployment/external/deep-person-reid/torchreid/models/hacnn.py", line 283, in forward
"Input size does not match, expected (160, 64) but got ({}, {})".format(x.size(2), x.size(3))
AssertionError: Input size does not match, expected (160, 64) but got (256, 128)
How can I fix this?
@luzai @KaiyangZhou Hi just wanted to know how to test the trained model for a video ? since the trained model is having an extension of "pth.tar" is there any other method which we can use to test this model
pls share the comand Thank you
I don‘t understand how you generate triplet images to train the model. It seems that you rewrite the sampler class. But why you set the num_instances is 4? Thanks for your answer.
Is there a way to prepare our own dataset and test the models on it? Thanks
I've met python error:
==> Start training
Traceback (most recent call last):
File "train_img_model_xent.py", line 299, in
main()
File "train_img_model_xent.py", line 167, in main
train(epoch, model, criterion, optimizer, trainloader, use_gpu)
File "train_img_model_xent.py", line 212, in train
outputs = model(imgs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/parallel/data_parallel.py", line 69, in forward
inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/parallel/data_parallel.py", line 80, in scatter
return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/parallel/scatter_gather.py", line 38, in scatter_kwargs
inputs = scatter(inputs, target_gpus, dim) if inputs else []
File "/usr/local/lib/python2.7/dist-packages/torch/nn/parallel/scatter_gather.py", line 31, in scatter
return scatter_map(inputs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/parallel/scatter_gather.py", line 18, in scatter_map
return list(zip(*map(scatter_map, obj)))
File "/usr/local/lib/python2.7/dist-packages/torch/nn/parallel/scatter_gather.py", line 16, in scatter_map
assert not torch.is_tensor(obj), "Tensors not supported in scatter."
AssertionError: Tensors not supported in scatter.
I encounter the following error:
TypeError: test() missing 1 required positional argument: 'use_gpu'
when running:
python train_vidreid_xent.py -d dukemtmcvidreid -a hacnn --evaluate --resume saved-models/hacnn_xent_dukemtmcreid.pth.tar --save-dir log/resnet50-xent-dukemtmc --test-batch 2 --gpu-devices 0
Changing line 214 in file train_vidreid_xent.py from:
distmat = test(model, queryloader, galleryloader, use_gpu, return_distmat=True)
to
distmat = test(model, queryloader, galleryloader, 'avg', use_gpu, return_distmat=True)
or to
distmat = test(model, queryloader, galleryloader, 'max', use_gpu, return_distmat=True)
Seems to solve it.
Perhaps pool param is forgotten?
HACNN download link is unavailable in benchmark page.
deep-person-reid/eval_metrics.py
Line 25 in b19c5a3
Traceback (most recent call last):
File "eval_metrics.py", line 25, in eval_cuhk03
matches = (g_pids[indices] == q_pids[:, np.newaxis]).astype(np.int32)
IndexError: index 2 is out of bounds for axis 0 with size 1
Kindly check this
Hi, my understanding of samplers.py is that you are trying to use every image in data_source, only limitting the identities in each batch.
So I was wondering if there's some problems in train_vid_xent_htri.py when defines RandomIdentitySampler for trainloader:
RandomIdentitySampler(dataset.train ...
Is it should be:
RandomIdentitySampler(new_train, ...
Thanks very much for your excellent work! It helps me a lot!
I tried to perform test on the MARS dataset with the provided trained model densenet121_xent_htri_mars.pth.tar
, using this command:
python train_vid_model_xent_htri.py -d mars -a densenet121 --evaluate --resume saved-models/densenet121_xent_htri_mars.pth.tar --save-dir log/densenet121-xent-htri-mars --test-batch 2
And I got the following error info:
RuntimeError: Error(s) in loading state_dict for DenseNet121:
Missing key(s) in state_dict: "base.denseblock1.denselayer1.norm1.running_var",
...
(for detailed info, please refer to the console log)
1)Although I had already put the provided model file in the dir deep-person-reid/saved-models/
,the console log shows that PyTorch still automatically downloaded a pre-trained model from "https://download.pytorch.org/models/densenet121-a639ec97.pth" to the path /home/user/.torch/models/densenet121-a639ec97.pth
. After the download process had completed, pyTorch loaded the provided model densenet121_xent_htri_mars.pth.tar
.(see the log for detailed info.)
It seems that the provided model densenet121_xent_htri_mars.pth.tar
which contains the model's parameters ONLY is not consistent with the model auto downloaded from download.pytorch.org
.
2)Why did PyTorch auto download pre-trained model from download.pytorch.org
before it loaded densenet121_xent_htri_mars.pth.tar
?
Because in __init__
function of DenseNet.py
,"pretrained" is set to true.
This Line:
densenet121 = torchvision.models.densenet121(pretrained=True)
3)I tried to perform test on the MARS dataset with the ResNet model (resnet50_xent_mars.pth.tar
). PyTorch again automatically downloaded another pre-trained model from download.pytorch.org
, but this time, NO runtime error occurred. Command:
python train_vid_model_xent.py -d mars -a resnet50 --evaluate --resume saved-models/resnet50_xent_mars.pth.tar --save-dir log/resnet50-xent-mars --test-batch 2
I have found a similar issue(#4), but it can't solve this problem.
Does anyone know how to solve it? Thanks in advance.
May 30, 2018
File Name : densenet121_xent_htri_mars.pth.tar
URL : http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/video-models/densenet121_xent_htri_mars.pth.tar
MD5 : 544FFC7520B5719B2F63CAA44F412F49
Ubuntu 14.04 x64
Anaconda 2 4.4.0 x86_64
Python 2.7.13
PyTorch 0.4.0
torchvision-cpu 0.2.1
use CPU only (no CUDA installed)
Because the result of my training is always about 3% worse than the result you provided, so I guess it's the training parameters. My training parameters are as follows:
python train_vid_model_xent_htri.py -d mars -a resnet50m --max-epoch 500 --train-batch 128 --test-batch 32 --stepsize 200 --eval-step 20 --save-dir log/resnet50m-xent-htri-mars --gpu-devices 0
Hello, xent and xent-htri's len (trainloader) why is the gap so large that the former 3983 and the last one are only 19. Their train_batch is 128. Is that normal? If it's normal, can you say why?
Dear author.
Can you publish the hyperparameters you used to get 90.7| 97.0| 98.2| 76.8 (rank1, rank5, rank10, mAP) in market1501?, I have tried with the default configuration and only got 81.4| 92.5| 95.4| 62.7. it gets stuck in this point.
Thanks a lot for the great work! :)
Thank you for your great work in person reidentification. It makes clear which model can perform what performance from the table. And I want to ask, if you have stored the time consuming log of these models? Can you share the time consuming (such as, forward time/batch size) log?
Thank you again for your great work.
Yours,
Hao
densenet121_xent_htri_mars.pth.tar was not found on this server
”densenet121_xent_htri_mars.pth.tar“ ’s link is broken
HI!
I trained your hacnn
architecture and ended up with
Rank-1 : 86.1%
Rank-5 : 94.7%
Rank-10 : 96.6%
Rank-20 : 97.8%
mAP: 67.3%
Which are different from the original results reported in the paper 91.2
for rank1 and 75.7
for mAP
Pls, I want to make sure that this is the expected result and that you are getting the same result.
Thanks
Just wondering, has anyone tried? Do you think it would be useful to try it or the results will be awful?
Is there a validation set used for choosing the best model before testing the accuracy on the test set?
From what I see from the code, the model with Best Rank 1 is chosen based on test set result. Won't this mean that the Best Rank 1 result is overfitting on the test set?
Hello @KaiyangZhou,
Thank you for this repo. I get an error while trying to evaluate some of the pretrained models. Here are the details:
This is my first time trying this repo. In a clean conda python2.7 environment, I installed PyTorch and torchvision, and the installations seem fine. I prepared the Market1501 dataset and downloaded some pretrained model weights as described in the readme. Then I tried to evaluate the ResNet50M xent+htri pretrained model for Market1501. I used the command as:
python train_img_model_xent_htri.py -d market1501 -a resnet50m --evaluate --resume saved-models/resnet50m_xent_htri_market1501.pth.tar --save-dir log/resnet50m-xenthtri-market1501 --test-batch 32
I got the following error logs:
==========
Args:Namespace(arch='resnet50m', cuhk03_classic_split=False, cuhk03_labeled=False, dataset='market1501', eval_step=-1, evaluate=True, gamma=0.1, gpu_devices='0', height=256, htri_only=False, lr=0.0003, margin=0.3, max_epoch=180, num_instances=4, optim='adam', print_freq=10, resume='saved-models/resnet50m_xent_htri_market1501.pth.tar', root='data', save_dir='log/resnet50m-xenthtri-market1501', seed=1, split_id=0, start_epoch=0, stepsize=60, test_batch=32, train_batch=32, use_cpu=False, use_metric_cuhk03=False, weight_decay=0.0005, width=128, workers=4)
==========
Currently using GPU 0
Initializing dataset market1501
=> Market1501 loaded
Dataset statistics:
------------------------------
subset | # ids | # images
------------------------------
train | 751 | 12936
query | 750 | 3368
gallery | 751 | 15913
------------------------------
total | 1501 | 32217
------------------------------
Traceback (most recent call last):
File "train_img_model_xent_htri.py", line 295, in <module>
main()
File "train_img_model_xent_htri.py", line 115, in main
T.Resize((args.height, args.width)),
AttributeError: 'module' object has no attribute 'Resize'
I thought this might be a problem with torchvision, but the latest version of it (0.2.1) is installed. What am I missing here? Can you give some guidance?
I am trying to reproduce the HACNN result. I've downloaded CUHK03 dataset and process it as described in README. However, when I apply the train command of HACNN as described in README, I've met python error:
I am an beginner of ReID, I an running the repo in my personal bought AWS K80 GPU Ubuntu 16.04 LTS server, in which the Python is 3.6.4 in Anaconda3, and the torch 0.4.0, with torchvision 0.2.1:
Where is the problem do you think? Can you give me some hint, from the capture image above?
Thanks.
Hi, I use the default setting when I train resnet50 with cross-entropy-label-smooth loss on Market1501, i.e. -- max-epoch 60 -- stepsize 20 40, then I got map 67.3, rank-1 85.2. Therefore, I try --max-epoch 180 --stepsize 60, then I got
my results seem to be much worse than yours :
thank you very much.
@luzai @KaiyangZhou Hi what is the command to perform incremental training , lets assume i trained my net for 20 epochs and got the weights now i want to train it for another 30 epochs but i have to start from 20th epoch , how can we do it ?
Hi,
I was able to run the testing command without any error, but how to visualise the result ? How to see the performance on our own videos or real time video streams ? Where can I find the test results of market1501 tested with resnet50 model ?
Any help would be appreciated !!
Thanks.
Thanks for provide the elegant code;
When I train the densenet 121 with xent+htri loss, I set 80 epoch ;
I train it for three times ,but got not good result:
batch size = 32,epoch = 80 :rank1 = 60.6%
batch size = 16,epoch = 80 :rank1 = 61.2%
batch size = 48,epoch = 60 :rank1 = 58.4%
I don't know why the result is not good like yours,can you teach me how to set the parameter when you train the densenet121;
Thanks
您好,感谢您的代码,对我这种刚入门的小白帮助很大
请问下像HANN等论文中可视化特征图一般用的是什么方法?
Hi, I'd like to use your pre-trained models to finetune Re-ID model. But I can't extract ‘*.tar‘files you upload. Is there anything wrong? Looking forward to your reply, thx
Hi, I noticed that all the models on the main page are gone. Did you move them to another server? Are they available? Thank you.
i wanna train other nets with pretrained weights on imagenet, how to get it?
@luzai @KaiyangZhou Hi first of thanks for the wonderful code representation . i have few queries when training the DukeMTMC-VideoReID dataset
1.When i download the dataset from the source i have only train , gallery and video folder there is no .json file. Can you share me how to generate the .json file , since it gives me error camid = int(img_name[5]) - 1 # index-0
ValueError: invalid literal for int() with base 10: 'C'
Can you pls help me out in solving this issues
I run this code(train_img_model_xent_htri.py) with ubuntu 16.04 and python3.5
and appear this problem
Loading checkpoint from 'saved-models/resnet50_xent_htri_market1501.pth.tar'
Traceback (most recent call last):
File "train_img_model_xent_htri.py", line 290, in
main()
File "train_img_model_xent_htri.py", line 155, in main
model.load_state_dict(checkpoint['state_dict'])
KeyError: 'state_dict'
what is this mean ? i download this resnet50-19c8e357.pth , rename resnet50_xent_htri_market1501.pth.tar and put it into the file saved-models
I don't know how to do.
Hello @KaiyangZhou , thank you for make this project.
I try to running test on prid 2011 dataset using ResNet50m pretrained model you give, but got error
UnicodeDecodeError: 'ascii' codec can't decode byte 0x9b in position 1: ordinal not in range(128)
I think this error appear because pretrained model resnet50m_xent_prid.pth.tar which can only open using Microsoft Windows OS (I don't know is true or not), but when I type file resnet50m_xent_prid.pth.tar
got this message
resnet50m_xent_prid.pth.tar: 8086 relocatable (Microsoft)
How can I use that pretrained model on linux meanwhile pytorch can't install on windows?
Thank you :)
when I run the model hacnn, I only get the top1: 84.0, map: 62.7. It is very below than top1: 88.7, map: 71.2.
Do you know the reason that may cause this?
This is a minor issue with the RandomIdentitySampler. The implementation does not necessarily guarantee that the N identities in the batch are unique. For example, ID 28 is sampled twice in this batch.
tensor([ 554, 554, 554, 554, 195, 195, 195, 195, 399, 399,
399, 399, 527, 527, 527, 527, 28, 28, 28, 28,
501, 501, 501, 501, 252, 252, 252, 252, 136, 136,
136, 136, 700, 700, 700, 700, 125, 125, 125, 125,
120, 120, 120, 120, 68, 68, 68, 68, 577, 577,
577, 577, 455, 455, 455, 455, 28, 28, 28, 28,
9, 9, 9, 9, 387, 387, 387, 387, 564, 564,
564, 564], device='cuda:0')
Can be reproduced by calling any of the htri demo examples.
In transforms.py, the probabilty variable p works the opposite way: Increasing p decreases the probability of performing the transformation. Line 31 can be changed to if self.p < random.random():
And by the way, have you tried training a model with transformation probability other than p=0.5? In the triplet loss paper (In Defense of the Triplet Loss for Person Re-Identification), this probability seems like 1; it is stated that the transformation is done to all images. Can we expect increased scores with higher transformation probabilities?
Code Version
Aug 17, 2018
Command
python train_vidreid_xent.py -d ilidsvid -a resnet50 \
--save-dir log/ilidsvid_test-resnet50-xent --gpu-devices 6,7 \
--evaluate --resume log/ilidsvid_train-resnet50-xent/best_model.pth.tar
Error Info
Traceback (most recent call last):
File "train_vidreid_xent.py", line 379, in <module>
main()
File "train_vidreid_xent.py", line 200, in main
distmat = test(model, queryloader, galleryloader, use_gpu, return_distmat=True)
TypeError: test() takes at least 5 arguments (5 given)
Solution
It seems that at train_vidreid_xent.py/line 200, argument "pool" is missing when invoking the test function:
distmat = test(model, queryloader, galleryloader, use_gpu, return_distmat=True)
fix:
distmat = test(model, queryloader, galleryloader,args.pool, use_gpu, return_distmat=True)
Hi, would you please add ResNeXt as baseline?
Hi. Great code.
I have one question about the scheduler. The initial scheduler.last_epoch =-1. And in Python3, -1 // 60 = -1, which means the actual learning rate of the first epoch is 10 times of args.lr. I think the correct code sequence is
for epoch in range(args.max_epoch):
scheduler.step()
train(...)
validate(...)
When i try to evaluate market i get the following error
File "/home/konstantinou/virtualenvs/pytorch_python2/local/lib/python2.7/site-packages/torch/nn/modules/linear.py", line 49, in reset_parameters
stdv = 1. / math.sqrt(self.weight.size(1))
RuntimeError: dimension out of range (expected to be in range of [-1, 0], but got 1)
The command i use is the following
python train_img_model_xent.py -d market1501 -a resnet50 --evaluate --resume saved-models/resnet50_xent_market1501.pth.tar --save-dir log/resnet50m_xent_market1501 --test-batch 32
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.