Code Monkey home page Code Monkey logo

csrnet-pytorch's People

Contributors

leeyeehoo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

csrnet-pytorch's Issues

Model:Predictions.

How to predict estimated count from the trained model in Val.ipynb.
Where is the output of the predicted estimated count?
@leeyeehoo

High inference time in comparison to keras implementation

As stated by the issue title. The CSRNet-keras implementation is a lot faster than this implementation.
In my case, one single inference on a Full HD image required ~6100 ms. On the other hand using the keras implementation it required ~460ms

Is there any way to make this implementation faster?

My machine:

  • PyTorch: 1.0.1
  • GTX 1060 Drivers: 410.104
  • CUDA 10
  • Intel(R) Core(TM) i5-7500T CPU @ 2.70GHz

train.py is empty?

Thanks for your great works!
But I find that the train.py, dataset.py and image.py are empty, will you update it in the future?

Bug in val code

While executing the val code, one gets an unexpected output for crowd count(something random).
The probable cause is incorrect normalization values that are substracted.
Using the same normalization as while training (using torch transform) seems to solve the problem.

A code akin to this should solve the issue:

im = Image.open(path).convert('RGB')

im = np.array(im)

im = im/255.0

im[:,:,0]=(im[:,:,0]-0.485)/0.229
im[:,:,1]=(im[:,:,1]-0.456)/0.224
im[:,:,2]=(im[:,:,2]-0.406)/0.225

Thank you!

Part_B上训练模型不收敛

环境:win10+ cuda9.0 +Pytorch4.0 (GTX1070)
加载了VGG16的预训练参数
直接加8倍upsample模型不收敛
不加upsample,MAE一直在68左右,lr改过1e-6,1e-7,也不知道怎么回事。
有没有什么训练日志什么的,或者训练上的trick

Region of Interest at test time

Great code thank you!

For datasets with regions of interest (ROI) such as UCSD and world Expo, do you use the ROI to mask the input image at test time? I think it makes sense that the ROI is accessible at test time (as it is unique to each scene/camera). Thanks.

问题

首先。图像预处理是指将所有待训练和测试的图片根据标注的人头点的坐标生成对应的密度图,并与标注的人头总数一起作为ground-truth。训练阶段是指将所有的训练集(包括图像预处理生成的ground-truth)输送到以VGG16前十层作为前端的网络进行人头特征提取,然后将提取到的人头特征输送到空洞卷积神经网络的进行训练,最后通过提取出的人头位置特征生成对应的密度图。
麻烦看一下我理解的对吗?谢谢

MAE of Part_A

Thanks for your Pytorch implementation! I have run the code on the Part_A of ShanghaiTect dataset. In the testing, I can obtain 73.26 MAE for this data. So, how can I reach the 66.4 MAE as mentioned in the readme? Or, this reimplementation is unfinished for now. Thanks!

why multiplied by 64 after cv2.resize() in image.py??

hi,leeyeehoo,thank u for the released codes.
In image.py, I don't understand the meaning of the code" target = cv2.resize(target,(target.shape[1]/8,target.shape[0]/8),interpolation = cv2.INTER_CUBIC)*64 ",maybe the reason is the resize scale is 1/8,but 64 means the pixels64,it is irrelevant to the scale . or it only increases the values of pixels,if so ,why *64 not others. I just want to know why?

python3下使用报错

self.frontend.state_dict().items()[i][1].data[:] = mod.state_dict().items()[i][1].data[:]
TypeError: 'odict_items' object does not support indexing

有大神知道该怎么改吗?

IOError: Unable to create file

Hi, @leeyeehoo
When I running two pieces of training, I encountered this issue.

Traceback (most recent call last):
File "train.py", line 249, in
main()
File "train.py", line 99, in main
train(train_list, model, criterion, optimizer, epoch)
File "train.py", line 145, in train
for i,(img, target)in enumerate(train_loader):
File "/app/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 314, in next
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/disk1/kevin/prjs/crowdcounting/CSRNet-pytorch_seg/dataset.py", line 33, in getitem
img,target = load_data(img_path,self.train)
File "/disk1/kevin/prjs/crowdcounting/CSRNet-pytorch_seg/image.py", line 12, in load_data
gt_file = h5py.File(gt_path)
File "/app/anaconda2/lib/python2.7/site-packages/h5py/_hl/files.py", line 269, in init
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File "/app/anaconda2/lib/python2.7/site-packages/h5py/_hl/files.py", line 124, in make_fid
fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 98, in h5py.h5f.create
IOError: Unable to create file (unable to open file: name = '/path/IMG_48.h5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)

What should I do about this issue?
Many thanks.

How to get the count from the density map

Hi,

you said that the method to generate ground truth is from https://arxiv.org/abs/1608.06197

from that paper, they said

We generate our ground truth by simply blurring each head annotation using a Gaussian kernel normalized to sum to one. This kind of blurring causes the sum of the density map to be the same as the total number of people in the crowd

But I found that the sum of the density map is not exactly the same of the total number of people.

Here is what I've done

img_path = '/home/tumh/crowd/part_A_final/train_data/images/IMG_211.jpg'
mat = io.loadmat(img_path.replace('.jpg','.mat').replace('images','ground_truth').replace('IMG_','GT_IMG_'))
img= plt.imread(img_path)
k = np.zeros((img.shape[0],img.shape[1]))
gt = mat["image_info"][0,0][0,0][0]

for i in range(0,len(gt)):
    if int(gt[i][1])<img.shape[0] and int(gt[i][0])<img.shape[1]:
        k[int(gt[i][1]),int(gt[i][0])]=1
k = gaussian_filter_density(k)
print np.sum(k)
assert(np.sum(k) == len(gt)) 

would result

Traceback (most recent call last):
  File "make_dataset.py", line 71, in <module>
    assert(np.sum(k) == len(gt)), '{}!= {}'.format(np.sum(k), len(gt))
AssertionError: 185.91192627!= 188

the ground truth is 188, but from the sum of the density map, the count is 185.91192627.

Any advise?

About crowd count is inconsistent before and after processing of the density map

Thanks for releasing the code.The code help me a lot. I have 2 questions:
1、in image.py, to resize the density map of ground truth to the 1/64 of the original size of ground truth, you used the following code:
target = cv2.resize(target,(target.shape[1]/8,target.shape[0]/8),interpolation = cv2.INTER_CUBIC)*64
but I found that the resize operation cannot make sure the crowd count of resized density map reduce to the 1/64 of the original density map, for example, if the crowd count of original density map is 640, after resize operation, the crowd count of resized density map may be 9, so 9 * 64 != 640. Am I Wrong?
2、the second doubt is the same to #18 , I read the commentary on this question but I still don't know how to solve this problem. You said 'don't mind this slight variation' in make_dataset.ipynb, but I found the variation cannot be neglect(especially when using Geometry-adaptive kernels).
considering 1 and 2, the crowd count of ground_truth we used for training and the crowd count of ground truth given by dataset is not equal, so any idea to solve these problem? Thank you!

Testing on UCF_CC_50 data

Thanks previous work! Testing on the challenging UCF_CC_50 data is also important for counting method. Could you provide the code of training and testing for UCF_CC_50? or for WorldExpo’10 dataset.

关于patch的size问题

您好,感谢您百忙之中抽空解答我的疑问.

我有一个问题想请教一下,就是你们在crop 1/4 size的原图的patch后,是直接用patch拿来训练还是resize patch到其他的分辨率后再训练呢?

感谢您的赐教!

About MAE, MSE calculation

Hello, thank you for releasing the pytorch code. I want to ask a question about the calculation of MAE and MSE. When calculating the MAE and MSE, the ground-truth is the integration of target density map. However, the integration of target density map is lower than the number of ground-truth crowd due to the loss of the edge when generating the target density map. How to solve the problem?

密度图和人数的问题

密度图的ground-truth是根据标签生成的,然后根据密度图积分求和得到人数,对吗?
但是如果用一个随便找的图片进行测试的时候,没有标签,那怎么生成的密度图?
能解答一下吗

Valuerror encountered in running make_dataset

When I run the make_dataset, I encounter the error "ValueError: not enough values to unpack (expected 2, got 0)" as below

ValueError Traceback (most recent call last)
in ()
8 if int(gt[i][1])<img.shape[0] and int(gt[i][0])<img.shape[1]:
9 k[int(gt[i][1]),int(gt[i][0])]=1
---> 10 k = gaussian_filter_density(k)
11 with h5py.File(img_path.replace('.jpg','.h5').replace('images','ground_truth'), 'w') as hf:
12 hf['density'] = k

in gaussian_filter_density(gt)
10 leafsize = 2048
11 # build kdtree
---> 12 tree = scipy.spatial.KDTree(pts.copy(), leafsize=leafsize)
13 # query kdtree
14 distances, locations = tree.query(pts, k=4)

~/anaconda3/lib/python3.6/site-packages/scipy/spatial/kdtree.py in init(self, data, leafsize)
233 def init(self, data, leafsize=10):
234 self.data = np.asarray(data)
--> 235 self.n, self.m = np.shape(self.data)
236 self.leafsize = int(leafsize)
237 if self.leafsize < 1:

ValueError: not enough values to unpack (expected 2, got 0)

Can we use our photos and print density map?

Hello,
I'm a researcher with my team at NCTU, Taipei, Taiwan.
I face a problem.
Whenever I want to replace the photos in the dataset with other pictures,
I run make_dataset.ipynb on Jupyter Notebook.
But the result is wrong, which is still the result of the previous ShaungHi'10 Expo dataset.
My method is clear all the photos in the folder "image" , and replace the photos with mine.
Can this action problematic?

thank you !

AttributeError: module 'torch.nn.init' has no attribute 'normal_'

in val.ipynb file while executing
model = CSRNet()
i got this error:


AttributeError Traceback (most recent call last)
in ()
----> 1 model = CSRNet()

~\CSRNet-pytorch-master\model.py in init(self, load_weights)
15 if not load_weights:
16 mod = models.vgg16(pretrained = True)
---> 17 self._initialize_weights()
18 for i in xrange(len(self.frontend.state_dict().items())):
19 self.frontend.state_dict().items()[i][1].data[:] = mod.state_dict().items()[i][1].data[:]

~\CSRNet-pytorch-master\model.py in initialize_weights(self)
26 for m in self.modules():
27 if isinstance(m, nn.Conv2d):
---> 28 nn.init.normal
(m.weight, std=0.01)
29 if m.bias is not None:
30 nn.init.constant_(m.bias, 0)

AttributeError: module 'torch.nn.init' has no attribute 'normal_'

Models

@leeyeehoo
how to run train and test, problem is what is "task id to use".

~/ML/Crowd-count/final/CSRNet$ python2 train.py TRAIN /home/tikam/ML/Crowd-count/final/CSRNet/train_A.json GPU 0 TASK 0
usage: train.py [-h] [--pre PRETRAINED] TRAIN TEST GPU TASK
train.py: error: unrecognized arguments: TASK 1

should default GPU 0 and TASK 0 ---> is it okay

What to use in TASK [ID].

validation yields very high crowd counts on the pretrained model

Hello,

Thanks a lot for the very easy-to-follow instructions and the code. I am having difficulty reproducing your results. I am using your pretrained model 'PartAmodel_best.pth.tar'. And I am using the test_data set of ShanghaiTech. Getting very high MAE - in the 1000's.

PyTorch 0.4.1
CUDA 9.0
Python 2.7

Please let me know if I am doing anything wrong.
Thanks,
Buvana

Snippet of code (based on your val.ipynb) and result:

model = CSRNet()
model = model.cuda()
checkpoint = torch.load('PartAmodel_best.pth.tar')
model.load_state_dict(checkpoint['state_dict'])

import random
n=random.randint(0,182)
image_file=img_paths[n]
print n, image_file

17 /home/scene/CSRNet-pytorch/ShanghaiTech/part_A/test_data/images/IMG_34.jpg

gt_file = h5py.File(image_file.replace('.jpg','.h5').replace('images','ground-truth'),'r')
groundtruth = np.asarray(gt_file['density'])
print np.sum(groundtruth)
plt.imshow(groundtruth,cmap=CM.jet)

84.752235

img = 255.0 * F.to_tensor(Image.open(image_file).convert('RGB'))
img[0,:,:]=img[0,:,:]-92.8207477031
img[1,:,:]=img[1,:,:]-95.2757037428
img[2,:,:]=img[2,:,:]-104.877445883

img = img.cuda()
output = model(img.unsqueeze(0))
print (output.detach().cpu().sum().numpy(), np.sum(groundtruth))
630.8671 84.752235

mae=abs(output.detach().cpu().sum().numpy()-np.sum(groundtruth))
print mae
546.11487

image
image
image

testing the trained model.

  • 1) How to Test the train models.
  • 2)How to test the trained models for other custom crowd images for estimation.
  • 关于环境搭配问题请教大神!

    Python: 2.7+PyTorch: 0.4.0+CUDA: 9.2
    但是执行Pytorch官网指令:conda install pytorch torchvision cuda92 -c pytorch
    在Unbuntu16.04环境下默认安装的是pytorch0.1.12,我试着去升级pytorch,可是conda list下面显示的是pytorch:0.1.12-py27cuda8.0cudnn6.0_1,同时我也去试着改写python3的语法,以及将ipynb放入.py文件中调试,依旧不能正常运行代码,请问有什么其他环境搭配能解决吗?

    Using multiple GPUs

    @leeyeehoo I was trying to use more than 1 GPU to train.

    I did modify the code by adding nn.DataParallel, but it doesnt seem to work.

    What is the fix for the issue?

    Training-Validation split

    Hello

    Thank you for sharing such beautiful work.

    May I ask which experimental setup you have used to achieve the results stated in the paper?

    I saw that you have two different setups in your json files. I would be glad if I could find out which one is the official.

    Did you use validation? If yes, how many of the training images have you reserved for validation set?

    训练时,验证集的效果非常差是怎么回事

    当我训练的时候,验证集的效果非常差,一直在300-400左右浮动,请问这是什么原因,我也没有改别的,只是把python3中不兼容python2的给改掉了,改动最大的是model.py中
    if not load_weights:
    mod = models.vgg16(pretrained = True)
    self._initialize_weights()
    #读取参数
    pretrained_dict=mod.state_dict()
    self.frontend_dict=self.frontend.state_dict()
    #将pretrained_dict里不属于frontend_dict的键剔除掉
    pretrained_dict={k:v for k, v in pretrained_dict.items() if k in self.frontend_dict}
    #更新现有的frontend_dict
    self.frontend_dict.update(pretrained_dict)
    #加载我们真正需要的state_dict
    self.frontend.load_state_dict(self.frontend_dict)

    因为python3中不支持这样的索引,不知道改的对不对,验证集效果一直很差

    training error

    When I train the model, the loss is nan
    Epoch: [0][1170/1200] Time 0.709 (0.417) Data 0.020 (0.017) Loss nan (nan)

    About UCF_CC_50 dataset

    Thanks for your previous work and your pytorch implementation!

    I wonder how to train the UCF_CC_50 dataset?

    With your implementation, in ShanghaiTech dataset I reach the MAE/MSE in the paper. Since I use 5-fold cross validation for UCF_CC_50 dataset but I could not get the MAE/MSE in your paper. Is there something I forgot to implement or I missed?

    And could you please offer some training details or code release?

    Thank you!

    error load_state_dict in val.py

    Thank you for your sharing. It's really a outstanding work.
    There is an error when I test my model after training, happened in line "model.load_state_dict(checkpoint['checkpoint.pth.tar']) " of val.ipynb.

    It said "KeyError: 'checkpoint.pth.tar' "
    Could you please give me some advice?

    how are normalized parameters calculated?

    Thanks for the released code!
    we can see there are normalization in both train and test as follows:

    transforms.Compose([
                           transforms.ToTensor(),transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                         std=[0.229, 0.224, 0.225]),
                       ])
    

    so why do we need normalization? how are the parameters 'mean' & 'std' calculated?

    GT的生成

    您好,感谢你漂亮的工作,但是我在阅读代码的时候,发现part_A 和 part_B的GT用了不同的高斯生成方式,这有什么原因吗。

    Where is the sobel kernel mentioned in the paper?

    Hi~
    When I read the code, I didn't find two ways(Fig.4) you mentioned in 3.1.1. And In the paper, the final output size is the same as the input, but in the code, the output is 1/8 of input.

    I wonder is there any other code?

    How to generate the density map from the output of model

    Thank you for releasing the code.
    I have some questions about the source code.
    1.How to generate the density map from the output of model?I didn't get the desity map from the output of the model when i run the val.ipython but only the groudtruth of density map.or ,how to transform the output of the model to densoty map?
    2. I found that the sum of the density map from groundtruth is not same as the groundtruth count number of shanghaitech dataset in .mat file.even worse ,they are very different just as the picture.if the sum of groundtruth is far from the real count number ,what does the meaning of calculating the error of it.
    number output groundtruth
    1 218.73355 265.32138
    2 242.14491 232.52945
    3 113.73707 65.99532
    4 371.46692 372.23004
    5 432.0569 306.81796
    6 144.08446 190.22192
    7 269.57242 269.53113
    8 565.20056 565.5759
    9 235.49048 307.76044
    10 507.46124 510.92313
    11 297.7716 298.135
    12 317.20828 493.84286
    13 665.5448 580.9203
    14 536.8564 468.91245
    15 189.81647 241.52519
    16 558.702 583.2348
    17 102.62774 95.6099
    18 401.4803 376.6646
    19 244.39113 211.23973
    20 343.18445 373.51428
    we can find that the difference of two count numbers from one picture.they are the count number from .mat file (.mat image_info number)and the count number from sum of generated density map.we can get the count number from sum of generated density map of the fist image in shanghaitech A from the data above ,it equals 265.32, however,the count number from .mat file of this image is 172. And not only this image,most of the data have the same problem.
    here is my code:
    mae = 0
    for i in xrange(len(img_paths)):
    img = 255.0 * F.to_tensor(Image.open(img_paths[i]).convert('RGB'))
    img[0,:,:]=img[0,:,:]-92.8207477031
    img[1,:,:]=img[1,:,:]-95.2757037428
    img[2,:,:]=img[2,:,:]-104.877445883
    img = img.cuda()
    img = transform(Image.open(img_paths[i]).convert('RGB')).cuda()
    gt_file = h5py.File(img_paths[i].replace('.jpg','.h5').replace('images','ground_truth'),'r')
    groundtruth = np.asarray(gt_file['density'])
    output = model(img.unsqueeze(0))
    print i+1,output.detach().cpu().sum().numpy(),np.sum(groundtruth)
    #print mse
    I will appreciate the help if anyone has idea of the problem, thank you!

    Data augmentation

    The data augmentation in code is not the same as the paper. There is no data augmentation in the code, but the code just use the same image four times.

    Is the predicted map the same size as the input image?

    hello, there is a problem when i run jupyter notebook val.ipynb
    i try to see the sizes of output and groundtruth
    and find that the size of output is 1/8 of the input image
    just like : (704,1024) (88,128)
    So my problem is Do i need to 8 Bilinear the output of the model before counting it?
    Thanks a lot

    How to train this model?

    How to train this model? I use your pretrained parameters, it can get good result. But, not all Fine-Tunning parameters are released. I want to train this model on MALL dataset. BUT, the evaluate always same in different epochs. MY batch_size is 1, lr is 0.001, optimizer is SGD. Can you give me some advice about how to train this model? @leeyeehoo

    Recommend Projects

    • React photo React

      A declarative, efficient, and flexible JavaScript library for building user interfaces.

    • Vue.js photo Vue.js

      🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

    • Typescript photo Typescript

      TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

    • TensorFlow photo TensorFlow

      An Open Source Machine Learning Framework for Everyone

    • Django photo Django

      The Web framework for perfectionists with deadlines.

    • D3 photo D3

      Bring data to life with SVG, Canvas and HTML. 📊📈🎉

    Recommend Topics

    • javascript

      JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

    • web

      Some thing interesting about web. New door for the world.

    • server

      A server is a program made to process requests and deliver data to clients.

    • Machine learning

      Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

    • Game

      Some thing interesting about game, make everyone happy.

    Recommend Org

    • Facebook photo Facebook

      We are working to build community through open source technology. NB: members must have two-factor auth.

    • Microsoft photo Microsoft

      Open source projects and samples from Microsoft.

    • Google photo Google

      Google ❤️ Open Source for everyone.

    • D3 photo D3

      Data-Driven Documents codes.