leeyeehoo / csrnet-pytorch Goto Github PK
View Code? Open in Web Editor NEWCSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
How to predict estimated count from the trained model in Val.ipynb.
Where is the output of the predicted estimated count?
@leeyeehoo
As stated by the issue title. The CSRNet-keras implementation is a lot faster than this implementation.
In my case, one single inference on a Full HD image required ~6100 ms. On the other hand using the keras implementation it required ~460ms
Is there any way to make this implementation faster?
My machine:
Hi, thanks very much for your works. Can you release the pretrained model without fine-tuning on dataset? Because, I want to fine-tuning on other dataset. Thanks very much again. @leeyeehoo
Thanks for your great works!
But I find that the train.py, dataset.py and image.py are empty, will you update it in the future?
While executing the val code, one gets an unexpected output for crowd count(something random).
The probable cause is incorrect normalization values that are substracted.
Using the same normalization as while training (using torch transform) seems to solve the problem.
A code akin to this should solve the issue:
im = Image.open(path).convert('RGB')
im = np.array(im)
im = im/255.0
im[:,:,0]=(im[:,:,0]-0.485)/0.229
im[:,:,1]=(im[:,:,1]-0.456)/0.224
im[:,:,2]=(im[:,:,2]-0.406)/0.225
Thank you!
在image.py脚本中,13行的位置处报错,提示:Unable to open object (object 'density' doesn't exist)
环境:win10+ cuda9.0 +Pytorch4.0 (GTX1070)
加载了VGG16的预训练参数
直接加8倍upsample模型不收敛
不加upsample,MAE一直在68左右,lr改过1e-6,1e-7,也不知道怎么回事。
有没有什么训练日志什么的,或者训练上的trick
Great code thank you!
For datasets with regions of interest (ROI) such as UCSD and world Expo, do you use the ROI to mask the input image at test time? I think it makes sense that the ROI is accessible at test time (as it is unique to each scene/camera). Thanks.
首先。图像预处理是指将所有待训练和测试的图片根据标注的人头点的坐标生成对应的密度图,并与标注的人头总数一起作为ground-truth。训练阶段是指将所有的训练集(包括图像预处理生成的ground-truth)输送到以VGG16前十层作为前端的网络进行人头特征提取,然后将提取到的人头特征输送到空洞卷积神经网络的进行训练,最后通过提取出的人头位置特征生成对应的密度图。
麻烦看一下我理解的对吗?谢谢
Thanks for your Pytorch implementation! I have run the code on the Part_A of ShanghaiTect dataset. In the testing, I can obtain 73.26 MAE for this data. So, how can I reach the 66.4 MAE as mentioned in the readme? Or, this reimplementation is unfinished for now. Thanks!
Thank you for releasing the code. The net use the density map as the Groudtruth. I wonder Is it possible to generate the points coordinates from the density map ?
hi,leeyeehoo,thank u for the released codes.
In image.py, I don't understand the meaning of the code" target = cv2.resize(target,(target.shape[1]/8,target.shape[0]/8),interpolation = cv2.INTER_CUBIC)*64 ",maybe the reason is the resize scale is 1/8,but 64 means the pixels64,it is irrelevant to the scale . or it only increases the values of pixels,if so ,why *64 not others. I just want to know why?
self.frontend.state_dict().items()[i][1].data[:] = mod.state_dict().items()[i][1].data[:]
TypeError: 'odict_items' object does not support indexing
有大神知道该怎么改吗?
Hi, @leeyeehoo
When I running two pieces of training, I encountered this issue.
Traceback (most recent call last):
File "train.py", line 249, in
main()
File "train.py", line 99, in main
train(train_list, model, criterion, optimizer, epoch)
File "train.py", line 145, in train
for i,(img, target)in enumerate(train_loader):
File "/app/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 314, in next
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/disk1/kevin/prjs/crowdcounting/CSRNet-pytorch_seg/dataset.py", line 33, in getitem
img,target = load_data(img_path,self.train)
File "/disk1/kevin/prjs/crowdcounting/CSRNet-pytorch_seg/image.py", line 12, in load_data
gt_file = h5py.File(gt_path)
File "/app/anaconda2/lib/python2.7/site-packages/h5py/_hl/files.py", line 269, in init
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File "/app/anaconda2/lib/python2.7/site-packages/h5py/_hl/files.py", line 124, in make_fid
fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 98, in h5py.h5f.create
IOError: Unable to create file (unable to open file: name = '/path/IMG_48.h5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)
What should I do about this issue?
Many thanks.
I am confused in the code, how does the VGG16 concat with the net that your owner defined?
HI, I fine-tuning on WorldExpo dataset, but it cannot reach 8.6 as paper said. I only got MAE 25.494. Can you give me some advices? @leeyeehoo
Hi,
you said that the method to generate ground truth is from https://arxiv.org/abs/1608.06197
from that paper, they said
We generate our ground truth by simply blurring each head annotation using a Gaussian kernel normalized to sum to one. This kind of blurring causes the sum of the density map to be the same as the total number of people in the crowd
But I found that the sum of the density map is not exactly the same of the total number of people.
Here is what I've done
img_path = '/home/tumh/crowd/part_A_final/train_data/images/IMG_211.jpg'
mat = io.loadmat(img_path.replace('.jpg','.mat').replace('images','ground_truth').replace('IMG_','GT_IMG_'))
img= plt.imread(img_path)
k = np.zeros((img.shape[0],img.shape[1]))
gt = mat["image_info"][0,0][0,0][0]
for i in range(0,len(gt)):
if int(gt[i][1])<img.shape[0] and int(gt[i][0])<img.shape[1]:
k[int(gt[i][1]),int(gt[i][0])]=1
k = gaussian_filter_density(k)
print np.sum(k)
assert(np.sum(k) == len(gt))
would result
Traceback (most recent call last):
File "make_dataset.py", line 71, in <module>
assert(np.sum(k) == len(gt)), '{}!= {}'.format(np.sum(k), len(gt))
AssertionError: 185.91192627!= 188
the ground truth is 188, but from the sum of the density map, the count is 185.91192627.
Any advise?
Thanks for releasing the code.The code help me a lot. I have 2 questions:
1、in image.py, to resize the density map of ground truth to the 1/64 of the original size of ground truth, you used the following code:
target = cv2.resize(target,(target.shape[1]/8,target.shape[0]/8),interpolation = cv2.INTER_CUBIC)*64
but I found that the resize operation cannot make sure the crowd count of resized density map reduce to the 1/64 of the original density map, for example, if the crowd count of original density map is 640, after resize operation, the crowd count of resized density map may be 9, so 9 * 64 != 640. Am I Wrong?
2、the second doubt is the same to #18 , I read the commentary on this question but I still don't know how to solve this problem. You said 'don't mind this slight variation' in make_dataset.ipynb, but I found the variation cannot be neglect(especially when using Geometry-adaptive kernels).
considering 1 and 2, the crowd count of ground_truth we used for training and the crowd count of ground truth given by dataset is not equal, so any idea to solve these problem? Thank you!
Thanks previous work! Testing on the challenging UCF_CC_50 data is also important for counting method. Could you provide the code of training and testing for UCF_CC_50? or for WorldExpo’10 dataset.
您好,感谢您百忙之中抽空解答我的疑问.
我有一个问题想请教一下,就是你们在crop 1/4 size的原图的patch后,是直接用patch拿来训练还是resize patch到其他的分辨率后再训练呢?
感谢您的赐教!
Hello, thank you for releasing the pytorch code. I want to ask a question about the calculation of MAE and MSE. When calculating the MAE and MSE, the ground-truth is the integration of target density map. However, the integration of target density map is lower than the number of ground-truth crowd due to the loss of the edge when generating the target density map. How to solve the problem?
密度图的ground-truth是根据标签生成的,然后根据密度图积分求和得到人数,对吗?
但是如果用一个随便找的图片进行测试的时候,没有标签,那怎么生成的密度图?
能解答一下吗
When I run the make_dataset, I encounter the error "ValueError: not enough values to unpack (expected 2, got 0)" as below
ValueError Traceback (most recent call last)
in ()
8 if int(gt[i][1])<img.shape[0] and int(gt[i][0])<img.shape[1]:
9 k[int(gt[i][1]),int(gt[i][0])]=1
---> 10 k = gaussian_filter_density(k)
11 with h5py.File(img_path.replace('.jpg','.h5').replace('images','ground_truth'), 'w') as hf:
12 hf['density'] = k
in gaussian_filter_density(gt)
10 leafsize = 2048
11 # build kdtree
---> 12 tree = scipy.spatial.KDTree(pts.copy(), leafsize=leafsize)
13 # query kdtree
14 distances, locations = tree.query(pts, k=4)
~/anaconda3/lib/python3.6/site-packages/scipy/spatial/kdtree.py in init(self, data, leafsize)
233 def init(self, data, leafsize=10):
234 self.data = np.asarray(data)
--> 235 self.n, self.m = np.shape(self.data)
236 self.leafsize = int(leafsize)
237 if self.leafsize < 1:
ValueError: not enough values to unpack (expected 2, got 0)
Hello,
I'm a researcher with my team at NCTU, Taipei, Taiwan.
I face a problem.
Whenever I want to replace the photos in the dataset with other pictures,
I run make_dataset.ipynb on Jupyter Notebook.
But the result is wrong, which is still the result of the previous ShaungHi'10 Expo dataset.
My method is clear all the photos in the folder "image" , and replace the photos with mine.
Can this action problematic?
thank you !
in val.ipynb file while executing
model = CSRNet()
i got this error:
AttributeError Traceback (most recent call last)
in ()
----> 1 model = CSRNet()
~\CSRNet-pytorch-master\model.py in init(self, load_weights)
15 if not load_weights:
16 mod = models.vgg16(pretrained = True)
---> 17 self._initialize_weights()
18 for i in xrange(len(self.frontend.state_dict().items())):
19 self.frontend.state_dict().items()[i][1].data[:] = mod.state_dict().items()[i][1].data[:]
~\CSRNet-pytorch-master\model.py in initialize_weights(self)
26 for m in self.modules():
27 if isinstance(m, nn.Conv2d):
---> 28 nn.init.normal(m.weight, std=0.01)
29 if m.bias is not None:
30 nn.init.constant_(m.bias, 0)
AttributeError: module 'torch.nn.init' has no attribute 'normal_'
I don't understand the matlab's code ,For example, GAME_recursive(density, gt,currentLevel, targetLevel)
what's the means of the currentLevel and targetLevel
Hello,
When I run python train.py ./part_A_train_with_val.json ./part_A_val.json 0 , it occurs RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument #3 'other'.
@leeyeehoo
how to run train and test, problem is what is "task id to use".
~/ML/Crowd-count/final/CSRNet$ python2 train.py TRAIN /home/tikam/ML/Crowd-count/final/CSRNet/train_A.json GPU 0 TASK 0
usage: train.py [-h] [--pre PRETRAINED] TRAIN TEST GPU TASK
train.py: error: unrecognized arguments: TASK 1
should default GPU 0 and TASK 0 ---> is it okay
What to use in TASK [ID].
Hello,
Thanks a lot for the very easy-to-follow instructions and the code. I am having difficulty reproducing your results. I am using your pretrained model 'PartAmodel_best.pth.tar'. And I am using the test_data set of ShanghaiTech. Getting very high MAE - in the 1000's.
PyTorch 0.4.1
CUDA 9.0
Python 2.7
Please let me know if I am doing anything wrong.
Thanks,
Buvana
Snippet of code (based on your val.ipynb) and result:
model = CSRNet()
model = model.cuda()
checkpoint = torch.load('PartAmodel_best.pth.tar')
model.load_state_dict(checkpoint['state_dict'])
import random
n=random.randint(0,182)
image_file=img_paths[n]
print n, image_file
17 /home/scene/CSRNet-pytorch/ShanghaiTech/part_A/test_data/images/IMG_34.jpg
gt_file = h5py.File(image_file.replace('.jpg','.h5').replace('images','ground-truth'),'r')
groundtruth = np.asarray(gt_file['density'])
print np.sum(groundtruth)
plt.imshow(groundtruth,cmap=CM.jet)
84.752235
img = 255.0 * F.to_tensor(Image.open(image_file).convert('RGB'))
img[0,:,:]=img[0,:,:]-92.8207477031
img[1,:,:]=img[1,:,:]-95.2757037428
img[2,:,:]=img[2,:,:]-104.877445883
img = img.cuda()
output = model(img.unsqueeze(0))
print (output.detach().cpu().sum().numpy(), np.sum(groundtruth))
630.8671 84.752235
mae=abs(output.detach().cpu().sum().numpy()-np.sum(groundtruth))
print mae
546.11487
Python: 2.7+PyTorch: 0.4.0+CUDA: 9.2
但是执行Pytorch官网指令:conda install pytorch torchvision cuda92 -c pytorch
在Unbuntu16.04环境下默认安装的是pytorch0.1.12,我试着去升级pytorch,可是conda list下面显示的是pytorch:0.1.12-py27cuda8.0cudnn6.0_1,同时我也去试着改写python3的语法,以及将ipynb放入.py文件中调试,依旧不能正常运行代码,请问有什么其他环境搭配能解决吗?
@leeyeehoo I was trying to use more than 1 GPU to train.
I did modify the code by adding nn.DataParallel, but it doesnt seem to work.
What is the fix for the issue?
Hello
Thank you for sharing such beautiful work.
May I ask which experimental setup you have used to achieve the results stated in the paper?
I saw that you have two different setups in your json files. I would be glad if I could find out which one is the official.
Did you use validation? If yes, how many of the training images have you reserved for validation set?
当我训练的时候,验证集的效果非常差,一直在300-400左右浮动,请问这是什么原因,我也没有改别的,只是把python3中不兼容python2的给改掉了,改动最大的是model.py中
if not load_weights:
mod = models.vgg16(pretrained = True)
self._initialize_weights()
#读取参数
pretrained_dict=mod.state_dict()
self.frontend_dict=self.frontend.state_dict()
#将pretrained_dict里不属于frontend_dict的键剔除掉
pretrained_dict={k:v for k, v in pretrained_dict.items() if k in self.frontend_dict}
#更新现有的frontend_dict
self.frontend_dict.update(pretrained_dict)
#加载我们真正需要的state_dict
self.frontend.load_state_dict(self.frontend_dict)
因为python3中不支持这样的索引,不知道改的对不对,验证集效果一直很差
When I train the model, the loss is nan
Epoch: [0][1170/1200] Time 0.709 (0.417) Data 0.020 (0.017) Loss nan (nan)
Thanks for your previous work and your pytorch implementation!
I wonder how to train the UCF_CC_50 dataset?
With your implementation, in ShanghaiTech dataset I reach the MAE/MSE in the paper. Since I use 5-fold cross validation for UCF_CC_50 dataset but I could not get the MAE/MSE in your paper. Is there something I forgot to implement or I missed?
And could you please offer some training details or code release?
Thank you!
Thank you for your sharing. It's really a outstanding work.
There is an error when I test my model after training, happened in line "model.load_state_dict(checkpoint['checkpoint.pth.tar']) " of val.ipynb.
It said "KeyError: 'checkpoint.pth.tar' "
Could you please give me some advice?
Thanks for the released code!
we can see there are normalization in both train and test as follows:
transforms.Compose([
transforms.ToTensor(),transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
])
so why do we need normalization? how are the parameters 'mean' & 'std' calculated?
您好,感谢你漂亮的工作,但是我在阅读代码的时候,发现part_A 和 part_B的GT用了不同的高斯生成方式,这有什么原因吗。
Hi~
When I read the code, I didn't find two ways(Fig.4) you mentioned in 3.1.1. And In the paper, the final output size is the same as the input, but in the code, the output is 1/8 of input.
I wonder is there any other code?
Thank you for releasing the code.
I have some questions about the source code.
1.How to generate the density map from the output of model?I didn't get the desity map from the output of the model when i run the val.ipython but only the groudtruth of density map.or ,how to transform the output of the model to densoty map?
2. I found that the sum of the density map from groundtruth is not same as the groundtruth count number of shanghaitech dataset in .mat file.even worse ,they are very different just as the picture.if the sum of groundtruth is far from the real count number ,what does the meaning of calculating the error of it.
number output groundtruth
1 218.73355 265.32138
2 242.14491 232.52945
3 113.73707 65.99532
4 371.46692 372.23004
5 432.0569 306.81796
6 144.08446 190.22192
7 269.57242 269.53113
8 565.20056 565.5759
9 235.49048 307.76044
10 507.46124 510.92313
11 297.7716 298.135
12 317.20828 493.84286
13 665.5448 580.9203
14 536.8564 468.91245
15 189.81647 241.52519
16 558.702 583.2348
17 102.62774 95.6099
18 401.4803 376.6646
19 244.39113 211.23973
20 343.18445 373.51428
we can find that the difference of two count numbers from one picture.they are the count number from .mat file (.mat image_info number)and the count number from sum of generated density map.we can get the count number from sum of generated density map of the fist image in shanghaitech A from the data above ,it equals 265.32, however,the count number from .mat file of this image is 172. And not only this image,most of the data have the same problem.
here is my code:
mae = 0
for i in xrange(len(img_paths)):
img = 255.0 * F.to_tensor(Image.open(img_paths[i]).convert('RGB'))
img[0,:,:]=img[0,:,:]-92.8207477031
img[1,:,:]=img[1,:,:]-95.2757037428
img[2,:,:]=img[2,:,:]-104.877445883
img = img.cuda()
img = transform(Image.open(img_paths[i]).convert('RGB')).cuda()
gt_file = h5py.File(img_paths[i].replace('.jpg','.h5').replace('images','ground_truth'),'r')
groundtruth = np.asarray(gt_file['density'])
output = model(img.unsqueeze(0))
print i+1,output.detach().cpu().sum().numpy(),np.sum(groundtruth)
#print mse
I will appreciate the help if anyone has idea of the problem, thank you!
The data augmentation in code is not the same as the paper. There is no data augmentation in the code, but the code just use the same image four times.
hello, there is a problem when i run jupyter notebook val.ipynb
i try to see the sizes of output and groundtruth
and find that the size of output is 1/8 of the input image
just like : (704,1024) (88,128)
So my problem is Do i need to 8 Bilinear the output of the model before counting it?
Thanks a lot
How to train this model? I use your pretrained parameters, it can get good result. But, not all Fine-Tunning parameters are released. I want to train this model on MALL dataset. BUT, the evaluate always same in different epochs. MY batch_size is 1, lr is 0.001, optimizer is SGD. Can you give me some advice about how to train this model? @leeyeehoo
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.