zhihengcv / bayesian-crowd-counting Goto Github PK

View Code? Open in Web Editor NEW

317.0 8.0 82.0 1.47 MB

Official Implement of ICCV 2019 oral paper Bayesian Loss for Crowd Count Estimation with Point Supervision

Python 100.00%

bayesian-crowd-counting's Introduction

Bayesian-Crowd-Counting （ICCV 2019 oral）

Arxiv | CVF

Official Implement of ICCV 2019 oral paper "Bayesian Loss for Crowd Count Estimation with Point Supervision"

Visualization

Bayesian

Bayesian+

Density

Citation

If you use this code for your research, please cite our paper:

@inproceedings{ma2019bayesian,
  title={Bayesian loss for crowd count estimation with point supervision},
  author={Ma, Zhiheng and Wei, Xing and Hong, Xiaopeng and Gong, Yihong},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={6142--6151},
  year={2019}
}

Code

Install dependencies

torch >= 1.0 torchvision opencv numpy scipy, all the dependencies can be easily installed by pip or conda

This code was tested with python 3.6

Train and Test

1、 Dowload Dataset UCF-QNRF Link

2、 Pre-Process Data (resize image and split train/validation)

python preprocess_dataset.py --origin_dir <directory of original data> --data_dir <directory of processed data>

3、 Train model (validate on single GTX Titan X)

python train.py --data_dir <directory of processed data> --save_dir <directory of log and model>

4、 Test Model

python test.py --data_dir <directory of processed data> --save_dir <directory of log and model>

The result is slightly influenced by the random seed, but fixing the random seed (have to set cuda_benchmark to False) will make training time extrodinary long, so sometimes you can get a slightly worse result than the reported result, but most of time you can get a better result than the reported one. If you find this code is useful, please give us a star and cite our paper, have fun.

5、 Training on ShanghaiTech Dataset

Change dataloader to crowd_sh.py

For shanghaitech a, you should set learning rate to 1e-6, and bg_ratio to 0.1

Pretrain Weight

UCF-QNRF

Baidu Yun Link extract code: x9wc

Google Drive Link

ShanghaiTech A

Baidu Yun Link extract code: tx0m

Goodle Drive Link

ShanghaiTech B

Baidu Yun Link extract code: a15u

Goodle Drive Link

License

bayesian-crowd-counting's People

Contributors

Stargazers

Watchers

Forkers

belye cjwbdw peterzhousz pythonfirst tlwzzy jglezt wowotou1022 dorianbrown aohan12138 youtang1993 masami87 tzclxy cjgalvin issactung ltyong yarou1025 hsouporto liuzheng081 dansonc danyaalfageh wawancenggoro juanlp xrosliang gilangmantara lovedoubledan ikaroso zhudaoruyi wu-jiafeng ishaghodgaonkar fahmifc123 flyinglynx caozhiwei1994 nguyenlienviet qinghuachen007 raisinglc maohule kenanunal iferianto meredith92 techmagic pinna526 mrlinyj qwe7924296 tersekmatija fasladodo immrma feixuedudiao anhvaut deepakcrk shuhecn hardtogeta jie311 abheeshtroy d888999 jjo-nak dearbreeze 44liyun taxuezcy zhengqinlai vision-intelligence-and-robots-group lander77 paul2002 boltenwang-meta wangyintu harrywuhust2022 sg12qt h-hui2277 phpin57 blacklutos xiang0121 iq-scm albinjal sydai wang00619 tcy07 nightmare4214 luqijun camerayuhang fepfitra jihuacao nadinekuo dtronmans

bayesian-crowd-counting's Issues

How do I preprocess JHU++ datasets?

Hi,
I am trying to run your and a few other projects that rely on your preprocess step. I am looking to preprocess JHU++ dataset and I am not clear how I can accomplish that. Can you point out the steps? Thank you.

What is the shape of the .npy file?

Hi ! When I use my own dataset, I have a problem about the keypoints, which is File "D:\Crowed_recognition\code\GeneralizedLoss-Counting-Pytorch-main\datasets\crowd.py", line 112, in train_transform_with_crop nearest_dis = np.clip(keypoints[:, 2], 4.0, 128.0) IndexError: index 2 is out of bounds for axis 1 with size 2.

By the way, the shape of my own .npy file is (n, 2), such as [[1,2],[2,3]]. I do not know the real shape of the .npy file. Pls help, thanks!

Parameter settings of your trained model(UCF-QNRF dataset) ?

Thank you for great code!

What is the parameter settings of your trained model (UCF-QNRF dataset)?
default settings(e.g. batch=1, sigma=8, background ratio=0.15, learning rate=1e-5) ?

the abs in forward

This work is very interesting and useful.
I have some question about the code.
why do you use the torch.abs(x) in the end of the VGG forward.
In my point, we can use the result of self.reg_layer as the result of model forward.

So, I want to know the reason you use the torch.abs

Thank you very much

Ann .mat file to preprocess the input dataset

Hello to everyone, I'm interested to test this repo with my personal dataset against one of the pretrained models. Reviewing the UCF-QNRF dataset whic I downloaded, I realised for each image (jpg file), there is another file (img_xxx_ann.mat), and trying the script "preprocess_dataset.py" I observe this .mat file is necessary to preprocess the dataset just before to test it. So with all of that, my question is what is exaclty this .mat file? How is the way to generate it?

Thanks in advance!

About releasing the code

Hello!
The idea of constructing bayesian method in crowd counting is awesome! I'd like to reproduce or improve your work, but is there any specific plan to share your code with us? (ex. when to release the code) For it has been 27 days since you issued the rough plan. I'll really appreciate it if the code is released. Thanks!

Hello, can you tell me how to generate these density maps as shown in Figure 3 in your excellent paper?

Performance with UCF-QNRF

Hi,

Thanks for sharing the code.
After I running your code with all the default configurations, I can only get ~100 MAE on the UCF-QNRF dataset.
May I ask if there anything I need to modify in order to get the results reported in the paper?

Thanks!
Ze

how to set the dataset location in preprocess_dataset.py ？

should i merge the dataset（UCSD_Anomaly_Dataset） training directory and test directory in the two folders respectively？ And the error prompted me can‘t find ./img_0526.jpg， isn't the format in the file “tif”？
thanks

Issue about dataloader

In crowd.py, key_points has a shape of[x, 2], and target has a shape of[y], so their first dimension are different. But when I read a batch from the dataloader defined in regression_train.py in utils, target and points have same first dimension, why?And what's the meaning of the targets, does someone know?

How to pre-process the ShanghaiTech dataset?

I find that you have released the pre-processing code for QNRF, but how can I get the pre-processed data for ShanghaiTechA/B dataset?

Training data

What is the ideal dataset size you would recommend to train with?
How many pictures would you recommend to be a good start to train and get accurate results?

Thanks

code environment

Could you publish your requirements.txt？
I wish to know your environment such as pytorch and python version, etc.
Thank you!

ShanghaiTech Dataset

Hi，could you tell me how topre-Process the ShanghaiTech Dataset?thinks!

Plot density heatmap

How to plot density heatmap using matplotlib?

One loss or two loss functions?

There is two loss in your model, right? One is the Bayesian loss for point supervision and the other for training the D^est. Is the loss for training D^est the pixel-wise loss?

One more question is about p(y_n|x_m). The value of y_n varies from one image to another image. That is, N is different for each image. The N of a test image is also different from that of training images. What is the value field of y_n in training?

How to visualize the test result ?

I have downloaded your pre-trained model and im running test.py :

python3 test.py --data-dir processed_data --save-dir logs
In output I get

('img_0001',) -317.4151611328125 975 1292.4151611328125
('img_0002',) 234.6561279296875 923 688.3438720703125
.
.
.
('img_0017',) 11.425949096679688 191 179.5740509033203

My question is where is the visualization (heatmap) image saved ?

How to draw the "entropy maps" ?

Parameters for ShanghaiTech Part A

Hello @ZhihengCV ,

I changed the --crop-size to 256 as mentioned in the paper. While data-preparation the min_size is fixed as 256 and max_size as 5096. All the other parameters as default. I ran it on ShanghaiTech Part A. Unlike reported in paper I could only get an MAE of 66.0, that's a major difference of 3.2 I believe that the parameters of Sigma and background ratio might be of different values from default. Can you please let me know the changes from this codebase to that of used for ShanghaiTech Part A? One more thing I used just Train and Test Split only, if you have a different split please let me know that as well. Thanks.

How to add additional data/images to Shanghaitech A and B ?

Hi !

Can you advise How to add additional data/images to Shanghaitech A and B ?
What is the process of labelling the density maps ?

Multi-GPU training

Hello,

Thank you for the open-sourced code, I am trying to replace the current backbone with a bigger network for my own dataset. However I cannot run this in single GPU. Could you please let me know how to achieve multi-gpu training in this case? Is it different from the other approaches that follow multigpu using dataparallel ?

Thank you in advance for this help !

"train.py: error: unrecognized arguments: --data_dir 1 --save_dir" a sample command mistake

I'm not sure if you have made a mistake in your sample command:"python train.py --data_dir --save_dir "
The "--data_dir" and "--save_dir" maybe should be "--data-dir" and "--save-dir"
the difference between "_" and "-"

How to set rights path in linux?

python preprocess_dataset.py --origin_dir data/ --data_dir processed/
usage: preprocess_dataset.py [-h] [--origin-dir ORIGIN_DIR]
[--data-dir DATA_DIR]
preprocess_dataset.py: error: unrecognized arguments: --origin_dir data/ --data_dir processed/

关于 regression 模块的参数

您可以把 regression 模块那三个卷积层的参数文件上传到 Github 吗？

Why the target defined in the code are not all ones

Hi！

I am confused about the "target" in released code, in theory elements in target should all be ones or zero, but why there are decimals in it? Thank you very much!

请问post_prob里头self.cood是什么意思？能解释一下这部分的源代码吗？

请问您能解答一下【方括号标注部分】的计算原理么？感谢。
import torch
from torch.nn import Module

class Post_Prob(Module):
def init(self, sigma, c_size, stride, background_ratio, use_background, device):
super(Post_Prob, self).init()
assert c_size % stride == 0

    self.sigma = sigma
    self.bg_ratio = background_ratio
    self.device = device
    # coordinate is same to image space, set to constant since crop size is same
    self.cood = torch.arange(0, c_size, step=stride,
                             dtype=torch.float32, device=device) + stride / 2

【self.cood是用来做什么的】
self.cood.unsqueeze_(0)
self.softmax = torch.nn.Softmax(dim=0)
self.use_bg = use_background

def forward(self, points, st_sizes):
    num_points_per_image = [len(points_per_image) for points_per_image in points]
    # 每张图的人头数
    all_points = torch.cat(points, dim=0)
    # 人头坐标(n*2)
    # print(all_points.shape)
    if len(all_points) > 0:
        x = all_points[:, 0].unsqueeze_(1)
        # print(x.shape)
        # 横坐标列表
        y = all_points[:, 1].unsqueeze_(1)
        # print(y.shape)
        # 纵坐标列表

【这四行是计算什么？】
x_dis = -2 * torch.matmul(x, self.cood) + x * x + self.cood * self.cood
y_dis = -2 * torch.matmul(y, self.cood) + y * y + self.cood * self.cood
y_dis.unsqueeze_(2)
x_dis.unsqueeze_(1)
dis = y_dis + x_dis
dis = dis.view((dis.size(0), -1))

        dis_list = torch.split(dis, num_points_per_image)
        prob_list = []
        for dis, st_size in zip(dis_list, st_sizes):
            if len(dis) > 0:
                if self.use_bg:
                    min_dis = torch.clamp(torch.min(dis, dim=0, keepdim=True)[0], min=0.0)
                    d = st_size * self.bg_ratio
                    bg_dis = (d - torch.sqrt(min_dis))**2
                    dis = torch.cat([dis, bg_dis], 0)  # concatenate background distance to the last
                dis = -dis / (2.0 * self.sigma ** 2)
                prob = self.softmax(dis)
            else:
                prob = None
            prob_list.append(prob)
    else:
        prob_list = []
        for _ in range(len(points)):
            prob_list.append(None)
    return prob_list

The arxiv link is wrong...Hah

Issue about batch size

As we know, the groundtruth keypoints and target have different shapes in different inputs, I just wonder that whether the batch_size could be greater than 1 in the training stage ?

Why the pretrained model only gets mae 98 on UCF-QNRF?

I download the pretrained model from googledrive, and preprocess the data described, but get MAE 98.6 and RMSE 169.7. This is far from 88.7 and 154.8 in original paper.

license?

Could you please add a license? Happy to do a PR (for MIT?) if you want.

What is the `x` means in the dataloader ?

The issue is described in the link below

afengen/Bayesian-Crowd-Counting-#1

Thx !

Could you share the pre-trained model by gogole drive or anything else baidu service

Hello, first of all, I appreciate your work and sharing the code.
But, I couldn't download the pretrained model because I don't have Chinese VPN and mobile phone number which is necessary to register for baidu service (as far as I understand it)
So, it would be nicer if you would share the pretrained model by other sharing methods, such as google drive, anything else.
Thanks in advance.

The meaning of the density graph?

I am a new learner in this area, there is a naive question. With this model, i can get a density graph for my own pictures. So, what is the meaning of each point's value in the density graph? Is it the value of crowd density in that pixel?

这个问题我在谷歌了好久也没搞懂。请问一下，最后生成的人群密度热力图里，每个点的值代表什么意思呢。是代表该点人群密度的值吗，因为密度越高的地方值越大。那么，是否可以说，如果这个点的值是3，就代表网络预测这个点有三个人。麻烦各位大神回答一下，我知道这个问题应该很傻，谢谢了。

What is the p(x_m|y_n) in the testing stage?

Hi. There are no parameters learned for the Bayesian model. In the training stage, p(x_m|y_n) can be obtained from the ground truth density map. But in the testing stage, how we get the p(x_m|y_n)? Did I miss something?

When random crop the image, the number of people may be zero.

How to deal with keypoints and targets?

How to predict my own images?

Please，Now I have got the best_model.pth, But how to use this Bmodel to predict my own images?

What's the meaning of background_radio?

Hi !

Can you tell me the meaning of code of calculating background distance : bg_dis = (st_size * self.bg_ratio) ** 2 / (min_dis + 1e-5) and what's the meaning of background_radio. Thank you very much.

Code for NWPU

Could you release the setting that you have used to preprocess the NWPU Dataset ? I am running into cuda memory errors while implementng this network on NWPU-Crowd Dataset. Sharing pretrained model on NWPU would be great too !!

Thanks in Advance !

Expected Rough Timeline for Code Release

Hello @ZhihengCV,

I read the paper and found the bayesian loss idea very very interesting. If possible can you give a rough timeline by when code will be released? Thanks.

Preprocessing scenes with < 4 people gives out of range error

I am processing the GCC synthetic dataset (GTAV scenes) for training on the Bayesian Loss model. When there are < 4 people in a scene, the find_dis function in the preprocess_dataset.py gives an out of range error. The specific line throwing the error is the following:

Line 39: dis = np.mean(np.partition(dis, 3, axis=1)[:, 1:4], axis=1, keepdims=True)

When there are < 4 people, the kth=3 argument in the np.partition function leads to the error.
While changing the kth argument to a smaller number (for example, 0) will eliminate the error, I am curious about the logic for using the partition function? And why only average 3 columns (columns 1, 2 and 3) from the partitioned array?

Thank you,
Casey

How is the density map D^est generated?

Hello, @ZhihengCV. In your paper, there is no much explanation about the density map D^{est}. How is the D^{est} generated? Is it the output of the regressor? What is the loss used to learn the density map? Is the loss the same as the loss L^{Bayes} for the posterior estimation? Thank you.

Issue about the dataset used in your code

In crowd.py, the code shows that the .npy is the annotation file format, so which public dataset does it belong to? I found the annotation files of many public dataset are .mat format...
BTW, the variable keypoints seems has shape of (n, 3), when keypoints[:, :2] is the coordinates of the keypoints, what does the keypoints[:, 2] represents?

Where does the variable 'points' define?

Where does the variable 'points' define in the first line of function "find_dis()"? Thank you very much!

def find_dis(point):
square = np.sum(pointpoints, axis=1)
dis = np.sqrt(np.maximum(square[:, None] - 2np.matmul(point, point.T) + square[None, :], 0.0))
dis = np.mean(np.partition(dis, 3, axis=1)[:, 1:4], axis=1, keepdims=True)
return dis