Code Monkey home page Code Monkey logo

bayesian-crowd-counting's Introduction

Bayesian-Crowd-Counting (ICCV 2019 oral)

Arxiv | CVF

Official Implement of ICCV 2019 oral paper "Bayesian Loss for Crowd Count Estimation with Point Supervision"

Visualization

Bayesian

Bayesian+

Density

Citation

If you use this code for your research, please cite our paper:

@inproceedings{ma2019bayesian,
  title={Bayesian loss for crowd count estimation with point supervision},
  author={Ma, Zhiheng and Wei, Xing and Hong, Xiaopeng and Gong, Yihong},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={6142--6151},
  year={2019}
}

Code

Install dependencies

torch >= 1.0 torchvision opencv numpy scipy, all the dependencies can be easily installed by pip or conda

This code was tested with python 3.6

Train and Test

1、 Dowload Dataset UCF-QNRF Link

2、 Pre-Process Data (resize image and split train/validation)

python preprocess_dataset.py --origin_dir <directory of original data> --data_dir <directory of processed data>

3、 Train model (validate on single GTX Titan X)

python train.py --data_dir <directory of processed data> --save_dir <directory of log and model>

4、 Test Model

python test.py --data_dir <directory of processed data> --save_dir <directory of log and model>

The result is slightly influenced by the random seed, but fixing the random seed (have to set cuda_benchmark to False) will make training time extrodinary long, so sometimes you can get a slightly worse result than the reported result, but most of time you can get a better result than the reported one. If you find this code is useful, please give us a star and cite our paper, have fun.

5、 Training on ShanghaiTech Dataset

Change dataloader to crowd_sh.py

For shanghaitech a, you should set learning rate to 1e-6, and bg_ratio to 0.1

Pretrain Weight

UCF-QNRF

Baidu Yun Link extract code: x9wc

Google Drive Link

ShanghaiTech A

Baidu Yun Link extract code: tx0m

Goodle Drive Link

ShanghaiTech B

Baidu Yun Link extract code: a15u

Goodle Drive Link

License

GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright © 2007 Free Software Foundation, Inc. http://fsf.org/

bayesian-crowd-counting's People

Contributors

zhihengcv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bayesian-crowd-counting's Issues

How do I preprocess JHU++ datasets?

Hi,
I am trying to run your and a few other projects that rely on your preprocess step. I am looking to preprocess JHU++ dataset and I am not clear how I can accomplish that. Can you point out the steps? Thank you.

What is the shape of the .npy file?

Hi ! When I use my own dataset, I have a problem about the keypoints, which is File "D:\Crowed_recognition\code\GeneralizedLoss-Counting-Pytorch-main\datasets\crowd.py", line 112, in train_transform_with_crop nearest_dis = np.clip(keypoints[:, 2], 4.0, 128.0) IndexError: index 2 is out of bounds for axis 1 with size 2.

By the way, the shape of my own .npy file is (n, 2), such as [[1,2],[2,3]]. I do not know the real shape of the .npy file. Pls help, thanks!

the abs in forward

This work is very interesting and useful.
I have some question about the code.
why do you use the torch.abs(x) in the end of the VGG forward.
In my point, we can use the result of self.reg_layer as the result of model forward.

So, I want to know the reason you use the torch.abs

Thank you very much

Ann .mat file to preprocess the input dataset

Hello to everyone, I'm interested to test this repo with my personal dataset against one of the pretrained models. Reviewing the UCF-QNRF dataset whic I downloaded, I realised for each image (jpg file), there is another file (img_xxx_ann.mat), and trying the script "preprocess_dataset.py" I observe this .mat file is necessary to preprocess the dataset just before to test it. So with all of that, my question is what is exaclty this .mat file? How is the way to generate it?

Thanks in advance!

About releasing the code

Hello!
The idea of constructing bayesian method in crowd counting is awesome! I'd like to reproduce or improve your work, but is there any specific plan to share your code with us? (ex. when to release the code) For it has been 27 days since you issued the rough plan. I'll really appreciate it if the code is released. Thanks!

Performance with UCF-QNRF

Hi,

Thanks for sharing the code.
After I running your code with all the default configurations, I can only get ~100 MAE on the UCF-QNRF dataset.
May I ask if there anything I need to modify in order to get the results reported in the paper?

Thanks!
Ze

Issue about dataloader

In crowd.py, key_points has a shape of[x, 2], and target has a shape of[y], so their first dimension are different. But when I read a batch from the dataloader defined in regression_train.py in utils, target and points have same first dimension, why?And what's the meaning of the targets, does someone know?

Training data

Hi

What is the ideal dataset size you would recommend to train with?
How many pictures would you recommend to be a good start to train and get accurate results?

Thanks

code environment

Could you publish your requirements.txt?
I wish to know your environment such as pytorch and python version, etc.
Thank you!

One loss or two loss functions?

There is two loss in your model, right? One is the Bayesian loss for point supervision and the other for training the D^est. Is the loss for training D^est the pixel-wise loss?

One more question is about p(y_n|x_m). The value of y_n varies from one image to another image. That is, N is different for each image. The N of a test image is also different from that of training images. What is the value field of y_n in training?

How to visualize the test result ?

I have downloaded your pre-trained model and im running test.py :

python3 test.py --data-dir processed_data --save-dir logs
In output I get

('img_0001',) -317.4151611328125 975 1292.4151611328125
('img_0002',) 234.6561279296875 923 688.3438720703125
.
.
.
('img_0017',) 11.425949096679688 191 179.5740509033203

My question is where is the visualization (heatmap) image saved ?

Parameters for ShanghaiTech Part A

Hello @ZhihengCV ,

I changed the --crop-size to 256 as mentioned in the paper. While data-preparation the min_size is fixed as 256 and max_size as 5096. All the other parameters as default. I ran it on ShanghaiTech Part A. Unlike reported in paper I could only get an MAE of 66.0, that's a major difference of 3.2 I believe that the parameters of Sigma and background ratio might be of different values from default. Can you please let me know the changes from this codebase to that of used for ShanghaiTech Part A? One more thing I used just Train and Test Split only, if you have a different split please let me know that as well. Thanks.

Multi-GPU training

Hello,

Thank you for the open-sourced code, I am trying to replace the current backbone with a bigger network for my own dataset. However I cannot run this in single GPU. Could you please let me know how to achieve multi-gpu training in this case? Is it different from the other approaches that follow multigpu using dataparallel ?

Thank you in advance for this help !

How to set rights path in linux?

python preprocess_dataset.py --origin_dir data/ --data_dir processed/
usage: preprocess_dataset.py [-h] [--origin-dir ORIGIN_DIR]
[--data-dir DATA_DIR]
preprocess_dataset.py: error: unrecognized arguments: --origin_dir data/ --data_dir processed/

请问post_prob里头self.cood是什么意思?能解释一下这部分的源代码吗?

请问您能解答一下【方括号标注部分】的计算原理么?感谢。
import torch
from torch.nn import Module

class Post_Prob(Module):
def init(self, sigma, c_size, stride, background_ratio, use_background, device):
super(Post_Prob, self).init()
assert c_size % stride == 0

    self.sigma = sigma
    self.bg_ratio = background_ratio
    self.device = device
    # coordinate is same to image space, set to constant since crop size is same
    self.cood = torch.arange(0, c_size, step=stride,
                             dtype=torch.float32, device=device) + stride / 2

【self.cood是用来做什么的】
self.cood.unsqueeze_(0)
self.softmax = torch.nn.Softmax(dim=0)
self.use_bg = use_background

def forward(self, points, st_sizes):
    num_points_per_image = [len(points_per_image) for points_per_image in points]
    # 每张图的人头数
    all_points = torch.cat(points, dim=0)
    # 人头坐标(n*2)
    # print(all_points.shape)
    if len(all_points) > 0:
        x = all_points[:, 0].unsqueeze_(1)
        # print(x.shape)
        # 横坐标列表
        y = all_points[:, 1].unsqueeze_(1)
        # print(y.shape)
        # 纵坐标列表

【这四行是计算什么?】
x_dis = -2 * torch.matmul(x, self.cood) + x * x + self.cood * self.cood
y_dis = -2 * torch.matmul(y, self.cood) + y * y + self.cood * self.cood
y_dis.unsqueeze_(2)
x_dis.unsqueeze_(1)
dis = y_dis + x_dis
dis = dis.view((dis.size(0), -1))

        dis_list = torch.split(dis, num_points_per_image)
        prob_list = []
        for dis, st_size in zip(dis_list, st_sizes):
            if len(dis) > 0:
                if self.use_bg:
                    min_dis = torch.clamp(torch.min(dis, dim=0, keepdim=True)[0], min=0.0)
                    d = st_size * self.bg_ratio
                    bg_dis = (d - torch.sqrt(min_dis))**2
                    dis = torch.cat([dis, bg_dis], 0)  # concatenate background distance to the last
                dis = -dis / (2.0 * self.sigma ** 2)
                prob = self.softmax(dis)
            else:
                prob = None
            prob_list.append(prob)
    else:
        prob_list = []
        for _ in range(len(points)):
            prob_list.append(None)
    return prob_list

Issue about batch size

As we know, the groundtruth keypoints and target have different shapes in different inputs, I just wonder that whether the batch_size could be greater than 1 in the training stage ?

license?

Could you please add a license? Happy to do a PR (for MIT?) if you want.

Could you share the pre-trained model by gogole drive or anything else baidu service

Hello, first of all, I appreciate your work and sharing the code.
But, I couldn't download the pretrained model because I don't have Chinese VPN and mobile phone number which is necessary to register for baidu service (as far as I understand it)
So, it would be nicer if you would share the pretrained model by other sharing methods, such as google drive, anything else.
Thanks in advance.

The meaning of the density graph?

I am a new learner in this area, there is a naive question. With this model, i can get a density graph for my own pictures. So, what is the meaning of each point's value in the density graph? Is it the value of crowd density in that pixel?

这个问题我在谷歌了好久也没搞懂。请问一下,最后生成的人群密度热力图里,每个点的值代表什么意思呢。是代表该点人群密度的值吗,因为密度越高的地方值越大。那么,是否可以说,如果这个点的值是3,就代表网络预测这个点有三个人。麻烦各位大神回答一下,我知道这个问题应该很傻,谢谢了。

What is the p(x_m|y_n) in the testing stage?

Hi. There are no parameters learned for the Bayesian model. In the training stage, p(x_m|y_n) can be obtained from the ground truth density map. But in the testing stage, how we get the p(x_m|y_n)? Did I miss something?

What's the meaning of background_radio?

Hi !

Can you tell me the meaning of code of calculating background distance : bg_dis = (st_size * self.bg_ratio) ** 2 / (min_dis + 1e-5) and what's the meaning of background_radio. Thank you very much.

Code for NWPU

Could you release the setting that you have used to preprocess the NWPU Dataset ? I am running into cuda memory errors while implementng this network on NWPU-Crowd Dataset. Sharing pretrained model on NWPU would be great too !!

Thanks in Advance !

Preprocessing scenes with < 4 people gives out of range error

I am processing the GCC synthetic dataset (GTAV scenes) for training on the Bayesian Loss model. When there are < 4 people in a scene, the find_dis function in the preprocess_dataset.py gives an out of range error. The specific line throwing the error is the following:

Line 39: dis = np.mean(np.partition(dis, 3, axis=1)[:, 1:4], axis=1, keepdims=True)

When there are < 4 people, the kth=3 argument in the np.partition function leads to the error.
While changing the kth argument to a smaller number (for example, 0) will eliminate the error, I am curious about the logic for using the partition function? And why only average 3 columns (columns 1, 2 and 3) from the partitioned array?

Thank you,
Casey

How is the density map D^est generated?

Hello, @ZhihengCV. In your paper, there is no much explanation about the density map D^{est}. How is the D^{est} generated? Is it the output of the regressor? What is the loss used to learn the density map? Is the loss the same as the loss L^{Bayes} for the posterior estimation? Thank you.

Issue about the dataset used in your code

In crowd.py, the code shows that the .npy is the annotation file format, so which public dataset does it belong to? I found the annotation files of many public dataset are .mat format...
BTW, the variable keypoints seems has shape of (n, 3), when keypoints[:, :2] is the coordinates of the keypoints, what does the keypoints[:, 2] represents?

Where does the variable 'points' define?

Where does the variable 'points' define in the first line of function "find_dis()"? Thank you very much!

def find_dis(point):
square = np.sum(pointpoints, axis=1)
dis = np.sqrt(np.maximum(square[:, None] - 2
np.matmul(point, point.T) + square[None, :], 0.0))
dis = np.mean(np.partition(dis, 3, axis=1)[:, 1:4], axis=1, keepdims=True)
return dis

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.