pongpisit-thanasutives / variations-of-sfanet-for-crowd-counting Goto Github PK

The official implementation of "Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting"

Home Page: https://ieeexplore.ieee.org/document/9413286

License: GNU General Public License v3.0

Jupyter Notebook 94.03% Python 5.97%

crowd-counting ucf-qnrf segnet convolutional-neural-networks encoder-decoder

variations-of-sfanet-for-crowd-counting's Introduction

Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting（ICPR 2020）

Official Implementation of "Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting" LINK

Many thanks to BL, SFANet and CAN for their useful publications and repositories.

For complete UCF-QNRF and Shanghaitech training code, please refer to BL and SFANet respectively.

Please see models for our M-SFANet and M-SegNet implementations.

Density maps Visualization

Datasets (NEW)

To reproduce the results reported in the paper, you may use these preprocessed datasets. This is not completed yet, and might be updated in the future.

Shanghaitech B dataset that is preprocessed using the Gaussian kernel Link

Bayesian preprocessed (following BL) Shanghaitech datasets (A&B) Link

The Beijing-BRT dataset Link (Originally from BRT)

Pretrained Weights

Shanghaitech A&B Link

To test the visualization code you should use the pretrained M_SegNet* on UCF_QNRF Link (The pretrained weights of M_SFANet* are also included.)

Getting started

An example code of how to use the pretrained M-SFANet* on UCF-QNRF to count the number people in an image. The test image is ./images/img_0071.jpg (from UCF-QNRF test set).

import cv2
from PIL import Image
import numpy as np

import torch
from torchvision import transforms

from datasets.crowd import Crowd
from models import M_SFANet_UCF_QNRF

# Simple preprocessing.
trans = transforms.Compose([transforms.ToTensor(), 
                            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
                           ])

# An example image with the label = 1236.
img = Image.open("./images/img_0071.jpg").convert('RGB')
height, width = img.size[1], img.size[0]
height = round(height / 16) * 16
width = round(width / 16) * 16
img = cv2.resize(np.array(img), (width,height), cv2.INTER_CUBIC)
img = trans(Image.fromarray(img))[None, :]

model = M_SFANet_UCF_QNRF.Model()
# Weights are stored in the Google drive link.
# The model are originally trained on a GPU but, we can also test it on a CPU.
# For ShanghaitechWeights, use torch.load("./ShanghaitechWeights/...")["model"] with M_SFANet.Model() or M_SegNet.Model()
model.load_state_dict(torch.load("./Paper's_weights_UCF_QNRF/best_M-SFANet*_UCF_QNRF.pth", 
                                 map_location = torch.device('cpu')))

# Evaluation mode
model.eval()
density_map = model(img)
# Est. count = 1168.37 (67.63 deviates from the ground truth)
print(torch.sum(density_map).item())

Citation

If you find the code useful for your research, please cite our paper:

@inproceedings{thanasutives2021encoder,
  title={Encoder-decoder based convolutional neural networks with multi-scale-aware modules for crowd counting},
  author={Thanasutives, Pongpisit and Fukui, Ken-ichi and Numao, Masayuki and Kijsirikul, Boonserm},
  booktitle={2020 25th International Conference on Pattern Recognition (ICPR)},
  pages={2382--2389},
  year={2021},
  organization={IEEE}
}

Erratum: In Fig. 1 of the paper, "ASSP" should be "ASPP".
You may watch this 6-minute presentation video as a short introduction.

variations-of-sfanet-for-crowd-counting's People

Contributors

Stargazers

Watchers

variations-of-sfanet-for-crowd-counting's Issues

Output Image

Hi author,
This code is very interesting. I read code but I didn't see inference code to save and visualize output image.
Can you release it?

Help about module sega_psp2

I could not find sega_psp2 module. can you guide me

Session crashed after using all available RAM

Hello,
Thanks for your greatest effort, it really helps me a lot, but I have an issue when testing on the image(img_0071.jpg), the question here, Is the suggested model take more than 13GB RAM?, I have already followed the instructions suggested in the example code, but it gave me "Session crashed after using all available RAM"

How to make GT ground truth map suitable for UCF trained model?

Hello!

I have a ground truth for images in the form of coordinate points. Is there a way to turn this into a ground truth density map that I can use to compare to the MSFA UCF predicted density map and also visualise? So that I can perhaps make a confusion matrix.

Also, for the bayesian processed SH datasets, the .npy files, I get that the first 2 columns are the width and height coordinates, but what does the 3rd column represent? Some sort of distance or radius? Is this something that can be used to create the ground truth map, or is this for something else?

Thank you for your constant updates and help!

Question about the different checkpoint

Hello,

First of all, thank you for sharing your work with the code.

I have few questions regarding the checkpoints available.
I have tried to visualize the density map with the count of the crowd with few images taken from google but with the checkpoint train on the Shangai dataset (M_SFANet and M_SegNet) the results seems quite bad. I was wondering if some preprocessing or part of the code from the notebook Visualize.ipynb need to be change ?

Moreover, if I want to finetune these differents models on different datasets, what are the different steps to follow ? (dataset of crowd where we have the count and head coordinate of human)

how can i use your pretrained model to count crowd in any given image

Why I get error in keypoints

Keypoints look like array with three columns. My ground truth .npy contains only two columns that represent x and y coordinates. Please can you help me how to prepare my ground truth to run code. What the third column means?

License

Dear author,

Thank you so much for your great repository.
Could you tell me the license of this repository?

Cuda out of memory when validation

I have tried to test the SFANet model on UCF-QNRF dataset. It is okay in the training epoches, but when it starts the validation set, the error CUDA out of memory appears. I tried to run in both local machine (5GB GiB) and Google Colab (12GB GiB), both of them gave the same error message. Could you give me some suggestions? Thank you.

The exception is raised at models\M_SFANet_UCF_QNRF.py:
"input = torch.cat([input, conv3_3, F.interpolate(conv5_4, scale_factor=2, mode='bilinear', align_corners=True)], 1)"

How is the Shanghai tech A and B dataset labelled ?

Hello !

May I know how did people label the density maps on shanghai tech A and B data set ?
Is it possible for us to add on extra images and data to the data set ?

Problems when training

Hi, I trained M-SFANet with part of shanghaiTech samples. BUT the loss converge too slow, and the mse and mae remain large after training a few hundred epoches. Do you know why?

the version of pytorch

Hello,thanks for your greaat job. But I find that pytorch1.2.0 don't have "default_collate" in "dataloadder". Please tell me the version of your pytorch. Best wishes.

About the training and testing datasets

Hello, I would like to ask where can I find the datasets for testing and training? Thanks.

Inconsistent shapes of conv4, conv5,feature in Backend()'s forwarding in MSegNet

Anybody come across this problem while using random input picture to generate density map through M-SegNet? why the shape[2] of these 3 tensor differ?

could not reproduce the res in the paper

the published code do not have CANet branch, not coordinated with the paper report, the baseline of SFANet is 59 according to my exp, however, when I add ASPP(means using the M-SFANet model according to the author's code) the SHA MAE is only 61, ridiculous, the res in the paper can not be reproduced (i do not know whether the reviewer of ICPR know this thing)

cannot use the pretrain model

I want to use the pretain model to test the Visualization code.
But have erros in loading state_dict for Model:
the error is miss key of vgg's weight and bias.
So How can I use the pretrain model?

how can I train the model in the ShanghaiTech dataset?

we generate training data through preprocess_dataset.py in BL code, but during training, we find the issue nearest_dis = np.clip(keypoints[:, 2], 4.0, 128.0) IndexError: index 2 is out of bounds for axis 1 with size 2. And the shape of our keypoints is 2. So I want to ask how you train the model in the ShanghaiTech dataset.

Create GT for a different dataset

I'm trying to understand how can I get the third column of the .npy training file.

What's the meaning of that value? Is there any script to generate it?

How do I use the weights trained on ShanghaiTech?

How do I use the .pth files ShanghaiTech A and B datasets, like checkpoint_best_MSFANet_B.pth for inference? I am able to use your Getting Started script to infer on M_SFANet_UCF_QNRF.py and best_M-SFANet__UCF_QNRF.pth file, to get headcount and heatmap. But for the ShanghaiTech weights, is there a corresponding .py file with the Model class to put in /models? So far using the ShanghaiTech weights gives some error regarding state_dict.

Thanks for your great work!

Crop size

Hi, I am the newer in this feild. I have some samples with size less than 512, do you have any advice to deal with it in the cropping phase, instead of reducing the crop size?

GroundTruth files

Hello，where can I get the npy file? Or how to make it myself? Thanks!

train detail for shanghaiA and B

I train the datasets of shanghai A and B. But I find that the result （MAE） of SFANet is 68.02 and 10.32 for A and B.
The process of the experiment is completely in accordance with your article.
And I use the model from UCF data set for pre-training

UCF-QNRF weights missing

Hi, I've tried downloading Paper's_weights_UCF_QNRF.zip from both Google Drive and OneDrive, but both zip files are empty and do not contain the .pth files. Could you please upload them again? Thanks!

Unable to reproduce the result of the testing image

Unable to reproduce the result of the testing image, with using the pretrained model, anyone know how to reproduce the result? thanks.

Is ther any practical difference between M_SFANet.py and M_SFANet_UCF_QNRF.py?

find dis

Hi,
I noticed in find_dis:
dis = np.mean(np.partition(dis, 3, axis=1)[:, 1:4], axis=1, keepdims=True)

image with people less than 4 is illegal?

I wonder if it is possible to train on multiple gpu based on your setup theoretically

Hi @Pongpisit-Thanasutives ,

Good job for this repo! Can you tell me if your method is possible to train on multiple GPUs theoretically?

Thank you,

unable download the pretrain model

can you share the download link of Paper's_weights_UCF_QNRF.zip again?
The current download speed is too slow!
thx very much!

unet_aspp2_dmp_dupsampling is not defined

In the test.py file:

model = unet_aspp2_dmp_dupsampling.Model()
NameError: name 'unet_aspp2_dmp_dupsampling' is not defined

How to fix it? Thanks.

Can I predict a picture without label?

About "max_unpool2d" function

When i use another backbone , i meet this problem：

I cheak the conv5_3.shape and id4.shape , they meet the requirements of the "max_unpool2d" function

Could you help me?please!

Train detail?

I have trained 1000 epoche in shanghaiTech part A samples using the M_SFANet，but the test mea is 68.08，the learning rate used 5e-4, the batch size used 8, and the sample size croped as 400*400. And can you tell me the train detail?

how to add attention and density map into M_SFANet_UCF_QNRF

Dear author:
Nice to have access to your algorithmsI noticed that in the M_SFANet_UCF_QNRF visualization code there are only heat maps, how can I add density maps and attention maps in M_SFANet_UCF_QNRF? Looking forward to your guidance and clarification

Trained model weights

I downloaded the Shanghaitech A&B pretrained weights ("checkpoint_best_MSFANet_A.pth" & "checkpoint_best_MSFANet_B.pth") from the link provided but have error.
How can I used them?

RuntimeError: Error(s) in loading state_dict for Model:
Missing key(s) in state_dict: "vgg.conv1_1.conv.weight", "vgg.conv1_1.conv.bias", "vgg.conv1_1.bn.weight", "vgg.conv1_1.bn.bias", "vgg.conv1_1.bn.running_mean", "vgg.conv1_1.bn.running_var", "vgg.conv1_2.conv.weight", "vgg.conv1_2.conv.bias", "vgg.conv1_2.bn.weight", "vgg.conv1_2.bn.bias", "vgg.conv1_2.bn.running_mean", "vgg.conv1_2.bn.running_var", "vgg.conv2_1.conv.weight", "vgg.conv2_1.conv.bias", "vgg.conv2_1.bn.weight", "vgg.conv2_1.bn.bias", "vgg.conv2_1.bn.running_mean", "vgg.conv2_1.bn.running_var", "vgg.conv2_2.conv.weight", "vgg.conv2_2.conv.bias", "vgg.conv2_2.bn.weight", "vgg.conv2_2.bn.bias", "vgg.conv2_2.bn.running_mean", "vgg.conv2_2.bn.running_var", "vgg.conv3_1.conv.weight", "vgg.conv3_1.conv.bias", "vgg.conv3_1.bn.weight", "vgg.conv3_1.bn.bias", "vgg.conv3_1.bn.running_mean", "vgg.conv3_1.bn.running_var", "vgg.conv3_2.conv.weight", "vgg.conv3_2.conv.bias", "vgg.conv3_2.bn.weight", "vgg.conv3_2.bn.bias", "vgg.conv3_2.bn.running_mean", "vgg.conv3_2.bn.running_var", "vgg.conv3_3.conv.weight", "vgg.conv3_3.conv.bias", "vgg.conv3_3.bn.weight", "vgg.conv3_3.bn.bias", "vgg.conv3_3.bn.running_mean", "vgg.conv3_3.bn.running_var", "vgg.conv4_1.conv.weight", "vgg.conv4_1.conv.bias", "vgg.conv4_1.bn.weight", "vgg.conv4_1.bn.bias", "vgg.conv4_1.bn.running_mean", "vgg.conv4_1.bn.running_var", "vgg.conv4_2.conv.weight", "vgg.conv4_2.conv.bias", "vgg.conv4_2.bn.weight", "vgg.conv4_2.bn.bias", "vgg.conv4_2.bn.running_mean", "vgg.conv4_2.bn.running_var", "vgg.conv4_3.conv.weight", "vgg.conv4_3.conv.bias", "vgg.conv4_3.bn.weight", "vgg.conv4_3.bn.bias", "vgg.conv4_3.bn.running_mean", "vgg.conv4_3.bn.running_var", "vgg.conv5_1.conv.weight", "vgg.conv5_1.conv.bias", "vgg.conv5_1.bn.weight", "vgg.conv5_1.bn.bias", "vgg.conv5_1.bn.running_mean", "vgg.conv5_1.bn.running_var", "vgg.conv5_2.conv.weight", "vgg.conv5_2.conv.bias", "vgg.conv5_2.bn.weight", "vgg.conv5_2.bn.bias", "vgg.conv5_2.bn.running_mean", "vgg.conv5_2.bn.running_var", "vgg.conv5_3.conv.weight", "vgg.conv5_3.conv.bias", "vgg.conv5_3.bn.weight", "vgg.conv5_3.bn.bias", "vgg.conv5_3.bn.running_mean", "vgg.conv5_3.bn.running_var", "spm.assp.aspp1.atrous_conv.weight", "spm.assp.aspp1.bn.weight", "spm.assp.aspp1.bn.bias", "spm.assp.aspp1.bn.running_mean", "spm.assp.aspp1.bn.running_var", "spm.assp.aspp2.atrous_conv.weight", "spm.assp.aspp2.bn.weight", "spm.assp.aspp2.bn.bias", "spm.assp.aspp2.bn.running_mean", "spm.assp.aspp2.bn.running_var", "spm.assp.aspp3.atrous_conv.weight", "spm.assp.aspp3.bn.weight", "spm.assp.aspp3.bn.bias", "spm.assp.aspp3.bn.running_mean", "spm.assp.aspp3.bn.running_var", "spm.assp.aspp4.atrous_conv.weight", "spm.assp.aspp4.bn.weight", "spm.assp.aspp4.bn.bias", "spm.assp.aspp4.bn.running_mean", "spm.assp.aspp4.bn.running_var", "spm.assp.global_avg_pool.1.weight", "spm.assp.global_avg_pool.2.weight", "spm.assp.global_avg_pool.2.bias", "spm.assp.global_avg_pool.2.running_mean", "spm.assp.global_avg_pool.2.running_var", "spm.assp.conv1.weight", "spm.assp.bn1.weight", "spm.assp.bn1.bias", "spm.assp.bn1.running_mean", "spm.assp.bn1.running_var", "spm.can.scales.0.1.weight", "spm.can.scales.1.1.weight", "spm.can.scales.2.1.weight", "spm.can.scales.3.1.weight", "spm.can.bottleneck.weight", "spm.can.bottleneck.bias", "spm.can.weight_net.weight", "spm.can.weight_net.bias", "amp.conv1.conv.weight", "amp.conv1.conv.bias", "amp.conv1.bn.weight", "amp.conv1.bn.bias", "amp.conv1.bn.running_mean", "amp.conv1.bn.running_var", "amp.conv2.conv.weight", "amp.conv2.conv.bias", "amp.conv2.bn.weight", "amp.conv2.bn.bias", "amp.conv2.bn.running_mean", "amp.conv2.bn.running_var", "amp.conv3.conv.weight", "amp.conv3.conv.bias", "amp.conv3.bn.weight", "amp.conv3.bn.bias", "amp.conv3.bn.running_mean", "amp.conv3.bn.running_var", "amp.conv4.conv.weight", "amp.conv4.conv.bias", "amp.conv4.bn.weight", "amp.conv4.bn.bias", "amp.conv4.bn.running_mean", "amp.conv4.bn.running_var", "amp.conv5.conv.weight", "amp.conv5.conv.bias", "amp.conv5.bn.weight", "amp.conv5.bn.bias", "amp.conv5.bn.running_mean", "amp.conv5.bn.running_var", "amp.conv6.conv.weight", "amp.conv6.conv.bias", "amp.conv6.bn.weight", "amp.conv6.bn.bias", "amp.conv6.bn.running_mean", "amp.conv6.bn.running_var", "amp.conv7.conv.weight", "amp.conv7.conv.bias", "amp.conv7.bn.weight", "amp.conv7.bn.bias", "amp.conv7.bn.running_mean", "amp.conv7.bn.running_var", "dmp.conv1.conv.weight", "dmp.conv1.conv.bias", "dmp.conv1.bn.weight", "dmp.conv1.bn.bias", "dmp.conv1.bn.running_mean", "dmp.conv1.bn.running_var", "dmp.conv2.conv.weight", "dmp.conv2.conv.bias", "dmp.conv2.bn.weight", "dmp.conv2.bn.bias", "dmp.conv2.bn.running_mean", "dmp.conv2.bn.running_var", "dmp.conv3.conv.weight", "dmp.conv3.conv.bias", "dmp.conv3.bn.weight", "dmp.conv3.bn.bias", "dmp.conv3.bn.running_mean", "dmp.conv3.bn.running_var", "dmp.conv4.conv.weight", "dmp.conv4.conv.bias", "dmp.conv4.bn.weight", "dmp.conv4.bn.bias", "dmp.conv4.bn.running_mean", "dmp.conv4.bn.running_var", "dmp.conv5.conv.weight", "dmp.conv5.conv.bias", "dmp.conv5.bn.weight", "dmp.conv5.bn.bias", "dmp.conv5.bn.running_mean", "dmp.conv5.bn.running_var", "dmp.conv6.conv.weight", "dmp.conv6.conv.bias", "dmp.conv6.bn.weight", "dmp.conv6.bn.bias", "dmp.conv6.bn.running_mean", "dmp.conv6.bn.running_var", "dmp.conv7.conv.weight", "dmp.conv7.conv.bias", "dmp.conv7.bn.weight", "dmp.conv7.bn.bias", "dmp.conv7.bn.running_mean", "dmp.conv7.bn.running_var", "conv_att.conv.weight", "conv_att.conv.bias", "conv_att.bn.weight", "conv_att.bn.bias", "conv_att.bn.running_mean", "conv_att.bn.running_var", "conv_out.conv.weight", "conv_out.conv.bias", "conv_out.bn.weight", "conv_out.bn.bias", "conv_out.bn.running_mean", "conv_out.bn.running_var".
Unexpected key(s) in state_dict: "epoch", "model", "optimizer", "mae", "mse".