vpulab / semantic-aware-scene-recognition Goto Github PK

Code repository for paper https://www.sciencedirect.com/science/article/pii/S0031320320300613 @ Pattern Recognition 2020

License: MIT License

Python 93.07% Shell 6.93%

computer-vision scene-recognition deep-learning convolutional-neural-networks pattern-recognition

semantic-aware-scene-recognition's Introduction

Semantic-Aware Scene Recognition

Official Pytorch Implementation of Semantic-Aware Scene Recognition by Alejandro López-Cifuentes, Marcos Escudero-Viñolo, Jesús Bescós and Álvaro García-Martín (Elsevier Pattern Recognition).

Summary

This paper propose to improve scene recognition by using object information to focalize learning during the training process. The main contributions of the paper are threefold:

We propose an end-to-end multi-modal deep learning architecture which gathers both image and context information using a two-branched CNN architecture.
We propose to use semantic segmentation as an additional information source to automatically create, through a convolutional neural network, an attention model to reinforce the learning of relevant contextual information.
We validate the effectiveness of the proposed method through experimental results on public scene recognition datasets such as ADE20K, MIT Indoor 67, SUN 397 and Places365 obtaining state-of-the-art results.

The propose CNN architecture is as follows:

State-of-the-art Results

ADE20K Dataset

RGB	Semantic	Top@1	Top@2	Top@5	MCA
✓		55.90	67.25	78.00	20.96
	✓	50.60	60.45	72.10	12.17
✓	✓	62.55	73.25	82.75	27.00

MIT Indoor 67 Dataset

Method	Backbone	Number of Parameters	Top@1
PlaceNet	Places-CNN	62 M	68.24
MOP-CNN	CaffeNet	62 M	68.90
CNNaug-SVM	OverFeat	145 M	69.00
HybridNet	Places-CNN	62 M	70.80
URDL + CNNaug	AlexNet	62 M	71.90
MPP-FCR2	AlexNet	62 M	75.67
DSFL + CNN (7 Scales)	AlexNet	62M	76.23
MPP + DSFL	AlexNet	62 M	80.78
CFV	VGG-19	143 M	81.00
CS	VGG-19	143 M	82.24
SDO (1 Scale)	2 x VGG-19	276 M	83.98
VSAD	2 x VGG-19	276 M	86.20
SDO (9 Scales)	2 x VGG-19	276 M	86.76
Ours	ResNet-18 + Sem Branch + G-RGB-H	47 M	85.58
Ours*	ResNet-50 + Sem Branch + G-RGB-H	85 M	87.10

SUN 397 Dataset

Method	Backbone	Number of Parameters	Top@1
Decaf	AlexNet	62 M	40.94
MOP-CNN	CaffeNet	62 M	51.98
HybridNet	Places-CNN	62 M	53.86
Places-CNN	Places-CNN	62 M	54.23
Places-CNN ft	Places-CNN	62 M	56.20
CS	VGG-19	143 M	64.53
SDO (1 Scale)	2 x VGG-19	276 M	66.98
VSAD	2 x VGG-19	276 M	73.00
SDO (9 Scale)	2 x VGG-19	276 M	73.41
Ours	ResNet-18 + Sem Branch + G-RGB-H	47 M	71.25
Ours*	ResNet-50 + Sem Branch + G-RGB-H	85 M	74.04

Places 365 Dataset

Network	Number of Parameters	Top@1	Top@2	Top@5	MCA
AlexNet	62 M	47.45	62.33	78.39	49.15
AlexNet*	62 M	53.17	-	82.59	-
GooLeNet*	7 M	53.63	-	83.88	-
ResNet-18	12 M	53.05	68.87	83.86	54.40
ResNet-50	25 M	55.47	70.40	85.36	55.47
ResNet-50*	25 M	54.74	-	85.08	-
VGG-19*	143 M	55.24	-	84.91	-
DenseNet-161	29 M	56.12	71.48	86.12	56.12
Ours	47 M	56.51	71.57	86.00	56.51

Setup

Requirements

The repository has been tested in the following software versions.

Ubuntu 16.04
Python 3.6
Anaconda 4.6

Clone Repository

Clone repository running the following command:

$ git clone https://github.com/vpulab/Semantic-Aware-Scene-Recognition.git

Anaconda Enviroment

To create and setup the Anaconda Envirmorent run the following terminal command from the repository folder:

$ conda env create -f Config/Conda_Env.yml
$ conda activate SA-Scene-Recognition

Datasets

Download and setup instructions for each datasets are provided in the follwing links:

Evaluation

Model Zoo

In order to evaluate the models independently, download them from the following links and indicate the path in YAML configuration files (Usually /Data/Model Zoo/DATASET FOLDER).

[Recommended] Alternatively you can run the following script from the repository folder to download all the available Model Zoo:

bash ./Scripts/download_ModelZoo.sh

ADE20K

MIT Indoor 67

SUN 397

Places 365

Run Evaluation

In order to evaluate models run evaluation.py file from the respository folder indicating the dataset YAML configuration path:

python evaluation.py --ConfigPath [PATH to configuration file]

Example for ADE20K Dataset:

python evaluation.py --ConfigPath Config/config_ADE20K.yaml

All the desired configuration (backbone architecture to use, model to load, batch size...etc) should be changed in each separate YAML configuration file.

Computed performance metrics for both training and validation sets are:

Top@1
Top@2
Top@5
Mean Class Accuracy (MCA)

Citation

If you find this code and work useful, please consider citing:

@article{lopez2020semantic,
  title={Semantic-Aware Scene Recognition},
  author={L{\'o}pez-Cifuentes, Alejandro and Escudero-Vi{\~n}olo, Marcos and Besc{\'o}s, Jes{\'u}s and Garc{\'\i}a-Mart{\'\i}n, {\'A}lvaro},
  journal={Pattern Recognition},
  pages={107256},
  year={2020},
  publisher={Elsevier}
}

Acknowledgment

This study has been partially supported by the Spanish Government through its TEC2017-88169-R MobiNetVideo project.

semantic-aware-scene-recognition's People

Contributors

Stargazers

Watchers

Forkers

piaohe111 jiahangwu gotid aleksandrtulenkov ramazanlcetin askintution bozkurtcnr neoyang0620 snow-1314 amilcarotero vaibhavkale20009 hanbn yuanyezz superkirill ambigev eliauk-tiamo

semantic-aware-scene-recognition's Issues

resnet50-RGB-branch model

I use the resnet50 RGB-branch model on MIT indoor67 dataset to predict the image from the same dataset, but most of the result is not right, why is that?
Thanks a lot!

Question about two attention modules

There are two attention modules used in SAScnenNet，one is chain connected 3xChAM and the other is “Attention Module”.

Q1: What happened when feature pass 3xChAM (concentrate on svevral special channel that strongly about the scene?)
Q2: Why do we need 3 ChAM but not less or more (is it bucause 3module can make the feature more concentrate on the decisive feature that help to make sure the scene?)
Q3: Why do we need “Attention Module”, what is the difference between this and ChAM in function (is it like one judeg "what" and another judge "where" in CBAM?)

I very much look forward to your reply

How to test this model?

Thank you for the amazing repo
However, it appears as if this repo is used for training the model. is it possible to create a file that just takes the input image and generates an output from the trained model?

Places-365 noisy training data not found

Tried bash ./Scripts/download_Places365_extra.sh ./Data/Datasets/places365_standard
command to get the precomputed Semantic Segmentation masks.

Training masks are not downloaded only it contains validation masks.

Please, update/fix the sh file to provide training data for precomputed Semantic Segmentation masks.

Thanks.

how train a model from scratch

could you offer an instruction to train a model from scratch,thanks in advance !

how to train your model. Need the train.py file or the command to train

Hi. thank you for releasing your code. Could you provide the command to train your model?
the evaluation.py quits with No checkpoint found. I don't want to use your checkpoint. I would like to have mine created during the training. Any idea of how to do that with your current repository?

Thank you

noisy_annotations vs noisy_scores

|----images
	|--- training
		ADE_train_00000001.jpg
		...
	|--- validation
		ADE_val_00000001.jpg
		...
|----noisy_annotations_RGB
	|--- training
		ADE_train_00000001.png
		...
	|--- validation
		ADE_train_00000001.png
		...
|----noisy_scores_RGB
	|--- training
		ADE_train_00000001.png
		...
	|--- validation
		ADE_train_00000001.png
		...

The difference between the images folder and the other two folders seems to be .jpg vs .png. But what's the difference between the noisy_annotations_RGB folder and the noisy_scores_RGB folder? Or is there a typo?

And the validation folders of the noisy_annotations_RGB and noisy_scores_RGB folders should not contain train set files, as shown above, right?

The training code

Hello! I want to reproduce the experimental results，but I failed on the training code. Can you upload your training code, please!

Pre-trained model

First off, thank you very much for open-sourcing the code for your state-of-the-art results!

Would it be possible to open-source the best performing model you obtained?

I understand that you have provided some code, but I don't know if enough code/details are provided to train the model from scratch (like there is no train.py, no mention of how much computational power is required, etc.)...

Could one of the two things above be provided please?

How to convert semantic segmentation results into required format

Hi,
Thanks for your amazing work.

I met a question when I implemented this model to other unseen data. The model required two extra inputs: sem_labels and sem_scores. I checked your paper and couldn't find out specified instruction about how to convert the original semantic segmentation results to this two new inputs.

The semantic segmentation model is this. The model will predict a W x H x L score matrix. Can you explain a little about the following operations?

Best,
Neo

question about the format of the top3 scores in the precomputed dataset.

First of all, thanks a lot for sharing this great project. I want to use this model to predict the scenes on COCO dataset, however, I'm confused about the format of the top3 scores in the precomputed dataset. Why is the highest score of a pixel about 0.4 but 1. what function do you use to get these scores?

In https://github.com/CSAILVision/semantic-segmentation-pytorch the scores are product by softmax.

Thanks a lot!

When I use RGB_ ResNet50_ SUN model, select ONLY_ RGB: TRUE, the evaluation.py report an error

in config_SUN397.yaml:
MODEL:
ARCH: ResNet-50
PATH: ./Data/Model Zoo/SUN397/
NAME: RGB_ResNet50_SUN
ONLY_RGB: TRUE
ONLY_SEM: FALSE

TRAINING:
PRINT_FREQ: 10
PRECOMPUTED_SEM: FALSE
BATCH_SIZE:
TRAIN: 100
TEST: 1
LR: 2.5e-4
LR_DECAY: 10
MOMENTUM: 0.9
OPTIMIZER: DFW
POLY_POWER: 0.9
WEIGHT_DECAY: 5.0e-4
AVERAGE_LOSS: 20

VALIDATION:
PRINT_FREQ: 10
BATCH_SIZE:
TRAIN: 100
TEST: 1
TEN_CROPS: TRUE

error:
FileNotFoundError: [Errno 2] No such file or directory: './Data/Datasets/SUN397/noisy_annotations_RGB/val/conference_room/sun_aatxlublfjchvvzu.png'

the model is selecting PRECOMPUTED_ SEM: FALSE , still required noise_ Annotations_ RGB and noise_ Annotations_Scores ?

Thank you very much for answering my question!

train the SASceneNet on MITIndoor67Dataset

I tried to train the SASceneNet on MITIndoor67Dataset but encountered some problems,may I have your training codes? Thank you very much~~

Model zoo links expired

Hello, I am trying to download the models from the model zoo links, but it seems that they have expired. Could you fix that for us please? Thanks a lot.

Runtime Error while evaluating the model

I am getting the following error on running evaluation.py

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

This is the stack trace
Traceback (most recent call last):
File "evaluation.py", line 300, in
val_top1, val_top2, val_top5, val_loss, val_ClassTPDic = evaluationDataLoader(val_loader, model, set='Validation')
File "evaluation.py", line 110, in evaluationDataLoader
prec1, prec2, prec5 = utils.accuracy(outputSceneLabel.data, sceneLabelGT, topk=(1, 2, 5))
File "/content/Semantic-Aware-Scene-Recognition/Libs/Utils/utils.py", line 108, in accuracy
correct_k = correct[:k].view(-1).float().sum(0)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Could you please help me with this issue?

download dataset

I can not download the dataset such as ADE20K , could you check that problem ? I am in China .Thank you very much !

Why did the semantic score map become 152 channels

Hello! First of all, thank you very much for sharing the source code of your paper, which has played a very positive role in my research work.
It is well known that the number of object obtained from dataset ADE20K is 150, but for some reason, you set the number of channels in the source code to 152. Can you explain why?

Once again, I would like to express my sincere respect for your work

no question

Training Problems

Could you upload your training code? Thank you very much.

Thank you!

Hi,

I just wanted to say thank you for sharing your code and the pretrained weights. It's a super cool work! :)

如何测试但张图片？

I tried to train the RGB branch on ADE20K from scratch, but only got 45% acc rather than 55.9%.

hello, recently I tried to train the model on ADE20K from scratch completely. The validation result works well on SEMANTIC branch, but only gets 45% acc on RGB branch, rather than 55.9% the paper shows.
May I ask is there somethink I miss in my train script?

AttributeError: Can't pickle local object 'ADE20KDataset.init.<locals>.<lambda>'

Hi, I downloaded all the available Model Zoo, and ran the evaluation.py. However, there are some bugs in my project. Please, give me some suggestions and solutions.
Thanks.

The error is following:

Traceback (most recent call last):
  File "evaluation.py", line 279, in <module>
    sample = next(iter(val_loader))
  File "D:\Software\Code\Anaconda3\envs\SA-Scene-Recognition\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "D:\Software\Code\Anaconda3\envs\SA-Scene-Recognition\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
    w.start()
  File "D:\Software\Code\Anaconda3\envs\SA-Scene-Recognition\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "D:\Software\Code\Anaconda3\envs\SA-Scene-Recognition\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "D:\Software\Code\Anaconda3\envs\SA-Scene-Recognition\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "D:\Software\Code\Anaconda3\envs\SA-Scene-Recognition\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
    reduction.dump(process_obj, to_child)
  File "D:\Software\Code\Anaconda3\envs\SA-Scene-Recognition\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'ADE20KDataset.__init__.<locals>.<lambda>'

how to test model on new images

Model Zoo Links Broken

I tried downloading the models on the MIT Indoor 67, SUN 397, and Places 365 dataset, and all of these links seem to be broken. The only link that works is for the ADE20K dataset.

Can these please be fixed?

Link Down: Cannot downloads Weight files and nois_semantic data

Hello,

Thank you for your work.

I want to download and test your work but the download link is not working currently. can you help with this?

Thank you

questions about the precomputed segmentation mask

Hi, I have just read your paper and i wonder how to get the precomputed semantic segmentation mask, as the original feature maps have a dimension of 150, so i want to know the codes of how to get the semantic segmentation map and process them into '.png' file.
Thanks very much!!!

evaluation.py KeyError: 'ARCH'

When I run python evaluation.py --ConfigPath Config/config_ADE20K.yaml after doing everything in the README.md, I get the following error:

-----------------------------------------------------------------
Evaluation starting...
-----------------------------------------------------------------
Evaluating complete model
Traceback (most recent call last):
  File "evaluation.py", line 168, in <module>
    print('Selected RG backbone architecture: ' + CONFIG['MODEL']['ARCH'])
KeyError: 'ARCH'

When I print out the contents of CONFIG['MODEL'], I get:

CONFIG[MODEL]: {'PATH': './Data/Model Zoo/ADEChallengeData2016/', 'NAME': 'SAScene_ResNet18_ADE', 'ONLY_RGB': False, 'ONLY_SEM': False}

Any ideas why I may be getting this error?

vpulab / semantic-aware-scene-recognition Goto Github PK

semantic-aware-scene-recognition's Introduction

Semantic-Aware Scene Recognition

Summary

State-of-the-art Results

ADE20K Dataset

MIT Indoor 67 Dataset

SUN 397 Dataset

Places 365 Dataset

Setup

Requirements

Clone Repository

Anaconda Enviroment

Datasets

Evaluation

Model Zoo

Run Evaluation

Citation

Acknowledgment

semantic-aware-scene-recognition's People

Contributors

Stargazers

Watchers

Forkers

semantic-aware-scene-recognition's Issues

Recommend Projects

Recommend Topics

Recommend Org