Code Monkey home page Code Monkey logo

dcnet's Introduction

Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection, CVPR 2021

Our code is based on https://github.com/facebookresearch/maskrcnn-benchmark and developed with Python 3.6.5 & PyTorch 1.1.0.

Abstract

Conventional deep learning based methods for object detection require a large amount of bounding box annotations for training, which is expensive to obtain such high quality annotated data. Few-shot object detection, which learns to adapt to novel classes with only a few annotated examples, is very challenging since the fine-grained feature of novel object can be easily overlooked with only a few data available. In this work, aiming to fully exploit features of annotated novel object and capture fine-grained features of query object, we propose Dense Relation Distillation with Context-aware Aggregation (DCNet) to tackle the few-shot detection problem. Built on the meta-learning based framework, Dense Relation Distillation module targets at fully exploiting support features, where support features and query feature are densely matched, covering all spatial locations in a feed-forward fashion. The abundant usage of the guidance information endows model the capability to handle common challenges such as appearance changes and occlusions. Moreover, to better capture scale-aware features, Context-aware Aggregation module adaptively harnesses features from different scales for a more comprehensive feature representation. Extensive experiments illustrate that our proposed approach achieves state-of-the-art results on PASCAL VOC and MS COCO datasets. For more details, please refer to our CVPR paper (arxiv).

image

Installation

Check INSTALL.md for installation instructions. Since maskrcnn-benchmark has been deprecated, please follow these instructions carefully (e.g. version of Python packages).

Prepare datasets

Prepare original Pascal VOC & MS COCO datasets

First, you need to download the VOC & COCO datasets. We recommend to symlink the path of the datasets to datasets/ as follows

We use minival and valminusminival sets from Detectron (filelink).

mkdir -p datasets/coco
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2014 datasets/coco/train2014
ln -s /path_to_coco_dataset/test2014 datasets/coco/test2014
ln -s /path_to_coco_dataset/val2014 datasets/coco/val2014

ln -s /path_to_VOCdevkit_dir datasets/voc

Prepare base and few-shot datasets

For multiple runs, you need to specify the seed in the script.

bash tools/fewshot_exp/datasets/init_fs_dataset_standard.sh

This will also generate the datasets on base classes for base training.

Training and Evaluation

Scripts for training and evaluation on PASCAL VOC dataset.

experiments/DRD/
├── prepare_dataset.sh
├── base_train.sh
├── fine_tune.sh
└── get_result.sh

Configurations of base & few-shot experiments are:

experiments/DRD/configs/
├── base
│   └── e2e_voc_split*_base.yaml
└── standard
    └── e2e_voc_split*_*shot_finetune.yaml

Modify them if needed. If you have any question about these parameters (e.g. batchsize), please refer to maskrcnn-benchmark for quick solutions.

Perform few-shot training on VOC dataset

  1. Run the following for base training on 3 VOC splits
cd experiments/DRD
bash base_train.sh

This will generate base models (e.g. model_voc_split1_base.pth) and corresponding pre-trained models (e.g. voc0712_split1base_pretrained.pth).

  1. Run the following for few-shot fine-tuning
bash fine_tune.sh

This will perform evaluation on 1/2/3/5/10 shot of 3 splits. Result folder is fs_exp/voc_standard_results by default, and you can get a quick summary by:

bash get_result.sh

Citation

@inproceedings{hu2021dense,
  title={Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection},
  author={Hu, Hanzhe and Bai, Shuai and Li, Aoxue and Cui, Jinshi and Wang, Liwei},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={10185--10194},
  year={2021}
}

TODO

  • Context-aware Aggregation
  • Training scripts on COCO dataset

dcnet's People

Contributors

hzhupku avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

dcnet's Issues

Cannot reproduce results

I'm sorry, I don't know why I can't reproduce the results in your paper, and we didn't see your Fpn2mlpfeatureextractor_ PCB module of E2E_ voc_ split1_ 2shot_ finetune.yaml in codes. Can you tell me how many maps you tested after your basic training?

Training for voc2012

In your codes,i want to know if it just for voc2007 or i need to add some extra codes when use the voc2012 dataset?Because when i use the voc2012 to train,a error occur:
FileNotFoundError: [Errno 2] No such file or directory: '/home/dl/VSST/zwm/DCNet-main/VOCdevkit/VOC2012/Annotations/000005.xml'
as we know,the annotation of voc2012 datasets use the 'year_id.xml' format.Thank you for your answer.

关于 dense relation distill module

def forward(self,features,attentions):
    features = list(features)
    if isinstance(attentions,dict):
        for i in range(len(attentions)):
            if i==0:
                atten = attentions[i].unsqueeze(0)
            else:
                atten = torch.cat((atten,attentions[i].unsqueeze(0)),dim=0)
        attentions = atten.cuda()
    output = []
    h , w = attentions.shape[2:]
    ncls = attentions.shape[0]       
    key_t = self.key_t(attentions)   
    val_t = self.value_t(attentions) 
    for idx in range(len(features)):

我想知道谁是support ,谁是query?按代码的意思,是attentions是support,features是query是这样吗?
attentions和features的维度分别是多少?
期待您的回复,谢谢!

get a visual interface

I wonder know how to get a visual interface? For example, randomly input a picture, then detect it, and then output the result picture?Thank you for your reply

关于实验细节的问题

hi,
最近阅读到你们投在cvpr2021关于few-shot的paper,"Dense Relation Distillation...",觉得很赞。但是对其中一些实验细节比较困惑,希望不吝解答一下。

1.voc 上面的10 random runs是基于TFA的30个seed吗,然后取了topk10的seed结果吗
谢谢

当我运行‘bash base_train.sh’时遇到的问题

2021-12-19 07:57:32,080 maskrcnn_benchmark.utils.checkpoint INFO: Saving checkpoint to ./model_final.pth
/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/_initialize.py:25: UserWarning: An input tensor was not cuda.
warnings.warn("An input tensor was not cuda.")
Traceback (most recent call last):
File "../../tools/train_net.py", line 213, in
main()
File "../../tools/train_net.py", line 206, in main
model = train(cfg, args.local_rank, args.distributed, phase, shot, split)
File "../../tools/train_net.py", line 97, in train
arguments
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/engine/trainer.py", line 149, in do_train
attentions = model(images, targets, meta_input, meta_label,average_shot=True)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/_initialize.py", line 197, in new_fwd
**applier(kwargs, input_caster))
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 107, in forward
attentions = self.meta_extractor(meta_input,dr=self.dense_relation)
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 83, in meta_extractor
base_feat = self.backbone((meta_data,1))[2]
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/modeling/backbone/resnet.py", line 148, in forward
x = self.stem(x,meta=meta)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/modeling/backbone/resnet.py", line 366, in forward
x = self.conv2(x)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/media/hp/new/DCNet/DCNet/maskrcnn_benchmark/layers/misc.py", line 33, in forward
return super(Conv2d, self).forward(x)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 338, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same
Traceback (most recent call last):
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/distributed/launch.py", line 235, in
main()
File "/home/hp/anaconda3/envs/dcnet/lib/python3.6/site-packages/torch/distributed/launch.py", line 231, in main
cmd=process.args)
subprocess.CalledProcessError: Command '['/home/hp/anaconda3/envs/dcnet/bin/python', '-u', '../../tools/train_net.py', '--local_rank=0', '--config-file', 'configs/base/e2e_voc_split3_base.yaml']' returned non-zero exit status 1.
mv: 无法获取'inference/voc_2007_test_split3_base/result.txt' 的文件状态(stat): 没有那个文件或目录

请问存储在inference文件夹下的是一些什么文件?我的inference文件夹下是空的。
还有,请问为什么储存完model_final.pth后报了如上的错误,该怎么解决。
希望您的回复,谢谢。

Environmental issues of the project

I followed the install.md build environment to build four or five times, and every time I encountered ImportError:
"/DCNet-main/maskrcnn_benchmark/_C.cpython-36m-x86_64-linux-gpu.so:* undefined symbol: __cudaPopCallConfiguration"*.
But I am completely I installed it according to install.md. The cuda version of my computer is 9.0.176, and gcc tried 5.5 and 7.5, but they all reported the same error. What are the specific torch, cuda, torchvision, gcc and other environments you are using? Why do I get an error following the installation tutorial? Looking forward to your quick reply! @hzhupku

Details of method

Awesome work! But I have a minor question:

In page 4, you mentioned

Noted that there are N support features, which brings N key-value pairs. We perform summation over N output results to obtain the final result, which is a refined query feature, activated by support features where there are co-existing classes of objects in query and support images.

Could you please use mathematical equations to further clarify this?

ASDF

好多错误,求完善

Test my dataset.

Hello, I have some question. How to use your code to test my own dataset? If use my own custom dataset, how to make my own dataset?

Folder not found ! ! !

The/workspace/code/Meta FSOD folder was not found. I want to know how to obtain it? @hzhupku Looking forward to your quick reply!

fair comparison?

您好 首先非常感谢您的论文和开源代码。
我粗略看完文章后,发现您是meta-learning based FSOD,并且基于Meta R-CNN和FsDetView*的,但我有些疑惑,在Meta R-CNN代码repo下,很多人有提出issue,认为它们在比较时是不公平的。
在比较新的另一篇CVPR [2021的文章中]GFSD ,它们采用了相同的sample策略后,从文章中得出FsDetView(coco 10-shot)并不是12.5。
因此 我想问下您是基于它们原有的采样方法没有啥更改吗?

what is DCNet w/CFA?

Hi.

I understand that “DCNet w/o CFA” is FasterRCNN with DRD only and “DCNet” is your network architecture, which is DRD+CFA.

But what is “DCNet w/CFA”?
It must be different from "DCNet"(DRD+CFA). There isn’t any description of this one.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.