Code Monkey home page Code Monkey logo

soco's Introduction

SoCo

[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

By Fangyun Wei*, Yue Gao*, Zhirong Wu, Han Hu, Stephen Lin.

* Equal contribution.

Introduction

Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning. Such generality for transfer learning, however, sacrifices specificity if we are interested in a certain downstream task. We argue that this could be sub-optimal and thus advocate a design principle which encourages alignment between the self-supervised pretext task and the downstream task. In this paper, we follow this principle with a pretraining method specifically designed for the task of object detection. We attain alignment in the following three aspects:

  1. object-level representations are introduced via selective search bounding boxes as object proposals;
  2. the pretraining network architecture incorporates the same dedicated modules used in the detection pipeline (e.g. FPN);
  3. the pretraining is equipped with object detection properties such as object-level translation invariance and scale invariance. Our method, called Selective Object COntrastive learning (SoCo), achieves state-of-the-art results for transfer performance on COCO detection using a Mask R-CNN framework.

Architecture

Main results

The pretrained models and finetuned models with their logs are available on Google Drive and Baidu Pan (code: 4662)

The following links are relative paths of the share folder.

SoCo pre-trained models

Model Arch Epochs Scripts Pretrained Model (relative path)
SoCo ResNet50-C4 100 SoCo_C4_100ep log (pretrain/SoCo_C4_100ep/log.txt)
raw model (pretrain/SoCo_C4_100ep/ckpt_epoch_100.pth)
converted d2 model (pretrain/SoCo_C4_100ep/current_detectron2_C4.pkl)
SoCo ResNet50-C4 400 SoCo_C4_400ep log (pretrain/SoCo_C4_400ep/log.txt)
raw model (pretrain/SoCo_C4_400ep/ckpt_epoch_400.pth)
converted d2 model (pretrain/SoCo_C4_400ep/current_detectron2_C4.pkl)
SoCo ResNet50-FPN 100 SoCo_FPN_100ep log (pretrain/SoCo_FPN_100ep/log.txt)
raw model (pretrain/SoCo_FPN_100ep/ckpt_epoch_100.pth)
converted d2 model (pretrain/SoCo_FPN_100ep/current_detectron2_Head.pkl)
SoCo ResNet50-FPN 400 SoCo_FPN_400ep log (pretrain/SoCo_FPN_400ep/log.txt)
raw model (pretrain/SoCo_FPN_400ep/ckpt_epoch_400.pth)
converted d2 model (pretrain/SoCo_FPN_400ep/current_detectron2_Head.pkl)
SoCo* ResNet50-FPN 400 SoCo_FPN_Star_400ep log (pretrain/SoCo_FPN_Star_400ep/log.txt)
raw model (pretrain/SoCo_FPN_Star_400ep/ckpt_epoch_400.pth)
converted d2 model (pretrain/SoCo_FPN_Star_400ep/current_detectron2_Head.pkl)

Results on LVIS with MaskRCNN R50-FPN

Methods Epoch APbb APbb50 APbb75 APmk APmk50 APmk75 config Detectron2 trained (relative path)
Supervised 90 20.4 32.9 21.7 19.4 30.6 20.5 -- --
SoCo* 400 26.3 41.2 27.8 25.0 38.5 26.8 config log (finetune/mask_rcnn_lvis_SoCo_FPN_Star_400ep_1x/log.txt)
model (finetune/mask_rcnn_lvis_SoCo_FPN_Star_400ep_1x/model_final.pth)

Results on COCO with MaskRCNN R50-FPN

Methods Epoch APbb APbb50 APbb75 APmk APmk50 APmk75 config Detectron2 trained (relative path)
Scratch - 31.0 49.5 33.2 28.5 46.8 30.4 -- --
Supervised 90 38.9 59.6 42.7 35.4 56.5 38.1 -- --
SoCo 100 42.3 62.5 46.5 37.6 59.1 40.5 config log (finetune/mask_rcnn_coco_SoCo_FPN_100ep_1x/log.txt)
model (finetune/mask_rcnn_coco_SoCo_FPN_100ep_1x/model_final.pth)
SoCo 400 43.0 63.3 47.1 38.2 60.2 41.0 config log (finetune/mask_rcnn_coco_SoCo_FPN_400ep_1x/log.txt)
model (finetune/mask_rcnn_coco_SoCo_FPN_400ep_1x/model_final.pth)
SoCo* 400 43.2 63.5 47.4 38.4 60.2 41.4 config log (finetune/mask_rcnn_coco_SoCo_FPN_Star_400ep_1x/log.txt)
model (finetune/mask_rcnn_coco_SoCo_FPN_Star_400ep_1x/model_final.pth)

Results on COCO with MaskRCNN R50-C4

Methods Epoch APbb APbb50 APbb75 APmk APmk50 APmk75 config Detectron2 trained (relative path)
Scratch - 26.4 44.0 27.8 29.3 46.9 30.8 -- --
Supervised 90 38.2 58.2 41.2 33.3 54.7 35.2 -- --
SoCo 100 40.4 60.4 43.7 34.9 56.8 37.0 config log (finetune/mask_rcnn_coco_SoCo_C4_100ep_1x/log.txt)
model (finetune/mask_rcnn_coco_SoCo_C4_100ep_1x/model_final.pth)
SoCo 400 40.9 60.9 44.3 35.3 57.5 37.3 config log (finetune/mask_rcnn_coco_SoCo_C4_400ep_1x/log.txt)
model (finetune/mask_rcnn_coco_SoCo_C4_400ep_1x/model_final.pth)

Get started

Requirements

The Dockerfile is included, please refer to it.

Prepare data with Selective Search

  1. Generate Selective Search proposals
    python selective_search/generate_imagenet_ss_proposals.py
  2. Filter out invalid proposals with filter strategy
    python selective_search/filter_ss_proposals_json.py
  3. Post preprocessing for images of no proposals
    python selective_search/filter_ss_proposals_json_post_no_prop.py

Pretrain with SoCo

Use SoCo FPN 100 epoch as example.

bash ./tools/SoCo_FPN_100ep.sh

Finetune detector

  1. Copy the folder detectron2_configs to the root folder of Detectron2
  2. Train the detectors with Detectron2

Citation

@article{wei2021aligning,
  title={Aligning Pretraining for Detection via Object-Level Contrastive Learning},
  author={Wei, Fangyun and Gao, Yue and Wu, Zhirong and Hu, Han and Lin, Stephen},
  journal={arXiv preprint arXiv:2106.02637},
  year={2021}
}

soco's People

Contributors

hologerry avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

soco's Issues

some question about selective search

hi,hologerry, thanks for open source the code, I just wonder does use a detector trained on object detection dataset to generate proposal can get a better result than selective search, as detector is better than selective search, I just wonder how important the proposal generation method's performance to SoCo framework, detector vs selective search.

How to start pre training

Hi, hologerry.
Thanks for your wonderful work. I'm a rookie and try running all the code. I use a part of ImageNet due to insufficient computing resources. I successfully run the part 'prepare data with selective search' and got imagenet_filtered_proposals/train_ratio3size0308post.json and imagenet_root_proposals_mp/train (.pkl) finally. But then when I do the 'Pretrain with SoCo', I'm in some trouble.
I have some questions for you:

In 'SoCo_FPN_100ep.sh', data_dir="./data/ImageNet-Zip", What is the data format in this file?

And, When will the data(.json and .pkl) be used?

Thank you.

Question about the roi_box_head

Hi, thanks for your great work!
I notice that in your code the FastRCNNConvFCHead is composed of 4 Conv layers and 1 FC layer, but in FasterRCNN there are only 2 FC layers in the roi box head. Why is it designed this way?

TypeError: __call__() missing 1 required positional argument: 'view_size'

hi,hologerry, thanks for your work, when we run the 'bash SoCo_C4_100ep.sh'
There is an error as follows. Our environment is pytorch 1.10,torchvision 0.11,GPU:A100,CUDA 11.4.
Thanks for your reply.

Traceback (most recent call last):
File "../main_pretrain.py", line 280, in
main(opt)
File "../main_pretrain.py", line 185, in main
train(epoch, train_loader, model, optimizer, scheduler, args, summary_writer)
File "../main_pretrain.py", line 204, in train
for idx, data in enumerate(train_loader):
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
<zipfile.ZipFile filename='../../imagenet/traintry.zip' mode='r'>
data.reraise()
File "/opt/conda/lib/python3.8/site-packages/torch/_utils.py", line 425, in reraise
path raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/workspace/SoCo-main/contrast/data/dataset.py", line 506, in getitem
img2_1x_cut = self.transform[5](img2_1x, resized_bboxs2) # cutout
TypeError: call() missing 1 required positional argument: 'view_size'

finetune detector

Thank you for sharing your amazing work.

Do you have the mmdetection configs for finetuning the detectors?

How to turn off scale-aware assignment

Hi,

If I understand correctly, the scale aware assignment is handled by the correspondence matrices (corres_12, corres_13, etc.).
From the ablations in your paper I can see that you train a model without it but I'm not sure how that translates into code.
Do I adjust the correspondence matrices to disable this, and if so, how? It's unclear from the paper if this means assigning the proposals to all pyramid levels or just one, or something else entirely.

Thanks,
Linus

training efficiency

hi, hologerry~

I'm currently run your code on 4 V100 32G. I found it took about 1.3s for each iteration (batch size =128/GPU), thus the total training time for 100 epochs is about 7 days.

Does 1.3s sounds normal for you? I ran MoCo on same machines, and it took about 0.5s/iteration.

I'd appreciate if you can help me with this~
Thanks!

Lower COCO AP after pre-training SoCo_FPN_100ep

Hi,

I've managed to run the SoCo_FPN_100ep model and subsequently evaluate it on COCO using the provided configs. The performance I achieve is 39.8 bb AP and 36.0 mk AP.

I've checked that my training hyperparameters are the same as yours (as reported in the google drive config.json/log.txt). The only difference is that I ran mine on 8xV100 instead of 16. This should therefore give similar results to your Table 5.b where batch size is 1024, so 41.9 bb AP and 37.6 mk AP.

Do you have any idea why my numbers are lower? Any help would be appreciated.

Thanks,
Linus

What are the format requirements for the data?

Hi @hologerry ,

Thanks for your contribution, but I'm very confused about the data format requirements of this code, there doesn't seem to be any explanation.

I construct the data format like this:

| imagenet
| ---- train
| --------n01440766
| ---- val

However, when I try to run the code

bash ./tools/SoCo_FPN_100ep.sh

There is an error and I have no idea how to fix it:

FileNotFoundError: [Errno 2] No such file or directory: './data/imagenet/train_map.txt'

Could you please refine the README and provide the available data formats?

pretrained models of R101

Hi,thanks for the great work. Is there any plan to release the pretrained models of Reset101? Thanks in advance.

jitter_prob is always 0

It seems that jitter_prob is not initialized by args.jitter_prob in contrast/data/init.py, and it will always be the default value 0. Is it because the BoxJitter operation has a trivial improvement according to Table4 in the paper and we can ignore it?

dataset

Hi ,hologerry.
Thank you very much for your work!
I am very interested in understanding the organization of the dataset and would like to get your response.

Issues about cutout

In getitem func of class ImageFolderImageAsymBboxAwareMultiJitter1Cutout:

the cutout is used : img2_cutout = self.transform[7](img2, resized_bboxs2).

however, if flip operation is applied, the resized_bboxs2 is not aligned with bboxs2 and img2, is it a bug?

Does it should be img2_cutout = self.transform[7](img2, flip_box(resized_bboxs2)) ? (flip_box is just a pseudo func)

Do you plan for Training longer?

Thanks for your wonderful work. I think you are aware of finetuning Detectron with a longer time, 4x in the paper. Do you plan to train longer time? Say, 6x as in the paper "Rethinking ImageNet Pre-training" or longer until the model converges, and compare with random initialization to show the generalization and regularization characteristics of your method?
Thank you.

About BatchNormalization in finetuning Stage

Hi, thanks for your great work and your code!

I notice that Sync BN was used in your models (e.g. SoCo_FPN_400ep), and there are SynBN operations in Backbone, FPN and RoI Head. So when I try to use your pretrained models like SoCo_FPN_400ep to do COCO-detection Task (say, faster_rcnn_fpn_res50), there could be three different choices about BN when finetuning on COCO-detection:

Possibly,

  1. load all BN parameters (Backbone, FPN and RoI head) from SoCo_FPN_400ep, fix all of them and use normal BN when finetuning on COCO
  2. load all BN parameters (Backbone, FPN and RoI head) from SoCo_FPN_400ep, fix all of them and also use Sync BN when finetuning on COCO
  3. load all BN parameters (Backbone, FPN and RoI head) from SoCo_FPN_400ep, set them as training parameters (do not fix) and use normal BN when finetuning on COCO

Could you please give me some suggestions? many thanks in advance!

Best,

configuration is not consistent with that in paper

Thanks for sharing such a good work!
I am confused about the train configuration, base learning rate in this repository is 0.03 but 1.0 in paper, and weight decay in this repository is 0.000025 but 0.00001 in paper. Do these two different configurations result in significant performance changes?

imagenet_root

First of all, thank you for sharing the code. Can you tell me the organization under the imagenet_root file Jia?

Where is the Base-RCNN-C4-BN.yaml file?

Thanks for your great job~
In SoCo/detectron2_configs/R_50_C4_1x.yaml, Base-RCNN-C4-BN.yaml was refered.

_BASE_: "Base-RCNN-C4-BN.yaml"

However, I do not find the Base-RCNN-C4-BN.yaml under SoCo/detectron2_configs/ folder. Could you please upload this file or tell me details about such a setting?
Thanks a lot, looking forward to your reply~

HifaFace code

Hello, thanks for your exciting work.
May I ask you when the code of HifaFace will be released?

Question about COCO pretrained model

Thank you for your great job!
I noticed that you added the result with the model which was pretrained on MS-COCO dataset recently in the revised version.
Could you please upload such a model or release the training script if it is convenient for you?
Thank you so much~ Looking forward to your reply.

Mini COCO

Hello! Thanks for your work. Could you please provide more information on the mini-coco experiment? I want to reproduce it, but I couldn't find train splits and training hyperparameters.

The settings between paper and code

Hi, Thanks for your great work!

I am recently learning your paper, and notice that some configs are different between SoCo paper and SoCo code.

Say, in SoCo paper, Table 1, it says that lr_base=1.0, wd=1e-5 and LARS are used. In SoCo code, lr_base=0.03, wd=1e-4 and SGD are used. If I want to use LARS optimizer, need I decrease the weight_decay ? Thanks in advance!

In Table 1, BYOL pretrained with 300 epoch achieves 40.4 APbbox. May I also ask that, was BYOL reproduced by your own in Table 1 and could you please give me some hints on settings (e.g. batch size, lr, wd) of reproducing BYOL results on COCO? (I'm now doing a project and really struggling on reproducing BYOL)

Looking forward to your answers and your many excellent works in the future!

TypeError: 'tuple' object is not callable

Hi,hologerry, thanks for open source the code。
when I run bash toos/SoCo_C4_100ep.sh,the bug is:
contrast/data/dataset.py", line 371, in getitem
img = self.transform(image)
TypeError: 'tuple' object is not callable

pretrained models of R101

Hi,thanks for the great work. Is there any plan to release the pretrained models of Reset101? Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.