lyakaap / isc21-descriptor-track-1st Goto Github PK

The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

License: MIT License

Python 100.00%

isc21-descriptor-track-1st's Introduction

ISC21-Descriptor-Track-1st

The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

You can check our solution tech report from: Contrastive Learning with Large Memory Bank and Negative Embedding Subtraction for Accurate Copy Detection

Main features:

The weights of the competition winning models are publicly available and easy to use.
Without any fine-tuning or something, our models work well with image/video copy detection, image retrieval, and so on.
- In video copy detection task, it is reported that our model has the best result among recent frame feature extractor, despite with the smallest feature dimensionality (ref: https://github.com/alipay/VCSL).

Installation

pip install git+https://github.com/lyakaap/ISC21-Descriptor-Track-1st

Usage

import requests
import torch
from PIL import Image

from isc_feature_extractor import create_model

recommended_weight_name = 'isc_ft_v107'
model, preprocessor = create_model(weight_name=recommended_weight_name, device='cpu')

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
x = preprocessor(image).unsqueeze(0)

y = model(x)
print(y.shape)  # => torch.Size([1, 256])

isc21-descriptor-track-1st's People

Contributors

Stargazers

Watchers

Forkers

repo-collection yangsuhui peternara cxmscb mr-lz kimaril yjingyu zc277584121 yangsenwxy aniketgurav insightcs voguke chenchy sagadre jakub-kucera vobilela yuankangninggithub 1191658517

isc21-descriptor-track-1st's Issues

about the memory size

python v107.py \
  -a tf_efficientnetv2_m_in21ft1k --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 99999 \
  --epochs 10 --lr 0.5 --wd 1e-6 \
  --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.1 --weight ./v98/train/checkpoint_0001.pth.tar \
  --input-size 512 --sample-size 1000000 --memory-size 1000 \
  ../input/training_images/

why not set the --memory-size large such as 20000 ? thanks in advance

will v107 overfit for phase2?

Congratulations and thanks for your sharing.

i find v107 only use the about 5k query-ref pair (i.e. gt in phase1) as positive. How to know whether it overfits for phase2 ?

Unable to reproduce Stage 1 results

Hi, I attempted to reproduce the Stage 1 training using your provided code, but was unable to obtain the reported muAP of 0.5831. I instead obtained this result at epoch 9 (indexed from 0):

Average Precision: 0.49554
Recall at P90    : 0.32701
Threshold at P90 : -0.375733
Recall at rank 1:  0.62448
Recall at rank 10: 0.65961

I also saw that you continued training from epoch 5, but these are the results I obtained at epoch 5:

Average Precision: 0.47977
Recall at P90    : 0.32501
Threshold at P90 : -0.376619
Recall at rank 1:  0.61409
Recall at rank 10: 0.64903

Both sets of results were obtained on the private ground truth set of Phase 1, using image size 512. Is it possible to provide some insight as to what is happening here? Thank you.

access denied for dataset on aws

Thanks for you work! I have problems downloading the dataset from the given aws buckets

$ aws s3 cp s3://drivendata-competition-fb-isc-data/all/query_images/ input/query_images/ --recursive --no-sign-request
fatal error: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

Do I need special permissions to download the data?

Final optimizer state for the model

Hello @lyakaap

Thanks a lot for this work. I am trying to take this and finetune over a certain task. Is it possible you can provide the state of final optimizer after 4th stage of training. We want to try an experiment where it will be very useful.

Thank you.

why not include unlabeled queries when evaluating?

This way will result in artificially inflated score. How do you think about it?

patch_embed in disc21_ft_vit_base_r50_s16_224_in21k.pth

Hi!

Awesome repo. Thank you for sharing.

I notice that there is a 'disc21_ft_vit_base_r50_s16_224_in21k.pth' in Release v1.0.3, which I assume is from timm 'vit_base_r50_s16_224_in21k'.

The default setting for patch embedding does not work. I speculate that you had changed the patch size from 16 to 8, but I am unable to make it work.

May I ask if you can kindly share the change in patch_embed (HybridEmbed class), please?

Thanks!

Cannot load state dict for model

Thanks for your amazing work. But I encounter a problem, when I use checkpoint_0009.pth.tar checkpoint,

When I don't remove model = nn.DataParallel(model), I encouter error:

        size mismatch for module.backbone.bn1.weight: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is 
torch.Size([64]).
        size mismatch for module.backbone.bn1.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([64]).
        size mismatch for module.backbone.bn1.running_mean: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([64]).
        size mismatch for module.backbone.bn1.running_var: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([64]).
        size mismatch for module.fc.weight: copying a param with shape torch.Size([256, 512]) from checkpoint, the shape in current model is torch.Size([256, 2048])

Then I remove line model = nn.DataParallel(model), the model seems to load checkpoint successfully, but I feed same input to model, the output feature vector if different for different time I run. I guess the model is not loaded successfully when load state dict, so model will use the weight initialized randomly.
Then I change strict=True in model.load_state_dict(state_dict=state_dict, strict=False), I encounter error RuntimeError: Error(s) in loading state_dict for ISCNet: Missing key(s) in state_dict:, I found that the key of state_dict in model and checkpoint totally diffrent even name pattern. Key of model state dict and checkpoint state dict I attached below.
checkpoint.txt
model.txt
How can I solve the this problem?

Bugs?

Congratulations! We really appreciate the work. When I run the

python v107.py \
  -a tf_efficientnetv2_m_in21ft1k --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 99999 \
  --epochs 10 --lr 0.5 --wd 1e-6 --batch-size 16 --ncrops 2 \
  --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.1 --weight ./v98/train/checkpoint_0001.pth.tar \
  --input-size 512 --sample-size 1000000 --memory-size 1000 \
  ../input/training_images/

I come across

Traceback (most recent call last):                                              
  File "v107.py", line 774, in <module>
    train(args)
  File "v107.py", line 425, in train
    mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
  File "/home/wangwenhao/anaconda3/envs/ISC/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/wangwenhao/anaconda3/envs/ISC/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
    while not context.join():
  File "/home/wangwenhao/anaconda3/envs/ISC/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 5 terminated with the following error:
Traceback (most recent call last):
  File "/home/wangwenhao/anaconda3/envs/ISC/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/home/wangwenhao/fbisc-descriptor-1st/exp/v107.py", line 573, in main_worker
    train_one_epoch(train_loader, model, loss_fn, optimizer, scaler, epoch, args)
  File "/home/wangwenhao/fbisc-descriptor-1st/exp/v107.py", line 595, in train_one_epoch
    labels = torch.cat([torch.tile(i, dims=(args.ncrops,)), torch.tensor(j)])
ValueError: only one element tensors can be converted to Python scalars

Do you know how to fix it?
Thanks.

about the train output feature

sorry to bother you again. I want train the model with a small backbone such as resnet50. Because I only have three GPU and I run with command:

CUDA_VISIBLE_DEVICES=0,1,2 python v83.py  --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 9 \
  --epochs 5 --lr 0.1 --wd 1e-6 --batch-size 96 --ncrops 2 \
  --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.0 \
  --input-size 256 --sample-size 1000000 --memory-size 20000 \
/root/zhx3/data/fb_train_data/train

I find a strange problem. I test checkpoint_000{0..4}.pth.tar model. only the checkpoint_0002.pth.tar ouput different when the input is different. I mean other model will output same embedding no matter what different you input. thanks in advance.
the loss log output such as:

epoch 5:   0%|          | 0/15873 [00:00<?, ?it/s]=> loading checkpoint './v83/train/checkpoint_0004.pth.tar'
=> loaded checkpoint './v83/train/checkpoint_0004.pth.tar' (epoch 5)
epoch 6:   0%|          | 0/15873 [00:00<?, ?it/s]epoch=5, loss=1.0154363534772417
epoch 7:   0%|          | 0/15873 [00:00<?, ?it/s]epoch=6, loss=1.012835873522891

data augment is wrong

train_dataset = ISCDataset(
    train_paths,
    NCropsTransform(
        transforms.Compose(aug_moderate),
        transforms.Compose(aug_hard),
        args.ncrops,
    ),
)

error log: apply_transform() takes from 2 to 3 positional arguments but 5 were given

how to export to ONNX model?

when i use this code to export ONNX Model:

import requests
import torch
from PIL import Image

from isc_feature_extractor import create_model

recommended_weight_name = 'isc_ft_v107'
model, preprocessor = create_model(weight_name=recommended_weight_name, device='cpu')

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
x = preprocessor(image).unsqueeze(0)
input_name = 'input'
output_name = 'output'

torch.onnx.export(model,
x,
'model.onnx',
input_names=['input'],
output_names=['output'],
opset_version=11,
)
y = model(x)
print(y.shape) # => torch.Size([1, 256])

It has the following error：RuntimeError: Failed to export an ONNX attribute 'onnx::Squeeze', since it's not constant, please try to make things (e.g., kernel size) static if possible

can you tell me how to export a ONNX model Correctly？thank you!