yanbeic / val Goto Github PK

View Code? Open in Web Editor NEW

62.0 2.0 12.0 3.92 MB

Tensorflow Implementation on Paper [CVPR2020]Image Search with Text Feedback by Visiolinguistic Attention Learning

License: Apache License 2.0

Python 91.72% Shell 8.28%

image-search vision-and-language attention cvpr2020 tensorflow retrieval

val's Introduction

Visiolinguistic-Attention-Learning

Tensorflow code of VAL model

Chen et al. Image Search with Text Feedback by Visiolinguistic Attention Learning. CVPR2020

Getting Started

Prerequisites:

Datasets: Fashion200k [1], FashionIQ [2], Shoes [3,4].
Python 3.6.8
Tensorflow 1.10.0

Preparation:

(1) Download ImageNet pretrained models: mobilenet and resnet, which should be put under the directory pretrain_model.

(2) Follow steps in scripts/prepare_data.sh to prepare datasets. Note: fashion200k and shoes can be downloaded manually. Relevant py files for data preparation are detailed below.

download_fashion_iq.py: crawl the image data from Amazon websites. Note that some url links might be broken.
generate_groundtruth.py: generate some .npy files that charaterize the groundtruth annotations during test time.
read_glove.py: prepare the pre-trained glove word embeddings to initialize the text model (i.e. LSTM).

Running Experiments

Training & Testing:

Train and test the VAL model on different datasets in one script file as follows.

bash scripts/run_fashion200k.sh

bash scripts/run_fashion_iq.sh

bash scripts/run_shoes.sh

The test results will be finally reported in results/results_fashion_iq.log.

Our implementation include the following .py files. Note that fashion200k is formated differently compared to fashion_iq or shoes, as a triplet of source image, text and target image is not pre-given, but is instead sampled randomly during training. Therefore, there are two implementation to build and run the training graph.

train_val.py: build and run the training graph on dataset fashion_iq or shoes.
train_val_fashion200k.py: build and run the training graph on dataset fashion200k.
model.py: define the model and losses.
config.py: define image preprocessing and other configurations.
extract_features_val.py: extract features from the model.
test_val.py: compute distance, perform retrieval, and report results in the log file.

Bibtex:

@inproceedings{chen2020image,
  title={Image Search with Text Feedback by Visiolinguistic Attention Learning},
  author={Chen, Yanbei and Gong, Shaogang and Bazzani, Loris},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={3001--3011},
  year={2020}
}

License

This project is licensed under the Apache-2.0 License - see the LICENSE file for details.

References

[1] Automatic Spatially-aware Fashion Concept Discovery, ICCV2019
[2] The Fashion IQ Dataset: Retrieving Images by Combining Side Information and Relative Natural Language Feedback, CVPRW2019
[3] Dialog-based interactive image retrieval, NeuRIPS2018
[4] Auomatic attribute discovery and characterization from noisy web data, ECCV10

val's People

Contributors

Stargazers

Watchers

Forkers

ffzhang1231 ricezjh manlizhang maogewudi007 huynhtruc0309 shine0624 xixiareone fagan2888 dohonghuan nourghribi nate-yu gchb4hk

val's Issues

Hello, I downloaded the Google cloud disk dataset (fashion-200k) according to your connection, but I haven't passed your download license. Can you pass my application?

Train Error

When running run_fashion_iq.sh to train_val.py I got to this error

scripts/run_fashion_iq.sh: line 44:  1876 Segmentation fault      (core dumped) python train_val.py --checkpoint_dir=${STAGE1_DIR} --pretrain_checkpoint_dir=${PRETRAIN_DIR} --image_model=${CNN} --data_path=${DATA_DIR} --batch_size=32 --print_span=20 --save_length=10000 --text_model=${TEXT_MODEL} --joint_embedding_size=${JOINT_SIZE} --word_embedding_size=${WORD_SIZE} --text_embedding_size=${TEXT_SIZE} --margin=0.2 --text_projection_dropout=0.9 --init_learning_rate=0.0002 --dataset=${DATASET} --train_length=50000 --constant_lr=True --image_size=${IMG_SIZE} --image_feature_name='before_pool' --augmentation=${AUGMENT} --word_embedding_dir=${PRE_TEXT}

How can Segmentation happen? Please help me, thank you.

Problems about Fashion200k dataset

Hi, there.

Thank you for the great work!

According to fashion200k repo, there are two versions of Fashion200k pictures, namely cropped detected images and original images, which one do you use in your implementation?

Test error

When I run "python test_val.py --feature_dir=${FEAT_DIR} --batch_size=20 --dataset=fashoin_iq --subset=toptee, I got error at Traceback (most recent call last):
File "test_val.py", line 218, in
main()
File "test_val.py", line 179, in main
gt_label = order[gt_mask[i, :]]
IndexError: index 3920 is out of bounds for axis 0 with size 3920

and when the subset is "shirt", the Indexerror is: IndexError: boolean index did not match indexed array along dimension 0; dimension is 2902 but corresponding boolean dimension is 3089

when the subset is "dress", the Indexerror is: boolean index did not match indexed array along dimension 0; dimension is 2902 but corresponding boolean dimension is 2628

Data prepare.

I got this errors when I try to run python generate_caption_pairs.py --dataset='fashion_iq'

Traceback (most recent call last):
File "generate_tags.py", line 10, in
'dataset', "fashion_iq or shoes")
File "C:\ProgramData\Miniconda3\lib\site-packages\tensorflow_core\python\platform\flags.py", line 58, in wrapper
return original_function(*args, **kwargs)
TypeError: DEFINE_string() missing 1 required positional argument: 'help'

Can not reproduce the same result when reimplementing in Pytorch!

Dear everyone who's working in this field,
I am reimplementing this work in Pytorch. My code doesn't give the same result as the reported results in the paper. Can you guys help me out by seeing my code? I will appreciate it very much.

How to solve the problem of broken link in fashioniq

Hello,
some url links are broken in fashioniq. How can I get the right pictures?

Where is the code of Auxiliary visual-semantic matching?

Hello. I cannot find the code of the Lvs loss. Please help me. Thank you.

No such file or directory: 'save_model/fashion_iq/val_resnet_v2_50_ml/query_images.npy'

I have some problems. Please help me.

I am running bash scripts/run_fashion_iq.sh and falling to this error.

Traceback (most recent call last):
  File "test_val.py", line 220, in <module>
    main()
  File "test_val.py", line 48, in main
    query_images = np.load(filename)
  File "/home/huynhtruc0309/anaconda3/envs/truchlp/lib/python3.7/site-packages/numpy/lib/npyio.py", line 416, in load
    fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: 'save_model/fashion_iq/val_resnet_v2_50_ml/query_images.npy'

And can you provide your checkpoint in the paper

Is FashionIQ evaluation comparable?

In fashion-iq dataset, there are two captions available, in paper "The fashion iq dataset: Retrieving images by combining side information and relative natural language feedback", they concatenate two captions with special symbol "<and>" and in doing so they treat two captions as one. So each ref-cap-tgt pair in json file only results in one pair.
For more detail, see their released source code
However, in your evaluation process, the file "fashion_iq-val-cap.txt" shows that you threat them as two individual pairs. If the results in your paper follow the same evaluation process, I doubt your results are not comparable with early published work.

Where to get 'datasets/shoes/shoes-tag-test.txt'?

I am running python generate_groundtruth.py --dataset='shoes' --data_path='' and falling to this error.
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/shoes/shoes-tag-test.txt'

PyTorch implementation

Hi, I know this is a bit more to ask, but just in case- is there a pyTorch implementation available, or by any chance an easy transfer of model to the pyTorch?

Thanks.

Data prepare (tags)

I got this errors when I try to run " python generate_groundtruth.py --dataset='fashion_iq' --data_path='fashion_iq' " and " python read_glove.py --dataset='fashion_iq' --data_path='datasets/fashion_iq/image_data' " :
...
file = open(filename, "r")
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/fashion_iq/tags/asin2attr-test.txt'

I found the .txt in ‘datasets/fashion_iq/tags/’ should be generated by " python generate_tags.py --dataset='fashion_iq' ", but the code shows no "asin2attr-test.txt" is generated. It only generated 9 .txt as follows:
writepaths = [
'datasets/fashion_iq/tags/asin2attr-train-dress.txt',
'datasets/fashion_iq/tags/asin2attr-train-shirt.txt',
'datasets/fashion_iq/tags/asin2attr-train-toptee.txt',
'datasets/fashion_iq/tags/asin2attr-val-dress.txt',
'datasets/fashion_iq/tags/asin2attr-val-shirt.txt',
'datasets/fashion_iq/tags/asin2attr-val-toptee.txt',
'datasets/fashion_iq/tags/asin2attr-test-dress.txt',
'datasets/fashion_iq/tags/asin2attr-test-shirt.txt',
'datasets/fashion_iq/tags/asin2attr-test-toptee.txt',
]

Is there any wrong in your codes?