Code Monkey home page Code Monkey logo

val's Introduction

Visiolinguistic-Attention-Learning

Tensorflow code of VAL model

Chen et al. Image Search with Text Feedback by Visiolinguistic Attention Learning. CVPR2020

Getting Started

Prerequisites:

Preparation:

(1) Download ImageNet pretrained models: mobilenet and resnet, which should be put under the directory pretrain_model.

(2) Follow steps in scripts/prepare_data.sh to prepare datasets. Note: fashion200k and shoes can be downloaded manually. Relevant py files for data preparation are detailed below.

  • download_fashion_iq.py: crawl the image data from Amazon websites. Note that some url links might be broken.
  • generate_groundtruth.py: generate some .npy files that charaterize the groundtruth annotations during test time.
  • read_glove.py: prepare the pre-trained glove word embeddings to initialize the text model (i.e. LSTM).

Running Experiments

Training & Testing:

Train and test the VAL model on different datasets in one script file as follows.

bash scripts/run_fashion200k.sh
bash scripts/run_fashion_iq.sh
bash scripts/run_shoes.sh

The test results will be finally reported in results/results_fashion_iq.log.

Our implementation include the following .py files. Note that fashion200k is formated differently compared to fashion_iq or shoes, as a triplet of source image, text and target image is not pre-given, but is instead sampled randomly during training. Therefore, there are two implementation to build and run the training graph.

  • train_val.py: build and run the training graph on dataset fashion_iq or shoes.
  • train_val_fashion200k.py: build and run the training graph on dataset fashion200k.
  • model.py: define the model and losses.
  • config.py: define image preprocessing and other configurations.
  • extract_features_val.py: extract features from the model.
  • test_val.py: compute distance, perform retrieval, and report results in the log file.

Bibtex:

@inproceedings{chen2020image,
  title={Image Search with Text Feedback by Visiolinguistic Attention Learning},
  author={Chen, Yanbei and Gong, Shaogang and Bazzani, Loris},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={3001--3011},
  year={2020}
}

License

This project is licensed under the Apache-2.0 License - see the LICENSE file for details.

References

[1] Automatic Spatially-aware Fashion Concept Discovery, ICCV2019
[2] The Fashion IQ Dataset: Retrieving Images by Combining Side Information and Relative Natural Language Feedback, CVPRW2019
[3] Dialog-based interactive image retrieval, NeuRIPS2018
[4] Auomatic attribute discovery and characterization from noisy web data, ECCV10

val's People

Contributors

yanbeic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

val's Issues

Train Error

When running run_fashion_iq.sh to train_val.py I got to this error

scripts/run_fashion_iq.sh: line 44:  1876 Segmentation fault      (core dumped) python train_val.py --checkpoint_dir=${STAGE1_DIR} --pretrain_checkpoint_dir=${PRETRAIN_DIR} --image_model=${CNN} --data_path=${DATA_DIR} --batch_size=32 --print_span=20 --save_length=10000 --text_model=${TEXT_MODEL} --joint_embedding_size=${JOINT_SIZE} --word_embedding_size=${WORD_SIZE} --text_embedding_size=${TEXT_SIZE} --margin=0.2 --text_projection_dropout=0.9 --init_learning_rate=0.0002 --dataset=${DATASET} --train_length=50000 --constant_lr=True --image_size=${IMG_SIZE} --image_feature_name='before_pool' --augmentation=${AUGMENT} --word_embedding_dir=${PRE_TEXT}

How can Segmentation happen? Please help me, thank you.

Problems about Fashion200k dataset

Hi, there.

Thank you for the great work!

According to fashion200k repo, there are two versions of Fashion200k pictures, namely cropped detected images and original images, which one do you use in your implementation?

Test error

When I run "python test_val.py --feature_dir=${FEAT_DIR} --batch_size=20 --dataset=fashoin_iq --subset=toptee, I got error at Traceback (most recent call last):
File "test_val.py", line 218, in
main()
File "test_val.py", line 179, in main
gt_label = order[gt_mask[i, :]]
IndexError: index 3920 is out of bounds for axis 0 with size 3920

and when the subset is "shirt", the Indexerror is: IndexError: boolean index did not match indexed array along dimension 0; dimension is 2902 but corresponding boolean dimension is 3089

when the subset is "dress", the Indexerror is: boolean index did not match indexed array along dimension 0; dimension is 2902 but corresponding boolean dimension is 2628

Data prepare.

I got this errors when I try to run python generate_caption_pairs.py --dataset='fashion_iq'

Traceback (most recent call last):
File "generate_tags.py", line 10, in
'dataset', "fashion_iq or shoes")
File "C:\ProgramData\Miniconda3\lib\site-packages\tensorflow_core\python\platform\flags.py", line 58, in wrapper
return original_function(*args, **kwargs)
TypeError: DEFINE_string() missing 1 required positional argument: 'help'

No such file or directory: 'save_model/fashion_iq/val_resnet_v2_50_ml/query_images.npy'

I have some problems. Please help me.

  1. I am running bash scripts/run_fashion_iq.sh and falling to this error.
Traceback (most recent call last):
  File "test_val.py", line 220, in <module>
    main()
  File "test_val.py", line 48, in main
    query_images = np.load(filename)
  File "/home/huynhtruc0309/anaconda3/envs/truchlp/lib/python3.7/site-packages/numpy/lib/npyio.py", line 416, in load
    fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: 'save_model/fashion_iq/val_resnet_v2_50_ml/query_images.npy'
  1. And can you provide your checkpoint in the paper

Is FashionIQ evaluation comparable?

In fashion-iq dataset, there are two captions available, in paper "The fashion iq dataset: Retrieving images by combining side information and relative natural language feedback", they concatenate two captions with special symbol "<and>" and in doing so they treat two captions as one. So each ref-cap-tgt pair in json file only results in one pair.
For more detail, see their released source code
However, in your evaluation process, the file "fashion_iq-val-cap.txt" shows that you threat them as two individual pairs. If the results in your paper follow the same evaluation process, I doubt your results are not comparable with early published work.

Where to get 'datasets/shoes/shoes-tag-test.txt'?

I am running python generate_groundtruth.py --dataset='shoes' --data_path='' and falling to this error.
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/shoes/shoes-tag-test.txt'

PyTorch implementation

Hi, I know this is a bit more to ask, but just in case- is there a pyTorch implementation available, or by any chance an easy transfer of model to the pyTorch?

Thanks.

Data prepare (tags)

I got this errors when I try to run " python generate_groundtruth.py --dataset='fashion_iq' --data_path='fashion_iq' " and " python read_glove.py --dataset='fashion_iq' --data_path='datasets/fashion_iq/image_data' " :
...
file = open(filename, "r")
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/fashion_iq/tags/asin2attr-test.txt'

I found the .txt in ‘datasets/fashion_iq/tags/’ should be generated by " python generate_tags.py --dataset='fashion_iq' ", but the code shows no "asin2attr-test.txt" is generated. It only generated 9 .txt as follows:
writepaths = [
'datasets/fashion_iq/tags/asin2attr-train-dress.txt',
'datasets/fashion_iq/tags/asin2attr-train-shirt.txt',
'datasets/fashion_iq/tags/asin2attr-train-toptee.txt',
'datasets/fashion_iq/tags/asin2attr-val-dress.txt',
'datasets/fashion_iq/tags/asin2attr-val-shirt.txt',
'datasets/fashion_iq/tags/asin2attr-val-toptee.txt',
'datasets/fashion_iq/tags/asin2attr-test-dress.txt',
'datasets/fashion_iq/tags/asin2attr-test-shirt.txt',
'datasets/fashion_iq/tags/asin2attr-test-toptee.txt',
]

Is there any wrong in your codes?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.