ads2018's Introduction

Implementation of our approach for Automatic Understanding of Visual Advertisements Challenge (1st place of 2018 challenge).

Requirements

You need to have following packages:

chainer
chainercv
keras
cupy
gensim
nltk
pandas
pytables
parse

We also provide a Dockerfile to setup dependencies.

We use Google word2vec to compute word embeddings. Download GoogleNews-vectors-negative300.bin.gz here and set WORD2VEC_PATH.

export WORD2VEC_PATH=/path/to/Word2Vec/GoogleNews-vectors-negative300.bin

Data

You can get the competition dataset here. Downlaod the training/test datasets and extracut them in the data directory.

We also use OCR results. Download the OCR results (figshare) and save in the data directory.

Preparation

Before training, pre-compute Faster-RCNN features of ad images.

VA_DATASET_ROOT=/path/to/VisualAdvertisementDataset/ python script/save_feat.py

Otherwise, you can download precomputed Faster-RCNN features (figshare), and copy to data/frcnn_feat/.

Training

To train our full model, run

python script/train.py --model_name ocr+vis --text_net cnn

An output directory will be made under /output/checkpoint/, and a trained model and some other output files will be saved in the directory.

Evaluation

To evaluate a model, run

python script/train.py --eval /path/to/output/directory

Reproduce the competition results

Download two figshare items Chainer model file and tokenizer and word embeddings (figshare). Copy wordvec.npy and tokenizer.pickle to data directory, then run

python script/train.py --eval /path/to/directory/of/Chainer_model_file

Visualizing the results

We included some code snippets for visualization. See notebook/visualize inference.ipynb.

ads2018's People

Contributors

Stargazers

Watchers

ads2018's Issues

frcnn_feat is corrupted

As the title, when I doing data preparation, I can't unzip the frcnn_feat after downloaded.

Had look in more detail and found out that there is a problem with train.h5, test.h5 can be extracted but train.h5 can't.

Can I get the train.h5 which is uncorrupted?

Recommend Projects

mayu-ot / ads2018 Goto Github PK

ads2018's Introduction

Requirements

Data

Preparation

Training

Evaluation

Reproduce the competition results

Visualizing the results

ads2018's People

Contributors

Stargazers

Watchers

Forkers

ads2018's Issues

frcnn_feat is corrupted

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent