Code Monkey home page Code Monkey logo

faster-rcnn-graphics's Introduction

Symbol detection in online handwritten graphics using Faster R-CNN

This repository contains the implementation of the models described in the paper "Symbol detection in online handwritten graphics using Faster R-CNN". A model is a Faster R-CNN network that takes an image of a handwritten graphic (flowchart or mathematical expression) as input and predicts the bounding box coordinates of the symbols that compose the graphic. The models are implemented using a fork of the the Tensorflow Object Detection API.

Symbol detection in flowchart

Symbol detetion in mathematical expression

Citing this work

In case you use this work, please consider citing:

@inproceedings{frankdas:2018,
  title={Symbol detection in online handwritten graphics using Faster R-CNN},
  author={Frank Julca-Aguilar and Nina Hirata},
  booktitle={13th IAPR International Workshop on Document Analysis Systems (DAS)},
  year={2018}
 }

Contents

  1. Installation
  2. Evaluating the models
  3. Training new models

Installation

  1. Clone the repository (with --recursive)
git clone --recursive https://github.com/vision-ime/faster-rcnn-graphics.git

The recursive option is necessary to download the fork version of the Tensorflow Object Detection API used in our experimentation.

  1. Follow the Tensorflow Object Detection API installation instructions to set up the API, which was cloned in the tf-models folder (the tf-models folder corresponds to the models folder described in the API installation instructions).

Evaluating the models

  1. Download the datasets. In the directory where you cloned this repository do:
./download_datasets.sh

The datasets will be saved in the datasets folder. Each dataset consist of Tensorflow's .record files, images of handwritten graphics, and xml metadata for each image. As described in the paper, the datasets were using the CROHME-2016 and flowcharts datasets.

  1. Download a model. Models can be download from: http://www.vision.ime.usp.br/~frank.aguilar/graphics/models/

For example, to download the model for symbol detection in flowcharts, trained with inception V2:

wget http://www.vision.ime.usp.br/~frank.aguilar/graphics/models/flowcharts/flowcharts_inceptionv2.tar.gzip

In order better to organize the different files, we can
save the model in the corresponding flowchart folder.

mkdir models/flowcharts/inceptionv2/trained
mv flowcharts_inceptionv2.tar.gzip models/flowcharts/inceptionv2/trained/
cd models/flowcharts/inceptionv2/trained
tar -xf flowcharts_inceptionv2.tar.gzip
  1. Execute the evaluation script. In the folder in which you cloned this work, to evaluate the model downloaded in step 2, you can do
python tf-models/research/object_detection/eval.py \
    --logtostderr \
    --pipeline_config_path=models/flowcharts/inceptionv2/pipeline.config \
    --checkpoint_dir=models/flowcharts/inceptionv2/trained \
    --eval_dir=models/flowcharts/inceptionv2/eval \ 
    --gpudev=0 \
    --run_once=True 

The parameter gpudev indicates the GPU device that would be used to evaluate the model. A value -1 can be used to run over CPU. The rest of the parameters are defined as in the Object Detection API (here).

Training new models

New models can be trained using

python tf-models/research/object_detection/train.py \
--logtostderr \
--pipeline_config_path=models/math/inceptionv2/pipeline.config \
--train_dir=models/math/inceptionv2/new_trained \ 
--gpudev=0 &

As in the case of evaluation, the parameters are defined
here.

faster-rcnn-graphics's People

Contributors

fjulca-aguilar avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.