Code Monkey home page Code Monkey logo

dhsegment's Introduction

dhSegment

dhSegment allows you to extract content (segment regions) from different type of documents. See some examples here.

The corresponding paper is now available on arxiv, to be presented as oral at ICFHR2018.

It was created by Benoit Seguin and Sofia Ares Oliveira at DHLAB, EPFL.

Installation and requirements

See INSTALL.md to install environment and to use dh_segment package.

NB : a good nvidia GPU (6GB RAM at least) is most likely necessary to train your own models. We assume CUDA and cuDNN are installed.

Usage

Training

  • You need to have your training data in a folder containing images folder and labels folder. The pairs (images, labels) need to have the same name (it is not mandatory to have the same extension file, however we recommend having the label images as .png files).
  • The annotated images in label folder are (usually) RGB images with the regions to segment annotated with a specific color
  • The file containing the classes has the format shown below, where each row corresponds to one class (including 'negative' or 'background' class) and each row has 3 values for the 3 RGB values. Of course each class needs to have a different code.
0 0 0
0 255 0
...
  • sacred package is used to deal with experiments and trainings. Have a look at the documentation to use it properly.

In order to train a model, you should run python train.py with <config.json>

Demo

This demo shows the usage of dhSegment for page document extraction. It trains a model from scratch (optional) using the READ-BAD dataset and the annotations of pagenet (annotator1 is used). In order to limit memory usage, the images in the dataset we provide have been downsized to have 1M pixels each.

How to

  1. Get the annotated dataset here, which already contains the folders images and labels for training, validation and testing set. Unzip it into model/pages.
cd demo/
wget https://github.com/dhlab-epfl/dhSegment/releases/download/v0.2/pages.zip
unzip pages.zip
cd ..
  1. (Only needed if training from scratch) Download the pretrained weights for ResNet :
cd pretrained_models/
python download_resnet_pretrained_model.py
cd ..
  1. You can train the model from scratch with: python train.py with demo/demo_config.json but because this takes quite some time, we recommend you to skip this and just download the provided model (download and unzip it in demo/model)
cd demo/
wget https://github.com/dhlab-epfl/dhSegment/releases/download/v0.2/model.zip
unzip model.zip
cd ..
  1. (Only if training from scratch) You can visualize the progresses in tensorboard by running tensorboard --logdir . in the demo folder.
  2. Run python demo.py
  3. Have a look at the results in demo/processed_images

dhsegment's People

Contributors

jim-salmons avatar seguinbe avatar solivr avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.