Code Monkey home page Code Monkey logo

p2pala's Introduction

P2PaLA

Python Version Code Style

Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks.

๐Ÿ’ฅ Try our new DEMO for online baseline detection. โ—โ—

If you find this toolkit useful in your research, please cite:

@misc{p2pala2017,
  author = {Lorenzo Quirรณs},
  title = {P2PaLA: Page to PAGE Layout Analysis tookit},
  year = {2017},
  publisher = {GitHub},
  note = {GitHub repository},
  howpublished = {\url{https://github.com/lquirosd/P2PaLA}},
}

Check this paper for more details Arxiv.

Requirements

  • Linux (OSX may work, but untested.).
  • Python (2.7, 3.6 under conda virtual environment is recomended)
  • Numpy
  • PyTorch (1.0). PyTorch 0.3.1 compatible on this branch
  • OpenCv (3.4.5.20).
  • NVIDIA GPU + CUDA CuDNN (CPU mode and CUDA without CuDNN works, but is not recomended for training).
  • tensorboard-pytorch (v0.9) [Optional]. pip install tensorboardX > A diferent conda env is recomended to keep tensorflow separated from PyTorch

Install

python setup.py install

To install python dependencies alone, use requirements file conda env create --file conda_requirements.yml

Usage

  1. Input data must follow the folder structure data_tag/page, where images must be into the data_tag folder and xml files into page. For example:
mkdir -p data/{train,val,test,prod}/page;
tree data;
data
โ”œโ”€โ”€ prod
โ”‚ย ย  โ”œโ”€โ”€ page
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ prod_0.xml
โ”‚ย ย  โ”‚ย ย  โ””โ”€โ”€ prod_1.xml
โ”‚ย ย  โ”œโ”€โ”€ prod_0.jpg
โ”‚ย ย  โ””โ”€โ”€ prod_1.jpg
โ”œโ”€โ”€ test
โ”‚ย ย  โ”œโ”€โ”€ page
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ test_0.xml
โ”‚ย ย  โ”‚ย ย  โ””โ”€โ”€ test_1.xml
โ”‚ย ย  โ”œโ”€โ”€ test_0.jpg
โ”‚ย ย  โ””โ”€โ”€ test_1.jpg
โ”œโ”€โ”€ train
โ”‚ย ย  โ”œโ”€โ”€ page
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ train_0.xml
โ”‚ย ย  โ”‚ย ย  โ””โ”€โ”€ train_1.xml
โ”‚ย ย  โ”œโ”€โ”€ train_0.jpg
โ”‚ย ย  โ””โ”€โ”€ train_1.jpg
โ””โ”€โ”€ val
    โ”œโ”€โ”€ page
    โ”‚ย ย  โ”œโ”€โ”€ val_0.xml
    โ”‚ย ย  โ””โ”€โ”€ val_1.xml
    โ”œโ”€โ”€ val_0.jpg
    โ””โ”€โ”€ val_1.jpg
  1. Run the tool.
python P2PaLA.py --config config.txt --tr_data ./data/train --te_data ./data/test --log_comment "_foo"

โ— Pre-trained models available here

  1. Use TensorBoard to visualize train status:
tensorboard --logdir ./work/runs
  1. xml-PAGE files must be at "./work/results/test/"

We recommend Transkribus or nw-page-editor to visualize and edit PAGE-xml files.

  1. For detail about arguments and config file, see docs or python P2PaLa.py -h.
  2. For more detailed example see egs:
    • Bozen dataset see
    • cBAD complex competition dataset see
    • OHG dataset see

License

GNU General Public License v3.0 See LICENSE to see the full text.

Acknowledgments

Code is inspired by pix2pix and pytorch-CycleGAN-and-pix2pix

p2pala's People

Contributors

lquirosd avatar

Watchers

James Cloos avatar harirajeev avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.