Code Monkey home page Code Monkey logo

midv-500-models's Introduction

midv-500-models

The repository contains a model for binary semantic segmentation of the documents.

Left: input.

Center: prediction.

Right: overlay of the image and predicted mask.

For more details: Example notebook

Dataset

Model is trained on MIDV-500: A Dataset for Identity Documents Analysis and Recognition on Mobile Devices in Video Stream.

Preparation

Download the dataset from the ftp server with

wget -r ftp://smartengines.com/midv-500/

Unpack the dataset

cd smartengines.com/midv-500/dataset/
unzip \*.zip

The resulting folder structure will be

smartengines.com
    midv-500
        dataset
            01_alb_id
                ground_truth
                    CA
                        CA01_01.tif
                    ...
                images
                    CA
                        CA01_01.json
                    ...
                ...
            ...
        ...
    ...

To preprocess the data use the script

python midv500models/preprocess_data.py -i <input_folder> \
                                          -o <output_folder>

where input_folder corresponds to the file with the unpacked dataset and output folder will look as:

images
    CA01_01.jpg
    ...
masks
    CA01_01.png

target binary masks will have values [0, 255], where 0 is background and 255 is the document.

Training

python midv500models/train.py -c midv500models/configs/2020-05-19.yaml \
                              -i <path to train>

Inference

python midv500models/inference.py -c midv500models/configs/2020-05-19.yaml \
                                  -i <path to images> \
                                  -o <path to save preidctions>
                                  -w <path to weights>

Example notebook

Example notebook

Weights

Unet with Resnet34 backbone: Config Weights

midv-500-models's People

Contributors

ternaus avatar zackpashkin avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.