Code Monkey home page Code Monkey logo

label-studio-for-layoutlm's Introduction

Label Studio For LayoutLM

Script

load data to label-studio project and load ImageMetaData, ImageData to mongodb

LS_QA_ENDPOINT=<LABEL_STUDIO_ENDPOINT> LS_QA_TOKEN=<LABEL_STUDIO_TOKEN> LS_QA_PII_PROJECT_ID=<LABEL_STUDIO_PROJECT> MONGODB_USERNAME=<MONGODB_USERNAME> MONGODB_PASSWORD=<MONGODB_PASSWORD> python ls_loader_ocr_data.py --mongo_host <MONGODB_HOST>   -d 2022-12-16 -f images

export combine datasets of label-studio annotated data and ImageData

MONGODB_USERNAME=<MONGODB_USERNAME> MONGODB_PASSWORD=<MONGODB_PASSWORD> LS_QA_ENDPOINT=<LABEL_STUDIO_ENDPOINT> LS_QA_TOKEN=<LABEL_STUDIO_TOKEN> LS_QA_PII_PROJECT_ID=<LABEL_STUDIO_PROJECT>  python3 ls_exporter_combine_data.py  --mongo_host <MONGODB_HOST>

Flowchart

graph TD
    A[Prepare data] --> B(Enable Label-studio project)
    B --> |trigger|C(layoutlmv3_data_loader.py)
    C --> |write ImageMetaData and ImageData to databases| D(MongoDB)
    C --> |import new task_id record to label-studio project| E{labeling data on Label Studio}
    C --> |check task_id|C
    E --> |no|E
    E --> |yes, trigger|F(layoutlmv3_combine.py)
    F --> |fetch ImageMetadata and ImageData|D
    F --> |generate data|G(Done)

Schema

classDiagram
  class ImageMetaData {
    -filename : string
    -task_id: string uuid
    -project_id: integer id
    -type: train or test
    -text: document fulltext
  }
  class ImageData {
    -task_id: string uuid
    -project_id: integer id
    -token: string of list
    -bbox: 4-tuple of list
  }
  class AnnotatedImageData {
    -task_id: string uuid
    -label_studio_export_data: LabelStudioExportAnnotations
  }
  ImageMetaData "1" -- "1" ImageData : link by task_id
  ImageMetaData "1" -- "1" AnnotatedImageData : link by task_id

label-studio-for-layoutlm's People

Contributors

sean830314 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.