Code Monkey home page Code Monkey logo

pdf-data-extraction's Introduction

Download devDependencies

'yarn'

Run

'yarn start'

Package

'yarn package'

User Guide

PDF Entity Annotation Tool (PEAT)

Version 1.1.1

  1. Scope and Purpose The purpose of this project is to further the research and development of tools that NCEA can use in their creation of machine-readable datasets and machine learning research. This effort consists of the following objectives:

    1. Research and develop software for NCEA that provides the ability to annotate scientific publications for use in machine learning algorithms. This software should be able to accept a list of tags provided by NCEA, allow the user to apply these tags to PDF documents in a web interface, and then extract out the information needed in machine-readable formats that can be used for machine learning.
  2. Select the PDF and Schema (tags.json is included in the test folder.)

  3. Annotate PDF

    1. Highlight text you wish to annotate and select Add Annotation. Alt text
    2. Select the annotation type.

    Alt text 3. Hit save

    Alt text

    Alt text

  4. Save Annotations

    1. Click File in the menu bar and select Save Annotations.

    Alt text

    1. Select a save location on your computer and click Save Annot File.

    Alt text

  5. Load Annotations

    1. Click File in the menu bar and select Load Annotations.

    Alt text

    1. Select an annotation file

    Alt text

    Alt text

  6. Delete Annotations

    1. Select annotation you wish to delete from the table in the side bar.

    Alt text

    1. Click Delete selected row button

    Alt text

pdf-data-extraction's People

Contributors

chrstahl avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

shapiromatron

pdf-data-extraction's Issues

Schema editor fixes

Schema editor appears to be wanting a different json format. Update to the newest schema to fix schema edit functions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.