Code Monkey home page Code Monkey logo

sagemaker-groundtruth-ner's Introduction

SageMaker GroundTruth - Named Entity Recognition

This is a sample template for SageMaker named entity recognition ground truth solution. This has 3 components

  1. The HTML template that the workers will use to work on the task
  2. Lambda functions for pre and post processing rules.

Note This template currently only supports one type of entity

Preview

Known issues

  1. Does not support multi-word entities on the user interface.

Setup

  1. If you want try this sample but dont have the data to test this, you cam use this a sample file tests/sample_input_data_pubtator.txt as input data to evaluate the workflow.

  2. Create pre and post processing lambda functions

    • Note Using the naming convention SageMaker-* for your lambda functions automatically gives access to Sagemaker using the standard template. Otherwise you would have to use create an IAM policy and provide access to Sagemaker to execute the lambda function

    • Create a lambda function SageMaker-EntityAnnotationPreProcessing with runtime python 3.6 using the code source/lambda_preprocess/preprocess_handler.py.

    • Create a lambda function SageMaker-EntityAnnotationPostProcessing with runtime python 3.6 using the code source/lambda_postprocess/postprocess_handler.py. Make sure this has access to read the s3 bucket containing the results from Sagemaker groundtruth job you are about to create

  3. Configure SageMaker Ground Truth as follows:

    • Choose custom template in Sagemaker Ground Truth

    • In the custom template section, copy paste the html from source/template/entityrecognition.html

    • In the Pre-labelling task lambda function, select Sagemaker-EntityAnnotationPreProcessing

    • In the Post-labelling task lambda function, select Sagemaker-EntityAnnotationPostProcessing

setup

Run tests

export PYTHONPATH=./source
pytests

sagemaker-groundtruth-ner's People

Contributors

elangovana avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.