Code Monkey home page Code Monkey logo

amazon-sagemaker-workbench-demo's Introduction

AWS Sagemaker Workbench Demo

Status - Work in progress -- Non-Functional

This project is an experiment in designing custom data science workbenches on AWS Sagemaker.

The goals of project are as follows:

  • Demonstrate how to loosely couple data engineering and modelling
  • Illustrate how to train a combination of Sagemaker and bespoke models.
  • Perform model selection using a flexible independent model comparison Notebook.
  • Deploy a chosen model.

Approach

We achieve this with a combination of convention, configuration and prebuilt applications that depend on these requirements.

  • Data is partitioned in an independent job that should be respected by all models
  • Models are then built independently according to the data scientists ideas and requirements
  • Models are deployed to an endpoint and registered in order to permit comparison
  • Comparison is performed using these endpoints on independent data.
  • After selection and final deployment, all artefacts are cleaned to reduce costs.

Key Conventions

Usage

Clone this repository into an instance of Sagemaker Studio.

There are then two usage pathways you can follow: GUI/Notebook Workflow and Script Workflow They both rely on the same underlying scripts and configuration.

GUI/Notebook Workflow

Data Prep

Follow the Notebook data/prepare_data.ipynb to understand how we get the data and prepare it for modelling.

Experiments

Examples of modelling approaches are shown in the experiments directory.

The proposed flow is as follows:

  1. Build a Simple Baseline - Using a sci-kit learn script
  2. Build an XGBoost Model - Using a pre-built training job container.
  3. Run an Autopilot Job

With these models built we can then explore their performance.

Analysis

The Model Comparisons Notebook will allow you to compare any model that has been built following the conventions show in the experients sections

This notebook makes extensive use of configuration and GUI widgets so that you can always return and perform additional comparisons after additional models have been run.

Deployment

The [Deployment Notebook] demonstrates how to select any of the models built and create an endpoint. In some instances there will be additional configuration required to add pre-processing into the endpoint.

Script Workflow

The same steps as above can be executed using the RUN script in the root of the repository. This script is parameterised such that you can run individual steps seperately, or the entire process in sequence.

The goal of this workflow is demonstrate how you might automate certains elements of your data science workflow and develop a code base that is easier to deploy.

amazon-sagemaker-workbench-demo's People

Contributors

john-hawkins avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.