Code Monkey home page Code Monkey logo

azure-databricks-mlops-mlflow's Introduction

page_type ms.custom ms.contributors languages products
sample
team=cse
prdeb-12/21/2021
anchugh-12/21/2021
python
azure-databricks
azure-blob-storage
azure-monitor

Azure Databricks MLOps using MLflow

This is a template or sample for MLOps for Python based source code in Azure Databricks using MLflow without using MLflow Project.

This template provides the following features:

  • A way to run Python based MLOps without using MLflow Project, but still using MLflow for managing the end-to-end machine learning lifecycle.
  • Sample of machine learning source code structure along with Unit Test cases
  • Sample of MLOps code structure along with Unit Test cases
  • Demo setup to try on users subscription

Problem Summary

Products/Technologies/Languages Used

  • Products & Technologies:
    • Azure Databricks
    • Azure Blob Storage
    • Azure Monitor
  • Languages:
    • Python

Architecture

Model Training

Model Training

Batch Scoring

Batch Scoring

Individual Components

  • ml_experiment - sample ML experiment notebook.
  • ml_data - dummy data for sample model
  • ml_ops - sample MLOps code along with Unit Test cases, orchestrator, deployment setup.
  • ml_source - sample ML code along with Unit Test cases
  • Makefile - for build, test in local environment
  • requirements.txt - python dependencies

Getting Started

Prerequisites

Development

  1. git clone https://github.com/Azure-Samples/azure-databricks-mlops-mlflow.git
  2. cd azure-databricks-mlops-mlflow
  3. Open cloned repository in Visual Studio Code Remote Container
  4. Open a terminal in Remote Container from Visual Studio Code
  5. make install to install sample packages (taxi_fares and taxi_fares_mlops) locally
  6. make test to Unit Test the code locally

Package

  1. make dist to build wheel Ml and MLOps packages (taxi_fares and taxi_fares_mlops) locally

Deployment

  1. make databricks-deploy-code to deploy Databricks Orchestrator Notebooks, ML and MLOps Python wheel packages. If any code changes.
  2. make databricks-deploy-jobs to deploy Databricks Jobs. If any changes in job specs.

Run training and batch scoring

  1. To trigger training, execute make run-taxi-fares-model-training
  2. To trigger batch scoring, execute make run-taxi-fares-batch-scoring

NOTE: for deployment and running the Databricks environment should be created first, for creating a demo environment the Demo chapter can be followed.

Observability

Check Logs, create alerts. etc. in Application Insights. Following are the few sample Kusto Query to check logs, traces, exception, etc.

  • Check for Error, Info, Debug Logs

    Kusto Query for checking general logs for a specific MLflow experiment, filtered by mlflow_experiment_id

      traces
    | extend mlflow_experiment_id = customDimensions.mlflow_experiment_id
    | where timestamp > ago(30m) 
    | where mlflow_experiment_id == <mlflow experiment id>
    | limit 1000

    Kusto Query for checking general logs for a specific Databricks job execution filtered by mlflow_experiment_id and mlflow_run_id

    traces
    | extend mlflow_run_id = customDimensions.mlflow_run_id
    | extend mlflow_experiment_id = customDimensions.mlflow_experiment_id
    | where timestamp > ago(30m) 
    | where mlflow_experiment_id == <mlflow experiment id>
    | where mlflow_run_id == "<mlflow run id>"
    | limit 1000
  • Check for Exceptions

    Kusto Query for checking exception log if any

    exceptions 
    | where timestamp > ago(30m)
    | limit 1000
  • Check for duration of different stages in MLOps

    Sample Kusto Query for checking duration of different stages in MLOps

    dependencies 
    | where timestamp > ago(30m) 
    | where cloud_RoleName == 'TaxiFares_Training'
    | limit 1000

To correlate dependencies, exceptions and traces, operation_Id can be used a filter to above Kusto Queries.

Demo

  1. Create Databricks workspace, a storage account (Azure Data Lake Storage Gen2) and Application Insights
    1. Create an Azure Account
    2. Deploy resources from custom ARM template
  2. Initialize Databricks (create cluster, base workspace, mlflow experiment, secret scope)
    1. Get Databricks CLI Host and Token
    2. Authenticate Databricks CLI make databricks-authenticate
    3. Execute make databricks-init
  3. Create Azure Data Lake Storage Gen2 Container and upload data
    1. Create Azure Data Lake Storage Gen2 Container named - taxifares
    2. Upload as blob taxi-fares data files into Azure Data Lake Storage Gen2 container named - taxifares
  4. Put secrets to Mount ADLS Gen2 Storage using Shared Access Key
    1. Get Azure Data Lake Storage Gen2 account name created in step 1
    2. Get Shared Key for Azure Data Lake Storage Gen2 account
    3. Execute make databricks-secrets-put to put secret in Databricks secret scope
  5. Put Application Insights Key as a secret in Databricks secret scope (optional)
    1. Get Application Insights Key created in step 1
    2. Execute make databricks-add-app-insights-key to put secret in Databricks secret scope
  6. Package and deploy into Databricks (Databricks Jobs, Orchestrator Notebooks, ML and MLOps Python wheel packages)
    1. Execute make deploy
  7. Run Databricks Jobs
    1. To trigger training, execute make run-taxifares-model-training
    2. To trigger batch scoring, execute make run-taxifares-batch-scoring
  8. Expected results
    1. Azure resources Azure resources
    2. Databricks jobs Databricks jobs
    3. Databricks mlflow experiment Databricks mlflow experiment
    4. Databricks mlflow model registry Databricks mlflow model registry
    5. Output of batch scoring Output of batch scoring

Additional Details

  1. Continuous Integration (CI) & Continuous Deployment (CD)
  2. Registered Models Stages and Transitioning

Related resources

  1. Azure Databricks
  2. MLflow
  3. MLflow Project
  4. Run MLflow Projects on Azure Databricks
  5. Databricks Widgets
  6. Databricks Notebook-scoped Python libraries
  7. Databricks CLI
  8. Azure Data Lake Storage Gen2
  9. Application Insights
  10. Kusto Query Language

Glossaries

  1. Application developer : It is a role that work mainly towards operationalize of machine learning.
  2. Data scientist : It is a role to perform the data science parts of the project

Contributors

azure-databricks-mlops-mlflow's People

Contributors

anandchugh avatar dependabot[bot] avatar microsoft-github-operations[bot] avatar microsoftopensource avatar mtrilbybassett avatar prabdeb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

azure-databricks-mlops-mlflow's Issues

Can we get this demo document with Azure Devops?

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

Azure Devops Integrated Demo, than offline demo so that it can be used in projects directly.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Added custom logger class

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Doc structure change in main Readme, Individual Components

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

ML Feature Eng Code and Unit tests

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Fix make databricks-authenticate

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

git clone https://github.com/Azure-Samples/azure-databricks-mlops-mlflow.git
make databricks-authenticate

Any log messages given by the failure

$ make databricks-authenticate
Authenticate Databricks CLI
Follow https://docs.microsoft.com/en-us/azure/databricks/dev-tools/cli/ for getting Host and token value
Taking Backup of .databrickscfg file in .env/databrickscfg
Creating env script file for mlflow
databricks configure --token
Databricks Host (should begin with https://): https://adb-xxx.x.azuredatabricks.net
Token:
cp ~/.databrickscfg .env/.databrickscfg
cp: cannot create regular file ‘.env/.databrickscfg’: No such file or directory
make: *** [databricks-authenticate] Error 1

Expected/desired behavior

$ make databricks-authenticate
Authenticate Databricks CLI
Follow https://docs.microsoft.com/en-us/azure/databricks/dev-tools/cli/ for getting Host and token value
Taking Backup of .databrickscfg file in .env/databrickscfg
Creating env script file for mlflow
databricks configure --token
Databricks Host (should begin with https://): https://adb-xxx.x.azuredatabricks.net
Token:
mkdir .env
cp ~/.databrickscfg .env/.databrickscfg
DATABRICKS_HOST="$(cat ~/.databrickscfg | grep '^host' | cut -d' ' -f 3)";
DATABRICKS_TOKEN="$(cat ~/.databrickscfg | grep '^token' | cut -d' ' -f 3)";
echo "export MLFLOW_TRACKING_URI=databricks"> .env/.databricks_env.sh;
echo "export DATABRICKS_HOST=$DATABRICKS_HOST" >> .env/.databricks_env.sh;
echo "export DATABRICKS_TOKEN=$DATABRICKS_TOKEN" >> .env/.databricks_env.sh

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
RHEL 7.9

Versions

Mention any other details that might be useful

It seems like the .env directory needs to be made. Adding a simple mkdir .env statement before the cp solves this


Thanks! We'll be in touch soon.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.