page_type | languages | products | ||||
---|---|---|---|---|---|---|
sample |
|
|
This sample will setup a pipeline to train, package and deploy Machine Learning models in IoT Edge Devices. There are two phases in this pipeline.
- Training on AzureML - Train the Tiny Yolo v3 model in Azure Machine Learning and converting it to ONNX.
- Continuous Deployment - Build Docker container images and deploy to NVIDIA jetson devices on Azure IoT Edge.
Specifically, you will learn how to:
- Train a Tiny Yolo v3 model using AzureML
- Set up a NVIDIA Jetson Nano as a Linux self-hosted DevOps agent, for building our Edge solution.
- Trigger a release pipeline, when a newly trained model is registered in the AzureML model registry, for continuous deployment.
The Keras implementation of YOLOv3 (Tensorflow backend) inspired by allanzelener/YAD2K. This sample reuses the recipes from the [qqwweee/keras-yolo3] (https://github.com/qqwweee/keras-yolo3) repo to train a keras-yolo3 model on the VOC Pascal dataset, using AzureML. If you are familiar with the original repository, you may want to jump right to section Train On AzureML
below.
-
Setup you Azure account: An Azure Account Subscription (with pre-paid credits or billing through existing payment channels) is required for this sample. Create the account in Azure portal using this tutorial. Your subscription must have pre-paid credits or bill through existing payment channels. (If you make an account for the first time, you can get 12 months free and $200 in credits to start with.)
-
Devices needed for this sample need atleast two NVIDIA Jetson devices. Read more about the NVIDIA Jetson Developer Kit here.
We will use one of the Jetson devices as the Azure DevOps self-host agent to run the jobs in the DevOps pipeline. This is the Dev machine. The other Jetson device(s) will be used to deploy the IoT application containers. We will refer to these devices as Test Device(s).
Note: If you are ordering these devices, we recommend you get power adapters (rather than relying on USB as a power source) and a wireless antenna (unless you are fine with using ethernet).
Note: If you don't have a NVIDIA Jetson device, this sample will provide a demonstration for how to train a yolo v3 model using AzureML. For deployment, you could switch deployment platform from arm to amd64, to e.g. deploy to your workstation for testing.
-
Before you try to create a DevOps release pipeline, we recommend that you familiarize yourself with this easy to use getting started sample to deploy a ML model manually to an IoT Edge device like the Jetson device.
In this step we will use weights for Tiny Yolo v3 from the original release by Darknet. This allows us to perform fine-tuning of a model that has already been trained on another dataset, rather than training the model from scratch. We will first convert the weights to Keras to training with TensorFlow backend. After converting the weights to Keras format (h5 extension), we can then fine-tune the model on the VOC. After training, trained model is converted to ONNX to enable us to deploy in different execution environments.
Create your Azure Machine Learning Workspace. (You can skip this step if you already have a workspace setup.)
Setup the Jupyter Notebook Environment in Azure Machine Learning Workspace to run your Jupyter Notebooks directly in your workspace in AML studio.
Clone this repo to your AML Workspace to run this training notebook.
You can use regular git clone --recursive https://github.com/Azure-Samples/AzureDevOps-onnxrutime-jetson
CLI commands from the Notebook Terminal in AML to clone this repository into a desired folder in your workspace.
Get Started to Train: Open the notebook Training-keras-yolo3-AML.ipynb
and start executing the cells to train the Tiny Yolo model.
In this step we will create the pipeline of steps to build the docker images for the Jetson devices. We will use Azure DevOps to create this pipeline.
Go to https://dev.azure.com/ and create a new organization and project.
The Dev machine is setup to run the jobs for the CI/CD pipeline. Since the test device is a ubuntu/ARM64 platform we will need to build the ARM64 docker images on the host platform with same HW configuration. Another approach is to setup a docker cross-build environment in Azure which is beyond the scope of this tutorial and not fully validated for ARM64 configuration.
Install the self-hosted Azure DevOps agent. Follow the instructions on this page: https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/v2-linux?view=azure-devops.
The IoT Edge Dev Tool greatly simplifies Azure IoT Edge development down to simple commands driven by environment variables. We recommend that you install the tool manually on the ARM64: https://github.com/Azure/iotedgedev/wiki/manual-dev-machine-setup
We will use the AzureML SDK for Python to download the model to the DevOps agent.
You are welcome to just install the SDK system wide. Alternatively, you might want to install it inside a Conda environment - for easier housekeeping.
Because there is no official release of Anaconda/Miniconda for ARM64 devices, we recommend that you use Archiconda.
Then, you can install the SDK like so:
conda create -n onnx python=3.7
conda activate onnx
pip install -U pip
pip install azureml-core
Service Principal enables non-interactive authentication for any specific user login. This is useful for setting up a machine learning workflow as an automated process.
Note that you must have administrator privileges over the Azure subscription to complete these steps.
Follow the instructions from the section Service Principal Authentication in this notebook to create a service principal for your project. We recommend to scope the Service Principal to the Resource Group.
Note the configuration details of your AML Workspace in a config.json
file. This file will later be needed for your Release Pipeline to ahve access to your AzureML workspace. You can find most of the info in the Azure Portal.
{
"subscription_id": "subscription_id",
"resource_group": "resource_group",
"workspace_name": "workspace_name",
"workspace_region": "workspace_region",
"service_principal_id": "service_principal_id",
"service_principal_password": "service_principal_password",
"tenant_id": "tenant_id"
}
Important: Add
service_principal_id
,service_principal_password
, andtenant_id
to theconfig.json
file above. You can then upload theconfig.json
file to the secure file libary of your DevOps project. Make sure to enable all pipelines to have access to the secure file.
Add config.json
to Library of secure files in the Azure DevOps project. Select on the Pipelines icon on the left, then Library. In your library go to Secure files and + Secure File. Upload the config.json
file and make sure to allow all pipelines to use it.
Next, we configure the project such that the release pipeline has access to the code in github repo, to your AzureML Workspace, and to the Azure Container Registry (for Docker images).
Go to the settings of your project, Service Connections
and click on New Service Connection
.
- Create one Service Connection of type
GitHub
. - Create one of type
Azure Resource Manager
, using the Service Principal Connection credentials from above.
You can install the MLOps extension from here: https://marketplace.visualstudio.com/items?itemName=ms-air-aiagility.vss-services-azureml.
Now we can build the Release pipeline for the project by selecting Create Pipeline under Pipelines in the Azure DevOps project.
Connect Artifacts
Our pipeline is connected to two Artifacts
.
- Your fork of this GitHub repository and,
- Your Model Registry from the AzureML Workspace.
You can add these by clicking the + Add
button, next to Artifacts
.
The final pipeline should look like this:
When the pipeline is triggered, it will execute the tasks in Stage 1
:
Let's go through the steps:
Download Secure file
The config.json
is downloaded from the Secure Library of our DevOps project is downloaded for the pipeline to authenticate with the different Azure services. We called our file wopauli_onnx_config.json
in this example. Feel free to give it a different name. It helps to add some kind of identifier, in case you have other release pipelines that work with other AzureML Workspaces or Service Principals.
Copy Secure file
Copy the config.json
file from the Agent.TempDirectory
into the aml
folder of the cloned code repository (cp $(Agent.TempDirectory)/wopauli_onnx_config.json ./_wmpauli_onnxruntime-iot-edge/aml/config.json
).
Agent.TempDirectory
is a predefined variable. Check out what other predefined variables exist: https://docs.microsoft.com/en-us/azure/devops/pipelines/build/variables
Download Model from AzureML Model Registry
This pipeline is trigerred when a new model is available in the AML Model registry. We use the AzureML SDK for Python to download the latest model from the Model Registry. This is the ONNX model we saved as the last step in the Training step above. ($(System.DefaultWorkingDirectory)/_wmpauli_onnxruntime-iot-edge/aml/download_model.py
)
Build Modules
Next, we will rebuild the IoT modules (docker images) of our solution to update with the new ONNX model. Make sure you point it to the correct deployment.template.json
file, and pick the correct Default platform
, and Action
.
Push Modules to ACR
After the modules are created we will push them to the Azure Container Registry. You can use the Azure Container Registry that was creates along your Azure ML Workspace above. As Azure Subscription
, pick the Service connection you created above to connect to your workspace.
Deploy to Edge Device
The last step in the pipeline is to deploy the modules to the Edge device.
Now you can run src/model_registration.py
to manually trigger a loop of this release pipeline.
Make sure to click on the Lightning Icon (continuous deployment trigger) on the Artifact
for _TinyYOLO
(i.e. the Model Registry in AzureML), to make sure that the release pipeline is triggered when you register a new model in the Model Registry. You can do that from Pipeline -> Release -> Edit.
You can expand this sample to address more complex Machine Learning scenarios.
-
Use AzureML MLOps to increase efficiencies for the ML workflows.
-
Deploy to different HW configurations by using different Docker base images in the Build step. You can update the dockerfiles to change the base image configuration but reusing the same application code and model.
==============================
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repositories using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
Security issues and bugs should be reported privately, via email, to the Microsoft Security Response Center (MSRC) at [email protected]. You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Further information, including the MSRC PGP key, can be found in the Security TechCenter.
Copyright (c) Microsoft Corporation. All rights reserved.
MIT License
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.