Code Monkey home page Code Monkey logo

nm-gcp-vertex's Introduction

nm-gcp-vertex

Example using DeepSparse in a GCP Vertex model serving endpoint for Sentiment Analysis

GCP has great documentation on using a custom container to create a Model for usage in inference Endpoints. Let's see how we can use this functionality to run DeepSparse in a Vertex Endpoint!

Overview

The following steps are required to provision and deploy DeepSparse to GCP Vertex for inference:

  • Create a new Project in the GCP Console
  • Create an Artifact Repository to host a Docker image with DeepSparse
  • Build a Docker Image and push to the Artifact Repository
  • Create a Vertex Model with the hosted Docker image
  • Build a Vertex Endpoint for model serving
  • Deploy the Model to the Endpoint

We will use the gcloud CLI to for each step.

1. Create a New Project In the GCP Console

From the GCP Console, create a new project. We named our project gcp-vertex-deepsparse-example.

Set the PROJECT_ID enviornment variable on your machine to match the ID of the project. In our case:

export PROJECT_ID=gcp-vertex-deepsparse-example

Once the project is created, switch to the project on your machine with the gcloud CLI command:

gcloud config set project $PROJECT_ID

Choose the region you want to run in and set it as an environment variable. For example:

export REGION=us-east1

2. Create Artifact Repository

  • Authenticate to gcloud
gcloud auth configure-docker $REGION-docker.pkg.dev
  • Enable Artifact Repository API
gcloud services enable artifactregistry.googleapis.com

Wait a few minutes for this to propogate through GCP's systems.

  • Create Artifact Repository with Docker format
gcloud artifacts repositories create deepsparse-server-images --repository-format=docker --location=$REGION
  • Build Docker Image Locally We provided a Dockerfile which downloads DeepSparse and launches the server based on the server-config.yaml file provided. If you want to use a different model or Server configuration, update the server-config.yaml as needed.
docker build -t $REGION-docker.pkg.dev/$PROJECT_ID/deepsparse-server-images/sentiment-analysis .

Push Docker Image to Artifact Repository

docker push $REGION-docker.pkg.dev/$PROJECT_ID/deepsparse-server-images/sentiment-analysis

Setup Model and Endpoint

Enable Vertex AI

Unfortunately there is no gcloud CLI command to enable the Vertex API. As such, bavigate to the Vertex AI Dashboard and click "Enable Recommended APIs."

Create a Model:

The GCP Docs specify a few arguments that must be passed, including the region, the location of the container in the Artifact Respository, the port used by the Server in the container, the health route, and the prediction route. In the case of DeepSparse Server, we configured it to use port 5543 and the routes /health and /predict (these are the defaults) - so we will pass these values when creating the Model.

gcloud ai models upload --region=$REGION --display-name=sparse-sentiment-analysis --container-image-uri=$REGION-docker.pkg.dev/$PROJECT_ID/deepsparse-server-images/sentiment-analysis --container-ports=5543 --container-health-route=/health --container-predict-route=/predict

This command may fail with an error about IAM policies:

ERROR: (gcloud.ai.models.upload) FAILED_PRECONDITION: Vertex AI Service Agent [email protected] does not have permission to access Artifact Registry repository projects/gcp-vertex-deepsparse-example/locations/us-east1/repositories/deepsparse-server-images.

Set the Service Agent as an enviornment variable:

export SERVICE_AGENT=service-XXX@gcp-sa-aiplatform.iam.gserviceaccount.com

Grant access to the repository as a reader:

gcloud artifacts repositories add-iam-policy-binding deepsparse-server-images --location $REGION --member=serviceAccount:$SERVICE_ACCOUNT --role=roles/artifactregistry.reader

Re-running the upload command should now work.

Create An Endpoint

Create an Endpoint

gcloud ai endpoints create --region=$REGION --display-name=deepsparse-endpoint

This will take a few minutes to complete.

Deploy Model to the Endpoint

Get Model and Endpoint IDs

gcloud ai models list --region=$REGION
gcloud ai endpoints list --region=$REGION

Save the numbers that appear in the ENDPOINT_ID and MODEL_ID columns as enviornment variables, such as:

export ENDPOINT_ID=5488427794322948096
export MODEL_ID=4905880145648156672
gcloud ai endpoints deploy-model $ENDPOINT_ID --region=us-east1 --model=$MODEL_ID --display-name=sparse-model --machine-type=n1-highcpu-8 --min-replica-count=1

Save the Id of the deployed model as an enviornment variable:

export DEPLOYED_MODEL_ID=7899507260454338560

This will take a few minutes to complete.

Note that our endpoint does not use any accelerators and is running on a standard CPU-only instance.

We are ready to start performing inferences!

Send Requests To The Server

The GCP Raw Prediction API enables you to send aribtray HTTP payload.

Create a JSON file to send to the server. As per the documentation, the Sentiment Analysis Pipeline expects an array of sequences.

{"sequences": ["The man dislikes going to the store", "The man loves going to the store"]}

Send A Request to the Endpoint:

gcloud ai endpoints raw-predict $ENDPOINT_ID --region=$REGION [email protected]

# {"labels":["LABEL_0","LABEL_1"],"scores":[0.9987145662307739,0.9933835864067078]}

We successfully made a prediction!

Cleaning Up

Undeploy your Model from the Endpoint.

gcloud ai endpoints undeploy-model $ENDPOINT_ID --region=$REGION --deployed-model-id $DEPLOYED_MODEL_ID

Delete your Endpoint.

gcloud ai endpoints delete $ENDPOINT_ID

Delete your Model.

gcloud ai models delete $MODEL_ID

Next Steps

Refer to the GCP documentation for more inforrmation on deploying custom models with Vertex.

Refer to the GCP documentation for more information on compute resources and autoscaling your endpoints.

nm-gcp-vertex's People

Contributors

robertgshaw avatar robertgshaw2-neuralmagic avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.