Code Monkey home page Code Monkey logo

bento_emotion's Introduction

Steps

Installation

Install requirements (Python 3.8)

pip install -r requirements.txt

Load models into bento

python load-model.py

Run prediction service

Create bento endpoint on port 3000

BENTOML_CONFIG=./cfg.yml bentoml serve --production

Or: Run containerized prediction service

Build the bento

$ bentoml build
Successfully built Bento(tag="emotion_recognition:zd2vyafvicgucaav").

Docker

Generate Docker image

$ bentoml containerize emotion_recognition:zd2vyafvicgucaav
Successfully built Bento container for "emotion_recognition:zd2vyafvicgucaav" with tag(s) "emotion_recognition:zd2vyafvicgucaav"

$ docker run -p 3000:3000 --rm -v $(pwd)/cfg.yml:/home/bentoml/configuration.yml \
             -e BENTOML_CONFIG=/home/bentoml/configuration.yml \
             emotion_recognition:zd2vyafvicgucaav serve --production
Starting production BentoServer from "emotion_recognition:zd2vyafvicgucaav" running on http://0.0.0.0:3000

Kubernetes

Install the Yatai components and follow their instructions to push the bento to Yatai

$ bentoml yatai login --api-token {YOUR_TOKEN} --endpoint http://127.0.0.1:8080
$ bentoml push emotion_recognition:zd2vyafvicgucaav
│ Successfully pushed model "emotion:uouhv7vvg2ka6aav"
│ Successfully pushed model "blazeface_back:uprkd2fvg2ka6aav"
│ Successfully pushed bento "emotion_recognition:zd2vyafvicgucaav" 

Go to the deployments page: http://127.0.0.1:8080/deployments, click Create button and follow the instructions on the UI. Paste this into the BentoML configuration textfield

runners.timeout=1 runners.batching.enabled=true runners.batching.max_batch_size=100 runners.batching.max_latency_ms=60

Note: The yatai-deployment component script installs a metrics server, which is needed for auto-scaling. If the metric pod fails during installation, try adding the following to spec.template.spec.containers[].args in the metrics-server deployment of components.yaml and apply it manually.

- --kubelet-insecure-tls

Test

Run a simple test script. Send many requests at once python request.py {URL} {REQUEST_AMOUNT}, or send them one after another python request.py {URL} {REQUEST_AMOUNT} --one-by-one

$ cd tests && python request.py http://emotion-yatai.127.0.0.1.sslip.io/predict_async 2000
...
Response content: b'{"userId":345786,"conferenceId":890678,"clientFaceDetection":true,"emotions":[{"raw":{"neutral":0.05500756949186325,"happy":0.1018536314368248,"sad":0.16155335307121277,"surprise":0.09337092190980911,"fear":0.31956708431243896,"disgust":0.2149820476770401,"anger":0.04546269401907921,"contempt":0.008202659897506237},"dominantEmotion":"fear"}],"date":"2023-02-25T19:39:19.216+00:00","duration":0.063389253616333}'
Response code: 500
Response code: 503
...
  • Response code 500 usually means the request was cancelled because a Runner is too busy (timeout in config).
  • Response code 503 usually means bento removed the request from the adaptive batching queue of the emotion model, because it figured out that the inference time + queue wait time would be higher than max_latency_ms in config.
  • Response code 502 usually means a general resource issue, try adding more memory/cpu when using Yatai.

Starting the service without the BentoML config options timeout/max_latency_ms would greatly increase the latency of requests when the system is under load. This is because none of the requests would be cancelled.

bento_emotion's People

Contributors

thomasfen avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.