Code Monkey home page Code Monkey logo

fastdeploy's Introduction

fastDeploy

easy and performant micro-services for Python Deep Learning inference pipelines

  • Deploy any python inference pipeline with minimal extra code
  • Auto batching of concurrent inputs is enabled out of the box
  • no changes to inference code (unlike tf-serving etc), entire pipeline is run as is
  • Promethues metrics (open metrics) are exposed for monitoring
  • Auto generates clean dockerfiles and kubernetes health check, scaling friendly APIs
  • sequentially chained inference pipelines are supported out of the box
  • can be queried from any language via easy to use rest apis
  • easy to understand (simple consumer producer arch) and simple code base

Installation:

pip install --upgrade fastdeploy fdclient
# fdclient is optional, only needed if you want to use python client

Start fastDeploy server on a recipe:

# Invoke fastdeploy 
python -m fastdeploy --help
# or
fastdeploy --help

# Start prediction "loop" for recipe "echo"
fastdeploy --loop --recipe recipes/echo

# Start rest apis for recipe "echo"
fastdeploy --rest --recipe recipes/echo

Send a request and get predictions:

auto generate dockerfile and build docker image:

# Write the dockerfile for recipe "echo"
# and builds the docker image if docker is installed
# base defaults to python:3.8-slim
fastdeploy --build --recipe recipes/echo

# Run docker image
docker run -it -p8080:8080 fastdeploy_echo

Serving your model (recipe):

Where to use fastDeploy?

  • to deploy any non ultra light weight models i.e: most DL models, >50ms inference time per example
  • if the model/pipeline benefits from batch inference, fastDeploy is perfect for your use-case
  • if you are going to have individual inputs (example, user's search input which needs to be vectorized or image to be classified)
  • in the case of individual inputs, requests coming in at close intervals will be batched together and sent to the model as a batch
  • perfect for creating internal micro services separating your model, pre and post processing from business logic
  • since prediction loop and inference endpoints are separated and are connected via sqlite backed queue, can be scaled independently

Where not to use fastDeploy?

  • non cpu/gpu heavy models that are better of running parallely rather than in batch
  • if your predictor calls some external API or uploads to s3 etc in a blocking way
  • io heavy non batching use cases (eg: query ES or db for each input)
  • for these cases better to directly do from rest api code (instead of consumer producer mechanism) so that high concurrency can be achieved

fastdeploy's People

Contributors

bedapudi6788 avatar preetham avatar shashikg avatar vamsibedapudi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fastdeploy's Issues

The app does not proceed ahead of "[_app.py:23] Waiting for batch size search to finish." when running the docker container on Mac os.

Trying to run the yolo recipe in docker container using the "notaitech/temp:fastdeploy_license" image on Mac OS.

System Specifications -
System Version: macOS 12.3 (21E230)
Kernel Version: Darwin 21.4.0
Boot Volume: Macintosh HD
Boot Mode: Normal

Model Name: MacBook Pro
Model Identifier: MacBookPro18,3
Chip: Apple M1 Pro
Total Number of Cores: 8 (6 performance and 2 efficiency)
Memory: 16 GB


The app doesn't proceed ahead of the following output

2022-05-02:16:10:39,667 INFO [_utils.py:67] REQUEST_INDEX: /recipe/fastdeploy_dbs/default.request_index RESULTS_INDEX: /recipe/fastdeploy_dbs/default.results_cache META_INDEX: /recipe/fastdeploy_dbs/default.META_INDEX IS_FILE_INPUT: True FASTDEPLOY_UI_PATH: /home/user/miniconda/lib/python3.9/site-packages/fastdeploy/fastdeploy-ui
2022-05-02:16:10:40,988 INFO [_utils.py:67] REQUEST_INDEX: /recipe/fastdeploy_dbs/default.request_index RESULTS_INDEX: /recipe/fastdeploy_dbs/default.results_cache META_INDEX: /recipe/fastdeploy_dbs/default.META_INDEX IS_FILE_INPUT: True FASTDEPLOY_UI_PATH: /home/user/miniconda/lib/python3.9/site-packages/fastdeploy/fastdeploy-ui
2022-05-02:16:10:40,992 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:10:46,5 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:10:47,400 INFO [_loop.py:25] ACCEPTS_EXTRAS: False
2022-05-02:16:10:47,403 INFO [_utils.py:97] Warming up ..
2022-05-02:16:10:51,9 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:10:56,24 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:01,35 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:06,42 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:11,53 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:16,64 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:21,75 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:26,82 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:31,94 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:36,105 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:41,117 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:46,130 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:51,143 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:11:56,153 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:12:01,161 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:12:06,169 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:12:11,177 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:12:16,186 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:12:21,194 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:12:26,201 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:12:31,214 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:12:36,228 INFO [_app.py:23] Waiting for batch size search to finish.
2022-05-02:16:12:41,236 INFO [_app.py:23] Waiting for batch size search to finish.

Cannot create custom recipe

As fastpunct needs old TF 1.14 I wanted to wrap it into a container using fastDeploy.

When I call ./fastDeploy.py --build fastpunct --source_dir recipes/fastpunct I am getting asked about base image. I am selecting tf_1_14_cpu but then I got an error:


tf_1_14_cpu: Pulling from notaitech/fastdeploy
Digest: sha256:c0b3277e87b578e6d4396f94087171a928721c7c1fa8e60584f629d462339935
Status: Image is up to date for notaitech/fastdeploy:tf_1_14_cpu
docker.io/notaitech/fastdeploy:tf_1_14_cpu

  --port defaults to 8080 
fastpunct
 Your folder must contain a file called requirements.txt, predictor.py and example.pkl 

I created folder recipes/fastpunct it contains:

example.pkl     predictor.py    requiremets.txt

predictor.py:

from fastpunct import FastPunct
fp = FastPunct('en')


def predictor(inputs=[], batch_size=1):
    return fp.punct(inputs, batch_size)


if __name__ == '__main__':
    import json
    import pickle
    # Sample inputs for your predictor function
    sample_inputs = ['Some text and another text I always wanted to say what i']

    # Verify that the inputs are json serializable
    json.dumps(sample_inputs)

    # Verify that predictor works as expected
    # preds = predictor(sample_inputs)
    # assert len(preds) == len(sample_inputs)

    # Verify that the predictions are json serializable
    json.dumps(sample_inputs)

    pickle.dump(sample_inputs, open('example.pkl', 'wb'))

requiremets.txt:

tensorflow==1.14.0
keras==2.2.4
numpy==1.16
fastpunct==1.0.2

How can I deploy fastpunct easily?)
p.s. I need to chain with DeepSegment and transform YouTube transcribe into sentences. Thanks for awesome work!

Add shortcuts to cli options

Current fastDeploy.py CLI options has --list_recipes, --source_dir and --build options which can be provided with shortcuts such as -l, -s and -b respectively.

Add background run option for recipes

fastDeploy CLI (fastDeploy.py) provides option to run pre-built recipes available via --list_recipes argument, but this runs the recipe (docker container) in foreground. An option to run the container in background would be a good to have.

fastDeploy.py --h should not work.

Steps to replicate:

  • Download the latest fastDeploy.py cli from repo.
  • Execute fastDeploy.py --l

This shouldn't work as the shortcut is supposed to be fastDeploy.py -l (which also works) and the long form trigger is fastDeploy.py --list_recipes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.