Code Monkey home page Code Monkey logo

multi-model-server's Introduction

Multi Model Server

ubuntu/python-2.7 ubuntu/python-3.6
Python3 Build Status Python2 Build Status

Multi Model Server (MMS) is a flexible and easy to use tool for serving deep learning models trained using any ML/DL framework.

Use the MMS Server CLI, or the pre-configured Docker images, to start a service that sets up HTTP endpoints to handle model inference requests.

A quick overview and examples for both serving and packaging are provided below. Detailed documentation and examples are provided in the docs folder.

Join our slack channel to get in touch with development team, ask questions, find out what's cooking and more!

Contents of this Document

Other Relevant Documents

Quick Start

Prerequisites

Before proceeding further with this document, make sure you have the following prerequisites.

  1. Ubuntu, CentOS, or macOS. Windows support is experimental. The following instructions will focus on Linux and macOS only.

  2. Python - Multi Model Server requires python to run the workers.

  3. pip - Pip is a python package management system.

  4. Java 8 - Multi Model Server requires Java 8 to start. You have the following options for installing Java 8:

    For Ubuntu:

    sudo apt-get install openjdk-8-jre-headless

    For CentOS:

    sudo yum install java-1.8.0-openjdk

    For macOS:

    brew tap homebrew/cask-versions
    brew update
    brew cask install adoptopenjdk8

Installing Multi Model Server with pip

Setup

Step 1: Setup a Virtual Environment

We recommend installing and running Multi Model Server in a virtual environment. It's a good practice to run and install all of the Python dependencies in virtual environments. This will provide isolation of the dependencies and ease dependency management.

One option is to use Virtualenv. This is used to create virtual Python environments. You may install and activate a virtualenv for Python 2.7 as follows:

pip install virtualenv

Then create a virtual environment:

# Assuming we want to run python2.7 in /usr/local/bin/python2.7
virtualenv -p /usr/local/bin/python2.7 /tmp/pyenv2
# Enter this virtual environment as follows
source /tmp/pyenv2/bin/activate

Refer to the Virtualenv documentation for further information.

Step 2: Install MXNet MMS won't install the MXNet engine by default. If it isn't already installed in your virtual environment, you must install one of the MXNet pip packages.

For CPU inference, mxnet-mkl is recommended. Install it as follows:

# Recommended for running Multi Model Server on CPU hosts
pip install mxnet-mkl

For GPU inference, mxnet-cu92mkl is recommended. Install it as follows:

# Recommended for running Multi Model Server on GPU hosts
pip install mxnet-cu92mkl

Step 3: Install or Upgrade MMS as follows:

# Install latest released version of multi-model-server 
pip install multi-model-server

To upgrade from a previous version of multi-model-server, please refer migration reference document.

Notes:

  • A minimal version of model-archiver will be installed with MMS as dependency. See model-archiver for more options and details.
  • See the advanced installation page for more options and troubleshooting.

Serve a Model

Once installed, you can get MMS model server up and running very quickly. Try out --help to see all the CLI options available.

multi-model-server --help

For this quick start, we'll skip over most of the features, but be sure to take a look at the full server docs when you're ready.

Here is an easy example for serving an object classification model:

multi-model-server --start --models squeezenet=https://s3.amazonaws.com/model-server/model_archive_1.0/squeezenet_v1.1.mar

With the command above executed, you have MMS running on your host, listening for inference requests. Please note, that if you specify model(s) during MMS start - it will automatically scale backend workers to the number equal to available vCPUs (if you run on CPU instance) or to the number of available GPUs (if you run on GPU instance). In case of powerful hosts with a lot of compute resoures (vCPUs or GPUs) this start up and autoscaling process might take considerable time. If you would like to minimize MMS start up time you can try to avoid registering and scaling up model during start up time and move that to a later point by using corresponding Management API calls (this allows finer grain control to how much resources are allocated for any particular model).

To test it out, you can open a new terminal window next to the one running MMS. Then you can use curl to download one of these cute pictures of a kitten and curl's -o flag will name it kitten.jpg for you. Then you will curl a POST to the MMS predict endpoint with the kitten's image.

kitten

In the example below, we provide a shortcut for these steps.

curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg
curl -X POST http://127.0.0.1:8080/predictions/squeezenet -T kitten.jpg

The predict endpoint will return a prediction response in JSON. It will look something like the following result:

[
  {
    "probability": 0.8582232594490051,
    "class": "n02124075 Egyptian cat"
  },
  {
    "probability": 0.09159987419843674,
    "class": "n02123045 tabby, tabby cat"
  },
  {
    "probability": 0.0374876894056797,
    "class": "n02123159 tiger cat"
  },
  {
    "probability": 0.006165083032101393,
    "class": "n02128385 leopard, Panthera pardus"
  },
  {
    "probability": 0.0031716004014015198,
    "class": "n02127052 lynx, catamount"
  }
]

You will see this result in the response to your curl call to the predict endpoint, and in the server logs in the terminal window running MMS. It's also being logged locally with metrics.

Other models can be downloaded from the model zoo, so try out some of those as well.

Now you've seen how easy it can be to serve a deep learning model with MMS! Would you like to know more?

Stopping the running model server

To stop the current running model-server instance, run the following command:

$ multi-model-server --stop

You would see output specifying that multi-model-server has stopped.

Create a Model Archive

MMS enables you to package up all of your model artifacts into a single model archive. This makes it easy to share and deploy your models. To package a model, check out model archiver documentation

Recommended production deployments

  • MMS doesn't provide authentication. You have to have your own authentication proxy in front of MMS.
  • MMS doesn't provide throttling, it's vulnerable to DDoS attack. It's recommended to running MMS behind a firewall.
  • MMS only allows localhost access by default, see Network configuration for detail.
  • SSL is not enabled by default, see Enable SSL for detail.
  • MMS use a config.properties file to configure MMS's behavior, see Manage MMS page for detail of how to configure MMS.
  • For better security, we recommend running MMS inside docker container. This project includes Dockerfiles to build containers recommended for production deployments. These containers demonstrate how to customize your own production MMS deployment. The basic usage can be found on the Docker readme.

Other Features

Browse over to the Docs readme for the full index of documentation. This includes more examples, how to customize the API service, API endpoint details, and more.

External demos powered by MMS

Here are some example demos of deep learning applications, powered by MMS:

Product Review Classification demo4 Visual Search demo1
Facial Emotion Recognition demo2 Neural Style Transfer demo3

Contributing

We welcome all contributions!

To file a bug or request a feature, please file a GitHub issue. Pull requests are welcome.

multi-model-server's People

Contributors

aaronmarkham avatar abhinavs95 avatar alexgl-github avatar alexwong avatar ankkhedia avatar c2zwdjnlcg avatar ddavydenko avatar dhanainme avatar ericangelokim avatar frankfliu avatar goswamig avatar jamesewoo avatar jesterhazy avatar jiajiechen avatar kevinthesun avatar knjcode avatar lupesko avatar lxning avatar maaquib avatar nskool avatar nswamy avatar photoszzt avatar piyushghai avatar sandeep-krishnamurthy avatar saravsak avatar thomasdelteil avatar vdantu avatar vrakesh avatar yuruofeifei avatar zachgk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

multi-model-server's Issues

model export failure - consumes all disk space

Env:
Windows 10 / Conda / Python 2.7

Attempted a caffenet export. Used files from the model zoo. First time failed because I had several symbol files in the same directory, but it made a caffenet.model file anyway. I tried serving this file but it said it was not a zip file. (this is probably a separate bug)
Second time I run export after moving the other symbol files away, then I see the error below and...

It will eat up all available disk space!!! It made some massive 3-GB file. Then I cleared some space and now have 6GB caffenet.model, but this model should only be about 238MB.

Repeatable, by just dropping a 0 byte x.model file and running export on x model's params/symbol/signature files. It will break when there's already an x.model file there.

Note that you will get this broken warning message (which should also be fixed):

  warnings.warn("%s.model already in %s and will be overwritten." % (model_name, model_path))

In the middle of this error:

(dms_p27) C:\Users\Aaron\Source\Repos\dms\examples\models\resnet-18>deep-model-export --model-name resnet-18 --model-path .
c:\users\aaron\appdata\local\conda\conda\envs\dms_p27\lib\site-packages\urllib3\contrib\pyopenssl.py:46: DeprecationWarning: OpenSSL.rand is deprecated - you should use os.urandom instead
  import OpenSSL.SSL
c:\users\aaron\appdata\local\conda\conda\envs\dms_p27\lib\site-packages\dms\export_model.py:90: UserWarning: resnet-18.model already in . and will be overwritten.
  warnings.warn("%s.model already in %s and will be overwritten." % (model_name, model_path))
Traceback (most recent call last):
  File "c:\users\aaron\appdata\local\conda\conda\envs\dms_p27\lib\runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "c:\users\aaron\appdata\local\conda\conda\envs\dms_p27\lib\runpy.py", line 72, in _run_code
    exec code in run_globals
  File "C:\Users\Aaron\AppData\Local\conda\conda\envs\dms_p27\Scripts\deep-model-export.exe\__main__.py", line 9, in <module>
  File "c:\users\aaron\appdata\local\conda\conda\envs\dms_p27\lib\site-packages\dms\export_model.py", line 214, in export
    _export_model(args)
  File "c:\users\aaron\appdata\local\conda\conda\envs\dms_p27\lib\site-packages\dms\export_model.py", line 94, in _export_model
    zip_file.write(item, os.path.basename(item))
  File "c:\users\aaron\appdata\local\conda\conda\envs\dms_p27\lib\zipfile.py", line 801, in __exit__
    self.close()
  File "c:\users\aaron\appdata\local\conda\conda\envs\dms_p27\lib\zipfile.py", line 1347, in close
    " would require ZIP64 extensions")
zipfile.LargeZipFile: Central directory offset would require ZIP64 extensions

(dms_p27) C:\Users\Aaron\Source\Repos\dms\examples\models\resnet-18>

Failed download not handled

If initial download fails, a partial file is created. Retries to run just give error that the file isn’t a zip file rather than checking the source, etc. Maybe should be downloaded ad temp and flip-activate? Or check with server that we have most recent file (date, size, etc.)?

Model name overlap not supported?

After initial download, changing name or url (anything but changing the actual model file name) results in no download attempt for the new item. I would expect the initial model_name= to be the “unique” identifier for another download, or something mixed with url, since “resnet-18.model” ay be a common name.

Provide details on dms_app.config settings

Several questions on this:

  1. When would you change the Gunicorn arguments; for what purpose/effect?
  2. I noticed change to the config from 1 worker to 4, and OMP_NUM_THREADS from 4 to 1. What's up with that? Why? Are they linked in this way where if I have a bigger instance I can go to 8 workers with two threads? Why not 4 workers and 4 threads? Or 64 workers and 4 threads?
  3. What is worker-class? What options are there?
  4. What is limit-request-line? What's the max? What impact does this have when changed?

params file inclusion (over/under)

dms requires a prefix-0000.params file.
dme does not.

Maybe if dms requires it, so should dme, or dms should be made to support non 0000 param files.

Example:
When exporting Inception-BN using the model zoo it has Inception-BN-0126.params. The export is successful, but when serving you get an error that it can't find 0000.params.
Copied and renamed the copy to 0000, the export doubles the size of the model file. It includes both. Changing the name to Inception-BN-0126.params.bak doesn't matter - it's still included.
Also, if you ran server already in that folder you now have a subfolder with a .params file in it. If you run export with just the params file in the parent, it finds this other params file in the subfolder and adds that too. Very greedy. :)

So two things here:

  1. If it is so picky about it being specifically prefix-0000.params at the server cli, what is going on with the export cli!?
  2. Why is it that I can just rename the checkpoint to 0000 and eventually get it to serve properly? Why not just pick prefix-####.params and rename it internally?

2.5 (and warn people before all params are rolled up into the .model file or don't let that happen)

Installing on python3 throws import error

Installing collected packages: itsdangerous, click, Flask, flask-cors, deep-model-server
Successfully installed Flask-0.12.2 click-6.7 deep-model-server-0.1 flask-cors-3.0.3 itsdangerous-0.24
[hadoop@ip-10-0-0-121 dms]$ deep-model-export
Traceback (most recent call last):
File "/usr/local/bin/deep-model-export", line 7, in
from mms.export_model import export
File "/usr/local/lib/python3.4/site-packages/mms/export_model.py", line 7, in
from arg_parser import ArgParser
ImportError: No module named 'arg_parser'

Export CLI model params confusing

Current export CLI --model parameters are designed with a multiple key-value pair, e.g. dms --model <model_name>=<model_path> which has a few issues:

  1. It allows multiple models in a single package which is not supported by DMS
  2. It expects a single JSON+Weights files pair in the model_path, but there can be cases where there are multiple pairs in the target path, and it is not clear which one to package

We can consider a few alternatives:

  1. dms --model <model_prefix_path> --output <model_archive_name>
  2. dms --model <model_archive_name>=<model_prefix_path> and return an error if more than one model_archive_name is specified.

model_prefix_path is the path plus the file name prefix of the JSON and Weights (so file names minus the extensions)
model_archive_name is the generated model archive file name, not including the prefix that gets added by the tool

server cannot bind to public IP

Trying to run:

deep-model-server --models squeezenet=https://s3.amazonaws.com/model-server/models/squeezenet_v1.1/squeezenet_v1.1.model --host 34.228.254.167 --port 8080

Gives the error Cannot assign requested address:

[ERROR 2017-11-09 20:50:18,592 PID:6760 /home/ec2-user/anaconda3/envs/mxnet_p27/lib/python2.7/site-packages/dms/deep_model_server.py:start_model_serving:97] Failed to start model serving host: Flask handler failed to start: [Errno 99] Cannot assign requested address

Tried different ports. Tried tinkering with inbound rules, but same error.
Binding to 0.0.0.0 doesn't throw an error, but trying to access the API gives a timeout.

Binding to an internal IP seems to work, but that's inaccessible.

Error in example signature.json

In readme, I tried to copy and modify example signature.json according to my application. We need to fix following issues:

  1. Single quotes do not work. I get json parser error. Double quotes works fine.
  2. input => inputs, output => outputs.

arg_parser dependency missing

I assume that the pip install should handle all of the project dependencies, but I'm getting an error that I'm missing arg_parser.

(mxnet3.6) 8c8590217d26:Development markhama$ deep-model-server
Traceback (most recent call last):
  File "/Users/markhama/Development/mxnet3.6/bin/deep-model-server", line 7, in <module>
    from mms.mxnet_model_server import start_serving
  File "/Users/markhama/Development/mxnet3.6/lib/python3.6/site-packages/mms/mxnet_model_server.py", line 1, in <module>
    from arg_parser import ArgParser
ModuleNotFoundError: No module named 'arg_parser'

how do you make sure your signature.json is correct?

When all you have is a symbol.json and a params file?

I think I'm seeing this fail with the caffenet conversion. The model exports just fine - no errors - but it can't be served. I tried messing with the outputs in signature and that leads me to believe that there's a problem there, but when I try other models like nin or inception, I can just use the resnet-18 signature and there's not problem. I want to make sure, however, that the signature.json is really correct before adding a .model file to the model zoo.

dms error output:

[21:27:10] C:\projects\mxnet-distro-win\mxnet-build\src\nnvm\legacy_json_util.cc:190: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[21:27:10] C:\projects\mxnet-distro-win\mxnet-build\src\nnvm\legacy_json_util.cc:198: Symbol successfully upgraded!
[21:27:10] C:\projects\mxnet-distro-win\mxnet-build\dmlc-core\include\dmlc/logging.h:308: [21:27:10] c:\projects\mxnet-distro-win\mxnet-build\src\operator\tensor\../elemwise_op_common.h:122: Check failed: assign(&dattr, (*vec)[i]) Incompatible attr in node  at 0-th output: expected (4096,9216), got (4096,57600)
�[31mE1026 21:27:10 14580 c:\users\aaron\appdata\local\conda\conda\envs\dms_p27\lib\site-packages\dms\mxnet_model_server.py:_arg_process:140]�[0m Failed to process arguments: [21:27:10] c:\projects\mxnet-distro-win\mxnet-build\src\operator\tensor\../elemwise_op_common.h:122: Check failed: assign(&dattr, (*vec)[i]) Incompatible attr in node  at 0-th output: expected (4096,9216), got (4096,57600)

docker docs updates needed

  1. clone the repo to get the files
  2. add mention of the GPU docs below
  3. remove GPU mention not supported
  4. add reminder about opening ports

Usage of relative imports is generally not preferred

In many parts of the code, relative imports are used. Example: from ..log import logger.
Relative imports are generally not encouraged. We should revisit and consider using the full path or absolute_import from future.

For example, running unit tests from pytest fails with

from ..log import get_logger
22:07:28 E ValueError: Attempted relative import beyond toplevel package

help output incorrect

When you don't provide minimum required inputs you get this response:

usage: mxnet-model-serving

It should say usage: mxnet-model-server or deep-model-server.

Also, when you use the -h flag, it mentions:

MXNet Model Serving, which it probably should say Deep Model Server instead.

swagger instructions needed

The guide tells you to use swagger_client for generated client code, but doesn't tell you how to install it or where to get it.

python export example doesn't work

I tried a couple of variations.
It seems like mms.export_model still exists, but I would have thought this would have been changed to be like dms instead of mms. But even the original example still doesn't work.

>>> import mxnet as mx
>>> from dms.export_model import export_serving
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named dms.export_model
>>> from mms.export_model import export_serving
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name export_serving
>>> 

Update name to "Model Server for Apache MXNet" in code

We've decided to brand the product as "Model Server for Apache MXNet".
More specifically:

Product documentation should use the full name "Model Server for Apache MXNet" or a short hand "Model Server"
PIP package name will be "mxnet-model-server"
PIP commands will be "mxnet-model-server" and "mxnet-model-export"
Folders and files should be changes accordingly, e.g. "/dms" folder will be changed to "model_server"

The task is to perform all updates in the code.
This #73 is to update the docs.

Server start output

It would be great if when the server starts it gives you a valid URL rather than this thing that gives you a 404 which makes you think it's broken:
Service started at 127.0.0.1:8080/

Why not:

Service started. 
DMS API Description: http://127.0.0.1:8080/api-description
DMS API Health: http://127.0.0.1:8080/ping

Input shape in signature enforces batch shape and replace with 1 during loading

https://github.com/awslabs/deep-model-server/blob/master/dms/model_service/mxnet_model_service.py#L94

For MXNet Module binding, we assume 1st dimension in input shape is batch and then replaces it with 1.

For example, if my input is a color image of 512*512, If I specify input shape (3, 512, 512) then during load time, MMS changes it to (1,512,512) assuming 3 is batch_size.

So input shape in the signature is basically input shape along with batch size used in training. We need to revisit this part.

Not all Logs written to the log file with --log-file option.

I started the server with --log-file=/tmp/x.log. Only the metrics and the response are written to this file and everything else is dumped to the console.
The logs i find in the log file

Initialized model serving.
Adding endpoint: squeezenet_predict to Flask
Adding endpoint: ping to Flask
Adding endpoint: api-description to Flask
Metric error_number for last 300 seconds is 0.000000
Metric requests_number for last 300 seconds is 0.000000
Metric cpu for last 300 seconds is 0.222000
Metric memory for last 300 seconds is 0.005583
Metric disk for last 300 seconds is 0.955000
Metric overall_latency for last 300 seconds is 0.000000
Metric inference_latency for last 300 seconds is 0.000000
Metric preprocess_latency for last 300 seconds is 0.000000
Service started successfully.
Service description endpoint: 127.0.0.1:8080/api-description
Service health endpoint: 127.0.0.1:8080/ping

  • Running on http://127.0.0.1:8080/ (Press CTRL+C to quit)
    Request input: input0 should be image with jpeg format.
    Getting file data from request.
    Response is text.
    Jsonifying the response: {'prediction': [[{'class': 'n02123394 Persian cat', 'probability': 0.8297115564346313}, {'class': 'n02086079 Pekinese, Pekingese, Peke', 'probability': 0.04721757397055626}, {'class': 'n02098413 Lhasa, Lhasa apso', 'probability': 0.019571054726839066}, {'class': 'n02113624 toy poodle', 'probability': 0.018856260925531387}, {'class': 'n02085936 Maltese dog, Maltese terrier, Maltese', 'probability': 0.016827790066599846}]]}
    127.0.0.1 - - [09/Nov/2017 12:47:05] "POST /squeezenet/predict HTTP/1.1" 200

warning message not populating

Env:
Windows 10 / Conda / Python 2.7

Run an export successfully:

('Successfully exported %s model. Model file is located at %s.', 'resnet-18', 'C:\\Users\\Aaron\\Source\\Repos\\dms\\examples\\models\\resnet-18\\resnet-18.model')

Unable to predict on large image

I test with an image of size ~2MB and it returns the XML response that

<html>
<head><title>413 Request Entity Too Large</title></head>
<body bgcolor="white">
<center><h1>413 Request Entity Too Large</h1></center>
<hr><center>nginx/1.4.6 (Ubuntu)</center>
</body>
</html>

Readme's "Start Serving" and other CLI examples include text that creates errors

Example: https://github.com/deep-learning-tools/deep-model-server#start-serving
The CLI includes params in squared brackets ([]) which fails the CLI, e.g. "deep-model-server --models resnet-18=https://s3.amazonaws.com/mms-models/resnet-18.model [--service mxnet_vision_service] [--gen-api python] [--port 8080] [--host 127.0.0.1]"

These needs to be removed; the CLI examples need to work as is, and optional parameters can be described later.

numpy version warning

Should the pip package include the appropriate version of numpy?

8c8590217d26:Development markhama$ deep-model-server
RuntimeError: module compiled against API version 0xb but this version of numpy is 0x9
RuntimeError: module compiled against API version 0xb but this version of numpy is 0x9

I think it's "fixed" it via (I don't see the warning anymore):

8c8590217d26:Development markhama$ pip install -U numpy
Collecting numpy
  Downloading numpy-1.13.3-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (4.6MB)
    100% |████████████████████████████████| 4.6MB 271kB/s 
Installing collected packages: numpy
  Found existing installation: numpy 1.8.0rc1
    Not uninstalling numpy at /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python, outside environment /Users/markhama/Development/caffe2-2.7
Successfully installed numpy-1.13.3

Scalable behavior?

Are concurrent requests handed serially? When sending many concurrent requests, the pattern of handling appears to be serial (i.e. throughput seems to be similar to serial calls of the same count).

Exception Handling

We need to revisit on exception handling - "error message" and "error codes" thrown back to users in various use cases like - Invalid Input, timeout, unknown exceptions and more.

I sent a wrong input received 500 error without an error message.

SSD integration test downloads 125 MB of model binaries

Integration test for SSD in dms/tests/integration_tests/ downloads MXNet model files (~125 MB).
We should probably have an integration test which needs smaller model file. If some developer is running this tests remotely, he would not be able to complete integration test quickly.

Ideally, we should create dms/tests/nightly_tests and move SSD test to this folder. And this test can run nightly on Jenkins CI.

Revisit effect of logging pre-process, post-process and inference time

Currently we log time for pre-process, post-process and inference time in "Debug Mode".
We need to revisit to answer the following:

  1. Writing to log file is costly, in performance service use-case, this can cause significant bottle neck for the users. What is the optimal log details that gives enough information and do not become bottle neck? Can we give users an option?
  2. [Imp] probably keep average inference time and log once in 5 minutes? to reduce effect of logging every request. And we can do this in log.INFO level.

README grammar

README: Data shape is a list of integer <— integer+s (integers). It “should contain” or “contains”. (would contain if the user creates is manually).

  • Ok, various grammar changes in README

Update name to "Model Server for Apache MXNet" in docs

We've decided to brand the product as "Model Server for Apache MXNet".
More specifically:

  • Product documentation should use the full name "Model Server for Apache MXNet" or a short hand "Model Server"
  • PIP package name will be "mxnet-model-server"
  • PIP commands will be "mxnet-model-server" and "mxnet-model-export"
  • Folders and files should be changes accordingly, e.g. "/dms" folder will be changed to "model_server"

The task is to perform all updates in the docs.
This #74 handles the required code updates.

Extending DMS code samples needs update

I see many restructuring the code related to utils and model service. Our documentation examples, mainly extended the service Python code samples needs to be fixed.

For example:

  1. No module named mxnet_utils . This is used in overriding example
    It should be - from dms.utils.mxnet import image
  2. No module named mxnet_model_service
    It should be - from dms.model_service import mxnet_model_service
  3. In preprocess.
    We should use data[0]. data is list of inputs.
  4. Input Data must be a list
    _preprocess must return list

Define output shape in signature file

We take output shape in signature file. This is the output shape of NDArray that we get after forward pass (inference).
This shape is not and should not be enforced on the output from the service.
We need clearly document on how we use this output_shape.

custom service docs update

Now that the custom service file is inside the model file, all of the references and instructions around export and custom service needs updating.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.