mlcommons / inference_results_v1.1 Goto Github PK

This repository contains the results and code for the MLPerf™ Inference v1.1 benchmark.

Home Page: https://mlcommons.org/en/inference-datacenter-11/

License: Apache License 2.0

Makefile 1.12% Python 48.29% Shell 1.83% CMake 1.13% C++ 38.28% Cuda 8.24% C 0.46% Roff 0.04% Dockerfile 0.10% Jupyter Notebook 0.46% HTML 0.01% CSS 0.07%

inference_results_v1.1's Introduction

MLPerf Inference v1.1

GitHub Submission HOWTO

Clone the MLPerf Inference v1.1 submission tree

Clone the submission tree e.g. under your home directory:

$ export SUBMISSION_ROOT=$HOME/submissions_inference_1_1
$ git clone [email protected]:mlcommons/submissions_inference_1_1.git $SUBMISSION_ROOT
$ cd $SUBMISSION_ROOT

Create a branch

We recommend creating a new branch for every logically connected group of results e.g. all results from your System-Under-Test (SUT) or only relating to a particular benchmark. Prefix your branch name with your organization's name. Feel free to include the SUT name, implementation name, benchmark name, etc.

For example:

$ git checkout master && git pull
$ git checkout -b dividiti-closed-aws-g4dn.4xlarge-openvino

Populate your branch according to the submission rules.

You can inspect your changes:

$ git status
On branch dividiti-closed-aws-g4dn.4xlarge-openvino
Untracked files:
  (use "git add <file>..." to include in what will be committed)
        closed/dividiti/code/
        closed/dividiti/compliance/
        closed/dividiti/measurements/
        closed/dividiti/results/
        closed/dividiti/systems/

nothing added to commit but untracked files present (use "git add" to track)

and make intermediate commits as usual:

$ git add closed/dividiti
$ git commit -m "Dump repo:mlperf-closed-aws-g4dn.4xlarge-openvino."

Run the submission checker

Once you are happy with the tree structure, truncate the accuracy logs and run the submission checker, culminating in e.g.:

      INFO:main:Results=2, NoResults=0
      INFO:main:SUMMARY: submission looks OK

Push the changes

Once you and the submission checker are happy with the tree structure, you can push the changes:

$ git push

fatal: The current branch dividiti-closed-aws-g4dn.4xlarge-openvino has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream origin dividiti-closed-aws-g4dn.4xlarge-openvino

Do exactly as suggested:

$ git push --set-upstream origin dividiti-closed-aws-g4dn.4xlarge-openvino

Create a pull request

If you now go to https://github.com/mlcommons/submissions_inference_1_1/, you should see a notification about your branch being recently pushed and can immediately create a pull request (PR). You can also select your branch from the dropdown menu under <> Code. (Aren't you happy you prefixed your branch's name with the submitter's name?)

As usual, you can continue committing to the branch until the PR is merged, with any changes being reflected in the PR.

inference_results_v1.1's People

Contributors

Stargazers

Watchers

inference_results_v1.1's Issues

DLRM 99.9 performance and accuracy runs getting stuck on Xeon Icelake CPU

1157-0  : Complete load query samples !!
1158-3  : Complete load query samples !!
1158-2  : Complete load query samples !!
1158-1  : Complete load query samples !!
1158-4  : Complete load query samples !!
1158-0  : Complete load query samples !!
1157-2  : Complete load query samples !!
1157-1  : Complete load query samples !!
1157-4  : Complete load query samples !!
1157-3  : Complete load query samples !!

The DLRM 99.9 runs are getting stuck after this output. It is stuck at this spot for more than 4-5 hours; I am not sure what is causing this issue. The runs won't produce anything in the mlperf_log_summary.txt file.

Also, I have attached the performance run mlperf_log_detail.txt file if it can help us figure out what's the issue.
mlperf_log_detail.txt

RuntimeError: module compiled against API version 0xf but this version of numpy is 0xd

According to NVIDIA README Documentation , I created the image and started the container, but the following error occurs in the container：

root@mlperf-inference-x86_64:/work# CUDA_VISIBLE_DEVICES=0 make run RUN_ARGS="--benchmarks=ssd-resnet34 --scenarios=server --test_mode=PerformanceOnly"
make[1]: Entering directory '/work'
[2022-01-22 15:51:41,568 init.py:185 INFO] Running command: CUDA_VISIBLE_ORDER=PCI_BUS_ID nvidia-smi --query-gpu=gpu_name,pci.device_id,uuid --format=csv
[2022-01-22 15:51:43,563 main.py:760 INFO] Detected System ID: GeForceRTX3090x1
[2022-01-22 15:51:44,072 main.py:108 INFO] Building engines for ssd-resnet34 benchmark in Server scenario...
[2022-01-22 15:51:44,074 main.py:128 INFO] Building GPU engine for GeForceRTX3090x1_ssd-resnet34_Server
RuntimeError: module compiled against API version 0xf but this version of numpy is 0xd
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(self._args, **self._kwargs)
File "/work/code/main.py", line 129, in handle_generate_engine
b = get_benchmark(config)
File "/work/code/init.py", line 80, in get_benchmark
cls = get_cls(G_BENCHMARK_CLASS_MAP[benchmark])
File "/work/code/init.py", line 63, in get_cls
return getattr(import_module(module_loc.module_path), module_loc.cls_name)
File "/usr/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 848, in exec_module
File "", line 219, in _call_with_frames_removed
File "/work/code/ssd-resnet34/tensorrt/SSDResNet34.py", line 37, in
SSDResNet34EntropyCalibrator = import_module("code.ssd-resnet34.tensorrt.calibrator").SSDResNet34EntropyCalibrator
File "/usr/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 848, in exec_module
File "", line 219, in _call_with_frames_removed
File "/work/code/ssd-resnet34/tensorrt/calibrator.py", line 16, in
import pycuda.driver as cuda
File "/usr/local/lib/python3.8/dist-packages/pycuda/driver.py", line 65, in
from pycuda._driver import * # noqa
SystemError: initialization of _driver raised unreported exception
Traceback (most recent call last):
File "code/main.py", line 763, in
main(main_args, system)
File "code/main.py", line 736, in main
dispatch_action(main_args, config_dict, workload_id, equiv_engine_setting=equiv_engine_setting)
File "code/main.py", line 556, in dispatch_action
launch_handle_generate_engine(_gen_args, **_gen_kwargs)
File "code/main.py", line 92, in launch_handle_generate_engine
raise RuntimeError("Building engines failed!")
RuntimeError: Building engines failed!
make[1]: *** [Makefile:619: generate_engines] Error 1
make[1]: Leaving directory '/work'
make: *** [Makefile:613: run] Error 2

But the version of the numpy is as same as requirements

hi，i use the rtx 3070 to make build :"Cannot find valid configs for 1x NVIDIA GeForce RTX 3070. Please follow performance_tuning_guide.md to add support for a new GPU."

Xavier Failed to compile Pytorch 1.4

inference_results_v1.1/closed/NVIDIA/scripts/install_xavier_dependencies.sh

Line 108 in afc47bc

&& USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=0 sudo -E python3 setup.py install \

Xavier dependency have following errors. The Xavier I used started from a fresh installation with only pip installed for supporting python3.8 dependencies.

c++: internal compiler error: Segmentation fault (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
caffe2/CMakeFiles/torch.dir/build.make:2818: recipe for target 'caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/CPUType.cpp.o' failed
make[2]: *** [caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/CPUType.cpp.o] Error 4
make[2]: *** Waiting for unfinished jobs....
c++: internal compiler error: Segmentation fault (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
caffe2/CMakeFiles/torch.dir/build.make:2883: recipe for target 'caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/TypeDefault.cpp.o' failed
make[2]: *** [caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/TypeDefault.cpp.o] Error 4

Can not run "make build" on Xavier AGX

After following the readme, the next step was to run the command make build. This, however fails on the Xavier AGX when it tries to build for server, which is somewhat expected. The following error is thrown:

[ 81%] Building CXX object src/core/CMakeFiles/server-library.dir/tritonserver.cc.o
cd /media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server/src/core && /usr/bin/c++ -DTRITON_ENABLE_CUDA_GRAPH=1 -DTRITON_ENABLE_GPU=1 -DTRITON_ENABLE_LOGGING=1 -DTRITON_ENABLE_TENSORRT=1 -DTRITON_MIN_COMPUTE_CAPABILITY=6.0 -DTRITON_VERSION=\"2.13.0dev\" -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/build/server/../.. -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/third-party/protobuf/include -I/usr/local/cuda/include -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/third-party/cnmem/include -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server/_deps/repo-common-build/protobuf -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server/_deps/repo-core-src/include -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server/_deps/repo-common-src/include -Wall -Wextra -Wno-unused-parameter -Werror -Wno-deprecated-declarations -O3 -fPIC -std=gnu++11 -o CMakeFiles/server-library.dir/tritonserver.cc.o -c /media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/src/core/tritonserver.cc
[ 82%] Building CXX object src/core/CMakeFiles/server-library.dir/cuda_memory_manager.cc.o
cd /media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server/src/core && /usr/bin/c++ -DTRITON_ENABLE_CUDA_GRAPH=1 -DTRITON_ENABLE_GPU=1 -DTRITON_ENABLE_LOGGING=1 -DTRITON_ENABLE_TENSORRT=1 -DTRITON_MIN_COMPUTE_CAPABILITY=6.0 -DTRITON_VERSION=\"2.13.0dev\" -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/build/server/../.. -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/third-party/protobuf/include -I/usr/local/cuda/include -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/third-party/cnmem/include -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server/_deps/repo-common-build/protobuf -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server/_deps/repo-core-src/include -I/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server/_deps/repo-common-src/include -Wall -Wextra -Wno-unused-parameter -Werror -Wno-deprecated-declarations -O3 -fPIC -std=gnu++11 -o CMakeFiles/server-library.dir/cuda_memory_manager.cc.o -c /media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/src/core/cuda_memory_manager.cc
make[8]: Leaving directory '/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server'
make[7]: *** [CMakeFiles/Makefile2:851: src/backends/tensorrt/CMakeFiles/tensorrt-backend-library.dir/all] Error 2
make[7]: *** Waiting for unfinished jobs....
make[8]: Leaving directory '/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server'
[ 82%] Built target server-library
make[7]: Leaving directory '/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server'
make[6]: *** [Makefile:149: all] Error 2
make[6]: Leaving directory '/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build/server'
make[5]: *** [CMakeFiles/server.dir/build.make:133: server/src/server-stamp/server-build] Error 2
make[5]: Leaving directory '/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build'
make[4]: *** [CMakeFiles/Makefile2:150: CMakeFiles/server.dir/all] Error 2
make[4]: Leaving directory '/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build'
make[3]: *** [CMakeFiles/Makefile2:157: CMakeFiles/server.dir/rule] Error 2
make[3]: Leaving directory '/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build'
make[2]: *** [Makefile:137: server] Error 2
make[2]: Leaving directory '/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/build/triton-inference-server/out/tritonserver/build'
error: make server failed
make[1]: *** [Makefile:485: build_triton] Error 1
make[1]: Leaving directory '/media/nvmedrive/inference_results_v1.1/closed/NVIDIA'
make: *** [Makefile:447: build] Error 2

I have ran both the install_xavier_dependencies.sh and did a pip3 install -r requirements_xavier.txt

Now, when I try to run anything else, such as: make run RUN_ARGS="--benchmarks=resnet50 --scenarios=offline,singlestream"

It throws the following:

make[1]: Entering directory '/media/nvmedrive/inference_results_v1.1/closed/NVIDIA'
[2023-09-06 19:29:33,883 main.py:760 INFO] Detected System ID: AGX_Xavier
[2023-09-06 19:29:35,437 main.py:108 INFO] Building engines for resnet50 benchmark in Offline scenario...
[2023-09-06 19:29:35,441 main.py:117 INFO] Building DLA engine for AGX_Xavier_resnet50_Offline
/home/jetson/.local/lib/python3.8/site-packages/onnx/mapping.py:27: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
  int(TensorProto.STRING): np.dtype(np.object)
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/code/main.py", line 118, in handle_generate_engine
    b = get_benchmark(config)
  File "/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/code/__init__.py", line 80, in get_benchmark
    cls = get_cls(G_BENCHMARK_CLASS_MAP[benchmark])
  File "/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/code/__init__.py", line 63, in get_cls
    return getattr(import_module(module_loc.module_path), module_loc.cls_name)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/media/nvmedrive/inference_results_v1.1/closed/NVIDIA/code/resnet50/tensorrt/ResNet50.py", line 20, in <module>
    import onnx
  File "/home/jetson/.local/lib/python3.8/site-packages/onnx/__init__.py", line 20, in <module>
    import onnx.helper  # noqa
  File "/home/jetson/.local/lib/python3.8/site-packages/onnx/helper.py", line 17, in <module>
    from onnx import mapping
  File "/home/jetson/.local/lib/python3.8/site-packages/onnx/mapping.py", line 27, in <module>
    int(TensorProto.STRING): np.dtype(np.object)
  File "/home/jetson/.local/lib/python3.8/site-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe. 
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
Traceback (most recent call last):
  File "code/main.py", line 763, in <module>
    main(main_args, system)
  File "code/main.py", line 736, in main
    dispatch_action(main_args, config_dict, workload_id, equiv_engine_setting=equiv_engine_setting)
  File "code/main.py", line 556, in dispatch_action
    launch_handle_generate_engine(*_gen_args, **_gen_kwargs)
  File "code/main.py", line 92, in launch_handle_generate_engine
    raise RuntimeError("Building engines failed!")
RuntimeError: Building engines failed!
make[1]: *** [Makefile:619: generate_engines] Error 1
make[1]: Leaving directory '/media/nvmedrive/inference_results_v1.1/closed/NVIDIA'
make: *** [Makefile:613: run] Error 2

Install tenosrrt for python3.8 in Jetson Xavier NX

I try to achive the result in Nvidia Jetson Xavier NX. After setup up the environment, I got the fllowing error message:

Makefile:236: *** MLPerf Inference v1.1 code requires NVIDIA Driver Version >= 465.xx.  Stop

then I check the environment, found the tensorrt not in my python3.8 env, so how can I install tensorrt for python3.8

Why is the difference between the offline performance and the single stream performance for RNNT so big?

Hi,

I have noticed that the difference between the "offline" and "single stream" performance is a lot higher for RNN-T than for the other benchmarks.

For example in submission "1.1-100" from NVIDIA the ratio between the "offline" sample/s to the calculated "single stream" sample/s for all nets except RNNT is lower than 20, but for RNNT it is ca. 292.

Any help in clarifying this difference is appreciated.

Thank you!

nvidia-smi: command not found

According to NVIDIA README documentation, I created the image and started the container, but the following error occurs in the container：