Changing config.pbtxt for resnet50_netdef model (/docs/examples/model_repository/resne

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Closing. <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard

resnet50_netdef does not run on CPU about server HOT 5 CLOSED

cc911 commented on May 16, 2024

resnet50_netdef does not run on CPU

from server.

Comments (5)

deadeyegoodwin commented on May 16, 2024

From the config.pbtxt that you show I don't see why you expect it to run on the CPU. To indicate CPU execution config.pbtxt would contain:

instance_group [
{
kind: KIND_CPU
}
]

from server.

cc911 commented on May 16, 2024

@deadeyegoodwin unfortunately that didn't work. Any other recommendation?

As a side note, running the model on CPU would allow me to do some benchmarks.

My /docs/examples/model_repository/resnet50_netdef/config.pbtxt looks like this:
name: "resnet50_netdef"
platform: "caffe2_netdef"
max_batch_size: 128
input [
{
name: "data"
data_type: TYPE_FP32
format: FORMAT_NCHW
dims: [ 3, 224, 224 ]
}
]
output [
{
name: "softmax"
data_type: TYPE_FP32
dims: [ 1000 ]
label_filename: "resnet50_labels.txt"
}
]
instance_group [
{
kind: KIND_CPU
}
]

Output:

===============================
== TensorRT Inference Server ==

NVIDIA Release 18.12 (build 880120)

Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved.
Copyright 2018 The TensorFlow Authors. All rights reserved.
Copyright 2018 The TensorFlow Serving Authors. All rights reserved.
Copyright (c) 2016-present, Facebook Inc. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.
ERROR: Detected NVIDIA Quadro M1000M GPU, which is not supported by this container
ERROR: No supported GPU(s) detected to run this container

NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for the inference server. NVIDIA recommends the use of the following flags:
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ...

I0128 09:56:37.392247 1 server.cc:701] Initializing TensorRT Inference Server
I0128 09:56:37.392330 1 server.cc:751] Reporting prometheus metrics on port 8002
I0128 09:56:37.394590 1 metrics.cc:148] found 1 GPUs supporting NVML metrics
I0128 09:56:37.400388 1 metrics.cc:158] GPU 0: Quadro M1000M
I0128 09:56:37.401003 1 server.cc:1121] Starting server 'inference:0' listening on
I0128 09:56:37.401019 1 server.cc:1125] localhost:8001 for gRPC requests
I0128 09:56:37.401149 1 server.cc:1029] Building nvrpc server
I0128 09:56:37.401170 1 server.cc:1035] Register TensorRT GRPCService
I0128 09:56:37.401185 1 server.cc:1038] Register Infer RPC
I0128 09:56:37.401196 1 server.cc:1042] Register Status RPC
I0128 09:56:37.401202 1 server.cc:1046] Register Profile RPC
I0128 09:56:37.401207 1 server.cc:1050] Register Health RPC
I0128 09:56:37.401213 1 server.cc:1054] Register Executor
I0128 09:56:37.405083 1 server.cc:1135] localhost:8000 for HTTP requests
[warn] getaddrinfo: address family for nodename not supported
[evhttp_server.cc : 237] RAW: Entering the event loop ...
I0128 09:56:37.457394 1 server_status.cc:105] New status tracking for model 'inception_graphdef'
I0128 09:56:37.457415 1 server_status.cc:105] New status tracking for model 'resnet50_netdef'
I0128 09:56:37.457423 1 server_status.cc:105] New status tracking for model 'simple'
I0128 09:56:37.458118 1 server_core.cc:465] Adding/updating models.
I0128 09:56:37.458129 1 server_core.cc:562] (Re-)adding model: inception_graphdef
I0128 09:56:37.458134 1 server_core.cc:562] (Re-)adding model: resnet50_netdef
I0128 09:56:37.458158 1 server_core.cc:562] (Re-)adding model: simple
I0128 09:56:37.558482 1 basic_manager.cc:739] Successfully reserved resources to load servable {name: resnet50_netdef version: 1}
I0128 09:56:37.558526 1 loader_harness.cc:66] Approving load for servable version {name: resnet50_netdef version: 1}
I0128 09:56:37.558545 1 loader_harness.cc:74] Loading servable version {name: resnet50_netdef version: 1}
I0128 09:56:37.658411 1 basic_manager.cc:739] Successfully reserved resources to load servable {name: simple version: 1}
I0128 09:56:37.658431 1 loader_harness.cc:66] Approving load for servable version {name: simple version: 1}
I0128 09:56:37.658457 1 loader_harness.cc:74] Loading servable version {name: simple version: 1}
I0128 09:56:37.659081 1 base_bundle.cc:168] Creating instance simple_0_gpu0 on GPU 0 (5.0) using model.graphdef
I0128 09:56:37.706952 1 netdef_bundle.cc:215] Creating instance resnet50_netdef_0_0_cpu on CPU using init_model.netdef and model.netdef
E0128 09:56:37.707187 1 init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0128 09:56:37.707202 1 init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0128 09:56:37.707207 1 init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
I0128 09:56:37.728223 1 cuda_gpu_executor.cc:957] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I0128 09:56:37.728769 1 gpu_device.cc:1432] Found device 0 with properties:
name: Quadro M1000M major: 5 minor: 0 memoryClockRate(GHz): 1.0715
pciBusID: 0000:01:00.0
totalMemory: 1.96GiB freeMemory: 683.38MiB
I0128 09:56:37.728783 1 gpu_device.cc:1482] Ignoring visible gpu device (device: 0, name: Quadro M1000M, pci bus id: 0000:01:00.0, compute capability: 5.0) with Cuda compute capability 5.0. The minimum required Cuda capability is 5.2.
I0128 09:56:37.728792 1 gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
I0128 09:56:37.728801 1 gpu_device.cc:988] 0
I0128 09:56:37.728812 1 gpu_device.cc:1001] 0: N
I0128 09:56:37.749739 1 loader_harness.cc:86] Successfully loaded servable version {name: simple version: 1}I0128 09:56:37.749735 1 infer.cc:788] Starting runner thread 0 at nice 5...

I0128 09:56:37.758545 1 basic_manager.cc:739] Successfully reserved resources to load servable {name: inception_graphdef version: 1}
I0128 09:56:37.758587 1 loader_harness.cc:66] Approving load for servable version {name: inception_graphdef version: 1}
I0128 09:56:37.758596 1 loader_harness.cc:74] Loading servable version {name: inception_graphdef version: 1}
I0128 09:56:37.759763 1 base_bundle.cc:168] Creating instance inception_graphdef_0_0_gpu0 on GPU 0 (5.0) using model.graphdef
I0128 09:56:37.759823 1 gpu_device.cc:1482] Ignoring visible gpu device (device: 0, name: Quadro M1000M, pci bus id: 0000:01:00.0, compute capability: 5.0) with Cuda compute capability 5.0. The minimum required Cuda capability is 5.2.
I0128 09:56:37.759840 1 gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
I0128 09:56:37.759851 1 gpu_device.cc:988] 0
I0128 09:56:37.759861 1 gpu_device.cc:1001] 0: N
I0128 09:56:38.127098 1 base_bundle.cc:168] Creating instance inception_graphdef_0_1_gpu0 on GPU 0 (5.0) using model.graphdef
I0128 09:56:38.127202 1 gpu_device.cc:1482] Ignoring visible gpu device (device: 0, name: Quadro M1000M, pci bus id: 0000:01:00.0, compute capability: 5.0) with Cuda compute capability 5.0. The minimum required Cuda capability is 5.2.
I0128 09:56:38.127228 1 gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
I0128 09:56:38.127233 1 gpu_device.cc:988] 0
I0128 09:56:38.127237 1 gpu_device.cc:1001] 0: N
I0128 09:56:38.308329 1 base_bundle.cc:168] Creating instance inception_graphdef_0_2_gpu0 on GPU 0 (5.0) using model.graphdef
I0128 09:56:38.308400 1 gpu_device.cc:1482] Ignoring visible gpu device (device: 0, name: Quadro M1000M, pci bus id: 0000:01:00.0, compute capability: 5.0) with Cuda compute capability 5.0. The minimum required Cuda capability is 5.2.
I0128 09:56:38.308411 1 gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
I0128 09:56:38.308420 1 gpu_device.cc:988] 0
I0128 09:56:38.308426 1 gpu_device.cc:1001] 0: N
I0128 09:56:38.526304 1 base_bundle.cc:168] Creating instance inception_graphdef_0_3_gpu0 on GPU 0 (5.0) using model.graphdef
I0128 09:56:38.526370 1 gpu_device.cc:1482] Ignoring visible gpu device (device: 0, name: Quadro M1000M, pci bus id: 0000:01:00.0, compute capability: 5.0) with Cuda compute capability 5.0. The minimum required Cuda capability is 5.2.
I0128 09:56:38.526379 1 gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
I0128 09:56:38.526384 1 gpu_device.cc:988] 0
I0128 09:56:38.526389 1 gpu_device.cc:1001] 0: N
W0128 09:56:38.612561 1 workspace.cc:170] Blob gpu_0/data not in the workspace.
terminate called after throwing an instance of 'c10::Error'
what(): [enforce fail at operator.cc:46] blob != nullptr. op Conv: Encountered a non-existing input blob: gpu_0/data
frame #0: c10::ThrowEnforceNotMet(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, void const*) + 0x76 (0x7fae9499e416 in /opt/tensorrtserver/lib/libc10.so)
frame # 1: caffe2::OperatorBase::OperatorBase(caffe2::OperatorDef const&, caffe2::Workspace*) + 0x6aa (0x7faee32d171a in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 2: + 0x1527a75 (0x7faee3412a75 in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 3: + 0x152b43c (0x7faee341643c in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 4: + 0x15eaad2 (0x7faee34d5ad2 in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 5: std::_Function_handler<std::unique_ptr<caffe2::OperatorBase, std::default_deletecaffe2::OperatorBase > (caffe2::OperatorDef const&, caffe2::Workspace*), std::unique_ptr<caffe2::OperatorBase, std::default_deletecaffe2::OperatorBase > ()(caffe2::OperatorDef const&, caffe2::Workspace)>::_M_invoke(std::_Any_data const&, caffe2::OperatorDef const&, caffe2::Workspace*&&) + 0x23 (0x7faee30c3fd3 in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 6: + 0x13e45b8 (0x7faee32cf5b8 in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 7: + 0x13e6a19 (0x7faee32d1a19 in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 8: caffe2::CreateOperator(caffe2::OperatorDef const&, caffe2::Workspace*, int) + 0x5c2 (0x7faee32d28b2 in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 9: caffe2::SimpleNet::SimpleNet(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0x455 (0x7faee3277355 in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 10: + 0x1390c3e (0x7faee327bc3e in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 11: + 0x138cbb3 (0x7faee3277bb3 in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 12: caffe2::CreateNet(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0xb67 (0x7faee32a0dc7 in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 13: caffe2::Workspace::CreateNet(std::shared_ptr<caffe2::NetDef const> const&, bool) + 0x14b (0x7faee32be24b in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 14: caffe2::Workspace::CreateNet(caffe2::NetDef const&, bool) + 0x9f (0x7faee32bf91f in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 15: + 0x140d229 (0x7faee32f8229 in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 16: Caffe2WorkspaceCreate + 0x1307 (0x7faee32fa277 in /opt/tensorrtserver/lib/libcaffe2.so)
frame # 17: + 0x870054 (0x55facd54a054 in trtserver)
frame # 18: + 0x87120e (0x55facd54b20e in trtserver)
frame # 19: + 0x867e1f (0x55facd541e1f in trtserver)
frame # 20: + 0x863cc8 (0x55facd53dcc8 in trtserver)
frame # 21: + 0x863ebc (0x55facd53debc in trtserver)
frame # 22: + 0x86834a (0x55facd54234a in trtserver)
frame # 23: + 0x8e6869 (0x55facd5c0869 in trtserver)
frame # 24: + 0x8e8c87 (0x55facd5c2c87 in trtserver)
frame # 25: + 0x8e7ba6 (0x55facd5c1ba6 in trtserver)
frame # 26: + 0x8e3eed (0x55facd5bdeed in trtserver)
frame # 27: + 0x8e430c (0x55facd5be30c in trtserver)
frame # 28: + 0x8e5b66 (0x55facd5bfb66 in trtserver)
frame # 29: + 0x8e5c5f (0x55facd5bfc5f in trtserver)
frame # 30: + 0x66b41a9 (0x55fad338e1a9 in trtserver)
frame # 31: + 0x66b2347 (0x55fad338c347 in trtserver)
frame # 32: + 0xb8c80 (0x7fae8507cc80 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame # 33: + 0x76ba (0x7fae85c846ba in /lib/x86_64-linux-gnu/libpthread.so.0)
frame # 34: clone + 0x6d (0x7fae84aeb41d in /lib/x86_64-linux-gnu/libc.so.6)

from server.

GuanLuo commented on May 16, 2024

@cc911 I can't reproduce the issue, what is the command you used to start the container?
I am using the 18.12 container with GT 710 GPU (compute capability: 3.5)
config.pbtxt for resnet50_netdef

name: "resnet50_netdef"
platform: "caffe2_netdef"
max_batch_size: 128
input [
  {
    name: "gpu_0/data"
    data_type: TYPE_FP32
    format: FORMAT_NCHW
    dims: [ 3, 224, 224 ]
  }
]
output [
  {
    name: "gpu_0/softmax"
    data_type: TYPE_FP32
    dims: [ 1000 ]
    label_filename: "resnet50_labels.txt"
  }
]
instance_group [
  {
    kind: KIND_CPU
  }
]

from server.

GuanLuo commented on May 16, 2024

Update: to use CPU, you only need to change the instance_group in the config file. Notice that the config I used still keep the "gpu_0/" prefix in input/output name.

from server.

deadeyegoodwin commented on May 16, 2024

Closing. @cc911 if you still see a failure after fixing your model configuration please re-open.

from server.

resnet50_netdef does not run on CPU about server HOT 5 CLOSED

Comments (5)

===============================
== TensorRT Inference Server ==

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Comments (5)

=============================== == TensorRT Inference Server ==

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org

===============================
== TensorRT Inference Server ==