Comments (7)
The most guranteed way to speed up starting time for RTX A4000 (or any Amphere compute 8.6 cards) is to pre-compiles SASS to target sm_86.
The tensorflow v2.6 release has the following options:
https://github.com/tensorflow/tensorflow/blob/v2.6.3/.bazelrc#L576
build:release_gpu_base --repo_env=TF_CUDA_COMPUTE_CAPABILITIES="sm_35,sm_50,sm_60,sm_70,sm_75,compute_80"
Note that compute_80 means compiles to PTX for compute 8.0, which means when running on RTX4000 it will goes throught PTX 8.0 => SASS 8.6 convertion by the Nvidia driver and that is what the delays comes from.
The tensorflow v2.8 release has the following options:
https://github.com/tensorflow/tensorflow/blob/v2.8.0/.bazelrc#L608
build:release_gpu_base --repo_env=TF_CUDA_COMPUTE_CAPABILITIES="sm_35,sm_50,sm_60,sm_70,sm_75,compute_80"
It is the same unchanged configuration, it won't speed up the start up time. Same for tensorflow v2.7 release.
If you added sm_86 to the compute capability and recompile Emgu TF with the flag, e.g. change it to
build:release_gpu_base --repo_env=TF_CUDA_COMPUTE_CAPABILITIES="sm_35,sm_50,sm_60,sm_70,sm_75,sm_86,compute_80"
it should make it possible to start up with RTX A4000 way faster. The resulting binary will be larger though.
from emgutf.
Btw, our release will use the same default Tensorflow compilation flags to make sure Emgu TF behaves the same as the official Tensorflow release.
from emgutf.
It does look like we're running into PTX compilation on Ampere and it sounds like this may be resolved whenever the official Tensorflow release is updated to natively support sm_80 (or even sm_86).
However, on upgrading from EmguTF 2.2 to EmguTF 2.6, we're now seeing PTX compilation for older cards as well, including all Turing (sm_75) hardware. We've confirmed that EmguTF 2.6 is reporting: TensorFlow was not built with CUDA kernel binaries compatible with compute capability 7.5. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
, which seems unintended based on the flags listed for tensorflow 2.6 above.
Is there a way to tell what compute EmguTF 2.6 was precompiled for?
from emgutf.
Found this command to list the compute capability the dll included:
cuobjdump -ptx .\tfextern.dll > out.txt
I tested against the Emgu TF 2.6 release, the header says:
Fatbin ptx code:
================
arch = sm_52
code version = [7,3]
producer = <unknown>
host = windows
compile_size = 64bit
compressed
...
That means PTX sm_52 is included for the 2.6 Emgu TF with cuda for windows release.
I am tracing back the build script. For the 2.6 release, we used this build script:
https://github.com/emgucv/emgutf/blob/2.6.0/platforms/windows/bazel_build_tf.bat#L111
The line I am highlighting set the compute capability to 5.2
The reason is that we are using the tensorflow build script for windows here:
https://github.com/tensorflow/tensorflow/blob/v2.6.0/tensorflow/tools/ci_build/windows/libtensorflow_gpu.sh
which reference the comon_env.sh script here:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/ci_build/windows/bazel/common_env.sh#L60
With the highlighted line only enable compute capability 6.0 for windows build. This overrides the default bazel configuration here:
https://github.com/tensorflow/tensorflow/blob/v2.6.0/.bazelrc#L576
I remember there was a request to make Emgu TF compatible with older 5.2 devices and that's why our build script changes that to 5.2 instead of 6.0.
I will work on enabling a full list of arch for our windows build. Will try to set:
TF_CUDA_COMPUTE_CAPABILITIES="sm_35,sm_50,sm_60,sm_70,sm_75,sm_86,compute_80"
I will keep you posted with updates.
from emgutf.
Current build script has been updated:
https://github.com/emgucv/emgutf/blob/master/platforms/windows/bazel_build_tf.bat#L134
SET TF_CUDA_COMPUTE_CAPABILITIES=sm_35,sm_50,sm_60,sm_70,sm_75,sm_80,sm_86,compute_80
The next release will have the above compute capabilities enabled.
from emgutf.
FYI, the new release v2.8.0 is out with the above compute capabilities enabled.
from emgutf.
Closing ticket now.
from emgutf.
Related Issues (20)
- System.AccessViolationException on Invoke for TFLite model HOT 2
- DetectorParameters not finding markers HOT 1
- x86 and x64 support on Windows? HOT 4
- Memory leak in TF Lite example HOT 1
- System.EntryPointNotFoundException when loading tflite model in xamarin.ios HOT 4
- Please relax "match exact" requirement on Google.Protobuf
- Can I use pre-trained Tensorflow model to make predit?
- How I can generate some images using the repo?
- Emgu CV Does Not Open An Image Under ANY .NET version HOT 3
- Searchable PDF from Tess Engine Broken? HOT 1
- Training the MobileNetV3 model
- Did not work on .NET MAUI Application HOT 7
- Joint installation emgu.cv and emgu.tfLite in unity - error Multiple plugins with the same name HOT 2
- tf.lite example with custom train model issue HOT 2
- Emgu.TF.Lite use pose_landmark_lite.tflite and face_landmark.tflite ?
- Remove Delegates support
- System.AccessViolationException: 'Attempted to read or write protected memory. HOT 1
- Emgutf not working on MAUI Android HOT 3
- DllNotFoundException: tfliteextern in unity on MacOS HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from emgutf.