Comments (10)
You are not doing anything wrong. There is an issue with SYCL and the SIMD instructions (i.e. mavx and msee flags). I am running tests right now to see if a simple fix is possible which would allow to keep the SIMD instructions for the CPU.
Otherwise the only solution would be to remove these flags.
I will keep you in touch.
Also I noticed that you are using clang. You should switch to gcc-4.8 (and g++4.8) at least.
from tensorflow-opencl.
Ok this seems to compile and pass the tests just fine.
Make sure you first switch to gcc-4.8 (the link you mentioned needs to be updated).
If this still does not work, please try to apply my fix that you can find here: lukeiwanski/tensorflow@fabe385
from tensorflow-opencl.
Ok so this is just some incomplete information.
If you have a look here: http://deep-beta.co.uk/setting-up-tensorflow-with-opencl-using-sycl/ you can find a very similar guide. The last command you need to launch from the _python_build directory is
python setup.py develop
This will actually create an egg-link in the dist-package folder that points to the _python_build folder and acts just like a package installed from pip.
Note that if you want to launch python and import tensorflow you will have to be outside the repository project.
Also I wasn't able to create a whl like you tried either. Not sure why.
from tensorflow-opencl.
- When the GPU is enabled, whether it is with CUDA or OpenCL, all the heavy work is done there so the CPU won't be much used. The optimization flags actually don't affect the CPU usage. So it is expected to observe a usage around 100%.
- I think your VRAM is definitely the bottleneck here as you suspected. For comparison I have 4GB. It is true that a lot of copies happen between the CPU and GPU. Some work still needs to be done to avoid that. Then your question is actually related to SYCL, you can probably have more help here: https://github.com/lukeiwanski/tensorflow.
Well assuming you don't want to spend more money on GPU, the only possibility I see is contributing to this repository ;) There are other optimizations to do to avoid copies.
(ok that's very nice of you ^^)
from tensorflow-opencl.
Thank you. That helped with compiling! Compiling now finishes successfully. However, I am not quite sure how to proceed from here to get a working build within a virtual environment. I tried to create a wheel like so:
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
Then, I tried:
pip install /tmp/tensorflow_pkg/NAME_OF_WHEEL.whl
Now when I try to just import Tensorflow in Jupyter Notebook, I am getting the following errors:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/tensorflow/python/__init__.py in <module>()
60 sys.setdlopenflags(_default_dlopen_flags | ctypes.RTLD_GLOBAL)
---> 61 from tensorflow.python import pywrap_tensorflow
62 sys.setdlopenflags(_default_dlopen_flags)
/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py in <module>()
27 return _mod
---> 28 _pywrap_tensorflow = swig_import_helper()
29 del swig_import_helper
/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py in swig_import_helper()
23 try:
---> 24 _mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
25 finally:
/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/imp.py in load_module(name, file, filename, details)
241 else:
--> 242 return load_dynamic(name, filename, file)
243 elif type_ == PKG_DIRECTORY:
/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/imp.py in load_dynamic(name, path, file)
341 name=name, loader=loader, origin=path)
--> 342 return _load(spec)
343
ImportError: libComputeCpp.so: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
ImportError Traceback (most recent call last)
<ipython-input-1-c61832825467> in <module>()
28 from sklearn.cross_validation import KFold, StratifiedKFold
29 from sklearn.model_selection import train_test_split
---> 30 from keras.applications import ResNet50, InceptionV3
31 from keras.models import Sequential, Model
32 from keras.layers.core import Dense, Dropout, Flatten
/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/keras/__init__.py in <module>()
1 from __future__ import absolute_import
2
----> 3 from . import activations
4 from . import applications
5 from . import backend
/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/keras/activations.py in <module>()
1 from __future__ import absolute_import
2 import six
----> 3 from . import backend as K
4 from .utils.generic_utils import deserialize_keras_object
5
/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/keras/backend/__init__.py in <module>()
71 elif _BACKEND == 'tensorflow':
72 sys.stderr.write('Using TensorFlow backend.\n')
---> 73 from .tensorflow_backend import *
74 else:
75 raise ValueError('Unknown backend: ' + str(_BACKEND))
/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in <module>()
----> 1 import tensorflow as tf
2 from tensorflow.python.training import moving_averages
3 from tensorflow.python.ops import tensor_array_ops
4 from tensorflow.python.ops import control_flow_ops
5 from tensorflow.python.ops import functional_ops
/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/tensorflow/__init__.py in <module>()
22
23 # pylint: disable=wildcard-import
---> 24 from tensorflow.python import *
25 # pylint: enable=wildcard-import
26
/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/tensorflow/python/__init__.py in <module>()
70 for some common reasons and solutions. Include the entire stack trace
71 above this error message when asking for help.""" % traceback.format_exc()
---> 72 raise ImportError(msg)
73
74 # Protocol buffers
ImportError: Traceback (most recent call last):
File "/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/tensorflow/python/__init__.py", line 61, in <module>
from tensorflow.python import pywrap_tensorflow
File "/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in <module>
_pywrap_tensorflow = swig_import_helper()
File "/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
File "/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "/home/anthonyle/anaconda3/envs/deep_learning_gpu3/lib/python3.5/imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: libComputeCpp.so: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.
See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md#import_error
for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
Am I too ambitious trying to make a wheel and then do a pip install into my virtual environment? I see that in the tutorial your colleague is actually:
`
$ mkdir _python_build
$ cd _python_build
$ ln -s ../bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/org_tensorflow/* .
$ ln -s ../tensorflow/tools/pip_package/* .
`
...creating symbolic links into the _python_build folder. I fail to understand how this will install a python package into a site-package folder...or make tensorflow available to the system. Could you help enlighten me on that?
from tensorflow-opencl.
Hi Zakor,
I thought it would be wise for me to also run the tests as recommended. I am noticing 2 things:
1.) There is a timeout option set on the tutorial like so:
bazel test --config=sycl -k --test_timeout 1600 -- //tensorflow/... -//tensorflow/contrib/... -//tensorflow/java/... -//tensorflow /compiler/...
It's rather high. Are these limits high because the test computations are rather expensive?
2.) A lot of these tests time out on my setup. Is this an indication that the integrated GPU is not quite powerful enough to perform these tests in time? At the writing of these tests, the tests have been running for 2 days now. I expect the tests to finish within one more day. Is this an indication of deeper problems? E.g.: The GPU is not actually used etc.
Update:
After compiling the build without errors, I proceeded to compare some performance:
Tensorflow from pip: 1 epoch = ~ 1,700 sec (CPU utilization ~ 350%)
Tensorflow+SSE (4.1+4.2)+AVX+Keras on CIFAR10 dataset: 1 epoch = ~ 1,100 sec (CPU utilization ~ 350%)
Tensorflow+OpenCl: 1 epoch = ~ 11,000 sec (CPU utilization ~ 150%)
This is about 10x worse than what I would get with just optimized tensorflow compilation. I am not currently gaining the hoped for performance increases. How can I test if the GPU is used at all?
from tensorflow-opencl.
Hi,
I am not sure why we need such a high timeout as well.
Two days is definitely too much! I usually only test -- //tensorflow/... -//tensorflow/compiler/...
which takes less than an hour. Also try adding --local_test_jobs=8
. Even when I ran all the tests I don't remember of any timeouts.
Maybe it is as you say because of the integrated GPU that is not powerful enough. To make sure I think you should use an external tool such as aticonfig --odgc --odgt
since you have an AMD.
from tensorflow-opencl.
Thank you for guiding me along:
I have reinstalled tensorflow-opencl because I specified incorrect paths to the computcpp, g++ and g compiler last time. I have also rerun my CIFAR-10 dataset to benchmark tensorflow-opencl with the integrated GPU on my AMD A10-7850 setup. This time I also looked at GPU usage. I was able to verify that the GPU is used at 100% capacity. The CPU usage is still at 150%. The time to process 1 epoch is still about 5800 seconds .
So this is still about 5X worse than just using optimized compilation flags. These are my general observations and thoughts after having tried out tensorflow-opencl:
1.) The fact that on my 4-core setup, the CPU usage is only 150% percent indicates that somehow multithreading is not fully efficient in the opencl setup because without opencl all 4 cores get used.
2.) I am not sure what it is but I thought that any matrix computation in a GPU should be faster if executed there instead of a CPU. I think in the case of opencl 1.2 we are still talking about copying data from CPU-RAM to GPU-RAM before any computations can be done. And if the GPU only has 1 GB of RAM there is potentially a lot of copying back and forth. Could this be the cause for the slowness of the computations? With opencl 2.0, I think one does not have to copy data back and forth, but could just pass pointers. May I ask what the reason is for going with opencl 1.2 instead of opencl 2.0? So the full potential of APUs cannot currently be exploited.
3.) It appears that opencl 1.2 is not a full substitute for handcrafted/optimized assembler libraries at this time.
How can I further increase computation times without NVIDIA GPUs? I still have a few months before I commit. One of the things I am waiting for is the release of the VEGA GPUs. In the absence of any softwareframe work ontop to exploit the GPU potential, it will be hard to go with the VEGA cards though. For one, the talked about rocM and MIopen initiatives have not been released.
In general what are your thoughts?
(Oh and by the way, if you need help with benchmarking on an APU system, now that things are working for me, I would be happy to help out!)
from tensorflow-opencl.
Well, I will spend more money on a GPU in about 2-5 months. The question then will be what GPU to get (NVIDIA vs AMD). At this point, the answer is tilting toward NVIDIA. But I really want to give AMD a good chance first. With MIOpen and rocM, and this, the most robust path on AMD GPUs seems less defined.
This probably goes outside the scope of this discussion...in which case, could you point me to lists of optimizations that still need to be done?
from tensorflow-opencl.
Yes this is definitely getting out of scope. Please open an issue on https://github.com/lukeiwanski/tensorflow about the optimizations that can be done to avoid copies.
from tensorflow-opencl.
Related Issues (20)
- the tensorflow-opencl will build for windows?
- 404 not found HOT 1
- if the project will be built for Windows? HOT 6
- Tensorflow compilation fails with error "redeclared here as 'Eigen::internal::ComparisonName cmp'" HOT 1
- Anyway to run on AMDGPU-PRO driver? HOT 3
- Getting Nan everywhere HOT 4
- Build Dockerfile
- python import error (undefined symbol)
- Compilation fails at eigen, compiler finds an error
- Issue with compiling Tensorflow HOT 1
- build tensorflow with sycl HOT 1
- zipfile.BadZipFile: File is not a zip file HOT 1
- pre_built tensorflow-opencl for windows 10
- Will it work on Intel CPU OpenCL ? HOT 1
- OSX GPU HOT 1
- Whats up with the windows build? HOT 2
- libcublas9 missing error
- AMD GPU device not found, but GPU test passed HOT 1
- How to compile it for NDK?
- tensorflow 2.8.0 requires tf-estimator-nightly==2.8.0.dev2021122109, but you have tf-estimator-nightly 2.9.0.dev2022022309 which is incompatible.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorflow-opencl.