tensorflow / custom-op Goto Github PK

Guide for building custom op for TensorFlow

License: Apache License 2.0

Python 19.01% Makefile 0.95% Shell 4.48% C++ 3.94% Smarty 45.60% Starlark 26.02%

custom-op's Introduction

`Documentation`

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.

TensorFlow was originally developed by researchers and engineers working within the Machine Intelligence team at Google Brain to conduct research in machine learning and neural networks. However, the framework is versatile enough to be used in other areas as well.

TensorFlow provides stable Python and C++ APIs, as well as a non-guaranteed backward compatible API for other languages.

Keep up-to-date with release announcements and security updates by subscribing to [email protected]. See all the mailing lists.

Install

See the TensorFlow install guide for the pip package, to enable GPU support, use a Docker container, and build from source.

To install the current release, which includes support for CUDA-enabled GPU cards (Ubuntu and Windows):

$ pip install tensorflow

Other devices (DirectX and MacOS-metal) are supported using Device plugins.

A smaller CPU-only package is also available:

$ pip install tensorflow-cpu

To update TensorFlow to the latest version, add --upgrade flag to the above commands.

Nightly binaries are available for testing using the tf-nightly and tf-nightly-cpu packages on PyPi.

Try your first TensorFlow program

$ python

>>> import tensorflow as tf
>>> tf.add(1, 2).numpy()
3
>>> hello = tf.constant('Hello, TensorFlow!')
>>> hello.numpy()
b'Hello, TensorFlow!'

For more examples, see the TensorFlow tutorials.

Contribution guidelines

If you want to contribute to TensorFlow, be sure to review the contribution guidelines. This project adheres to TensorFlow's code of conduct. By participating, you are expected to uphold this code.

We use GitHub issues for tracking requests and bugs, please see TensorFlow Forum for general questions and discussion, and please direct specific questions to Stack Overflow.

The TensorFlow project strives to abide by generally accepted best practices in open-source software development.

Patching guidelines

Follow these steps to patch a specific version of TensorFlow, for example, to apply fixes to bugs or security vulnerabilities:

Clone the TensorFlow repo and switch to the corresponding branch for your desired TensorFlow version, for example, branch r2.8 for version 2.8.
Apply (that is, cherry-pick) the desired changes and resolve any code conflicts.
Run TensorFlow tests and ensure they pass.
Build the TensorFlow pip package from source.

Continuous build status

You can find more community-supported platforms and configurations in the TensorFlow SIG Build community builds table.

Official Builds

Build Type	Status	Artifacts
Linux CPU		PyPI
Linux GPU		PyPI
Linux XLA		TBA
macOS		PyPI
Windows CPU		PyPI
Windows GPU		PyPI
Android		Download
Raspberry Pi 0 and 1		Py3
Raspberry Pi 2 and 3		Py3
Libtensorflow MacOS CPU	Status Temporarily Unavailable	Nightly Binary Official GCS
Libtensorflow Linux CPU	Status Temporarily Unavailable	Nightly Binary Official GCS
Libtensorflow Linux GPU	Status Temporarily Unavailable	Nightly Binary Official GCS
Libtensorflow Windows CPU	Status Temporarily Unavailable	Nightly Binary Official GCS
Libtensorflow Windows GPU	Status Temporarily Unavailable	Nightly Binary Official GCS

Resources

Learn more about the TensorFlow community and how to contribute.

Courses

License

Apache License 2.0

custom-op's People

Contributors

Stargazers

Watchers

Forkers

github30 dustindorroh basav1989 jamzywang zorrock fendaq suphoff patrickstuedi kaczmarj jackd donniek engalar joth76 seanpmorgan kdbanman miladhakimi shujian2015 markdaoust donproc mycaster shigangli broken gonglixue davitbzh chengtbf ssheikholeslami perfmjs doldre a3m4 yifeif endldreamer yuucyf captainger julienvalentin git04112019 hellcatzm prouast cosw0t byvoid ardila jonathangoor chrisgao001 jonathangoorblink brendanlundy nhasabni msgpo bhack leimao brandontrabucco charygao vjrkr csyben alucardmini mareikethies nlcharles gaybro8777 ikossaczky isabella232 zuston cjolivier01 marcinkosztolowicz kooyunmo alwaysproblem henrysky mr-nineteen philippvk saiprasad16 lauthu shuihan0555 michaelbanfield daugraph apx103 shangdehao1 sunshinee24 rbouadjenek gulu-goolu zero1666 shijungg jacky-hate pabhermoso kdhingra307 gobusy gowthamkpr ariaabba amoschenyq muskanmahajan486 yangyegang beoy sathibault mx781 mddct silingtong123 llx-08 deltawing seanpm2001 huangrt01 cemlyn007 jiayugedede jaeyoo sushreebarsa

custom-op's Issues

CUDA 10.1 support?

I can build and test the custom op in the container, but I can't use it outside of my container. Apparently the CUDA versions don't match.

$ python -c 'import tensorflow as tf; tf.load_op_library("./_time_two_ops.so")' 
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.7/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: libcudart.so.10.0: cannot open shared object file: No such file or directory

Will this be fixed? What is the proper course of action in these situations? I have it figured out (sort of), how I will run my code through a container etc, but wanted to check, what is the "proper reaction" to these errors?

Is libtensorflow_framework.so required in this mode of custom op?

Tensorflow: 1.13.1

In tensorflow's .bazelrc file, there is a config setting framework_shared_object, which is default true so that we can get a libtensorflow_framework.so in the wheel package after compiling.

Howerver, because of certerns reasons, I built my tensorflow wheel package while framework_shared_object is false and there is no libtensorflow_framework.so in the wheel package, which caused that I cannot built my custom op according to this repo.

In a word, Is libtensorflow_framework.so required in this mode of custom op? Is there any other ways to compiling the custom op without this so file?

@yifeif @vatai Could you please give some suggestions? Thanks a lot!

Seg Fault when evaluating custom GPU op

System information

OS Platform and Distribution: Linux Ubuntu 16.04 but using tensorflow/tensorflow:custom-op-ubuntu16 docker image.
TensorFlow version (use command below): 1.14.0 (from tensorflow/tensorflow:custom-op-ubuntu16 docker image)
Python version: 2.7.12 (from tensorflow/tensorflow:custom-op-ubuntu16 docker image)
CUDA/cuDNN version: 10.0 (from tensorflow/tensorflow:custom-op-ubuntu16 docker image)
GPU model and memory: GeForce GTX 980M

Describe the problem

I am using the tensorflow/tensorflow:custom-op-ubuntu16 docker image and within the container I took the following steps from the instructions.

ran configure.sh
build with bazel build tensorflow_time_two:python/ops/_time_two_ops.so which is successful
Navigated to the directory of the resulting shared object and attempted to load and evaluate the op with

import tensorflow as tf
time_two_module = tf.load_op_library('./_time_two_ops.so')
with tf.Session(''):
  time_two_module.time_two([[1, 2], [3, 4]]).eval()

and I get a seg fault with the following output

2019-08-24 17:27:36.880958: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-08-24 17:27:36.894117: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-24 17:27:36.894651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce GTX 980M major: 5 minor: 2 memoryClockRate(GHz): 1.1265
pciBusID: 0000:01:00.0
2019-08-24 17:27:36.894677: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-08-24 17:27:36.895645: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-08-24 17:27:36.896448: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-08-24 17:27:36.896671: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-08-24 17:27:36.897767: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-08-24 17:27:36.907920: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-08-24 17:27:36.911424: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-08-24 17:27:36.911531: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-24 17:27:36.912194: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-24 17:27:36.912787: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-08-24 17:27:36.913094: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-08-24 17:27:36.987692: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-24 17:27:36.988139: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x60162f0 executing computations on platform CUDA. Devices:
2019-08-24 17:27:36.988156: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce GTX 980M, Compute Capability 5.2
2019-08-24 17:27:37.008712: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 4008000000 Hz
2019-08-24 17:27:37.008995: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x6088260 executing computations on platform Host. Devices:
2019-08-24 17:27:37.009010: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2019-08-24 17:27:37.009185: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-24 17:27:37.009568: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce GTX 980M major: 5 minor: 2 memoryClockRate(GHz): 1.1265
pciBusID: 0000:01:00.0
2019-08-24 17:27:37.009601: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-08-24 17:27:37.009616: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-08-24 17:27:37.009630: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-08-24 17:27:37.009642: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-08-24 17:27:37.009656: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-08-24 17:27:37.009667: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-08-24 17:27:37.009679: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-08-24 17:27:37.009730: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-24 17:27:37.010112: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-24 17:27:37.010474: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-08-24 17:27:37.040655: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-08-24 17:27:37.040677: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2019-08-24 17:27:37.040683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2019-08-24 17:27:37.040899: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-24 17:27:37.041315: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-24 17:27:37.041705: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7201 MB memory) -> physical GPU (device: 0, name: GeForce GTX 980M, pci bus id: 0000:01:00.0, compute capability: 5.2)
Segmentation fault (core dumped)

undefined symbol error on building official example with GPU support with bazel

Hi,
I have problems on building the official example with GPU support with bazel in tensorflow source folder. I use tf_custom_op_library rule defined in //tensorflow:tensorflow.bzl
using nvcc + g++ works fine.
Any ideas to solve the problem? Or just use the bazel rules in this repo?

Here is my code:

I put this folder under /tensorflow/tensorflow

cd [path to]/tensorflow
bazel build tensorflow/custom_op:test_op.so
//remembered to using the right .so path
// comment load_lib from ./test_op,so, uncomment ./bazel-bin......
python3 ./tensorflow/custom_op/test.py

got errors:

Traceback (most recent call last):
  File "tensorflow/test/test.py", line 2, in <module>
    test_op = tf.load_op_library('bazel-bin/tensorflow/test/test_op.so')
  File "/home/wendyh/.local/lib/python3.5/site-packages/tensorflow/python/framework/load_library.py", line 60, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: bazel-bin/tensorflow/test/test_op.so: undefined symbol: _ZN10tensorflow14kernel_factory17OpKernelRegistrar12InitInternalEPKNS_9KernelDefEN4absl11string_viewESt10unique_ptrINS0_15OpKernelFactoryESt14default_deleteIS8_EE

If I use bash make.sh and test it(using the right .so path), it output desired result.

Env:
tensorflow master branch
Ubuntu 16.04
Nvidia cuda toolkit 10.0

Thank you.

Error while importing the library

NotFoundError Traceback (most recent call last)
in ()
----> 1 import multidim_image_augmentation

2 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/load_library.py in load_op_library(library_filename)
56 RuntimeError: when unable to load the library or get the python wrappers.
57 """
---> 58 lib_handle = py_tf.TF_LoadLibrary(library_filename)
59 try:
60 wrappers = _pywrap_python_op_gen.GetPythonWrappers(

NotFoundError: /content/custom-op/multidim_image_augmentation/python/ops/_augmentation_ops.so: cannot open shared object file: No such file or directory

Contributing rules/guides

I would like to contribute to this repo an example of how to create a cuda custom op, and then show how to add and use it in serving. Is there anything I should be aware of, contributing rules etc, before I create a PR? Also, Is the guide for serving something I should add here or elsewhere?

As a side note, there is currently no documentation on how to add and use cuda ops with serving, so my method may not be the ideal way to do it.

Thanks

Centralize copts within repostiory

Currently the compiler flags are stored in the BUILD file of the custom op. This becomes difficult to maintain if there are many op directories in a repository. It also makes it difficult for customized builds specific to the OS and architecture (e.g. conda or gentoo linux).

Seeing as custom-op is used as a template this may be worth restructuring?

cc @perfinion

How to have the correction version of protoc when building with nightly or 2.3.0rc0 ?

I don't know if this issue is related to version of protocol buffer used in the current existing image tensorflow/tensorflow:2.2.0-custom-op-gpu-ubuntu16 to build custom ops.

I'm trying to build GRPC ops from seed_rl https://github.com/google-research/seed_rl/tree/master/grpc with tf-2.3.0rc0 or tf-nightly. I reported the issue here google-research/seed_rl#14 as well.

I tested two options: upgrade the version of TensorFlow of tensorflow/tensorflow:2.2.0-custom-op-gpu-ubuntu16 from 2.2 to 2.3.0rc0 or use tensorflow/tensorflow:nightly-custom-op-gpu-ubuntu16.

My starting point was the docker file provided to builds the ops for tf-2.2

FROM tensorflow/tensorflow:2.2.0-custom-op-gpu-ubuntu16 as grpc_compile

RUN git clone https://github.com/tensorflow/custom-op.git
WORKDIR custom-op

RUN ./configure.sh

RUN echo '\n\
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")\n\
\n\
http_archive(\n\
    name = "com_github_grpc_grpc",\n\
    urls = [\n\
        "https://github.com/grpc/grpc/archive/ac1c5de1b36da4a1e3d72ca40b0e43f24266121a.tar.gz",\n\
    ],\n\
    strip_prefix = "grpc-ac1c5de1b36da4a1e3d72ca40b0e43f24266121a",\n\
)\n\
\n\
load("@com_github_grpc_grpc//bazel:grpc_deps.bzl", "grpc_deps")\n\
grpc_deps()\n\
load("@com_github_grpc_grpc//bazel:grpc_extra_deps.bzl", "grpc_extra_deps")\n\
grpc_extra_deps()' >> WORKSPACE

ADD grpc/ grpc/
RUN bazel build grpc:ops/grpc.so grpc:service_py_proto --incompatible_remove_legacy_whole_archive=0

ADD . /seed_rl
RUN cp bazel-bin/grpc/ops/grpc.so /seed_rl/grpc/grpc_cc.so
RUN cp bazel-bin/grpc/service_pb2.py /seed_rl/grpc/service_pb2.py
WORKDIR /seed_rl/
RUN pip3 install tensorflow-gpu==2.2.0
RUN PYTHONPATH=/ python3 grpc/python/ops_test.py

Both of my attempts fail with

ERROR: /custom-op/grpc/BUILD.bazel:4:1: C++ compilation of rule '//grpc:ops/grpc.so' failed (Exit 1)
In file included from bazel-out/k8-opt/bin/grpc/service.grpc.pb.h:21:0,
                 from grpc/ops/grpc.cc:23:
bazel-out/k8-opt/bin/grpc/service.pb.h:17:2: error: #error This file was generated by an older version of protoc which is
 #error This file was generated by an older version of protoc which is
  ^~~~~
bazel-out/k8-opt/bin/grpc/service.pb.h:18:2: error: #error incompatible with your Protocol Buffer headers. Please
 #error incompatible with your Protocol Buffer headers. Please
  ^~~~~
bazel-out/k8-opt/bin/grpc/service.pb.h:19:2: error: #error regenerate this file with a newer version of protoc.
 #error regenerate this file with a newer version of protoc.
  ^~~~~

My understanding is that there is an incompatibility between the protoc version used to generate the tensorflow library and the protoc being used during this build. Is that the problem here and how can we fix this ?

custom-op-gpu-ubuntu16 has CUDA 10.1 but configure.sh has CUDA 10.0

in configure.sh [Nn]* ) echo "Build against TensorFlow <2.1."; TF_CUDA_VERSION=10.0;;. Making GPU build impossible. The documentation and docker image are different?

Create Op and perform mixed mode debugging with Visual Studio

Hey,

first of all, thank you for this great tutorial page, i was able to set up everything to build my own shared library under windows. But to further develop it, i want to have debugging capabilities. But if I build the .so (or .dll file) with the following command:

cd custom-op && bazel build --strip=never --compilation_mode dbg -c fastbuild tensorflow_my_MaxPool:python/ops/my_MaxPool_ops.so --verbose_failures

I can create a library file and the pdb file that is needed for debug, but if i then use this .so file to include in my python script:
max_pooling_ops = tf.load_library('my_MaxPool_ops.so')
and start mixed mode debuggin in Visual Studio i get the error

Exception thrown at 0x000007FEBBCBB108 (my_MaxPool_ops.so) in python.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF.

Is there any trick to create a working shared library with debugging capabilities under windows?
Using the 'normal' building steps works fine, I can easily load the library into python with the above mentioned command (Path is the same of course).

would be very happy to receive an answer to my problem :)

greetings

Errors creating pip package on windows

Hi, I've been experimenting with this on Windows 10 (TF version 1.15.2 - Python 3.7 - VS-2017, cuda 10.0) and i'm stuck with creating the pip package.

The instructions say that 2 commands should suffice:

bazel build :build_pip_pkg

followed by

bazel-bin/build_pip_pkg artifacts

The first command runs through without problem. After 1 small change (compilation flag for VS 2017), the packages are built.
For the second command I have to run "bazel-bin/build_pip_pkg" artifacts for it to run, but either way i then face this error:

cp: cannot stat 'bazel-bin/build_pip_pkg.runfiles/main/setup.py': No such file or directory

There's no build_pip_pkg.runfiles/__main__ folder created at all.

the folder bazel-bin looks like this:

.
├── build_pip_pkg.exe.runfiles
├── external
├── tensorflow_time_two
├── tensorflow_zero_out
├─build_pip_pkg
├─build_pip_pkg.exe
└─build_pip_pkg.exe.runfiles_manifest

Seems a problem with version of bazel, but I've tried with several versions of bazel (0.24.1, 0.25.2, 0.26.1, 1.2.1, 2.1.0) and none works.

Weirdly enough, on linux docker image 16, there's no such problem (bazel 1.2.1, tf-1.15.2)

tensorflow serving with custom-op

What is the cleanest way to get a custom-op working with tensorflow serving? I have a working custom operation, but I'm having trouble getting tensorflow serving to load it.

Thank you!

I am getting "undefined symbol: _ZN10tensorflow8OpKernel11TraceStringEPNS_15OpKernelContextEb" error when trying to build the pip package

I followed the steps as mentioned in the README,

docker run -it tensorflow/tensorflow:2.1.0-custom-op-gpu-ubuntu16 /bin/bash
git clone https://github.com/tensorflow/custom-op.git
cd custom-op
 ./configure.sh
 bazel build build_pip_pkg
 bazel-bin/build_pip_pkg artifacts
pip install artifacts/*.whl
cd ..
python -c "import tensorflow as tf;import tensorflow_zero_out;print(tensorflow_zero_out.zero_out([[1,2], [3,4]]))"

I get the following error when I try to import the installed package.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_zero_out/__init__.py", line 19, in <module>
    from tensorflow_zero_out.python.ops.zero_out_ops import zero_out
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_zero_out/python/ops/zero_out_ops.py", line 25, in <module>
    resource_loader.get_path_to_datafile('_zero_out_ops.so'))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/framework/load_library.py", line 57, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /usr/local/lib/python2.7/dist-packages/tensorflow_zero_out/python/ops/_zero_out_ops.so: undefined symbol: _ZN10tensorflow8OpKernel11TraceStringEPNS_15OpKernelContextEb

I tried to build with the Makefile as well but I am facing the same issue.

GPU Kernel not executed

It seems the time_two_ops kernel compiles (with warnings) but it seems the GPU kernel is not executed.

To reproduce the symptoms, change the multiplier from 2 to 3 i.e. in file tensorflow_time_two/cc/kernels/time_two_kernels.cu.cc, line 33 change out[i] = 2 * ldg(in + i); to out[i] = 3 * ldg(in + i);. Compile and run the time_two_ops test expecting a failure (because the test expects a times 2 multiplier), but the tests pass because the (unaltered) CPU kernel is executed.

Everything is run the docker, and I verified that cuda/tensorflow sees the GPU

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

indicates that the GPUs are detected from the container.

As far as I can tell, I'm following the guide/readme to the letter.

tensorflow/tensorflow:custom-op-gpu-ubuntu16 only installed tensorflow-cpu version

I tried docker pull tensorflow/tensorflow:custom-op-gpu-ubuntu16 and run it.
Then tried './configure.sh'
However, it detected that no tensorflow-gpu is installed in this container, and has to uninstall the already installed tensorflow and install a tensorflow-gpu version. Shouldn't your docker image already has tensorflow-gpu installed?

Does this guide also apply to TF2.0?

It's not clear from the README if the guide, Docker image, etc. are set up for TF1, TF2, or both. Ideally both would be supported; but I'd settle for explicit documentation as to what is (and is not) supported at present.

Custom Op TPU

Can we add in the example something related to TPU.
There was a FAQ about creating custom ops for TPU https://cloud.google.com/tpu/docs/faq

Compiling TF serve with my new op

Compiling `tensorflow-serving` with a custom op

First of all, thanks so much for the awesome examples of custom ops. They've been extremely helpful. I'm wondering whether you have any working examples of how to include the custom ops built with this format in TensorFlow serving?

What I've tried

I built a library of custom ops, call it custom_ops. At a high level, here's the structure:

custom_ops
|_ custom_ops/
|   |_first_op/
|   |   |_cc/
|   |   |   |_ ...
|   |   |_python/
|   |   |   |_ ...
|   |_second_op/
|   |   |_cc/
|   |   |   |_ ...
|   |   |_python/
|   |   |   |_ ...
|   |_BUILD
|_ third_party/
|   |_ ...
|_ tf/
|   |_BUILD
|   |_BUILD.tpl
|   |_tf_configure.bzl
|_ BUILD
|_ WORKSPACE

FWIW, I'm able to build the entire project, load the python bindings, and all unit tests for all ops pass. So now I want to get these ops into TensorFlow serving...

Here's the cc_library target that builds the shared custom ops library:

# /custom_ops/custom_ops/BUILD
cc_library(
  name="custom_ops_lib",
  srcs=[
    "first_op/cc/fo_kernel.cc",
    "first_op/cc/fo_op.cc",
    "second_op/cc/so_kernel.cc",
    "second_op/cc/so_op.cc",
  ],
  deps=[
    "@local_config_tf//:libtensorflow_framework",
    "@local_config_tf//:tf_header_lib",
  ],
  alwayslink=1,
  includes=[
    "custom_ops",
    "third_party",
  ],
  copts=[
    "-fPIC",
    "-pthread",
    "-std=c++11",
    "-D_GLIBCXX_USE_CXX11_ABI=0",
    "-ggdb",
  ],
  visibility=["//visibility:public"],
)

The structure of my file system:

...
/custom_ops
/tensorflow-serving
...

Inside of tensorflow-serving/tensorflow_serving/model_servers/BUILD, I've added my op:

SUPPORTED_TENSORFLOW_OPS = [
    "@org_tensorflow//tensorflow/contrib:contrib_kernels",
    "@org_tensorflow//tensorflow/contrib:contrib_ops_op_lib",
    "@my_custom_ops//custom_ops:custom_ops_lib",
]

I've added this to the tensorflow-serving/WORKSPACE file (after copying custom_ops/tf into tensorflow-serving—this part I am not sure about):

load("//tf:tf_configure.bzl", "tf_configure")
tf_configure(name = "local_config_tf")

local_repository(
    name = "my_custom_ops",
    path = "/custom-ops"
)

... and this into the tensorflow-serving/.bazelrc file:

build --workspace_status_command=/proc/self/cwd/tools/gen_status_stamp.sh
# A reduced version of the custom_ops bazelrc.
build --action_env TF_HEADER_DIR="/usr/local/lib/python3.6/dist-packages/tensorflow/include"
build --action_env TF_SHARED_LIBRARY_DIR="/usr/local/lib/python3.6/dist-packages/tensorflow"
build --action_env TF_SHARED_LIBRARY_NAME="libtensorflow_framework.so"
build --action_env TF_NEED_CUDA="0"

Compiling

Here's how I compile tf serve, which succeeds:

$ bazel build --color=yes --curses=yes \
    ${TF_SERVING_BAZEL_OPTIONS} \
    --verbose_failures \
    --output_filter=DONT_MATCH_ANYTHING \
    --config=mkl --copt=-march=skylake-avx512 --copt=-O3 \
    tensorflow_serving/model_servers:tensorflow_model_server && \
    cp bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server \
    /usr/local/bin/

Running (and stack trace)

When I run TensorFlow serving (after updating LIBRARY_PATH and LD_LIBRARY_PATH), here's what I get:

$ tensorflow_model_server --model_base_path="/path-to-servable" --model_name="my-model" --enable_batching
2019-05-29 21:19:22.564025: F external/org_tensorflow/tensorflow/stream_executor/host/host_platform.cc:94] Non-OK-status: MultiPlatformManager::RegisterPlatform(std::move(platform)) status: Internal: platform is already registered with name: "Host"
Aborted (core dumped)

It would seem from rooting around that what's happening is the libtensorflow_framework.so dependency is being loaded twice (I'm guessing due to the custom_ops dependency on it?)

This is where I am out of my depth in debugging things and I cannot seem to find any end-to-end example on this anywhere. Do you have an example of how you'd integrate the custom ops in this lib (on which mine were modeled) into TF serve?

model accuracy problem for only inference with custom-op

Hello~

Thank you for your guide to make custom-op.

Due to your guide, I can make custom-op and I add that operation as one of layer substituting conv.

I built the pip package of my-custom op by Makefile.

on ubuntu 16.04
tensorflow ver. 1.15.3
gcc 7.5
python 3.7
cuda 10.0

When I add this custom-op one of layer, and I made the model,

The training accuracy shows similar to the original paper,

However, the inference accuracy is lower under 30%.

I use tensorpack.

At first I think it's one of the version restriction.

Since InferenceGraph doesn't include the the new custom-op.

However, by using tf.disable_eager_execution() I can check the new custom-op also is included in the InferenceGraph.

But still the inference accuracy is under 30%.

Is there any guide to solve this problem?

Thank you in advance~

fatal: Remote branch test not found in upstream origin

Hi.
Inside the Docker container, I try to clone this repository by running the following command.

git clone -b test https://github.com/tensorflow/custom_op.git

I get "fatal: Remote branch test not found in upstream origin"

Full error message is:

Cloning into 'custom_op'...
fatal: Remote branch test not found in upstream origin
Unexpected end of command stream

Building a custom-op on Windows

A heavily requested feature for tensorflow/addons is that we package a windows version:
tensorflow/addons#173

AFAIK to build this we'll need a libtensorflow_framework.dll of some sort within the pip installed package. Looking at the link flags & installed directory it looks like this is not available.

python -c "import tensorflow as tf; print(tf.version.VERSION)"
2.0.0-beta1

python -c "import tensorflow as tf; print(tf.sysconfig.get_link_flags())"
[]

Is this by design or has it just not been implemented yet? There is a _pywrap_tensorflow_internal.pyd but I don't believe we've been linking to the python shared objects so I figured I was headed down the wrong path.

cc @meteorcloudy @yifeif @perfinion

FileNotFoundError: [Errno 2] No such file or directory: '/dt7/usr/bin/gcc'

I am using
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12)
Cuda compilation tools, release 10.0, V10.0.130
print>>> print(tf.version)
1.15.2

I got the following error when running "bazel build build_pip_pkg --verbose_failures":
++++++++++++++++++++++++++++++++++++++++++++++++++++++
WARNING: /root/.cache/bazel/_bazel_root/8538317fd780b9092f73dd9691ce04d0/external/local_config_tf/BUILD:7167:1:
target 'libtensorflow_framework.so' is both a rule and a file; please choose another name for the rule
INFO: Analysed target //:build_pip_pkg (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
ERROR: /root/ssy/custom-op/tensorflow_zero_out/BUILD:10:1: C++ compilation of rule '//tensorflow_zero_out:python
/ops/zero_out_ops.so' failed (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command
(cd /root/.cache/bazel/bazel_root/8538317fd780b9092f73dd9691ce04d0/execroot/main &&
exec env -
CUDA_TOOLKIT_PATH=/usr/local/cuda
CUDNN_INSTALL_PATH=/usr/lib/x86_64-linux-gnu
LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64
PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

PWD=/proc/self/cwd
TF_CUDA_VERSION=10.0
TF_CUDNN_VERSION=7
TF_HEADER_DIR=/usr/local/lib/python3.5/dist-packages/tensorflow_core/include
TF_NEED_CUDA=1
TF_SHARED_LIBRARY_DIR=/usr/local/lib/python3.5/dist-packages/tensorflow_core
TF_SHARED_LIBRARY_NAME=libtensorflow_framework.so.1
third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010-nvcc-cuda10.0/clang/bin/crosstool_wrapper_driv
er_is_not_gcc -MD -MF bazel-out/k8-opt/bin/tensorflow_zero_out/objs/python/ops/zero_out_ops.so/zero_out_ops.pi
c.d '-frandom-seed=bazel-out/k8-opt/bin/tensorflow_zero_out/objs/python/ops/zero_out_ops.so/zero_out_ops.pic.o
' -iquote . -iquote bazel-out/k8-opt/genfiles -iquote bazel-out/k8-opt/bin -iquote external/local_config_tf -iqu
ote bazel-out/k8-opt/genfiles/external/local_config_tf -iquote bazel-out/k8-opt/bin/external/local_config_tf -iq
uote external/bazel_tools -iquote bazel-out/k8-opt/genfiles/external/bazel_tools -iquote bazel-out/k8-opt/bin/ex
ternal/bazel_tools -isystem external/local_config_tf/include -isystem bazel-out/k8-opt/genfiles/external/local_c
onfig_tf/include -isystem bazel-out/k8-opt/bin/external/local_config_tf/include '-std=c++11' -Wno-builtin-macro-
redefined '-D__DATE="redacted"' '-D__TIMESTAMP="redacted"' '-D__TIME="redacted"' -fPIC -U_FORTIFY_SOURCE '
-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-syste
m-headers -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections -pthread '-std=c++11' '-D_GLIBCXX_USE_CXX11_ABI=0
' -c tensorflow_zero_out/cc/ops/zero_out_ops.cc -o bazel-out/k8-opt/bin/tensorflow_zero_out/_objs/python/ops/_ze
ro_out_ops.so/zero_out_ops.pic.o)
Execution platform: @bazel_tools//platforms:host_platform
Traceback (most recent call last):
File "third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010-nvcc-cuda10.0/clang/bin/crosstool_wrappe
r_driver_is_not_gcc", line 267, in
sys.exit(main())
File "third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010-nvcc-cuda10.0/clang/bin/crosstool_wrappe
r_driver_is_not_gcc", line 264, in main
return subprocess.call([CPU_COMPILER] + cpu_compiler_flags)
File "/usr/lib/python3.5/subprocess.py", line 557, in call
with Popen(*popenargs, **kwargs) as p:
File "/usr/lib/python3.5/subprocess.py", line 947, in init
restore_signals, start_new_session)
File "/usr/lib/python3.5/subprocess.py", line 1551, in _execute_child
raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2] No such file or directory: '/dt7/usr/bin/gcc'
Target //:build_pip_pkg failed to build
INFO: Elapsed time: 0.466s, Critical Path: 0.18s

++++++++++++++++++++++++++++++++++++++++++++++++++++++

tensorflow-serving with custom-op, when start tensorflow_model_server, got Op type not registered 'ActiveShift2DOp' in binary running on server.

Hello

My Tensorflow ver 1.15.3.

I built custom-op as the guide.

I made shift_op custom-op and make it package as wheel and installed it.

by make package in the makefile
active_shift_pip_pkg: $(ACTIVE_SHIFT_TARGET_LIB)
./build_pip_pkg.sh make artifacts`

using the wheel file under artifacts directory,
I can use the custom op by calling.

from tensorflow_active_shift.python.ops import active_shift2d_ops

After I built the model using tensorpack with my new custom-op, shift and I got checkpoint, and meta file.
and by making the generation .py file as follows I can get pb. file

`
import tensorflow as tf
from tensorflow_active_shift.python.ops import active_shift2d_ops
from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants, signature_def_utils, tag_constants, utils
from tensorflow.python.util import compat
import os

tf.contrib.resampler

config = tf.ConfigProto(allow_soft_placement=True)
sess = tf.Session(config=config)

saver = tf.train.import_meta_graph('./train_log/cifar10-Shift20/graph-0629-155507.meta')
saver.restore(sess, tf.train.latest_checkpoint('./train_log/cifar10-Shift20/'))

graph = tf.get_default_graph()
model_signature = tf.saved_model.signature_def_utils.predict_signature_def(
inputs={'image':graph.get_tensor_by_name('input:0')},
outputs={'label':graph.get_tensor_by_name('label:0')})
#print(inputs, outputs)

model_path = "./model"
model_version = 1
export_path = os.path.join(compat.as_bytes(model_path), compat.as_bytes(str(model_version)))
builder = saved_model_builder.SavedModelBuilder(export_path)
builder.add_meta_graph_and_variables(
sess, [tag_constants.SERVING],
clear_devices=True,
signature_def_map={
signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: model_signature,
})
builder.save()`

I run
tensorflow_model_server --port=8501 --rest_api_port=8502 --model_base_path="/path /model/"

but I got the error as follows.

SavedModel load for tags { serve }; Status: fail: Not found: Op type not registered 'ActiveShift2DOp' in binary running on grace-Alienware-Area-51-R5. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) tf.contrib.resampler should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed..

How can I solve this issue?

Since version problem, I can't add tf.contrib.resampler in the tensorflow, tensorpack code.
When I just use tf.disable_eager_execution(), I can check the InferenceGraph also include the new op as follows.

for op in graph.get_operations(): print(op.name)
...
InferenceTower/res3.1/shift1/ActiveShift2DOp/ActiveShift2DOp/shift
....

How can this problem solve?

Thank you in advance~

Building pip pkg for different architectures (OSX vs Ubu)

Is there any documentation on using bazel to build pip pkgs for different architectures? The compiled custom-op.so contains different ELF headers.

Does anyone successfully compile and test "time_two_op" on the "2.1.0-custom-op-gpu-ubuntu16" image?

My environment is
Ubuntu 18.04
(Do I need to provide another environment?)
I tried for a long time without successfully completing the test of time_two_op.
The initial problem was that nvcc couldn't find the fPIC option. This is because there is CFLAGS in NVCC, and the fPIC in CFLAGS did not add Xcompiler before, and nvcc couldn't find fPIC.
Later.
I encountered some problems. I forgot what changes I made, and finally I compiled the SO library.
The following are my compilation instructions and output:

root@c2f387c31644:/working_dir/my_op# make time_two_gpu_only
2020-05-09 03:33:32.838500: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-05-09 03:33:32.839752: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
2020-05-09 03:33:34.059140: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-05-09 03:33:34.060345: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
nvcc -std=c++11 -c -o tensorflow_time_two/python/ops/_time_two_ops.cu.o tensorflow_time_two/cc/kernels/time_two_kernels.cu.cc -I/usr/local/lib/python2.7/dist-packages/tensorflow_core/include -D_GLIBCXX_USE_CXX11_ABI=0 -O2 -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -L/usr/local/lib/python2.7/dist-packages/tensorflow_core -l:libtensorflow_framework.so.2 -x cu -Xcompiler -fPIC -DNDEBUG --expt-relaxed-constexpr

root@c2f387c31644:/working_dir/my_op# make time_two_op
2020-05-09 03:34:06.279300: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-05-09 03:34:06.280573: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
2020-05-09 03:34:07.469400: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-05-09 03:34:07.470612: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
g++ -I/usr/local/lib/python2.7/dist-packages/tensorflow_core/include -D_GLIBCXX_USE_CXX11_ABI=0 -O2 -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -o tensorflow_time_two/python/ops/_time_two_ops.so tensorflow_time_two/cc/kernels/time_two_kernels.cc tensorflow_time_two/cc/kernels/time_two.h tensorflow_time_two/cc/ops/time_two_ops.cc tensorflow_time_two/python/ops/_time_two_ops.cu.o -shared -L/usr/local/lib/python2.7/dist-packages/tensorflow_core -l:libtensorflow_framework.so.2 -D GOOGLE_CUDA=1 -I/usr/local/cuda/targets/x86_64-linux/include -L/usr/local/cuda-10.1/targets/x86_64-linux/lib -lcudart

root@c2f387c31644:/working_dir/my_op# make time_two_test
2020-05-09 03:34:26.682548: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-05-09 03:34:26.683766: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
2020-05-09 03:34:27.891841: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-05-09 03:34:27.893596: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
python tensorflow_time_two/python/ops/time_two_ops_test.py
2020-05-09 03:34:29.092042: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-05-09 03:34:29.093241: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
Traceback (most recent call last):
File "tensorflow_time_two/python/ops/time_two_ops_test.py", line 28, in
import time_two_ops
File "/working_dir/my_op/tensorflow_time_two/python/ops/time_two_ops.py", line 25, in
resource_loader.get_path_to_datafile('_time_two_ops.so'))
File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/framework/load_library.py", line 57, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /working_dir/my_op/tensorflow_time_two/python/ops/_time_two_ops.so: undefined symbol: _ZN10tensorflow7functor14TimeTwoFunctorIN5Eigen9GpuDeviceEiEclERKS3_iPKiPi
Makefile:42: recipe for target 'time_two_test' failed
make: *** [time_two_test] Error 1

It can be seen that there is nothing defined when loading SO finally.
Later, I tried to write a GPU OP, and the "undefined symbol" error appeared when loading the SO library.
And the example of zero_out Op can work normally

Whether or not --runtime = nvidia --privileged boot image will have this problem。I don't know how to solve it, can someone help me?

Unable to build pip package for python 3.7 using docker container

If I install python 3.7 using miniconda on the custom op docker container and then run configure.sh I get the following error:

ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /root/miniconda3/lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so)

Is python 3.7 not supported in this docker container?

Thanks!

Some questions I encountered when building this example in my own environment, and my solution

My environment:

Ubuntu 16.04
Gcc 4.8
Cuda 9.0
Python 3.6 (anaconda)
Tensorflow-gpu 1.12

I am a beginner of tensorflow, and would like to try to customize an operator, I am fortunate to see this project, I am very grateful to the author's work.
The first is the compilation part:
I ran configure.sh under the "custom-op" folder to configure it. From the printed information, there is no problem:

Do you want to build ops again TensorFlow CPU pip package? Y or enter for CPU (tensorflow), N for GPU (tensorflow-gpu). [Y/n] n
Build with GPU pip package.
Using installed tensorflow-gpu

After I executed the "make time_two_op" command, I get an error:
Nvcc fatal : Unknown option 'fPIC'
I added a new variable:
CFLAGS_G = ${TF_CFLAGS} -O2 -std=c++11
And replaced with:

$(TIME_TWO_GPU_ONLY_TARGET_LIB): tensorflow_time_two/cc/kernels/time_two_kernels.cu.cc
$(NVCC) -c -o $@ $^ $(CFLAGS_G) $(TF_LFLAGS) -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -DNDEBUG --expt-relaxed-constexpr

I don't know what the problem will be, but it does solve the error. If anything is wrong, please let me know.
The second time "make time_two_op" is executed, a header file "cuda_runtime.h" contained in the tensor under "eigen3" is not found. You only need to modify the path to point to the file, such as cuda/include/cuda_runtime.h.
The third time "make time_two_op" is executed, the problem of not finding the cudart library when executing to the g++ part is caused by a path error. Just modify the two paths before "-lcudart" in "TIME_TWO_TARGET_LIB". The include and lib64 under the CUDA folder can be used.
The fourth time I executed "make time_two_op", depending on everything, the nvcc part still has a bunch of warnings, and the g++ part actually has no warnings, which makes me wonder if it was executed correctly.
Then there is the test part:
The first time "make time_two_test" was executed:
Module 'tensorflow.python.framework.test_util' has no attribute 'run_gpu_only'
I simply deleted "@test_util.run_gpu_only".
The second letter "make time_two_test" did not show an error message, but it seems that Bin did not print out any calculation information, which made it difficult for me to judge whether it was executed correctly. The following is a message that was printed when I executed it. Is this a successful implementation?
I look forward to your reply, thank you

Python tensorflow_time_two/python/ops/time_two_ops_test.py
2019-07-19 16:48:34.603115: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-19 16:48:34.730713: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-19 16:48:34.731039: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
Name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
totalMemory: 3.94GiB freeMemory: 1.46GiB
2019-07-19 16:48:34.731055: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-07-19 16:48:36.084652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-19 16:48:36.084740: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2019-07-19 16:48:36.084770: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2019-07-19 16:48:36.085739: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1209 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
..
-------------------------------------------------- --------------------
Ran 2 tests in 1.568s

OK

Building a GPU Custom Operation on Windows 10

Please make sure that this is a build/installation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:build_template

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Microsoft Windows 10
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
TensorFlow installed from (source or binary): Source
TensorFlow version: 1.12.3
Python version: 3.5.0
Installed using virtualenv? pip? conda?: conda
Bazel version (if compiling from source): 0.17.2
GCC/Compiler version (if compiling from source): 7.4.0
CUDA/cuDNN version: 9
GPU model and memory: NVIDIA Quadro P100

Describe the problem

Hello, I'm very new to the TensorFlow package, I'm looking into creating a gradient function for complex numbers for a fft custom operator.

I've been trying to get a custom operation working with a Windows version of TensorFlow GPU version from the guide: https://www.tensorflow.org/guide/extend/op#compiling_the_kernel_for_the_gpu_device

Provide the exact sequence of commands / steps that you executed before running into the problem

Clone the repository:

git clone https://github.com/tensorflow/tensorflow.git

I put my C++ implementation of sqrt_kernels.cc, sqrt.h, and sqrt_kernels.cu.cc in tensorflow\tensorflow\core\user_ops

sqrt_kerenls.cc

#if GOOGLE_CUDA
#define EIGEN_USE_GPU
#endif

#include "sqrt.h"
#include <math.h>
#include "tensorflow/core/framework/op_kernel.h"

using namespace tensorflow;

using CPUDevice = Eigen::ThreadPoolDevice;
using GPUDevice = Eigen::GpuDevice;

// CPU specialization of actual computation
template <typename T>
struct SqrtFunctor<CPUDevice, T>{

    void operator()(const CPUDevice& d, int size, const T* in, T* out){
        for(int i = 0; i < size; ++i){
            out[i] = sqrt(in[i]);
        }
    }
};

//Opkernel definition
//template paramter T is datatype of tensors

template <typename Device, typename T>
class SqrtOp : public OpKernel {
    public:
     explicit SqrtOp(OpKernelConstruction* context) : OpKernel(context) {}

     void Compute(OpKernelContext* context) override {

         const Tensor& input_tensor = context->input(0);

         Tensor* output_tensor = NULL;

         OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output_tensor));

         OP_REQUIRES(context, input_tensor.NumElements() <= tensorflow:kint32max, errors::InvalidArgument("Too many elements in tensor"));

         SqrtFunctor<Device, T>(){
             context->eigen_device<Device>(),
             static_cast<int>(input_tensor.NumElements()),
             input_tensor.flat<T>().data(),
             output_tensor->flat<T>().data());
         }
     }

};

//Register the Kernels
#define REGISTER_CPU(T)                     \
 REGISTER_KERNEL_BUILDER(                   \
        Name("Sqrt").Device(DEVICE_CPU).TypeConstraint<T>("T"), \
        SqrtOp<CPUDevice, T>);
 REGISTER_CPU(float);
 REGISTER_CPU(int32);

#ifdef GOOGLE_CUDA
#define REGISTER_GPU(T)
 extern template struct SqrtFunctor<GPUDevice, T>;   \
REGISTER_KERNEL_BUILDER(
    Name("Sqrt").Device(DEVICE_GPU).TypeConstraint<T>("T"), \
    SqrtOp<GPUDevice, T>);
REGISTER_GPU(float);
REGISTER_GPU(int32);
#endif
}
}

sqrt_kenels.cu.cc

#if GOOGLE_CUDA

#define EIGEN_USE_GPU

#include "sqrt.h"
#include "tensorflow/core/util/cuda_kernel_helper.h"


using namespace tensorflow;

using GPUDevice = Eigen::GpuDevice;

template <typename T>

__global__ void SqrtCudaKernel(const int size, const T* in, T* out){

    for(int i = blockIdx.x * blockDim.x + threadIdx.x; i < size; i += blockDim.x * gridDim.x){

        out[i] = sqrt(ldg(in + i));
    }
}

template <typename T>
struct SqrtFunctor<GPUDevice, T> {
    void operator()(const GPUDevice& d, int size, const T* in, T* out){

        int block_count = 1024;
        int thread_per_block = 20;

        SqrtCudaKernel<T> <<<block_count, thread_per_block, 0, d.stream()>>>(size,in,out);

    }
};

template struct SqrtFunctor<GPUDevice, float>;
template struct SqrtFunctor<GPUDevice, int32>;
}
}

#endif

sqrt.h

#ifndef SQRT_H_
#define SQRT_H_

namespace tensorflow {

namespace functor{


template <typename Device, typename T>

struct SqrtFunctor{
    void operator()(const Device&d, int size, const T* in, T* out);
};

#if GOOGLE_CUDA

template <typename Eigen::GpuDevice, typename T>
struct SqrtFunctor{
    void operator()(const Eigen::GpuDevice & d, int size, const T* in, T* out);
};
#endif

}

}

#endif

Afterwards I configure and built the tensorflow library by using bazel:

python ./configure.py

This is where I get my error after running this command:

nvcc -std=c++11 -c -o cuda_op_kernel.cu.o cuda_op_kernel.cu.cc \
  ${TF_CFLAGS[@]} -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC
nvcc warning : The -std=c++11 flag is not supported with the configured host compiler. Flag will be ignored.
sqrt_kernels.cu.cc
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
sqrt_kernels.cu.cc
C://Users//alayu//AppData//Local//Programs//Python//Python35//lib//site-packages//tensorflow//include\third_party/eigen3/unsupported/Eigen/CXX11/Tensor(1): fatal error C1083: Cannot open include file: 'unsupported/Eigen/CXX11/Tensor': No such file or directory

I found the file "unsupported/Eigen/CXX11/Tensor'" in the path that it said it couldn't find it in, so I don't know what the error could be. I also don't know why nvcc is not recognizing the -std=c++11 flag. Any help would be appreciated.

Any other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

Tests failed in 1.12.0-devel-py3 docker

It seems that resource_loader.get_path_to_datafile is returning wrong directory. I did not try any other ways to build.

15:41 $ docker run -it -v ~/git/custom-op:/home/custom-op tensorflow/tensorflow:1.12.0-devel-py3 /bin/bash
root@52727958a85b:~# ls
root@52727958a85b:~# cd /home/custom-op/
root@52727958a85b:/home/custom-op# ls
BUILD  LICENSE  MANIFEST.in  Makefile  README.md  WORKSPACE  build_pip_pkg.sh  configure.sh  setup.py  tensorflow_zero_out  tf
root@52727958a85b:/home/custom-op# ./configure.sh
rm: cannot remove '.bazelrc': No such file or directory
using installed tensorflow
root@52727958a85b:/home/custom-op# bazel test ...
Extracting Bazel installation...
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
WARNING: /root/.cache/bazel/_bazel_root/f59710f949de61c873019b8044b9e9cf/external/local_config_tf/BUILD:3506:1: target 'libtensorflow_framework.so' is both a rule and a file; please choose another name for the rule
WARNING: /root/.cache/bazel/_bazel_root/f59710f949de61c873019b8044b9e9cf/external/local_config_tf/BUILD:5:12: in hdrs attribute of cc_library rule @local_config_tf//:tf_header_lib: file '_api_implementation.so' from target '@local_config_tf//:tf_header_include' is not allowed in hdrs
WARNING: /root/.cache/bazel/_bazel_root/f59710f949de61c873019b8044b9e9cf/external/local_config_tf/BUILD:5:12: in hdrs attribute of cc_library rule @local_config_tf//:tf_header_lib: file '_message.so' from target '@local_config_tf//:tf_header_include' is not allowed in hdrs
INFO: Analysed 5 targets (22 packages loaded).
INFO: Found 4 targets and 1 test target...
FAIL: //tensorflow_zero_out:zero_out_ops_py_test (see /root/.cache/bazel/_bazel_root/f59710f949de61c873019b8044b9e9cf/execroot/__main__/bazel-out/k8-fastbuild/testlogs/tensorflow_zero_out/zero_out_ops_py_test/test.log)
INFO: Elapsed time: 33.232s, Critical Path: 16.92s
INFO: 6 processes: 6 local.
INFO: Build completed, 1 test FAILED, 18 total actions
//tensorflow_zero_out:zero_out_ops_py_test                               FAILED in 3.0s
  /root/.cache/bazel/_bazel_root/f59710f949de61c873019b8044b9e9cf/execroot/__main__/bazel-out/k8-fastbuild/testlogs/tensorflow_zero_out/zero_out_ops_py_test/test.log

INFO: Build completed, 1 test FAILED, 18 total actions
root@52727958a85b:/home/custom-op# more /root/.cache/bazel/_bazel_root/f59710f949de61c873019b8044b9e9cf/execroot/__main__/bazel-out/k8-fastbuild/testlogs/tensorflow_zero_out/zero_out_ops_py_test/test.log
exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //tensorflow_zero_out:zero_out_ops_py_test
-----------------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/f59710f949de61c873019b8044b9e9cf/execroot/__main__/bazel-out/k8-fastbuild/bin/tensorflow_zero_out/zero_out_ops_py_test.runfiles/__main__/tensorfl
ow_zero_out/python/ops/zero_out_ops_test.py", line 23, in <module>
    from zero_out_ops import zero_out
  File "/home/custom-op/tensorflow_zero_out/python/ops/zero_out_ops.py", line 25, in <module>
    resource_loader.get_path_to_datafile('_zero_out_ops.so'))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/load_library.py", line 60, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /home/custom-op/tensorflow_zero_out/python/ops/_zero_out_ops.so: cannot open shared object file: No such file or directory
root@52727958a85b:/home/custom-op#

Problem inserting external library to custom-ops when building a pip package

Hi all,
I'm working on a project and I'd like to use the external C++ library in my TF custom ops.
In order for me to do this I've tried the following:

I've added the following to the custom-ops WORKSPACE:

load("//tf:tf_configure.bzl", "tf_configure")
load("//gpu:cuda_configure.bzl", "cuda_configure")

tf_configure(name = "local_config_tf")

cuda_configure(name = "local_config_cuda")

new_local_repository(
    name = "custom_ops_logic",
    path = "../custom_ops_logic/",
    build_file = "custom_ops_logic.BUILD",
)

I've also added custom_ops_logic.BUILD in the custom-ops dir:

cc_import(
    name = "custom_ops_logic/zero_out",
    hdrs = ["zero_out_logic/zero_out_logic.h"],
    shared_library = "build/zero_out_logic/libzero_out_logic.so",
    visibility = ["//visibility:public"],
)

Finally, I've added custom_ops_logic/zero_out into the BUILD file of the zero_out custom op.
Now, I've succeeded to build the following:

bazel build :build_pip_pkg
bazel-bin/build_pip_pkg artifacts
pip install artifacts/*.whl

But when I try to run the test pip package:
python -c "import tensorflow as tf;import tensorflow_zero_out;print(tensorflow_zero_out.zero_out([[1,2], [3,4]]))"'

I get the following error:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_zero_out/__init__.py", line 19, in <module>
    from tensorflow_zero_out.python.ops.zero_out_ops import zero_out
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_zero_out/python/ops/zero_out_ops.py", line 25, in <module>
    resource_loader.get_path_to_datafile('_zero_out_ops.so'))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/framework/load_library.py", line 57, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: libzero_out_logic.so: cannot open shared object file: No such file or directory

I'm new to bazel so there's something I probably don't get (tried changing the BUILD files with no success).

Thanks

'libnvinfer.so.6' and plugin missing when building custom-op-gpu on ubuntu14

I fallowed tutorial steps for building custom-gpu-op against manylinux1 docker - ubuntu14. When running ./configure.sh command I got error:


2020-02-19 16:14:12.072849: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-02-19 16:14:12.072947: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-02-19 16:14:12.072968: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
Fatal Python error: GC object already tracked

It seems this is related to CUDA 10 and some similar issues are reported recently for TF2.1

Tried nightly docker version but same problem.

XLA custom op?

I'm curious if there is any known limitation to compile external custom-op for XLA device (i.e. XLA_GPU)?

It looks to me instead of register Device(DEVICE_GPU) one could try Device(XLA_GPU) but I'm not sure if it will work. Can someone please comment?

Testing custom operations from C++

Hi,

I've been looking into how to test custom operations from C++. I see in this repo there are some tests from python but it'd be great to be able to run tests from C++. I was looking into how tensorflow does it and it uses test helpers from tensorflow/core/platform/test.h and as far as I can tell this is not packaged up with the tensorflow headers. Any tips here would be much appreciated.

Thanks!

Parsing shared library name fails on MacOS

Since tensorflow/tensorflow@7efc611 custom-op's configure script will no longer properly parse the shared library name for macos TensorFlow:
https://github.com/tensorflow/custom-op/blob/master/configure.sh#L94

As a quick fix in Addons we just altered the string parsing to be OS dependent, but a more elegant solution is probably possible. I tried passing the linkflag to copts but ran into issues without copying the library to the bazel build dir as a GENRULE.

Happy to submit a string parsing PR to fix to this repo; or modify the Addons script to match a better solution that comes from this.

configuring and running

manylinux2010 is generated whether cpu or gpu choice.
gcc compier path is hard-coded and used at many places.

Is there any possibility that you fix this ? Thank you

Failed compiling with TF 1.14.1 nightly package

TF version: tf-nightly==1.14.1.dev20190419
Python: 2.7
OS: macOS

$ ./configure.sh
Do you want to build ops again TensorFlow CPU pip package? Y or enter for CPU (tensorflow), N for GPU (tensorflow-gpu). [Y/n]
Build with CPU pip package.
Using installed tensorflow
$ bazel build //tensorflow_time_two:time_two_py
WARNING: /private/var/tmp/_bazel_byronyi/ee0318db429a8a8bc1314ce19d988028/external/local_config_tf/BUILD:3589:1: target 'libtensorflow_framework.so' is both a rule and a file; please choose another name for the rule
WARNING: /private/var/tmp/_bazel_byronyi/ee0318db429a8a8bc1314ce19d988028/external/local_config_tf/BUILD:5:12: in hdrs attribute of cc_library rule @local_config_tf//:tf_header_lib: file '_api_implementation.so' from target '@local_config_tf//:tf_header_include' is not allowed in hdrs
WARNING: /private/var/tmp/_bazel_byronyi/ee0318db429a8a8bc1314ce19d988028/external/local_config_tf/BUILD:5:12: in hdrs attribute of cc_library rule @local_config_tf//:tf_header_lib: file '_message.so' from target '@local_config_tf//:tf_header_include' is not allowed in hdrs
INFO: Analysed target //tensorflow_time_two:time_two_py (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
ERROR: /private/var/tmp/_bazel_byronyi/ee0318db429a8a8bc1314ce19d988028/external/local_config_tf/BUILD:3589:1: Executing genrule @local_config_tf//:libtensorflow_framework.so failed (Exit 1)
cp: /usr/local/lib/python2.7/site-packages/tensorflow/libtensorflow_framework.so: No such file or directory
Target //tensorflow_time_two:time_two_py failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.361s, Critical Path: 0.15s
INFO: 0 processes.
FAILED: Build did NOT complete successfully

$ make
g++ -I/usr/local/lib/python2.7/site-packages/tensorflow/include -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -O2 -std=c++11 -o tensorflow_zero_out/python/ops/_zero_out_ops.so tensorflow_zero_out/cc/kernels/zero_out_kernels.cc tensorflow_zero_out/cc/ops/zero_out_ops.cc -shared -L/usr/local/lib/python2.7/site-packages/tensorflow -l:libtensorflow_framework.1.dylib
In file included from tensorflow_zero_out/cc/kernels/zero_out_kernels.cc:16:
In file included from /usr/local/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:24:
In file included from /usr/local/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/framework/allocator.h:26:
In file included from /usr/local/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/framework/resource_handle.h:19:
In file included from /usr/local/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/platform/tensor_coding.h:24:
/usr/local/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/platform/protobuf.h:29:10: fatal error: 'google/protobuf/io/tokenizer.h' file not found
#include "google/protobuf/io/tokenizer.h"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
In file included from tensorflow_zero_out/cc/ops/zero_out_ops.cc:16:
In file included from /usr/local/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/framework/op.h:23:
In file included from /usr/local/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/framework/op_def_builder.h:24:
/usr/local/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/framework/op_def.pb.h:10:10: fatal error: 'google/protobuf/port_def.inc' file not found
#include <google/protobuf/port_def.inc>
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
make: *** [tensorflow_zero_out/python/ops/_zero_out_ops.so] Error 1

cc @yifeif @perfinion @martinwicke

tensorflow.python.framework.errors_impl.NotFoundError: libtensorflow_framework.so.2:

I have successfully compiled and tested the my custom op in the docker environment.

However, when I try to run the script my-custom-cop_test.py in the my local host machine, I ran into the following problems.
Could I know how could I solve the following problem?

Traceback (most recent call last):
File "tensorflow_nearest_neighbor/python/ops/nearest_neighbor_ops_test.py", line 24, in
from tensorflow_nearest_neighbor.python.ops.nearest_neighbor_ops import nearest_neighbor
ModuleNotFoundError: No module named 'tensorflow_nearest_neighbor'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tensorflow_nearest_neighbor/python/ops/nearest_neighbor_ops_test.py", line 26, in
from nearest_neighbor_ops import nearest_neighbor
File "/home/my_op/tensorflow_nearest_neighbor/python/ops/nearest_neighbor_ops.py", line 31, in
resource_loader.get_path_to_datafile('_nearest_neighbor_ops.so'))
File "/home/tan/anaconda3/lib/python3.7/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: libtensorflow_framework.so.2: cannot open shared object file: No such file or directory

Solution: [Updated 2 April 2020]

The .configure.sh file by default install the latest version of tensorflow which is tensorflwo 2.1.0. However, I am using tensorflow version == 1.15.0. So I only have libtensorflow_framework.so.1. I have to install tensorflow version == 1.15.0 to compile my custom operation.

if [[ "$TF_NEED_CUDA" == "0" ]]; then
if [[ $(pip show tensorflow) == tensorflow ]] || [[ $(pip show tf-nightly) == tf-nightly ]] ; then
echo 'Using installed tensorflow'
else
if [[ $(pip show tensorflow-gpu) == tensorflow-gpu ]]; then
echo 'Already have gpu version of tensorflow installed. Uninstalling......\n'
pip uninstall tensorflow-gpu
elif [[ $(pip show tf-nightly-gpu) == tf-nightly-gpu ]]; then
echo 'Already have gpu version of tensorflow installed. Uninstalling......\n'
pip uninstall tf-nightly-gpu
fi
# Install CPU version
echo 'Installing tensorflow......\n'
pip install tensorflow==1.15.0
fi
else
if [[ $(pip show tensorflow-gpu) == tensorflow-gpu ]] || [[ $(pip show tf-nightly-gpu) == tf-nightly-gpu ]]; then
echo 'Using installed tensorflow-gpu'
else
# Uninstall CPU version if it is installed.
if [[ $(pip show tensorflow) == tensorflow ]]; then
echo 'Already have tensorflow non-gpu installed. Uninstalling......\n'
pip uninstall tensorflow
elif [[ $(pip show tf-nightly) == tf-nightly ]]; then
echo 'Already have tensorflow non-gpu installed. Uninstalling......\n'
pip uninstall tf-nightly
fi
echo 'Installing tensorflow-gpu .....\n'
pip install tensorflow-gpu==1.15.0
fi
fi

Missing header error in building the example on Windows

Hi,

I'v got the following error when trying to build the example ‘’tensorflow_zero_out'' on Windows:

(env1) D:\MyResearch\code\customop\custom-op-master>bazel build build_pip_pkg
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
d:\myresearch\code\customop\custom-op-master/.bazelrc
Starting local Bazel server and connecting to it...
INFO: Invocation ID: 3c77b750-5cfa-43d8-b9f9-156f1630ec5e
WARNING: C:/users/02/_bazel_02/ni2xhpqp/external/local_config_tf/BUILD:5:12: in hdrs attribute of cc_library rule @local_config_tf//:tf_header_lib: file '_api_implementation.so' from target '@local_config_tf//:tf_header_include' is not allowed in hdrs
WARNING: C:/users/02/_bazel_02/ni2xhpqp/external/local_config_tf/BUILD:5:12: in hdrs attribute of cc_library rule @local_config_tf//:tf_header_lib: file '_message.so' from target '@local_config_tf//:tf_header_include' is not allowed in hdrs
INFO: Analysed target //:build_pip_pkg (14 packages loaded, 100 targets configured).
INFO: Found 1 target...
ERROR: D:/myresearch/code/customop/custom-op-master/tensorflow_zero_out/BUILD:10:1: C++ compilation of rule '//tensorflow_zero_out:python/ops/_zero_out_ops.so' failed (Exit 2)
bazel-out/x64_windows-opt/genfiles/external/local_config_tf/include\tensorflow/core/framework/op_def.pb.h(9): fatal error C1083: Cannot open include file: 'google/protobuf/stubs/common.h': No such file or directory
Target //:build_pip_pkg failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 10.147s, Critical Path: 3.76s
INFO: 0 processes.
FAILED: Build did NOT complete successfully

Here are the details:

Since no docker container was provided on windows , to build the example I created a conda environment and used pip to install tensorflow cpu version 1.13.2.

Then to run the shell script "configure.sh" in conda env on Windows, I installed m2-base in conda. So Bazel is actually run in MSYS2 shell.

I'v got the following .bazelrc file after running the script "configure.sh" using bazel 0.21:

build:cuda --define=using_cuda=true --define=using_cuda_nvcc=true
build:cuda --crosstool_top=@local_config_cuda//crosstool:toolchain
build --spawn_strategy=standalone
build --strategy=Genrule=standalone
build -c opt
build --action_env TF_HEADER_DIR="D:/MyProgram/Anaconda/envs/env1/lib/site-packages/tensorflow/include"
build --action_env TF_SHARED_LIBRARY_DIR="D:/MyProgram/Anaconda/envs/env1/lib/site-packages/tensorflow/python"
build --action_env TF_SHARED_LIBRARY_NAME="_pywrap_tensorflow_internal.lib"
build --action_env TF_NEED_CUDA="0"

Then when using the command "bazel build build_pip_pkg" ,I got the missing header error mentioned above.

ERROR: D:/myresearch/code/customop/custom-op-master/tensorflow_zero_out/BUILD:10:1: C++ compilation of rule '//tensorflow_zero_out:python/ops/_zero_out_ops.so' failed (Exit 2)
bazel-out/x64_windows-opt/genfiles/external/local_config_tf/include\tensorflow/core/framework/op_def.pb.h(9): fatal error C1083: Cannot open include file: 'google/protobuf/stubs/common.h': No such file or directory
Target //:build_pip_pkg failed to build

Actually the package :"protobuf " version 3.12.2 was installed in the conda env. I still got the same error.

Any idea to solve the problem?

Thank you.

Enable template repository

If this is meant to be a guide/template for developing custom ops, would it be worth enabling the GitHub Template Repository setting?

I think it indicates well that people should use this as a base and makes it easy to do so.

nvcc fatal : Unknown option 'fPIC'

The following error is reported when compiling:
$make time_two_op
nvcc -std=c++11 -c -o tensorflow_time_two/python/ops/_time_two_ops.cu.o tensorflow_time_two/cc/kernels/time_two_kernels.cu.cc -I/usr/local/lib/python3.6/dist-packages/tensorflow_core/include -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -O2 -std=c++11 -L/usr/local/lib/python3.6/dist-packages/tensorflow_core -l:libtensorflow_framework.so.1 -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -DNDEBUG --expt-relaxed-constexpr nvcc fatal : Unknown option 'fPIC' Makefile:35: recipe for target 'tensorflow_time_two/python/ops/_time_two_ops.cu.o' failed make: *** [tensorflow_time_two/python/ops/_time_two_ops.cu.o] Error 1

How to build wheel for python 3?

The current example only produces Python 2 wheels.

Include a C++ example of how to make use of this customer op

As above, I think it would be helpful to have a C++ example application of how to make use of these custom ops.

Error while importing the image_augmentation_library.

Not able to find augmentation_ops.so files

Does it support custom-op with gpu kernel?

can I depend directly on source headers?

i don't want to re-install tensorflow when some tf header changed ( in my case , may changed by myself )？

deprecated nvidia-runtime option for docker in README.md

For newer version of docker the gpu support is activated with the option --gpus all and not --runtime=nvidia

How to depend on another shared library libtensorflow_cc.so?

My project has to depend on both shared library libtensorflow_cc.so and libtensorflow_framework.so
So I tried to edit the tf/BUILD.tpl file and added another rule as following
cc_library( name = "libtensorflow_cc", srcs = [":libtensorflow_cc.so"], visibility = ["//visibility:public"], )
and edit the BUILD file like this
deps = [ "@local_config_tf//:libtensorflow_framework", "@local_config_tf//:tf_header_lib", "@local_config_tf//:libtensowflow_cc", # this line is added ],

But bazel seems not to recognize this and I got an error.
ERROR: /home/ubuntu/custom-op/tensorflow_zero_out/BUILD:5:1: no such target '@local_config_tf//:libtensowflow_cc': target 'libtensowflow_cc' not declared in package ''

I realize that I have to edit the tf/tf_configure.bzl file to take my modification into effect. And Could anyone please give me some advice on how to modify the tf/tf_configure.bzl file to take the rule added into effect?

Cuda Configuration Error: Failed to run find_cuda_config.py

I run docker container with docker img - tensorflow 1.15.2 gpu py3 on ubuntu 16.04
In the img,
tensorflow 1.15.2, and (gcc, g++ 7.4.0), python 3.6.9(gcc 8.3.0),
After I success build my_custom_op,
I tried to build pip tensorflow package according to the guide of custom-op github
such as running 'bazel build :build_pip_pkg'.
But I met the following error.

ROR: Analysis of target '//:build_pip_pkg' failed; build aborted: error loading package 'tensorflow_active_shift': Encountered error while reading extension file 'cuda/build_defs.bzl': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
File "/work_dir/gpu/cuda_configure.bzl", line 1254
_create_local_cuda_repository(repository_ctx)
File "/work_dir/gpu/cuda_configure.bzl", line 985, in _create_local_cuda_repository
_get_cuda_config(repository_ctx)
File "/work_dir/gpu/cuda_configure.bzl", line 714, in _get_cuda_config
find_cuda_config(repository_ctx, ["cuda", "cudnn"])
File "/work_dir/gpu/cuda_configure.bzl", line 694, in find_cuda_config
auto_configure_fail(("Failed to run find_cuda_config...))
File "/work_dir/gpu/cuda_configure.bzl", line 325, in auto_configure_fail
fail(("\n%sCuda Configuration Error:%...)))

Cuda Configuration Error: Failed to run find_cuda_config.py: Could not find any cublas_api.h in any subdirectory:
''
'include'
'include/cuda'
'include/*-linux-gnu'
'extras/CUPTI/include'
'include/cuda/CUPTI'
of:
'/usr/local/cuda'

INFO: Elapsed time: 0.162s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded, 0 targets configured)

My system is ubuntu 16.04
and gpu is rtx2080, the container cuda vers. is 10.0 and the solution patch on the find_cuda_config.py already applied for the similar error for cuda 10.1.

When I tried to find the file ' cublas_api.h ' from the root such as 'find / -name cublas_api.h', I cannot find the file.
However, the original ubuntu system of mine is installed cuda 10.2 and I can find the cublas_api.h.

How can I solve this problem?
Thank you in advance

Missing dependency declarations for gcc when building against manylinux1

Docker: tensorflow/tensorflow:custom-op-gpu-ubuntu14 and nightly
manylinux1 and TF<2.1
In configure changed tf version to 1.14.0

When building custom-gpu op encountered error:

ERROR: /working_dir/hdrnet/BUILD:47:1: undeclared inclusion(s) in rule '//hdrnet:python/ops/_hdrnet_ops.so':
this rule is missing dependency declarations for the following files included by 'hdrnet/cc/bilateral_slice.cc':
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed/limits.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed/syslimits.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stddef.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stdarg.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stdint.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stdbool.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/x86intrin.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/ia32intrin.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/mmintrin.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/xmmintrin.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/mm_malloc.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/emmintrin.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/immintrin.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/fxsrintrin.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/adxintrin.h'
  '/usr/lib/gcc/x86_64-linux-gnu/4.8/include/float.h'
Target //hdrnet:python/ops/_hdrnet_ops.so failed to build
INFO: Elapsed time: 21.516s, Critical Path: 14.68s
INFO: 4 processes: 4 local.
FAILED: Build did NOT complete successfully

Build works when using docker for manylinux2010 and TF>2.1

Building a tensorflow custom operation just by changing its name in all files (.cc .cu BUILD) results in segmentation fault.

the error happens only with gpu. i built the .so library with bazel using ubuntu16-gpu docker image. root@6d23afbfc05f:/home/sbs/my_op# python tensorflow_crop_and_resize1/python/ops/crop_and_resize1_ops.py
2019-11-28 15:28:50.468388: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-11-28 15:28:50.472866: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-11-28 15:28:50.475121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1660 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.815
pciBusID: 0000:01:00.0
2019-11-28 15:28:50.475137: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-11-28 15:28:50.475893: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-11-28 15:28:50.476576: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-11-28 15:28:50.476748: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-11-28 15:28:50.477763: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-11-28 15:28:50.478716: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-11-28 15:28:50.481232: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-11-28 15:28:50.481787: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-28 15:28:50.482261: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-11-28 15:28:50.506346: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3600000000 Hz
2019-11-28 15:28:50.507012: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x60e19f0 executing computations on platform Host. Devices:
2019-11-28 15:28:50.507027: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version
2019-11-28 15:28:50.564176: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x61052a0 executing computations on platform CUDA. Devices:
2019-11-28 15:28:50.564202: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce GTX 1660 Ti, Compute Capability 7.5
2019-11-28 15:28:50.564550: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1660 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.815
pciBusID: 0000:01:00.0
2019-11-28 15:28:50.564582: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-11-28 15:28:50.564596: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-11-28 15:28:50.564607: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-11-28 15:28:50.564618: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-11-28 15:28:50.564629: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-11-28 15:28:50.564639: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-11-28 15:28:50.564650: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-11-28 15:28:50.565124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-28 15:28:50.760387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-28 15:28:50.760411: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2019-11-28 15:28:50.760416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2019-11-28 15:28:50.760997: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3769 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
2019-11-28 15:28:50.762516: I tensorflow/core/common_runtime/eager/execute.cc:574] Executing op CropAndResize1 in device /job:localhost/replica:0/task:0/device:GPU:0
2019-11-28 15:28:51.049562: E tensorflow/stream_executor/cuda/cuda_event.cc:29] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2019-11-28 15:28:51.049591: F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:273] Unexpected Event status: 1
Aborted (core dumped)