Disable mkl inside eigen - tensorflow

System information
OS Platform and Distribution : Debian GNU/Linux 10
TensorFlow installed from (source or binary): Source
TensorFlow version:latest
Python version:3.7.3
Installed using virtualenv? pip? conda?: virtualenv
Bazel version (if compiling from source):3.3.0
GCC/Compiler version (if compiling from source): gcc 8.3
I did NOT use the --config=mkl flag while building
The image is a call graph of 1024X1024 Matmul done 100 times, if you look carefully mkldnn_sgemm is called internally by Eigen, this is what I want to disable.
After some amount of reading I found out that MKL can be called internally by Eigen and after reading the Bazel documentation and seeing how TensorFlow is strucured.I want to disable mkl/mkl_dnn completely if eigen is to be used.
The Eigen build files loads the ifmkl symbol from the mkl directory.
My next step was to look at the if_mkl function inside the build_defs.bzl
I changed the includes attribute in the cclibrary rule in the BUILD file of eigen to ["//conditions:default"] and tried building it again
which game me an error saying ModuleNotFoundError: No module named 'portpicker'
So pip installed portpicker pip install portpicker
And the build completed sucessfully after I did this, but the profile literally shows no difference mkldnn_sgemm is still bieng called the same number of time by eigen internally.
NOTE: The Ultimate aim is to disable mkldnn which is bieng called from inside eigen

Related

Bazel + numpy + zip cross compile for arm

I am using bazel to make a python zip (--build_python_zip) from py_binary rule. Works great on the same architecture, but I when I try run the x86 built app it crashes on the arm with:
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
I think this is because there are some c libs in numpy that are included but built for x86. From looking around it seems like I need to define a toolchain in bazel and build with that. Does this work with the rules_python pip_install thing? How do I define/use the toolchain?
I have a minimal example in: https://github.com/CruxML/MinimalCrossCompile. Run make_zip.sh to build and run. Verified that this has issue described.
This appears to have been solved in rules_python 0.12 and above in PR #773

check tensorflow version in CMake

I'm trying to check the TensorFlow (built from source) version in CMake.
If TensorFlow is built from source, there's a include ("eager", c_api.h, c_api_experimental.h, LICENSE) folder, and a lib (libtensorflow.so, libtensorflow_framwork.so) folder.
I tried find_package because of PACKAGE_FIND_VERSION variable. Although TensorFlow_FOUND variable was set, PACKAGE variable was not set. Maybe there needs something like .version file.
The reason I'm trying to do this is for a version check. My program needs tensorflow 1.10. If there is pre-built tensorflow already in user(/usr/include, /usr/lib), it should check version if it is 1.10.
Is there any good method for doing this?

How to force tensorflow to use a custom version of Eigen?

I am compiling Tensorflow 1.5, and I want to force bazel to include a custom version of the eigen header files, which are at:
usr/local/lib/python2.7/dist-packages/...
Conversely, whenever I try to compile (even after a bazel clean --expunge) tensorflow uses different files, which are copied during the build procedure at:
/root/.cache/bazel/_bazel_root/
Is there any way to force tensorflow to use different files?
You can change the tf_http_archive rule for eigen_archive (you must not change the name) in tensorflow/workspace.bzl to new_local_repository and use Tensorflow's eigen BUILD file (//third_party:eigen.BUILD).

Integrating tensorflow in a larger C++ project -- Library conflicts

Objective: Integrate tensorflow into a larger project.
Solution: 1) Integrate tensorflow into cmake by passing appropriate arguments to bazel and get a working build.
2) Unzip the *.whl file to get the library and headers.
Problem: Tensorflow builds but has its own header files for protobufs and Eigen. My project also depends on these two libraries and the versions might mismatch. However, I can use the libraries that tensorflow fetches and replace the one we currently use. We currently build protobuf in our system.
Question: I can find the protobuf and Eigen header files used by tesorflow inside the whl files built, but I cannot find the .so files.
My understanding of bazel is low, but it might be that it is removing the .so files from the sandbox it uses, I am not sure.
What can I do to always fetch the lib and include folders for tensorflow dependencies that it dowloads. Namely, protobuf. Eigen is header only.
Tried already: search in ~/.cache/bazel/ directory.

how to know the lowest cuda toolkit version that supports for one specific gpu like gtx1080

For a specific gpu like gtx1080, I want to know that which cuda toolkit versions support for it. I have scanned the official website of nvidia, but find no specific result.
You could first check the Computing Capability (CC) of GTX1080 in the following site. It is CC 6.1.
https://developer.nvidia.com/cuda-gpus
Then check the CUDA doc below to see if it is in the supported CC list of the current version of CUDA 7.5. It is not.
http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities
So it should be supported by a future release. Currently the coming release is CUDA 8. You will find it in its document.
If you are not sure about the version of the doucments, you could find the documents associated with a specific CUDA installation, such as
/usr/local/cuda-7.5/doc
The help message of CUDA compiler also gives the supported CC list.
$ nvcc --help
--gpu-code <code>,... (-code)
Specify the name of the NVIDIA GPU to assemble and optimize PTX for.
nvcc embeds a compiled code image in the resulting executable for each specified
<code> architecture, which is a true binary load image for each 'real' architecture
(such as sm_20), and PTX code for the 'virtual' architecture (such as compute_20).
During runtime, such embedded PTX code is dynamically compiled by the CUDA
runtime system if no binary load image is found for the 'current' GPU.
Architectures specified for options '--gpu-architecture' and '--gpu-code'
may be 'virtual' as well as 'real', but the <code> architectures must be
compatible with the <arch> architecture. When the '--gpu-code' option is
used, the value for the '--gpu-architecture' option must be a 'virtual' PTX
architecture.
For instance, '--gpu-architecture=compute_35' is not compatible with '--gpu-code=sm_30',
because the earlier compilation stages will assume the availability of 'compute_35'
features that are not present on 'sm_30'.
Allowed values for this option: 'compute_20','compute_30','compute_32',
'compute_35','compute_37','compute_50','compute_52','compute_53','sm_20',
'sm_21','sm_30','sm_32','sm_35','sm_37','sm_50','sm_52','sm_53'.