TensorFlow and oneDNN

TensorFlow and oneDNN - tensorflow

After installing tensorflow==2.9.1 and running it for the first time I got the following message:
2022-08-19 11:51:23.381523: I tensorflow/core/platform/cpu_feature_guard.cc:193]
This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)
to use the following CPU instructions in performance-critical operations:
AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate
compiler flags.
This message is a bit confusing. At a first glance, it seems like a bad thing. But if you read it carefully, you realize it's actually a good thing. It's using all those nice extensions in "performance-critical operations". But then the last sentence makes it sound not so good, because they are not enabled in "other operations" (whatever they are).
Searching the above message on the interwebs I came across Intel® Optimization for TensorFlow* Installation Guide which said:
If your machine has AVX512 instruction set supported please use the below packages for better performance.
pip install intel-tensorflow-avx512==2.9.1 # linux only
Since my box supports AVX512, I've installed intel-tensorflow-avx512==2.9.1 and now got this message:
2022-08-19 11:43:00.187298: I tensorflow/core/platform/cpu_feature_guard.cc:193]
This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)
to use the following CPU instructions in performance-critical operations:
AVX512_VNNI
To enable them in other operations, rebuild TensorFlow with the appropriate
compiler flags.
Hmm...
So, my questions are:
Since Intel "optimized" version of TensorFlow only "complains" of using AVX512_VNNI in "performance-critical sections", does that mean it's using AVX2, AVX512F and FMA everywhere, including all "other operations"? Or does it mean it's not using them at all?
If it's not using them at all, does it mean it's "inferior" to the official version of TensorFlow and there is no point in using it?
BONUS QUESTION: Why are those cool AVX2/512F/512_VNNI and FMA instructions only enabled in "performance-critical sections" and not for all "other operations"?

Related

AVX512 not showing on Intel Tensorflow

I have a Windows 11 computer with an 11th Gen Intel Core i7-1185G7, which supports SSE4.1, SSE4.2, AVX, AVX2 and AVX512. The computer has no GPU.
I created a conda environment with Python 3.10, and ran pip install intel-tensorflow. According to the documentation, the command pip install intel-tensorflow-avx512 should only be used on Linux platforms. It mentions that AVX512 is automatically used and enabled on PIP wheels:
All Intel TensorFlow binaries are optimized with oneAPI Deep Neural Network Library (oneDNN), which will use the AVX2 or AVX512F FMA etc CPU instructions automatically in performance-critical operations based on the supported Instruction sets on your machine for both Windows and Linux OS.
However, when I start a new project that uses TensorFlow, the following message is shown:
I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Therefore, I am not sure that TensorFlow is using AVX512 as the default instructions set.
Questions
How can I check that TensorFlow is indeed using AVX512?
If TensorFlow is not using AVX512, how can I force it to? Is it a bug that should be reported to Intel?
Is AVX512 really worth it in comparison with AVX and AVX2 when training a model in TensorFlow on a CPU?

This may not be ideal but you could try WSL and run TF through there using the intel-tensorflow-avx512 package as a test.
It is supposed to be default in the TF windows package as well (no need to use the avx512 pip package), but I’m confirming that now. Will get back to you asap.

How to compile TensorFlow binary to use AVX2, AVX512F, FMA?

When I call tf.Session() in TensorFlow, I obtain the following warning message:
I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports
instructions that this TensorFlow binary was not compiled to use: AVX2
AVX512F FMA
My questions are:
How can I solve this? In particular, I wish to be able to keep the current TensorFlow version (1.12.0)
Will I obtain a considerable gain considering that I work on GPU?
I use Ubuntu 18.04.1 LTS.
Thank you ;)

I do not know how to keep 1.12.0, however the Tensorflow page has a good build guide: https://www.tensorflow.org/install/source#setup_for_linux_and_macos
According to comments from this thread at the Tensorflow github project, no. Quote:
From my experiments I found CPU-optimized GPU TF doesn't boost the performance significantly, but it can make the CPU cooler.

Custom tensorflow contradictory AVX2 / FMA messages?

In Julia 0.6 official release, if I Pkg.add tensorflow and run the Pkg.test, shortly after the test starts I get a message about how my CPU supports various protocols such as AVX/2 FMA SSE and so on. Then later in the test process I get another message restating that AVX2 and FMA are not available. The AVX? issue is broadly addressed in other stackoverflow questions.
On recompile a custom version of tensorflow to include AVX / FMA and copy of the resulting tensorflow.so files to the Julia tensorflow deps/usr/bin, running the same Pkg.test() results in no
first message, which seems to confirm that AVX2 and FMA are now in the binary, but the second message repeats, informing me again that AVX2 and FMA are not compiled in.
Test Summary: | Pass Total
shape_inference | 255 255
2018-06-08 09:55:41.794208: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
TensorBoard 1.8.0 at http://linux-k18k.suse:6006 (Press CTRL+C to quit)
Test Summary: |
show | No tests
This may or may not be a contradiction in messages from tensorflow. Given a tensorflow.so library file, is there a way to confirm independently whether the AVX / FMA components were successfully compiled in?
Edit1: Ok so I found objdump and verified that some opt codes for AVX2 are in fact included in the .so library. This issue seems to involve tensorboard rather than tensorflow, but I don't qualify to add a tag for tensorboard (can someone help?). I'm wondering if the standalone tensorboard is pointed at the right libtensorflow? If it is getting information from another version this might explain why it thinks that the codes for AVX2 are missing.

This is now resolved. For me the confusing thing was that it was tensorboard generating the message, not (as I thought) tensorflow itself. Tensorflow was quiet because it saw a valid binary capable of AVX2 and FMA, but tensorboard was doing a separate check which failed, at least in version 1.8. Tensorboard in fact does not do anything requiring AVX2 or FMA so the issue can be safely ignored. Version 1.9 of tensorflow/tensorboard now assesses AVX2 and FMA capability correctly and does not generate the warning message.

bazel build Tensorflow from source

I have many big deep learning tasks in python 3.6 ahead and wanted to build tensorflow (CPU only) from source, as my MacBook Pro with Touchbar 13" noted that tensorflow would run faster if it were build with SSE4.1 SSE4.2 AVX AVX2 and FMA support. There are quite a lot questions on StackOverflow and GitHub regarding that topic and I read them all. None of which is addressing why it is not working for me.
I strictly followed the instructions provided by https://www.tensorflow.org/install/install_sources
my configure looks like this
./configure
Please specify the location of python. [Default is /anaconda/bin/python]: /anaconda/python.app/Contents/MacOS/python
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] n
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] n
No XLA JIT support will be enabled for TensorFlow
Do you wish to build TensorFlow with VERBS support? [y/N] n
No VERBS support will be enabled for TensorFlow
Found possible Python library paths:
/anaconda/python.app/Contents/lib/python3.6/site-packages
Please input the desired Python library path to use. Default is [/anaconda/python.app/Contents/lib/python3.6/site-packages]
Using python library path: /anaconda/python.app/Contents/lib/python3.6/site-packages
Do you wish to build TensorFlow with OpenCL support? [y/N] n
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] n
No CUDA support will be enabled for TensorFlow
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
Configuration finished
with bazel 0.4.5 I then try to do the build as in the instructions
bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
This is executed without error but it gives literally hundreds of warnings. I can provide such as an example, but there hardly any snippets that go on without warning.
I appreciate ever help, thank you all very much.

Unfortunately compiler warnings are a fact of life. However, many of these come from external libraries which are pulled into the build. These can be filtered out with the "output_filter" argument to Bazel:
bazel build --config=opt --output_filter='^//tensorflow' //tensorflow/tools/pip_package:build_pip_package
This limits output to warnings generated by TensorFlow code (you can also turn warnings off entirely this way, but that takes all the fun out of compiling). Since the tooling used to build matches what TensorFlow is developed with more closely, there are fewer warnings (I get some about multi-line comment continuations, a bunch of signed/unsigned integer comparisons, and some about variables which "may" be uninitialized).
None of these indicate definite bugs, just patterns of code which are sometimes bug-prone. If the compiler knew something was wrong, it would emit an error instead. Which is a long way of saying there's nothing to worry about.

Tensorflow Compilation Speeding up CPU

As I use the following command after importing tensorflow in python 2.7:
sess = tf.Session()
Warnings/errors:
tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow
library wasn't compiled to use SSE4.2 instructions, but these are
available on your machine and could speed up CPU computations.
2017-02-02 00:41:48.616602: W
tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow
library wasn't compiled to use AVX instructions, but these are
available on your machine and could speed up CPU computations.
2017-02-02 00:41:48.616614: W
tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow
library wasn't compiled to use AVX2 instructions, but these are
available on your machine and could speed up CPU computations.
2017-02-02 00:41:48.616624: W
tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow
library wasn't compiled to use FMA instructions, but these are
available on your machine and could speed up CPU computations.
Please help me fix this so I may use my machine at its optimal power.

Those warnings are just saying if you build TensorFlow from source it can run faster on your machine. There is no fix as it's not an issue but intended behavior to provide this information to users.
Those CPU instructions were not enabled by default to provide a broader compatibility with most machines.
As the docs says:
TensorFlow checks on startup whether it has been compiled with the optimizations available on the CPU. If the optimizations are not included, TensorFlow will emit warnings, e.g. AVX, AVX2, and FMA instructions not included.
For all details on that see the Performance Guide.

These warnings you see, are telling you that the compiled code does not use these instructions which you have, but not all CPUs out there. When maintainers compile codes for repositories, they need to compile it such that it supports majority of CPUs out there, which means they tell the compiler to use architecture specific instructions.
If you want the package to use all the instructions you have, you need to compile it yourself, or as it's called install from source. You can find documentation about how to do that here, and once you're comfortable compiling tensorflow from source, then you should go and read the performance specific instructions.
However, at the end of the day, for realworld applications you might really need a GPU. It is true that these CPU instructions give you a bit of performance boost, but that's not comparable to using a GPU.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas