Trying to build ArrayFire examples, everything goes well until I get to the CUDA ones. They are supposed to be skipped, since I have an AMD processor/GPU. However, during the build process, the CUDA section is built anyway, failing for obvious reasons, interrupting the rest of the process.
I could manually change the CMakeLists.txt files. However, is there a higher level way to let the build system (cmake) know that I do not have a CUDA compatible GPU?
It looks like the ArrayFire_CUDA_FOUND and CUDA_FOUND macros are erroneously defined on my system.
The ArrayFire CMake build provides a flag to disable the CUDA backend. Simply set AF_BUILD_CUDA to NO via -DAF_BUILD_CUDA=NO at the command line to disable CUDA.
Related
I'm attempting to build TensorFlow's C++ library for Windows XP. While I've been able to build and use it on Windows 10, 32-bit XP isn't working. The background: I'm working on a COM module that calls fuctions from tensorflow.dll. My build environment:
Visual Studio 2017 15.7
CMake 3.11.1
TensorFlow 1.8
Windows 10
The sequence I use to build tensorflow.dll is:
Open "x64_x86 Cross Tools Command Prompt for VS 2017"
Try to force the use of functions availablbe in Win XP: set CXXFLAGS=/D_WINVER=0x0501 /D_WIN32_WINNT=0x0501
Add Git to path: set PATH=%PATH%;C:\Program Files (x86)\Git\bin
Fix CMake file for converting *.proto files to *.pb.h files as described here.
Configure CMake: cmake .. -A Win32 -T v141_xp,host=x64 -DCMAKE_SYSTEM_VERSION=7.0 -DCMAKE_BUILD_TYPE=Release -DPYTHON_EXECUTABLE=C:\Users\williams\AppData\Local\Continuum\Anaconda3\envs\tensorflow\python.exe -Dtensorflow_BUILD_SHARED_LIB=ON -Dtensorflow_BUILD_PYTHON_BINDINGS=OFF -Dtensorflow_WIN_CPU_SIMD_OPTIONS="/arch:IA32"
Build: cmake --build . --target tensorflow --config Release -- /fileLogger /m:1 /p:CL_MPCount=1
The last step also involves some manual labour as the build process doesn't copy .lib files from the 3rd part dependencies to where they are needed. For whatever reason, a bunch of INSTALL projects never get run so I had to do that manually each time the build would fail while looking for a missing lib file. Once that was done, the build completed successfully.
Next I copy my COM module (a DLL) and the TensorFlow DLL over to a Windows XP virtual machine for testing and try to register the COM module, but get an error LoadLibrary("MyDLL.dll") - The specified procedure could not be found. I don't know what procedure it is looking for, so the best I can offer is that Dependency Walker highlights WS2_32.DLL and tells me it can't find inet_ntop and inet_pton.
Any suggestions on how to build TensorFlow so that it doesn't use these two functions?
P.S. suggestions of "Stop using XP, its old and no longer supported" don't help here. Upgrading to Windows 10 is an absolute last resort because of the disruption it would cause at the facility where this software will be tested.
Edit 1:
These two functions inet_pton and inet_ntop were only used in one file that forms part of Google Cloud Storage support in TensorFlow. The build process generated a tensorflow_static.lib in addition to tensorflow.dll. Linking against the static version and adding a few dependencies that aren't included in tensorflow_static.lib got rid of the code using inet_* functions.
My COM module still isn't working on Windows XP though because the file tensorflow\core\platform\windows\env.cc uses functions like CloseThreadpoolWork, submit SubmitThreadpoolWork, etc. that were only introduced in Windows Vista. It looks like I'll have to replace them with something else, as I don't see an alternative implementation in TensorFlow.
Additionally, I found that tensorflow\contrib\cmake\CMakeLists.txt forces _WIN32_WINNT=0x0A00 and that CXXFLAGS is the wrong environment variable to use. Changing it to CMAKE_CXX_FLAGS at least gets my macro definitions included, FWIW.
It is almost impossible to port tensorflow to windows xp, because:
TF's platform depedent code requires some Windows APIs later than winxp such as Thread Pool API. This would possibly bypassed by using third party thread pool libs.
The nsync, protobuf and eigen, which are core parts of TF, use C++11 thread_local, which makes them unable to run-time load as dll, see https://learn.microsoft.com/en-us/cpp/parallel/thread-local-storage-tls?view=vs-2017 for details. This later feature can theoretically be replaced by old windows TLS API, which requires many modifications on TF's core framework.
Anyway, if you really need the xp support, good luck with that.
In the end I gave up on this as simply being impossible. Even replacing the thread pool functions with something from Boost didn't help. If someone else manages to get this working, I'll gladly accept that as the answer, but so far this looks impossible.
I am wondering if there is a definitive recipe for using cmake to build tensorflow and tensor for apps. I followed the instructions at https://github.com/cjweeks/tensorflow-cmake without much success and ended up having to build Eigen and Protobuf by hand and then copy relevant headers files into the the header file tree created by the Bazel build of Tensorflow.
I just built TF with CMake, VS2017, and CUDA 9.2, but had to make two manual changes:
Patch Half.h in Eigen
Change CUDA version from "9.0" to "9.2" in the main CMakeLists.txt.
Build has to be single threaded, otherwise VS runs out of heap (on my 16GB laptop). It takes a while and one project fails, but builds enough libraries to run all the examples I wanted.
Another problem with CMake build, vs. Bazel, is that the former rebuilds a bunch of projects (involving protobuf generated files) even when nothing there changes. Bazel is smarter and only compiles the changed code, then statically links all object files into a single executable, which is still faster than CMake build.
Tensorflow 1.0 has introduced XLA support that includes JIT compilation and AOT compilation. For JIT compilation, I found a python test script with which it can be unit-tested. However, I've not found any python test for AOT compilation. There are bazel tests though, which can be run on source tree.
Tensorflow's link https://www.tensorflow.org/performance/xla/tfcompile provides information on how to test. But tfcompile does not make into the tensorflow's distribution content. I may be wrong here. But I could not see tfcompile anywhere in the TF's distribution directory where it is installed.
Could anyone please help me understand how to test AOT compilation on the existing distribution content OR I need to tweak something in the code to allow AOT stuff to go into distribution?
Thanks in advance.
I know you're asking specifically about AOT, but I recommend you first read this page: https://www.tensorflow.org/performance/xla/
And then read this one: https://www.tensorflow.org/performance/xla/jit
In particular note that XLA is not included in our binary distributions; you must build from source at the moment. Note that you must pick "enable XLA" when you run ./configure in order for XLA support to be enabled.
Once you've done that, Yaroslav Bulatov's advice is correct; you can build the binaries yourself, or run the tests via bazel.
I'm doing some experiments with an evaluation version of the WindRiver dcc diab compiler. I would like to do some testing on my Windows PC.
However I think I have the wrong target setup.
I've got as far as using the 'dctrl -t' command to get the list of target architectures, but selecting options so far hasn't produced anything i can run on windows.
I'm simply doing:
dcc main.c -o main.exe
Am I missing a step?
do I have the wrong target?
or is it simply not possible to create windows binaries?
I believe that the Diab compiler targets a free-standing environment, so would not produce a Windows executable. Moreover x86 is not a supported target processor in any case; see the product brief.
The compiler is intended for use with VxWorks, though can be separately licensed. The toolchain includes an instruction-set simulator for executing target code in a simulated environment, and if you are using VxWorks, that includes a VxWorks simulator.
If you want to build your code as a native Windows application; you will have to use a Windows targeted compiler. I suggest MinGW/GCC since WindRiver support both their own WindRiver/Diab compiler and GCC for Vxworks development, and they share a great deal of commonality with respect to compiler switches and extension syntax.
I am new to writing kernel modules, so facing few non-technical problems.
Since for creating kernel module for a specific kernel version ( say 3.0.0-10, 10 is patch number) requires same version kernel headers, so it looks straight to install kernel headers and start development over there.
But kernel headers for patched kernel version are not available.
As I have a guest kernel vmlinuz-3.0.0-10 running in machine and upon downloading kernel headers it says not found.
other approach is to get the source for that specific kernel, but again problem is same source for patched kernel is not available ( its not necessary to get sources of linux-kernel-3.0.0-10 or even linux-kernel-3.0.0 and 10th patch). In some situation it is possible to get source of running kernel, but not always possible.
another is to build kernel other than the running kernel and place built kernel in the machine. But it requires to build the modules of that kernel that is time-consuming and space-consuming process.
So intention of asking this is to know what are the preferences of kernel driver developers. Are there other alternatives ?
Is it possible to compile kernel module in one version and run in another version ( though it is going to give error, but are there any workaround for this ?)
So, building a new kernel is not a good option as it will require :
building kernel
building modules and firmware
building headers
Moving all of above things in appropriate location (if your machine is not same on which you are going to develop module)
So if you have kernel headers for running system then you dont need to download a source code for any kernel version, and while making module use
make -C /lib/modules/kernel-headers-x.y.z/build M=`pwd` modules
and your module will be ready.
If there would be better answers, i will not hesitate to accept any of them.
I know it's a long time since this question asked. I am new in the kernel development. I have also encountered the same error. But now I am able to load my module in the different kernel by which I have built it. Following is the solution:
download the kernel-devel related to the image that you are running. It should have version as close as possible.
Check that the functions you are using in the module are mapped with the header files you have in the kernel-devel.
change the include/generated/utsrelease.h file for UTS_RELEASE value. change it to the version of kernel image running on your HW.
Compile the module using this kernel tree.
Now you can insert your module inside kernel.
Note:: It may cause some unwanted events to be happened as Shahbaz mentioned above. But if you are doing this just for experiments I think its good to go. :)
There is a way to build a module on one kernel and insert it in another. It is by turning off a certain configuration. I am not telling you which configuration it is because this is ABSOLUTELY DANGEROUS. The reason is that there may be changes between the kernels that could cause your module to behave differently, often resulting in a total freeze.
What you should do is to build the module against an already-built kernel (or at least a configured one). If you have a patched kernel, the best thing you can do is to build that kernel and boot your OS with that.
I know this is time consuming. I have done it many many times and I know how boring it can get, but once you do it right, it makes your life much easier. Kernel compilation takes about 2 hours or so, but you can parallelize it if you have a multi-core CPU. Also, you can always let it compile before you leave the office (or if at home, before going to bed) and let it work at night.
In short, I strongly recommend that you build the kernel you are interested in yourself.