Is there a way to show where LLVM is auto vectorising? - objective-c

Context: I have several loops in an Objective-C library I am writing which deal with processing large text arrays. I can see that right now it is running in a single threaded manner.
I understand that LLVM is now capable of auto-vectorising loops, as described at Apple's session at WWDC. It is however very cautious in the way it does it, one reason being the possibility of variables being modified due to CPU pipelining.
My question: how can I see where LLVM has vectorised my code, and, more usefully, how can I receive debug messages that explain why it can't vectorise my code? I'm sure if it can see why it can't auto-vectorise it, it could point that out to me and I could make the necessary manual adjustments to make it vectorisable.
I would be remiss if I didn't point out that this question has been more or less asked already, but quite obtusely, here.

Identifies loops that were successfully vectorized:
clang -Rpass=loop-vectorize
Identifies loops that failed vectorization and indicates if vectorization was specified:
clang -Rpass-missed=loop-vectorize
Identifies the statements that caused vectorization to fail:
clang -Rpass-analysis=loop-vectorize
Source: http://llvm.org/docs/Vectorizers.html#diagnostics

The standard llvm toolchain provided by Xcode doesn't seem to support getting debug info from the optimizer. However, if you roll your own llvm and use that, you should be able to pass flags as mishr suggested above. Here's the workflow I used:
1. Using homebrew, install llvm
brew tap homebrew/versions
brew install llvm33 --with-clang --with-asan
This should install the full and relatively current llvm toolchain. It's linked into /usr/local/bin/*-3.3 (i.e. clang++-3.3). The actual on-disk location is available via brew info llvm33 - probably /usr/local/Cellar/llvm33/3.3/bin.
2. Build the single file you're optimizing, with homebrew llvm and flags
If you've built in Xcode, you can easily copy-paste the build parameters, and use your clang++-3.3 instead of Xcode’s own clang.
Appending -mllvm -debug-only=loop-vectorize will get you the auto-vectorization report. Note: this will likely NOT work with any remotely complex build, e.g. if you've got PCH's, but is a simple way to tweak a single cpp file to make sure it's vectorizing correctly.
3. Create a compiler plugin from the new llvm
I was able to build my entire project with homebrew llvm by:
Grabbing this Xcode compiler plugin: http://trac.seqan.de/browser/trunk/util/xcode/Clang%20LLVM%20MacPorts.xcplugin.zip?order=name
Modifying the clang-related paths to point to my homebrew llvm and clang bin names (by appending '-3.3')
Placing it in /Library/Application Support/Developer/5.0/Xcode/Plug-ins/
Relaunching Xcode should show this plugin in the list of available compilers. At this point, the -mllvm -debug-only=loop-vectorize flag will show the auto-vectorization report.
I have no idea why this isn't exposed in the Apple builds.
UPDATE: This is exposed in current (8.x) versions of Xcode. The only thing required is to enable one or more of the loop-vectorize flags.

Assuming you are using opt and you have a debug build of llvm, you can do it as follows:
opt -O1 -loop-vectorize -debug-only=loop-vectorize code.ll
where code.ll is the IR you want to vectorize.
If you are using clang, you will need to pass the -debug-only=loop-vectorize flag using -mllvm option.

Related

CMake idiom for overcoming libstdc++ filesystem weirdness?

If you build C++14 code with G++ and libstdc++, there's a library named libstdc++fs, which is separate from the rest of libstdc++, and contains the code for std::experimental::filesystem. If you don't link against it, you'll get undefined references.
The "trick" I'm using for overcoming this right now is:
if ("${CMAKE_CXX_COMPILER_ID}" STREQUAL "GNU")
set(CXX_FILESYSTEM_LIBRARIES "stdc++fs")
endif()
and later:
target_link_libraries(my_target PUBLIC ${CXX_FILESYSTEM_LIBRARIES})
but - I don't like having to place this code in every project I work on. Is there a simpler or more standard idiom I could use? Some way this will all happen implicitly perhaps, with some CMake behind-the-scences magic?
tl;dr: Nothing right now, wait for a newer CMake version
As #Pedro graciously points out, this is a known problem, and there is an open issue about it at KitWare's GitLab site for CMake:
Portable linking for C++17 std::filesystem
If using CMAKE_CXX_STANDARD=17 and std::filesystem, GCC requires linking of an extra library: stdc++fs. ... If C++17 is enabled, would it be worth automatically linking to stdc++fs for GCC versions which require this? Likewise for any quirks in other compilers or libraries.
The KitWare issue is about C++17, for which apparently you still need the separate extra library (i.e. it's not just because of the "experimentality" in C++14). Hopefully we'll see some traction on this matter - but
Note: If you're experiencing this problem with C++17's std::filesystem, you're in luck - that code is built into libstdc++ beginning with GCC 9, so if you're using g++ 9 or later, and std::filesystem, you should no longer experience this problem.

Bazel build using different compiler

How can I specify the compiler for Bazel to use? I see the --compiler option here, but no explanation of its use.
I have read about making new toolchains, but it appears that it is per project or something. For Tensorflow in particular, I want to use a icecc install I have on my machines so I can distribute the build
For a wrapper around gcc, doing export CC=/path/to/icecc should just work and start using icecc with bazel 0.4.5. If icecc requires special environment variable you might have to add --action_env flags.
Note that Bazel was created to run with the Google compilation cluster and as a consequence separate each compilation action, that might interact badly with icecc assumptions.

cmake: How to debug bad flags

I am currently having a bear of a time trying to compile a moderate sized library with a brand new toolchain, Assimp on Xcode6 with the new iOS 8.0 SDK.
Bundled with the project are various scripts and Xcode projects that have configurations for building on iOS, but unfortunately none of them work out of the box.
So far the farthest I have gotten is by using a build script which uses the cmake "Unix Makefiles" method to assemble static libs. Other methods would include using cmake to generate Xcode projects to use to build. I tried that also to no avail, and neither did the Xcodeproject that comes with the project in the repository (which I later learned was marked deprecated in one of the readme files).
Okay, so with this "Unix Makefiles" cmake script I have been able to generate some of the static libs (after manually forcing static lib generation inside the main CMakeLists.txt), but when it went on to build for i386 and x86_64 architectures for iPhoneSimulator it kept pulling in the headers for iOS which caused a torrent of compiler errors.
Luckily I followed a hunch and found assimp/code/CMakeFiles/assimp.dir/flags.make which is one of the cmake-generated files, and lo and behold, the entire cflags was in here, and once I removed the rogue header include path, the make call finally succeeds and I have my iPhoneSimulator static lib!
Okay so the question that I have is basically where do I get started when debugging these frustrating cmake problems. My relationship with cmake has always been a strained one because none of cmake's complexity and design principles ever made sense to me, and very infrequent are the times when cmake builds work for me out of the box... it is always something that almost works but then I have to spend hours debugging with make VERBOSE=1 and then haphazardly poking at generated files, which are of course all marked with warnings to not edit them as they are generated files.
I realize that some of the variables here are perhaps relevant to my troubles. But it isn't clear to me how I can debug these variables. Where do I go to print out these variables so that I can find which variable contains erroneous values? For example, in this most recent situation I had a -I flag that was cropping up in the wrong place. Luckily I was able to find a file that contained it using various large-hammer methods that involve grep but I am not close to actually fixing the build configuration to make the process any less painful in the future.
For complex CMakeLists.txt files I have found the variable_watch command can sometimes be useful (documentation here). It doesn't make it easy, but gives you another level of information.

Build and link µIP library with no OS

I'm relitavely new to embedded development and I have a question, or more of a feedback, on building and linking the µIP library on an embedded device. For what it's worth, the following is using a FOX G20 V board with an ATMEL AT91SAM9G20 processor with no OS.
I have done some research, and the way I see myself building and linking the library on the board is one of the following two options.
Option 1: The first option would be to compile the whole library (the .c files) in order to have a built static library in the form of a .a file. Then, I can link the created static library with my application code, before loading it on the device. Of course, the device driver will have to be programmed in order to allow the library to work on the platform (help was found here). This first option is using a Linux machine. For this first option as well, in order to load the static library linked with my application code, do I do so with an "scp"?
Option 2: The second option would be to compile and link the library to my application code directly without going through an intermediate static library. However, since my platorm does not contain an OS, I would need to install an appropraite GCC compiler in order to compile and link (if anyone has any leads for such an installation, that would be very helpful as well). However I'm quite unfamilier with the second option, but I've been told that it is easier to implement so if anyone as an idea on how to implement it, it would be very helpful.
I would appreciate some feedback along with the answers as to whether these options seem correct to you, and to be sure that I have not mentioned something that is false.
There is no real difference between these options. In any case, the host toolchain is responsible for creating a binary file that contains a fully linked executable with no external dependencies, so you need a cross compiler either way, and it is indeed easiest to just compile uIP along with the rest of the application.
The toolchain will typically have a cross compiler (if you use gcc, it should be named arm-eabi-gcc or arm-none-eabi-gcc), cross linker (arm-eabi-ld), cross archiver (arm-eabi-ar) etc. You would use these instead of the native tools. For Debian, you can find a cross compiler for ARM targets without an OS in testing/unstable.
Whether you build a static library
arm-eabi-gcc -c uip.c
arm-eabi-ar cru uip.a uip.o
arm-eabi-ranlib uip.a
arm-eabi-gcc -o executable application.c uip.a
or directly link
arm-eabi-gcc -c application.c
arm-eabi-gcc -c uip.c
arm-eabi-gcc -o executable application.o uip.o
or directly compile and link
arm-eabi-gcc -o executable application.c uip.c
makes no real difference.
If you use an integrated development environment, it is usually easiest to just add uip.c as a source file.

Build System and portability

I'm wondering how i can make a portable build system (step-by-step), i currently use cmake because it was easy to set up in the first place, with only one arch target, but now that i have to package the library I'm developing I'm wondering how is the best way to make it portable for arch I'm testing.
I know I need a config.h to define things depending on the arch but I don't know how automatic this can be.
Any other way to have a build system are warmly welcome!
You can just use CMake, it's pretty straightforward.
You need these things:
First, means to find out the configuration specifics. For example, if you know that some function is named differently on some platform, you can use TRY_COMPILE to discover that:
TRY_COMPILE(HAVE_ALTERNATIVE_FUNC
${CMAKE_BINARY_DIR}
${CMAKE_SOURCE_DIR}/alternative_function_test.cpp
CMAKE_FLAGS -DINCLUDE_DIRECTORIES=xxx
)
where alternative_function_test.cpp is a file in your source directory that compiles only with the alternative definition.
This will define variable HAVE_ALTERNATIVE_FUNC if the compile succeeds.
Second, you need to make this definition affect your sources. Either you can add it to compile flags
IF(HAVE_TR1_RANDOM)
ADD_DEFINITIONS(-DHAVE_TR1_RANDOM)
ENDIF(HAVE_TR1_RANDOM)
or you can make a config.h file. Create config.h.in with the following line
#cmakedefine HAVE_ALTERNATIVE_FUNCS
and create a config.h file by this line in CMakeLists.txt (see CONFIGURE_FILE)
CONFIGURE_FILE(config.h.in config.h #ONLY)
the #cmakedefine will be translated to #define or #undef depending on the CMake variable.
BTW, for testing edianness, see this mail
I have been using the GNU autoconf/automake toolchain which has worked well for me so far. I am only really focussed on Linux/x86 (and 64bit) and the Mac, which is important if you are building on a PowerPC, due to endian issues.
With autoconf you can check the host platform with the macro:
AC_CANONICAL_HOST
And check the endianness using:
AC_C_BIGENDIAN
Autoconf will then add definitions to config.h which you can use in your code.
I am not certain (have never tried) how well the GNU autotools work on Windows, so if Windows is one of your targets then you may be better off finding similar functionality with your existing cmake build system.
For a good primer on the autotools, have a look here:
http://www.freesoftwaremagazine.com/books/autotools_a_guide_to_autoconf_automake_libtool