Nvidia Optix :rtProgramCreateFromPTXFile or rtProgramCreateFromPTXString returns the RT_ERROR_INVALID_SOURCE? - optix

RTresult code = rtProgramCreateFromPTXString(context, pBuf, "draw_solid_color", &ray_gen_program);
The code result is
RT_ERROR_INVALID_SOURCE
My project can generate PTX file, and the cuda and optix are configured.
How can I resolve this problem?

Since these steps solved your issue, I'm making this an answer:
Make sure the string pBuf points to is well formed and null-terminated
Check the PTX bitness (should probably be 64 from now on)
No debug info in the PTX

Related

cmake -D CMAKE_CXX_FLAGS="-march=armv8-a" for aarch64 compiling

I need to adapt a series of codes and scripts written for raspberry Pi (1st gen) (that was running a ARM11 cpu) to run on a Allwinner H6-based board cpu (an ARM Cortex-A53).
I already substituted CMAKE_SYSTEM_PROCESSOR from ARCH armv7l to ARCH aarch64.
But, to launch the cmake compiling command string I had
cmake -D CMAKE_CXX_FLAGS="-march=armv7-a" /..path
and I thought to substitute the -march=armv7-a with -march=armv8-a.
Now my doubt is: could be this correct to compile for the Allwinner H6 64bit? Why I can't put directly aarch64 instead of armv8-a? And, finally: what the difference between "armv8" and "armv8-a"?.
Sorry, I am a little bit confused here.
1) Yes, -march=armv8-a would be correct but less specific than,
say, -mtune=cortex-a53, since the Allwinner H6 is a cortex-a53.
My guess is that you cannot put -march=aarch64 instead of -march=arm-v8-a because this would be too generic: after all, you can already specify ‘armv8-a’, ‘armv8.1-a’, ‘armv8.2-a’, ‘armv8.3-a’, ‘armv8.4-a’ and ‘armv8.5-a’, as documented here.
armv8 is the umbrella name for ARMv8-A, ARMv8-M and ARMv8-R. A, R and M are 'profiles' according to arm terminology, and target different types of applications:
See here, here and here for more details.

Patching AIX binary

I am attached to a running proces using dbx on AIX. There is a bug in the program, the offset in the opcode below is 0x9b8, but should be 0xbe8:
(dbx) listi 0x100001b14
0x100001b14 (..........+0x34) e88109b8 ld r4,0x9b8(r1)
I am able to fix that using the command below:
(dbx) assign 0x100001b14 = 0xe8810be8
but that affects only the running process and its memory. How can I change the on disk binary? I am not able to locate the pattern e88109b8 in the binary file,
otherwise I would use e.g. dd utility to patch it.
Best regards,
Pavel Filipensky

Funny font in the build messages in Codeblocks using g++-4 (Cygwin) as compiler

I am using CodeBlocks 10.05 with Cygwin 1.7 to compile some C++ codes. The operating system is WinXP SP3. The compiler used is g++ 4.5.3.
When I build the following program:
#include <stdio.h>
#include <stdlib.h>
using namespace std;
int main()
{
unsigned long long a = 12345678901234;
printf("%u\n",a);
return 0;
}
it outputs the following in the build log:
C:\Documents and Settings\Zhi Ping\Desktop\UVa\143\main.cpp||In function ‘int main()’:|
C:\Documents and Settings\Zhi Ping\Desktop\UVa\143\main.cpp|9|warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘long long unsigned int’|
C:\Documents and Settings\Zhi Ping\Desktop\UVa\143\main.cpp|9|warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘long long unsigned int’|
||=== Build finished: 0 errors, 2 warnings ===|
I do not know why CodeBlocks prints the ‘ etc. symbols. Is there a way for CodeBlocks to properly display the characters?
Cygwin defaults to the UTF-8 encoding, whereas it looks like CodeBlocks assumes that output is in CP1252. Furthermore, since Cygwin tells it that UTF-8 is available, gcc uses separate left and right versions of quote characters instead of the usual ASCII ones. The result is what you're seeing. There are two ways to tackle this: either tell CodeBlocks to use UTF-8, or tell gcc to stick to ASCII by setting LANG=C. I don't know how to do either of these in CodeBlocks though.
Add the following Environment Variable to your computer:
LANG=C
In Windows 7, you can add it by going to Computer > Properties > Advanced System Settings > Environment Variables, then "New...". The menus should be similar in Windows XP.
I hope it's ok to answer an old question. This happened to me today as well, and it took me a while to fix it.

Unable to link Intel MKL

I'm unable to link my program correctly. I use the following command line, but get an error.
g++ -I/home/blah/intel/composerxe/mkl/include dotProduct.cpp /home/blah/intel/composerxe/mkl/lib/intel64/libmkl_core.a
The output is this:
/tmp/ccvw6w13.o: In function `main':
dotProduct.cpp:(.text+0x108): undefined reference to `cblas_sdot'
collect2: ld returned 1 exit status
I also tried running a script that tries to link one by one against all .a files, but they all fail. Can anybody please suggest a solution.
Thanks.
Here's a KB article from Intel:
http://software.intel.com/en-us/articles/intel-math-kernel-library-intel-mkl-for-linux-linking-applications-with-intel-mkl-version-100/
On a side note, if you can use Intel compiler instead of gcc, this works (at least it does for me):
icpc files -mkl
Notice there's no l in front, it's just -mkl.

cmake: Target-specific preprocessor definitions for CUDA targets seems not to work

I'm using cmake 2.8.1 on Mac OSX 10.6 with CUDA 3.0.
So I added a CUDA target which needs BLOCK_SIZE set to some number in order to compile.
cuda_add_executable(SimpleTestsCUDA
SimpleTests.cu
BlockMatrix.cpp
Matrix.cpp
)
set_target_properties(SimpleTestsCUDA PROPERTIES COMPILE_FLAGS -DBLOCK_SIZE=3)
When running make VERBOSE=1 I noticed that nvcc is invoked w/o -DBLOCK_SIZE=3, which results in an error, because BLOCK_SIZE is used in the code, but defined nowhere. Now I used the same definition for a CPU target (using add_executable(...)) and there it worked.
So now the questions: How do I figure out what cmake does with the set_target_properties line if it points to a CUDA target? Googling around didn't help so far and a workaround would be cool..
I think the best way to do this is by adding "OPTIONS -DBLOCK_SIZE=3" to cuda_add_executable. So your line would look like this:
cuda_add_executable(SimpleTestsCUDA
SimpleTests.cu
BlockMatrix.cpp
Matrix.cpp
OPTIONS -DBLOCK_SIZE=3
)
Or you can set it before cuda_add_executable:
SET(CUDA_NVCC_FLAGS -DBLOCK_SIZE=3)
The only workaround I found so far is using remove_definitions:
remove_definitions(-DBLOCK_SIZE=3)
add_definitions(-DBLOCK_SIZE=32)
Doing this before a target seems to help.