Lua API unload luaL_loadfile - api

I working on Lua scripter in game that users are able to write custom Lua script and load it.
I already done with implementation and functions
L = luaL_newstate()
luaL_openlibs()
lua_register(L, cFunction, "luaFunction")
lua_register(L, cFunction2, "luaFunction2")
lua_register(L, cFunctionN, "luaFunctionN")
What I trying now?
Possibility to execute/kill script on button press, below picture for more clarity : https://i.stack.imgur.com/eJQzg.png
All scripts from scripter need to access to Lib.lua so i do:
luaL_loadfile(L, "Lib.lua")
lua_pcall(L, 0, 0, 0)
Then to load script i could use this same thing and is ok until user want to kill/unload script.
luaL_loadfile(L, "script.lua")
lua_pcall(L, 0, 0, 0)
I digging in many threads about Lua API and currently no good answer for this problem.
Ppls talking about lua_newthread that i already tried to implement but no success.
T = lua_newthread(L)
luaL_loadfile(L, "script.lua")
lua_pcall(L, 0, 0, 0)
Return >> Attempt to call a nill value. Look like new thread has no access to Lib.lua
Also there is other problem, when i register function i use:
lua_register(L, cFunction, "luaFunction")
Then even i able to create a thread functions from luaL_newstate*L required to pass handle L to working while handle of thread is T.
Example of C function passing to lua:
static int add (lua_State *L) {
double a = lua_tonumber(L, -1); /* get argument 1 */
double b = lua_tonumber(L, -2); /* get argument 2 */
lua_pushnumber(L, a+b); /* push result */
return 1; /* number of results */
}
Any king of is appreciated.
Regards,
Ascer

Related

Parallel Dynamic Programming with CUDA

It is my first attempt to implement recursion with CUDA. The goal is to extract all the combinations from a set of chars "12345" using the power of CUDA to parallelize dynamically the task. Here is my kernel:
__device__ char route[31] = { "_________________________"};
__device__ char init[6] = { "12345" };
__global__ void Recursive(int depth) {
// up to depth 6
if (depth == 5) return;
// newroute = route - idx
int x = depth * 6;
printf("%s\n", route);
int o = 0;
int newlen = 0;
for (int i = 0; i<6; ++i)
{
if (i != threadIdx.x)
{
route[i+x-o] = init[i];
newlen++;
}
else
{
o = 1;
}
}
Recursive<<<1,newlen>>>(depth + 1);
}
__global__ void RecursiveCount() {
Recursive <<<1,5>>>(0);
}
The idea is to exclude 1 item (the item corresponding to the threadIdx) in each different thread. In each recursive call, using the variable depth, it works over a different base (variable x) on the route device variable.
I expect the kernel prompts something like:
2345_____________________
1345_____________________
1245_____________________
1234_____________________
2345_345_________________
2345_245_________________
2345_234_________________
2345_345__45_____________
2345_345__35_____________
2345_345__34_____________
..
2345_245__45_____________
..
But it prompts ...
·_____________
·_____________
·_____________
·_____________
·_____________
·2345
·2345
·2345
·2345
...
What I´m doing wrong?
What I´m doing wrong?
I may not articulate every problem with your code, but these items should get you a lot closer.
I recommend providing a complete example. In my view it is basically required by Stack Overflow, see item 1 here, note use of the word "must". Your example is missing any host code, including the original kernel call. It's only a few extra lines of code, why not include it? Sure, in this case, I can deduce what the call must have been, but why not just include it? Anyway, based on the output you indicated, it seems fairly evident the launch configuration of the host launch would have to be <<<1,1>>>.
This doesn't seem to be logical to me:
I expect the kernel prompts something like:
2345_____________________
The very first thing your kernel does is print out the route variable, before making any changes to it, so I would expect _____________________. However we can "fix" this by moving the printout to the end of the kernel.
You may be confused about what a __device__ variable is. It is a global variable, and there is only one copy of it. Therefore, when you modify it in your kernel code, every thread, in every kernel, is attempting to modify the same global variable, at the same time. That cannot possibly have orderly results, in any thread-parallel environment. I chose to "fix" this by making a local copy for each thread to work on.
You have an off-by-1 error, as well as an extent error in this loop:
for (int i = 0; i<6; ++i)
The off-by-1 error is due to the fact that you are iterating over 6 possible items (that is, i can reach a value of 5) but there are only 5 items in your init variable (the 6th item being a null terminator. The correct indexing starts out over 0-4 (with one of those being skipped). On subsequent iteration depths, its necessary to reduce this indexing extent by 1. Note that I've chosen to fix the first error here by increasing the length of init. There are other ways to fix, of course. My method inserts an extra _ between depths in the result.
You assume that at each iteration depth, the correct choice of items is the same, and in the same order, i.e. init. However this is not the case. At each depth, the choices of items must be selected not from the unchanging init variable, but from the choices passed from previous depth. Therefore we need a local, per-thread copy of init also.
A few other comments about CUDA Dynamic Parallelism (CDP). When passing pointers to data from one kernel scope to a child scope, local space pointers cannot be used. Therefore I allocate for the local copy of route from the heap, so it can be passed to child kernels. init can be deduced from route, so we can use an ordinary local variable for myinit.
You're going to quickly hit some dynamic parallelism (and perhaps memory) limits here if you continue this. I believe the total number of kernel launches for this is 5^5, which is 3125 (I'm doing this quickly, I may be mistaken). CDP has a pending launch limit of 2000 kernels by default. We're not hitting this here according to what I see, but you'll run into that sooner or later if you increase the depth or width of this operation. Furthermore, in-kernel allocations from the device heap are by default limited to 8KB. I don't seem to be hitting that limit, but probably I am, so my design should probably be modified to fix that.
Finally, in-kernel printf output is limited to the size of a particular buffer. If this technique is not already hitting that limit, it will soon if you increase the width or depth.
Here is a worked example, attempting to address the various items above. I'm not claiming it is defect free, but I think the output is closer to your expectations. Note that due to character limits on SO answers, I've truncated/excerpted some of the output.
$ cat t1639.cu
#include <stdio.h>
__device__ char route[31] = { "_________________________"};
__device__ char init[7] = { "12345_" };
__global__ void Recursive(int depth, const char *oroute) {
char *nroute = (char *)malloc(31);
char myinit[7];
if (depth == 0) memcpy(myinit, init, 6);
else memcpy(myinit, oroute+(depth-1)*6, 6);
myinit[6] = 0;
if (nroute == NULL) {printf("oops\n"); return;}
memcpy(nroute, oroute, 30);
nroute[30] = 0;
// up to depth 6
if (depth == 5) return;
// newroute = route - idx
int x = depth * 6;
//printf("%s\n", nroute);
int o = 0;
int newlen = 0;
for (int i = 0; i<(6-depth); ++i)
{
if (i != threadIdx.x)
{
nroute[i+x-o] = myinit[i];
newlen++;
}
else
{
o = 1;
}
}
printf("%s\n", nroute);
Recursive<<<1,newlen>>>(depth + 1, nroute);
}
__global__ void RecursiveCount() {
Recursive <<<1,5>>>(0, route);
}
int main(){
RecursiveCount<<<1,1>>>();
cudaDeviceSynchronize();
}
$ nvcc -o t1639 t1639.cu -rdc=true -lcudadevrt -arch=sm_70
$ cuda-memcheck ./t1639
========= CUDA-MEMCHECK
2345_____________________
1345_____________________
1245_____________________
1235_____________________
1234_____________________
2345__345________________
2345__245________________
2345__235________________
2345__234________________
2345__2345_______________
2345__345___45___________
2345__345___35___________
2345__345___34___________
2345__345___345__________
2345__345___45____5______
2345__345___45____4______
2345__345___45____45_____
2345__345___45____5______
2345__345___45____5_____5
2345__345___45____4______
2345__345___45____4_____4
2345__345___45____45____5
2345__345___45____45____4
2345__345___35____5______
2345__345___35____3______
2345__345___35____35_____
2345__345___35____5______
2345__345___35____5_____5
2345__345___35____3______
2345__345___35____3_____3
2345__345___35____35____5
2345__345___35____35____3
2345__345___34____4______
2345__345___34____3______
2345__345___34____34_____
2345__345___34____4______
2345__345___34____4_____4
2345__345___34____3______
2345__345___34____3_____3
2345__345___34____34____4
2345__345___34____34____3
2345__345___345___45_____
2345__345___345___35_____
2345__345___345___34_____
2345__345___345___45____5
2345__345___345___45____4
2345__345___345___35____5
2345__345___345___35____3
2345__345___345___34____4
2345__345___345___34____3
2345__245___45___________
2345__245___25___________
2345__245___24___________
2345__245___245__________
2345__245___45____5______
2345__245___45____4______
2345__245___45____45_____
2345__245___45____5______
2345__245___45____5_____5
2345__245___45____4______
2345__245___45____4_____4
2345__245___45____45____5
2345__245___45____45____4
2345__245___25____5______
2345__245___25____2______
2345__245___25____25_____
2345__245___25____5______
2345__245___25____5_____5
2345__245___25____2______
2345__245___25____2_____2
2345__245___25____25____5
2345__245___25____25____2
2345__245___24____4______
2345__245___24____2______
2345__245___24____24_____
2345__245___24____4______
2345__245___24____4_____4
2345__245___24____2______
2345__245___24____2_____2
2345__245___24____24____4
2345__245___24____24____2
2345__245___245___45_____
2345__245___245___25_____
2345__245___245___24_____
2345__245___245___45____5
2345__245___245___45____4
2345__245___245___25____5
2345__245___245___25____2
2345__245___245___24____4
2345__245___245___24____2
2345__235___35___________
2345__235___25___________
2345__235___23___________
2345__235___235__________
2345__235___35____5______
2345__235___35____3______
2345__235___35____35_____
2345__235___35____5______
2345__235___35____5_____5
2345__235___35____3______
2345__235___35____3_____3
2345__235___35____35____5
2345__235___35____35____3
2345__235___25____5______
2345__235___25____2______
2345__235___25____25_____
2345__235___25____5______
2345__235___25____5_____5
2345__235___25____2______
2345__235___25____2_____2
2345__235___25____25____5
2345__235___25____25____2
2345__235___23____3______
2345__235___23____2______
2345__235___23____23_____
2345__235___23____3______
2345__235___23____3_____3
2345__235___23____2______
2345__235___23____2_____2
2345__235___23____23____3
2345__235___23____23____2
2345__235___235___35_____
2345__235___235___25_____
2345__235___235___23_____
2345__235___235___35____5
2345__235___235___35____3
2345__235___235___25____5
2345__235___235___25____2
2345__235___235___23____3
2345__235___235___23____2
2345__234___34___________
2345__234___24___________
2345__234___23___________
2345__234___234__________
2345__234___34____4______
2345__234___34____3______
2345__234___34____34_____
2345__234___34____4______
2345__234___34____4_____4
2345__234___34____3______
2345__234___34____3_____3
2345__234___34____34____4
2345__234___34____34____3
2345__234___24____4______
2345__234___24____2______
2345__234___24____24_____
2345__234___24____4______
2345__234___24____4_____4
2345__234___24____2______
2345__234___24____2_____2
2345__234___24____24____4
2345__234___24____24____2
2345__234___23____3______
2345__234___23____2______
2345__234___23____23_____
2345__234___23____3______
2345__234___23____3_____3
2345__234___23____2______
2345__234___23____2_____2
2345__234___23____23____3
2345__234___23____23____2
2345__234___234___34_____
2345__234___234___24_____
2345__234___234___23_____
2345__234___234___34____4
2345__234___234___34____3
2345__234___234___24____4
2345__234___234___24____2
2345__234___234___23____3
2345__234___234___23____2
2345__2345__345__________
2345__2345__245__________
2345__2345__235__________
2345__2345__234__________
2345__2345__345___45_____
2345__2345__345___35_____
2345__2345__345___34_____
2345__2345__345___45____5
2345__2345__345___45____4
2345__2345__345___35____5
2345__2345__345___35____3
2345__2345__345___34____4
2345__2345__345___34____3
2345__2345__245___45_____
2345__2345__245___25_____
2345__2345__245___24_____
2345__2345__245___45____5
2345__2345__245___45____4
2345__2345__245___25____5
2345__2345__245___25____2
2345__2345__245___24____4
2345__2345__245___24____2
2345__2345__235___35_____
2345__2345__235___25_____
2345__2345__235___23_____
2345__2345__235___35____5
2345__2345__235___35____3
2345__2345__235___25____5
2345__2345__235___25____2
2345__2345__235___23____3
2345__2345__235___23____2
2345__2345__234___34_____
2345__2345__234___24_____
2345__2345__234___23_____
2345__2345__234___34____4
2345__2345__234___34____3
2345__2345__234___24____4
2345__2345__234___24____2
2345__2345__234___23____3
2345__2345__234___23____2
1345__345________________
1345__145________________
1345__135________________
1345__134________________
1345__1345_______________
1345__345___45___________
1345__345___35___________
1345__345___34___________
1345__345___345__________
1345__345___45____5______
1345__345___45____4______
1345__345___45____45_____
1345__345___45____5______
1345__345___45____5_____5
1345__345___45____4______
1345__345___45____4_____4
1345__345___45____45____5
1345__345___45____45____4
1345__345___35____5______
1345__345___35____3______
1345__345___35____35_____
1345__345___35____5______
1345__345___35____5_____5
1345__345___35____3______
1345__345___35____3_____3
1345__345___35____35____5
1345__345___35____35____3
1345__345___34____4______
1345__345___34____3______
1345__345___34____34_____
1345__345___34____4______
1345__345___34____4_____4
1345__345___34____3______
1345__345___34____3_____3
1345__345___34____34____4
1345__345___34____34____3
1345__345___345___45_____
1345__345___345___35_____
1345__345___345___34_____
1345__345___345___45____5
1345__345___345___45____4
1345__345___345___35____5
1345__345___345___35____3
1345__345___345___34____4
1345__345___345___34____3
1345__145___45___________
1345__145___15___________
1345__145___14___________
1345__145___145__________
1345__145___45____5______
1345__145___45____4______
1345__145___45____45_____
1345__145___45____5______
1345__145___45____5_____5
1345__145___45____4______
1345__145___45____4_____4
1345__145___45____45____5
1345__145___45____45____4
1345__145___15____5______
1345__145___15____1______
1345__145___15____15_____
1345__145___15____5______
1345__145___15____5_____5
1345__145___15____1______
1345__145___15____1_____1
1345__145___15____15____5
1345__145___15____15____1
1345__145___14____4______
1345__145___14____1______
1345__145___14____14_____
1345__145___14____4______
1345__145___14____4_____4
1345__145___14____1______
1345__145___14____1_____1
1345__145___14____14____4
1345__145___14____14____1
1345__145___145___45_____
1345__145___145___15_____
1345__145___145___14_____
1345__145___145___45____5
1345__145___145___45____4
1345__145___145___15____5
1345__145___145___15____1
1345__145___145___14____4
1345__145___145___14____1
1345__135___35___________
1345__135___15___________
1345__135___13___________
1345__135___135__________
1345__135___35____5______
1345__135___35____3______
1345__135___35____35_____
1345__135___35____5______
1345__135___35____5_____5
1345__135___35____3______
1345__135___35____3_____3
1345__135___35____35____5
1345__135___35____35____3
1345__135___15____5______
1345__135___15____1______
1345__135___15____15_____
1345__135___15____5______
1345__135___15____5_____5
1345__135___15____1______
1345__135___15____1_____1
1345__135___15____15____5
1345__135___15____15____1
1345__135___13____3______
1345__135___13____1______
1345__135___13____13_____
1345__135___13____3______
1345__135___13____3_____3
1345__135___13____1______
1345__135___13____1_____1
1345__135___13____13____3
1345__135___13____13____1
1345__135___135___35_____
1345__135___135___15_____
1345__135___135___13_____
1345__135___135___35____5
1345__135___135___35____3
1345__135___135___15____5
1345__135___135___15____1
1345__135___135___13____3
1345__135___135___13____1
1345__134___34___________
1345__134___14___________
1345__134___13___________
1345__134___134__________
1345__134___34____4______
1345__134___34____3______
1345__134___34____34_____
1345__134___34____4______
1345__134___34____4_____4
1345__134___34____3______
1345__134___34____3_____3
1345__134___34____34____4
1345__134___34____34____3
1345__134___14____4______
1345__134___14____1______
1345__134___14____14_____
1345__134___14____4______
1345__134___14____4_____4
1345__134___14____1______
1345__134___14____1_____1
1345__134___14____14____4
1345__134___14____14____1
1345__134___13____3______
1345__134___13____1______
1345__134___13____13_____
1345__134___13____3______
1345__134___13____3_____3
1345__134___13____1______
1345__134___13____1_____1
1345__134___13____13____3
1345__134___13____13____1
1345__134___134___34_____
1345__134___134___14_____
1345__134___134___13_____
1345__134___134___34____4
1345__134___134___34____3
1345__134___134___14____4
1345__134___134___14____1
1345__134___134___13____3
1345__134___134___13____1
1345__1345__345__________
1345__1345__145__________
1345__1345__135__________
1345__1345__134__________
1345__1345__345___45_____
1345__1345__345___35_____
1345__1345__345___34_____
1345__1345__345___45____5
1345__1345__345___45____4
1345__1345__345___35____5
1345__1345__345___35____3
1345__1345__345___34____4
1345__1345__345___34____3
1345__1345__145___45_____
1345__1345__145___15_____
1345__1345__145___14_____
1345__1345__145___45____5
1345__1345__145___45____4
1345__1345__145___15____5
1345__1345__145___15____1
1345__1345__145___14____4
1345__1345__145___14____1
1345__1345__135___35_____
1345__1345__135___15_____
1345__1345__135___13_____
1345__1345__135___35____5
1345__1345__135___35____3
1345__1345__135___15____5
1345__1345__135___15____1
1345__1345__135___13____3
1345__1345__135___13____1
1345__1345__134___34_____
1345__1345__134___14_____
1345__1345__134___13_____
1345__1345__134___34____4
1345__1345__134___34____3
1345__1345__134___14____4
1345__1345__134___14____1
1345__1345__134___13____3
1345__1345__134___13____1
1245__245________________
1245__145________________
1245__125________________
1245__124________________
1245__1245_______________
1245__245___45___________
1245__245___25___________
1245__245___24___________
1245__245___245__________
1245__245___45____5______
1245__245___45____4______
1245__245___45____45_____
1245__245___45____5______
1245__245___45____5_____5
1245__245___45____4______
1245__245___45____4_____4
1245__245___45____45____5
1245__245___45____45____4
1245__245___25____5______
1245__245___25____2______
1245__245___25____25_____
1245__245___25____5______
1245__245___25____5_____5
1245__245___25____2______
1245__245___25____2_____2
1245__245___25____25____5
1245__245___25____25____2
1245__245___24____4______
1245__245___24____2______
1245__245___24____24_____
1245__245___24____4______
1245__245___24____4_____4
1245__245___24____2______
1245__245___24____2_____2
1245__245___24____24____4
1245__245___24____24____2
1245__245___245___45_____
1245__245___245___25_____
1245__245___245___24_____
1245__245___245___45____5
1245__245___245___45____4
1245__245___245___25____5
1245__245___245___25____2
1245__245___245___24____4
1245__245___245___24____2
1245__145___45___________
1245__145___15___________
1245__145___14___________
1245__145___145__________
1245__145___45____5______
1245__145___45____4______
1245__145___45____45_____
1245__145___45____5______
1245__145___45____5_____5
1245__145___45____4______
...
1235__1235__235___25_____
1235__1235__235___23_____
1235__1235__235___35____5
1235__1235__235___35____3
1235__1235__235___25____5
1235__1235__235___25____2
1235__1235__235___23____3
1235__1235__235___23____2
1235__1235__135___35_____
1235__1235__135___15_____
1235__1235__135___13_____
1235__1235__135___35____5
1235__1235__135___35____3
1235__1235__135___15____5
1235__1235__135___15____1
1235__1235__135___13____3
1235__1235__135___13____1
1235__1235__125___25_____
1235__1235__125___15_____
1235__1235__125___12_____
1235__1235__125___25____5
1235__1235__125___25____2
1235__1235__125___15____5
1235__1235__125___15____1
1235__1235__125___12____2
1235__1235__125___12____1
1235__1235__123___23_____
1235__1235__123___13_____
1235__1235__123___12_____
1235__1235__123___23____3
1235__1235__123___23____2
1235__1235__123___13____3
1235__1235__123___13____1
1235__1235__123___12____2
1235__1235__123___12____1
1234__234________________
1234__134________________
1234__124________________
1234__123________________
1234__1234_______________
1234__234___34___________
1234__234___24___________
1234__234___23___________
1234__234___234__________
1234__234___34____4______
1234__234___34____3______
1234__234___34____34_____
1234__234___34____4______
1234__234___34____4_____4
1234__234___34____3______
1234__234___34____3_____3
1234__234___34____34____4
1234__234___34____34____3
1234__234___24____4______
1234__234___24____2______
1234__234___24____24_____
1234__234___24____4______
1234__234___24____4_____4
1234__234___24____2______
1234__234___24____2_____2
1234__234___24____24____4
1234__234___24____24____2
1234__234___23____3______
1234__234___23____2______
1234__234___23____23_____
1234__234___23____3______
1234__234___23____3_____3
1234__234___23____2______
1234__234___23____2_____2
1234__234___23____23____3
1234__234___23____23____2
1234__234___234___34_____
1234__234___234___24_____
1234__234___234___23_____
1234__234___234___34____4
1234__234___234___34____3
1234__234___234___24____4
1234__234___234___24____2
1234__234___234___23____3
1234__234___234___23____2
1234__134___34___________
1234__134___14___________
1234__134___13___________
1234__134___134__________
1234__134___34____4______
1234__134___34____3______
1234__134___34____34_____
1234__134___34____4______
1234__134___34____4_____4
1234__134___34____3______
1234__134___34____3_____3
1234__134___34____34____4
1234__134___34____34____3
1234__134___14____4______
1234__134___14____1______
1234__134___14____14_____
1234__134___14____4______
1234__134___14____4_____4
1234__134___14____1______
1234__134___14____1_____1
1234__134___14____14____4
1234__134___14____14____1
1234__134___13____3______
1234__134___13____1______
1234__134___13____13_____
1234__134___13____3______
1234__134___13____3_____3
1234__134___13____1______
1234__134___13____1_____1
1234__134___13____13____3
1234__134___13____13____1
1234__134___134___34_____
1234__134___134___14_____
1234__134___134___13_____
1234__134___134___34____4
1234__134___134___34____3
1234__134___134___14____4
1234__134___134___14____1
1234__134___134___13____3
1234__134___134___13____1
1234__124___24___________
1234__124___14___________
1234__124___12___________
1234__124___124__________
1234__124___24____4______
1234__124___24____2______
1234__124___24____24_____
1234__124___24____4______
1234__124___24____4_____4
1234__124___24____2______
1234__124___24____2_____2
1234__124___24____24____4
1234__124___24____24____2
1234__124___14____4______
1234__124___14____1______
1234__124___14____14_____
1234__124___14____4______
1234__124___14____4_____4
1234__124___14____1______
1234__124___14____1_____1
1234__124___14____14____4
1234__124___14____14____1
1234__124___12____2______
1234__124___12____1______
1234__124___12____12_____
1234__124___12____2______
1234__124___12____2_____2
1234__124___12____1______
1234__124___12____1_____1
1234__124___12____12____2
1234__124___12____12____1
1234__124___124___24_____
1234__124___124___14_____
1234__124___124___12_____
1234__124___124___24____4
1234__124___124___24____2
1234__124___124___14____4
1234__124___124___14____1
1234__124___124___12____2
1234__124___124___12____1
1234__123___23___________
1234__123___13___________
1234__123___12___________
1234__123___123__________
1234__123___23____3______
1234__123___23____2______
1234__123___23____23_____
1234__123___23____3______
1234__123___23____3_____3
1234__123___23____2______
1234__123___23____2_____2
1234__123___23____23____3
1234__123___23____23____2
1234__123___13____3______
1234__123___13____1______
1234__123___13____13_____
1234__123___13____3______
1234__123___13____3_____3
1234__123___13____1______
1234__123___13____1_____1
1234__123___13____13____3
1234__123___13____13____1
1234__123___12____2______
1234__123___12____1______
1234__123___12____12_____
1234__123___12____2______
1234__123___12____2_____2
1234__123___12____1______
1234__123___12____1_____1
1234__123___12____12____2
1234__123___12____12____1
1234__123___123___23_____
1234__123___123___13_____
1234__123___123___12_____
1234__123___123___23____3
1234__123___123___23____2
1234__123___123___13____3
1234__123___123___13____1
1234__123___123___12____2
1234__123___123___12____1
1234__1234__234__________
1234__1234__134__________
1234__1234__124__________
1234__1234__123__________
1234__1234__234___34_____
1234__1234__234___24_____
1234__1234__234___23_____
1234__1234__234___34____4
1234__1234__234___34____3
1234__1234__234___24____4
1234__1234__234___24____2
1234__1234__234___23____3
1234__1234__234___23____2
1234__1234__134___34_____
1234__1234__134___14_____
1234__1234__134___13_____
1234__1234__134___34____4
1234__1234__134___34____3
1234__1234__134___14____4
1234__1234__134___14____1
1234__1234__134___13____3
1234__1234__134___13____1
1234__1234__124___24_____
1234__1234__124___14_____
1234__1234__124___12_____
1234__1234__124___24____4
1234__1234__124___24____2
1234__1234__124___14____4
1234__1234__124___14____1
1234__1234__124___12____2
1234__1234__124___12____1
1234__1234__123___23_____
1234__1234__123___13_____
1234__1234__123___12_____
1234__1234__123___23____3
1234__1234__123___23____2
1234__1234__123___13____3
1234__1234__123___13____1
1234__1234__123___12____2
1234__1234__123___12____1
========= ERROR SUMMARY: 0 errors
$
The answer given by Robert Crovella is correct at the 5th point, the mistake was in the using of init in every recursive call, but I want to clarify something that can be useful for other beginners with CUDA.
I used this variable because when I tried to launch a child kernel passing a local variable I always got the exception: Error: a pointer to local memory cannot be passed to a launch as an argument.
As I´m C# expert developer I´m not used to using pointers (Ref does the low-level-work for that) so I thought there was no way to do it in CUDA/c programming.
As Robert shows in its code it is possible copying the pointer with memalloc for using it as a referable argument.
Here is a kernel simplified as an example of deep recursion.
__device__ char init[6] = { "12345" };
__global__ void Recursive(int depth, const char* route) {
// up to depth 6
if (depth == 5) return;
//declaration for a referable argument (point 6)
char* newroute = (char*)malloc(6);
memcpy(newroute, route, 5);
int o = 0;
int newlen = 0;
for (int i = 0; i < (6 - depth); ++i)
{
if (i != threadIdx.x)
{
newroute[i - o] = route[i];
newlen++;
}
else
{
o = 1;
}
}
printf("%s\n", newroute);
Recursive <<<1, newlen>>>(depth + 1, newroute);
}
__global__ void RecursiveCount() {
Recursive <<<1, 5>>>(0, init);
}
I don't add the main call because I´m using ManagedCUDA for C# but as Robert says it can be figured-out how the call RecursiveCount is.
About ending arrays of char with /0 ... sorry but I don't know exactly what is the benefit; this code works fine without them.

Why can't I call the child process when I put it C:/Windows/System32/?

I wrote a Project.exe that shows Hello World. Write a main process to create a child process of Project.exe. However, when I put Projetc.exe on C:/Windows/System32/, I can't create the child process of Project.exe correctly, but it can be created normally when I put it in other directories.
The main program is as follows:
#include <Windows.h>
#include <string.h>
#include <stdio.h>
int main() {
STARTUPINFO si;
memset(&si, 0, sizeof(si));
si.cb = sizeof(si);
PROCESS_INFORMATION pi;
wchar_t *p = (wchar_t*)TEXT("C:/Windows/System32/Project1.exe");
CreateProcess(p, 0, 0, 0, 0, 0, 0, 0, &si, &pi);
DWORD CurId = GetCurrentProcessId(); //Get the ID of the current process
DWORD Pid = pi.dwProcessId; //ID of the created process
DWORD Tid = pi.dwThreadId; //The main thread ID of the created child process
printf("the ID of the current process: %d\nID of the created process: %d\nThe main thread ID of the created child process %d\n", CurId, Pid, Tid);
WaitForSingleObject(pi.hProcess, -1);
}
The result of the operation is:
If I put Project.exe under E:/, the result is
What is the reason for this, how do I need to change it to run successfully under /System32/?
The general answer is, if you want to know why a call isn't doing what you expect, you should first look at the return status to find out whether the call worked.
The documentation for CreateProcess tells you it returns non-zero for success, zero for failure. You are not looking at the return value, so your code has no knowledge of success or failure.
Furthermore, on failure, GetLastError() will return a reason for failure. That might tell you what's going on.
In short, you need "if return value is zero, print value of GetLastError() and exit".

Vulkan load vkCreateDebugReportCallbackEXT

I am getting into Vulkan and stumbled on my first problem. When trying to create a debug report callback (validation layers and debug extensions are available on my intel hd vulkan driver, at least it says so), it fails telling me vkCreateDebugReportCallbackEXT is an unresolved symbol. When trying to get the function pointer it fails telling me vkCreateDebugReportCallbackEXT is already defined.
Which it is, in the Vulkan header. I could set VK_NO_PROTOTYPES but then I would have to load everything by hand. Is there a way around this? Just using a different name for the function pointer won't work, since I am using Vulkan-Hpp and it uses vkCreateDebugReportCallbackEXT as it is.
Is this a driver bug, telling me debug extensions are available, but there are not?
Btw, I am using VS2015.
Thanks for any help
That's normal. vulkan.h defines them as a global functions. But the loader commands obviously return function pointer.
Normally you would just use a different name you like. But I like to have the canonical names too...
I solve it by defining the function myself (using the declaration from vulkan.h) which in turn calls the loaded pointer:
VKAPI_ATTR VkResult VKAPI_CALL vkCommandEXT( /*...*/ ){
return fpCommandEXT( /*...*/ );
}
(Shameless self-promotion) Like so:
https://github.com/krOoze/Hello_Triangle/blob/8227220/ErrorHandling.h#L181
I make the command to self-load on its first use — if you don't like that, in older commit I had more conventional loader:
https://github.com/krOoze/Hello_Triangle/blob/699ab57/HelloTriangle.cpp#L731
PS:
Khronos themselves just added loader code that illustrates that nicely:
https://github.com/KhronosGroup/Vulkan-Docs/blob/1.0/src/ext_loader/vulkan_ext.c
If you handle multiple VkInstances or VkDevices the loaded functions have to be dispatched to the correct instance or device. For example, I do that (likely inefficiently) here:
https://github.com/krOoze/Hello_Triangle/blob/a691de5/ExtensionLoader.h
I had same issue, couldn't find solution so i solved it this way
(might be wrong, but i just want to share in case it will help somebody):
struct DebugDispatch {
//KHRONOS
PFN_vkCreateDebugReportCallbackEXT vkCreateDebugReportCallbackEXT = 0;
PFN_vkDestroyDebugReportCallbackEXT vkDestroyDebugReportCallbackEXT = 0;
//LUNARG
PFN_vkCreateDebugUtilsMessengerEXT vkCreateDebugUtilsMessengerEXT = 0;
PFN_vkDestroyDebugUtilsMessengerEXT vkDestroyDebugUtilsMessengerEXT = 0;
}
VKAPI_ATTR vk::Bool32 VKAPI_CALL debugReportCallback(...){...}
VKAPI_ATTR vk::Bool32 VKAPI_CALL debugUtilsMessengerCallback(...){...}
enum class ValidationFlagsBits : unsigned int {
NONE = 0,
KHRONOS = 1,
LUNARG = 1 << 1
};
typedef vk::Flags<ValidationFlagsBits> ValidationFlags;
void Example(){
...
vk::Instance instance;
instance = vk::createInstance(...);
DebugDispatch debug_dispatch;
vk::DebugReportCallbackEXT debug_report_callback;
vk::DebugUtilsMessengerEXT debug_utils_messenger;
if(validation_flags & ValidationFlagsBits::KHRONOS){
debug_dispatch.vkCreateDebugReportCallbackEXT =
(PFN_vkCreateDebugReportCallbackEXT)instance.getProcAddr("vkCreateDebugReportCallbackEXT");
debug_dispatch.vkDestroyDebugReportCallbackEXT =
(PFN_vkDestroyDebugReportCallbackEXT)instance.getProcAddr("vkDestroyDebugReportCallbackEXT");
vk::DebugUtilsMessengerCreateInfoEXT create_info{};
create_info.messageSeverity = ...;
create_info.messageType = ...;
create_info.pfnUserCallback = reinterpret_cast<PFN_vkDebugUtilsMessengerCallbackEXT>(&debugUtilsMessengerCallback);
debug_utils_messenger = instance.createDebugUtilsMessengerEXT(create_info, nullptr, debug_dispatch);
}
if(validation_flags & ValidationFlagsBits::LUNARG){
debug_dispatch.vkCreateDebugUtilsMessengerEXT =
(PFN_vkCreateDebugUtilsMessengerEXT)instance.getProcAddr("vkCreateDebugUtilsMessengerEXT");
debug_dispatch.vkDestroyDebugUtilsMessengerEXT =
(PFN_vkDestroyDebugUtilsMessengerEXT)instance.getProcAddr("vkDestroyDebugUtilsMessengerEXT");
vk::DebugReportCallbackCreateInfoEXT create_info{};
create_info.flags = ...;
create_info.pfnCallback = reinterpret_cast<PFN_vkDebugReportCallbackEXT>(&debugReportCallback);
debug_report_callback = instance.createDebugReportCallbackEXT(create_info, nullptr, debug_dispatch);
}
...
if(validation_flags & ValidationFlagsBits::KHRONOS){
instance.destroyDebugUtilsMessengerEXT(debug_utils_messenger, nullptr, debug_dispatch);
}
if(validation_flags & ValidationFlagsBits::LUNARG){
instance.destroyDebugReportCallbackEXT(debug_report_callback, nullptr, debug_dispatch);
}
instance.destroy();
}

MPI message received in different communicator - erroneous program or MPI implementation bug?

This is a follow-up to this previous question of mine, for which the conclusion was that the program was erroneous, and therefore the expected behavior was undefined.
What I'm trying to create here is a simple error-handling mechanism, for which I use that Irecv request for the empty message as an "abort handle", attaching it to my normal MPI_Wait call (and turning it into MPI_WaitAny), in order to allow me to unblock process 1 in case an error occurs on process 0 and it can no longer reach the point where it's supposed to post the matching MPI_Recv.
What's happening is that, due to internal message buffering, the MPI_Isend may succeed right away, without the other process being able to post the matching MPI_Recv. So there's no way of canceling it anymore.
I was hoping that once all processes call MPI_Comm_free I can just forget about that message once and for all, but, as it turns out, that's not the case. Instead, it's being delivered to the MPI_Recv in the following communicator.
So my questions are:
Is this also an erroneous program, or is it a bug in the MPI implementation (Intel MPI 4.0.3)?
If I turn my MPI_Isend calls into MPI_Issend, the program works as expected - can I at least in that case rest assured that the program is correct?
Am I reinventing the wheel here? Is there a simpler way to achieve this?
Again, any feedback is much appreciated!
#include "stdio.h"
#include "unistd.h"
#include "mpi.h"
#include "time.h"
#include "stdlib.h"
int main(int argc, char* argv[]) {
int rank, size;
MPI_Group group;
MPI_Comm my_comm;
srand(time(NULL));
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_group(MPI_COMM_WORLD, &group);
MPI_Comm_create(MPI_COMM_WORLD, group, &my_comm);
if (rank == 0) printf("created communicator %d\n", my_comm);
if (rank == 1) {
MPI_Request req[2];
int msg = 123, which;
MPI_Isend(&msg, 1, MPI_INT, 0, 0, my_comm, &req[0]);
MPI_Irecv(NULL, 0, MPI_INT, 0, 0, my_comm, &req[1]);
MPI_Waitany(2, req, &which, MPI_STATUS_IGNORE);
MPI_Barrier(my_comm);
if (which == 0) {
printf("rank 1: send succeed; cancelling abort handle\n");
MPI_Cancel(&req[1]);
MPI_Wait(&req[1], MPI_STATUS_IGNORE);
} else {
printf("rank 1: send aborted; cancelling send request\n");
MPI_Cancel(&req[0]);
MPI_Wait(&req[0], MPI_STATUS_IGNORE);
}
} else {
MPI_Request req;
int msg, r = rand() % 2;
if (r) {
printf("rank 0: receiving message\n");
MPI_Recv(&msg, 1, MPI_INT, 1, 0, my_comm, MPI_STATUS_IGNORE);
} else {
printf("rank 0: sending abort message\n");
MPI_Isend(NULL, 0, MPI_INT, 1, 0, my_comm, &req);
}
MPI_Barrier(my_comm);
if (!r) {
MPI_Cancel(&req);
MPI_Wait(&req, MPI_STATUS_IGNORE);
}
}
if (rank == 0) printf("freeing communicator %d\n", my_comm);
MPI_Comm_free(&my_comm);
sleep(2);
MPI_Comm_create(MPI_COMM_WORLD, group, &my_comm);
if (rank == 0) printf("created communicator %d\n", my_comm);
if (rank == 0) {
MPI_Request req;
MPI_Status status;
int msg, cancelled;
MPI_Irecv(&msg, 1, MPI_INT, 1, 0, my_comm, &req);
sleep(1);
MPI_Cancel(&req);
MPI_Wait(&req, &status);
MPI_Test_cancelled(&status, &cancelled);
if (cancelled) {
printf("rank 0: receive cancelled\n");
} else {
printf("rank 0: OLD MESSAGE RECEIVED!!!\n");
}
}
if (rank == 0) printf("freeing communicator %d\n", my_comm);
MPI_Comm_free(&my_comm);
MPI_Finalize();
return 0;
}
outputs:
created communicator -2080374784
rank 0: sending abort message
rank 1: send succeed; cancelling abort handle
freeing communicator -2080374784
created communicator -2080374784
rank 0: STRAY MESSAGE RECEIVED!!!
freeing communicator -2080374784
As mentioned in one of the above comments by #kraffenetti, this is an erroneous program because the sent messages are not being matched by receives. Even though the messages are cancelled, they still need to have a matching receive on the remote side because it's possible that the cancel might not be successful for sent messages due to the fact that they were already sent before the cancel can be completed (which is the case here).
This question started a thread on this on a ticket for MPICH, which you can find here that has more details.
I tried to build your code using open mpi and it did not work. mpicc complained about status.cancelled
error: ‘MPI_Status’ has no member named ‘cancelled’
I suppose this is a feature of intel mpi. What happens if you switch for :
...
int flag;
MPI_Test_cancelled(&status, &flag);
if (flag) {
...
This gives the expected output using open mpi (and it makes your code less dependant). Is it the case using intel mpi ?
We need an expert to tell us what is status.cancelled in intel mpi, because i don't know anything about it !
Edit : i tested my answer many times and i found that the output was random, sometimes correct, sometimes not. Sorry for that... As if something in status was not set. Part of the answer may be in MPI_Wait(), http://www.mpich.org/static/docs/v3.1/www3/MPI_Wait.html ,
" The MPI_ERROR field of the status return is only set if the return from the MPI routine is MPI_ERR_IN_STATUS. That error class is only returned by the routines that take an array of status arguments (MPI_Testall, MPI_Testsome, MPI_Waitall, and MPI_Waitsome). In all other cases, the value of the MPI_ERROR field in the status is unchanged. See section 3.2.5 in the MPI-1.1 specification for the exact text. " If MPI_Test_cancelled() makes use of the MPI_ERROR, things might get bad.
So here is the trick : use MPI_Waitall(1,&req, &status) ! The output is correct at last !

DirectShow's PushSource filters cause IMediaControl::Run to return S_FALSE

I'm messing around with the PushSource sample filter shipped with the DirectShow SDK and I'm having the following problem:
When I call IMediaControl::Run(), it returns S_FALSE which means "the graph is preparing to run, but some filters have not completed the transition to a running state". MSDN suggests to then call IMediaControl::GetState() and wait for the transition to finish.
And so, I call IMediaControl::GetState(INFINITE, ...) which is supposed to solve the problem.
However, to the contrary, it returns VFW_S_STATE_INTERMEDIATE even though I've specified an infinite waiting time.
I've tried all three variations (Bitmap, Bitmap Set and Desktop) and they all behave the same way, which initially lead me to believe there is a bug in there somewhere.
However, then, I tried using IFilterGraph::AddSourceFilter to do the same and it did the same thing, which must mean it's my rendering code that is the problem:
CoInitialize(0);
IGraphBuilder *graph = 0;
assert(S_OK == CoCreateInstance(CLSID_FilterGraph, 0, CLSCTX_INPROC_SERVER, IID_IGraphBuilder, (void**)&graph));
IBaseFilter *pushSource = 0;
graph->AddSourceFilter(L"sample.bmp", L"Source", &pushSource);
IPin *srcOut = 0;
assert(S_OK == GetPin(pushSource, PINDIR_OUTPUT, &srcOut));
graph->Render(srcOut);
IMediaControl *c = 0;
IMediaEvent *pEvent;
assert(S_OK == graph->QueryInterface(IID_IMediaControl, (void**)&c));
assert(S_OK == graph->QueryInterface(IID_IMediaEvent, (void**)&pEvent));
HRESULT hr = c->Run();
if(hr != S_OK)
{
if(hr == S_FALSE)
{
OAFilterState state;
hr = c->GetState(INFINITE, &state);
assert(hr == S_OK );
}
}
long code;
assert(S_OK == pEvent->WaitForCompletion(INFINITE, &code));
Anyone knows how to fix this?
IBaseFilter *pushSource = 0;
graph->AddSourceFilter(L"sample.bmp", L"Source", &pushSource);
AddSourceFilter adds a default source filter, I don't think it will add your pushsource samplefilter.
I would recommend to add the graph to the ROT, so you can inspect it with graphedit.
And what happens if you don't call GetState()?
hr = pMediaControl->Run();
if(FAILED(hr)) {
/// handle error
}
long evCode=0;
while (evCode == 0)
{
pEvent->WaitForCompletion(1000, &evCode);
/// other code
}
Open GraphEditPlus, add your filter, render its pin and press Run. Then you'll see states of each filter separately, so you'll see what filter didn't run and why.