How to log sc_logic values using semantic? - systemc

I need to convert this code:
bool hw = (gpio_ctrl(idx) >> 5) & 1;
uint8_t cnfg = (gpio_ctrl(idx) >> 3) & 3;
sc_dt::sc_logic oe_n = Log_1; // Default disable
sc_dt::sc_logic od = Log_Z; // Default disable
DEBUG_PRINTF(("GPIO update_ouput: CTRL=0x%x IDX=%d, HW=%d CNFG=%d, OE_N=%d, od=%d\n", gpio_ctrl(idx), idx, hw, cnfg, oe_n, od));
into being logged with Semantic.
I tried simplistically doing this:
SEM_MSG(gpio_output_logic_update, SEM_INFO, "GPIO output logic update", "Control register index", ("value", SEM_ATTR_HEX), "HW (active low)", "Config style", "OE_N", "OD");
// ...
SEM_TRACE(gpio_output_logic_update, source, idx, gpio_ctrl(idx), hw, cnfg, oe_n, od);
but SEM_MSG doesn't magically understand sc_logic.

What I currently have done is to declare a semantic enum for the sc_logic values, and trace that using the value of the sc_logic. It's slightly convoluted to read ("why do I have to take the value() of the sc_logic variable?") but it works...
// Support logging sc_logic values
SEM_ENUM(sc_dt::sc_logic_value_t, (sc_dt::Log_0, "0"), (sc_dt::Log_1, "1"), (sc_dt::Log_Z, "Z"), (sc_dt::Log_X, "X"));
//...
SEM_MSG(gpio_output_logic_update, SEM_INFO, "GPIO output logic update", "index", ("value", SEM_ATTR_HEX), "HW", "CNFG", "OE_N", "OD");
//..
SEM_TRACE(gpio_output_logic_update, source, idx, gpio_ctrl(idx), hw, cnfg, oe_n.value(), od.value());

You can use to_char() method returning a char version of sc_logic. (or to_bool()). See 7.9.2.2 section of IEEE Std 1666-2011.
SEM_TRACE(gpio_output_logic_update, source, idx, gpio_ctrl(idx), hw, cnfg, oe_n.to_char(), od.to_char());

Related

Parallel Dynamic Programming with CUDA

It is my first attempt to implement recursion with CUDA. The goal is to extract all the combinations from a set of chars "12345" using the power of CUDA to parallelize dynamically the task. Here is my kernel:
__device__ char route[31] = { "_________________________"};
__device__ char init[6] = { "12345" };
__global__ void Recursive(int depth) {
// up to depth 6
if (depth == 5) return;
// newroute = route - idx
int x = depth * 6;
printf("%s\n", route);
int o = 0;
int newlen = 0;
for (int i = 0; i<6; ++i)
{
if (i != threadIdx.x)
{
route[i+x-o] = init[i];
newlen++;
}
else
{
o = 1;
}
}
Recursive<<<1,newlen>>>(depth + 1);
}
__global__ void RecursiveCount() {
Recursive <<<1,5>>>(0);
}
The idea is to exclude 1 item (the item corresponding to the threadIdx) in each different thread. In each recursive call, using the variable depth, it works over a different base (variable x) on the route device variable.
I expect the kernel prompts something like:
2345_____________________
1345_____________________
1245_____________________
1234_____________________
2345_345_________________
2345_245_________________
2345_234_________________
2345_345__45_____________
2345_345__35_____________
2345_345__34_____________
..
2345_245__45_____________
..
But it prompts ...
·_____________
·_____________
·_____________
·_____________
·_____________
·2345
·2345
·2345
·2345
...
What I´m doing wrong?
What I´m doing wrong?
I may not articulate every problem with your code, but these items should get you a lot closer.
I recommend providing a complete example. In my view it is basically required by Stack Overflow, see item 1 here, note use of the word "must". Your example is missing any host code, including the original kernel call. It's only a few extra lines of code, why not include it? Sure, in this case, I can deduce what the call must have been, but why not just include it? Anyway, based on the output you indicated, it seems fairly evident the launch configuration of the host launch would have to be <<<1,1>>>.
This doesn't seem to be logical to me:
I expect the kernel prompts something like:
2345_____________________
The very first thing your kernel does is print out the route variable, before making any changes to it, so I would expect _____________________. However we can "fix" this by moving the printout to the end of the kernel.
You may be confused about what a __device__ variable is. It is a global variable, and there is only one copy of it. Therefore, when you modify it in your kernel code, every thread, in every kernel, is attempting to modify the same global variable, at the same time. That cannot possibly have orderly results, in any thread-parallel environment. I chose to "fix" this by making a local copy for each thread to work on.
You have an off-by-1 error, as well as an extent error in this loop:
for (int i = 0; i<6; ++i)
The off-by-1 error is due to the fact that you are iterating over 6 possible items (that is, i can reach a value of 5) but there are only 5 items in your init variable (the 6th item being a null terminator. The correct indexing starts out over 0-4 (with one of those being skipped). On subsequent iteration depths, its necessary to reduce this indexing extent by 1. Note that I've chosen to fix the first error here by increasing the length of init. There are other ways to fix, of course. My method inserts an extra _ between depths in the result.
You assume that at each iteration depth, the correct choice of items is the same, and in the same order, i.e. init. However this is not the case. At each depth, the choices of items must be selected not from the unchanging init variable, but from the choices passed from previous depth. Therefore we need a local, per-thread copy of init also.
A few other comments about CUDA Dynamic Parallelism (CDP). When passing pointers to data from one kernel scope to a child scope, local space pointers cannot be used. Therefore I allocate for the local copy of route from the heap, so it can be passed to child kernels. init can be deduced from route, so we can use an ordinary local variable for myinit.
You're going to quickly hit some dynamic parallelism (and perhaps memory) limits here if you continue this. I believe the total number of kernel launches for this is 5^5, which is 3125 (I'm doing this quickly, I may be mistaken). CDP has a pending launch limit of 2000 kernels by default. We're not hitting this here according to what I see, but you'll run into that sooner or later if you increase the depth or width of this operation. Furthermore, in-kernel allocations from the device heap are by default limited to 8KB. I don't seem to be hitting that limit, but probably I am, so my design should probably be modified to fix that.
Finally, in-kernel printf output is limited to the size of a particular buffer. If this technique is not already hitting that limit, it will soon if you increase the width or depth.
Here is a worked example, attempting to address the various items above. I'm not claiming it is defect free, but I think the output is closer to your expectations. Note that due to character limits on SO answers, I've truncated/excerpted some of the output.
$ cat t1639.cu
#include <stdio.h>
__device__ char route[31] = { "_________________________"};
__device__ char init[7] = { "12345_" };
__global__ void Recursive(int depth, const char *oroute) {
char *nroute = (char *)malloc(31);
char myinit[7];
if (depth == 0) memcpy(myinit, init, 6);
else memcpy(myinit, oroute+(depth-1)*6, 6);
myinit[6] = 0;
if (nroute == NULL) {printf("oops\n"); return;}
memcpy(nroute, oroute, 30);
nroute[30] = 0;
// up to depth 6
if (depth == 5) return;
// newroute = route - idx
int x = depth * 6;
//printf("%s\n", nroute);
int o = 0;
int newlen = 0;
for (int i = 0; i<(6-depth); ++i)
{
if (i != threadIdx.x)
{
nroute[i+x-o] = myinit[i];
newlen++;
}
else
{
o = 1;
}
}
printf("%s\n", nroute);
Recursive<<<1,newlen>>>(depth + 1, nroute);
}
__global__ void RecursiveCount() {
Recursive <<<1,5>>>(0, route);
}
int main(){
RecursiveCount<<<1,1>>>();
cudaDeviceSynchronize();
}
$ nvcc -o t1639 t1639.cu -rdc=true -lcudadevrt -arch=sm_70
$ cuda-memcheck ./t1639
========= CUDA-MEMCHECK
2345_____________________
1345_____________________
1245_____________________
1235_____________________
1234_____________________
2345__345________________
2345__245________________
2345__235________________
2345__234________________
2345__2345_______________
2345__345___45___________
2345__345___35___________
2345__345___34___________
2345__345___345__________
2345__345___45____5______
2345__345___45____4______
2345__345___45____45_____
2345__345___45____5______
2345__345___45____5_____5
2345__345___45____4______
2345__345___45____4_____4
2345__345___45____45____5
2345__345___45____45____4
2345__345___35____5______
2345__345___35____3______
2345__345___35____35_____
2345__345___35____5______
2345__345___35____5_____5
2345__345___35____3______
2345__345___35____3_____3
2345__345___35____35____5
2345__345___35____35____3
2345__345___34____4______
2345__345___34____3______
2345__345___34____34_____
2345__345___34____4______
2345__345___34____4_____4
2345__345___34____3______
2345__345___34____3_____3
2345__345___34____34____4
2345__345___34____34____3
2345__345___345___45_____
2345__345___345___35_____
2345__345___345___34_____
2345__345___345___45____5
2345__345___345___45____4
2345__345___345___35____5
2345__345___345___35____3
2345__345___345___34____4
2345__345___345___34____3
2345__245___45___________
2345__245___25___________
2345__245___24___________
2345__245___245__________
2345__245___45____5______
2345__245___45____4______
2345__245___45____45_____
2345__245___45____5______
2345__245___45____5_____5
2345__245___45____4______
2345__245___45____4_____4
2345__245___45____45____5
2345__245___45____45____4
2345__245___25____5______
2345__245___25____2______
2345__245___25____25_____
2345__245___25____5______
2345__245___25____5_____5
2345__245___25____2______
2345__245___25____2_____2
2345__245___25____25____5
2345__245___25____25____2
2345__245___24____4______
2345__245___24____2______
2345__245___24____24_____
2345__245___24____4______
2345__245___24____4_____4
2345__245___24____2______
2345__245___24____2_____2
2345__245___24____24____4
2345__245___24____24____2
2345__245___245___45_____
2345__245___245___25_____
2345__245___245___24_____
2345__245___245___45____5
2345__245___245___45____4
2345__245___245___25____5
2345__245___245___25____2
2345__245___245___24____4
2345__245___245___24____2
2345__235___35___________
2345__235___25___________
2345__235___23___________
2345__235___235__________
2345__235___35____5______
2345__235___35____3______
2345__235___35____35_____
2345__235___35____5______
2345__235___35____5_____5
2345__235___35____3______
2345__235___35____3_____3
2345__235___35____35____5
2345__235___35____35____3
2345__235___25____5______
2345__235___25____2______
2345__235___25____25_____
2345__235___25____5______
2345__235___25____5_____5
2345__235___25____2______
2345__235___25____2_____2
2345__235___25____25____5
2345__235___25____25____2
2345__235___23____3______
2345__235___23____2______
2345__235___23____23_____
2345__235___23____3______
2345__235___23____3_____3
2345__235___23____2______
2345__235___23____2_____2
2345__235___23____23____3
2345__235___23____23____2
2345__235___235___35_____
2345__235___235___25_____
2345__235___235___23_____
2345__235___235___35____5
2345__235___235___35____3
2345__235___235___25____5
2345__235___235___25____2
2345__235___235___23____3
2345__235___235___23____2
2345__234___34___________
2345__234___24___________
2345__234___23___________
2345__234___234__________
2345__234___34____4______
2345__234___34____3______
2345__234___34____34_____
2345__234___34____4______
2345__234___34____4_____4
2345__234___34____3______
2345__234___34____3_____3
2345__234___34____34____4
2345__234___34____34____3
2345__234___24____4______
2345__234___24____2______
2345__234___24____24_____
2345__234___24____4______
2345__234___24____4_____4
2345__234___24____2______
2345__234___24____2_____2
2345__234___24____24____4
2345__234___24____24____2
2345__234___23____3______
2345__234___23____2______
2345__234___23____23_____
2345__234___23____3______
2345__234___23____3_____3
2345__234___23____2______
2345__234___23____2_____2
2345__234___23____23____3
2345__234___23____23____2
2345__234___234___34_____
2345__234___234___24_____
2345__234___234___23_____
2345__234___234___34____4
2345__234___234___34____3
2345__234___234___24____4
2345__234___234___24____2
2345__234___234___23____3
2345__234___234___23____2
2345__2345__345__________
2345__2345__245__________
2345__2345__235__________
2345__2345__234__________
2345__2345__345___45_____
2345__2345__345___35_____
2345__2345__345___34_____
2345__2345__345___45____5
2345__2345__345___45____4
2345__2345__345___35____5
2345__2345__345___35____3
2345__2345__345___34____4
2345__2345__345___34____3
2345__2345__245___45_____
2345__2345__245___25_____
2345__2345__245___24_____
2345__2345__245___45____5
2345__2345__245___45____4
2345__2345__245___25____5
2345__2345__245___25____2
2345__2345__245___24____4
2345__2345__245___24____2
2345__2345__235___35_____
2345__2345__235___25_____
2345__2345__235___23_____
2345__2345__235___35____5
2345__2345__235___35____3
2345__2345__235___25____5
2345__2345__235___25____2
2345__2345__235___23____3
2345__2345__235___23____2
2345__2345__234___34_____
2345__2345__234___24_____
2345__2345__234___23_____
2345__2345__234___34____4
2345__2345__234___34____3
2345__2345__234___24____4
2345__2345__234___24____2
2345__2345__234___23____3
2345__2345__234___23____2
1345__345________________
1345__145________________
1345__135________________
1345__134________________
1345__1345_______________
1345__345___45___________
1345__345___35___________
1345__345___34___________
1345__345___345__________
1345__345___45____5______
1345__345___45____4______
1345__345___45____45_____
1345__345___45____5______
1345__345___45____5_____5
1345__345___45____4______
1345__345___45____4_____4
1345__345___45____45____5
1345__345___45____45____4
1345__345___35____5______
1345__345___35____3______
1345__345___35____35_____
1345__345___35____5______
1345__345___35____5_____5
1345__345___35____3______
1345__345___35____3_____3
1345__345___35____35____5
1345__345___35____35____3
1345__345___34____4______
1345__345___34____3______
1345__345___34____34_____
1345__345___34____4______
1345__345___34____4_____4
1345__345___34____3______
1345__345___34____3_____3
1345__345___34____34____4
1345__345___34____34____3
1345__345___345___45_____
1345__345___345___35_____
1345__345___345___34_____
1345__345___345___45____5
1345__345___345___45____4
1345__345___345___35____5
1345__345___345___35____3
1345__345___345___34____4
1345__345___345___34____3
1345__145___45___________
1345__145___15___________
1345__145___14___________
1345__145___145__________
1345__145___45____5______
1345__145___45____4______
1345__145___45____45_____
1345__145___45____5______
1345__145___45____5_____5
1345__145___45____4______
1345__145___45____4_____4
1345__145___45____45____5
1345__145___45____45____4
1345__145___15____5______
1345__145___15____1______
1345__145___15____15_____
1345__145___15____5______
1345__145___15____5_____5
1345__145___15____1______
1345__145___15____1_____1
1345__145___15____15____5
1345__145___15____15____1
1345__145___14____4______
1345__145___14____1______
1345__145___14____14_____
1345__145___14____4______
1345__145___14____4_____4
1345__145___14____1______
1345__145___14____1_____1
1345__145___14____14____4
1345__145___14____14____1
1345__145___145___45_____
1345__145___145___15_____
1345__145___145___14_____
1345__145___145___45____5
1345__145___145___45____4
1345__145___145___15____5
1345__145___145___15____1
1345__145___145___14____4
1345__145___145___14____1
1345__135___35___________
1345__135___15___________
1345__135___13___________
1345__135___135__________
1345__135___35____5______
1345__135___35____3______
1345__135___35____35_____
1345__135___35____5______
1345__135___35____5_____5
1345__135___35____3______
1345__135___35____3_____3
1345__135___35____35____5
1345__135___35____35____3
1345__135___15____5______
1345__135___15____1______
1345__135___15____15_____
1345__135___15____5______
1345__135___15____5_____5
1345__135___15____1______
1345__135___15____1_____1
1345__135___15____15____5
1345__135___15____15____1
1345__135___13____3______
1345__135___13____1______
1345__135___13____13_____
1345__135___13____3______
1345__135___13____3_____3
1345__135___13____1______
1345__135___13____1_____1
1345__135___13____13____3
1345__135___13____13____1
1345__135___135___35_____
1345__135___135___15_____
1345__135___135___13_____
1345__135___135___35____5
1345__135___135___35____3
1345__135___135___15____5
1345__135___135___15____1
1345__135___135___13____3
1345__135___135___13____1
1345__134___34___________
1345__134___14___________
1345__134___13___________
1345__134___134__________
1345__134___34____4______
1345__134___34____3______
1345__134___34____34_____
1345__134___34____4______
1345__134___34____4_____4
1345__134___34____3______
1345__134___34____3_____3
1345__134___34____34____4
1345__134___34____34____3
1345__134___14____4______
1345__134___14____1______
1345__134___14____14_____
1345__134___14____4______
1345__134___14____4_____4
1345__134___14____1______
1345__134___14____1_____1
1345__134___14____14____4
1345__134___14____14____1
1345__134___13____3______
1345__134___13____1______
1345__134___13____13_____
1345__134___13____3______
1345__134___13____3_____3
1345__134___13____1______
1345__134___13____1_____1
1345__134___13____13____3
1345__134___13____13____1
1345__134___134___34_____
1345__134___134___14_____
1345__134___134___13_____
1345__134___134___34____4
1345__134___134___34____3
1345__134___134___14____4
1345__134___134___14____1
1345__134___134___13____3
1345__134___134___13____1
1345__1345__345__________
1345__1345__145__________
1345__1345__135__________
1345__1345__134__________
1345__1345__345___45_____
1345__1345__345___35_____
1345__1345__345___34_____
1345__1345__345___45____5
1345__1345__345___45____4
1345__1345__345___35____5
1345__1345__345___35____3
1345__1345__345___34____4
1345__1345__345___34____3
1345__1345__145___45_____
1345__1345__145___15_____
1345__1345__145___14_____
1345__1345__145___45____5
1345__1345__145___45____4
1345__1345__145___15____5
1345__1345__145___15____1
1345__1345__145___14____4
1345__1345__145___14____1
1345__1345__135___35_____
1345__1345__135___15_____
1345__1345__135___13_____
1345__1345__135___35____5
1345__1345__135___35____3
1345__1345__135___15____5
1345__1345__135___15____1
1345__1345__135___13____3
1345__1345__135___13____1
1345__1345__134___34_____
1345__1345__134___14_____
1345__1345__134___13_____
1345__1345__134___34____4
1345__1345__134___34____3
1345__1345__134___14____4
1345__1345__134___14____1
1345__1345__134___13____3
1345__1345__134___13____1
1245__245________________
1245__145________________
1245__125________________
1245__124________________
1245__1245_______________
1245__245___45___________
1245__245___25___________
1245__245___24___________
1245__245___245__________
1245__245___45____5______
1245__245___45____4______
1245__245___45____45_____
1245__245___45____5______
1245__245___45____5_____5
1245__245___45____4______
1245__245___45____4_____4
1245__245___45____45____5
1245__245___45____45____4
1245__245___25____5______
1245__245___25____2______
1245__245___25____25_____
1245__245___25____5______
1245__245___25____5_____5
1245__245___25____2______
1245__245___25____2_____2
1245__245___25____25____5
1245__245___25____25____2
1245__245___24____4______
1245__245___24____2______
1245__245___24____24_____
1245__245___24____4______
1245__245___24____4_____4
1245__245___24____2______
1245__245___24____2_____2
1245__245___24____24____4
1245__245___24____24____2
1245__245___245___45_____
1245__245___245___25_____
1245__245___245___24_____
1245__245___245___45____5
1245__245___245___45____4
1245__245___245___25____5
1245__245___245___25____2
1245__245___245___24____4
1245__245___245___24____2
1245__145___45___________
1245__145___15___________
1245__145___14___________
1245__145___145__________
1245__145___45____5______
1245__145___45____4______
1245__145___45____45_____
1245__145___45____5______
1245__145___45____5_____5
1245__145___45____4______
...
1235__1235__235___25_____
1235__1235__235___23_____
1235__1235__235___35____5
1235__1235__235___35____3
1235__1235__235___25____5
1235__1235__235___25____2
1235__1235__235___23____3
1235__1235__235___23____2
1235__1235__135___35_____
1235__1235__135___15_____
1235__1235__135___13_____
1235__1235__135___35____5
1235__1235__135___35____3
1235__1235__135___15____5
1235__1235__135___15____1
1235__1235__135___13____3
1235__1235__135___13____1
1235__1235__125___25_____
1235__1235__125___15_____
1235__1235__125___12_____
1235__1235__125___25____5
1235__1235__125___25____2
1235__1235__125___15____5
1235__1235__125___15____1
1235__1235__125___12____2
1235__1235__125___12____1
1235__1235__123___23_____
1235__1235__123___13_____
1235__1235__123___12_____
1235__1235__123___23____3
1235__1235__123___23____2
1235__1235__123___13____3
1235__1235__123___13____1
1235__1235__123___12____2
1235__1235__123___12____1
1234__234________________
1234__134________________
1234__124________________
1234__123________________
1234__1234_______________
1234__234___34___________
1234__234___24___________
1234__234___23___________
1234__234___234__________
1234__234___34____4______
1234__234___34____3______
1234__234___34____34_____
1234__234___34____4______
1234__234___34____4_____4
1234__234___34____3______
1234__234___34____3_____3
1234__234___34____34____4
1234__234___34____34____3
1234__234___24____4______
1234__234___24____2______
1234__234___24____24_____
1234__234___24____4______
1234__234___24____4_____4
1234__234___24____2______
1234__234___24____2_____2
1234__234___24____24____4
1234__234___24____24____2
1234__234___23____3______
1234__234___23____2______
1234__234___23____23_____
1234__234___23____3______
1234__234___23____3_____3
1234__234___23____2______
1234__234___23____2_____2
1234__234___23____23____3
1234__234___23____23____2
1234__234___234___34_____
1234__234___234___24_____
1234__234___234___23_____
1234__234___234___34____4
1234__234___234___34____3
1234__234___234___24____4
1234__234___234___24____2
1234__234___234___23____3
1234__234___234___23____2
1234__134___34___________
1234__134___14___________
1234__134___13___________
1234__134___134__________
1234__134___34____4______
1234__134___34____3______
1234__134___34____34_____
1234__134___34____4______
1234__134___34____4_____4
1234__134___34____3______
1234__134___34____3_____3
1234__134___34____34____4
1234__134___34____34____3
1234__134___14____4______
1234__134___14____1______
1234__134___14____14_____
1234__134___14____4______
1234__134___14____4_____4
1234__134___14____1______
1234__134___14____1_____1
1234__134___14____14____4
1234__134___14____14____1
1234__134___13____3______
1234__134___13____1______
1234__134___13____13_____
1234__134___13____3______
1234__134___13____3_____3
1234__134___13____1______
1234__134___13____1_____1
1234__134___13____13____3
1234__134___13____13____1
1234__134___134___34_____
1234__134___134___14_____
1234__134___134___13_____
1234__134___134___34____4
1234__134___134___34____3
1234__134___134___14____4
1234__134___134___14____1
1234__134___134___13____3
1234__134___134___13____1
1234__124___24___________
1234__124___14___________
1234__124___12___________
1234__124___124__________
1234__124___24____4______
1234__124___24____2______
1234__124___24____24_____
1234__124___24____4______
1234__124___24____4_____4
1234__124___24____2______
1234__124___24____2_____2
1234__124___24____24____4
1234__124___24____24____2
1234__124___14____4______
1234__124___14____1______
1234__124___14____14_____
1234__124___14____4______
1234__124___14____4_____4
1234__124___14____1______
1234__124___14____1_____1
1234__124___14____14____4
1234__124___14____14____1
1234__124___12____2______
1234__124___12____1______
1234__124___12____12_____
1234__124___12____2______
1234__124___12____2_____2
1234__124___12____1______
1234__124___12____1_____1
1234__124___12____12____2
1234__124___12____12____1
1234__124___124___24_____
1234__124___124___14_____
1234__124___124___12_____
1234__124___124___24____4
1234__124___124___24____2
1234__124___124___14____4
1234__124___124___14____1
1234__124___124___12____2
1234__124___124___12____1
1234__123___23___________
1234__123___13___________
1234__123___12___________
1234__123___123__________
1234__123___23____3______
1234__123___23____2______
1234__123___23____23_____
1234__123___23____3______
1234__123___23____3_____3
1234__123___23____2______
1234__123___23____2_____2
1234__123___23____23____3
1234__123___23____23____2
1234__123___13____3______
1234__123___13____1______
1234__123___13____13_____
1234__123___13____3______
1234__123___13____3_____3
1234__123___13____1______
1234__123___13____1_____1
1234__123___13____13____3
1234__123___13____13____1
1234__123___12____2______
1234__123___12____1______
1234__123___12____12_____
1234__123___12____2______
1234__123___12____2_____2
1234__123___12____1______
1234__123___12____1_____1
1234__123___12____12____2
1234__123___12____12____1
1234__123___123___23_____
1234__123___123___13_____
1234__123___123___12_____
1234__123___123___23____3
1234__123___123___23____2
1234__123___123___13____3
1234__123___123___13____1
1234__123___123___12____2
1234__123___123___12____1
1234__1234__234__________
1234__1234__134__________
1234__1234__124__________
1234__1234__123__________
1234__1234__234___34_____
1234__1234__234___24_____
1234__1234__234___23_____
1234__1234__234___34____4
1234__1234__234___34____3
1234__1234__234___24____4
1234__1234__234___24____2
1234__1234__234___23____3
1234__1234__234___23____2
1234__1234__134___34_____
1234__1234__134___14_____
1234__1234__134___13_____
1234__1234__134___34____4
1234__1234__134___34____3
1234__1234__134___14____4
1234__1234__134___14____1
1234__1234__134___13____3
1234__1234__134___13____1
1234__1234__124___24_____
1234__1234__124___14_____
1234__1234__124___12_____
1234__1234__124___24____4
1234__1234__124___24____2
1234__1234__124___14____4
1234__1234__124___14____1
1234__1234__124___12____2
1234__1234__124___12____1
1234__1234__123___23_____
1234__1234__123___13_____
1234__1234__123___12_____
1234__1234__123___23____3
1234__1234__123___23____2
1234__1234__123___13____3
1234__1234__123___13____1
1234__1234__123___12____2
1234__1234__123___12____1
========= ERROR SUMMARY: 0 errors
$
The answer given by Robert Crovella is correct at the 5th point, the mistake was in the using of init in every recursive call, but I want to clarify something that can be useful for other beginners with CUDA.
I used this variable because when I tried to launch a child kernel passing a local variable I always got the exception: Error: a pointer to local memory cannot be passed to a launch as an argument.
As I´m C# expert developer I´m not used to using pointers (Ref does the low-level-work for that) so I thought there was no way to do it in CUDA/c programming.
As Robert shows in its code it is possible copying the pointer with memalloc for using it as a referable argument.
Here is a kernel simplified as an example of deep recursion.
__device__ char init[6] = { "12345" };
__global__ void Recursive(int depth, const char* route) {
// up to depth 6
if (depth == 5) return;
//declaration for a referable argument (point 6)
char* newroute = (char*)malloc(6);
memcpy(newroute, route, 5);
int o = 0;
int newlen = 0;
for (int i = 0; i < (6 - depth); ++i)
{
if (i != threadIdx.x)
{
newroute[i - o] = route[i];
newlen++;
}
else
{
o = 1;
}
}
printf("%s\n", newroute);
Recursive <<<1, newlen>>>(depth + 1, newroute);
}
__global__ void RecursiveCount() {
Recursive <<<1, 5>>>(0, init);
}
I don't add the main call because I´m using ManagedCUDA for C# but as Robert says it can be figured-out how the call RecursiveCount is.
About ending arrays of char with /0 ... sorry but I don't know exactly what is the benefit; this code works fine without them.

GoLang, REST, PATCH and building an UPDATE query

since few days I was struggling on how to proceed with PATCH request in Go REST API until I have found an article about using pointers and omitempty tag which I have populated and is working fine. Fine until I have realized I still have to build an UPDATE SQL query.
My struct looks like this:
type Resource struct {
Name *string `json:"name,omitempty" sql:"resource_id"`
Description *string `json:"description,omitempty" sql:"description"`
}
I am expecting a PATCH /resources/{resource-id} request containing such a request body:
{"description":"Some new description"}
In my handler I will build the Resource object this way (ignoring imports, ignoring error handling):
var resource Resource
resourceID, _ := mux.Vars(r)["resource-id"]
d := json.NewDecoder(r.Body)
d.Decode(&resource)
// at this point our resource object should only contain
// the Description field with the value from JSON in request body
Now, for normal UPDATE (PUT request) I would do this (simplified):
stmt, _ := db.Prepare(`UPDATE resources SET description = ?, name = ? WHERE resource_id = ?`)
res, _ := stmt.Exec(resource.Description, resource.Name, resourceID)
The problem with PATCH and omitempty tag is that the object might be missing multiple properties, thus I cannot just prepare a statement with hardcoded fields and placeholders... I will have to build it dynamically.
And here comes my question: how can I build such UPDATE query dynamically? In the best case I'd need some solution with identifying the set properties, getting their SQL field names (probably from the tags) and then I should be able to build the UPDATE query. I know I can use reflection to get the object properties but have no idea hot to get their sql tag name and of course I'd like to avoid using reflection here if possible... Or I could simply check for each property it is not nil, but in real life the structs are much bigger than provided example here...
Can somebody help me with this one? Did somebody already have to solve the same/similar situation?
SOLUTION:
Based on the answers here I was able to come up with this abstract solution. The SQLPatches method builds the SQLPatch struct from the given struct (so no concrete struct specific):
import (
"fmt"
"encoding/json"
"reflect"
"strings"
)
const tagname = "sql"
type SQLPatch struct {
Fields []string
Args []interface{}
}
func SQLPatches(resource interface{}) SQLPatch {
var sqlPatch SQLPatch
rType := reflect.TypeOf(resource)
rVal := reflect.ValueOf(resource)
n := rType.NumField()
sqlPatch.Fields = make([]string, 0, n)
sqlPatch.Args = make([]interface{}, 0, n)
for i := 0; i < n; i++ {
fType := rType.Field(i)
fVal := rVal.Field(i)
tag := fType.Tag.Get(tagname)
// skip nil properties (not going to be patched), skip unexported fields, skip fields to be skipped for SQL
if fVal.IsNil() || fType.PkgPath != "" || tag == "-" {
continue
}
// if no tag is set, use the field name
if tag == "" {
tag = fType.Name
}
// and make the tag lowercase in the end
tag = strings.ToLower(tag)
sqlPatch.Fields = append(sqlPatch.Fields, tag+" = ?")
var val reflect.Value
if fVal.Kind() == reflect.Ptr {
val = fVal.Elem()
} else {
val = fVal
}
switch val.Kind() {
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
sqlPatch.Args = append(sqlPatch.Args, val.Int())
case reflect.String:
sqlPatch.Args = append(sqlPatch.Args, val.String())
case reflect.Bool:
if val.Bool() {
sqlPatch.Args = append(sqlPatch.Args, 1)
} else {
sqlPatch.Args = append(sqlPatch.Args, 0)
}
}
}
return sqlPatch
}
Then I can simply call it like this:
type Resource struct {
Description *string `json:"description,omitempty"`
Name *string `json:"name,omitempty"`
}
func main() {
var r Resource
json.Unmarshal([]byte(`{"description": "new description"}`), &r)
sqlPatch := SQLPatches(r)
data, _ := json.Marshal(sqlPatch)
fmt.Printf("%s\n", data)
}
You can check it at Go Playground. The only problem here I see is that I allocate both the slices with the amount of fields in the passed struct, which may be 10, even though I might only want to patch one property in the end resulting in allocating more memory than needed... Any idea how to avoid this?
I recently had same problem. about PATCH and looking around found this article. It also makes references to the RFC 5789 where it says:
The difference between the PUT and PATCH requests is reflected in the way the server processes the enclosed entity to modify the resource identified by the Request-URI. In a PUT request, the enclosed entity is considered to be a modified version of the resource stored on the origin server, and the client is requesting that the stored version be replaced. With PATCH, however, the enclosed entity contains a set of instructions describing how a resource currently residing on the origin server should be modified to produce a new version. The PATCH method affects the resource identified by the Request-URI, and it also MAY have side effects on other resources; i.e., new resources may be created, or existing ones modified, by the application of a PATCH.
e.g:
[
{ "op": "test", "path": "/a/b/c", "value": "foo" },
{ "op": "remove", "path": "/a/b/c" },
{ "op": "add", "path": "/a/b/c", "value": [ "foo", "bar" ] },
{ "op": "replace", "path": "/a/b/c", "value": 42 },
{ "op": "move", "from": "/a/b/c", "path": "/a/b/d" },
{ "op": "copy", "from": "/a/b/d", "path": "/a/b/e" }
]
This set of instructions should make it easier to build the update query.
EDIT
This is how you would obtain sql tags but you will have to use reflection:
type Resource struct {
Name *string `json:"name,omitempty" sql:"resource_id"`
Description *string `json:"description,omitempty" sql:"description"`
}
sp := "sort of string"
r := Resource{Description: &sp}
rt := reflect.TypeOf(r) // reflect.Type
rv := reflect.ValueOf(r) // reflect.Value
for i := 0; i < rv.NumField(); i++ { // Iterate over all the fields
if !rv.Field(i).IsNil() { // Check it is not nil
// Here you would do what you want to having the sql tag.
// Creating the query would be easy, however
// not sure you would execute the statement
fmt.Println(rt.Field(i).Tag.Get("sql")) // Output: description
}
}
I understand you don't want to use reflection, but still this may be a better answer than the previous one as you comment state.
EDIT 2:
About the allocation - read this guide lines of Effective Go about Data structures and Allocation:
// Here you are allocating an slice of 0 length with a capacity of n
sqlPatch.Fields = make([]string, 0, n)
sqlPatch.Args = make([]interface{}, 0, n)
With make(Type, Length, Capacity (optional))
Consider the following example:
// newly allocated zeroed value with Composite Literal
// length: 0
// capacity: 0
testSlice := []int{}
fmt.Println(len(testSlice), cap(testSlice)) // 0 0
fmt.Println(testSlice) // []
// newly allocated non zeroed value with make
// length: 0
// capacity: 10
testSlice = make([]int, 0, 10)
fmt.Println(len(testSlice), cap(testSlice)) // 0 10
fmt.Println(testSlice) // []
// newly allocated non zeroed value with make
// length: 2
// capacity: 4
testSlice = make([]int, 2, 4)
fmt.Println(len(testSlice), cap(testSlice)) // 2 4
fmt.Println(testSlice) // [0 0]
In your case, may want to the following:
// Replace this
sqlPatch.Fields = make([]string, 0, n)
sqlPatch.Args = make([]interface{}, 0, n)
// With this or simple omit the capacity in make above
sqlPatch.Fields = []string{}
sqlPatch.Args = []interface{}{}
// The allocation will go as follow: length - capacity
testSlice := []int{} // 0 - 0
testSlice = append(testSlice, 1) // 1 - 2
testSlice = append(testSlice, 1) // 2 - 2
testSlice = append(testSlice, 1) // 3 - 4
testSlice = append(testSlice, 1) // 4 - 4
testSlice = append(testSlice, 1) // 5 - 8
Alright, I think the solution I used back in 2016 was quite over-engineered for even more over-engineered problem and was completely unnecessary. The question asked here was very generalized, however we were building a solution that was able to build its SQL query on its own and based on the JSON object or query parameters and/or Headers sent in the request. And that to be as generic as possible.
Nowadays I think the best solution is to avoid PATCH unless truly necessary. And even then you still can use PUT and replace the whole resource with patched property/ies coming already from the client - i.e. not giving the client the option/possibility to send any PATCH request to your server and to deal with partial updates on their own.
However this is not always recommended, especially in cases of bigger objects to save some C02 by reducing the amount of redundant transmitted data. Whenever today if I need to enable a PATCH for the client I simply define what can be patched - this gives me clarity and the final struct.
Note that I am using a IETF documented JSON Merge Patch implementation. I consider that of JSON Patch (also documented by IETF) redundant as hypothetically we could replace the whole REST API by having one single JSON Patch endpoint and let clients control the resources via allowed operations. I also think the implementation of such JSON Patch on the server side is way more complicated. The only use-case I could think of using such implementation is if I was implementing a REST API over a file system...
So the struct may be defined as in my OP:
type ResourcePatch struct {
ResourceID some.UUID `json:"resource_id"`
Description *string `json:"description,omitempty"`
Name *string `json:"name,omitempty"`
}
In the handler func I'd decode the ID from the path into the ResourcePatch instance and unmarshall JSON from the request body into it, too.
Sending only this
{"description":"Some new description"}
to PATCH /resources/<UUID>
I should end up with with this object:
ResourcePatch
* ResourceID {"UUID"}
* Description {"Some new description"}
And now the magic: use a simple logic to build the query and exec parameters. For some it may seem tedious or repetitive or unclean for bigger PATCH objects, but my reply to this would be: if your PATCH object consists of more than 50% of the original resource' properties (or simply too many for your liking) use PUT and expect the clients to send (and replace) the whole resource instead.
It could look like this:
func (s Store) patchMyResource(r models.ResourcePatch) error {
q := `UPDATE resources SET `
qParts := make([]string, 0, 2)
args := make([]interface{}, 0, 2)
if r.Description != nil {
qParts = append(qParts, `description = ?`)
args = append(args, r.Description)
}
if r.Name != nil {
qParts = append(qParts, `name = ?`)
args = append(args, r.Name)
}
q += strings.Join(qParts, ',') + ` WHERE resource_id = ?`
args = append(args, r.ResourceID)
_, err := s.db.Exec(q, args...)
return err
}
I think there's nothing simpler and more effective. No reflection, no over-kills, reads quite good.
Struct tags are only visible through reflection, sorry.
If you don't want to use reflection (or, I think, even if you do), I think it is Go-like to define a function or method that "marshals" your struct into something that can easily be turned into a comma-separated list of SQL updates, and then use that. Build small things to help solve your problems.
For example given:
type Resource struct {
Name *string `json:"name,omitempty" sql:"resource_id"`
Description *string `json:"description,omitempty" sql:"description"`
}
You might define:
func (r Resource) SQLUpdates() SQLUpdates {
var s SQLUpdates
if (r.Name != nil) {
s.add("resource_id", *r.Name)
}
if (r.Description != nil) {
s.add("description", *r.Description)
}
}
where the type SQLUpdates looks something like this:
type SQLUpdates struct {
assignments []string
values []interface{}
}
func (s *SQLUpdates) add(key string, value interface{}) {
if (s.assignments == nil) {
s.assignments = make([]string, 0, 1)
}
if (s.values == nil) {
s.values = make([]interface{}, 0, 1)
}
s.assignments = append(s.assignments, fmt.Sprintf("%s = ?", key))
s.values = append(s.values, value)
}
func (s SQLUpdates) Assignments() string {
return strings.Join(s.assignments, ", ")
}
func (s SQLUpdates) Values() []interface{} {
return s.values
}
See it working (sorta) here: https://play.golang.org/p/IQAHgqfBRh
If you have deep structs-within-structs, it should be fairly easy to build on this. And if you change to an SQL engine that allows or encourages positional arguments like $1 instead of ?, it's easy to add that behavior to just the SQLUpdates struct without changing any code that used it.
For the purpose of getting arguments to pass to Exec, you would just expand the output of Values() with the ... operator.

NSObject description and custom summaries in Xcode

I override object's -(NSString*)description however Xcode always displays error: summary string parsing error in summary field in variables view.
My current implementation is the following:
- (NSString*)description {
return [NSString stringWithFormat:#"<%# %p> x=%f, y=%f", self.class, self, _x, _y];
}
If I type po objectName in console, LLDB shows a fine output as expected, however Xcode and command p objectName always indicate error, so what's the proper debug description format to make summary field work? Worth to notice that the output of "p" command is the same as a summary message that you see in Xcode for instances of Foundation classes.
Update:
As far as I can see from "WWDC 2012 session Debugging in Xcode", custom summaries can be implemented using Custom python script only. -(NSString*)description or -(NSString*)debugDescription methods are not connected anyhow to summary messages. I thought they are because I got an error displayed, but it seems it's a standard message for classes that do not have their own formatters.
I would suggest at least:
- (NSString*)description {
return [NSString stringWithFormat:#"%#; x=%f, y=%f", [super description], _x, _y];
}
So that you're not manually replicating the NSObject default and thereby blocking any non-default behaviour your superclass may have opted to include.
Beyond that, "summary string parsing error" is an lldb error. It's being reported by the debugger only. Per its documentation, po is correct for Objective-C objects; p is for C or C++ objects. So you needn't heed that error — it's essentially just telling you that you used the wrong lldb command.
EDIT: for what it's worth, the method used by CFArray is open source and looks like:
static CFStringRef __CFArrayCopyDescription(CFTypeRef cf) {
CFArrayRef array = (CFArrayRef)cf;
CFMutableStringRef result;
const CFArrayCallBacks *cb;
CFAllocatorRef allocator;
CFIndex idx, cnt;
cnt = __CFArrayGetCount(array);
allocator = CFGetAllocator(array);
result = CFStringCreateMutable(allocator, 0);
switch (__CFArrayGetType(array)) {
case __kCFArrayImmutable:
CFStringAppendFormat(result, NULL, CFSTR("<CFArray %p [%p]>{type = immutable, count = %u, values = (%s"), cf, allocator, cnt, cnt ? "\n" : "");
break;
case __kCFArrayDeque:
CFStringAppendFormat(result, NULL, CFSTR("<CFArray %p [%p]>{type = mutable-small, count = %u, values = (%s"), cf, allocator, cnt, cnt ? "\n" : "");
break;
}
cb = __CFArrayGetCallBacks(array);
for (idx = 0; idx < cnt; idx++) {
CFStringRef desc = NULL;
const void *val = __CFArrayGetBucketAtIndex(array, idx)->_item;
if (NULL != cb->copyDescription) {
desc = (CFStringRef)INVOKE_CALLBACK1(cb->copyDescription, val);
}
if (NULL != desc) {
CFStringAppendFormat(result, NULL, CFSTR("\t%u : %#\n"), idx, desc);
CFRelease(desc);
} else {
CFStringAppendFormat(result, NULL, CFSTR("\t%u : <%p>\n"), idx, val);
}
}
CFStringAppend(result, CFSTR(")}"));
return result;
}
As with the other comments above, I'm willing to gamble that the answer is: Xcode's debugger isn't smart in any sense and definitely isn't smart enough to use the correct po means of getting an Objective-C description; if your object is an uninflected Objective-C object then the debugger isn't going to be able to figure it out.

Convert (vectorize) code with per32-bit element conditional to SSE2 SSE3

I want to vectorize code for Core2. I think, I can use intrinsic functions from gcc or icc, and the SSE, SSE2, SSE3, SSSE3 instructions are allowed.
My code works on arrays of 8 uint32_t elements and it is like this (only hotspot is here):
const uint32_t p[8] = {2147483743, 2147483713, 2147483693, 2147483659,
2147483647, 2147483629, 2147483587, 2147483579};
void vector_mod_add(uint32_t *a /* a[8] */, uint32_t *b /* b[8] */) {
int n;
for(n=0;n<8;n++)
a[n]+=b[n];
for(n=0;n<8;n++)
if(a[n]>=p[n])
a[n]-=p[n];
}
Addition is rather easy, but I don't know how it is possible to do an conditional subtraction.
Also, I have no experience in manual vectorizing with SSE2, so, please, tell me how should I define all types here.
You can write it as a[n] -= p[n] & ~(a[n] < p[n]). Note that the < here is not the C one, it's the SSE one (pcmpltd) that returns -1 in each true element and 0 in each false element (to allow the AND operation), and &~ is pandn. Here is an attempt at the code:
__m128i a, p;
a = _mm_sub_epi32(a, _mm_andnot_si128(_mm_cmplt_epi32(a, p), p));
Note that this uses signed operations, and so your numbers will need to stay below 2^31 - 1 for it to work correctly. If you need to go beyond that, change _mm_cmplt_epi32(a, p) to _mm_cmplt_epi32(_mm_xor_si128(a, signs), _mm_xor_si128(p, signs)), where signs is a vector of 32-bit words whose elements are all 0x80000000. Here is a version that seems like it will handle wider ranges more efficiently:
__m128i a, p;
a = _mm_sub_epi32(a, p);
a = _mm_add_epi32(a, _mm_and_si128(_mm_srai_epi32(a, 31), p));

Trying to parse OpenCV YAML ouput with yaml-cpp

I've got a series of OpenCv generated YAML files and would like to parse them with yaml-cpp
I'm doing okay on simple stuff, but the matrix representation is proving difficult.
# Center of table
tableCenter: !!opencv-matrix
rows: 1
cols: 2
dt: f
data: [ 240, 240]
This should map into the vector
240
240
with type float. My code looks like:
#include "yaml.h"
#include <fstream>
#include <string>
struct Matrix {
int x;
};
void operator >> (const YAML::Node& node, Matrix& matrix) {
unsigned rows;
node["rows"] >> rows;
}
int main()
{
std::ifstream fin("monsters.yaml");
YAML::Parser parser(fin);
YAML::Node doc;
Matrix m;
doc["tableCenter"] >> m;
return 0;
}
But I get
terminate called after throwing an instance of 'YAML::BadDereference'
what(): yaml-cpp: error at line 0, column 0: bad dereference
Abort trap
I searched around for some documentation for yaml-cpp, but there doesn't seem to be any, aside from a short introductory example on parsing and emitting. Unfortunately, neither of these two help in this particular circumstance.
As I understand, the !! indicate that this is a user-defined type, but I don't see with yaml-cpp how to parse that.
You have to tell yaml-cpp how to parse this type. Since C++ isn't dynamically typed, it can't detect what data type you want and create it from scratch - you have to tell it directly. Tagging a node is really only for yourself, not for the parser (it'll just faithfully store it for you).
I'm not really sure how an OpenCV matrix is stored, but if it's something like this:
class Matrix {
public:
Matrix(unsigned r, unsigned c, const std::vector<float>& d): rows(r), cols(c), data(d) { /* init */ }
Matrix(const Matrix&) { /* copy */ }
~Matrix() { /* delete */ }
Matrix& operator = (const Matrix&) { /* assign */ }
private:
unsigned rows, cols;
std::vector<float> data;
};
then you can write something like
void operator >> (const YAML::Node& node, Matrix& matrix) {
unsigned rows, cols;
std::vector<float> data;
node["rows"] >> rows;
node["cols"] >> cols;
node["data"] >> data;
matrix = Matrix(rows, cols, data);
}
Edit It appears that you're ok up until here; but you're missing the step where the parser loads the information into the YAML::Node. Instead, se it like:
std::ifstream fin("monsters.yaml");
YAML::Parser parser(fin);
YAML::Node doc;
parser.GetNextDocument(doc); // <-- this line was missing!
Matrix m;
doc["tableCenter"] >> m;
Note: I'm guessing dt: f means "data type is float". If that's the case, it'll really depend on how the Matrix class handles this. If you have a different class for each data type (or a templated class), you'll have to read that field first, and then choose which type to instantiate. (If you know it'll always be float, that'll make your life easier, of course.)