How can I force Realloc to behave like calloc?
For instance:
I have the following structs:
typedef struct bucket0{
int hashID;
Registry registry;
}Bucket;
typedef struct table0{
int tSize;
int tElements;
Bucket** content;
}Table;
and I have the following code in order to grow the table:
int grow(Table* table){
Bucket** tempPtr;
//grow will add 1 to the number available buckets, and double it.
table->tSize++; //add 1
table->tSize *= 2; //double element
if(!table->content){
//table will be generated for the first time
table->content = (Bucket**)(calloc(sizeof(Bucket*), table->tSize));
} else {
//realloc content
tempPtr = (Bucket**)realloc(table->content, sizeof(Bucket)*table->tSize);
if(tempPtr){
table->content = tempPtr;
return 0;
}else{
return 1000;//table could not grow
}
}
}
When I execute it, the table grows properly, and MOST of the "Buckets" in it are initialized as a NULL ptr. However, not all of them are.
How can I make Realloc behave like calloc? in the sense that when it creates new "buckets" they initialize to NULL
Strictly speaking, you shouldn't be relying on calloc (or memset, for that matter) to set pointers to null. C doesn't guarantee that null pointers are represented by all-zero bytes in memory.
Quoting from the comp.lang.C FAQ question 7.31:
Don't rely on calloc's zero fill too much (see below); usually, it's best to initialize data structures yourself, on a field-by-field basis, especially if there are pointer fields.
calloc's zero fill is all-bits-zero, and is therefore guaranteed to yield the value 0 for all integral types (including '\0' for character types). But it does not guarantee useful null pointer values (see section 5 of this list) or floating-point zero values.
It's safer to initialize the individual structure fields yourself. You can create a static const one as a template, with its content initialized to NULL, and then memcpy it to each element of your dynamically-allocated array.
Related
I'm trying to implement my own absolute function for gonum dense vectors in Go. I'm wandering if there's a better way of getting the absolute value of an array than squaring and then square rooting?
My main issue is that I've had to implement my own element wise Newtonian square-root function on these vectors and there's a balance between implementation speed and accuracy. If I could avoid using this square-root function I'd be happy.
NumPy source code can be tricky to navigate, because it has so many functions for so many data types. You can find the C-level source code for the absolute value function in the file scalarmath.c.src. This file is actually a template with function definitions that are later replicated by the build system for several data types. Note each function is the "kernel" that is run for each element of the array (looping through the array is done somewhere else). The functions are always called <name of the type>_ctype_absolute, where <name of the type> is the data type it applies to and is generally templated. Let's go through them.
/**begin repeat
* #name = ubyte, ushort, uint, ulong, ulonglong#
*/
#define #name#_ctype_absolute #name#_ctype_positive
/**end repeat**/
This one is for unsigned types. In this case, the absolute value is the same as np.positive, which just copies the value without doing anything (it is what you get if you have an array a and you do +a).
/**begin repeat
* #name = byte, short, int, long, longlong#
* #type = npy_byte, npy_short, npy_int, npy_long, npy_longlong#
*/
static void
#name#_ctype_absolute(#type# a, #type# *out)
{
*out = (a < 0 ? -a : a);
}
/**end repeat**/
This one is for signed integers. Pretty straightforward.
/**begin repeat
* #name = float, double, longdouble#
* #type = npy_float, npy_double, npy_longdouble#
* #c = f,,l#
*/
static void
#name#_ctype_absolute(#type# a, #type# *out)
{
*out = npy_fabs#c#(a);
}
/**end repeat**/
This is for floating-point values. Here npy_fabsf, npy_fabs and npy_fabsl functions are used. These are declared in npy_math.h, but defined through templated C code in npy_math_internal.h.src, essentially calling the C/C99 counterparts (unless C99 is not available, in which case fabsf and fabsl are emulated with fabs). You might think that the previous code should work as well for floating-point types, but actually these are more complicated, since they have things like NaN, infinity or signed zeros, so it is better to use the standard C functions that deal with everything reliably.
static void
half_ctype_absolute(npy_half a, npy_half *out)
{
*out = a&0x7fffu;
}
This is actually not templated, it is the absolute value function for half-precision floating-point values. Turns out you can change sign by just doing that bitwise operation (set the first bit to 0), since half-precision is simpler (if more limited) than other floating-point types (it's usually the same for those, but with special cases).
/**begin repeat
* #name = cfloat, cdouble, clongdouble#
* #type = npy_cfloat, npy_cdouble, npy_clongdouble#
* #rtype = npy_float, npy_double, npy_longdouble#
* #c = f,,l#
*/
static void
#name#_ctype_absolute(#type# a, #rtype# *out)
{
*out = npy_cabs#c#(a);
}
/**end repeat**/
This last one is for complex types. These use npy_cabsf, npycabs and npy_cabsl functions, again declared in npy_math.h but in this case template-implemented in npy_math_complex.c.src using C99 functions (unless that is not available, in which case it is emulated with np.hypot).
I clearly see the need to deepen my knowledge in RenderScript memory allocation and data types (I'm still confused about the sheer number of data types and finding the correct corresponding types on either side - allocations and elements. (or when to refer the forEach to input, to output or to both, etc.) Therefore I will read and re-read the documentation, which is really not bad - but it needs some time to get the necessary "intuition" how to use it correctly. But for now, please help me with this basic one (and I will return later with hopefully less stupid questions...). I need a very simple kernel that takes an ARGB Color Bitmap and returns an integer Array of gray-values. My attempt was the following:
#pragma version(1)
#pragma rs java_package_name(com.example.xxxx)
#pragma rs_fp_relaxed
uint __attribute__((kernel)) grauInt(uchar4 in) {
uint gr= (uint) (0.2125*in.r + 0.7154*in.g + 0.0721*in.b);
return gr;
}
and Java side:
int[] data1 = new int[width*height];
ScriptC_gray graysc;
graysc=new ScriptC_gray(rs);
Type.Builder TypeOut = new Type.Builder(rs, Element.U8(rs));
TypeOut.setX(width).setY(height);
Allocation outAlloc = Allocation.createTyped(rs, TypeOut.create());
Allocation inAlloc = Allocation.createFromBitmap(rs, bmpfoto1,
Allocation.MipmapControl.MIPMAP_NONE, Allocation.USAGE_SCRIPT);
graysc.forEach_grauInt(inAlloc, outAlloc);
outAlloc.copyTo(data1);
This crashed with the message cannot locate symbol "convert_uint". What's wrong with this conversion? Is the code otherwise correct?
UPDATE: isn't that ridiculous? I don't get this "easy one" run, even after 2 hours trying. I still struggle with the different Element- and variable-types. Let's recap: Input is a Bitmap. Output is an int[] Array. So, why doesnt it work when I use U8 in the Java-side Out-allocation, createFromBitmap in the Java-side In-allocation, uchar4 as kernel Input and uint as the kernel Output (RSRuntimeException: Type mismatch with U32) ?
There is no convert_uint() function. How about simple casting? Other than that, the code looks alright (assuming width and height have correct values).
UPDATE: I have just noticed that you allocate Element.I32 (i.e. signed integer type), but return uint from the kernel. These should match. And in any case, unless you need more than 8-bit precision, you should be able to fit your result in U8.
UPDATE: If you are changing the output type, make sure you change it in all places, e.g. if the kernel returns an uint, the allocation should use U32. If the kernel returns a char, the allocation should use I8. And so on...
You can't use a Uint[] directly because the input Bitmap is actually 2-dimensional. Can you create the output Allocation with a proper width/height and try that? You should still be able to extract the values into a Java array when you are finished.
I used to think that in 64-bit Obj-C runtime BOOL is actually _Bool and it's a real type so it's safe to write like this:
BOOL a = YES;
BOOL b = NO;
if (a != b) {...}
It's been working seemingly fine but today I found a problem when I use bit field structs like this:
typedef struct
{
BOOL flag1 : 1;
} FlagsType;
FlagsType f;
f.flag1 = YES;
BOOL b = YES;
if (f.flag1 != b)
{
// DOES GET HERE!!!
}
It seems that BOOL returned from the bit field is equal to -1 while the regular BOOL is 1, and they are not equal!!!
Note that I am aware of the situation when an arbitrary integer number is cast to BOOL and therefore becomes a "strange" BOOL which is not safe to compare.
However in this situation, both flag1 field and b were declared as BOOL and never cast. What is the problem? Is this a compiler bug?
The bigger question is if it's really safe to compare BOOLs at all or should I write a XORing helper function? (It would be such a chore, because boolean comparisons are so ubiquitous...)
I do not repeat that using a C boolean type solves the problems one can have with BOOL. That's true – in particular here, as you can read below –, but most of the problems resulted from a wrong storage into a boolean (C) object. But in this case _Bool or unsigned (int) seem to be the only possible solution. (Except of solutions with extra code.) There is a reason for it:
I cannot find a precise documentation of the new behavior of BOOL in Objective-C, but the behavior you found is something between bad and buggy. I expected the latest behavior to be analogous to _Bool. That's not true in your case. (Thanks for finding that out!) Maybe this is for backwards compatibility. To tell the full story:
In C an object of the type int is signed int. (This is a difference to char. For this type the signedess is implementation defined.)
— int, signed, or signed int
ISO/IEC 9899:TC3, 6.7.2-2
Each of the comma-separated sets designates the same type, […]
ISO/IEC 9899:TC3, 6.7.2-5
But there is a weird exception for historical reasons:
If the int object is a bit-field, it is implementation defined, whether it is a signed int or an unsigned int. (Likely this is because some CPUs in the past could not automatically expand the sign of a partial byte integer. So having an unsigned integer is easier, because nulling the top bits is enough.)
On clang the default is signed int. So according to full-width integers int always denotes a signed integer, even it has only one bit. An int member : 1 can only store 0 and -1! (Therefore it is no solution to use int instead.)
Each of the comma-separated sets designates the same type, except that for bit-fields, it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int.
ISO/IEC 9899:TC3, 6.7.2-5
The C standard says that a boolean bit-field is an integer type and therefore takes part on the weird integer signedness rule for bit-fields:
A bit-field is interpreted as a signed or unsigned integer type consisting of the specified number of bits.
ISO/IEC 9899:TC3, 6.7.2.1-9
This is the behavior you found. Because this is meaningless for 1 bit booleans types, the C standard explicitly denotes that storing a 1 into a boolean bit-field has to compare equal to 1 in every case:
If the value 0 or 1 is stored into a nonzero-width bit-field of type _Bool, the value of the bit-field shall compare equal to the value stored.
ISO/IEC 9899:TC3, 6.7.2.1-9
This leads to the strange situation, that an implementation can implement booleans of width 1 as { 0, -1 }, but has to fulfill 1 == -1. Great.
So, the short story: BOOL behaves like an integer bit-field (conforming to the standard), but does not take part on the extra requirement for _Bools.
I think this is, because of legacy code. (One could expect -1 in the past.)
Not very familiar with in-line assembly to begin with, and much less with that of the blackfin processor. I am in the process of migrating a legacy C application over to C++, and ran into a problem this morning regarding the following routine:
//
void clear_buffer ( short * buffer, int len ) {
__asm__ (
"/* clear_buffer */\n\t"
"LSETUP (1f, 1f) LC0=%1;\n"
"1:\n\t"
"W [%0++] = %2;"
:: "a" ( buffer ), "a" ( len ), "d" ( 0 )
: "memory", "LC0", "LT0", "LB0"
);
}
I have a class that contains an array of shorts that is used for audio processing:
class AudProc
{
enum { buffer_size = 512 };
short M_samples[ buffer_size * 2 ];
// remaining part of class omitted for brevity
};
Within the AudProc class I have a method that calls clear_buffer, passing it the samples array:
clear_buffer ( M_samples, sizeof ( M_samples ) / 2 );
This generates a "Bus Error" and aborts the application.
I have tried making the array public, and that produces the same result. I have also tried making it static; that allows the call to go through without error, but no longer allows for multiple instances of my class as each needs its own buffer to work with. Now, my first thought is, it has something to do with where the buffer is in memory, or from where it is being accessed. Does something need to be changed in the in-line assembly to make this work, or in the way it is being called?
Thought that this was similar to what I was trying to accomplish, but it is using a different dialect of asm, and I can't figure out if it is the same problem I am experiencing or not:
GCC extended asm, struct element offset encoding
Anyone know why this is occurring and how to correct it?
Does anyone know where there is helpful documentation regarding the blackfin asm instruction set? I've tried looking on the ADSP site, but to no avail.
I would suspect that you could define your clear_buffer as
inline void clear_buffer (short * buffer, int len) {
memset (buffer, 0, sizeof(short)*len);
}
and probably GCC is able to optimize (when invoked with -O2 or -O3) that cleverly (because GCC knows about memset).
To understand assembly code, I suggest running gcc -S -O -fverbose-asm on some small C file, then to look inside the produced .s file.
I would have take a guess, because I don't know Blackfin assembler:
That LC0 sounds like "loop counter", LSETUP looks like a macro/insn, which, well, setups a loop between two labels and with a certain loop counter.
The "%0" operands is apparently the address to write to and we can safely guess it's incremented in the loop, in other words it's both an input and output operand and should be described as such.
Thus, I suggest describing it as in input-output operand, using "+" constraint modifier, as follows:
void clear_buffer ( short * buffer, int len ) {
__asm__ (
"/* clear_buffer */\n\t"
"LSETUP (1f, 1f) LC0=%1;\n"
"1:\n\t"
"W [%0++] = %2;"
: "+a" ( buffer )
: "a" ( len ), "d" ( 0 )
: "memory", "LC0", "LT0", "LB0"
);
}
This is, of course, just a hypothesis, but you could disassemble the code and check if by any chance GCC allocated the same register for "%0" and "%2".
PS. Actually, only "+a" should be enough, early-clobber is irrelevant.
For anyone else who runs into a similar circumstance, the problem here was not with the in-line assembly, nor with the way it was being called: it was with the classes / structs in the program. The class that I believed to be the offender was not the problem - there was another class that held an instance of it, and due to other members of that outer class, the inner one was not aligned on a word boundary. This was causing the "Bus Error" that I was experiencing. I had not come across this before because the classes were not declared with __attribute__((packed)) in other code, but they are in my implementation.
Giving Type Attributes - Using the GNU Compiler Collection (GCC) a read was what actually sparked the answer for me. Two particular attributes that affect memory alignment (and, thus, in-line assembly such as I am using) are packed and aligned.
As taken from the aforementioned link:
aligned (alignment)
This attribute specifies a minimum alignment (in bytes) for variables of the specified type. For example, the declarations:
struct S { short f[3]; } __attribute__ ((aligned (8)));
typedef int more_aligned_int __attribute__ ((aligned (8)));
force the compiler to ensure (as far as it can) that each variable whose type is struct S or more_aligned_int is allocated and aligned at least on a 8-byte boundary. On a SPARC, having all variables of type struct S aligned to 8-byte boundaries allows the compiler to use the ldd and std (doubleword load and store) instructions when copying one variable of type struct S to another, thus improving run-time efficiency.
Note that the alignment of any given struct or union type is required by the ISO C standard to be at least a perfect multiple of the lowest common multiple of the alignments of all of the members of the struct or union in question. This means that you can effectively adjust the alignment of a struct or union type by attaching an aligned attribute to any one of the members of such a type, but the notation illustrated in the example above is a more obvious, intuitive, and readable way to request the compiler to adjust the alignment of an entire struct or union type.
As in the preceding example, you can explicitly specify the alignment (in bytes) that you wish the compiler to use for a given struct or union type. Alternatively, you can leave out the alignment factor and just ask the compiler to align a type to the maximum useful alignment for the target machine you are compiling for. For example, you could write:
struct S { short f[3]; } __attribute__ ((aligned));
Whenever you leave out the alignment factor in an aligned attribute specification, the compiler automatically sets the alignment for the type to the largest alignment that is ever used for any data type on the target machine you are compiling for. Doing this can often make copy operations more efficient, because the compiler can use whatever instructions copy the biggest chunks of memory when performing copies to or from the variables that have types that you have aligned this way.
In the example above, if the size of each short is 2 bytes, then the size of the entire struct S type is 6 bytes. The smallest power of two that is greater than or equal to that is 8, so the compiler sets the alignment for the entire struct S type to 8 bytes.
Note that although you can ask the compiler to select a time-efficient alignment for a given type and then declare only individual stand-alone objects of that type, the compiler's ability to select a time-efficient alignment is primarily useful only when you plan to create arrays of variables having the relevant (efficiently aligned) type. If you declare or use arrays of variables of an efficiently-aligned type, then it is likely that your program also does pointer arithmetic (or subscripting, which amounts to the same thing) on pointers to the relevant type, and the code that the compiler generates for these pointer arithmetic operations is often more efficient for efficiently-aligned types than for other types.
The aligned attribute can only increase the alignment; but you can decrease it by specifying packed as well. See below.
Note that the effectiveness of aligned attributes may be limited by inherent limitations in your linker. On many systems, the linker is only able to arrange for variables to be aligned up to a certain maximum alignment. (For some linkers, the maximum supported alignment may be very very small.) If your linker is only able to align variables up to a maximum of 8-byte alignment, then specifying aligned(16) in an __attribute__ still only provides you with 8-byte alignment. See your linker documentation for further information.
.
packed
This attribute, attached to struct or union type definition, specifies that each member (other than zero-width bit-fields) of the structure or union is placed to minimize the memory required. When attached to an enum definition, it indicates that the smallest integral type should be used.
Specifying this attribute for struct and union types is equivalent to specifying the packed attribute on each of the structure or union members. Specifying the -fshort-enums flag on the line is equivalent to specifying the packed attribute on all enum definitions.
In the following example struct my_packed_struct's members are packed closely together, but the internal layout of its s member is not packed—to do that, struct my_unpacked_struct needs to be packed too.
struct my_unpacked_struct
{
char c;
int i;
};
struct __attribute__ ((__packed__)) my_packed_struct
{
char c;
int i;
struct my_unpacked_struct s;
};
You may only specify this attribute on the definition of an enum, struct or union, not on a typedef that does not also define the enumerated type, structure or union.
The problem which I was experiencing was specifically due to the use of packed. I attempted to simply add the aligned attribute to the structs and classes, but the error persisted. Only removing the packed attribute resolved the problem. For now, I am leaving the aligned attribute on them and testing to see if I find any improvements in the efficiency of the code as mentioned above, simply due to their being aligned on word boundaries. The application makes use of arrays of these structures, so perhaps there will be better performance, but only profiling the code will say for certain.
Jumping straight to code, this is what I would like to do:
size_t len = obj->someLengthFunctionThatReturnsTypeSizeT();
array<int>^ a = gcnew array<int>(len);
When I try this, I get the error
conversion from size_t to int, possible loss of data
Is there a way I can get this code to compile without explicitly casting to int? I find it odd that I can't initialize an array to this size, especially because there is a LongLength property (and how could you get a length as a long - bigger than int - if you can only initialize a length as an int?).
Thanks!
P.S.: I did find this article that says that it may be impractical to allocate an array that is truly size_t, but I don't think that is a concern. The point is that the length I would like to initialize to is stored in a size_t variable.
Managed arrays are implemented for using Int32 as indices, there is no way around that. You cannot allocate arrays larger than Int32.MaxValue.
You could use the static method Array::CreateInstance (the overload that takes a Type and an array of Int64), and then cast the resulting System::Array to the appropriate actual array type (e.g. array<int>^). Note that the passed values must not be larger than Int32.MaxValue. And you would still need to cast.
So you have at least two options. Either casting:
// Would truncate the value if it is too large
array<int>^ a = gcnew array<int>((int)len);
or this (no need to cast len, but the result of CreateInstance):
// Throws an ArgumentOutOfRangeException if len is too large
array<int>^ a = (array<int>^)Array::CreateInstance(int::typeid, len);
Personally, i find the first better. You still might want to check the actual size of len so that you don't run into any of the mentioned errors.