Valgrind Invalid read of size 4, but not out-of-bounds nor not stack'd, malloc'd or (recently) free'd - valgrind

I got many 'Invalid read of size N' while running my program written in C++ with Valgrind-3.11.0 on Ubuntu 64-bit.
The error messages are like following with different N where N is varying among 1, 4, 8.
Invalid read of size N.
Address 0xblahblah is 88 bytes inside a block of size 176 alloc'd
The block of size 176 is a C++ class object allocated with new operator and the size of N is small enough so that it's not out-of-bounds case.
Then why the Valgrind doesn't tell me the reason like 'not stacked', 'not malloced', 'recently freed'?
Does anyone know why Valgrind decided this as an invalid read when there are no messages like 'not stacked', 'not malloced', 'recently freed'?

N corresponds to the size of the underlying primitive data type. Roughly speaking, 1 would be a char or a bool, 4 would be and int or float and 8 would be a long int, double or pointer.
XX bytes inside a block of size YY alloc'd
First of all this means that the memory is dynamic (~on the heap) rather than automatic (~on the stack). Secondly, you know the size of the object(s) allocated. If you know the size per object (you can use sizeof to get that), then you can work out how many objects were allocated.
Are there any other serious errors before this?
Do you use any Valgrind client requests in your code?

Related

To deploy a Tiny ML model that I created via Colab Google

When I compile the code (on arduino) I get the following error:
8 bytes lost due to alignment. To avoid this loss, please make sure the tensor_arena is 16 bytes aligned.
constexpr int tensorArenaSize = 8 * 1024;
byte tensorArena[tensorArenaSize];
Someone can help me to fix this problem?
For reasons unbeknownst to me, the compiler wants to make sure your large byte array is 16-byte-aligned. Because of variables already declared above the two lines you included, it needs to "move forward" the Large Array by 8 bytes, to make it start at an address that is on a 16-byte boundary. To fix the error (to me this should just be a warning) either add a dummy 8-byte variable before your Large Array, or move 8-byte worth of variables from before your Large Array to after it. In the first case you just lose 8 bytes of variable space.

Handling magic constants during 64-bit migration

I confess I did something dumb and it now bites me. I used a magic number constant defined as NSUIntegerMax to define a special case index. The value is normally used as index to access selected item in NSArray. In the special case, denoted by the magic number I get the value from elsewhere, instead of from the array.
This index value is serialized in User Defaults as NSNumber.
With Xcode 5.1 my iOS app gets compiled with standard architecture that now also includes arm64. This changed the value of NSUIntegerMax, so now after deserialization I get 32-bit value of NSUIntegerMax, which no longer matches in comparisons with the magic number, whose value is now 64-bit NSUIntegerMax. And it results in NSRangeException with reason: -[__NSArrayI objectAtIndex:]: index 4294967295 beyond bounds [0 .. 10].
It is a minor issue in my code, given the normal range of that array is small, I may just get away with redefining my magic number as 4294967295. But it doesn't feel right. How should I have handled this issue properly?
I guess avoiding the magic number altogether would be the most robust approach?
Note
I think the problem with my magic number is roughly equivalent to what happened to NSNotFound constant. Apple's 64-bit Transition Guide for Cocoa Touch says in section about Common Type-Conversion Problems in Cocoa Touch:
Working with constants defined in the framework as NSInteger. Of particular note is the NSNotFound constant. In the 64-bit runtime, its value is larger than the maximum range of an int type, so truncating its value often causes errors in your app.
… but it does not say what should be done, except to be careful ;-)
If you use NSInteger/NSUInteger it's 4b on 32bit OS and 8b on 64 OS.
If you want to use the the same size integer for both OSs you should consider use int (4) or long long (8) or int32_t/int64_t. To get max int from int you can use cast:
(int)INT_MAX
//or LONG_MAX

Memory addresses, pointers, variables, values - what goes on behind the scenes

This is going to be a pretty loaded question but ever since I started learning about pointers I've been very curious about what happens behind the scenes when a program is run.
As far as I know, computer memory is commonly thought of as a long strip of memory divided evenly into individual bytes. Certainly pictures such as the following evoke such a metaphor:
One thing I've been wondering, what do the memory addresses themselves represent? I'm sure it's no coincidence that memory addresses appear as 8 digit hexadecimal values (eg/ 00EB5748). Why is this?
Furthermore, when I declare a variable x, what is happening at the memory level? Is the compiler simply reserving a random address (+however many consecutive addresses it needs for the variable type) for data storage?
Now suppose x is an unsigned int that occupies 2 bytes of memory (ie values ranging from 0 to 65536). When I declare x = 12, what is happening? What is it that I'm making equal to 12? When I draw conceptual diagrams, I usually have a box for an address (say &x) pointing to a variable (x) that occupies seemingly nothing, and I'm sure that can't be a fully accurate picture of what's going on.
And what's happening at the binary level? Is the address 00EB5748 treated as 111010110101011101001000 and storing a value of 12 somewhere, or 1100?
Mostly my confusion & curiosity stems from the relationship between memory addresses and actual values being declared (eg/ 12, 'a', -355.2). As another example, suppose our address 00EB5748 is pointing to a char 's' whose value is 115 according to ASCII charts. Is the address describing a position that stores the value 115 in 1 byte, by flipping the appropriate 1s and 0s at that position in memory?
Just open any book. You will see pages. Every page has a number. Consecutive pages are numbered by consecutive numbers. Do you have any confusion with numbered pages? I think no. Then you should not have confusion with computer memory.
Books were main memory storage devices before computer era. Computer memory derived basic concept from books: book has pages -> computer memory has memory cells, book has page numbers -> computer memory has memory addresses.
One thing I've been wondering, what do the memory addresses themselves represent?
Numbers. Every memory cell has number, like every page in book.
Furthermore, when I declare a variable x, what is happening at the memory level? Is the compiler simply reserving a random address (+however many consecutive addresses it needs for the variable type) for data storage?
Memory manager marks some memory cells occupied and tells the address of first reserved cell to compiler. Compiler associates name and type of variable with this address. (This picture is from my head, it can be inaccurate).
When I declare x = 12, what is happening?
When you declared variable x, memory cells were reserved for this variable. Now you write 12 into these memory cells. Note that 12 is binary coded in some way, depending on type of variable x. If x is unsigned int which occupies 2 memory cells, then one cell will contain 0, other will contain 12. Because binary integer representation of 12 is
0000 0000 0000 1100
|_______| |_______|
cell cell
If 12 is floating-point number it will be coded in other way.
A memory address is simply the position of a given byte in memory. The zeroth byte is at 0x00000000. The tenth at 0x0000000A. The 65535th at 0x0000FFFF. And so on.
Local variables live on the stack*. When compiling a block of code, the compiler counts how many bytes are needed to hold all the local variables, and then increments the stack pointer so that all the variables can fit below it (along with some other stuff like frame pointers and return addresses and whatnot). Then it just remembers that, for example, local variable x is at an offset -2 from the stack pointer, foo is at an offset -4 and so on, and uses those addresses whenever those variables are referenced in the following code.
Since the compiler knows that x is at address (stack pointer - 2), that's the location that is set to the value 12 when you do x = 12.
Not entirely sure if I understand this question, but say you want to read the memory at address 0x00EB5748. The control unit in the CPU reads the instruction, sees that it is a load instruction, and passes the address (in binary of course) to the load/store unit, along with some other junk like how many bytes to read. Then the LSU sends that address to some memory (probably L1 cache), and after a certain time gets the value 12 back. Then this data is available to, say, put in a register, or send to the ALU to do arithmetic, or whatever.
That seems to be accurate, yes. Going back to the first question, an address simply means "byte number 0xWHATEVER in memory".
Hope this clarified things a bit at least.
*I should probably explain the stack as well. A stack is a portion of memory reserved for local variables (and some other stuff). It starts at a fixed location in memory, and stops at the memory address contained in a special register called the stack pointer. To begin with, the stack is empty, so the stack pointer just contains the start of the stack. As you put more data on the stack, the SP is incremented. This means that you can always put more data on it simply by putting it at the address in the SP, and then incrementing the SP so that once again anything past that address is free memory.

Memcpy and Memset on structures of Short Type in C

I have a query about using memset and memcopy on structures and their reliablity. For eg:
I have a code looks like this
typedef struct
{
short a[10];
short b[10];
}tDataStruct;
tDataStruct m,n;
memset(&m, 2, sizeof(m));
memcpy(&n,&m,sizeof(m));
My question is,
1): in memset if i set to 0 it is fine. But when setting 2 i get m.a and m.b as 514 instead of 2. When I make them as char instead of short it is fine. Does it mean we cannot use memset for any initialization other than 0? Is it a limitation on short for eg
2): Is it reliable to do memcopy between two structures above of type short. I have a huge
strings of a,b,c,d,e... I need to make sure copy is perfect one to one.
3): Am I better off using memset and memcopy on individual arrays rather than collecting in a structure as above?
One more query,
In the structue above i have array of variables. But if I am passed pointer to these arrays
and I want to collect these pointers in a structure
typedef struct
{
short *pa[10];
short *pb[10];
}tDataStruct;
tDataStruct m,n;
memset(&m, 2, sizeof(m));
memcpy(&n,&m,sizeof(m));
In this case if i or memset of memcopy it only changes the address rather than value. How do i change the values instead? Is the prototype wrong?
Please suggest. Your inputs are very imp
Thanks
dsp guy
memset set's bytes, not shorts. always. 514 = (256*2) + (1*2)... 2s appearing on byte boundaries.
1.a. This does, admittedly, lessen it's usefulness for purposes such as you're trying to do (array fill).
reliable as long as both structs are of the same type. Just to be clear, these structures are NOT of "type short" as you suggest.
if I understand your question, I don't believe it matters as long as they are of the same type.
Just remember, these are byte level operations, nothing more, nothing less. See also this.
For the second part of your question, try
memset(m.pa, 0, sizeof(*(m.pa));
memset(m.pb, 0, sizeof(*(m.pb));
Note two operations to copy from two different addresses (m.pa, m.pb are effectively addresses as you recognized). Note also the sizeof: not sizeof the references, but sizeof what's being referenced. Similarly for memcopy.

What mechanism detects accesses of unallocated memory?

From time to time, I'll have an off-by-one error like the following:
unsigned int* x = calloc(2000, sizeof(unsigned int));
printf("%d", x[2000]);
I've gone beyond the end of the allocated region, so I get an EXC_BAD_ACCESS signal at runtime. My question is: how is this detected? It seems like this would just silently return garbage, since I'm only off by one byte and not, say, a full page. What part of the system prevents me from just returning the garbage byte at x + 2000?
The memory system has sentinel values at the beginning and end of its memory fields, beyond your allocated bytes. When you free the memory, it checks to see if those values are intact. If not, it tells you.
Perhaps you are just lucky because you are using 2000 as a size. Depending on the size of int the total size is divisible by 32 or 64, so chances are high that the end of it really terminates the "real" allocation. Try with some odd number of bytes (better use a char array for that) and see if your systems still detects it.
In any case you shouldn't rely on finding these bugs this way. Always use valgrind or similar to check your memory accesses.