How to find where an uninitialised value comes from in Valgrind - valgrind

I am debugging tinyscheme version v 1.41. Valgrind notes that
==16675== Conditional jump or move depends on uninitialised value(s)
==16675== at 0x4062C4: opexe_0 (scheme.c:2579)
==16675== by 0x403C5E: Eval_Cycle (scheme.c:4471)
==16675== by 0x40A3AC: scheme_load_named_file (scheme.c:4830)
==16675== by 0x40A878: main (scheme.c:5118)
==16675==
==16675== Conditional jump or move depends on uninitialised value(s)
==16675== at 0x406324: opexe_0 (scheme.c:2586)
==16675== by 0x403C5E: Eval_Cycle (scheme.c:4471)
==16675== by 0x40A3AC: scheme_load_named_file (scheme.c:4830)
==16675== by 0x40A878: main (scheme.c:5118)
This uninitialised value is the type information inside some object. It appears that some object is being created with no type information. I would be interested to see when that memory was allocated, or if ever that location was overwritten with other uninitialised data.
Is there a way to tell Valgrind, "tell me when and where that memory was allocated"?

The option
--track-origins=no|yes show origins of undefined values? [no]
instructs valgrind to give more information about the origin of undefined values.

Related

What does "VkImageMemoryBarrier::srcAccessMask = 0" mean?

I just read Images Vulkan tutorial, and I didn't understand about "VkImageMemoryBarrier::srcAccessMask = 0".
code:
barrier.srcAccessMask = 0;
barrier.dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
and this tutorial say:
Since the transitionImageLayout function executes a command buffer with only a single command, you could use this implicit synchronization and set srcAccessMask to 0 if you ever needed a VK_ACCESS_HOST_WRITE_BIT dependency in a layout transition.
Q1 : If function have commandbuffer with multi command, then can't use this implicit synchronization?
Q2 : According to the manual page, VK_ACCESS_HOST_WRITE_BIT is 0x00004000. but tutorial use "0". why?
it's "0" mean implicit
it's "VK_ACCESS_HOST_WRITE_BIT" mean explicit ?
Am I understanding correctly?
0 access mask means "nothing". As in, there is no memory dependency the barrier introduces.
Implicit synchronization means Vulkan does it for you. As the tutorial says:
One thing to note is that command buffer submission results in implicit VK_ACCESS_HOST_WRITE_BIT synchronization
Specifically this is Host Write Ordering Guarantee.
Implicit means you don't have to do anything. Any host write to mapped memory is already automatically visible to any device access of any vkQueueSubmit called after the mapped memory write.
Explicit in this case would mean to submit a barrier with VK_PIPELINE_STAGE_HOST_BIT and VK_ACCESS_HOST_*_BIT.
Note the sync guarantees only work one way. So CPU → GPU will be automatic\implicit. But GPU → CPU always need to be explicit (you need a barrier with dst = VK_PIPELINE_STAGE_HOST_BIT to perform memory domain transfer operation).

VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT VkAccessFlags set to 0?

In the Vulkan spec it defines:
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT is equivalent to VK_PIPELINE_STAGE_ALL_COMMANDS_BIT with
VkAccessFlags set to 0 when specified in the second synchronization scope, but specifies no
stages in the first scope.
and similarly:
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT is equivalent to VK_PIPELINE_STAGE_ALL_COMMANDS_BIT with
VkAccessFlags set to 0 when specified in the first synchronization scope, but specifies no stages
in the second scope.
I'm unclear what it means by "with VkAccessFlags set to 0" in this context?
Technically VkAccessFlags is a type, not a variable, so it can't be set to anything.
(It seems to be adjusting the definitions of TOP/BOTTOM_OF_PIPE for some special property of VK_PIPELINE_STAGE_ALL_COMMANDS_BIT with respect to VkAccessFlags, but I can't quite see what that special property is or where it is specified.)
Anyone know what it's talking about?
(or, put another way: If we removed those two utterances of "with VkAccessFlags set to 0" from the spec, what would break?)
It is roundabout way to say the interpretation of the stage flag is different for a memory dependency.
For execution dependency in src it takes the stage bits you provide, and logically-earlier stages are included automatically. Similarly for dst, logically-later stages are included automatically.
But this applies only to the execution dependency. For a memory dependency, only the stage flags you provide count (and none are added automatically).
For example, let's say you have VK_PIPELINE_STAGE_ALL_COMMANDS_BIT + VK_ACCESS_MEMORY_WRITE_BIT in src. That means all memory writes from all previous commands will be made available. But if you have VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT + VK_ACCESS_MEMORY_WRITE_BIT in src, that means all memory writes from only BOTTOM_OF_PIPE stage are made available, so no memory writes are made available (because that particular stage doesn't make any).
Either way IMO, for code clarity it is better to always state all pipeline stages explicitly whenever one can.

Storing things in isa

The 64-bit runtime took away the ability to directly access the isa field of an object, something CLANG engineers had been warning us about for a while. They've been replaced by a rather inventive (and magic) set of everchanging ABI rules about which sections of the newly christened isa header contain information about the object, or even other state (in the case of NSNumber/NSString). There seems to be a loophole, in that you can opt out of the new "magic" isa and use one of your own (a raw isa) at the expense of taking the slow road through certain runtime code paths.
My question is twofold, then:
If it's possible to opt out and object_setClass() an arbitrary class into an object in +allocWithZone:, is it also possible to put anything up there in the extra space with the class, or will the runtime try to read it through the fast paths?
What exactly in the isa header is tagged to let the runtime differentiate it from a normal isa?
If it's possible to opt out and object_setClass() an arbitrary class into an object in +allocWithZone:
According to this article by Greg Parker
If you override +allocWithZone:, you may initialize your object's isa field to a "raw" isa pointer. If you do, no extra data will be stored in that isa field and you may suffer the slow path through code like retain/release. To enable these optimizations, instead set the isa field to zero (if it is not already) and then call object_setClass().
So yes, you can opt out and manually set a raw isa pointer. To inform the runtime about this, you have to the first LSB of the isa to 0. (see below)
Also, there's an environment variable that you can set, named OBJC_DISABLE_NONPOINTER_ISA, which is pretty self-explanatory.
is it also possible to put anything up there in the extra space with the class, or will the runtime try to read it through the fast paths?
The extra space is not being wasted. It's used by the runtime for useful in-place information about the object, such as the current state and - most importantly - its retain count (this is a big improvement since it used to be fetched every time from an external hash table).
So no, you cannot use the extra space for your own purposes, unless you opt out (as discussed above). In that case the runtime will go through the long path, ignoring the information contained in the extra bits.
Always according to Greg Parker's article, here's the new layout of the isa (note that this is very likely to change over time, so don't trust it)
(LSB)
1 bit | indexed | 0 is raw isa, 1 is non-pointer isa.
1 bit | has_assoc | Object has or once had an associated reference. Object with no associated references can deallocate faster.
1 bit | has_cxx_dtor | Object has a C++ or ARC destructor. Objects with no destructor can deallocate faster.
30 bits | shiftcls | Class pointer's non-zero bits.
9 bits | magic | Equals 0xd2. Used by the debugger to distinguish real objects from uninitialized junk.
1 bit | weakly_referenced | Object is or once was pointed to by an ARC weak variable. Objects not weakly referenced can deallocate faster.
1 bit | deallocating | Object is currently deallocating.
1 bit | has_sidetable_rc | Object's retain count is too large to store inline.
19 bits | extra_rc | Object's retain count above 1. (For example, if extra_rc is 5 then the object's real retain count is 6.)
(MSB)
What exactly in the isa header is tagged to let the runtime differentiate it from a normal isa?
As anticipated above you can discriminate between a raw isa and a new rich isa by looking at the first LSB.
To wrap it up, while it looks feasible to opt out and start messing with the extra bits available on a 64 bit architecture, I personally discourage it. The new isa layout is carefully crafted for optimizing the runtime performances and it's far from guaranteed to stay the same over time.
Apple may also decide in the future to drop the retro-compatibility with the raw isa representation, preventing opt out. Any code assuming the isa to be a pointer would then break.
You can't safely do this, since if (when, really) the usable address space expands beyond 33 bits, the layout will presumably need to change again. Currently though, the bottom bit of the isa controls whether it's treated as having extra info or not.

Tell LLVM optimizer contents of variables

I'm writing a compiler using LLVM as backend and have a lot of reference counting. When I borrow an object, I increment the object's reference counter. When I release an object, I decrement the reference counter, and free the object if it goes to zero.
However, if I only do a small piece of code, like this one:
++obj->ref;
global_variable_A = obj->a;
if (--obj->ref == 0)
free_object(obj);
LLVM optimizes this to (in IR but this is the equal code in C):
global_variable_A = obj->a;
if (obj->ref == 0)
free_object(obj);
But since I know that a reference counter is always positive before the first statement, it could be optimized to only
global_variable_A = obj->a;
My question: is there any way to tell the LLVM optimizer that a register or some memory, at a the time of reading it, is known to contain non-zero data?
An other equal question would be if I can tell the optimizer that a pointer is non-null, that would also be great.
You could write a custom FunctionPass that would replace the variable with a true value, then it should be optimised by DCE or SimplifyCFG.
http://llvm.org/docs/WritingAnLLVMPass.html

Is using fflush(stdout) as fprintf() argument safe?

To I came upon this line of code:
fprintf(stdout, "message", fflush(stdout));
Note that the message does not contain any %-tag.
Is that safe in visual c++? fflush() returns 0 on success and EOF on failure. What will fprintf() do with this extra parameter?
I first thought that this was a strange hack to add a fflush() call without needing an extra line. But written like this, the fflush() call will be executed before the fprintf() call so it does not flush the message being printed right now but the ones waiting to be flushed, if any... am I right?
It's safe. Here's what C (C99 atleast, paragraph
7.19.6.1) says about it
If the format is exhausted while
arguments remain, the excess arguments
shall be evaluated but are otherwise
ignored.
If the goal was to avoid a line, i'd rather do
fflush(stdout); fprintf(stdout, "message");
if for nothing else than to prevent the person later reading that code to hunt me down with a bat.
fprintf doesn't know the exact number of parameters, it only tries to load one argument per '%'. If you provide less arguments than '%', it results in undefined behavior, if you provide more arguments, they are ignored.
Ad second question, yes, this would only flush messages in queue, the new message won't be flushed.
I think fprintf is using varargs to process parameters, so any extra parameters should be safely ignored (not that it's a good practice or anything). And you are right that fflush will be called before fprintf, so this is kind of a pointless hack.
With enough warning flags enabled (like -Wall for gcc) you will get a warning