How does Objective-C block capture a non-object value? - objective-c

int anInteger = 42;
void (^testBlock)(void) = ^{
NSLog(#"Integer is: %i", anInteger);
};
anInteger = 84;
testBlock();
Integer is: 42
This is an example from Apple official guide.
Now, for object value, it is easy to understand, it keeps a reference to it. So later on, when it's original reference changes to point to something else, or simply gets destroyed. This reference is still there, so reference count won't be zero, and the original value is kept.
But, for the example code above, it is not an object. The block keeps a reference to it, and then the value changes to 84. I suppose that is a change to itself instead of its copy, that means the value the pointer pointing to has changed. How can it still be 42?

From the Blocks and Variables section of the documentation:
The following rules apply to variables used within a block:
Global variables are accessible, including static variables that exist within the enclosing lexical scope.
Parameters passed to the block are accessible (just like parameters to a function).
Stack (non-static) variables local to the enclosing lexical scope are captured as const variables.
Their values are taken at the point of the block expression within the program. In nested blocks, the value is captured from the nearest enclosing scope.
Variables local to the enclosing lexical scope declared with the __block storage modifier are provided by reference and so are mutable.
Any changes are reflected in the enclosing lexical scope, including any other blocks defined within the same enclosing lexical scope. These are discussed in more detail in The __block Storage Type.
Local variables declared within the lexical scope of the block, which behave exactly like local variables in a function.
Each invocation of the block provides a new copy of that variable. These variables can in turn be used as const or by-reference variables in blocks enclosed within the block.
Rule 3 applies to the code in your question.

Blocks introduce the necessary indirection to make sure this happens. Variables that appear to be local but which are captured by blocks are actually allocated on the heap by the compiler. Among other things, this is necessary for that variable to be able to outlive the lifetime of the function it was declared in.

To make a long story short: Integral values are copied. (To be more precise: structs and object references are copied, too. But in the case of object references, it is a reference.)
BTW: This is the meaning of a closure. This is what closure are made for. The reason for their existence. You want exactly this behavior. Otherwise you would have to ensure, that a value is not changed while a block is running – maybe seconds or minutes later.

Related

How does Objective C deals with primitive __block variables on stack when block is copied?

I have the following code under ARC:
-(void) foo {
__block int spam = 42;
self.myProperty = ^{
spam++;
}
self.myProperty(); // Increment first time
}
-(void) bar {
self.myProperty(); // Increment second time
}
When "Increment first time" is called, ObjC uses pointer to spam (which resides on stack) to increment it, right?
After it, foo stack is thrown out, so pointer is not valid anymore.
What will bar do? What should happen when I call it?
Everything is clear about objects, since they are created on heap and block copying (which takes place in the moment of property assignment) leads to retain call. But what is about auto vars?
I can use debugger to find the answer, but I want to fully understand it: which part of Objc/clang specification covers that?
Update: A used "&" to get address of my variable and found that address get changed in the moment I assign block to my property (actually at the moment when block is copied). I believe that is the moment from my variable was moved from stack to heap.
-(void) foo {
__block int spam = 42;
int* addrOfSpam = &spam;
*addrOfSpam = 43; // OK, spam = 43 now
self.myProperty = ^{ // The moment when spam is moved from stack to heap
spam++;
};
*addrOfSpam = 44; // Fail, spam not changed and 'addrOfSpam' points to nowhere
spam++; // Spam looks like auto/stack var here, BUT IT IS NOT!
// It is on heap now, and "spam" is changed by compiler to:
// *(newAddressOfSpamInHeap_OnlyCompilerKnowsWhere)++;
}
The salient passage in the doc is here. (boldface added by me)
__block variables live in storage that is shared between the lexical scope of the variable and all blocks and block copies declared or
created within the variable’s lexical scope. Thus, the storage will
survive the destruction of the stack frame if any copies of the blocks
declared within the frame survive beyond the end of the frame (for
example, by being enqueued somewhere for later execution). Multiple
blocks in a given lexical scope can simultaneously use a shared
variable.
The block qualifier places the variable in a scope that will persist at least as long as the block that refers to it. So the first and second invocations of the block are identical with respect to the variable. After the second call, spam should equal 44.
First of all, __block is a storage qualifier for variables, and it the same way for all types of variables. Whether it's a variable of primitive type, pointer-to-object type, or whatever (by the way, a variable cannot have "object type"); doesn't matter. Variables of all types normally reside on the stack if they are local variables of a function. Referring to a local variable of pointer-to-object type after its scope has exited equally results in undefined behavior.
About your question, the answer is that __block variables are not only stored on the stack. They can also be stored on the heap. How it works is an implementation detail but it's guaranteed by the specification to be valid to be used in all the blocks that capture it.
If you want to know how it is actually implemented, you can read Clang's Block implementation specification (although I think there are some typos). Essentially, when you have a __block variable, you actually have a data structure, where the variable is a field. And the data structure also has a pointer to keep track of where the current version of itself is. Accesses to the variable are implicitly translated by the compiler to access it through these levels of indirection.
Just like with blocks, this data structure starts out on the stack for optimization, but can be "moved" to the heap (it gets moved the first time any block that captures it gets moved to the heap, because that's when its lifetime might need to exceed the scope it was created in). When it's "moved" to the heap, the pointer that says where it is is updated, so that people will know to use the new copy. This data structure, when in the heap, is memory-managed by a system of reference counting.
Blocks that capture __block variables have a copy of the "current version" pointer, which is updated when the data structure is moved, so they know where to access the variable both before and after the move.
About your tests, you have to be really careful about taking the address of a __block variable, because the location of the variable moves over its lifetime! (something that doesn't usually happen in C).
When you first take the address of spam, it is still on the stack, so you have the address of the stack version of it. spam is captured in a block which is then assigned to the property myProperty whose setter copies the block, moving it to the heap and also moving spam to the heap. addrOfSpam still points to the stack version of the variable which is no longer the version being used; using it to change the integer is changing the wrong version of the variable, because everyone is using the heap copy now.

No matching function for call to... but only inside a block?

I've got a strange situation. I have some local variables in a function:
JSContext *cx = ...;
jsval successCb = ...;
There is a function call which takes these parameters:
//JS_RemoveValueRoot(JSContext *cx, jsval *vp);
JS_RemoveValueRoot(cx, &successCb); //works
The above compiles fine. However, if I instead have the following, I get a compile time error:
id foo = ^() {
JS_RemoveValueRoot(cx, &successCb);
}
Literally, if I copy and paste the line, if it's outside of the block it compiles, yet if it's not, it doesn't. The error is:
No matching function for call to 'JS_RemoveValueRoot'
I suspect something is going on behind the scenes in terms of how block closures are implemented but I'm not familiar enough with Objective C to figure this out. Why does this generate a compile-time error and how do I fix it?
EDIT: It seems that if I do the following I no longer get a compile-time error, but this makes no sense to me, which is always a bad thing, so I'd still like an explanation...
id foo = ^() {
jsval localSuccessCb = successCb;
JS_RemoveValueRoot(cx, &localSuccessCb);
};
It's more complicated that that. Yes, the immediate issue is that all non-__block captured variables are const inside the block. Therefore, inside the block cx has type JSContext * const and successCb has type const jsval. And const jsval * cannot be passed to jsval *. But you have to first understand why the variables are const.
Blocks capture non-__block variables by value at the time that they are created. That means the copy of the variable inside the block and the copy outside are different, independent variables, even though they have the same name. If it were not const, you might be tempted to change the variable inside the block and expect it to change outside, but it does not. (Of course, the opposite problem still happens -- you can still change the variable outside the block, since it's not const, and wonder why it does not change inside the block.) __block resolves this issue by making it so there's only one copy of the variable, that is shared between the inside and outside of the block.
Then it's important to think about why a const variable is not sufficient. If you just need the value of the variable, then a const copy is just as well. When const won't work, usually it's because of the need to assign to the variable. We need to ask, what does JS_RemoveValueRoot that it requires a non-const pointer to the variable? Is it to assign to the variable? (And if it does, do we care about the new value outside the block? Because if not, we can just assign the const variable to a non-const variable inside the block.)
It turns out it's more complicated. According to the documentation of JS_Remove*Root, it neither uses the value of the variable pointed to, nor needs to set the variable; rather, it needs the address of the variable, and this needs to match the address passed to JS_Add*Root. (Actually, I am not even sure whether a const pointer is even needed for what they're doing.) I am assuming that JS_AddValueRoot was done in the body of the function that encloses the block, outside the block. (I assume this since you said successCb is a local variable, so it must be within this function; if it were within the block, it wouldn't make sense because then successCb could just be a local variable of the block, and thus not need to be captured.)
Because the address of the variable itself is significant, let us consider what happens in the various block variable capture modes. A non-__block variable is now clearly not appropriate, since there are two separate copies (and thus two separate addresses) for the inside and outside. Thus, the addresses given to Add and Remove won't match. A __block variable is shared, and is much better.
However, there is still an issue with __block variables that may make it not match -- the address of a __block variable may change over time! This goes into the specifics of how blocks are implemented. In the current implementation, a __block variable is held in a special structure (a kind of an "object") that starts out on the stack, but when any block capturing it is copied, it is "moved" to the heap as a dynamically-allocated structure. This is very similar to how block objects that capture variables start out on the stack, but is moved to the heap upon being copied. Putting it on the stack first is an optimization, and is not guaranteed to happen; but currently it does. The __block variable itself is actually an access of the variable inside this structure, accessed through a pointer that tracks where this structure is. As the structure is moved from the stack to the heap, you can see the value of the expression &successCb change. (This is not possible in normal C.) Therefore, to have matching addresses, you must ensure that the move has already occurred when you pass the address of the variable to Add. You may be able to do this by forcibly copying a block that captures it.
Ah I believe this is the issue. From this article on closures:
Here comes a first difference. The variables available in a block by closure are typed as «const». It means their values can't be modified from inside the block.
Thus the error is that I was passing JS_RemoveValueRoot a const jsval * instead of a jsval *. Creating a local copy that wasn't constant "resolved" the issue (depending on whether that behavior is acceptable, which in this case it is).
Alternatively I could also declare the jsval as:
__block jsval successCb = ...;
In which case I don't have to create a local non-const copy.
XCode did provide quite the unhelpful error message in this case...

When *exactly* is it necessary to copy a block in objective-C under ARC?

I've been getting conflicting information about when I need to copy a block when using ARC in objective-C. Advice varies from "always" to "never", so I'm really not sure what to make of it.
I happen to have a case I don't know how to explain:
-(RemoverBlock)whenSettledDo:(SettledHandlerBlock)settledHandler {
// without this local assignment of the argument, one of the tests fails. Why?
SettledHandler handlerFixed = settledHandler;
[removableSettledHandlers addObject:handlerFixed];
return ^{
[removableSettledHandlers removeObject:handlerFixed];
};
}
Which is called with a block inline like this:
-(void) whatever {
[self whenSettledDo:^(...){
...
}];
}
(The actual code this snipper was adapted from is here.)
What does copying the argument to the local variable change here? Is the version without the local making two distinct copies, one for addObject and one for removeObject, so the removed copy doesn't match the added copy?
Why or when isn't ARC handling blocks correctly? What does it guarantee and what are my responsibilities? Where is all of this documented in a non-vague fashion?
In C, correctness cannot be inferred from running any number of tests, because you could be seeing undefined behavior. To properly know what is correct, you need to consult the language specification. In this case, the ARC specification.
It is instructive to first review when it is necessary to copy a block under MRC. Basically, a block that captures variables can start out on the stack. What this means is when you see a block literal, the compiler can replace it with a hidden local variable in that scope that contains the object structure itself, by value. Since local variables are only valid in the scope they are declared in, that is why blocks from block literals are only valid in the scope the literal is in, unless it is copied.
Furthermore, there is the additional rule that, if a function takes a parameter of block pointer type, it makes no assumptions about whether it's a stack block or not. It is only guaranteed that the block is valid at the time the block is called. However, this pretty much means that the block is valid for the entire duration of the function call, because 1) if it is a stack block, and it is valid when the function was called, that means somewhere up the stack where the block was created, the call is still within the scope of the stack literal; therefore it will still be in scope by the end of the function call; 2) if it is a heap block or global block, it is subject to the same memory management rules as other objects.
From this, we can deduce where it is necessary to copy. Let's consider some cases:
If the block from a block literal is returned from the function: It needs to be copied, since the block escapes from the scope of the literal
If the block from a block literal is stored in an instance variable: It needs to be copied, since the block escapes from the scope of the literal
If the block is captured by another block: It does not need to be copied, since the capturing block, if copied, will retain all captured variables of object type AND copy all captured variables of block type. Thus, the only situation where our block would escape this scope would be if the block that captures it escapes the scope; but in order to do that, that block must be copied, which in turn copies our block.
If the block from a block literal is passed to another function, and that function's parameter is of block pointer type: It does not need to be copied, since the function does not assume that it was copied. This means that any function that takes a block and needs to "store it for later" must take responsibility for copying the block. And indeed this is the case (e.g. dispatch_async).
If the block from a block literal is passed to another function, and that function's parameter is not of block pointer type (e.g. -addObject:): It needs to be copied if you know that this function stores it for later. The reason it needs to be copied is that the function cannot take responsibility for copying the block, since it does not know it is taking a block.
So if your code in the question was in MRC, -whatever would not need to copy anything. -whenSettledDo: will need to copy the block, since it is passed to addObject:, a method that takes a generic object, type id, and doesn't know it's taking a block.
Now, let's look at which of these copies ARC takes care for you. Section 7.5 says
With the exception of retains done as part of initializing a __strong
parameter variable or reading a __weak variable, whenever these
semantics call for retaining a value of block-pointer type, it has the
effect of a Block_copy. The optimizer may remove such copies when it
sees that the result is used only as an argument to a call.
What the first part means is that, in most places where you assign to a strong reference of block pointer type (which normally causes a retain for object pointer types), it will be copied. However, there are some exceptions: 1) In the beginning of the first sentence, it says that a parameter of block pointer type is not guaranteed to be copied; 2) In the second sentence, it says that if a block is only used as an argument to a call, it is not guaranteed to be copied.
What does this mean for the code in your question? handlerFixed is a strong reference of block pointer type, and the result is used in two places, more than just an argument to a call, thus assigning to it assigns a copy. If however, you had passed a block literal directly to addObject:, then there is not guaranteed to be a copy (since it's used only as an argument to a call), and you would need to copy it explicitly (as we discussed that the block passed to addObject: needs to be copied).
When you used settledHandler directly, since settledHandler is a parameter, it is not automatically copied, so when you pass it to addObject:, you need to copy it explicitly, because as we discussed that the block passed to addObject: needs to be copied.
So in conclusion, in ARC you need to explicitly copy when passing a block to a function that doesn't specifically take block arguments (like addObject:), if it's a block literal, or it's a parameter variable that you're passing.
I've confirmed that my particular issue was in fact making two distinct copies of the block. Tricky tricky. This implies the proper advice is "never copy, unless you want to be able to compare the block to itself".
Here's the code I used to test it:
-(void) testMultipleCopyShenanigans {
NSMutableArray* blocks = [NSMutableArray array];
NSObject* v = nil;
TOCCancelHandler remover = [self addAndReturnRemoverFor:^{ [v description]; }
to:blocks];
test(blocks.count == 1);
remover();
test(blocks.count == 0); // <--- this test fails
}
-(void(^)(void))addAndReturnRemoverFor:(void(^)(void))block to:(NSMutableArray*)array {
NSLog(#"Argument: %#", block);
[array addObject:block];
NSLog(#"Added___: %#", array.lastObject);
return ^{
NSLog(#"Removing: %#", block);
[array removeObject:block];
};
}
The logging output when running this test is:
Argument: <__NSStackBlock__: 0xbffff220>
Added___: <__NSMallocBlock__: 0x2e283d0>
Removing: <__NSMallocBlock__: 0x2e27ed0>
The argument is an NSStackBlock, stored on the stack. In order to be placed in the array or the closure it must be copied to the heap. But this happens once for the addition to the array and once for the closure.
So the NSMallocBlock in the array has an address ending in 83d0 whereas the one in the closure that is removed from the array has an address ending in 7ed0. They are distinct. Removing one doesn't count as removing the other.
Bleh, guess I need to watch out for that in the future.
A block must be copied when the application leaves the scope where the block was defined. A bad example:
BOOL yesno;
dispatch_block_t aBlock;
if (yesno)
{
aBlock = ^(void) { printf ("yesno is true\n");
}
else
{
aBlock = ^(void) { printf ("yesno is false\n");
}
aBlock = [aBlock copy];
It's too late already! The block has left its scope (the { brackets } ) and things can go wrong. This could have been fixed trivially by not having the { brackets }, but it is one of the rare cases where you call copy yourself.
When you store a block away somewhere, 99.99% of the time you are leaving the scope where the block was declared; usually this is solved by making block properties "copy" properties. If you call dispatch_async etc. the block needs to be copied, but the called function will do that. The block based iterators for NSArray and NSDictionary typically don't have to make copies of the block because you are still running inside the scope where the block was declared.
[aBlock copy] when the block was already copied doesn't do anything, it just returns the block itself.

Should I use the __block specifier on an object pointer even though it works without it?

I am using a CAAnimation completion block (using CAAnimationBlocks) to provide processing at the end of an animation and part of that completion block modifies the animated CALayer. This works even if layer isn't declared with the __block specifier, given the object pointer remains constant, however I am really treating the object as read/write.
One aspect of the Apple Guide that bothers me is:
__block variables live in storage that is shared between the lexical scope of the variable and all blocks and block copies declared or
created within the variable’s lexical scope.
Given the layer is a collection iterator, that looks to me like it will actually break if I do use the __block specifier.
Here is the code in question:
for (CALayer *layer in _myLayers) // _myLayers is an ivar of the containing object
{
CAAnimationGroup *group = ...;
...
group.completion = ^(BOOL finished)
{
[CATransaction begin];
[CATransaction setValue:(id)kCFBooleanTrue
forKey:kCATransactionDisableActions];
layer.position = [self _findPosition];
[CATransaction commit];
[layer removeAnimationForKey:#"teleportEffect"];
};
[layer addAnimation:group forKey:#"teleportEffect"];
}
My actual question is: Am I doing this right (my spider sense is tingling).
EDIT I should also add that my app uses MRR, however there are no issues with retain/release given the layers are static in nature (their lifetime is that of the containing NSView). Also I appear to be doing precisely what the Patterns to Avoid section of the guide say I shouldn't do, although it's not clear (to me) why.
__block variables are mutable within the block (and the enclosing scope) and are preserved if any referencing block is copied to the heap.
I don't think that in your case you need a block variable because you are changing the value of the object layer inside the block, since it belong to the _myLayers array that seems to be an instance variable it is difficult that the object will be released before each block performed ... However you can add the __block storage type modifier to retain the object, but if you are using ARC, object variables are retained and released automatically as the block is copied and later released.
EDIT:
As to your concern with the Anti-patterns you mention, I think that in both anti-pattern examples, the critical point is the the variable declaration and the "block literal" assigned to it have different scope. Take the for case presented there:
void dontDoThis() {
void (^blockArray[3])(void); // an array of 3 block references
for (int i = 0; i < 3; ++i) {
blockArray[i] = ^{ printf("hello, %d\n", i); };
// WRONG: The block literal scope is the "for" loop
}
}
blockArray is visible within the whole method body;
in the for loop, you create a block; a block is an object (some storage in memory) and has an address; the block as an object has "stack-local data structure" (from the reference above), i.e., it is allocated on the stack when you enter the method;
the fact that the "block literal" is treated as a variable local to the for loop, means that that storage can be reused at each successive iteration;
the block addresses are assigned to blockArray elements;
when you exit the for loop, blockArray will contain addresses of blocks that are possibly not there anymore and/or have been overwritten at each step depending on what the compiler does to stack data structure created within a for scope.
Your case is different, since your local variable is also within the for scope, and it will not be visible outside of it.
The case presented as an anti-pattern is similar to this:
{
int array[3];
for (int i = 0; i < 3; ++i) {
int a = i;
array[i] = &a;
// WRONG: The block literal scope is the "for" loop
}
Very likely, the a variable within the for scope will be allocated on stack just once and then reused at each iteration of the loop. In principle, a (one copy) will be still there (I am not sure, actually, the C standard should be inspected) outside of the loop, but it is pretty clear that the meaning of that code is not really sensible.
OLD ANSWER:
__block variables live in storage that is shared between the lexical scope of the variable and all blocks and block copies declared or created within the variable’s lexical scope.
I think this can be better understood like this: the lexical scope of the __block variable and all blocks (as per above definition) will share the same storage for that variable. So, if one block (or the original lexical scope) modifies the variable (I mean here the variable pointing to the object), that change will be visible to all others.
Given this, one effect of declaring a variable as __block is that in the non-ARC case the object pointed-to by it will not be automatically retained by each block where it is passed into (with ARC, the retain is done also for __block variables).
Both when using ARC and not using ARC, you need to use the __block specifier when you want to change the variable value and want that all blocks use the new value. Imagine that you had a block to initialize your _myLayers ivar: in this case, you would need to pass the _myLayers variable into the block as a __block variable, so that it (vs. a copy of it) can be modified.
In your case, if you are not using ARC, then, it all depends on whether the object pointed to by layer will be still there when the block is executed. Since layer comes from _myLayers, this converts into whether the object owning _myLayers will be still there when the block is executed. The answer to this is normally yes, since the block we are talking about is the completion block for an animation on that layer. (It would have been different, say, if it were the completion block for a network request).
Hope this helps.

What is meant by the term "local object variable"?

Are "local object variables" the variables that are used or initialized in a method, or are they the arguments taken in? I can't find this term in Xcode's documentation or Google.
I found this in the Objective-C book that I'm using. The full quote is
Local variables that are basic C data types have no default initial value, so you must set them to some value before using them. The three local variables in the reduce method are set to values before they are used, so that's not a problem here. Local object variables are initialized to nil by default. Unlike your instance variables (which retain their values through method calls), these local variables have no memory. Therefore, after the method returns, the values of these variables disappear. Every time a method is called, each local variable defined in that method is reinitialized to the value specified (if any) with the variable's declaration."
Based on your comment, I understand what the book means. Local variables are variables local to a particular scope (denoted by braces '{}' in C and Objective-C). Local variables are declared in the scope where they're used, as opposed to global variables which can be seen and used globally (to a file, multiple files or the whole program depending on declaration visibility). Instance variables are part of a class instance and can be used by any of its methods (and other classes too if declared using #public, though that's generally not good practice).
Primitive local variables are local variables whose type is a C primitive like int, float, char, etc. What the book is calling "local object variables" are simply local variables whose type is (a pointer to) an Objective-C object. Examples are NSString *, NSDictionary * and id.
Local variables are stored on the stack, as opposed to the heap. Variables on the stack go away at the end of the method or function call where they were declared. This Stack Overflow question has some good answers explaining the difference between the stack and the heap: What and where are the stack and heap?
The first result of a Google search for "local variables objective-c": http://blog.ablepear.com/2010/04/objective-c-tuesdays-local-variables.html .
Local variables are defined in the method and scope of the variables that have existed inside the method itself.