Where does ARC write the release instructions? - objective-c

I know each time I press CMD + B on my keyboard:
Xcode does wake up ARC
ARC analyzes my code and writes all the retain/release/autorelease invocations
finally the code is compiled by LLVM
A few more things happen in the process, however what I am asking is...
Where does ARC write the release instructions?
Just before the variable that reference an instance of a class goes out of scope
Just after the last time that variable is used
Example
class HugeObject {
func doVeryImportantStuff() { print("The answer is \((10*4)+2)") }
}
func foo() {
let a = HugeObject()
a.doVeryImportantStuff()
// <-- Point A
let b = HugeObject()
b.doVeryImportantStuff()
// <-- Point B
}
Where is ARC going to write the a.release() line?
At Point B?
Or, better, it is able to understand that the a.release can safely be moved at Point A?
I suspect there could be important implications in terms of footprint but I could not find any information about this.

First of all: ARC is a part of the compiler, whose name is clang, not LLWM. It is done while compiling.
To your Q:
Short answer: This depends on the annotations you gave the compiler. Default is not too early and not too late.
Long answer:
Semantically the release of a local var (local scope, auto) is sent, when the extent of the local var is lost. But this is optimized. Therefore it is possible that technically the release is sent earlier, that means between the last visible usage of the local var and the loss of the extent.
If you have a reason to keep the retain as long as the extent, you have to annotate objc_precise_lifetime.
In general, ARC maintains an invariant that a retainable object pointer held in a __strong object will be retained for the full formal lifetime of the object. Objects subject to this invariant have precise lifetime semantics.
By default, local variables of automatic storage duration do not have precise lifetime semantics. Such objects are simply strong references which hold values of retainable object pointer type, and these values are still fully subject to the optimizations on values under local control.
[…]
A local variable of retainable object owner type and automatic storage duration may be annotated with the objc_precise_lifetime attribute to indicate that it should be considered to be an object with precise lifetime semantics.
http://clang.llvm.org/docs/AutomaticReferenceCounting.html#precise-lifetime-semantics

Related

Weak and strong property implementation

I want to better understand strong and weak pointers implementation, and i figure out assumptions, about how their setter methods would look like (correct me if i wrong please).
First, strong pointer, look like:
- (void)setObj:(NSObject*)Obj // Setting Strong Obj
{
//First, check either we trying to set old value again
if (_Obj == Obj)
{
return;
}
NSObject* oldObj = _Obj;
_Obj = [Obj retain];
[oldObj release];
// Set pointer to old "chunk" of memory, containing old value,
// assign new value to backed instance variable and release
// old value
}
That is my assumption of construction, that strong setter may look like. So, my first question is - am i correct in my assumption?
Second, weak reference. I guess, it should look similar, but exclude retain.
- (void)setObj:(NSObject*)Obj // Setting Weak Obj
{
if (_Obj == Obj)
{
return;
}
NSObject* oldObj = _Obj;
_Obj = Obj; // setting value without incrementing reference count
[oldObj release];
}
Is that correctly assumption, about how weak reference work?
Ok, one more question. Consider a situation like that (in Manual Memory Management):
- (void)testFunc
{
strongObj = val; // Retain count about >= 2
weakObj = val; // Retain count about >=1
}
// Now strongObj lives in memory with value of val, with retain count >=1
// weakObj is destroyed, since after end of a scope (function body) it retain count decreased by 1
So, actually i want to know, whether retain count decremented each time, when method that own variable finishes?
I know that question is familiar to many developers, but, i want clarification in that cases. Thanks.
Your strong implementation is correct.
The weak one is wrong. You are not allowed to release the value if you have not previously retained it. You would just set the new value without issueing memory management calls here.
Then again, that wouldn't really be weak, but assign. The special thing about weak is that the reference is zeroed out of the referenced object is deallocated.
For the first and second Q I refer to #rmaddy's comment and Christian's answer.
So, actually i want to know, whether retain count decremented each time, when method that own variable finishes?
First I want to be more precise: When you say "when method that own a variable finishes" you probably mean "when a local strong reference variable of automatic storage class loses its extent". This is not exactly the same. But it is what you likely wanted to say. ("A usual local var.")
In this case it is correct that the referred object is released.
But things are more difficult behind the scenes. I. e.: What happens if the local var (more precise again: the referred object) is returned? What happens in this case, if the method is ownership transferring or not?
The basic problem is that an automatic reference counting has to take edge cases formally into account, even in "usual" code things couldn't break. A human developer can say: "Oh, there is a very special situation the code can break, but I know that this never happens." A compiler cannot. So ARC typically creates very much memory handling calls. Fortunately many of them are optimized away.
If you want to have a deep view into what is done in which situation, you have two good approaches:
Read clang's documenation, which is more precise than Apple's by far, but it is more complicated.
Create a class in a separate file that implements the methods for manual reference counting (-retain, -release, …) and log the execution. Then compile it with manual reference counting, which is possible through compiler flags. Use that class in ARC code. You will see, what ARC does. (You should not rely on the results, because they are subject of optimization and the strategy can change in the future. But it is a good tool to understand, how ARC works.)
It may be helpful to think about strong and weak references in terms of balloons.
A balloon will not fly away as long as at least one person is holding on to a string attached to it. The number of people holding strings is the retain count. When no one is holding on to a string, the ballon will fly away (dealloc). Many people can have strings to that same balloon. You can get/set properties and call methods on the referenced object with both strong and weak references.
A strong reference is like holding on to a string to that balloon. As long as you are holding on to a string attached to the balloon, it will not fly away.
A weak reference is like looking at the balloon. You can see it, access it's properties, call it's methods, but you have no string to that balloon. If everyone holding onto the string lets go, the balloon flies away, and you cannot access it anymore.

Cocoa blocks as strong pointers vs copy

I did work several times with blocks as with pointers to which i had strong reference
I heard that you should use copy, but what is the implication in working with blocks as pointers and not with the raw object?
I never got a complain from the compiler, that i should not use
#property (nonatomic, strong) MyBlock block;
but should use
#property (nonatomic, copy) MyBlock block;
as far as i know, the block is just an object, so why to preferrer copy anyway?
Short Answer
The answer is it is historical, you are completely correct that in current ARC code there is no need to use copy and a strong property is fine. The same goes for instance, local and global variables.
Long Answer
Unlike other objects a block may be stored on the stack, this is an implementation optimisation and as such should, like other compiler optimisations, not have direct impact on the written code. This optimisation benefits a common case where a block is created, passed as a method/function argument, used by that function, and then discarded - the block can be quickly allocated on the stack and then disposed of without the heap (dynamic memory pool) being involved.
Compare this to local variables, which (a) created on the stack, (b) are automatically destroyed when the owning function/method returns and (c) can be passed-by-address to methods/functions called by the owning function. The address of a local variable cannot be stored and used after its owning function/method has return - the variable no longer exists.
However objects are expected to outlast their creating function/method (if required), so unlike local variables they are allocated on the heap and are not automatically destroyed based on their creating function/method returning but rather based on whether they are still needed - and "need" here is determined automatically by ARC these days.
Creating a block on the stack may optimise a common case but it also causes a problem - if the block needs to outlast its creator, as objects often do, then it must be moved to the heap before its creators stack is destroyed.
When the block implementation was first released the optimisation of storing blocks on the stack was made visible to programmers as the compiler at that time was unable to automatically handle moving the block to the heap when needed - programmers had to use a function block_copy() to do it themselves.
While this approach might not be out-of-place in the low-level C world (and blocks are C construct), having high-level Objective-C programmers manually manage a compiler optimisation is really not good. As Apple released newer versions of the compiler improvements where made. Early on it programmers were told they could replace block_copy(block) with [block copy], fitting in with normal Objective-C objects. Then the compiler started to automatically copy blocks off stack as needed, but this was not always officially documented.
There has been no need to manually copy blocks off the stack for a while, though Apple cannot shrug off its origins and refers to doing so as "best practice" - which is certainly debatable. In the latest version, Sept 2014, of Apple's Working with Blocks, they stated that block-valued properties should use copy, but then immediately come clean (emphasis added):
Note: You should specify copy as the property attribute, because a block needs to be copied to keep track of its captured state outside of the original scope. This isn’t something you need to worry about when using Automatic Reference Counting, as it will happen automatically, but it’s best practice for the property attribute to show the resultant behavior.
There is no need to "show the resultant behavior" - storing the block on the stack in the first place is an optimisation and should be transparent to the code - just like other compiler optimisations the code should gain the performance benefit without the programmer's involvement.
So as long as you use ARC and the current Clang compilers you can treat blocks like other objects, and as blocks are immutable that means you don't need to copy them. Trust Apple, even if they appear to be nostalgic for the "good old days when we did things by hand" and encourage you to leave historical reminders in your code, copy is not needed.
Your intuition was right.
You are asking about the ownership modifier for a property. This affects the synthesized (or auto-synthesized) getter and/or setter for the property if it is synthesized (or auto-synthesized).
The answer to this question will differ between MRC and ARC.
In MRC, property ownership modifiers include assign, retain, and copy. strong was introduced with ARC, and when strong is used in MRC, it is synonymous with retain. So the question would be about the difference between retain and copy, and there is a lot of difference, because copy's setter saves a copy of the given value.
Blocks need to be copied to be used outside the scope where it was created (with a block literal). Since your property will be storing the value as an instance variable that persists across function calls, and it's possible that someone will assign an unoccupied block from the scope where it was created, the convention is that you must copy it. copy is the right ownership modifier.
In ARC, strong makes the underlying instance variable __strong, and copy also makes it __strong and adds copying semantics to the setter. However, ARC also guarantees that whenever a value is saved into a __strong variable of block-pointer type, a copy is done. Your property has type MyBlock, which I assume is a typedef for a block pointer type. Therefore, a copy will still be done in the setter if the ownership qualifier were strong. So, in ARC, there is no difference between using strong and copy for this property.
If this declaration might be used in both MRC and ARC though (e.g. a header in a library), it would be a good idea to use copy so that it works correctly in both cases.
what is the implication in working with blocks as pointers and not with the raw object?
You are never using the raw value, you always have a pointer to a block: a block is an object.
Looking at your specific example, I am assuming you want to keep the block around, "so why to preferrer copy anyway"enter code here? Well, it's a matter of safety (this example is taken from Mike Ash blog). Since blocks are allocated on the stack (and not on the heap as the rest of the objects in objective-c), when you do something like this:
[dictionary setObject: ^{ printf("hey hey\n"); } forKey: key];
You are allocating the block on the stack frame of your current scope, so when the scope ends (for example your returning the dictionary), the stack frame is destroyed and the block goes with it. So you got yourself a dangling pointer. I would advise reading Mike's article fully. Anyway, you can go with a strong property if when you are assigning the block you copy it:
self.block = [^{} copy];
Edit: After looking at Mike's article date, I am assuming this was the behaviour Pre-ARC. On ARC it seems it's working as expected, and it won't crash.
Edit2: After experimenting with Non-ARC it doesn't crash as well. But this example shows different results depending on the use of ARC or not:
void (^block[10])();
int i = -1;
while(++i < 10)
block[i] = ^{ printf("%d\n", i); };
for(i = 0; i < 10; i++)
block[i]();
Quoting Mike Ashe on the different outcomes:
The reason it prints out ten 9s in the first case is quite simple: the
block that's created within the loop has a lifetime that's tied to the
loop's inner scope. The block is destroyed at the next iteration of
the loop, and when leaving the loop. Of course, "destroy" just means
that its slot on the stack is available to be overwritten. It just
happens that the compiler reuses the same slot each time through the
loop, so in the end, the array is filled with identical pointers, and
thus you get identical behavior.
As far as I understand copy is required when the object is mutable. Use this if you need the value of the object as it is at this moment, and you don't want that value to reflect any changes made by other owners of the object. You will need to release the object when you are finished with it because you are retaining the copy.
On the other hand, strong means that you own the object until it is needed. It is a replacement for the retain attribute, as part of ARC.
Source: Objective-C declared #property attributes (nonatomic, copy, strong, weak)
Note: You should specify copy as the property attribute, because a block needs to be copied to keep track of its captured state outside of the original scope. This isn’t something you need to worry about when using Automatic Reference Counting, as it will happen automatically, but it’s best practice for the property attribute to show the resultant behavior. For more information, see Blocks Programming Topics.

Objective C++ block semantics

Consider the following C++ method:
class Worker{
....
private Node *node
};
void Worker::Work()
{
NSBlockOperation *op=[NSBlockOperation blockOperationWithBlock: ^{
Tool hammer(node);
hammer.Use();
}];
....
}
What, exactly, does the block capture when it captures "node"? The language specification for blocks, http://clang.llvm.org/docs/BlockLanguageSpec.html, is clear for other cases:
Variables used within the scope of the compound statement are bound to the Block in the normal manner with the exception of those in automatic (stack) storage. Thus one may access functions and global variables as one would expect, as well as static local variables. [testme]
Local automatic (stack) variables referenced within the compound statement of a Block are imported and captured by the Block as const copies.
But here, do we capture the current value of this? A copy of this using Worker’s copy constructor? Or a reference to the place where node is stored?
In particular, suppose we say
{
Worker fred(someNode);
fred.Work();
}
The object fred may not exist any more when the block gets run. What is the value of node? (Assume that the underlying Node objects live forever, but Workers come and go.)
If instead we wrote
void Worker::Work()
{
Node *myNode=node;
NSBlockOperation *op=[NSBlockOperation blockOperationWithBlock: ^{
Tool hammer(myNode);
hammer.Use();
}];
....
}
is the outcome different?
According to this page:
In general you can use C++ objects within a block. Within a member
function, references to member variables and functions are via an
implicitly imported this pointer and thus appear mutable. There are
two considerations that apply if a block is copied:
If you have a __block storage class for what would have been a
stack-based C++ object, then the usual copy constructor is used.
If
you use any other C++ stack-based object from within a block, it must
have a const copy constructor. The C++ object is then copied using
that constructor.
Empirically, I observe that it const copies the this pointer into the block. If the C++ instance pointed to by this is no longer at that address when the block executes (for instance, if the Worker instance on which Worker::Work() is called was stack-allocated on a higher frame), then you will get an EXC_BAD_ACCESS or worse (i.e. pointer aliasing). So it appears that:
It is capturing this, not copying instance variables by value.
Nothing is being done to keep the object pointed to by this alive.
Alternately, if I reference a locally stack-allocated (i.e. declared in this stack frame/scope) C++ object, I observe that its copy constructor gets called when it is initially captured by the block, and then again whenever the block is copied (for instance, by the operation queue when you enqueue the operation.)
To address your questions specifically:
But here, do we capture the current value of this? A copy of this using Worker’s copy constructor? Or a reference to the place where node is stored?
We capture this. Consider it a const-copy of an intptr_t if that helps.
The object fred may not exist any more when the block gets run. What is the value of node? (Assume that the underlying Node objects live forever, but Workers come and go.)
In this case, this has been captured by-value and node is effectively a pointer with the value this + <offset of node in Worker> but since the Worker instance is gone, it's effectively a garbage pointer.
I would infer no magic or other behavior other than exactly what's described in those docs.
In C++, when you write an instance variable node, without explicitly writing something->node, it is implicitly this->node. (Similar to how in Objective-C, if you write an instance variable node, without explicitly writing something->node, it is implicitly self->node.)
So the variable which is being used is this, and it is this that is captured. (Technically this is described in the standard as a separate expression type of its own, not a variable; but for all intents and purposes it acts as an implicit local variable of type Worker *const.) As with all non-__block variables, capturing it makes a const copy of this.
Blocks have memory management semantics when they capture a variable of Objective-C object pointer type. However, this does not have Objective-C object pointer type, so nothing is done with it in terms of memory management. (There is nothing that can be done in terms of C++ memory management anyway.) So yes, the C++ object pointed to by this could be invalid by the time the block runs.

ARC - why do object pointers require explicit ownership type in function definitions?

void testFunction (id testArgument[]) {
return;
}
I'm getting the error "Must explicitly describe intended ownership of an object array parameter". Why does ARC need me to specify the ownership type of the objects in the testArgument array?
To expand on Jeremy's answer, ARC had two primary goals when designed:
make memory management as fully automatic as possible in pure Objective-C code while also preserving or maximizing efficiency (in fact, ARC can be more efficient than manual retain release).
require exactly specific declaration of memory management intent when crossing the boundary between C and Objective-C.
As well, the implementation of ARC is extremely conservative. That is, anywhere where the behavior has traditionally been "undefined", ARC will spew a warning.
Thus, in this case, the declaration of intent is required so that the compiler can apply a consistent and specific set of memory management rules to the contents of the array.
Because ARC needs to know whether to insert retain/release calls for you to avoid memory leaks.

objective-c memory management--how long is object guaranteed to exist?

I have ARC code of the following form:
NSMutableData* someData = [NSMutableData dataWithLength:123];
...
CTRunGetGlyphs(run, CGRangeMake(0, 0), someData.mutableBytes);
...
const CGGlyph *glyphs = [someData mutableBytes];
...
...followed by code that reads memory from glyphs but does nothing with someData, which isn't referenced anymore. Note that CGGlyph is not an object type but an unsigned integer.
Do I have to worry that the memory in someData might get freed before I am done with glyphs (which is actually just pointing insidesomeData)?
All this code is WITHIN the same scope (i.e., a single selector), and glyphs and someData both fall out of scope at the same time.
PS In an earlier draft of this question I referred to 'garbage collection', which didn't really apply to my project. That's why some answers below give it equal treatment with what happens under ARC.
You are potentially in trouble whether you use GC or, as others have recommended instead, ARC. What you are dealing with is an internal pointer which is not considered an owning reference in either GC or ARC in general - unless the implementation has special-cased NSData. Without that owning reference either GC or ARC might remove the object. The problem you face is peculiar to internal pointers.
As you describe your situation the safest thing to do is to hang onto the real reference. You could do this by assigning the NSData reference to either an instance variable or a static (method local if you wish) variable and then assigning nil to that variable when you've done with the internal pointer. In the case of static be careful about concurrency!
In practice your code will probably work in both GC and ARC, probably more likely in ARC, but either could conceivably bite you especially as compilers change. For the cost of one variable declaration and one extra assignment you avoid the problem, cheap insurance.
[See this discussion as an example of short lifetime under ARC.]
Under actual, real garbage collection that code is potentially a problem. Objects may be released as soon as there is no longer any reference to them and the compiler may discard the reference at any time if you never use it again. For optimisation purposes scope is just a way of putting an upper limit on that sort of thing, not a way of dictating it absolutely.
You can use NSAllocateCollectable to attach lifecycle calculation to C primitive pointers, though it's messy and slightly convoluted.
Garbage collection was never implemented in iOS and is now deprecated on the Mac (as referenced at the very bottom of this FAQ), in both cases in favour of automatic reference counting (ARC). ARC adds retains and releases where it can see that they're implicitly needed. Sadly it can perform some neat tricks that weren't previously possible, such as retrieving objects from the autorelease pool if they've been used as return results. So that has the same net effect as the garbage collection approach — the object may be released at any point after the final reference to it vanishes.
A workaround would be to create a class like:
#interface PFDoNothing
+ (void)doNothingWith:(id)object;
#end
Which is implemented to do nothing. Post your autoreleased object to it after you've finished using the internal memory. Objective-C's dynamic dispatch means that it isn't safe for the compiler to optimise the call away — it has no way of knowing you (or the KVO mechanisms or whatever other actor) haven't done something like a method swizzle at runtime.
EDIT: NSData being a special case because it offers direct C-level access to object-held memory, it's not difficult to find explicit discussions of the GC situation at least. See this thread on Cocoabuilder for a pretty good one though the same caveat as above applies, i.e. garbage collection is deprecated and automatic reference counting acts differently.
The following is a generic answer that does not necessarily reflect Objective-C GC support. However, various GC implementaitons, including ref-counting, can be thought of in terms of Reachability, quirks aside.
In a GC language, an object is guaranteed to exist as long as it is Strongly-Reachable; the "roots" of these Strong-Reachability graphs can vary by language and executing environment. The exact meaning of "Strongly" also varies, but generally means that the edges are Strong-References. (In a manual ref-counting scenario each edge can be thought of as an unmatched "retain" from a given "owner".)
C# on the CLR/.NET is one such implementation where a variable can remain in scope and yet not function as a "root" for a reachability-graph. See the Systems.Timer.Timer class and look for GC.KeepAlive:
If the timer is declared in a long-running method, use KeepAlive to prevent garbage collection from occurring [on the timer object] before the method ends.
As of summer 2012, things are in the process of change for Apple objects that return inner pointers of non-object type. In the release notes for Mountain Lion, Apple says:
NS_RETURNS_INNER_POINTER
Methods which return pointers (other than Objective C object type)
have been decorated with the clang compiler attribute
objc_returns_inner_pointer (when compiling with clang) to prevent the
compiler from aggressively releasing the receiver expression of those
messages, which no longer appear to be referenced, while the returned
pointer may still be in use.
Inspection of the NSData.h header file shows that this also applies from iOS 6 onward.
Also note that NS_RETURNS_INNER_POINTER is defined as __attribute__((objc_returns_inner_pointer)) in the clang specification, which makes it such that
the object's lifetime will be extended until at least the earliest of:
the last use of the returned pointer, or any pointer derived from it,
in the calling function;
or the autorelease pool is restored to a
previous state.
Caveats:
If you're using anything older then Mountain Lion or iOS 6 you will still need to use any of the methods discussed here (e.g., __attribute__((objc_precise_lifetime))) when declaring your local NSData or NSMutableData objects.
Also, even with the newest compiler and Apple libraries, if you use older or third party libraries with objects that do not decorate their inner-pointer-returning methods with __attribute__((objc_returns_inner_pointer)) you will need to decorate your local variables declarations of such objects with __attribute__((objc_precise_lifetime)) or use one of the other methods discussed in the answers.