It may be simple question, but why implementing NSCopying protocol in my class, I get zone == nil
- (id)copyWithZone:(NSZone *)zone
{
if (zone == nil)
NSLog(#"why this is allways nil");
(...)
}
This is called using this method for copy array with objects.
[[NSArray alloc] initWithArray:myArray copyItems:YES]];
Kevin's and Robin's answer is the most accurate. Oscar's answer is pretty close to correct. But neither the Gnustep documentation nor logancautrell's reasons for the existence of zones is quite correct.
Zones were originally created -- first NXZone, then NSZone -- to ensure that objects allocated from a single zone would be relatively contiguous in memory, that much is true. As it turns out, this does not reduce the amount of memory an app uses; it ends up increasing it slightly, in most cases.
The larger purpose was to be able to mass destroy a set of objects.
For example, if you were to load a complicated document into a document based application, tear-down of the object graph when the document was closed could actually be quite significantly expensive.
Thus, if all the objects for a document were allocated from a single zone and the allocation metadata for that zone was also in that zone, then destruction of all objects related to the document would be as cheap as simply destroying the zone (which was really cheap -- "here, system, have these pages back" -- one function call).
This proved unworkable. If a single reference to an object in the zone leaked out of the zone, then your app would go BOOM as soon as the document was closed and there was no way for the object to tell whatever was referring to it to stop. Secondly, this model also fell prey to the "scarce resource" problem so often encountered in GC'd system. That is, if the object graph of the document held onto non-memory resources, there was no way to clean up said resources efficiently prior to zone destruction.
In the end, the combination of not nearly enough of a performance win (how often do you really close complex documents) with all the added fragility made zones a bad idea. Too late to change the APIs, though, and we are left with the vestiges.
A NULL zone just means 'use the default zone'. Zones aren't used by the modern Objective C runtime anymore and cannot be used with ARC at all.
See documentation
NSZone was deprecated a long time ago. The fact that it's still in method signatures (e.g. +allocWithZone: and -copyWithZone:) are for backwards compatibility.
NSZone is now an undocumented class because it is quite old, its purpose was to allocate objects on the heap using the same set of virtual memory pages. However it is mostly not used anymore, but since it was used before, that parameter is still there for backwards compatibility.
Zone is a legacy from the old days when computers had 8 megs or less of RAM.
Check this out (3.1.2 Memory Allocation and Zones):
http://www.gnustep.org/resources/documentation/Developer/Base/ProgrammingManual/manual_3.html
There is also a good discussion of this over on cocoa builder (well it was on the cocoa dev mailing list) from about 10 years ago. This is exactly what #bbum was saying.
http://www.cocoabuilder.com/archive/cocoa/65056-what-an-nszone.html
Apparently this used to be documented in the Apple docs, but it was changed at some point since 2007-06-06.
http://www.cocoadev.com/index.pl?NSZone
Related
Question
I am working on a project where I am concerned about the thread safety of an object's properties. I know that when a property is an object such as an NSString, I can run into situations where multiple threads are reading and writing simultaneously. In this case you can get a corrupt read and the app will either crash or result in corrupted data.
My question is for primitive value type properties such as BOOLs or NSIntegers. I am wondering if I can get into a similar situation where I read a corrupt value when reading and writing from multiple threads (and the app will crash)? In either case, I am interested in why.
Clarification - 1/13/17
I am mostly interested in if a primitive value type property is differently susceptible to crashing due to multiple threads accessing it at the same time than an object such as NSMutableString, custom created object, etc. In addition, if there is a difference when accessing memory on the stack vs heap relative to multithreading.
Clarification - 12/1/17
Thank you to #Rob for pointing me to the answer here: stackoverflow.com/a/34386935/1271826! This answer has a great example that shows that depending on the type of architecture you are on (32-bit vs 64-bit), you can get an undefined result when using a primitive property.
Although this is a great step towards answering my question, I still wonder two things:
If there is a multithreading difference when accessing a primitive value property on the stack vs heap (as noted in my previous clarification)?
If you restrict a program to running on one architecture, can you still find yourself in an undefended state when access a primitive value property and why?
I should note that here has been a lot of conversation around atomic vs nonatomic in response to this question. Although this is generally an important concept, this question has little to do with preventing undefined multithreading behavior by using the atomic property modifier or any other thread safety approach such as using GCD.
If your primitive value type property is atomic, then you're assured it cannot be corrupted because your reading it from one thread while setting it from another (as long as you only use the accessor methods, and not interact with the backing ivar directly). That's the entire purpose of atomic. And, as you suggest, this only applicable to fundamental data types (or objects that are both immutable and stateless). But in these narrow cases, atomic can be useful.
Having said that, this is a far cry from concluding that the app is thread-safe. It only assures you that the access to that one property is thread-safe. But often thread-safety must be considered within a broader context. (I know you assure us that this is not the case here, but I qualify this for future readers who too quickly jump to the conclusion that atomic is sufficient to achieve thread-safety. It often is not.)
For example, if your NSInteger property is "how many items are in this cache object", then not only must that NSInteger have its access synchronized, but it must be also be synchronized in conjunction with all interactions with the cache object (e.g. the "add item to cache" and "remove item from cache" tasks, too). And, in these cases, since you'll synchronize all interaction with this broader object somehow (e.g. with GCD queue, locks, #synchronized directive, whatever), making the NSInteger property atomic then becomes redundant and therefore modestly less efficient.
Bottom line, in limited situations, atomic can provide thread-safety for fundamental data types, but frequently it is insufficient when viewed in a broader context.
You later say that you don't care about race conditions. For what it's worth, Apple argues that there is no such thing as a benign race. See WWDC 2016 video Thread Sanitizer and Static Analysis (about 14:40 into it).
Anyway, you suggest you are merely concerned whether the value can be corrupted or whether the app will crash:
I am wondering if I can get into a similar situation where I read a corrupt value when reading and writing from multiple threads (and the app will crash)?
The bottom line is that if you're reading from one thread while mutating on another, the behavior is simply undefined. It could vary. You are simply well advised to avoid this scenario.
In practice, it's a function of the target architecture. For example on 64-bit type (e.g. long long) on 32-bit x86 target, you can easily retrieve a corrupt value, where one half of the 64-bit value is set and the other is not. (See https://stackoverflow.com/a/34386935/1271826 for example.) This results in merely non-sensical, invalid numeric values when dealing with primitive types. For pointers to objects, this obviously would have catestrophic implications.
But even if you're in an environment where no problems are manifested, it's an incredibly fragile approach to eschew synchronization to achieve thread-safety. It could easily break when run on new, unanticipated hardware architectures or compiled under different configuration. I'd encourage you to watch that Thread Sanitizer and Static Analysis video for more information.
Reading through the other questions that are similar to mine, I see that most people want to know why you would need to know the size of an instance, so I'll go ahead and tell you although it's not really central to the problem. I'm working on a project that requires allocating thousands to hundreds of thousands of very small objects, and the default allocation pattern for objects simply doesn't cut it. I've already worked around this issue by creating an object pool class, that allows a tremendous amount of objects to be allocated and initialized all at once; deallocation works flawlessly as well (objects are returned to the pool).
It actually works perfectly and isn't my issue, but I noticed class_getInstanceSize was returning unusually large sizes. For instance, a class that stores one size_t and two (including isA) Class instance variables is reported to be 40-52 bytes in size. I give a range because calling class_getInstanceSize multiple times, even in a row, has no guarantee of returning the same size. In fact, every object but NSObject seemingly reports random sizes that are far from what they should be.
As a test, I tried:
printf("Instance Size: %zu\n", class_getInstanceSize(objc_getClass("MyClassName"));
That line of code always returns a value that corresponds to the size that I've calculated by hand to be correct. For instance, the earlier example comes out to 12 bytes (32-bit) and 24 bytes (64-bit).
Thinking that the runtime may be doing something behind the scenes that requires more memory, I watched the actual memory use of each object. For the example given, the only memory read from or written to is in that 12/24 byte block that I've calculated to be the expected size.
class_getInstanceSize acts like this on both the Apple & GNU 2.0 runtime. So is there a known bug with class_getInstanceSize that causes this behavior, or am I doing something fundamentally wrong? Before you blame my object pool; I've tried this same test in a brand new project using both the traditional alloc class method and by allocating the object using class_createInstance(self, 0); in a custom class method.
Two things I forgot to mention before: I'm almost entirely testing this on my own custom classes, so I know the trickery isn't down to the class actually being a class cluster or any of that nonsense; second, class_getInstanceSize([MyClassName class]) and class_getInstanceSize(self) \\ Ran inside a class method rarely produce the same result, despite both simply referencing isA. Again, this happens in both runtimes.
I think I've solved the problem and it was due to possibly the dumbest reason ever.
I use a profiling/debugging library that is old; in fact, I don't know its actual name (the library is libcsuomm; the header for it has no identifying info). All I know about it is that it was a library available on the computers in the compsci labs (I did a year of Comp-Sci before switching to a Geology major, graduating and never looking back).
Anyway, the point of the library is that it provides a number of profiling and debugging functionalities; the one I use it most for is memory leak detection, since it actually tracks per object unlike my other favorite memory-leak library (now unsupported, MSS) which is based in C and not aware of objects outside of raw allocations.
Because I use it so much when debugging, I always set it up by default without even thinking about it. So even when creating my test projects to try and pinpoint the bug, I set it up without even putting any thought into it. Well, it turns out that the library works by pulling some runtime trickery, so it can properly track objects. Things seem to work correctly now that I've disabled it, so I believe that it was the source of my problems.
Now I feel bad about jumping to conclusions about it being a bug, but at the time I couldn't see anything in my own code that could possibly cause that problem.
I have ARC code of the following form:
NSMutableData* someData = [NSMutableData dataWithLength:123];
...
CTRunGetGlyphs(run, CGRangeMake(0, 0), someData.mutableBytes);
...
const CGGlyph *glyphs = [someData mutableBytes];
...
...followed by code that reads memory from glyphs but does nothing with someData, which isn't referenced anymore. Note that CGGlyph is not an object type but an unsigned integer.
Do I have to worry that the memory in someData might get freed before I am done with glyphs (which is actually just pointing insidesomeData)?
All this code is WITHIN the same scope (i.e., a single selector), and glyphs and someData both fall out of scope at the same time.
PS In an earlier draft of this question I referred to 'garbage collection', which didn't really apply to my project. That's why some answers below give it equal treatment with what happens under ARC.
You are potentially in trouble whether you use GC or, as others have recommended instead, ARC. What you are dealing with is an internal pointer which is not considered an owning reference in either GC or ARC in general - unless the implementation has special-cased NSData. Without that owning reference either GC or ARC might remove the object. The problem you face is peculiar to internal pointers.
As you describe your situation the safest thing to do is to hang onto the real reference. You could do this by assigning the NSData reference to either an instance variable or a static (method local if you wish) variable and then assigning nil to that variable when you've done with the internal pointer. In the case of static be careful about concurrency!
In practice your code will probably work in both GC and ARC, probably more likely in ARC, but either could conceivably bite you especially as compilers change. For the cost of one variable declaration and one extra assignment you avoid the problem, cheap insurance.
[See this discussion as an example of short lifetime under ARC.]
Under actual, real garbage collection that code is potentially a problem. Objects may be released as soon as there is no longer any reference to them and the compiler may discard the reference at any time if you never use it again. For optimisation purposes scope is just a way of putting an upper limit on that sort of thing, not a way of dictating it absolutely.
You can use NSAllocateCollectable to attach lifecycle calculation to C primitive pointers, though it's messy and slightly convoluted.
Garbage collection was never implemented in iOS and is now deprecated on the Mac (as referenced at the very bottom of this FAQ), in both cases in favour of automatic reference counting (ARC). ARC adds retains and releases where it can see that they're implicitly needed. Sadly it can perform some neat tricks that weren't previously possible, such as retrieving objects from the autorelease pool if they've been used as return results. So that has the same net effect as the garbage collection approach — the object may be released at any point after the final reference to it vanishes.
A workaround would be to create a class like:
#interface PFDoNothing
+ (void)doNothingWith:(id)object;
#end
Which is implemented to do nothing. Post your autoreleased object to it after you've finished using the internal memory. Objective-C's dynamic dispatch means that it isn't safe for the compiler to optimise the call away — it has no way of knowing you (or the KVO mechanisms or whatever other actor) haven't done something like a method swizzle at runtime.
EDIT: NSData being a special case because it offers direct C-level access to object-held memory, it's not difficult to find explicit discussions of the GC situation at least. See this thread on Cocoabuilder for a pretty good one though the same caveat as above applies, i.e. garbage collection is deprecated and automatic reference counting acts differently.
The following is a generic answer that does not necessarily reflect Objective-C GC support. However, various GC implementaitons, including ref-counting, can be thought of in terms of Reachability, quirks aside.
In a GC language, an object is guaranteed to exist as long as it is Strongly-Reachable; the "roots" of these Strong-Reachability graphs can vary by language and executing environment. The exact meaning of "Strongly" also varies, but generally means that the edges are Strong-References. (In a manual ref-counting scenario each edge can be thought of as an unmatched "retain" from a given "owner".)
C# on the CLR/.NET is one such implementation where a variable can remain in scope and yet not function as a "root" for a reachability-graph. See the Systems.Timer.Timer class and look for GC.KeepAlive:
If the timer is declared in a long-running method, use KeepAlive to prevent garbage collection from occurring [on the timer object] before the method ends.
As of summer 2012, things are in the process of change for Apple objects that return inner pointers of non-object type. In the release notes for Mountain Lion, Apple says:
NS_RETURNS_INNER_POINTER
Methods which return pointers (other than Objective C object type)
have been decorated with the clang compiler attribute
objc_returns_inner_pointer (when compiling with clang) to prevent the
compiler from aggressively releasing the receiver expression of those
messages, which no longer appear to be referenced, while the returned
pointer may still be in use.
Inspection of the NSData.h header file shows that this also applies from iOS 6 onward.
Also note that NS_RETURNS_INNER_POINTER is defined as __attribute__((objc_returns_inner_pointer)) in the clang specification, which makes it such that
the object's lifetime will be extended until at least the earliest of:
the last use of the returned pointer, or any pointer derived from it,
in the calling function;
or the autorelease pool is restored to a
previous state.
Caveats:
If you're using anything older then Mountain Lion or iOS 6 you will still need to use any of the methods discussed here (e.g., __attribute__((objc_precise_lifetime))) when declaring your local NSData or NSMutableData objects.
Also, even with the newest compiler and Apple libraries, if you use older or third party libraries with objects that do not decorate their inner-pointer-returning methods with __attribute__((objc_returns_inner_pointer)) you will need to decorate your local variables declarations of such objects with __attribute__((objc_precise_lifetime)) or use one of the other methods discussed in the answers.
this a question I always wanted to ask.
When I am running an iOS application in Profiler looking for allocation issues, I found out that NSManagedObject stays in memory long after they have been used and displayed, and the UIViewController who recall has been deallocated. Of course when the UIViewController is allocated again, the number is not increasing, suggesting that there's no leak, and there's some kind of object reuse by CoreData.
If I have a MyManagedObject class which has been given 'mobjc' as name, then in profiler I can see an increasing number of:
MyManagedObject_mobjc_
the number may vary, and for small amount of data, for example 100 objects in sqllite, it grows to that limit and stays there.
But it also seems that sometimes during the application lifecycle the objects are deallocated, so I suppose that CoreData itself is doing some kind of memory optimizations.
It also seems that not the whole object is retained, but rather the 'fault' of it (please forgive my english :-) ) because of the small live byte size.
Even tough a lot of fault objects would also occupy memory.
But at this point I would like some confirmation:
is CoreData really managing and optimizing object in memory ?
is there anything I can do for helping my application to retain less object as possible ?
related to the point above, do I really need to take care of this issue or not ?
do you have some link, possibly by Apple, where this specific subject is explained ?
maybe it is relevant, the app I used for testing rely on ARC and iOS 5.1.
thanks
In this SO topic, Core Data Memory Management, you can find the info you are looking for.
This, instead, is the link to Apple doc on Core Data Memory Managament.
Few tips here.
First, when you deal with Core Data you deal with an object graph. To reduce memory consumption (to prune your graph) you can do a reset on the context you are using or turn objects into fauts passing NO to refreshObject:(NSManagedObject *)object mergeChanges:(BOOL)flag method. If you pass NO to that method, you can lose unsaved changes, so pay attention to it.
Furthermore, don't use Undo Management if you don't need it. It increases memory use (by default in iOS, no undo manager is created).
Hope that helps.
Or, Why I Didn't Use retainCount On My Summer Vacation
This post is intended to solicit detailed write-ups about the whys and wherefores of that infamous method, retainCount, in order to consolidate the relevant information floating around SO.*
The basics: What are the official reasons to not use retainCount? Is there ever any situation at all when it might be useful? What should be done instead?** Feel free to editorialize.
Historical/explanatory: Why does Apple provide this method in the NSObject protocol if it's not intended to be used? Does Apple's code rely on retainCount for some purpose? If so, why isn't it hidden away somewhere?
For deeper understanding: What are the reasons that an object may have a different retain count than would be assumed from user code? Can you give any examples*** of standard procedures that framework code might use which cause such a difference? Are there any known cases where the retain count is always different than what a new user might expect?
Anything else you think is worth metioning about retainCount?
*
Coders who are new to Objective-C and Cocoa often grapple with, or at least misunderstand, the reference-counting scheme. Tutorial explanations may mention retain counts, which (according to these explanations) go up by one when you call retain, alloc, copy, etc., and down by one when you call release (and at some point in the future when you call autorelease).
A budding Cocoa hacker, Kris, could thus quite easily get the idea that checking an object's retain count would be useful in resolving some memory issues, and, lo and behold, there's a method available on every object called retainCount! Kris calls retainCount on a couple of objects, and this one is too high, and that one's too low, and what the heck is going on?! So Kris makes a post on SO, "What's wrong with my memory management?" and then a swarm of <bold>, <large> letters descend saying "Don't do that! You can't rely on the results.", which is well and good, but our intrepid coder may want a deeper explanation.
I'm hoping that this will turn into an FAQ, a page of good informational essays/lectures from any of our experts who are inclined to write one, that new Cocoa-heads can be pointed to when they wonder about retainCount.
** I don't want to make this too broad, but specific tips from experience or the docs on verifying/debugging retain and release pairings may be appropriate here.
***In dummy code; obviously the general public don't have access to Apple's actual code.
The basics: What are the official reasons to not use retainCount?
Autorelease management is the most obvious -- you have no way to be sure how many of the references represented by the retainCount are in a local or external (on a secondary thread, or in another thread's local pool) autorelease pool.
Also, some people have trouble with leaks, and at a higher level reference counting and how autorelease pools work at fundamental levels. They will write a program without (much) regard to proper reference counting, or without learning ref counting properly. This makes their program very difficult to debug, test, and improve -- it's also a very time consuming rectification.
The reason for discouraging its use (at the client level) is twofold:
The value may vary for so many reasons. Threading alone is reason enough to never trust it.
You still have to implement correct reference counting. retainCount will never save you from imbalanced reference counting.
Is there ever any situation at all when it might be useful?
You could in fact use it in a meaningful way if you wrote your own allocators or reference counting scheme, or if your object lived on one thread and you had access to any and all autorelease pools it could exist in. This also implies you would not share it with any external APIs. The easy way to simulate this is to create a program with one thread, zero autorelease pools, and do your reference counting the 'normal' way. It's unlikely that you'll ever need to solve this problem/write this program for anything other than "academic" reasons.
As a debugging aid: you could use it to verify that the retain count is not unusually high. If you take this approach, be mindful of the implementation variances (some are cited in this post), and don't rely on it. Don't even commit the tests to your SCM repository.
This may be a useful diagnostic in extremely rare circumstances. It can be used to detect:
Over-retaining: An allocation with a positive imbalance in retain count would not show up as a leak if the allocation is reachable by your program.
An object which is referenced by many other objects: One illustration of this problem is a (mutable) shared resource or collection which operates in a multithreaded context - frequent access or changes to this resource/collection can introduce a significant bottleneck in your program's execution.
Autorelease levels: Autoreleasing, autorelease pools, and retain/autorelease cycles all come with a cost. If you need to minimize or reduce memory use and/or growth, you could use this approach to detect excessive cases.
From commentary with Bavarious (below): a high value may also indicate an invalidated allocation (dealloc'd instance). This is completely an implementation detail, and again, not usable in production code. Messaging this allocation would result in a error when zombies are enabled.
What should be done instead?
If you're not responsible for returning the memory at self (that is, you did not write an allocator), leave it alone - it is useless.
You have to learn proper reference counting.
For a better understanding of release and autorelease usage, set up some breakpoints and understand how they are used, in what cases, etc. You'll still have to learn to use reference counting correctly, but this can aid your understanding of why it's useless.
Even simpler: use Instruments to track allocs and ref counts, then analyze the ref counting and callstacks of several objects in an active program.
Historical/explanatory: Why does Apple provide this method in the NSObject protocol if it's not intended to be used? Does Apple's code rely on retainCount for some purpose? If so, why isn't it hidden away somewhere?
We can assume that it is public for two primary reasons:
Reference counting proper in managed environments. It's fine for the allocators to use retainCount -- really. It's a very simple concept. When -[NSObject release] is called, the ref counter (unless overridden) may be called, and the object can be deallocated if retainCount is 0 (after calling dealloc). This is all fine at the allocator level. Allocators and zones are (largely) abstracted so... this makes the result meaningless for ordinary clients. See commentary with bbum (below) for details on why retainCount cannot be equal to 0 at the client level, object deallocation, deallocation sequences, and more.
To make it available to subclassers who want a custom behavior, and because the other reference counting methods are public. It may be handy in a few cases, but it's typically used for the wrong reasons (e.g. immortal singletons). If you need your own reference counting scheme, then this family may be worth overriding.
For deeper understanding: What are the reasons that an object may have a different retain count than would be assumed from user code? Can you give any examples*** of standard procedures that framework code might use which cause such a difference? Are there any known cases where the retain count is always different than what a new user might expect?
Again, a custom reference counting schemes and immortal objects. NSCFString literals fall into the latter category:
NSLog(#"%qu", [#"MyString" retainCount]);
// Logs: 1152921504606846975
Anything else you think is worth mentioning about retainCount?
It's useless as a debugging aid. Learn to use leak and zombie analyses, and use them often -- even after you have a handle on reference counting.
Update: bbum has posted an article entitled retainCount is useless. The article contains a thorough discussion of why -retainCount isn’t useful in the vast majority of cases.
The general rule of thumb is if you're using this method, you better be damn sure you know what you're doing. If you are using it for debugging a memory leak you're doing it wrong, if you're doing it to see what is going on with an object, you're doing it wrong.
There is one case where I have used it, and found it useful. That is in doing a shared object cache where I wanted to flush the object when nothing had a reference to it anymore. In this situation I waited until the retainCount is equal to 1, and then I can release it knowing that nothing else is holding onto it, this will obviously not work properly in garbage collected environments and there are better ways to do it. But this is still the only 'valid' use case I've seen for it, and isn't something a lot of people will be doing.