Key-value coding for mutable collections & atomic accessors

Key-value coding for mutable collections & atomic accessors - objective-c

My question, in brief: Is there any way to make mutable collection KVO accessors thread-safe with the same lock that the #synthesized methods are locked?
Explanation: I have a controller class which contains a collection (NSMutableArray) of Post objects. These objects are downloaded from a website, and thus the collection changes from time to time. I would like to be able to use key-value observing to observe the array, so that I can keep my interface updated.
My controller has a posts property, declared as follows:
#property (retain) NSMutableArray *posts;
If I call #synthesize in my .m file, it will create the -(NSMutableArray *)posts and -(void)setPosts:(NSMutableArray *)obj methods for me. Further, they will be protected by a lock such that two threads cannot stomp on each other while setting (or getting) the value.
However, in order to be key-value coding compliant for a mutable ordered collection, there are a few other methods I need to implement. Specifically, I need to implement at least the following:
-insertObject:inPostsAtIndex:
-removeObjectFromPostsAtIndex:
However, since the posts are downloaded asynchronously, I would like to be able to insert new posts into the array on a background thread as well. This means that access needs to be thread-safe.
So, my question. Is there any way to make those accessors thread-safe with the same lock that the #synthesized methods are locked? Or do I have to resort to specifying the setPosts: and posts methods myself in order to guarantee full atomicity across all accessors?

The Objective-C docs at developer.apple.com[1] don't state that there's a way to use the same lock for your explicitly defined functions as gets used for your #synthesized functions. In that case I'd say that to be completely safe it would be better to fully define your own functions to be sure they all use the same lock.
You may be able to use the debugger to determine the name of the lock that gets used for your #synthesized functions, but that's not something I'd rely on.

You probably don't really want to do this. If you do succeed, KVO-notifications will be received on the same thread that makes the change, and if it's a background thread, will be unsuitable for updating the UI.
Instead, why not have your background thread update the property using the main thread? Then you don't even need the property to be atomic.

Related

Is it thread-safe to read an instance variable while calling a setter from another thread?

I have an object with a property:
#interface Car
#property(strong) NSLicensePlate *licensePlate;
#end
I use the property in a method:
- (void) doSomething {
[_licensePlate frobnicate];
}
And the property value can be changed in another method:
- (void) doSomethingElse {
[self setLicensePlate:[_licensePlateDealer cleanLicensePlate]];
}
Now, if the -doSomethingElse method is called from another thread while I access the license plate property using the instance variable (as seen in the -doSomething method), is it possible to get a segfault?
Is it possible that the -setLicensePlate setter releases the value stored in _licensePlate right before I call -frobnicate on it and before a new valid value is assigned? And would it help to call [self licensePlate] instead of using _licensePlate directly?

If you want to enjoy the atomic behavior (which is the default behavior that you get because you didn't specify the nonatomic qualifier) of this property, you must use the getter (self.licensePlate or [self licensePlate]), not use the ivar (_licensePlate).
In general, it's usually prudent to use the getters and setters everywhere except (a) the init method; and (b) and custom accessor methods. The overhead is negligible and you avoid spectrum of potential problems ranging from atomicity, memory semantics, KVO, future-proofing code in case you customize accessor methods at some future date, etc.
But, assuming you access your property only through the accessor methods (the getters and setters), the atomic qualifier, as described by Programming with Objective-C: Encapsulating Data ensures that the pointer, itself, that you are retrieving/setting is will not be corrupted by another thread:
[Atomic] means that the synthesized accessors ensure that a value is always fully retrieved by the getter method or fully set via the setter method, even if the accessors are called simultaneously from different threads.
In answer to your question, if the other thread changes the licensePlate property, while the frobnicate method is running on the other thread, that original object will not be released until that method returns.
But to be clear, the atomic qualifier does not ensure thread safety. As the above guide goes on to warn us:
Note: Property atomicity is not synonymous with an object’s thread safety.
Consider an XYZPerson object in which both a person’s first and last names are changed using atomic accessors from one thread. If another thread accesses both names at the same time, the atomic getter methods will return complete strings (without crashing), but there’s no guarantee that those values will be the right names relative to each other. If the first name is accessed before the change, but the last name is accessed after the change, you’ll end up with an inconsistent, mismatched pair of names.
This example is quite simple, but the problem of thread safety becomes much more complex when considered across a network of related objects. Thread safety is covered in more detail in Concurrency Programming Guide.
So, it might be thread-safe to use frobnicate on one thread while doing other stuff on another thread, but it also might not. It depends upon all the different things that can be done with this license plate object. Because the protections offered by atomic are so minimalist, we frequently will employ some synchronization (via GCD serial queue or GCD reader-writer pattern, or via any of the synchronization methods outlined in the Threading Programming Guide: Synchronization such as locks) to coordinate interaction from different threads.

When you define properties, you can set them as atomic (the default) or nonatomic.
Since you're using the atomic default, you should be fine about thread safety, but that also depends on how you implemented frobnicate, setLicensePlate: and cleanLicensePlate.
Please refer to this question to get more details about atomic vs nonatomic: What's the difference between the atomic and nonatomic attributes?

In Objective-C, if #property and #synthesize will add getter and setter, why not just make an instance variable public?

In Objective-C, we can add #property and #synthesize to create a property -- like an instance variable with getter and setter which are public to the users of this class.
In this case, isn't it just the same as declaring an instance variable and making it public? Then there won't be the overhead of calling the getter and setter as methods. There might be a chance that we might put in validation for the setter, such as limiting a number to be between 0 and 100, but other than that, won't a public instance variable just achieve the same thing, and faster?

Even if you're only using the accessors generated by #synthesize, they get you several benefits:
Memory management: generated setters retain the new value for a (retain) property. If you try to access an object ivar directly from outside the class, you don't know whether the class might retain it. (This is less of an issue under ARC, but still important.)
Threadsafe access: generated accessors are atomic by default, so you don't have to worry about race conditions accessing the property from multiple threads.
Key-Value Coding & Observation: KVC provides convenient access to your properties in various scenarios. You can use KVC when setting up predicates (say, for filtering a collection of your objects), or use key paths for getting at properties in collections (say, a dictionary containing objects of your class). KVO lets other parts of your program automatically respond to changes in a property's value -- this is used a lot with Cocoa Bindings on the Mac, where you can have a control bound to the value of a property, and also used in Core Data on both platforms.
In addition to all this, properties provide encapsulation. Other objects (clients) using an instance of your class don't have to know whether you're using the generated accessors -- you can create your own accessors that do other useful stuff without client code needing changes. At some point, you may decide your class needs to react to an externally made change to one of its ivars: if you're using accessors already, you only need to change them, rather than make your clients start using them. Or Apple can improve the generated accessors with better performance or new features in a future OS version, and neither the rest of your class' code nor its clients need changes.

Overhead Is Not a Real Issue
To answer your last question, yes there will be overhead—but the overhead of pushing one more frame and popping it off the stack is negligible, especially considering the power of modern processors. If you are that concerned with performance you should profile your application and decide where actual problems are—I guarantee you you'll find better places to optimize than removing a few accessors.
It's Good Design
Encapsulating your private members and protecting them with accessors and mutators is simply a fundamental principle of good software design: it makes your software easier to maintain, debug, and extend. You might ask the same question about any other language: for example why not just make all fields public in your Java classes? (except for a language like Ruby, I suppose, which make it impossible to expose instance variables). The bottom line is that certain software design practices are in place because as your software grows larger and larger, you will be saving yourself from a veritable hell.
Lazy Loading
Validation in setters is one possibility, but there's more you can do than that. You can override your getters to implement lazy loading. For example, say you have a class that has to load some fields from a file or database. Traditionally this is done at initialization. However, it might be possible that not all fields will actually be used by whoever is instantiating the object, so instead you wait to initialize those members until it's requested via the getter. This cleans up initialization and can be a more efficient use of processing time.
Helps Avoid Retain Cycles in ARC
Finally, properties make it easier to avoid retain loops with blocks under ARC. The problem with ivars is that when you access them, you are implicitly referencing self. So, when you say:
_foo = 7;
what you're really saying is
self->_foo = 7;
So say you have the following:
[self doSomethingWithABlock:^{
_foo = 7;
}];
You've now got yourself a retain cycle. What you need is a weak pointer.
__block __weak id weakSelf = self;
[self doSomethingWithABlock:^{
weakSelf->_foo = 7;
}];
Now, obviously this is still a problem with setters and getters, however you are less likely to forget to use weakSelf since you have to explicity call self.property, whereas ivars are referenced by self implicitly. The static analayzer will help you pick this problem up if you're using properties.

#property is a published fact. It tells other classes that they can get, and maybe set, a property of the class. Properties are not variables, they are literally what the word says. For example, count is a property of an NSArray. Is it necessarily an instance variable? No. And there's no reason why you should care whether it is.
#synthesize creates a default getter, setter and instance variable unless you've defined any of those things yourself. It's an implementation specific. It's how your class chooses to satisfy its contractual obligation to provide the property. It's just one way of providing a property, and you can change your implementation at any time without telling anyone else about it.
So why not expose instance variables instead of providing getters and setters? Because that binds your hands on the implementation of the class. It makes other acts rely on the specific way it has been coded rather than merely the interface you've chosen to publish for it. That quickly creates fragile and inter-dependent code that will break. It's anathema to object-oriented programming.

Because one would normally be interested in encapsulation and hiding data and implementations. It is easier to maintain; You have to change one implementation, rather than all. Implementation details are hidden from the client. Also, the client shouldn't have to think about whether the class is a derived class.

You are correct... for a few very limited cases. Properties are horrible in terms of CPU cycle performance when they are used in the inner loops of pixel, image and real-time audio DSP (etc.) code. For less frequent uses, they bring a lot of benefits in terms of readable maintainable reusable code.

#property and #synthesize is set are getting getter and setter methods
other usage is you can use the that variable in other classes also
if you want to use the variable as instance variable and your custom getter and setter methods you can do but some times when you set the value for variable and while retrieving value of variable sometimes will become zombie which may cause crash of your app.
so the property will tell operating system not to release object till you deallocate your object of class,
hope it helps

How to receive notifications from an NSMutableArray subclass

I have subclassed NSMutableArray to allow for a datasource. This is called BaseObjectArray. The array actually only holds a list of rowids (as uint64_t), and when asking for objectAtIndex it asks the datasource delegate for the object with that rowid (to allow for lazy DB queries).
The internal list of rowids is a class in it's own right (a RowIDSet, or the OrderedRowIDSet subclass, which is just a subclass of NSObject), that maintains just the list of unique rowids.
What I need is to somehow listen for changes to the BaseObjectArray (which is actually listening to changes on it's RowIDSet object, perhaps through a similar method).
As objects may be added/removed from the BaseObjectArray not using the standard addObject:, but instead with addRowID:, the object that owns the BaseObjectArray will probably not get standard KVO notifications.
Possible solutions I have considered:
The BaseObjectArray has owner and ownerKey properties, and the BaseObjectArray triggers [owner willChangeForKey:ownerKey]; whenever anything changes.
Use will/didChangeNotificationBlocks - listeners can simply add a block to the BaseObjectArray (retaining these blocks in an NSMutableArray), and all the blocks in this array are triggered when something in the BaseObjectArray changes. I am uncertain about the possible retain-cycle nightmare that may ensue.
KVO on a 'contents' property of the BaseObjectArray. Anyone wanting to observe the BaseObjectArray actually observes the keyPath 'contents', and inside the BOArray it calls [self willChangeForKeyPath:#"contents"]. The contents property just returns self.
... something obvious that i have missed ...
Please let me know if any of these make the most (or any) sense, or if there is a better solution out there.
Thanks :)

Unless you know what you are doing, you should not subclass NSMutableArray. NSMutableArray is a class cluster and requires special treatment.
Why not just create a custom object that uses a plain NSMutableArray as its storage class? There seems to be no good reason to subclass NSMutableArray in your case, but maybe I'm misunderstanding your question.

I don't know if this will work, but if it does, it's probably the best way.
Make sure your NSMutableArray subclass is KVC compliant for the key self (if this doesn't work for self add a new property e.g rows which returns self or a copy of self). To make self (or whatever new property you use) KVC compliant you need to follow the Indexed To-Many Relationship Compliance rules for mutable ordered collections:
Implement a method named - that returns an array.
Or have an array instance variable named or _.
Or implement the method -countOf and one or both of -objectInAtIndex: or -AtIndexes:.
Optionally, you can also implement -get:range: to improve performance.
self ticks the box on the first of these. Also:
Implement one or both of the methods -insertObject:inAtIndex: or -insert:atIndexes:.
Implement one or both of the methods -removeObjectFromAtIndex: or -removeAtIndexes:.
ptionally, you can also implement -replaceObjectInAtIndex:withObject: or -replaceAtIndexes:with: to improve performance
So you'll need e.g. -insertObject:inSelfAtIndex: and -removeObjectFrom<Key>AtIndex:
Then you can use manual KVO notifications wherever you want to notify obeservers of the self property on that object. So you might use
NSIndexSet* indexes = // index set containing the index or indexes of objects to remove
[self willChange: NSKeyValueChangeRemoval valuesAtIndexes: indexes forKey:#"self"];
when removing objects.

Should I use properties or direct reference when accessing instance variables internally?

Say I have a class like this:
#interface MyAwesomeClass : NSObject
{
#private
NSString *thing1;
NSString *thing2;
}
#property (retain) NSString *thing1;
#property (retain) NSString *thing2;
#end
#implementation MyAwesomeClass
#synthesize thing1, thing1;
#end
When accessing thing1 and thing2 internally (i.e, within the implementation of MyAwesomeClass), is it better to use the property, or just reference the instance variable directly (assuming cases in which we do not do any work in a "custom" access or mutator, i.e., we just set and get the variable). Pre-Objective C 2.0, we usually just access the ivars directly, but what's the usual coding style/best practice now? And does this recommendation change if an instance variable/property is private and not accessible outside of the class at all? Should you create a property for every ivar, even if they're private, or only for public-facing data? What if my app doesn't use key-value coding features (since KVC only fires for property access)?
I'm interested in looking beyond the low-level technical details. For example, given (sub-optimal) code like:
#interface MyAwesomeClass : NSObject
{
id myObj;
}
#proprety id myObj;
#end
#implementation MyAwesomeClass
#synthesize myObj;
#end
I know that myObj = anotherObject is functionally the same as self.myObj = anotherObj.
But properties aren't merely fancy syntax for instructing the compiler to write accessors and mutators for you, of course; they're also a way to better encapsulate data, i.e., you can change the internal implementation of the class without rewriting classes that rely on those properties. I'm interested in answers that address the importance of this encapsulation issue when dealing with the class's own internal code. Furthermore, properly-written properties can fire KVC notifications, but direct ivar access won't; does this matter if my app isn't utilizing KVC features now, just in case it might in the future?

If you spend time on the cocoa-dev mailing list, you'll find that this is a very contentious topic.
Some people think ivars should only ever be used internally and that properties should never (or rarely) be used except externally. There are various concerns with KVO notifications and accessor side effects.
Some people think that you should always (or mostly) use properties instead of ivars. The main advantage here is that your memory management is well contained inside of accessor methods instead of strewn across your implementation logic. The KVO notifications and accessor side effects can be overcome by creating separate properties that point to the same ivar.
Looking at Apple's sample code will reveal that they are all over the place on this topic. Some samples use properties internally, some use ivars.
I would say, in general, that this is a matter of taste and that there is no right way to do it. I myself use a mix of both styles.

I don't think any way is 'better'. You see both styles in common use, so there isn't even a usual/best practice now. In my experience, the style used has very little impact on how well I digest some implementation file I am looking. You certainly want to be comfortable with both styles (and any in between) when looking at other people's code.
Using a property for every internal ivar might be going slightly overboard, in terms of maintenance. I've done it, and it added a non-trivial amount of work that I don't think paid off for me. But if you have a strong desire/OCD for seeing consistent code like self.var everywhere, and you have it in the back of your mind every time you look at a class, then use it. Don't discount the effect that a nagging feeling can have on productivity.
Exceptions- Obviously, for custom getters (e.g. lazy creation), you don't have much of a choice. Also, I do create and use a property for internal setters when it makes it more convenient (e.g. setting objects with ownership semantics).
"just in case", "might" is not be a compelling reason to do something without more data, since the time required to implement it is non-zero. A better question might be, what is the probability that all the private ivars in some class will require KVC notifications in the future, but not now? For most of my own classes, the answer is exceedingly low, so I now avoid a hard rule about creating properties for every private ivar.
I've found that when dealing with internal implementations, I quickly get a good handle on how each ivar should be accessed regardless.
If you are interested, my own approach is this:
Reading ivars: Direct access, unless there is a custom getter (e.g. lazy creation)
Writing ivars: Directly in alloc/dealloc. Elsewhere, through a private property if one exists.

The only difference in an assignment of thing1 = something; and self.thing1 = something; is that if you want to have the property assignment operation (retain, copy, etc), done on the assigned object, then you need to use a property. Assigning without properties will effectively be just that, assigning a reference to the provided object.
I think that defining a property for internal data is unnecessary. Only define properties for ivars that will be accessed often and need specific mutator behavior.

If thing1 is used with KVO it is a good idea to use self.thing1= when you set it. If thing1 is #public, then it is best to assume that someone someday will sometime want to use it with KVO.
If thing1 has complex set semantics that you don't want to repeat everywhere you set it (for example retain, or non-nonatomic) then use through self.thing1= is a good idea.
If benchmarking shows that calling setThing1: is taking significant time then you might want to think about ways to set it without use of self.thing1= -- maybe note that it can not be KVO'ed, or see if manually implementing KVO is better (for example if you set it 3000 times in a loop somewhere, you might be able to set it via self->thing1 3000 times, and make 2 KVO calls about the value being about to change and having changed).
That leaves the case of a trivial setter on a private variable where you know you aren't using KVO. At that point it stops being a technical issue, and falls under code style. At least as long as the accessor doesn't show up as a bottleneck in the profiler. I tend to use direct ivar access at that point (unless I think I will KVO that value in the future, or might want to make it public and thus think others may want to KVO it).
However when I set things with direct ivar access I try to only do it via self->thing1=, that makes it a lot simpler to find them all and change them if I ever find the need to use KVO, or to make it public, or to make a more complex accessor.

Other things mentioned here are all right on. A few things that the other answers missed are:
First, always keep in mind the implications of accessors/mutators being virtual (as all Objective-C methods are.) In general, it's been said that one should avoid calling virtual methods in init and dealloc, because you don't know what a subclass will do that could mess you up. For this reason, I generally try to access the iVars directly in init and dealloc, and access them through the accessor/mutators everywhere else. On the other hand, if you don't consistently use the accessors in all other places, subclasses that override them may be impacted.
Relatedly, atomicity guarantees of properties (i.e. your #properties are declared atomic) can't be maintained for anyone if you're accessing the iVar directly anywhere outside of init & dealloc. If you needed something to be atomic, don't throw away the atomicity by accessing the iVar directly. Similarly, if you don't need those guarantees, declare your property nonatomic (for performance.)
This also relates to the KVO issue. In init, no one can possibly be observing you yet (legitimately), and in dealloc, any remaining observer has a stale unretained (i.e. bogus) reference. The same reasoning also applies to the atomicity guarantees of properties. (i.e. how would concurrent accesses happen before init returns and accesses that happen during dealloc are inherently errors.)
If you mix and match direct and accessor/mutator use, you risk running afoul of not only KVO and atomicity, but of subclassers as well.

In ObjC, how to describe balance between alloc/copy/retain and auto-/release, in terms of location

As is common knowledge, calls to alloc/copy/retain in Objective-C imply ownership and need to be balanced by a call to autorelease/release. How do you succinctly describe where this should happen? The word "succinct" is key. I can usually use intuition to guide me, but would like an explicit principle in case intuition fails and that can be use in discussions.
Properties simplify the matter (the rule is auto-/release happens in -dealloc and setters), but sometimes properties aren't a viable option (e.g. not everyone uses ObjC 2.0).
Sometimes the release should be in the same block. Other times the alloc/copy/retain happens in one method, which has a corresponding method where the release should occur (e.g. -init and -dealloc). It's this pairing of methods (where a method may be paired with itself) that seems to be key, but how can that be put into words? Also, what cases does the method-pairing notion miss? It doesn't seem to cover where you release properties, as setters are self-paired and -dealloc releases objects that aren't alloc/copy/retained in -init.
It feels like the object model is involved with my difficulty. There doesn't seem to be an element of the model that I can attach retain/release pairing to. Methods transform objects from valid state to valid state and send messages to other objects. The only natural pairings I see are object creation/destruction and method enter/exit.
Background:
This question was inspired by: "NSMutableDictionary does not get added into NSMutableArray". The asker of that question was releasing objects, but in such a way that might cause memory leaks. The alloc/copy/retain calls were generally balanced by releases, but in such a way that could cause memory leaks. The class was a delegate; some members were created in a delegate method (-parser:didStartElement:...) and released in -dealloc rather than in the corresponding (-parser:didEndElement:...) method. In this instance, properties seemed a good solution, but the question still remained of how to handle releasing when properties weren't involved.

Properties simplify the matter (the rule is auto-/release happens in -dealloc and setters), but sometimes properties aren't a viable option (e.g. not everyone uses ObjC 2.0).
This is a misunderstanding of the history of properties. While properties are new, accessors have always been a key part of ObjC. Properties just made it easier to write accessors. If you always use accessors, and you should, than most of these questions go away.
Before we had properties, we used Xcode's built-in accessor-writer (in the Script>Code menu), or with useful tools like Accessorizer to simplify the job (Accessorizer still simplifies property code). Or we just typed a lot of getters and setters by hand.

The question isn't where it should happen, it's when.
Release or autorelease an object if you have created it with +alloc, +new or -copy, or if you have sent it a -retain message.
Send -release when you don't care if the object continues to exist. Send -autorelease if you want to return it from the method you're in, but you don't care what happens to it after that.

I wouldn't say that dealloc is where you would call autorelease. And unless your object, whatever it may be, is linked to the life of a class, it doesn't necessarily need to be kept around for a retain in dealloc.
Here are my rules of thumb. You may do things in other ways.
I use release if the life of the
object I am using is limited to the
routine I am in now. Thus the object
gets created and released in that
routine. This is also the preferred
way if I am creating a lot of objects
in a routine, such as in a loop, and
I might want to release each object
before the next one is created in the
loop.
If the object I created in a method
needs to be passed back to the
caller, but I assume that the use of
the object will be transient and
limited to this run of the runloop, I
use autorelease. Here, I am trying to mimic many of Apple's convenience routines. (Want a quick string to use for a short period? Here you go, don't worry about owning it and it will get disposed appropriately.)
If I believe the object is to be kept
on a semi-permanent basis (like
longer than this run of the runloop),
I use create/new/copy in my method
name so the caller knows that they
are the owner of the object and will
have to release the object.
Any objects that are created by a
class and kept as a property with
retain (whether through the property
declaration or not), I release those
in dealloc (or in viewDidUnload as
appropriate).
Try not to let all this memory management overwhelm you. It is a lot easier than it sounds, and looking at a bunch of Apple's samples, and writing your own (and suffering bugs) will make you understand it better.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas