Should I use == or [NSManagedObject isEqual:] to compare managed objects in the same context? - objective-c

Let's say variable A and B hold instances of managed objects in the same managed object context. I need to make sure that they are associated with the same "record" in the persistent store. The section on Faulting and Uniquing in the Core Data Programming Guide says that:
Core Data ensures that—in a given managed object context—an entry in a persistent store is associated with only one managed object.
From this, it seems that a pointer comparison is sufficient for my purpose. Or does it ever make sense to use isEqual: to compare managed objects in the same context?

Use == to determine if two pointers point to the same object. Use -isEqual to determine if two objects are "equal", where the notion of equality depends on the objects being compared. -isEqual: normally compares the values returned by the -hash method. I wrote previously that it seemed possible that -isEqual: might return true if two managed objects contain the same values. That's clearly not right. There are some caveats in the docs about making sure that the hash value for a mutable object doesn't change while it's in a collection, and that knowing whether a given object is in a collection can be difficult. It seems certain that the hash for a managed object doesn't depend on the data that that object contains, and much more likely that it's connected to something immutable about the object; the object's -objectID value seems a likely candidate.
Given all that, I'm changing my opinion ;-). Each record is only represented once in a given context, so == is probably safe, but -isEqual: seems to better express your intention.

Pointer comparison is fine for objects retrieved from a single managed object context, the documentation on uniquing you quote promises as much.
ObjectID should be used for testing object equality across managed object contexts.
isEqual does not do attribute tests, because it is documented to not fault the object. In fact, looking at the disassembled function it is definitely just a pointer compare.
So the semantics of the equality test for managed objects are simply "points to the same object (record) in the managed object context" and will compare false for objects in different contexts.

Warning: Since NSManagedObject isEqual compares objectIDs, a comparison can fail if one instance is using the temporary objectID and the other instance is using the permanent objectID.
Background: When an NSManagedObject is created, it is assigned a temporary objectID. It is converted into a permanent objectID when the NSManagedObject is actually persisted into the store. You can see the difference if you print the objectID:
x-coredata:///MyEntity/t03BF9735-A005-4ED9-96BA-462BD65FA25F118 (temporary ID)
x-coredata://EB8922D9-DC06-4256-A21B-DFFD47D7E6DA/MyEntity/p3 (permanent ID)
When an objectID is converted to permanent, instances of the NSManagedObject in other threads and collections are not updated. So if you put an NSManagedObject into an NSArray when it has a temporary objectID, using methods like containsObject will fail if you try to find the object with the permanent objectID. Remember containsObject uses isEqual.
Finally, a couple of useful methods are NSManagedObjectID isTemporaryID and NSManagedObjectContext obtainPermanentIDsForObjects:error:.

Related

Does NSDictionary's objectForKey: rely on identity or equality?

Say I have an object called Person which has the property socialSecurityNumber, and this class overrides the isEqual: method to return true when the social security number properties are equal. And say I've put a bunch of instances of Person into an NSDictionary.
If I now instantiate a newPerson object which happens to have the same social security number as one already in the dictionary, and I do [myDictionary objectForKey:newPerson], will it use the isEqual: and return YES, or will it compare pointers and return NO?
I know I can write a simple test to find out, but I want to understand how exactly objectForKey: finds a match in a dictionary, and generally how consistent this is across Cocoa (i.e. does NSArray's indexofObject: work the same?)
NSDictionary works like a hashtable. So it uses both -hash and -isEqual: to find the object in the dictionary corresponding to the given key.
So to answer your question for NSDictionary, this uses isEqual: and not pointer comparison. But you also should implement hash in addition to isEqual: on your Person class for this to work.
From the NSDictionary Class Reference documentation:
A key-value pair within a dictionary is called an entry. Each entry consists of one object that represents the key and a second object that is that key’s value. Within a dictionary, the keys are unique. That is, no two keys in a single dictionary are equal (as determined by isEqual:).
From the isEqual: method documentation:
If two objects are equal, they must have the same hash value. This last point is particularly important if you define isEqual: in a subclass and intend to put instances of that subclass into a collection. Make sure you also define hash in your subclass.
This behavior is consistent across the various container classes in Cocoa. For example, from the NSArray's indexOfObject: method documentation:
Starting at index 0, each element of the array is sent an isEqual: message until a match is found or the end of the array is reached. This method passes the anObject parameter to each isEqual: message. Objects are considered equal if isEqual: (declared in the NSObject protocol) returns YES.
You should always read the documentation : as pointed out by the extracts quoted above, these kind of details are often explained in the "Discussion" or "Special Consideration" sections of the method documentation or in the "Overview" section of the class documentation itself.
how consistent this is across Cocoa (i.e. does NSArray's indexofObject: work the same?)
It is consistent and at the same time it isn't. What I mean is that there are two methods that could be used: isEqual and hash. You should not be too much concerned about which is used when. What you should instead focus on is to respect the NSObject protocol requirements and make sure that if two objects are equal according to isEqual they also have the same hash.
From the isEqual documentation in the NSObject Protocol Reference
If two objects are equal, they must have the same hash value. This
last point is particularly important if you define isEqual: in a
subclass and intend to put instances of that subclass into a
collection. Make sure you also define hash in your subclass.

How does one effectively handle temporary objects in Core Data since the objectID changes between temporary objects and permanent objects?

What is the best way to handle temporary objects in Core Data? I've seen solutions where temporary contexts are created, where they are inserted into nil contexts, etc.
However, here's the issue I'm seeing in both of these solutions. I'm using Core Data for my object model and and in some of my views store a NSSet of Core Data objects. The problem I have is when the object is stored, the objectID changes which effectively invalidates anything stored in any NSSet since the isEqual and hash are now different. While I could invalidate the object stored in the NSSet, it often is not practical and certainly not always easy.
Here's the things I've considered:
1) override isEqual method and hash on NSManagedObject (obviously bad)
2) do not place any NSManagedObject in a NSSet (use a NSDictionary where the key is always fixed)
3) use an entirely different type to store in NSSet where I could correctly implement the isEqual and hash code methods
Does anyone have a better solution for this?
ManagedObjects in an NSSet -- that sounds like a Core Data relationship. Why not simply store your temporary managedObjects in a relationship, and have Core Data take care of the problems you're now running into. Then you can concentrate on when and how to delete the temporary objects, or break the relationship or whatever is needed.
However, here's the issue I'm seeing in both of these solutions. I'm using Core Data for my object model and and in some of my views store a NSSet of Core Data objects. The problem I have is when the object is stored, the objectID changes which effectively invalidates anything stored in any NSSet since the isEqual and hash are now different.
tjg184,
Your problem here is not the transition to permanent IDs but that your container class depends upon an immutable hash. Hence, change your container class to an array or dictionary and this problem goes away. (You give up uniquing with an array but that is easy to handle with a trip through a transient set to perform the uniquing.)
Andrew
A possible solution would be to convert the temporary IDs to permanent ones using [NSManagedObjectContext obtainPermanentIDsForObjects:error:].
But be aware that this may be expensive, especially if you have a lot of objects you need to process this way.
You could possibly subclass NSManagedObject and override the willSave and didSave methods to remove and then re-add you objects to your set.
I actually ended up using a different approach, that of using a NIL context and providing a base class to handle insertion into a context. It works really well and is the cleanest solution I have found. Code can be found here... Temporary Core Data

Is there any reason not to return a mutable object where one is not expected?

I have a number of functions similar to the following:
+ (NSArray *)arrayOfSomething
{
NSMutableArray *array = [NSMutableArray array];
// Add objects to the array
return [[array copy] autorelease];
}
My question is about the last line of this method: is it better to return the mutable object and avoid a copy operation, or to return an immutable copy? Are there any good reasons to avoid returning a mutable object where one is not expected?
(I know that it is legal to return a NSMutableArray since it is a subclass of NSArray. My question is whether or not this is a good idea.)
This is a complex topic. I think it's best to refer you to Apple's guidelines on object mutability.
Apple has this to say on the subject of using introspection to determine a returned object's mutability:
To determine whether it can change a received object, the receiver must rely on the formal type of the return value. If it receives, for instance, an array object typed as immutable, it should not attempt to mutate it. It is not an acceptable programming practice to determine if an object is mutable based on its class membership
(my emphasis)
The article goes on to give several very good reasons why you should not use introspection on a returned object to determine if you can mutate it e.g.
You read a property list from a file. When the Foundation framework processes the list it notices that various subsets of the property list are identical, so it creates a set of objects that it shares among all those subsets. Afterwards you look at the created property list objects and decide to mutate one subset. Suddenly, and without being aware of it, you’ve changed the tree in multiple places.
and
You ask NSView for its subviews (subviews method) and it returns an object that is declared to be an NSArray but which could be an NSMutableArray internally. Then you pass that array to some other code that, through introspection, determines it to be mutable and changes it. By changing this array, the code is mutating NSView’s internal data structures.
Given the above, it is perfectly acceptable for you to return the mutable array in your example (provided of course, you never mutate it yourself after having returned it, because then you would be breaking the contract).
Having said that, almost nobody has read that section of the Cocoa Objects Guide, so defensive programming would call for you to make an immutable copy and return that unless performance profiling shows that it is a problem to do that.
Short Answer: Don't do it
Long Answer: It depends. If the array is getting changed while being used by someone who expects it be static, you can cause some baffling errors that would be a pain to track down. It would be better to just do the copy/autorelease like you've done and only come back and revisit the return type of that method if it turns out that there is a significant performance hit.
In response to the comments, I think it's unlikely that returning a mutable array would cause any trouble, but, if it does cause trouble, it could be difficult to track down exactly what the issue is. If making a copy of the mutable array turns out to be a big performance hit, it will be very easy to determine what's causing the problem. You have a choice between two very unlikely issues, one that's easy to solve, one that's very difficult.

Using non-copyable object as key for NSMutableDictionary?

I tried to figure out this code referencing: Cocoa: Dictionary with enum keys?
+ (NSValue*)valueWithReference:(id)target
{
return [NSValue valueWithBytes:&target objCType:#encode(id*)];
}
And,
[table setObject:anObject forKey:[NSValue valueWithReference:keyObject]];
But it feels something not good. Any recommendations?
You're absolutely right it's not good.
For one, you're encoding the wrong type (it should be #encode(id), not #encode(id*)), but in most cases this shouldn't cause a big problem.
The bigger problem is that this completely ignores memory management. The object won't be retained or copied. If some other code releases it, it could just disappear, and then your dictionary key will be a boxed pointer to garbage or even a completely different object. This is basically the world's most advanced dangling pointer.
You have two good options:
You could either add NSCopying to the class or create a copyable subclass.
This option will only work for objects that can meaningfully be copied. This is most classes, but not necessarily all (e.g. it might be bad to have multiple objects representing the same input stream)
Implementing copying can be a pain even for classes where it makes sense — not difficult, per se, but kind of annoying
You could instead create the dictionary with the CFDictionary API. Since Core Foundation types don't have a generic copy function, CFDictionary just retains its keys by default (though you can customize its behavior however you like). But CFDictionary is also toll-free bridged with NSDictionary, which means that you can just cast a CFDictionaryRef to an NSDictionary* (or NSMutableDictionary*) and then treat it like any other NSDictionary.
This means that the object you're using as a key must not change (at least not in a way that affects its hash value) while it's in the dictionary — ensuring this doesn't happen is why NSDictionary normally wants to copy its keys
For the later reference.
Now I know that there are some more options.
Override methods in NSCopying protocol, and return the self instead of copying itself. (you should retain it if you are not using ARC) Also you ensure the object to always return same value for -hash method.
Make a copyable simple container class holds strong reference to the original key object. The container is copyable but, it just passes original key when it being copied. Override equality/hash methods also to match semantics. Even just an instance of NSArray contains only the key object works well.
Method #1 looks pretty safe but actually I'm not sure that's safe. Because I don't know internal behavior of NSDictionary. So I usually use #2 way which is completely safe in Cocoa convention.
Update
Now we Have NSHashTable and NSMapTable also in iOS since version 6.0.
I'm not 100% sure about the correctness of this solution, but I'm posting it just in case.
If you do not want to use a CFDictionary, maybe you could use this simple category:
#implementation NSMutableDictionary(NonCopyableKeys)
- (void)setObject:(id)anObject forNonCopyableKey:(id)aKey {
[self setObject:anObject forKey:[NSValue valueWithPointer:aKey]];
}
- (id)objectForNonCopyableKey:(id)aKey {
return [self objectForKey:[NSValue valueWithPointer:aKey]];
}
- (void)removeObjectForNonCopyableKey:(id)aKey {
[self removeObjectForKey:[NSValue valueWithPointer:aKey]];
}
#end
This is a generalization of a similar method I saw online (can't find the original source) for using an NSMutableDictionary that can store objects with UITouch keys.
The same restriction as in Chuck's answer applies: the object you're using as a key must not change in a way that affects its hash value and must not be freed while it's in the dictionary .
Also make sure you don't mix -(void)setObject:(id)anObject forNonCopyableKey:(id)aKey and - (id)objectForKey:(id)aKey methods, as it won't work (the latter will return nil).
This seems to work fine, but there might be some unwanted side effects that I am not thinking of. If anybody finds out that this solution has any additional problems or caveats, please comment.

Implementing -hash / -isEqual: / -isEqualTo...: for Objective-C collections

Note: The following SO questions are related, but neither they nor the linked resources seem to fully answer my questions, particularly in relation to implementing equality tests for collections of objects.
Best practices for overriding -isEqual: and -hash
Techniques for implementing -hash on mutable Cocoa objects
Background
NSObject provides default implementations of -hash (which returns the address of the instance, like (NSUInteger)self) and -isEqual: (which returns NO unless the addresses of the receiver and the parameter are identical). These methods are designed to be overridden as necessary, but the documentation makes it clear that you should provide both or neither. Further, if -isEqual: returns YES for two objects, then the result of -hash for those objects must be the same. If not, problems can ensue when objects that should be the same — such as two string instances for which -compare: returns NSOrderedSame — are added to a Cocoa collection or compared directly.
Context
I develop CHDataStructures.framework, an open-source library of Objective-C data structures. I have implemented a number of collections, and am currently refining and enhancing their functionality. One of the features I want to add is the ability to compare collections for equality with another.
Rather than comparing only memory addresses, these comparisons should consider the objects present in the two collections (including ordering, if applicable). This approach has quite a precedent in Cocoa, and generally uses a separate method, including the following:
-[NSArray isEqualToArray:]
-[NSDate isEqualToDate:]
-[NSDictionary isEqualToDictionary:]
-[NSNumber isEqualToNumber:]
-[NSSet isEqualToSet:]
-[NSString isEqualToString:]
-[NSValue isEqualToValue:]
I want to make my custom collections robust to tests of equality, so they may safely (and predictably) be added to other collections, and allow others (like an NSSet) to determine whether two collections are equal/equivalent/duplicates.
Problems
An -isEqualTo...: method works great on its own, but classes which define these methods usually also override -isEqual: to invoke [self isEqualTo...:] if the parameter is of the same class (or perhaps subclass) as the receiver, or [super isEqual:] otherwise. This means the class must also define -hash such that it will return the same value for disparate instances that have the same contents.
In addition, Apple's documentation for -hash stipulates the following: (emphasis mine)
"If a mutable object is added to a collection that uses hash values to determine the object's position in the collection, the value returned by the hash method of the object must not change while the object is in the collection. Therefore, either the hash method must not rely on any of the object's internal state information or you must make sure the object's internal state information does not change while the object is in the collection. Thus, for example, a mutable dictionary can be put in a hash table but you must not change it while it is in there. (Note that it can be difficult to know whether or not a given object is in a collection.)"
Edit: I definitely understand why this is necessary and totally agree with the reasoning — I mentioned it here to provide additional context, and skirted the topic of why it's the case for the sake of brevity.
All of my collections are mutable, and the hash will have to consider at least some of the contents, so the only option here is to consider it a programming error to mutate a collection stored in another collection. (My collections all adopt NSCopying, so collections like NSDictionary can successfully make a copy to use as a key, etc.)
It makes sense for me to implement -isEqual: and -hash, since (for example) an indirect user of one of my classes may not know the specific -isEqualTo...: method to call, or even care whether two objects are instances of the same class. They should be able to call -isEqual: or -hash on any variable of type id and get the expected result.
Unlike -isEqual: (which has access to two instances being compared), -hash must return a result "blindly", with access only to the data within a particular instance. Since it can't know what the hash is being used for, the result must be consistent for all possible instances that should be considered equal/identical, and must always agree with -isEqual:. (Edit: This has been debunked by the answers below, and it certainly makes life easier.) Further, writing good hash functions is non-trivial — guaranteeing uniqueness is a challenge, especially when you only have an NSUInteger (32/64 bits) in which to represent it.
Questions
Are there best practices when implementing equality comparisons -hash for collections?
Are there any peculiarities to plan for in Objective-C and Cocoa-esque collections?
Are there any good approaches for unit testing -hash with a reasonable degree of confidence?
Any suggestions on implementing -hash to agree with -isEqual: for collections containing elements of arbitrary types? What pitfalls should I know about? (Edit: Not as problematic as I first thought — as #kperryua points out, "equal -hash values do not imply -isEqual:".)
Edit: I should have clarified that I'm not confused about how to implement -isEqual: or -isEqualTo...: for collections, that's straightforward. I think my confusion stemmed mainly from (mistakenly) thinking that -hash MUST return a different value if -isEqual: returns NO. Having done cryptography in the past, I was thinking that hashes for different values MUST be different. However, the answers below made me realize that a "good" hash function is really about minimizing bucket collisions and chaining for collections that use -hash. While unique hashes are preferable, they are not a strict requirement.
I think trying to come up with some generally useful hash function that will generate unique hash values for collections is an exercise in futility. U62's suggestion of combining the hashes of all the contents will not scale well, as it makes the hash function O(n). Hash functions should really be O(1) to ensure good performance, otherwise the purpose of the hash is defeated. (Consider the common Cocoa construct of plists, which are dictionaries containing arrays and other dictionaries, potentially ad nauseum. Attempting to take the hash of the top-level dictionary of a large plist would be excruciatingly slow if the collections' hash functions were O(n).)
My suggestion would be not to worry a great deal about a collection's hash. As you stated, -isEqual: implies equal -hash values. On the other hand, equal -hash values do not imply -isEqual:. That fact gives you a lot of leeway to create a simple hash.
If you're really worried about collisions though (and you have proof in concrete measurements of real-world situations that confirm it is something to be worried about), you could still follow U62's advice to some degree. For example, you could take the hash of, say, the first and/or last element in the collection, and combine that with, say, the -count of the collection. That be enough to provide a decent hash.
I hope that answers at least one of your questions.
As for No. 1: Implementing -isEqual: is pretty cut and dry. You enumerate the contents, and check isEqual: on each of the elements.
There is one thing to be careful of that may affect what you decide to do for your collections' -hash functions. Clients of your collections must also understand the rules governing -isEqual: and -hash. If you use the contents' -hash in your collection's -hash, your collection will break if the contents' isEqual: and -hash don't agree. It's the client's fault, of course, but that's another argument against basing your -hash off of the collection's contents.
No. 2 is kind of vague. Not sure what you have in mind there.
Two collections should be considered equal if they contain the same elements, and further if the collections are ordered, that the elements are in the same order.
On the subject of hashes for collections, it should be enough to combine the hashes of the elements in some way (XOR them or modulo add them). Note that while the rules state that two objects that are equal according to IsEqual need to return the same hash, the opposite does not hold : Although uniqueness of hashes is desireable, it is not necessary for correctness of the solution. Thus an ordered collection need not take account of the order of the elements.
The excerpt from the Apple documentation is a necessary restriction by the way. An object could not maintain the same hash value under mutation while also ensuring that objects with the same value have the same hash. That applies for the simplest of objects as well as collections. Of course it only usually matters that an object's hash changes when it is inside a container that uses the hash to organise it's elements. The upshot of all this is that mutable collections shouldn't mutate when placed inside another container, but then neither should any object that has a true hash function.
I have done some investigation into the NSArray and NSMutableArray default hash implementation and (unless I have misunderstood something) it seams like Apple do not follow thier own rules:
If a mutable object is added to a collection that uses hash values to
determine the object's position in the collection, the value returned
by the hash method of the object must not change while the object is
in the collection. Therefore, either the hash method must not rely on
any of the object's internal state information or you must make sure
the object's internal state information does not change while the
object is in the collection. Thus, for example, a mutable dictionary
can be put in a hash table but you must not change it while it is in
there. (Note that it can be difficult to know whether or not a given
object is in a collection.)
Here is my test code
NSMutableArray* myMutableArray = [NSMutableArray arrayWithObjects:#"a", #"b", #"c", nil];
NSMutableArray* containerForMutableArray = [NSMutableArray arrayWithObject:myMutableArray];
NSUInteger hashBeforeMutation = [[containerForMutableArray objectAtIndex:0] hash];
[[containerForMutableArray objectAtIndex:0] removeObjectAtIndex:1];
NSUInteger hashAfterMutation = [[containerForMutableArray objectAtIndex:0] hash];
NSLog(#"Hash Before: %d", hashBeforeMutation);
NSLog(#"Hash After : %d", hashAfterMutation);
The output is:
Hash Before: 3
Hash After : 2
So it seams like the default implementation for the Hash method on both NSArray and NSMutableArray is the count of the array and it dosn't care if its inside a collection or not.