Fastest way to find an NSManagedObject in an NSSet - cocoa-touch

I have 2 NSSets with NSManagedObjects, the objects for each set are fetched in different threads, meaning some have a matching objectID, but the objects themselves are different. Now I want to remove managedObjects in one set from the other.
NSSet* oldObjects;
NSMutableSet* currentObjects;
// I want to remove the managedObjects in oldObjects from currentObjects, all objects in oldObjects are also in currentObjects
// This doesn't work, since the objects don't match
[currentObjects removeObjectsInArray:[oldObjects allObjects]];
// But strangely enough, this doesn't add any objects to currentObjects, but if the objects don't match, shouldn't it?
//[currentObjects addObjectsFromArray:[oldObjects allObjects]];
// This does work for me but this code is running on the main thread and I can see this becoming rather slow for large data sets
NSArray* oldObjectIDs = [[oldObjects allObjects] valueForKey:#"objectID"];
[currentObjects filterUsingPredicate:[NSPredicate predicateWithFormat:#"NOT (objectID IN %#)", oldObjectIDs]];
Is there a faster way I can filter these out? Would fast enumeration be faster even in this case?

Sorry for getting back to with such a delay.
I re-read your question, and now, that I've understood the setting completly I might have a solution for you.
This is not tested, but try something like this:
//Since the current objects set has registered its objects in the current context
//lets use that registration to see which of them is contained in the old object set
NSMutableSet* oldRegisteredSet = [NSMutableSet new];
for (NSManagedObject* o in oldObjects) {
NSManagedObject* regObject = [context objectRegisteredForID:[o objectID]];
if (regObject) {
//You could do here instead: [currentObjects removeObject:regObject];
//You should optimize here after testing performance
[oldRegisteredSet addObject:regObject];
}
}
[currentObjects minusSet:oldRegisteredSet];

Related

Core data find-or-create most efficient way

I have around 10000 objects of entity 'Message'. When I add a new 'Message' i want to first see whether it exists - and if it does just update it's data, but if it doesn't to create it.
Right now the "find-or-create" algorithm works with by saving all of the Message objects 'objectID' in one array and then filtering through them and getting the messages with existingObjectWithID:error:
This works fine but in my case when I fetch an 'Message' using existingObjectWithID: and then try to set and save a property by setting the property of the 'Message' object and calling save: on it's context it doesn't saves it properly. Has anyone come across a problem like this?
Is there a more efficient way to make find-or-create algorithm?
First, Message is a "bad" name for a CoreData entity as apple use it internally and it cause problems later in development.
You can read a little more about it HERE
I've noticed that all suggested solutions here use an array or a fetch request.
You might want to consider a dictionary based solution ...
In a single threaded/context application this is accomplished without too much of a burden by adding to cache (dictionary) the newly inserted objects (of type Message) and pre-populating the cache with existing object ids and keys mapping.
Consider this interface:
#interface UniquenessEnforcer : NSObject
#property (readonly,nonatomic,strong) NSPersistentStoreCoordinator* coordinator;
#property (readonly,nonatomic,strong) NSEntityDescription* entity;
#property (readonly,nonatomic,strong) NSString* keyProperty;
#property (nonatomic,readonly,strong) NSError* error;
- (instancetype) initWithEntity:(NSEntityDescription *)entity
keyProperty:(NSString*)keyProperty
coordinator:(NSPersistentStoreCoordinator*)coordinator;
- (NSArray*) existingObjectIDsForKeys:(NSArray*)keys;
- (void) unregisterKeys:(NSArray*)keys;
- (void) registerObjects:(NSArray*)objects;//objects must have permanent objectIDs
- (NSArray*) findOrCreate:(NSArray*)keys
context:(NSManagedObjectContext*)context
error:(NSError* __autoreleasing*)error;
#end
flow:
1) on application start, allocate a "uniqueness enforcer" and populate your cache:
//private method of uniqueness enforcer
- (void) populateCache
{
NSManagedObjectContext* context = [[NSManagedObjectContext alloc] init];
context.persistentStoreCoordinator = self.coordinator;
NSFetchRequest* r = [NSFetchRequest fetchRequestWithEntityName:self.entity.name];
[r setResultType:NSDictionaryResultType];
NSExpressionDescription* objectIdDesc = [NSExpressionDescription new];
objectIdDesc.name = #"objectID";
objectIdDesc.expression = [NSExpression expressionForEvaluatedObject];
objectIdDesc.expressionResultType = NSObjectIDAttributeType;
r.propertiesToFetch = #[self.keyProperty,objectIdDesc];
NSError* error = nil;
NSArray* results = [context executeFetchRequest:r error:&error];
self.error = error;
if (results) {
for (NSDictionary* dict in results) {
_cache[dict[self.keyProperty]] = dict[#"objectID"];
}
} else {
_cache = nil;
}
}
2) when you need to test existence simply use:
- (NSArray*) existingObjectIDsForKeys:(NSArray *)keys
{
return [_cache objectsForKeys:keys notFoundMarker:[NSNull null]];
}
3) when you like to actually get objects and create missing ones:
- (NSArray*) findOrCreate:(NSArray*)keys
context:(NSManagedObjectContext*)context
error:(NSError* __autoreleasing*)error
{
NSMutableArray* fullList = [[NSMutableArray alloc] initWithCapacity:[keys count]];
NSMutableArray* needFetch = [[NSMutableArray alloc] initWithCapacity:[keys count]];
NSManagedObject* object = nil;
for (id<NSCopying> key in keys) {
NSManagedObjectID* oID = _cache[key];
if (oID) {
object = [context objectWithID:oID];
if ([object isFault]) {
[needFetch addObject:oID];
}
} else {
object = [NSEntityDescription insertNewObjectForEntityForName:self.entity.name
inManagedObjectContext:context];
[object setValue:key forKey:self.keyProperty];
}
[fullList addObject:object];
}
if ([needFetch count]) {
NSFetchRequest* r = [NSFetchRequest fetchRequestWithEntityName:self.entity.name];
r.predicate = [NSPredicate predicateWithFormat:#"SELF IN %#",needFetch];
if([context executeFetchRequest:r error:error] == nil) {//load the missing faults from store
fullList = nil;
}
}
return fullList;
}
In this implementation you need to keep track of objects deletion/creation yourself.
You can use the register/unregister methods (trivial implementation) for this after a successful save.
You could make this a bit more automatic by hooking into the context "save" notification and updating the cache with relevant changes.
The multi-threaded case is much more complex (same interface but different implementation altogether when taking performance into account).
For instance, you must make your enforcer save new items (to the store) before returning them to the requesting context as they don't have permanent IDs otherwise, and even if you call "obtain permanent IDs" the requesting context might not save eventually.
you will also need to use a dispatch queue of some sort (parallel or serial) to access your cache dictionary.
Some math:
Given:
10K (10*1024) unique key objects
average key length of 256[byte]
objectID length of 128[byte]
we are looking at:
10K*(256+128) =~ 4[MB] of memory
This might be a high estimate, but you should take this into account ...
Ok, many things can go wrong here this is how to:
Create NSManagedObjectContext -> MOC
Create NSFetchRequest with the right entity
Create the NSPredicate and attache it to the fetch request
execute fetch request on newly created context
fetch request will return an array of objects matching the predicate
(you should have only one object in that array if your ids are distinct)
cast first element of an array to NSManagedObject
change its property
save context
The most important thing of all is that you use the same context for fetching and saving, and u must do it in the same thread cause MOC is not thread safe and that is the most common error that people do
Currently you say you maintain an array of `objectID's. When you need to you:
filter through them and get the messages with existingObjectWithID:error:
and after this you need to check if the message you got back:
exists
matches the one you want
This is very inefficient. It is inefficient because you are always fetching objects back from the data store into memory. You are also doing it individually (not batching). This is basically the slowest way you could possibly do it.
Why changes to that object aren't saved properly isn't clear. You should get an error of some kind. But, you should really change your search approach:
Instead of looping and loading, use a single fetch request with a predicate:
NSFetchRequest *request = ...;
NSPredicate *filterPredicate = [NSPredicate predicateWithFormat:#"XXX == %#", YYY];
[request setPredicate:filterPredicate];
[request setFetchLimit:1];
where XXX is the name of the attribute in the message to test, and YYY is the value to test it against.
When you execute this fetch on the MOC you should get one or zero responses. If you get zero, create + insert a new message and save the MOC. If you get one, update it and save the MOC.

Obj-C: using mutable and returning non mutable classes in methods

In objective-C I find myself creating alot of Mutable objects and then returning them as non mutable objects. Is the way I am doing it here, simply returning the NSMutableSet as an NSSet a good practice? I was thinking maybe I should specify that i make a copy of it.
/** Returns all the names of the variables used in a given
* program. If non are used it returns nil */
+ (NSSet *)variablesUsedInProgram:(id)program
{
NSMutableSet* variablesUsed = [[NSMutableSet alloc]init];
if ([program isKindOfClass:[NSArray class]]) {
for (NSString *str in program)
{
if ([str isEqual:#"x"] || [str isEqual:#"y"] || [str isEqual:#"a"] || [str isEqual:#"b"])
[variablesUsed addObject:str];
}
}
if ([variablesUsed count] > 0) {
return variablesUsed;
} else {
return nil;
}
}
If I were you, I would do it this way.
+ (NSSet *)variablesUsedInProgram:(id)program
{
NSSet *variablesUsed;
if ([program isKindOfClass:[NSArray class]]) {
NSPredicate *predicate = [NSPredicate predicateWithFormat:#"SELF = 'x' or SELF = 'y' or SELF = 'z'"];
variablesUsed = [NSSet setWithArray:[program filteredArrayUsingPredicate:predicate]];
}
int count;
return (count = [variablesUsed count]) > 0 ? variablesUsed : nil;
}
I find using predicate to filter array quite comprehensive and easy. Rather than dealing with creating a new mutable type and then testing certain condition, adding until the loop; in this scenario, it seems to be easier to use predicate. Hope this helps you.
It depends how much safety you require. If you return the object as an NSSet it will still be an NSMutableSet, so it could easily be cast back to one and modified.
Certainly, if you're creating a public API, I'd recommend returning a copy. For in internal project, perhaps the method signature already makes the intention clear enough.
Its, worth noting that, generally the performance impact of returning a copy is negligible - copying an immutable instance is effectively free whereas each copy sent to a mutable-passing-as-immutable will create another copy. So I would say its good practice to default to.
No. This is an absolutely correct OOP approach (it takes advantage of polymorphism). Every NSMutableSet is a proper NSSet. Don't copy superfluously.
Not a full answer here, consider NSProxy's one, but I want to clarify something.
In your case you create your object from scratch, and you don't set any ivar to point to that object. In my opinion in a good percentage of cases you don't need to make a copy of the mutable object returned. But if there is a good reason to deny the class client from mutating the class, then you should copy the variable.
Consider a property like this:
#property (nonatomic,assign) NSSet* set;
The class client could do this:
NSMutableSet* set= ... ; // inizialized to some value
classInstance.set= set;
// Mutate the set
Once mutated the set it could make the class be in an inconsistent state.
That's why when I have a property with the type of a class that has also a mutable version, I always put copy instead of assign in the property.

Creating local objects, prefrence or simply better?

Is it better to create a local object for later use like
NSDictionary *dic = [NSDictionary Dictionary];
or
NSDictionary * dic = nil;
Is it preference thing or is one better then the other?
it's not like 'the one is better', it's like 'the other is bad'.
If you're going to assign a new object to it later, initialize it to nil, else (you leak memory by losing the reference to the first object created by error.) - EDIT: no, you're not leaking memory (either because of the autorelease or the automatic reference counting, but anyway, that's an extra unneeded method call.) That is bad.
If it's a mutable collection, create it before you use it, else it will continue being nil and ignoring essentially all messages sent to it, which is also bad.
Conclusion: it's not a matter of preference - you must think logically and choose whichever is suited for the specific purpose you are using it for.
If you will use that object later, then you should instantiate it with the first option. If you will have to create an object in some if-else block where you will be reinitializing it with some custom values, then the second option is the way to go.
For example the first option:
NSMutableArray *arr = [NSMutableArray array];
for (int i = 0; i < 5; i++) {
[arr addObject:#"string"];
}
or
NSDictionary *dictionary = nil;
BOOL flag;
if (flag) {
dictionary = [NSDictionary dictionaryWithObject:#"string" forKey:#"myKey"];
}
else {
NSArray *objects;
NSArray *keys;
dictionary = [NSDictionary dictionaryWithObjects:objects forKeys:keys];
}

Emptying a Core Data NSSet (multiple relationships)

If I need to programmatically empty a NSSet automatically created by Core Data (multiple relationships), what should I do ? Something like this ?
[self willChangeValueForKey:#"MyRelationship"];
[[self MyRelationship] release];
[self MyRelationship] = [NSSet alloc] init];
[self didChangeValueForKey:#"MyRelationship"];
Not sure it is correct at all...
thanks
[[self mutableSetValueForKey:#"MyRelationship"] removeAllObjects];
For some reason, I can never get the "cascade" delete rule to work, so when I want the objects deleted as well I have to iterate over the set and call [self.managedObjectContext deleteObject:obj] or else I'll get validation errors, if the relationship is defined as required.
Patrick,
Relations are, unsurprisingly, special in Core Data. They provide some specialized methods to remove objects from those relations. Rather than trying to override the accessors, you should use those methods. As in this snippet:
[self removeMyRelationship: self.myRelationship];
I also think your should remove your overridden accessor methods.
I have no insight into your deletion problem. I recommend that you just iterate over the group and delete the objects. I think it is important that your enumerator be a copy of your relationship. As in the following ARC code:
for (Relation *r in [self.myRelationship copy]) {
[moc deleteObject: r];
}
Andrew

What's the way to communicate a set of Core Data objects stored in the background to the main thread?

Part of my iOS project polls a server for sets of objects, then converts and saves them to Core Data, to then update the UI with the results. The server tasks happens in a collection of NSOperation classes I call 'services' that operate in the background. If NSManagedObject and its ~Context were thread safe, I would have had the services call delegate methods on the main thread like this one:
- (void)service:(NSOperation *)service retrievedObjects:(NSArray *)objects;
Of course you can't pass around NSManagedObjects like this, so this delegate method is doomed. As far as I can see there are two solutions to get to the objects from the main thread. But I like neither of them, so I was hoping the great StackOverflow community could help me come up with a third.
I could perform an NSFetchRequest on the main thread to pull in the newly added or modified objects. The problem is that the Core Data store contains many more of these objects, so I have to add quite some verbosity to communicate the right set of objects. One way would be to add a property to the object like batchID, which I could then pass back to the delegate so it would know what to fetch. But adding data to the store to fix my concurrency limitations feels wrong.
I could also collect the newly added objects' objectID properties, put them in a list and send that list to the delegate method. The unfortunate thing though is that I have to populate the list after I save the context, which means I have to loop over the objects twice in the background before I have the correct list (first time is when parsing the server response). Then I still only have a list of objectIDs, which I have to individually reel in with existingObjectWithID:error: from the NSManagedObjectContext on the main thread. This just seems so cumbersome.
What piece of information am I missing? What's the third solution to bring a set of NSManagedObjects from a background thread to the main thread, without losing thread confinement?
epologee,
While you obviously have a solution you are happy with, let me suggest that you lose some valuable information, whether items are updated, deleted or inserted, with your mechanism. In my code, I just migrate the userInfo dictionary to the new MOC. Here is a general purpose routine to do so:
// Migrate a userInfo dictionary as defined by NSManagedObjectContextDidSaveNotification
// to the receiver context.
- (NSDictionary *) migrateUserInfo: (NSDictionary *) userInfo {
NSMutableDictionary *ui = [NSMutableDictionary dictionaryWithCapacity: userInfo.count];
NSSet * sourceSet = nil;
NSMutableSet *migratedSet = nil;
for (NSString *key in [userInfo allKeys]) {
sourceSet = [userInfo valueForKey: key];
migratedSet = [NSMutableSet setWithCapacity: sourceSet.count];
for (NSManagedObject *mo in sourceSet) {
[migratedSet addObject: [self.moc objectWithID: mo.objectID]];
}
[ui setValue: migratedSet forKey: key];
}
return ui;
} // -migrateUserInfo:
The above routine assumes it is a method of a class which has an #property NSManagedObjectContext *moc.
I hope you find the above useful.
Andrew
There's a section of the Core Data Programming Guide that addresses Concurrency with Core Data. In a nutshell, each thread should have its own managed object context and then use notifications to synchronize the contexts.
After a little experimentation, I decided to go for a slight alteration to my proposed method number 2. While performing background changes on the context, keep a score of the objects you want to delegate back to the main thread, say in an NSMutableArray *objectsOfInterest. We eventually want to get to the objectID keys of all the objects in this array, but because the objectID value changes when you save a context, we first have to perform that [context save:&error]. Right after the save, use the arrayFromObjectsAtKey: method from the NSArray category below to generate a list of objectID instances, like so:
NSArray *objectIDs = [objectsOfInterest arrayFromObjectsAtKey:#"objectID"];
That array you can pass back safely to the main thread via the delegate (do make sure your main thread context is updated with mergeChangesFromContextDidSaveNotification by listening to the NSManagedObjectContextDidSaveNotification). When you're ready to reel in the objects of the background operation, use the existingObjectsWithIDs:error: method from the category below to turn the array of objectID's back into a list of working NSManagedObjects.
Any suggestions to improve the conciseness or performance of these methods is appreciated.
#implementation NSArray (Concurrency)
- (NSArray *)arrayFromObjectsAtKey:(NSString *)key {
NSMutableArray *objectsAtKey = [NSMutableArray array];
for (id value in self) {
[objectsAtKey addObject:[value valueForKey:key]];
}
return objectsAtKey;
}
#end
#implementation NSManagedObjectContext (Concurrency)
- (NSArray *)existingObjectsWithIDs:(NSArray *)objectIDs error:(NSError **)error {
NSMutableArray *entities = [NSMutableArray array];
#try {
for (NSManagedObjectID *objectID in objectIDs) {
// existingObjectWithID might return nil if it can't find the objectID, but if you're not prepared for this,
// don't use this method but write your own.
[entities addObject:[self existingObjectWithID:objectID error:error]];
}
}
#catch (NSException *exception) {
return nil;
}
return entities;
}
#end