I have a dictionary that stores an object using a combination of the class name and selector as the key. I'm using the following function in order to calculate the hash:
+(NSString*) getKeyForClass:(Class) clazz andSelector:(SEL) selector {
return [NSString stringWithFormat:#"%#_%#",NSStringFromClass(clazz), NSStringFromSelector(selector)];
}
While running a profiler i've discovered that this function is the bottleneck of the computation. Is there a better (= more efficient) way to create a key from a class and a selector?
A few alternatives.
Keep using a string as a key, but do it faster:
Using a string is a bit more heavyweight than you really need, but it is at least simple.
Using -[NSString stringByAppendingString] would be faster. Parsing format strings is a lot of work.
return [[NSStringFromClass(clazz) stringByAppendingString:#"_"] stringByAppendingString:NSStringFromSelector(selector)];
It may be better to use a single NSMutableString instead of making intermediate strings. Profile it and see.
NSMutableString* result = [NSStringFromClass(clazz) mutableCopy];
[result appendString:#"_"];
[result appendString:NSStringFromSelector(selector)];
return result;
Use a custom object as a key:
You can make a custom object as the key that refers to the class and selector. Implement NSCopying and -isEqual: and -hash on it, so you can use it as a key in a dictionary.
#interface MyKey : NSObject <NSCopying>
{
Class _clazz;
SEL _selector;
}
- (id)initWithClass:(Class)clazz andSelector:(SEL)selector;
#end
#implementation MyKey
- (id)initWithClass:(Class)clazz andSelector:(SEL)selector
{
if ((self = [super init])) {
_clazz = clazz;
_selector = selector;
}
return self;
}
- (id)copyWithZone:(NSZone*)zone
{
return self; // this object is immutable, so no need to actually copy it
}
- (BOOL)isEqual:(id)other
{
if ([other isKindOfClass:[MyKey class]]) {
MyKey* otherKey = (MyKey*)other;
return _clazz == otherKey->_clazz && _selector == otherKey->_selector;
} else {
return NO;
}
}
// Hash combining method from http://www.mikeash.com/pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html
#define NSUINT_BIT (CHAR_BIT * sizeof(NSUInteger))
#define NSUINTROTATE(val, howmuch) ((((NSUInteger)val) << howmuch) | (((NSUInteger)val) >> (NSUINT_BIT - howmuch)))
- (NSUInteger)hash
{
return NSUINTROTATE([_clazz hash], NSUINT_BIT / 2) ^ (NSUInteger)_selector;
}
#end
+ (MyKey*)keyForClass:(Class)clazz andSelector:(SEL)selector
{
return [[MyKey alloc] initWithClass:clazz andSelector:selector];
}
Eliminate the middleman:
If you never need to pull the class and selector out of your key object, then you can just use the hash as computed above, stored in an NSNumber.
// Hash combining method from http://www.mikeash.com/pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html
#define NSUINT_BIT (CHAR_BIT * sizeof(NSUInteger))
#define NSUINTROTATE(val, howmuch) ((((NSUInteger)val) << howmuch) | (((NSUInteger)val) >> (NSUINT_BIT - howmuch)))
+ (NSNumber*)keyForClass:(Class)clazz andSelector:(SEL)selector
{
NSUInteger hash = NSUINTROTATE([clazz hash], NSUINT_BIT / 2) ^ (NSUInteger)selector;
return [NSNumber numberWithUnsignedInteger:hash];
}
SELs themselves are unique; you could use an NSValue and wrap just that:
[NSValue valueWithBytes:&selector objCType:#encode(SEL)];
Related
In my class i overwrite the isEqual
#interface MyClass : NSObject
#property (nonatomic, strong) NSString * customID;
#end
I overwrite the isEqual so it checks only the equality of customID
- (BOOL)isEqual:(id)object {
if ([object isKindOfClass:[MyClass class]]) {
if (self.customID == nil) {
return NO;
}
return [self.customID isEqual:[object customID]];
}
return [super isEqual:object];
}
Now the NSSet is practically a hash table, making it fast to check, if it contains a hash value... thats something we know
but, let imagine this code
NSArray * instancesToCheck = ...;
NSArray * allInstances = ...;
for (MyClass * instance in allInstances) {
if ([instancesToCheck containsObject:instance]) {
// do smth
}
}
i would like to "optimize" with this one (use a NSSet for membership testing)
NSArray * instancesToCheck = ...;
NSArray * allInstances = ...;
NSSet * instancesToCheckAsSet = [NSSet setWithArray:instancesToCheck];
for (MyClass * instance in allInstances) {
if ([instancesToCheckAsSet containsObject:instance]) {
// do smth
}
}
Does the second code provide any performance benefit at all (under the assumption, that there were no duplicates in the array from which it was created, and the instancesToCheck contains different pointers, but some of the objects have the same customID, making isEqual==YES but pointer comparison==NO)?
When i looked up the docs, i found out, that the containsObject calls the isEqual, so it has to iterate over all objects anyway
What are the performance implications when using NSSet with objects, that overwrite isEqual? Becomes the NSSet less effective then?
Does the second code provide any performance benefit at all
Absolutely. An array must cycle through the array examining every object. A set knows more or less instantly whether an object is contained, because it is a hash table. Indeed, this sort of thing is exactly what a set is for.
You MUST overwrite hash, if you overwrite isEqual: doing otherwise might break the functionality and things might not behave as expected
Two objects that are considered "equal" must return the same hash value.
- (BOOL)isEqual:(id)object {
if ([object isKindOfClass:[MyClass class]]) {
if (self.customID == nil) {
return NO;
}
return [self.customID isEqual:[object customID]];
}
return [super isEqual:object];
}
// MUST overwrite hash
- (NSUInteger)hash {
return [self.customID hash];
}
I wonder is there any drawbacks when use alloc/free with pure C array inside Objective-C class?
For example:
#import "CVPatternGrid.h"
#implementation CVPatternGrid
#synthesize row = _row;
#synthesize column = _column;
#synthesize count = _count;
#synthesize score = _score;
- (id)initWithRow:(NSInteger)row column:(NSInteger)column {
if (self = [super init]) {
_grid = [self allocateNewGrid:row column:column];
}
return self;
}
- (NSInteger)moveCount {
return _count;
}
- (bool**)allocateNewGrid:(NSInteger)row column:(NSInteger)column {
bool **p = malloc(row * sizeof(bool*));
for (int i = 0; i < row; ++i) {
p[i] = malloc(column * sizeof(bool));
}
return p;
}
- (void)generateNewGrid:(NSInteger)row column:(NSInteger)column {
[self freeGrid];
_grid = [self allocateNewGrid:row column:column];
_count = [self.algorithmDelegate generateGrid:_grid];
_score = _count * 100;
}
- (BOOL)isMarkedAtRow:(NSInteger)row column:(NSInteger)column {
return YES;
}
- (void)freeGrid {
for (int i = 0; i < _row; ++i) {
free(_grid[i]);
}
free(_grid);
}
- (void)dealloc {
[self freeGrid];
}
#end
It's perfectly normal to use a C array in an Obj-C class. There are no low level data types in Obj-C — every class, including NSArray, NSString, etc, is using primitive C types internally.
However you are doing a few things wrong:
Do not use #synthesize unless you need to. In this case you don't need it, so delete those lines of code.
Do not use _foo to access variables unless you need it, again in this case you don't need it in any of your use cases (except, arguably, in your init and dealloc methods. But I would argue it should not even be used there. Other people disagree with me). My rule is to only use _foo when I run into performance issues when using self.foo syntax. There are also edge case issues such as KVO where you might run into problems when using an accessor inside init/dealloc. In the real world I have never run into any of those edge cases in more than 10 years of writing Obj-C — I always use accessors, unless they're too slow.
Some implementation details about how to declare an #property of a C array: Objective-C. Property for C array
I've got an method that takes NSDictionary arg. This NSDictionary has some predefined keys it'll take. All the obj's should be strings. But only certain string objs are valid for each key.
So my approach was to typedef NSString for each valid string per key. I'm hoping not to extend the NSString class.
I've typedef'd some NSString's...
typedef NSString MyStringType
Then I define a few...
MyStringType * const ValidString = #"aValidString";
Here's what I'd like to do in my sample method..
- (void)setAttrbiutes:(NSDictionary *)attributes {
NSArray *keys = [attributes allKeys];
for (NSString *key in keys) {
if ([key isEqualToString:#"ValidKey"]) {
id obj = [attributes objectForKey:key];
//Here's where I'd like to check..
if (**obj is MyStringType**) {
}
}
}
}
I'm open to other ideas if there's a better approach to solve the obj type problem of an NSDictionary.
Doesn't work like that; typedefs are a compile time alias that don't survive being passed through a dictionary.
In any case, using typedefs for something like this would be unwieldy.
I suggest you create a property list -- either as a file in your project or in code -- that contains the specifications of your various keys and valid values, then write a little validator that, passed a string and value, can validate the string-value pair for validity.
This also gives you the flexibility to extend your validator in the future. For example, you might have a #"Duration" key that can only be in the range of 1 to 20.
Instead of setting up a typedef for you special values, one possible option would be to create an NSSet of the special values. Then in your code you can verify that the object in the dictionary is in your set.
What about a combination of category on NSString + associated object?
Something along the lines (untested!!):
#interface NSString (BBumSpecial)
- (NSString *) setSpecial: (BOOL) special ;
- (BOOL) special ;
#end
and:
#implementation NSString (BBumSpecial)
static void * key ;
- (NSString *) setSpecial: (BOOL) special {
objc_setAssociatedObject(self, &key, special ? #YES : #NO, OBJC_ASSOCIATION_ASSIGN) ;
return self ;
}
- (BOOL) special {
id obj = objc_getAssociatedObject(self, &key) ;
return obj && [obj boolValue] ;
}
#end
Which you could then use as:
NSString * mySpecialString = [#"I'm Special" setSpecial:YES] ;
?
I ran into an issue, where I got the same hash value for different dictionaries. Maybe I'm doing something obvious wrong, but I thought, objects with different content (a.k.a. not equal objects) should have different hash values.
NSDictionary *dictA = #{ #"foo" : #YES };
NSDictionary *dictB = #{ #"foo" : #NO };
BOOL equal = [dictA hash] == [dictB hash];
NSAssert(!equal, #"Assuming, that different dictionaries have different hash values.");
Any thoughts?
There is no guarantee that two different objects will have different hash values.
In the latest open-source version of CoreFoundation, the hash of a CFDictionary (which is equivalent to an NSDictionary) is defined as:
static CFHashCode __CFDictionaryHash(CFTypeRef cf) {
return __CFBasicHashHash((CFBasicHashRef)cf);
}
and __CFBasicHashHash is defined as:
__private_extern__ CFHashCode __CFBasicHashHash(CFTypeRef cf) {
CFBasicHashRef ht = (CFBasicHashRef)cf;
return CFBasicHashGetCount(ht);
}
which is simply the number of entries stored in the collection. In the other words, both [dictA hash] and [dictB hash]'s hash value are 1, the number of entries in the dictionaries.
While it is a very bad hash algorithm, CF didn't do anything wrong here. If you need to have a more accurate hash value, you can provide one yourself in an Obj-C category.
With a dictionary with only integers, strings etc. I would use dict.description.hash as a quick code.
A solution based on igor-kulagin's answer which is not order dependent:
#implementation NSDictionary (Extensions)
- (NSUInteger) hash
{
NSUInteger prime = 31;
NSUInteger result = 1;
for (NSObject *key in [[self allKeys] sortedArrayUsingSelector:#selector(compare:)]) {
result = prime * result + [key hash];
result = prime * result + [self[key] hash];
}
return result;
}
#end
However, there is still a possibility of collision if the dictionary contains other dictionaries as values.
The function 'hash' is not a real hash function. It gives different values for strings (all predictable) but for collections (arrays and dictionaries) it just returns the count. If you want a unique hash you can calculate it yourself using primes, or the functions srandom() and random() or explore a real hash function like SHA1 available in CommonCrypto/CommonDigest.h
I created NSDictionary category and overridden hash method based on this answer: Best practices for overriding isEqual: and hash
#implementation NSDictionary (Extensions)
- (NSUInteger) hash {
NSUInteger prime = 31;
NSUInteger result = 1;
NSArray *sortedKeys = [self.allKeys sortedArrayUsingSelector: #selector(compare:)];
for (NSObject *key in sortedKeys) {
result = prime * result + key.hash;
id value = self[key];
if ([value conformsToProtocol: #protocol(NSObject)] == YES) {
result = prime * result + [value hash];
}
}
return result;
}
#end
And Swift implementation.
extension Dictionary where Key: Comparable, Value: Hashable {
public var hashValue: Int {
let prime = 31
var result = 1
let sortedKeys = self.keys.sorted()
for (key) in sortedKeys {
let value = self[key]!
result = Int.addWithOverflow(Int.multiplyWithOverflow(prime, result).0, key.hashValue).0
result = Int.addWithOverflow(Int.multiplyWithOverflow(prime, result).0, value.hashValue).0
}
return result
}
}
Perfectly this also requires to implement Equatable protocol for Dictionary so you can also add Hashable protocol conformance.
For example the Object is something like this:
MyUser: NSObject{
NSString *firstName;
NSString *lastName;
NSString *gender;
int age;
}
and I would like to compare to user, if their attributes are the same, I will treat it as equal... instead of write a static method to compare enough attribute one by one, can I have a lazy way to get all the attribute to compare themselves, Thanks.?
For comparison, this is what you're trying to avoid writing.
-(NSUInteger)hash {
return [firstName hash] ^ [lastName hash] ^ [gender hash] ^ age;
}
-(BOOL)isEqual:(id)other {
return [other isKindOfClass:[self class]]
&& age == other.age
&& [gender isEqualToString:other.gender]
&& [firstName isEqualToString:other.firstName]
&& [lastName isEqualToString:other.lastName];
}
Using XOR is an extremely simple way of combining hashes, and I mostly include it as a stand-in. It may hurt the quality of the hash value, depending on distribution of the underlying hash functions. If the hashes have a uniform distribution, it should be all right. Note also that combining hashes only works because NSStrings that are equal in content have the same hashes. This approach won't work with all types; in particular, it won't work with types that use the default implementation of hash.
To get around writing the above, first change the type of the age property to NSNumber, so it doesn't have to be handled as a special case. You don't have to change the ivar, though you can if you want.
#interface MyUser : NSObject {
...
unsigned int age; // Or just make this an NSNumber*
}
...
#property (assign,nonatomic) NSNumber *age;
#implementation MyUser
#synthesize firstName, lastName, gender;
/* if the age ivar is an NSNumber*, the age property can be synthesized
instead of explicitly defining accessors.
*/
#dynamic age;
-(NSNumber*)age {
return [NSNumber numberWithUnsignedInt:age];
}
-(void)setAge:(NSNumber*)newAge {
age = [newAge unsignedIntValue];
}
Second, make sure your class supports the fast enumeration protocol. If it doesn't, you can implement -countByEnumeratingWithState:objects:count: by making use of reflection (with the Objective-C runtime functions) to get the list of properties for instances of your class. For example (taken in part from "Implementing countByEnumeratingWithState:objects:count:" on Cocoa With Love):
#import <objc/runtime.h>
...
#interface MyUser (NSFastEnumeration) <NSFastEnumeration>
-(NSUInteger)countByEnumeratingWithState:(NSFastEnumerationState *)state objects:(id *)stackbuf count:(NSUInteger)len;
#end
#implementation MyUser
#synthesize firstName, lastName, gender;
/* defined in the main implementation rather than a category, since there
can be only one +[MyUser initialize].
*/
static NSString **propertyNames=0;
static unsigned int cProperties=0;
+(void)initialize {
unsigned int i;
const char *propertyName;
objc_property_t *properties = class_copyPropertyList([self class], &cProperties);
if ((propertyNames = malloc(cProperties * sizeof(*propertyNames)))) {
for (i=0; i < cProperties; ++i) {
propertyName = property_getName(properties[i]);
propertyNames[i] = [[NSString alloc]
initWithCString:propertyName
encoding:NSASCIIStringEncoding];
}
} else {
cProperties = 0;
// Can't initialize property names. Fast enumeration won't work. What do?
}
}
...
#end
#implementation MyUser (NSFastEnumeration)
-(NSUInteger)
countByEnumeratingWithState:(NSFastEnumerationState *)state
objects:(id *)stackbuf
count:(NSUInteger)len
{
if (state->state >= cProperties) {
return 0;
}
state->itemsPtr = propertyNames;
state->state = cProperties;
state->mutationsPtr = (unsigned long *)self;
return cProperties;
}
#end
Last, implement hash (using fast enumeration) and isEqual:. Hash should calculate the hashes of all properties, then combine them to create the hash for the MyUser instance. isEqual: can simply check the other object is an instance of MyUser (or a subclass thereof) and compare hashes. For example:
-(NSUInteger)hash {
NSUInteger myHash=0;
for (NSString *property in self) {
// Note: extremely simple way of combining hashes. Will likely lead
// to bugs
myHash ^= [[self valueForKey:property] hash];
}
return myHash;
}
-(BOOL)isEqual:(id)other {
return [other isKindOfClass:[self class]]
&& [self hash] == [other hash];
}
Now, ask yourself which is less work overall. If you want a single approach what will work for all your classes, it might be the second (with some changes, such as turning +initialize into a class method on NSObject that returns the property name array and length), but in all likelihood the former is the winner.
There's a danger in both of the above hash implementations with calculating the hash based on property values. From Apple's documentation on hash:
If a mutable object is added to a collection that uses hash values to determine the object’s position in the collection, the value returned by the hash method of the object must not change while the object is in the collection. Therefore, either the hash method must not rely on any of the object’s internal state information or you must make sure the object’s internal state information does not change while the object is in the collection.
Since you want isEqual: to be true whenever two objects have the same property values, the hashing scheme must depend directly or indirectly on the object's state, so there's no getting around this danger.