NSDictionary lookups with +valueWithPointer: keys are too slow - objective-c

I have an Objective C++ class, instances of which are required to store an arbitrary set of C++ objects and associate each with a corresponding Objective C object. Looking up the Objective C objects when given the C++ object is killing my performance, so I'm looking for a better solution.
I'm currently storing the pairs in an NSMutableDictionary after creating the keys using [NSValue valueWithPointer:]. The lookup time, in which +valueWithPointer: is about twice as expensive as -objectForKey:, is simply too slow.
The C++ objects are in a third-party framework, and do not provide any unique identifier.
The sets of C++ objects are always smaller than a dozen elements.
What is a faster approach to performing these lookups?

I see three approaches that seem worth trying:
Use NSMapTable
Use objc_setAssociatedObject
Use std::unordered_map or std::map
objc_setAssociatedObject uses std::unordered_map behind the scenes.

I had a similar issue recently with [NSValue valueWithNonRetainedObject:] being too slow.
The solution I went with was to drop down a level and use CoreFoundation's CFMutableDictionary. I would suggest you take a look at it: CFMutableDictionary Reference. It takes regular C pointers in CFDictionaryAddValue(), CFDictionaryRemoveValue, so it's the best thing to use to interface Obj-C and C++ (C is their common denominator).
The reason I'd rather do that than use an std::unordered_map is because I tend to want to minimise C++ in these kind of things. Obj-C++ is a bit of a hack and it's best to just reduce it to glue code between real Obj-C and existing C++ code.

Related

objective-c complexity reference

For the c++ STL, there is a de-facto standard location (besides the de-jour standard, I mean) to find information about the complexity guarantees of standard container operations.
Is there an analogous, web-accessible document listing complexity guarantees for NSArray, NSDictionary, etc.?
For example, I cannot find a reference that gives complexity for [NSArray count]
Correct. There isn't one. C++ / the STL (based on my limited understanding) have a significant performance focus. Objective-C / Foundation basically don't.
NSArray, NSDictionary and friends are interfaces. They tell you how to use them, not how they behave. This gives them the freedom to switch implementation under the hood for performance reasons. The point is, you don't need to care, and this won't be specified in the API so you can't even if you want to ;)
For a really good read on this subject, highlighting implementation switches, and with a rough comparison between Foundation classes and STL / C data structures, check out the Ridiculous Fish (by someone on the Apple AppKit team) blog post about "Our arrays, aren't"
Is there an analogous, web-accessible document listing complexity
guarantees for NSArray, NSDictionary, etc.?
No. If you understand what the different containers do, you'll have a pretty good idea of how they behave (e.g. dictionary == map -> nearly constant-time lookups). But don't assume that you know exactly how these structures behave, because they may change their behavior based on circumstances. In other words, a class like NSArray may not be (certainly isn't) implemented as an actual array in the sense of a C-style array even though it has that same "ordered sequence of elements" behavior.
You can, of course, analyze the complexity of your own code: your own binary search through an NSArray is always going to take O(log n) operations any way you slice it. Just don't assume that inserting an element into an NSMutableArray is going to require moving all the subsequent elements, because your "array" might really be a linked list or something else.

Objective-C vs. C speed

This is probably a naive question here but I'll ask it anyway.
I'm working with Core Audio (C API) on iOS and am mixing C with Objective-C. My class has the .mm extension and everything is working so far.
I've read in different places about Objective-C being slow (without much detail given - and I am not making any declaration that it is). I understand about not calling Objective-C from a Core Audio render callback, etc. and the reasons why.
On the other hand, I need to call in to the class that handles the Core Audio stuff from my GUI in order to make various adjustments at runtime. There would be some walking of arrays, mostly, shifting data around that is used by Core Audio. Would there be any benefit speed-wise from writing my functions in C and storing my variables in, say, vectors rather than NSMutableArrays?
I've only been working with Objective-C/iOS for a few months so I don't have any perspective on this.
Objective-C is slightly slower than straight C function calls because of the lookups involved in its dynamic nature. I'll edit this answer with more detail on how it works later if nobody else adds in the detail.
However, more importantly, you are optimizing prematurely. There's a VERY high chance that the extra overhead of Objective-C will have zero noticeable impact on your application's performance.
Take advantage of Objective-C's strengths to design the best written, most object-oriented application possible. If, and only if, testing shows performance problems, optimize those particular areas of the application.
The main performance hit with Objective-C is in the work required to dispatch a method invocation. Objective-C is dynamically bound, which means that the object receiving the message (selector) decides what to do with it at run time. This is implemented with a hash table. The selector is hashed (at compile time I think) and mapped to the method that gets invoked via a hash table, and it takes time to do the look up.
Having said that, the method lookup – which happens in objc_msgSend() is highly optimised. In fact, it is hand crafted in assembler. I've heard it said that the overhead compared to a C function call is about 20 machine instructions. Normally, this is not a big deal, but if you are running through a 100,000 element NSArray, looking up each element with -objectAtIndex: that becomes quite a bit of overhead.
In almost every case, however, the extra flexibility and functionality is worth the cost. This is why wadersworld's answer contains fine advice.
Bill Bumgarner has written an awesome set of articles on objc_msgSend()
While other answers have quantified that the dynamic method dispatch (objc_msgSend), being hand-tuned assembly adds about 20 machine instructions, there's another possible cause of poorer performance in Objective-C as compared to C: Objective-C's has a richer foundation library.
One such performance comparison had a game generating terrain, as follows:
The pure C version gave 60 fps
The objective-C version gave 39 fps
The reason for the slow down was the the NSMutableArray being used includes all kinds of safety checks, and is able to grow and shrink to the required size, whereas the C array was fixed sized - go ahead and write beyond the bounds if you want, just be ready for bad things to happen.
Fortunately, as other have said, it's very easy to do later performance analysis, and swap in some pure C code, in the places where it will count.
Slow is relative.
Objective C messaging is slow relative to accessing lots of small data type elements (every pixel in a large image bitmap or every audio sample in an entire song) inside innermost loops. Objective C is really fast relative to doing anything at the rate of UI or even display refresh events.
For handling Core Audio raw samples, stick with using C. For handling Core Audio events related to UI (stop, start, properties, etc.), encapsulating them in Objective C won't make any measurable speed difference.
Objective-C is not slow, it is literally C with objects.
A class in Objective-C consists of a few different things:
A map of selectors to functions (method implementations)
A map of names to types (instance variables)
A map of names to types & functions (properties)
So, Objective-C will be just about as fast as calling the raw C functions yourself, with a little bit of overhead for looking up a function.
objective c is fast like c
because there is no Objective C Compiler and all objective C code is resolved to C using structures and function pointers.
Objective C is the way in which we can write object oriented programming in C. All the features of an object oriented programming language(Small Talk in objective C) are made using C.
Actually we can define an object in C by using structures (can have instance variables of a class) and related functions manipulating that data. Message passing or calling object function is done by using the function
objc_msgSend(receiver,selector,arg1,arg2....)
That is C, and an Objective C processor gives Objective C. When we are compiling Objective C code it is converted in to pure C and C code is compiled and run. The difference between C and Objective C is speed.
It all depends on what you are doing. Using core audio, 99% of your execution time should be spent in library functions anyway. Now if you do something stupid - take a second worth of samples, turn each into an NSNumber, store them into an NSMutableArray, and do a hand written FFT with calls of [[myArray objectAtIndex:i] doubleValue], you get what you deserve. The slowest iPhone can do quite a few method calls per microsecond.
Whether you use a C function or an Objective-C method doesn't make a difference. The only difference is how many Objective-C methods you call. Lots of tiny Objective-C methods called a million times is a lot of overhead. And there is no law that forbids the use of C arrays in Objective-C code.
The rule for speeding up things: Use Instruments. Measure the execution time. Pick where the execution time is high, speed things up, measure again. And most of the time you don't get speedup by replacing good code with better code, but by replacing massively stupid code with reasonably good code.

NSSet implementation

This question is just out of curiosity but, how is NSSet implemented? What data structure is behind it and what are the access times for adding and removing elements? If I had to guess, I'd say it was some sort of hashtable/dictionary data structure, but in that case why differentiate between NSSet and NSMutableSet?
Well, as Bavarious pointed out in a comment, Apple's actual CoreFoundation source is open and available for your perusal too. NSSet is implemented on top of CFSet, whose code is generated (as is that of CFDictionary) from a hash table template, using CFBasicHash to do the work.
The difference between mutablility and immutability seems to be the matter of a flag in the structure (line 91 of CFBasicHash.h), and from my reading so far just affects calls to functions such as CFBasicHashAddValue; there's a simple check for the mutability. It seems likely, however, that Cobbal is right about the copy/retain behavior between the two (I just haven't read that far yet).
PREVIOUSLY:
I find it interesting and educational occasionally to peruse the GNUstep sources when I'm wondering about implementation details. They are, of course, not at all guaranteed to be implemented the way that Apple did it, but they can be helpful in some cases. Their version of Foundation: http://gnu.ethz.ch/debian/gnustep/gnustep-base-1.20.0/Headers/Foundation/ (I hope that's the most recent version. If not, someone please correct me.)
To answer the second half of your question: one benefit of having a non-mutable version is that it allows for a very fast copy method that simply calls retain.
I find this link to be an interesting answer to your question. Apple's data structures (NSArray, NSSet, NSDictionary, etc.) are not implemented in a straightforward and "standard way." In most cases, they perform in the same way any other set would perform, but overall, they optimize automatically for the best performance. So, in truth, it's rather difficult to say. While Apple provides documentation on the efficiency of arrays in CFArray.h (equivalent for NSArrays), it offers no such documentation on the efficiency of sets, though you're free to poke around /System/Library/Frameworks/CoreFoundation.framework/Headers/ to look through other data structure implementations.
In addition, there has to be a distinction between a set and its mutable counterpart, just as there is a distinction between NSString and NSMutableString, NSArray and NSMutableArray, and NSDictionary and NSMutableDictionary (among others). For data structures and strings (and few other classes), Apple offers 'readonly' versions of classes to retain generality, along with standard 'mutable' counterparts for manipulation. It's simply Apple's standard practice.

Is it possible to replace malloc on iOS?

I'd like to use a custom malloc and free for some allocations in an iOS app, including those made by classes like NSMutableData.
Is this possible?
If so, how do I do it?
What I'd actually like to do is zero out certain data after I've used it, in order to guarantee forward security (in case the device is lost or stolen) as much as possible. If there's an easier way to do this that doesn't involve replacing malloc then that's great.
I believe I need to replace malloc in order to do this because the sensitive data is stored in the keychain --- and I have no option other than to use NSDictionary, NSString and NSData in order to access this data (I can't even use the mutable versions).
Instead of overwriting generic memory management functions you can use custom allocators on the sensitive objects.
The keychain services API is written in C and uses Core Foundation objects, like CFDictionary, CFData and CFString. While it's true that these objects are "toll free" bridged to their Objective-C counterparts and are usually interchangeable they have some abilities not available from Objective-C. One of these features is using custom allocators.
CFDictionaryCreate for example takes an argument of type CFAllocatorRef which, in turn, can be created using CFAllocatorCreate. The allocator holds pointers to functions for allocation and deallocation, among others. You can use custom functions to overwrite the sensible data.
Why do you need to go so low-level about it? I'd just overwrite the data in the NSMutableData instance with zeroes instead. If you really need to mess with malloc - I'd probably write a category on NSObject and override the memory-handling functions.
Disclaimer: I have no iOS experience, but I understand that it uses GCC. Assuming that is correct...
I have done this, albeit with GCC on the PlayStation3. I don't know how much of this is transferable to your case. I used the GCC objcopy utility with --weaken-symbol. (You may need to use nm to list the symbols in your library.
Once you've "weakened" the library's malloc, you just write your own, which is then used instead of the original when linked (rather than giving you a link error). To delegate to the original you may have to give it another name somehow (can't remember -- presumably doable with one of the binutils or else there's both a malloc and a _malloc in the library -- sorry, it's been a while.)
Hope that helps.
I'd encourage you to use the Objective-C memory management system based on ownership (retain/release). Memory Management Programming Guide
Another option would be to use C structures with C memory management rules like malloc.
NSMutableData methods like dataWithBytes:length use calloc / bzero internally already. Is that good enough for you?

Managing collections of tuples in Objective-C

I am fairly new to Objective-C and was wondering what the best way to manage collections of tuples was. In C I would use a 2D array or a struct.
Should I be creating objects to contain these tuples? It seems like overkill for just sorting lists or is there no real extra load generated by object initialisation?
There definitely is some overhead in the generation of objects. For a small number of objects, then using ObjC data structures is still appropriate. If you have a large number of tuples, I would manage them in a C array of structs. Remember, Objective-C is really just C. It is appropriate and common to use C constructs in Objective-C (to a point; learning where that point is represents a major milestone in becoming a good Objective-C developer).
Typically for this kind of data structure, I would probably create a single Objective-C object that managed the entire collection. So external callers would see an Objective-C interface, but the internal data would be stored in a more efficient C structure.
If it is common to access a lot of tuples quickly, my collection object would probably provide "get" methods similar to [NSArray getObjects:range:]. ObjC methods that begin with "get" indicate that the first parameter is a pointer that will be overwritten by the method. This is commonly used for high-performance C-like access to things managed by an ObjC object.
This kind of data structure is exactly the way ObjC developers merge the elegance and maintainability of ObjC with the performance and simplicity of C.
I think you'll have to settle for an NSArray of NSArray objects, or maybe an NSArray of NSDictionary objects. You can always roll your own class, or do it the way you would do in C.
There are a couple different ways you could go about this:
CoreData. While it's not technically a database, it can behave a lot like one. If you don't need persistence between app runs, then consider using the NSInMemoryStoreType store type, as opposed to an NSSQLiteStoreType or other option. However, if you're going to want to join tuples together, using CoreData will absolutely not work (this, IMO, is the main reason why CoreData is not a database).
Use a real database. SQLite ships on every Mac and iPhone and is pretty easy to use if you use wrappers like FMDB or SQLite Persistent Objects or PLDatabase or EGODatabase or the GTMSQLite wrapper by Google.
A tuple is really just a collection of key-value pairs, so you could just use an NSMutableArray of NSMutableDictionaries. You obviously won't get to use SQL syntax, and any joins/queries you have to run yourself, but this would definitely have the easiest setup.
Write a tuple class and store those in an NSMutableArray (similar to #3, just enforcing a common set of attributes on your tuples).