Is enumerateObjectsUsingBlock: faster than a for-in loop? Why? [duplicate] - objective-c

This question already has answers here:
Objective-C enumerateObjectsUsingBlock vs fast enumeration?
(2 answers)
Closed 8 years ago.
I was reading the NSHipster article on enumeration, which claims that for-in loops are faster than enumerateObjectsUsingBlock::
Unless you actually need the numerical index while iterating, it's almost always faster to use a for/in NSFastEnumeration loop instead.
This answer provides some rebuttal for that quote:
Fast enumeration requires translation from an internal representation to the representation for fast enumeration. There is overhead therein. Block-based enumeration allows the collection class to enumerate contents as quickly as the fastest traversal of the native storage format.
What is the translation process to move from the internal representation to the representation for fast enumeration? I understand that there is some overhead there, but how much?

The real answer: No difference that matters to pretty much any real world program. and Don't worry about it until you find an actual issue during performance quantification. and If the speed of execution of a loop matters, then your overall app architecture is likely the bug.
With that said, there is certainly some academic curiosity worth pursuing.
See:
Objective-C enumerateUsingBlock vs fast enumeration?

Related

What is the advantage of implementing NSMutableArray as a 2-3 tree?

I read elsewhere that NSMutableArray is implemented several different ways, depending on size, and is sometimes implemented as a 2-3 tree instead of just an array in memory. The implication is that doing things like removing an object is not as expensive as having to copy the entire tail end of the array over by one.
I was wondering if there was a quick summary of the implementation, so that I don't have to dig through the source code.
How is the array ordered?
Is the root node the middle of the array?
How does the array find the nth element?
Are there other advantages
to implementing the array as a 2-3 tree other than fast removal of
objects near the front of the array?
Searching the interwebs, I couldn't find anything that discussed implementing a mutable array as a tree.
Edit: Short Answer: it appears that NSMutableArray is implemented as a circular buffer, so the question makes no sense. If there is an advantage to implementing an array as a 2-3 tree, I would still like to know it.
Bartosz Ciechanowski has your answer in incredible detail at his blog post "Exposing NSMutableArray"
tl;dr: It's a "circular buffer:"
Data Structure
As you might have guessed, __NSArrayM makes use of circular buffer. This data structure is extremely simple, but a little bit more sophisticated than regular array/buffer. The contents of circular buffer can wrap around when either end is reached.
Circular buffer has some very cool properties. Notably, unless the buffer is full, insertion/deletion from either end doesn’t require any memory to be moved. Let’s analyze how the class utilizes circular buffer to be superior in its behavior in comparison to C array.
Historically, __NSArrayM is a newer-than-Cocoa class name-- it wasn't called that in OSX 10.0, but I can't find a good link for the older names. Maybe it used to be a tree. The link in your first comment makes it sound like in 2005, at large sizes it was not a circular buffer internally.

Objective C's ARC vs C++'s Manual Memory Management [duplicate]

This question already has answers here:
What are the advantages and disadvantages of using ARC? [closed]
(4 answers)
Closed 9 years ago.
One possible exam question reads as follows:
"Explain the benefits and drawbacks of Objective C's memory management when compared to c++'s"
I do know Objective C uses ARC, and ARC enables us to avoid destroying an object that is still being referenced by something else (meaning, its still needed). But I can't find any drawbacks at all, anywhere. I can only think of "There are no drawbacks" as an answer, but since the question explicitly asks for drawbacks, I'm guessing there must be at least one.
Reference counting may solve a problem that you don't have. It comes at a price, and you'll end up paying the price no matter whether you wanted the solution in the first place.
Contrary to what the gut feeling may say, most objects actually don't need to be shared at all and have a well-defined, unique ownership throughout their life. All that's needed is the ability to pass those objects around; reference counting provides that, but it provides much more, and has a greater cost.
(This answer compares reference counting in Objective C to C++-style lifetime management. It does not consider whether reference counting in Obj-C is sensible in the first place. ARC is simply an automated form of MRC, and if you were using MRC in the past and it made sense, then the question whether to migrate to ARC is not the point of this post. Rather, this post applies equally to the comparison of "MRC in Obj-C vs C++".)
Reference counting 'frees' you from always thinking about WHEN to delete an object. Anybody using the object just says, I still need it (want to retain it) or I am done with it (I release it)
that makes memory management way easier and also makes the code more manageable
BUT
it comes at the price of 2 additional method calls whenever you pass stuff around: you have to retain the object, THEN save the pointer and later also call release it.
When you deal with LOTS of objects that become a real life problem. Just the extra calls can kill your algorithms performance
and especially if you don't need any reference counting because the scope where the object is used is clear, the overhead is just annoying.
so it is convenience + maintainability vs. speed

How does Apple's Objective-C runtime do multithreaded reference counting without degraded performance?

So I was reading this article about an attempt to remove the global interpreter lock (GIL) from the Python interpreter to improve multithreading performance and saw something interesting.
It turns out that one of the places where removing the GIL actually made things worse was in memory management:
With free-threading, reference counting operations lose their thread-safety. Thus, the patch introduces a global reference-counting mutex lock along with atomic operations for updating the count. On Unix, locking is implemented using a standard pthread_mutex_t lock (wrapped inside a PyMutex structure) and the following functions...
...On Unix, it must be emphasized that simple reference count manipulation has been replaced by no fewer than three function calls, plus the overhead of the actual locking. It's far more expensive...
...Clearly fine-grained locking of reference counts is the major culprit behind the poor performance, but even if you take away the locking, the reference counting performance is still very sensitive to any kind of extra overhead (e.g., function call, etc.). In this case, the performance is still about twice as slow as Python with the GIL.
and later:
Reference counting is a really lousy memory-management technique for free-threading. This was already widely known, but the performance numbers put a more concrete figure on it. This will definitely be the most challenging issue for anyone attempting a GIL removal patch.
So the question is, if reference counting is so lousy for threading, how does Objective-C do it? I've written multithreaded Objective-C apps, and haven't noticed much of an overhead for memory management. Are they doing something else? Like some kind of per object lock instead of a global one? Is Objective-C's reference counting actually technically unsafe with threads? I'm not enough of a concurrency expert to really speculate much, but I'd be interested in knowing.
There is overhead and it can be significant in rare cases (like, for example, micro-benchmarks ;), regardless of the optimizations that are in place (of which, there are many). The normal case, though, is optimized for un-contended manipulation of the reference count for the object.
So the question is, if reference counting is so lousy for threading, how does Objective-C do it?
There are multiple locks in play and, effectively, a retain/release on any given object selects a random lock (but always the same lock) for that object. Thus, reducing lock contention while not requiring one lock per object.
(And what Catfish_man said; some classes will implement their own reference counting scheme to use class-specific locking primitives to avoid contention and/or optimize for their specific needs.)
The implementation details are more complex.
Is Objectice-C's reference counting actually technically unsafe with threads?
Nope -- it is safe in regards to threads.
In reality, typical code will call retain and release quite infrequently, compared to other operations. Thus, even if there were significant overhead on those code paths, it would be amortized across all the other operations in the app (where, say, pushing pixels to the screen is really expensive, by comparison).
If an object is shared across threads (bad idea, in general), then the locking overhead protecting the data access and manipulation will generally be vastly greater than the retain/release overhead because of the infrequency of retaining/releasing.
As far as Python's GIL overhead is concerned, I would bet that it has more to do with how often the reference count is incremented and decremented as a part of normal interpreter operations.
In addition to what bbum said, a lot of the most frequently thrown around objects in Cocoa override the normal reference counting mechanisms and store a refcount inline in the object, which they manipulate with atomic add and subtract instructions rather than locking.
(edit from the future: Objective-C now automatically does this optimization on modern Apple platforms, by mixing the refcount in with the 'isa' pointer)

Which type of array to use for large amounts of numbers?

I need to store large amounts of unsigned chars and/or ints (potentially 100,000,000 and up) in an array. Mathematical operations will frequently be performed on the numbers in this array, so the array will be modified often, and the length of the array can potentially change often as well.
I can use C or Objective-C (or both). Performance wise, would it be better to use a plain C array and realloc it as necessary, or just go for an NSMutableArray? Or does anyone have any better ideas?
Please note that performance is my main concern, I am willing to write extensive reallocation code if necessary.
Also: Memory usage is a consideration, but not a concern (as long as it doesn't end up using multiple gigabytes).
Using an NSMutableArray means you have the overhead of two Objective-C message sends every time you want to get or set the value of an array element. (One message to get the element as an object, and a second to get its value as a primitive int.) A message send is much slower than a direct array access.
You could use a CFMutableArray instead of an NSMutableArray, and specify callbacks that let you store bare numbers instead of objects. But you would still need to use a function call to get or set each array value.
If you need peak performance, just use a plain C array, or a std::vector if you want to use Objective-C++.
Will your array need to grow much and how much ?
Using realloc is not very performant.
That's why I would recommend a linked list as you can find GSList in glib.
The container:
How about C++? Objective-C++ and STL might be a point, STL was made by smart people and it's actually quite efficient in skilled hands. Although, having potentially up to 100,000,000 entries requires some optimization tricks in any cases.
The framework:
You haven't specified the task itself, could it be suitable to use something like CoreData or maybe SQLite? The math can be done with SQL procedure.
The first one is good if you have some, mmm, data samples - pixels, audio chunks or something like that. The second way is definitely preferred in most other cases.

Cocoa NSArray/NSSet: -makeObjectsPerformSelector: vs. fast enumeration

I want to perform the same action over several objects stored in a NSSet.
My first attempt was using a fast enumeration:
for (id item in mySetOfObjects)
[item action];
which works pretty fine. Then I thought of:
[mySetOfObjects makeObjectsPerformSelector:#selector(action)];
And now, I don't know what is the best choice. As far as I understand, the two solutions are equivalent. But are there arguments for preferring one solution over the other?
I would argue for using makeObjectsPerformSelector, since it allows the NSSet object to take care of its own indexing, looping and message dispatching. The people who wrote the NSSet code are most likely to know the best way to implement that particular loop.
At worst, they would simply implement the exact same loop, and all you gain is slightly cleaner code (no need for the enclosing loop). At best, they made some internal optimizations and the code will actually run faster.
The topic is briefly mentioned in Apple's Code Speed Performance document, in the section titled "Unrolling Loops".
If you're concerned about performance, the best thing to do is set up a quick program which performs some selector on the objects in a set. Have it run several million times, and time the difference between the two different cases.
I too was presented with this question. I find in the Apple docs "Collections Programming Topics" under "Sets: Unordered Collections of Objects" the following:
The NSSet method objectEnumerator lets
you traverse elements of the set one
by one. And
themakeObjectsPerformSelector: and
makeObjectsPerformSelector:withObject:
methods provide for sending messages
to individual objects in the set. In
most cases, fast enumeration should be
used because it is faster and more
flexible than using an NSEnumerator or
the makeObjectsPerformSelector:
method. For more on enumeration, see
“Enumeration: Traversing a
Collection’s Elements.”
This leads me to believe that Fast Enumeration is still the most efficient means for this application.
I would not use makeObjectsPerformSelector for the simple reason that it is the kind of call that you don't see all that often. Here is why for example - I need to add debugging code as the array is enumerated, and you really can't do that with makeObjectsPerformSelector unless you change how the code works in Release mode which is a real no no.
for (id item in mySetOfObjects)
{
#if MY_DEBUG_BUILD
if ([item isAllMessedUp])
NSLog(#"we found that wily bug that has been haunting us");
#endif
[item action];
}
--Tom
makeObjectsPerformSelector: might be slightly faster, but I doubt there's going to be any practical difference 99% of the time. It is a bit more concise and readable though, I would use it for that reason.
If pure speed is the only issue (i.e. you're creating some rendering engine where every tiny CPU cycle counts), the fastest possible way to iterate through any of the NSCollection objects (as of iOS 5.0 ~ 6.0) is the various "enumerateObjectsUsingBlock" methods. I have no idea why this is, but I tested it and this seems to be the case...
I wrote small test creating collections of hundreds of thousands of objects that each have a method which sums a simple array of ints. Each of those collections were forced to perform the various types of iteration (for loop, fast enumeration, makeObjectsPerformSelector, and enumerateObjectsUsingBlock) millions of times, and in almost every case the "enumerateObjectsUsingBlock" methods won handily over the course of the tests.
The only time when this wasn't true was when memory began to fill up (when I began to run it with millions of objects), after which it began to lose to "makeObjectsPerformSelector".
I'm sorry I didn't take a snapshot of the code, but it's a very simple test to run, I highly recommend giving it a try and see for yourself. :)