I need to allocate lot's of NSString objects from cStrings (which come that way from a database), as fast as possible. cStringUsingEncoding and the likes are just too slow - about 10-15 times slower compared to allocating a cString.
However, creating a NSString with a NSString is getting pretty close to cString allocation (about 1.2s for 1M allocations). EDIT: Fixed alloc to use a copy of the string.
const char *n;
const char *s = "Office für iPad: Steve Ballmer macht Hoffnung";
NSString *str = [NSString stringWithUTF8String:s];
int len = strlen(s);
for (int i = 0; i<10000000; i++) {
NSString *s = [[NSString alloc] initWithString:[str copy]];
s = s;
}
cString allocation test (also about 1s for 1M allocations):
for (int i = 0; i<10000000; i++) {
n = malloc(len);
memccpy((void*)n, s, 0, len) ;
n = n;
free(n);
}
But as I said, using stringWithCString and the likes is an order of magnitude slower. The fastest I could get was using initWithBytesNoCopy (about 8s, therefore 8 times slower compared to stringWithString):
NSString *so = [[NSString alloc] initWithBytesNoCopy:(void*)n length:len encoding:NSUTF8StringEncoding freeWhenDone:YES];
So, is there another magic way to make allocations from cStrings faster? I'd even not rule out to subclass NSString (and yes, I know it's a cluster class).
EDIT: In instruments I see that NSString's call to CFStringUsingByteStream3 is the root issue.
EDIT 2: The root issue is according to instuments __CFFromUTF8. Just looking at the sources [1], this seems indeed to be quite inefficient and handling some legacy cases.
https://www.opensource.apple.com/source/CF/CF-476.17/CFBuiltinConverters.c?txt
This seems to me to not be a fair test.
cString allocation test looks to be allocating a byte array and copying data. I can't tell for sure because the variable definitions are not included.
NSString *s = [[NSString alloc] initWithString:str]; is taking an existing NSString (data already in the correct format) and maybe just increments the retain count. Even if a copy is forced the data is still already in the correct encoding and just needs to be copied.
[NSString stringWithUTF8String:s]; has to handle the UTF8 encoding and convert from one encoding (UTF8) to the internal NSString/CFString encoding. The method being used (CFStreamUsingByteStream) has support for multiple encodings (UTF8/UTF16/UTF32/others). A specialized UTF8 only method could be faster but that leads to the question of is this really a performance problem or just an exercise.
You can see the source code for CFStringUsingByteStream3 in this file.
As per my comment, and Brian's answer, I think the problem here is that to create NSStrings you're having to parse the UTF-8 strings. So the question arises: do you really need to parse them, then?
If parsing-on-demand is an option then I'd suggest you write a proxy that can impersonate NSString with an interface along the lines of:
#interface BJLazyUTF8String: NSProxy
- (id)initWithBytes:(const char *)bytes length:(size_t)length;
#end
So it's not a subclass of NSString and it doesn't try to provide any real functionality. Inside the init just keep the bytes, e.g. as _bytes, doing whatever is correct for your C memory ownership. Then:
- (NSString *)bjRealString
{
// we'd better create the NSString if we haven't already
if(!_string)
_string = [NSString stringWithUTF8String:_bytes];
return _string;
}
- (void)forwardInvocation:(NSInvocation *)anInvocation
{
// if this is invoked then someone is trying to
// make a call to what they think is a string;
// let's forward that call to a string so that
// it does what they expect
[anInvocation setTarget:[self bjRealString]];
[anInvocation invoke];
}
- (NSMethodSignature *)methodSignatureForSelector:(SEL)aSelector
{
return [[self bjRealString] methodSignatureForSelector:aSelector];
}
You can then do:
NSString *myString = [[BJLazyUTF8String alloc] initWithBytes:... length:...];
And subsequently treat myString exactly as though it were an NSString.
Microbenchmarks are a great distraction, but rarely useful. In this case, though, there is validity.
Assuming, for the moment, that you've actually measured string creation as being a real source of performance issues, then the real problem can be better expressed as how do I reduce memory bandwidth? because that is really where your problems lie; you causing tons and tons of data to be copied into freshly allocated buffers.
As you've discovered, the fastest you can go is to not copy at all. initWithBytesNoCopy:... exists exactly to solve this case. Thus, you'll want to create a data construct that holds the original string buffer and manages all the NSString instances that point to it as one cohesive unit.
Without thinking it through in detail, you could likely encapsulate the raw buffer in an NSData instance, then use associated objects to create a strong reference from your string instances to that NSData instance. That way, the NSData (and associated memory) will be deallocated when the last string is deallocated.
With the additional detail that this is for a CoreData-esque ORM layer (and, no, I'm not going to suggest yer doin' it wrong because your description really does sound like you need that level of control), then it would seem that your ORM layer would be the ideal place to manage these strings as described above.
I'd also encourage you to investigate something like FMDB to see if it can provide both the encapsulation you need and the flexibility to add your additional features (and the hooks to make it fast).
I usually see this question asked the other way, such as Must every ivar be a property? (and I like bbum's answer to this Q).
I use properties almost exclusively in my code. Every so often, however, I work with a contractor who has been developing on iOS for a long time and is a traditional game programmer. He writes code that declares almost no properties whatsoever and leans on ivars. I assume he does this because 1.) he's used to it since properties didn't always exist until Objective C 2.0 (Oct '07) and 2.) for the minimal performance gain of not going through a getter / setter.
While he writes code that doesn't leak, I'd still prefer him to use properties over ivars. We talked about it and he more or less sees not reason to use properties since we weren't using KVO and he's experienced with taking care of the memory issues.
My question is more... Why would you ever want to use an ivar period - experienced or not. Is there really that great of a performance difference that using an ivar would be justified?
Also as a point of clarification, I override setters and getters as needed and use the ivar that correlates with that property inside of the getter / setter. However, outside of a getter / setter or init, I always use the self.myProperty syntax.
Edit 1
I appreciate all of the good responses. One that I'd like to address that seems incorrect is that with an ivar you get encapsulation where with a property you don't. Just define the property in a class continuation. This will hide the property from outsiders. You can also declare the property readonly in the interface and redefine it as readwrite in the implementation like:
// readonly for outsiders
#property (nonatomic, copy, readonly) NSString * name;
and have in the class continuation:
// readwrite within this file
#property (nonatomic, copy) NSString * name;
To have it completely "private" only declare it in the class continuation.
Encapsulation
If the ivar is private, the other parts of the program can't get at it as easily. With a declared property, the clever people can access and mutate quite easily via the accessors.
Performance
Yes, this can make a difference in some cases. Some programs have constraints where they can not use any objc messaging in certain parts of the program (think realtime). In other cases, you may want to access it directly for speed. In other cases, it's because objc messaging acts as an optimization firewall. Finally, it can reduce your reference count operations and minimize peak memory usage (if done correctly).
Nontrivial Types
Example: If you have a C++ type, direct access is just the better approach sometimes. The type may not be copyable, or it may not be trivial to copy.
Multithreading
Many of your ivars are codependent. You must ensure your data integrity in multithreaded context. Thus, you may favor direct access to multiple members in critical sections. If you stick with accessors for codependent data, your locks must typically be reentrant and you will often end up making many more acquisitions (significantly more at times).
Program Correctness
Since the subclasses can override any method, you may eventually see there is a semantic difference between writing to the interface versus managing your state appropriately. Direct access for program correctness is especially common in partially constructed states -- in your initializers and in dealloc, it's best to use direct access. You may also find this common in the implementations of an accessor, a convenience constructor, copy, mutableCopy, and archiving/serialization implementations.
It's also more frequent as one moves from the everything has a public readwrite accessor mindset to one which hides its implementation details/data well. Sometimes you need to correctly step around side effects a subclass' override may introduce in order to do the right thing.
Binary Size
Declaring everything readwrite by default usually results in many accessor methods you never need, when you consider your program's execution for a moment. So it will add some fat to your program and load times as well.
Minimizes Complexity
In some cases, it's just completely unnecessary to add+type+maintain all that extra scaffolding for a simple variable such as a private bool that is written in one method and read in another.
That's not at all to say using properties or accessors is bad - each has important benefits and restrictions. Like many OO languages and approaches to design, you should also favor accessors with appropriate visibility in ObjC. There will be times you need to deviate. For that reason, I think it's often best to restrict direct accesses to the implementation which declares the ivar (e.g. declare it #private).
re Edit 1:
Most of us have memorized how to call a hidden accessor dynamically (as long as we know the name…). Meanwhile, most of us have not memorized how to properly access ivars which aren't visible (beyond KVC). The class continuation helps, but it does introduce vulnerabilities.
This workaround's obvious:
if ([obj respondsToSelector:(#selector(setName:)])
[(id)obj setName:#"Al Paca"];
Now try it with an ivar only, and without KVC.
For me it is usually performance. Accessing an ivar of an object is as fast as accessing a struct member in C using a pointer to memory containing such a struct. In fact, Objective-C objects are basically C structs located in dynamically allocated memory. This is usually as fast as your code can get, not even hand optimized assembly code can be any faster than that.
Accessing an ivar through a getter/setting involves an Objective-C method call, which is much slower (at least 3-4 times) than a "normal" C function call and even a normal C function call would already be multiple times slower than accessing a struct member. Depending on the attributes of your property, the setter/getter implementation generated by the compiler may involve another C function call to the functions objc_getProperty/objc_setProperty, as these will have to retain/copy/autorelease the objects as needed and further perform spinlocking for atomic properties where necessary. This can easily get very expensive and I'm not talking about being 50% slower.
Let's try this:
CFAbsoluteTime cft;
unsigned const kRuns = 1000 * 1000 * 1000;
cft = CFAbsoluteTimeGetCurrent();
for (unsigned i = 0; i < kRuns; i++) {
testIVar = i;
}
cft = CFAbsoluteTimeGetCurrent() - cft;
NSLog(#"1: %.1f picoseconds/run", (cft * 10000000000.0) / kRuns);
cft = CFAbsoluteTimeGetCurrent();
for (unsigned i = 0; i < kRuns; i++) {
[self setTestIVar:i];
}
cft = CFAbsoluteTimeGetCurrent() - cft;
NSLog(#"2: %.1f picoseconds/run", (cft * 10000000000.0) / kRuns);
Output:
1: 23.0 picoseconds/run
2: 98.4 picoseconds/run
This is 4.28 times slower and this was a non-atomic primitive int, pretty much the best case; most other cases are even worse (try an atomic NSString * property!). So if you can live with the fact that each ivar access is 4-5 times slower than it could be, using properties is fine (at least when it comes to performance), however, there are plenty of situations where such a performance drop is completely unacceptable.
Update 2015-10-20
Some people argue, that this is not a real world problem, the code above is purely synthetic and you will never notice that in a real application. Okay then, let's try a real world sample.
The code following below defines Account objects. An account has properties that describe name (NSString *), gender (enum), and age (unsigned) of its owner, as well as a balance (int64_t). An account object has an init method and a compare: method. The compare: method is defined as: Female orders before male, names order alphabetically, young orders before old, balance orders low to high.
Actually there exists two account classes, AccountA and AccountB. If you look their implementation, you'll notice that they are almost entirely identical, with one exception: The compare: method. AccountA objects access their own properties by method (getter), while AccountB objects access their own properties by ivar. That's really the only difference! They both access the properties of the other object to compare to by getter (accessing it by ivar wouldn't be safe! What if the other object is a subclass and has overridden the getter?). Also note that accessing your own properties as ivars does not break encapsulation (the ivars are still not public).
The test setup is really simple: Create 1 Mio random accounts, add them to an array and sort that array. That's it. Of course, there are two arrays, one for AccountA objects and one for AccountB objects and both arrays are filled with identical accounts (same data source). We time how long it takes to sort the arrays.
Here's the output of several runs I did yesterday:
runTime 1: 4.827070, 5.002070, 5.014527, 5.019014, 5.123039
runTime 2: 3.835088, 3.804666, 3.792654, 3.796857, 3.871076
As you can see, sorting the array of AccountB objects is always significant faster than sorting the array of AccountA objects.
Whoever claims that runtime differences of up to 1.32 seconds make no difference should better never do UI programming. If I want to change the sorting order of a large table, for example, time differences like these do make a huge difference to the user (the difference between an acceptable and a sluggish UI).
Also in this case the sample code is the only real work performed here, but how often is your code just a small gear of a complicated clockwork? And if every gear slows down the whole process like this, what does that mean for the speed of the whole clockwork in the end? Especially if one work step depends on the output of another one, which means all the inefficiencies will sum up. Most inefficiencies are not a problem on their own, it's their sheer sum that becomes a problem to the whole process. And such a problem is nothing a profiler will easily show because a profiler is about finding critical hot spots, but none of these inefficiencies are hot spots on their own. The CPU time is just averagely spread among them, yet each of them only has such a tiny fraction of it, it seems a total waste of time to optimize it. And it's true, optimizing just one of them would help absolutely nothing, optimizing all of them can help dramatically.
And even if you don't think in terms of CPU time, because you believe wasting CPU time is totally acceptable, after all "it's for free", then what about server hosting costs caused by power consumption? What about battery runtime of mobile devices? If you would write the same mobile app twice (e.g. an own mobile web browser), once a version where all classes access their own properties only by getters and once where all classes access them only by ivars, using the first one constantly will definitely drain the battery much faster than using the second one, even though they are functional equivalent and to the user the second one would even probably even feel a bit swifter.
Now here's the code for your main.m file (the code relies on ARC being enabled and be sure to use optimization when compiling to see the full effect):
#import <Foundation/Foundation.h>
typedef NS_ENUM(int, Gender) {
GenderMale,
GenderFemale
};
#interface AccountA : NSObject
#property (nonatomic) unsigned age;
#property (nonatomic) Gender gender;
#property (nonatomic) int64_t balance;
#property (nonatomic,nonnull,copy) NSString * name;
- (NSComparisonResult)compare:(nonnull AccountA *const)account;
- (nonnull instancetype)initWithName:(nonnull NSString *const)name
age:(const unsigned)age gender:(const Gender)gender
balance:(const int64_t)balance;
#end
#interface AccountB : NSObject
#property (nonatomic) unsigned age;
#property (nonatomic) Gender gender;
#property (nonatomic) int64_t balance;
#property (nonatomic,nonnull,copy) NSString * name;
- (NSComparisonResult)compare:(nonnull AccountB *const)account;
- (nonnull instancetype)initWithName:(nonnull NSString *const)name
age:(const unsigned)age gender:(const Gender)gender
balance:(const int64_t)balance;
#end
static
NSMutableArray * allAcocuntsA;
static
NSMutableArray * allAccountsB;
static
int64_t getRandom ( const uint64_t min, const uint64_t max ) {
assert(min <= max);
uint64_t rnd = arc4random(); // arc4random() returns a 32 bit value only
rnd = (rnd << 32) | arc4random();
rnd = rnd % ((max + 1) - min); // Trim it to range
return (rnd + min); // Lift it up to min value
}
static
void createAccounts ( const NSUInteger ammount ) {
NSArray *const maleNames = #[
#"Noah", #"Liam", #"Mason", #"Jacob", #"William",
#"Ethan", #"Michael", #"Alexander", #"James", #"Daniel"
];
NSArray *const femaleNames = #[
#"Emma", #"Olivia", #"Sophia", #"Isabella", #"Ava",
#"Mia", #"Emily", #"Abigail", #"Madison", #"Charlotte"
];
const NSUInteger nameCount = maleNames.count;
assert(maleNames.count == femaleNames.count); // Better be safe than sorry
allAcocuntsA = [NSMutableArray arrayWithCapacity:ammount];
allAccountsB = [NSMutableArray arrayWithCapacity:ammount];
for (uint64_t i = 0; i < ammount; i++) {
const Gender g = (getRandom(0, 1) == 0 ? GenderMale : GenderFemale);
const unsigned age = (unsigned)getRandom(18, 120);
const int64_t balance = (int64_t)getRandom(0, 200000000) - 100000000;
NSArray *const nameArray = (g == GenderMale ? maleNames : femaleNames);
const NSUInteger nameIndex = (NSUInteger)getRandom(0, nameCount - 1);
NSString *const name = nameArray[nameIndex];
AccountA *const accountA = [[AccountA alloc]
initWithName:name age:age gender:g balance:balance
];
AccountB *const accountB = [[AccountB alloc]
initWithName:name age:age gender:g balance:balance
];
[allAcocuntsA addObject:accountA];
[allAccountsB addObject:accountB];
}
}
int main(int argc, const char * argv[]) {
#autoreleasepool {
#autoreleasepool {
NSUInteger ammount = 1000000; // 1 Million;
if (argc > 1) {
unsigned long long temp = 0;
if (1 == sscanf(argv[1], "%llu", &temp)) {
// NSUIntegerMax may just be UINT32_MAX!
ammount = (NSUInteger)MIN(temp, NSUIntegerMax);
}
}
createAccounts(ammount);
}
// Sort A and take time
const CFAbsoluteTime startTime1 = CFAbsoluteTimeGetCurrent();
#autoreleasepool {
[allAcocuntsA sortedArrayUsingSelector:#selector(compare:)];
}
const CFAbsoluteTime runTime1 = CFAbsoluteTimeGetCurrent() - startTime1;
// Sort B and take time
const CFAbsoluteTime startTime2 = CFAbsoluteTimeGetCurrent();
#autoreleasepool {
[allAccountsB sortedArrayUsingSelector:#selector(compare:)];
}
const CFAbsoluteTime runTime2 = CFAbsoluteTimeGetCurrent() - startTime2;
NSLog(#"runTime 1: %f", runTime1);
NSLog(#"runTime 2: %f", runTime2);
}
return 0;
}
#implementation AccountA
- (NSComparisonResult)compare:(nonnull AccountA *const)account {
// Sort by gender first! Females prior to males.
if (self.gender != account.gender) {
if (self.gender == GenderFemale) return NSOrderedAscending;
return NSOrderedDescending;
}
// Otherwise sort by name
if (![self.name isEqualToString:account.name]) {
return [self.name compare:account.name];
}
// Otherwise sort by age, young to old
if (self.age != account.age) {
if (self.age < account.age) return NSOrderedAscending;
return NSOrderedDescending;
}
// Last ressort, sort by balance, low to high
if (self.balance != account.balance) {
if (self.balance < account.balance) return NSOrderedAscending;
return NSOrderedDescending;
}
// If we get here, the are really equal!
return NSOrderedSame;
}
- (nonnull instancetype)initWithName:(nonnull NSString *const)name
age:(const unsigned)age gender:(const Gender)gender
balance:(const int64_t)balance
{
self = [super init];
assert(self); // We promissed to never return nil!
_age = age;
_gender = gender;
_balance = balance;
_name = [name copy];
return self;
}
#end
#implementation AccountB
- (NSComparisonResult)compare:(nonnull AccountA *const)account {
// Sort by gender first! Females prior to males.
if (_gender != account.gender) {
if (_gender == GenderFemale) return NSOrderedAscending;
return NSOrderedDescending;
}
// Otherwise sort by name
if (![_name isEqualToString:account.name]) {
return [_name compare:account.name];
}
// Otherwise sort by age, young to old
if (_age != account.age) {
if (_age < account.age) return NSOrderedAscending;
return NSOrderedDescending;
}
// Last ressort, sort by balance, low to high
if (_balance != account.balance) {
if (_balance < account.balance) return NSOrderedAscending;
return NSOrderedDescending;
}
// If we get here, the are really equal!
return NSOrderedSame;
}
- (nonnull instancetype)initWithName:(nonnull NSString *const)name
age:(const unsigned)age gender:(const Gender)gender
balance:(const int64_t)balance
{
self = [super init];
assert(self); // We promissed to never return nil!
_age = age;
_gender = gender;
_balance = balance;
_name = [name copy];
return self;
}
#end
Semantics
What #property can express that ivars can't: nonatomic and copy.
What ivars can express that #property can't:
#protected: public on subclasses, private outside.
#package: public on frameworks on 64 bits, private outside. Same as #public on 32 bits. See Apple's 64-bit Class and Instance Variable Access Control.
Qualifiers. For example, arrays of strong object references: id __strong *_objs.
Performance
Short story: ivars are faster, but it doesn't matter for most uses. nonatomic properties don't use locks, but direct ivar is faster because it skips the accessors call. For details read the following email from lists.apple.com.
Subject: Re: when do you use properties vs. ivars?
From: John McCall <email#hidden>
Date: Sun, 17 Mar 2013 15:10:46 -0700
Properties affect performance in a lot of ways:
As already discussed, sending a message to do a load/store is slower than just doing the load/store inline.
Sending a message to do a load/store is also quite a bit more code that needs to be kept in i-cache: even if the getter/setter
added zero extra instructions beyond just the load/store, there'd be a
solid half-dozen extra instructions in the caller to set up the
message send and handle the result.
Sending a message forces an entry for that selector to be kept in the method cache, and that memory generally sticks around in
d-cache. This increases launch time, increases the static memory
usage of your app, and makes context switches more painful. Since the
method cache is specific to the dynamic class for an object, this
problem increases the more you use KVO on it.
Sending a message forces all values in the function to be spilled to the stack (or kept in callee-save registers, which just means
spilling at a different time).
Sending a message can have arbitrary side-effects and therefore
forces the compiler to reset all of its assumptions about non-local memory
cannot be hoisted, sunk, re-ordered, coalesced, or eliminated.
In ARC, the result of a message send will always get retained, either by the callee or the caller, even for +0 returns: even if the
method doesn't retain/autorelease its result, the caller doesn't know
that and has to try to take action to prevent the result from getting
autoreleased. This can never be eliminated because message sends are
not statically analyzable.
In ARC, because a setter method generally takes its argument at +0, there is no way to "transfer" a retain of that object (which, as
discussed above, ARC usually has) into the ivar, so the value
generally has to get retain/released twice.
None of this means that they're always bad, of course — there are a
lot of good reasons to use properties. Just keep in mind that, like
many other language features, they're not free.
John.
The most important reason is the OOP concept of information hiding: If you expose everything via properties and thus make allow external objects to peek at another object's internals then you will make use of these internal and thus complicate changing the implementation.
The "minimal performance" gain can quickly sum up and then become a problem. I know from experience; I work on an app that really takes the iDevices to their limits and we thus need to avoid unnecessary method calls (of course only where reasonably possible). To aid with this goal, we're also avoiding the dot syntax since it makes it hard to see the number of method calls on first sight: for example, how many method calls does the expression self.image.size.width trigger? By contrast, you can immediately tell with [[self image] size].width.
Also, with correct ivar naming, KVO is possible without properties (IIRC, I'm not an KVO expert).
Properties vs. instance variables is a trade-off, in the end the choice comes down to the application.
Encapsulation/Information Hiding This is a Good Thing (TM) from a design perspective, narrow interfaces and minimal linkage is what makes software maintainable and understandable. It is pretty hard in Obj-C to hide anything, but instance variables declared in the implementation come as close as you'll get.
Performance While "premature optimisation" is a Bad Thing (TM), writing badly performing code just because you can is at least as bad. Its hard to argue against a method call being more expensive than a load or store, and in computational intensive code the cost soon adds up.
In a static language with properties, such as C#, calls to setters/getters can often be optimised away by the compiler. However Obj-C is dynamic and removing such calls is much harder.
Abstraction An argument against instance variables in Obj-C has traditionally been memory management. With MRC instance variables require calls to retain/release/autorelease to be spread throughout the code, properties (synthesized or not) keep the MRC code in one place - the principle of abstraction which is a Good Thing (TM). However with GC or ARC this argument goes away, so abstraction for memory management is no longer an argument against instance variables.
Properties expose your variables to other classes. If you just need a variable that is only relative to the class you're creating, use an instance variable. Here's a small example: the XML classes for parsing RSS and the like cycle through a bunch of delegate methods and such. It's practical to have an instance of NSMutableString to store the result of each different pass of the parse. There's no reason why an outside class would need to ever access or manipulate that string. So, you just declare it in the header or privately and access it throughout the class. Setting a property for it might only be useful to make sure there are no memory issues, using self.mutableString to invoke the getter/setters.
Backwards compatibility was a factor for me. I couldn't use any Objective-C 2.0 features because I was developing software and printer drivers that had to work on Mac OS X 10.3 as part of a requirement. I know your question seemed targeted around iOS, but I thought I'd still share my reasons for not using properties.
In order to promote good programming habits and increase the efficiency of my code (Read: "My brother and I are arguing over some code"), I propose this question to experienced programmers:
Which block of code is "better"?
For those who can't be bothered to read the code, is it worth putting a conditional within a for-loop to decrease the amount of redundant code than to put it outside and make 2 for-loops? Both pieces of code work, the question is efficiency vs. readability.
- (NSInteger)eliminateGroup {
NSMutableArray *blocksToKill = [[NSMutableArray arrayWithCapacity:rowCapacity*rowCapacity] retain];
NSInteger numOfBlocks = (NSInteger)[self countChargeOfGroup:blocksToKill];
Block *temp;
NSInteger chargeTotal = 0;
//Start paying attention here
if (numOfBlocks > 3)
for (NSUInteger i = 0; i < [blocksToKill count]; i++) {
temp = (Block *)[blocksToKill objectAtIndex:i];
chargeTotal += temp.charge;
[temp eliminate];
temp.beenCounted = NO;
}
}
else {
for (NSUInteger i = 0; i < [blocksToKill count]; i++) {
temp = (Block *)[blocksToKill objectAtIndex:i];
temp.beenCounted = NO;
}
}
[blocksToKill release];
return chargeTotal;
}
Or...
- (NSInteger)eliminateGroup {
NSMutableArray *blocksToKill = [[NSMutableArray arrayWithCapacity:rowCapacity*rowCapacity] retain];
NSInteger numOfBlocks = (NSInteger)[self countChargeOfGroup:blocksToKill];
Block *temp;
NSInteger chargeTotal = 0;
//Start paying attention here
for (NSUInteger i = 0; i < [blocksToKill count]; i++) {
temp = (Block *)[blocksToKill objectAtIndex:i];
if (numOfBlocks > 3) {
chargeTotal += temp.charge;
[temp eliminate];
}
temp.beenCounted = NO;
}
[blocksToKill release];
return chargeTotal;
}
Keep in mind that this is for a game. The method is called anytime the user double-taps the screen and the for loop normally runs anywhere between 1 and 15 iterations, 64 at maximum. I understand that it really doesn't matter that much, this is mainly for helping me understand exactly how costly conditional statements are. (Read: I just want to know if I'm right.)
The first code block is cleaner and more efficient because the check numOfBlocks > 3 is either true or false throughout the whole iteration.
The second code block avoids code duplication and might therefore pose lesser risk. However, it is conceptually more complicated.
The second block can be improved by adding
bool increaseChargeTotal = (numOfBlocks > 3)
before the loop and then using this boolean variable instead of the actual check inside the loop, emphasizing the fact that during the iteration it doesn't change.
Personally, in this case I would vote for the first option (duplicated loops) because the loop bodies are small and this shows clearly that the condition is external to the loop; also, it's more efficient and might fit the pattern "make the common case fast".
There is no way to answer this without defining your requirements for "better". Is it runtime efficiency? compiled size? code readability? code maintainability? code portability? code reuseability? algorithmic provability? developer efficiency? (Please leave comments on any popular measurements I've missed.)
Sometimes absolute runtime efficiency is all that matters, but not as often as people generally imagine, as you give a nod towards in your question—but this is at least easy to test! Often it's a mix of all these concerns, and you'll have to make a subjective judgement in the end.
Every answer here is applying a personal mix of these aspects, and people often get into vigorous Holy Wars because everyone's right—in the right circumstance. These approaches are ultimately wrong. The only correct approach is to define what matters to you, and then measure against it.
All other things being equal, having two separate loops will generally be faster, because you do the test once instead of every iteration of the loop. The branch inside the loop each iteration will often slow you down significantly due to pipeline stalls and branch mispredictions; however, since the branch always goes the same way, the CPU will almost certainly predict the branch correctly for every iteration except for the first few, assuming you're using a CPU with branch prediction (I'm not sure if the ARM chip used in the iPhone has a branch predictor unit).
However, another thing to consider is code size: the two loops approach generates a lot more code, especially if the rest of the body of the loop is large. Not only does this increase the size of your program's object code, but it also hurts your instruction cache performance -- you'll get a lot more cache misses.
All things considered, unless the code is a significant bottleneck in your application, I would go with the branch inside of the loop, as it leads to clearer code, and it doesn't violate the don't repeat yourself principle. If you make a change to once of the loops and forget to change the other loop in the two-loops version, you're in for a world of hurt.
I would go with the second option. If all of the logic in the loop was completely different, then it would make sense to make 2 for loops, but the case is that some of the logic is the same, and some is additional based upon the conditional. So the second option is cleaner.
The first option would be faster, but marginally so, and I would only use it if I found there to be a bottleneck there.
You would probably waste more time in the pointless and unnessesary [blocksToKill retain]/[blocksToKill release] at the start/end of the method than the time taken to execute a few dozens comparisons. There is no need to retain the array since you wont need it after you return and it will never be cleaned up before then.
IMHO, code duplication is a leading cause of bugs which should be avoided whenever possible.
Adding Jens recomendation to use fast enumeration and Antti's recomendation to use a clearly named boolean, you'd get something like:
- (NSInteger)eliminateGroup {
NSMutableArray *blocksToKill = [NSMutableArray arrayWithCapacity:rowCapacity*rowCapacity];
NSInteger numOfBlocks = (NSInteger)[self countChargeOfGroup:blocksToKill];
NSInteger chargeTotal = 0;
BOOL calculateAndEliminateBlocks = (numOfBlocks > 3);
for (Block* block in blocksToKill) {
if (calculateAndEliminateBlocks) {
chargeTotal += block.charge;
[block eliminate];
}
block.beenCounted = NO;
}
return chargeTotal;
}
If you finish your project and if your program is not running fast enough (two big ifs), then you can profile it and find the hotspots and then you can determine whether the few microseconds you spend contemplating that branch is worth thinking about — certainly it is not worth thinking about at all now, which means that the only consideration is which is more readable/maintainable.
My vote is strongly in favor of the second block.
The second block makes clear what the difference in logic is, and shares the same looping structure. It is both more readable and more maintainable.
The first block is an example of premature optimization.
As for using a bool to "save" all those LTE comparisons--in this case, I don't think it will help, the machine language will likely require exactly the same number and size of instructions.
The overhead of the "if" test is a handful of CPU instructions; way less than a microsecond. Unless you think the loop is going to run hundreds of thousands of times in response to user input, that's just lost in the noise. So I would go with the second solution because the code is both smaller and easier to understand.
In either case, though, I would change the loop to be
for (temp in blocksToKill) { ... }
This is both cleaner-reading and considerably faster than manually getting each element of the array.
Readability (and thus maintainability) can and should be sacrificed in the name of performance, but when, and only when, it's been determined that performance is an issue.
The second block is more readable, and unless/until speed is an issue, it is better (in my opinion). During testing for your app, if you find out this loop is responsible for unacceptable performance, then by all means, seek to make it faster, even if it becomes harder to maintain. But don't do it until you have to.