In this simple test, after being sure that the index is valid, does it worth to assign a variable instead of calling two times objectAtIndex: method ?
NSString *s = [myArray objectAtIndex:2];
if (s) {
Test *t = [Test initFromString:s];
}
instead of
if ([myArray objectAtIndex:2]) {
Test *t = [Test initFromString:[myArray objectAtIndex:2]];
}
From the performance point of view it’s not worth it, unless the code lies on a really hot path (and you would know that). Sending a message is practically free and looking up an object on a given index is also too fast to care in most situations.
The change makes the code more readable, though: First, you can name the thing that you pull from the container (like testName). Second, when reading the two repeated calls to objectAtIndex you have to make sure that it’s really the same code. After you introduce the separate variable it’s obvious, there’s less cognitive load.
Related
When I pass a string the Apple-style way to a function and test it a billion times it takes ~ 42,001 seconds:
- (void)test:(NSString *)str {
NSString *test = str;
if (test) {
return;
}
}
NSString *value = #"Value 1";
NSLog(#"START");
for (int i = 0; i < 1e9; i++) {
[self test:value];
}
NSLog(#"END");
But then passing the pointer it's pointer as a value (assuming my test function will be read-only style) like so:
- (void)test:(NSString **)str {
NSString *test = *str;
if (test) {
return;
}
}
NSLog(#"START");
for (int i = 0; i < 1e9; i++) {
[self test:&value];
}
NSLog(#"END");
..only takes ~26,804 seconds.
Why does Apple promote the first example as normal practice, while the latter seems to perform so different?
I read about the Toll-Free Bridging that Foundation applies, but if the difference is relatively so big, what's the added value? A whole application that would run a factor of more than 100% faster by just upgrading some major function arguments like this, then isn't that a considerable flaw by Apple, in their way of instructing how to build apps in Objective-C?
You wouldn't use the NSString ** syntax, as that suggests that the method you're calling can change what value points to. You would never do that unless this is really what was taking place.
The simple NSString * example may be taking longer because in the absence of any optimization, the NSString * rendition is probably adding/removing of a strong references to value when the method is called and returns.
If you turn on optimization, the behavior changes. For example, when I used -Os "Fastest, Smallest" build setting, the NSString * rendition was actually faster than the NSString ** one. And even if the performance was worse, I wouldn't write the code that exposed me to all sorts problems down the line just because it was was 0.0000152 seconds faster per call. I'd find other ways to optimize the code.
To quote Donald Knuth:
Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. [Emphasis added]
The goal is always to write code whose functional intent is clear, whose type handling is safest and then, where possible, use the compiler's own internal optimization capabilities to tackle the performance issues. Only sacrifice the code readability and ease of maintenance and debugging when it's absolutely essential.
So I have these two methods:
-(void)importEvents:(NSArray*)allEvents {
NSMutableDictionary *subjectAssociation = [[NSMutableDictionary alloc] init];
for (id thisEvent in allEvents) {
if (classHour.SubjectShort && classHour.Subject) {
[subjectAssociation setObject: classHour.Subject forKey:classHour.SubjectShort];
}
}
[self storeSubjects:subjectAssociation];
}
-(void)storeSubjects:(NSMutableDictionary*)subjects {
NSArray *documentPaths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDir = [documentPaths objectAtIndex:0];
NSString *subjectsList = [documentsDir stringByAppendingPathComponent:#"Subjects.plist"];
[subjects writeToFile:subjectsList atomically:YES];
}
The first loops through an array of let's say 100 items, and builds a NSMutableDictionary of about 10 unique key/value pairs.
The second method writes this dictionary to a file for reference elsewhere in my app.
The first method is called quite often, and so is the second. However, I know, that once the dictionary is built and saved, its contents won't ever change, no matter how often I call these methods, since the number of possible values is just limited.
Question: given the fact that the second method essentially needs to be executed only once, should I add some lines that check if the file already exists, essentially adding code that needs to be executed, or can I just leave it as is, overwriting an existing file over and over again?
Should I care? I should add that I don't seem to suffer from any performance issues, so this is more of a philosophical/hygienic question.
thanks
It depends.
You say
once the dictionary is built and saved, its contents won't ever change
until they do :-)
If your app is not suffering from any performance issues on this particular loop I wouldn't try to cache for the reason that unless you somehow remember that you have a once-only write on the file you are storing up a bug for later.
This could be mitigated by using an intention revealing name on the method. i.e
-(void)storeSubjectsOnceOnlyPerLaunch:(NSDictionary*)subjects
If I got my time back for tracing down bugs caused by caching, I would have several days back in my life.
Your solution is totally over engineered, and has tons of potential to go wrong. What if the users drive is full? Does this file get backed up? Does it need backing up / are you wasting the users time backing it up? Can this fail? Are you handling it? You are concentrating on the entering and storing of data, you should be focusing on accessing that data.
I'd have a readwrite property allEvents and a property eventAssociations, declared readonly in the interface, but readwrite in the implementation file.
The allEvents setter stores allEvents and sets _eventAssociations to nil.
The eventAssociations getter checks whether _eventAssociations is nil and recalculates it when needed. A simple and bullet-proof pattern.
I need to allocate lot's of NSString objects from cStrings (which come that way from a database), as fast as possible. cStringUsingEncoding and the likes are just too slow - about 10-15 times slower compared to allocating a cString.
However, creating a NSString with a NSString is getting pretty close to cString allocation (about 1.2s for 1M allocations). EDIT: Fixed alloc to use a copy of the string.
const char *n;
const char *s = "Office für iPad: Steve Ballmer macht Hoffnung";
NSString *str = [NSString stringWithUTF8String:s];
int len = strlen(s);
for (int i = 0; i<10000000; i++) {
NSString *s = [[NSString alloc] initWithString:[str copy]];
s = s;
}
cString allocation test (also about 1s for 1M allocations):
for (int i = 0; i<10000000; i++) {
n = malloc(len);
memccpy((void*)n, s, 0, len) ;
n = n;
free(n);
}
But as I said, using stringWithCString and the likes is an order of magnitude slower. The fastest I could get was using initWithBytesNoCopy (about 8s, therefore 8 times slower compared to stringWithString):
NSString *so = [[NSString alloc] initWithBytesNoCopy:(void*)n length:len encoding:NSUTF8StringEncoding freeWhenDone:YES];
So, is there another magic way to make allocations from cStrings faster? I'd even not rule out to subclass NSString (and yes, I know it's a cluster class).
EDIT: In instruments I see that NSString's call to CFStringUsingByteStream3 is the root issue.
EDIT 2: The root issue is according to instuments __CFFromUTF8. Just looking at the sources [1], this seems indeed to be quite inefficient and handling some legacy cases.
https://www.opensource.apple.com/source/CF/CF-476.17/CFBuiltinConverters.c?txt
This seems to me to not be a fair test.
cString allocation test looks to be allocating a byte array and copying data. I can't tell for sure because the variable definitions are not included.
NSString *s = [[NSString alloc] initWithString:str]; is taking an existing NSString (data already in the correct format) and maybe just increments the retain count. Even if a copy is forced the data is still already in the correct encoding and just needs to be copied.
[NSString stringWithUTF8String:s]; has to handle the UTF8 encoding and convert from one encoding (UTF8) to the internal NSString/CFString encoding. The method being used (CFStreamUsingByteStream) has support for multiple encodings (UTF8/UTF16/UTF32/others). A specialized UTF8 only method could be faster but that leads to the question of is this really a performance problem or just an exercise.
You can see the source code for CFStringUsingByteStream3 in this file.
As per my comment, and Brian's answer, I think the problem here is that to create NSStrings you're having to parse the UTF-8 strings. So the question arises: do you really need to parse them, then?
If parsing-on-demand is an option then I'd suggest you write a proxy that can impersonate NSString with an interface along the lines of:
#interface BJLazyUTF8String: NSProxy
- (id)initWithBytes:(const char *)bytes length:(size_t)length;
#end
So it's not a subclass of NSString and it doesn't try to provide any real functionality. Inside the init just keep the bytes, e.g. as _bytes, doing whatever is correct for your C memory ownership. Then:
- (NSString *)bjRealString
{
// we'd better create the NSString if we haven't already
if(!_string)
_string = [NSString stringWithUTF8String:_bytes];
return _string;
}
- (void)forwardInvocation:(NSInvocation *)anInvocation
{
// if this is invoked then someone is trying to
// make a call to what they think is a string;
// let's forward that call to a string so that
// it does what they expect
[anInvocation setTarget:[self bjRealString]];
[anInvocation invoke];
}
- (NSMethodSignature *)methodSignatureForSelector:(SEL)aSelector
{
return [[self bjRealString] methodSignatureForSelector:aSelector];
}
You can then do:
NSString *myString = [[BJLazyUTF8String alloc] initWithBytes:... length:...];
And subsequently treat myString exactly as though it were an NSString.
Microbenchmarks are a great distraction, but rarely useful. In this case, though, there is validity.
Assuming, for the moment, that you've actually measured string creation as being a real source of performance issues, then the real problem can be better expressed as how do I reduce memory bandwidth? because that is really where your problems lie; you causing tons and tons of data to be copied into freshly allocated buffers.
As you've discovered, the fastest you can go is to not copy at all. initWithBytesNoCopy:... exists exactly to solve this case. Thus, you'll want to create a data construct that holds the original string buffer and manages all the NSString instances that point to it as one cohesive unit.
Without thinking it through in detail, you could likely encapsulate the raw buffer in an NSData instance, then use associated objects to create a strong reference from your string instances to that NSData instance. That way, the NSData (and associated memory) will be deallocated when the last string is deallocated.
With the additional detail that this is for a CoreData-esque ORM layer (and, no, I'm not going to suggest yer doin' it wrong because your description really does sound like you need that level of control), then it would seem that your ORM layer would be the ideal place to manage these strings as described above.
I'd also encourage you to investigate something like FMDB to see if it can provide both the encapsulation you need and the flexibility to add your additional features (and the hooks to make it fast).
Given the following Objective-C example, is it simply a matter of style and ease of reading to keep separate statements or to bundle them into one? Are there any actual benefits of either? Is it a waste of memory to declare individual variables?
NSDictionary *theDict = [anObject methodToCreateDictionary];
NSArray *theValues = [theDict allValues];
NSString *theResult = [theArray componentsJoinedByString:#" "];
or
NSString *theResult = [[[anObject methodToCreateDictionary] theValues] componentsJoinedByString:#" "];
I take the following into consideration when I declare a separate variable:
If I might want to see its value in the debugger.
If I am accessing the variable more than once.
If the line is too long.
There is no practical difference between the two approaches, however.
Also, you haven't asked directly about this, but be aware, when you access objects using dot notation, for example:
myObject.myObjectProperty1.myObjectProperty1Property;
If you are going to access myObjectProperty1Property more than once, it can be advisable to assign it to a local named variable. If you don't, the look-up will be executed more than once.
Now I can't emphasise enough, for many if not most situations this time saving is so infinitesimal as to seriously call into question whether it is worth even spending the time doing extra typing for the assignation! So why am I raising this? Because having said that - stylistic "anality" apart (I just made up a new word) - if the section of code you are writing is running in a tight loop, it can be worth taking the extra care. An example would be when writing the code which populates the cells in a UICollectionView that contains a large number of cells. Additionally, if you are using Core Data and you are using the dot notation to refer to the properties of NSManagedObject properties, then there is far greater overhead with each and every look-up, in which case it is much more surely worth taking the time to assign any values referred to by "nested" dot notation calls to a local variable first.
This works -- it does compile -- but I just wanted to check if it would be considered good practice or something to be avoided?
NSString *fileName = #"image";
fileName = [fileName stringByAppendingString:#".png"];
NSLog(#"TEST : %#", fileName);
OUTPUT: TEST : image.png
Might be better written with a temporary variable:
NSString *fileName = #"image";
NSString *tempName;
tempName = [fileName stringByAppendingString:#".png"];
NSLog(#"TEST : %#", tempName);
just curious.
Internally, compilers will normally break your code up into a representation called "Single Static Assignment" where a given variable is only ever assigned one value and all statements are as simple as possible (compound elements are separated out into different lines). Your second example follows this approach.
Programmers do sometimes write like this. It is considered the clearest way of writing code since you can write all statements as basic tuples: A = B operator C. But it is normally considered too verbose for code that is "obvious", so it is an uncommon style (outside of situations where you're trying to make very cryptic code comprehensible).
Generally speaking, programmers will not be confused by your first example and it is considered acceptable where you don't need the original fileName again. However, many Obj-C programmers, encourage the following style:
NSString *fileName = [#"image" stringByAppendingString:#".png"];
NSLog(#"TEST : %#", fileName);
or even (depending on horizontal space on the line):
NSLog(#"TEST : %#", [#"image" stringByAppendingString:#".png"]);
i.e. if you only use a variable once, don't name it (just use it in place).
On a stylistic note though, if you were following the Single Static Assigment approach, you shouldn't use tempName as your variable name since it doesn't explain the role of the variable -- you'd instead use something like fileNameWithExtension. In a broader sense, I normally avoid using "temp" as a prefix since it is too easy to start naming everything "temp" (all local variables are temporary so it has little meaning).
The first line is declaring an NSString literal. It has storage that lasts the lifetime of the process, so doesn't need to be released.
The call to stringByAppendingString returns an autoreleased NSString. That should not be released either, but will last until it gets to the next autorelease pool drain.
So assigning the result of the the stringByAppendingString call back to the fileName pointer is perfectly fine in this case. In general, however, you should check what your object lifetimes are, and handle them accordingly (e.g. if fileName had been declared as a string that you own the memory to you would need to release it, so using a temp going to be necessary).
The other thing to check is if you're doing anything with fileName after this snippet - e.g. holding on to it in a instance variable - in which case your will need to retain it.
The difference is merely whether you still need the reference to the literal string or not. From the memory management POV and the object creational POV it really shouldn't matter. One thing to keep in mind though is that the second example makes it slightly easier when debugging. My preferred version would look like this:
NSString *fileName = #"image";
NSString *tempName = [fileName stringByAppendingString:#".png"];
NSLog(#"TEST : %#", tempName);
But in the end this is just a matter of preference.
I think you're right this is really down to preferred style.
Personally I like your first example, the codes not complicated and the first version is concise and easier on the eyes. Theres too much of the 'language' hiding what it's doing in the second example.
As noted memory management doesn't seem to be an issue in the examples.