Normally, we would create 1000000 NSObjects this way.
NSMutableArray* objs = [[NSMutableArray alloc] initWithCapacity:1000000];
for (int i = 0; i < 1000000; i++) {
MyObject* o = [[MyObject alloc] init];
o.str = [NSString stringWithFormat:#"%d", i];
[objs addObject:o];
}
However, this can be very slow. Maybe a lot of the memory allocation should be merged, or there is some other trick to speed it up.
How do I allocate space for several thousand NSObjects more time-efficiently?
Addition:
Using malloc to get a huge number of memory, and re-init it, is also illegal. Since converting from a normal memory into a NSObject will failed with bad access. See below:
There is no short-cut to this problem. If you need to allocate 1 million objects then it will take time and possibly fail due to consuming too much memory.
Given that limitation you need to think about a different solution. For example if each of these objects represents some item then have an ItemManager object that manages as many of these items as necessary. The manager class can then allocate memory in chunks, rather than for individual items, and this will perform much better and be more scalable.
However given you don't explain exactly what these objects represent, I cannot provide a more detailed alternative.
Related
I need to allocate lot's of NSString objects from cStrings (which come that way from a database), as fast as possible. cStringUsingEncoding and the likes are just too slow - about 10-15 times slower compared to allocating a cString.
However, creating a NSString with a NSString is getting pretty close to cString allocation (about 1.2s for 1M allocations). EDIT: Fixed alloc to use a copy of the string.
const char *n;
const char *s = "Office für iPad: Steve Ballmer macht Hoffnung";
NSString *str = [NSString stringWithUTF8String:s];
int len = strlen(s);
for (int i = 0; i<10000000; i++) {
NSString *s = [[NSString alloc] initWithString:[str copy]];
s = s;
}
cString allocation test (also about 1s for 1M allocations):
for (int i = 0; i<10000000; i++) {
n = malloc(len);
memccpy((void*)n, s, 0, len) ;
n = n;
free(n);
}
But as I said, using stringWithCString and the likes is an order of magnitude slower. The fastest I could get was using initWithBytesNoCopy (about 8s, therefore 8 times slower compared to stringWithString):
NSString *so = [[NSString alloc] initWithBytesNoCopy:(void*)n length:len encoding:NSUTF8StringEncoding freeWhenDone:YES];
So, is there another magic way to make allocations from cStrings faster? I'd even not rule out to subclass NSString (and yes, I know it's a cluster class).
EDIT: In instruments I see that NSString's call to CFStringUsingByteStream3 is the root issue.
EDIT 2: The root issue is according to instuments __CFFromUTF8. Just looking at the sources [1], this seems indeed to be quite inefficient and handling some legacy cases.
https://www.opensource.apple.com/source/CF/CF-476.17/CFBuiltinConverters.c?txt
This seems to me to not be a fair test.
cString allocation test looks to be allocating a byte array and copying data. I can't tell for sure because the variable definitions are not included.
NSString *s = [[NSString alloc] initWithString:str]; is taking an existing NSString (data already in the correct format) and maybe just increments the retain count. Even if a copy is forced the data is still already in the correct encoding and just needs to be copied.
[NSString stringWithUTF8String:s]; has to handle the UTF8 encoding and convert from one encoding (UTF8) to the internal NSString/CFString encoding. The method being used (CFStreamUsingByteStream) has support for multiple encodings (UTF8/UTF16/UTF32/others). A specialized UTF8 only method could be faster but that leads to the question of is this really a performance problem or just an exercise.
You can see the source code for CFStringUsingByteStream3 in this file.
As per my comment, and Brian's answer, I think the problem here is that to create NSStrings you're having to parse the UTF-8 strings. So the question arises: do you really need to parse them, then?
If parsing-on-demand is an option then I'd suggest you write a proxy that can impersonate NSString with an interface along the lines of:
#interface BJLazyUTF8String: NSProxy
- (id)initWithBytes:(const char *)bytes length:(size_t)length;
#end
So it's not a subclass of NSString and it doesn't try to provide any real functionality. Inside the init just keep the bytes, e.g. as _bytes, doing whatever is correct for your C memory ownership. Then:
- (NSString *)bjRealString
{
// we'd better create the NSString if we haven't already
if(!_string)
_string = [NSString stringWithUTF8String:_bytes];
return _string;
}
- (void)forwardInvocation:(NSInvocation *)anInvocation
{
// if this is invoked then someone is trying to
// make a call to what they think is a string;
// let's forward that call to a string so that
// it does what they expect
[anInvocation setTarget:[self bjRealString]];
[anInvocation invoke];
}
- (NSMethodSignature *)methodSignatureForSelector:(SEL)aSelector
{
return [[self bjRealString] methodSignatureForSelector:aSelector];
}
You can then do:
NSString *myString = [[BJLazyUTF8String alloc] initWithBytes:... length:...];
And subsequently treat myString exactly as though it were an NSString.
Microbenchmarks are a great distraction, but rarely useful. In this case, though, there is validity.
Assuming, for the moment, that you've actually measured string creation as being a real source of performance issues, then the real problem can be better expressed as how do I reduce memory bandwidth? because that is really where your problems lie; you causing tons and tons of data to be copied into freshly allocated buffers.
As you've discovered, the fastest you can go is to not copy at all. initWithBytesNoCopy:... exists exactly to solve this case. Thus, you'll want to create a data construct that holds the original string buffer and manages all the NSString instances that point to it as one cohesive unit.
Without thinking it through in detail, you could likely encapsulate the raw buffer in an NSData instance, then use associated objects to create a strong reference from your string instances to that NSData instance. That way, the NSData (and associated memory) will be deallocated when the last string is deallocated.
With the additional detail that this is for a CoreData-esque ORM layer (and, no, I'm not going to suggest yer doin' it wrong because your description really does sound like you need that level of control), then it would seem that your ORM layer would be the ideal place to manage these strings as described above.
I'd also encourage you to investigate something like FMDB to see if it can provide both the encapsulation you need and the flexibility to add your additional features (and the hooks to make it fast).
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
NSMutableArray initWithCapacity nuances
Objective-c NSArray init versus initWithCapacity:0
What is the difference between following line of code? what is the exact advantage and disadvantage? suppose next I will do 3 addObject operation.
NSMutableArray *array = [[NSMutableArray alloc] initWithCapacity: 5];
NSMutableArray *array = [[NSMutableArray alloc] init];
Functionally both statements are identical.
In the first case, you are giving the run time a hint that you will soon be adding five objects to the array so it can, if it likes, preallocate some space for them. That means that the first five addObject: invocations may be marginally faster using the first statement.
However, there's no guarantee that the run time does anything but ignore the hint. I never use initWithCapacity: myself. If I ever get to a situation where addObject: is a significant performance bottleneck, I might try it to see if things improve.
Regularly the difference with object oriented languages and arrays of varying sizes is: the overhead you will get and the page faults at the memory level.
To put this in another way imagine you have an object that requests 5 spaces in memory (just like your first example) and the second object doesn't reserve any space. Therefore when an object needs to be added to the first, there will already be space in memory for it to just fall in, on the other hand the non-allocated-object will first have to request for space on memory then add it to the array. This doesn't sound so bad at this level but when your arrays increase in size this becomes more important.
From Apple's documentation:
arrayWithCapacity:
Creates and returns an NSMutableArray object with
enough allocated memory to initially hold a given number of objects.
...
The initial capacity of the new array. Return Value A new NSMutableArray object with
enough allocated memory to hold numItems objects.
I have a question regarding memory allocation for Objects in an array. I am looking to create an array of Objects, but at compile time, I have no way of knowing how many objects I will need, and thus don't want to reserve more memory than needed.
What I would like to do is allocate the memory as needed. The way I would like to do this is when the user clicks an "Add" button, the array is increased by one additional object and the needed memory for the new object is allocated.
In my novice understanding of Objective C (I was a professional programmer about 20 years ago, and have only recently begun to write code again) I have come up with the following code segments:
First, I declared my object:
NSObject *myObject[1000]; // a maximum number of objects.
Then, when the user clicks an Add button it runs a method with the allocation code: (note: the variable i starts out at a value of 1 and is increased each time the Add button is clicked)
++i
myObject[i] = [[NSObject alloc] init];
Thus, I'm hoping to only allocate the memory for the objects actually needed, rather than all 1000 array objects immediately.
Am I interpreting this correctly? In other words, am I correct in my interpretation that the number of arrayed elements stated in the declaration is the MAXIMUM possible number of array elements, not how much memory is allocated at that moment? It this is correct, then theoretically, the declaration:
NSObject *myObject[10000];
Wouldn't pull any more memory than the declaration:
NSObject *myObject[5];
Can someone confirm that I'm understanding this process correctly, enlighten me if I've got this mixed up in my mind. :)
Thanks!
Why not use NSMutableArray? You can initWithCapacity or simply allocate with [NSMutableArray array]. It will grow and shrink as you add and remove objects. For example:
NSMutableArray *array = [NSMutableArray array];
NSObject *object = [[NSObject alloc] init];
[array addObject:object]; // array has one object
[array removeObjectAtIndex:0]; // array is back to 0 objects
// remember to relinquish ownership of your objects if you alloc them
// the NSMutable array was autoreleased but the NSObject was not
[object release];
Your understanding is mostly correct. When you do:
NSObject *myObject[1000];
You immediately allocate storage for 1000 pointers to NSObject instances. The NSObject instances themselves are not allocated until you do [[NSObject alloc] init].
However, doing NSObject *myObject[10000] will consume more space than doing NSObject *myObject[5], because 10,000 pointers certainly require more memory to represent than 5 pointers.
Remember that both things consume space, the pointer to the NSObject, and the NSObject instance itself, though in practice the space consumed by an NSObject instance will be significantly larger than the 4 bytes consumed by the pointer that refers to it.
Anyhow, perhaps more importantly, there is a better way to manage dynamic object allocation in Cocoa. Use the built-in NSMutableArray class. Like:
NSMutableArray* objects = [[NSMutableArray alloc] init];
[objects addObject: [[[NSObject alloc] init] autorelease]];
This is a bit of a silly question, but if I want to add an object to an array I can do it with both NSMutableArray and NSArray, which should I use?
NSMutableArray * array1;
[array1 addObject:obj];
NSArray * array2;
array2 = [array2 arrayByAddingObject:obj];
Use NSMutableArray, that is what it is there for. If I was looking at code and I saw NSArray I would expect it's collection to stay constant forever, whereas if I see NSMuteableArray I know that the collection is destined to change.
It might not sound like much right now, but as your project grows and as you spend more time on it you will see the value of this eventually.
NSMutableArray is not threadsafe, while NSArray is. This could be a huge problem if you're multithreading.
NSMutableArray and NSArray both are build on CFArray, performance/complexity should be same. The access time for a value in the array is guaranteed to be at
worst O(lg N) for any implementation, current and future, but will
often be O(1) (constant time). Linear search operations similarly
have a worst case complexity of O(N*lg N), though typically the
bounds will be tighter, and so on. Insertion or deletion operations
will typically be linear in the number of values in the array, but
may be O(N*lg N) clearly in the worst case in some implementations.
When deciding which is best to use:
NSMutableArray is primarily used for when you are building collections and you want to modify them. Think of it as dynamic.
NSArray is used for read only inform and either:
used to populate an NSMutableArray, to perform modifications
used to temporarily store data that is not meant to be edited
What you are actually doing here:
NSArray * array2;
array2 = [array2 arrayByAddingObject:obj];
is you are creating a new NSArray and changing the pointer to the location of the new array you created.
You are leaking memory this way, because it is not cleaning up the old Array before you add a new object.
if you still want to do this you will need to clean up like the following:
NSArray *oldArray;
NSArray *newArray;
newArray = [oldArray arrayByAddingObject:obj];
[oldArray release];
But the best practice is to do the following:
NSMutableArray *mutableArray;
// Initialisation etc
[mutableArray addObject:obj];
An NSArray object manages an immutable array—that is, after you have created the array, you cannot add, remove, or replace objects. You can, however, modify individual elements themselves (if they support modification). The mutability of the collection does not affect the mutability of the objects inside the collection. You should use an immutable array if the array rarely changes, or changes wholesale.
An NSMutableArray object manages a mutable array, which allows the addition and deletion of entries, allocating memory as needed. For example, given an NSMutableArray object that contains just a single dog object, you can add another dog, or a cat, or any other object. You can also, as with an NSArray object, change the dog’s name—and in general, anything that you can do with an NSArray object you can do with an NSMutableArray object. You should use a mutable array if the array changes incrementally or is very large—as large collections take more time to initialize.
Even the Q and the answer are very old, someone has to correct it.
What does "better" mean? Better what? Your Q leaks of information what the problem is and it is highly opinion-based. However, it is not closed.
If you are talking about performance, you can measure it yourself. But remember Donald Knuth: "Premature optimization is the root of all evil".
If I take your Q seriously, "better" can mean runtime performance, memory footprint, or architecture. For the first two topics it is easy to check yourself. So no answer is needed.
On an architectural point of view, things become more complicated.
First of all I have to mention, that having an instance of NSArray does not mean, that it is immutable. This is, because in Cocoa the mutable variants of collections are subclasses of the immutable variants. Therefore an instance of NSMutableArray is an instance of NSArray, but obviously mutable.
One can say that this was no good idea, especially when thinking about Barbara and Jeanette and there is a relation to the circle-ellipse problem, which is not easy to solve. However, it is as it is.
So only the docs can give you the information, whether a returned instance is immutable or not. Or you do a runtime check. For this reason, some people always do a -copy on every mutable collection.
However, mutability is another root of all evil. Therefore: If it is possible, always create an instance of NSArray as final result. Write that in your docs, if you return that instance from a method (esp. getter) or not, so everyone can rely on immutability or not. This prevents unexpected changes "behind the scene". This is important, not 0.000000000003 sec runtime or 130 bytes of memory.
This test gives the best answer:
Method 1:
NSTimeInterval start = [NSDate timeIntervalSinceReferenceDate];
NSMutableArray *mutableItems = [[NSMutableArray alloc] initWithCapacity:1000];
for (int i = 0; i < 10000; i++) {
[mutableItems addObject:[NSDate date]];
}
NSTimeInterval end = [NSDate timeIntervalSinceReferenceDate];
NSLog(#"elapsed time = %g", (end - start) * 1000.0);
Method 2:
...
NSArray *items = [[[NSArray alloc] init] autorelease];
or (int i = 0; i < 10000; i++) {
items = [items arrayByAddingObject:[NSDate date]];
}
...
Output:
Method 1: elapsed time = 0.011135 seconds.
Method 2: elapsed time = 9.712520 seconds.
NSNumber* n = [[NSNumber alloc] initWithInt:100];
NSNumber* n1 = n;
In the code above, why is the value of n's retainCount set to 2? In the second line of the code, I didn't use retain to increase the number of retainCount.
I found a strange situation. Actually the retainCount depends on the initial number:
NSNumber *n = [[NSNumber alloc] initWithInt:100];
// n has a retainCount of 1
NSNumber *n2 = [[NSNumber alloc] initWithInt:11];
// n has a retainCount of 2
Stop. Just stop. Never look at the retainCount of an object. Ever. It should never have been API and available. You're asking for pain.
There's too much going on for retainCount to be meaningful.
Based on this link here, it's possible that there's some optimization going on under the covers for common NSNumbers (which may not happen in all implementations hence a possible reason why #dizy's retainCount is 1).
Basically, because NSNumbers are non-mutable, the underlying code is free to give you a second copy of the same number which would explain why the retain count is two.
What is the address of n and n1? I suspect they're the same.
NSNumber* n = [[NSNumber alloc] initWithInt:100];
NSLog(#"Count of n : %i",[n retainCount]);
NSNumber* n1 = n;
NSLog(#"Count of n : %i",[n retainCount]);
NSLog(#"Count of n1: %i",[n1 retainCount]);
NSLog(#"Address of n : %p", n);
NSLog(#"Address of n1: %p", n1);
Based on your update, that link I gave you is almost certainly the issue. Someone ran a test and found out that the NSNumbers from 0 to 12 will give you duplicates of those already created (they may in fact be created by the framework even before a user requests them). Others above 12 seemed to give a retain count of 1. Quoting:
From the little bit of examination I've been able to do, it looks as if you will get "shared" versions of integer NSNumbers for values in the range [0-12]. Anything larger than 12 gets you a unique instance even if the values are equal. Why twelve? No clue. I don't even know if that's a hard number or circumstantial.
Try it with 11, 12 and 13 - I think you'll find 13 is the first to give you a non-shared NSNumber.
Retain counts are an implementation detail. They can be kindasorta useful in debugging, sometimes, but in general you should not care about them. All you should care about is that you're following the memory management rules.
For an example of why looking at retain counts is unreliable, this is a perfectly legal class that obeys the API contract and will behave correctly in all circumstances:
#implementation CrazyClass
- (id)retain {
for(int i=0; i<100; i++) {
[super retain];
}
}
- (void)release {
for(int i=0; i<100; i++) {
[super release];
}
}
#end
…but if you inspected its retain count, you'd think you had an "issue."
This precise case doesn't happen too often in practice, but it illustrates why looking at retain counts is useless for telling if something is wrong. Objects do get retained behind the scenes by code outside of your control. NSNumber, for example, will sometimes cache instances. Objects get autoreleased, which isn't reflected in the retain count. Lots of things can happen that will confuse the retain count. Some classes might not even keep their retain counts where you can see them.
If you suspect you have a leak, you should check with the real debugging tools meant for that purpose, not by poking at retain counts. And for code you're writing, you should primarily be concerned with following the guidelines I linked above.
You should never rely on the retainCount of an object. You should only use it as a debugging aid, never for normal control flow.
Why? Because it doesn't take into account autoreleases. If an object is retained and subequently autoreleased, its retainCount will increment, but as far as you're concerned, its real retain count hasn't been changed. The only way to get an object's real retain count is to also count how many times it's been added to any of the autorelease pools in the autorelease pool chain, and trying to do so is asking for trouble.
In this case, the retainCount is 2 because somewhere inside alloc or initWithInt:, the object is being retained and autoreleased. But you shouldn't need to know or care about that, it's an implementation detail.
I think you have something else going on...
NSNumber* n = [[NSNumber alloc] initWithInt:100];
NSNumber* n1 = n;
NSLog(#"n = %i",[n retainCount]);
Result is 1
There is however a fly in this ointment. I've seen crashes due to retain count overflows from NSNumber instances containing small integers. In large systems that run for a very long time you can exceed the max int and get an Internal Consistancy error.
NSInternalInconsistencyException: NSIncrementExtraRefCount() asked to increment too far for <NSIntNumber: 0x56531f7969f0> - 0