Reducing peak memory consumption - objective-c

I generate bitmaps using the next simplified for the sake of simplicity code:
for (int frameIndex = 0; frameIndex < 90; frameIndex++) {
UIGraphicsBeginImageContextWithOptions(CGSizeMake(130, 130));
// Making some rendering on the context.
// Save the current snapshot from the context.
UIImage *snapshot = UIGraphicsGetImageFromCurrentImageContext();
[self.snapshots addObject:snapshot];
UIGraphicsEndImageContext();
}
So nothing non-trivial but everything gets complicated when an operating system gives about 30 MB of memory for everything (in this particular case it is watch OS 2 but nevertheless it is not the OS-dependent question) and by exceeding the quota, the OS just kills the application's process.
The next graph from the Allocations Instrument illustrates the question:
It is the same graph but with different annotations of memory consumption - before, at the moment and after the aforementioned code execution. As it can be seen about 5.7 MB of bitmaps have been generated eventually and it is the absolutely acceptable result. What is not acceptable it is memory consumption (44.6 MB) at the peak of the graph - all of this memory is eaten by CoreUI: image data. Given the fact that the action takes place in a background thread the time of execution is not that important.
So the questions: What is the right approach to decreasing memory consumption (maybe by increasing the execution time) to fit the memory quota and why the memory consumption is increased despite UIGraphicsEndImageContext is called?
Update 1:
I think splitting the whole operation by using NSOperation, NSTimer etc. will do the trick but still trying to come up with some synchronous solution.
Tried to gather all answers together and tested the next piece of code:
UIGraphicsBeginImageContextWithOptions(CGSizeMake(130, 130));
for (int frameIndex = 0; frameIndex < 45; frameIndex++) {
// Making some rendering on the context.
#autoreleasepool {
UIImage *snapshot = UIGraphicsGetImageFromCurrentImageContext();
[self.snapshots addObject:snapshot];
}
CGContextClearRect(UIGraphicsGetCurrentContext(), CGSizeMake(130, 130));
}
for (int frameIndex = 0; frameIndex < 45; frameIndex++) {
// Making some rendering on the context.
#autoreleasepool {
UIImage *snapshot = UIGraphicsGetImageFromCurrentImageContext();
[self.snapshots addObject:snapshot];
}
CGContextClearRect(UIGraphicsGetCurrentContext(), CGSizeMake(130, 130));
}
UIGraphicsEndImageContext();
What has changed:
Split 90 iterations into 2 parts of 45.
Moved a graphic context outside and clear it after each iteration instead of creating new one.
Wrapped taking and storing snapshots in the autorelease pool.
As a result - nothing changed, memory consumption remains on the same level.
Also, if remove taking and storing a snapshot at all it will decrease memory consumption only for 4 MB i.e. less than 10%.
Update 2:
Doing rendering by a timer every 3 seconds generates the next graph:
As you see memory is not freed (to be precise - not fully) even if rendering is divided by time intervals. Something tells me that memory is not freed until the object that performs rendering exists.
Update 3:
The problem has been solved by combining 3 approaches:
Splitting the whole rendering task into subtasks. For example, 90 drawings are split into 6 subtasks by 15 drawings in each (The number of 15 was found empirically).
Executing all subtasks serially using dispatch_after with the small interval after each (0.05s in my case).
And the last and the most important. To avoid the memory leak like on the last graph - each subtask should be executed in a context of a new object. For example:
self.snapshots = [[SnaphotRender new] renderSnapshotsInRange:[0, 15]];
Thanks to everyone for answering but #EmilioPelaez was closest to the right answer.

Corrected the to the updated question frame count.
The total byte size of the images could be 130 * 130 * 4 (bytes per pixel) * 90 = ~6MB.
Not to sure about the watch but temporary memory might be building up, you could try wrapping the snap shot code in an #autoreleasepool block:
#autoreleasepool {
UIImage *snapshot = UIGraphicsGetImageFromCurrentImageContext();
[self.snapshots addObject:snapshot];
}

I think your problem is that you are doing all the snapshots in the same context (the context the for loop is in). I believe the memory is not being released until the context ends, which is when the graph goes down.
I would suggest you reduce the scope of the context, so instead of using a for loop to draw all frames you would keep track of the progress with some iVars and draw just one frame; whenever you finish rendering the frame, you can call the function again with a dispatch_after and modify the variables. Even if the delay is 0, it will allow the context to end and clean up the memory that is no longer being used.
PS. When I mean context I don't mean a graphics context, I mean a certain scope in your code.

wait, image resolution plays a big role, for example, a one mega byte jpeg image of the dimensions 5000 px * 3000 px would consume ram of size of 60 MB, 5000*3000*4 bytes are 60 MB, images get decompressed into the ram, so let's troubleshoot, so first, please tell us what kind of image sizes (dimensions) do you use ?

Edit (after clarifications):
I think, that the proper way should be not to store UIImage objects directly, but compressed NSData objects (i.e. using
UIImageJPEGRepresentation). And then, when need it, convert them again to UIImage objects.
However, if you use many of them simultaneously, you're going to run out memory quite rapidly.
Original answer:
Actually the total size can be higher than 10MB (probably > 20MB) depending on the scaling factor (as seen here). Note that the UIGraphicsBeginImageContextWithOptions requires a scale parameter in contrast to the UIGraphicsBeginImageContext. So my guess it's this is somehow related to a screenshot isn't it?
Also the method UIGraphicsGetImageFromCurrentImageContext it's thread safe, so it might be returning a copy or using more memory. Then you have your 40MB.
This answer states that iOS somehow stores the images compressed when not displayed. This can be the reason to the 6MB of usage afterwards.
My final guess is that the device is detecting a peak of memory and then tries to save memory somehow. Since the images are not being used, it compressed them internally and recycles memory.
So I wouldn't worry because it looks like it's taking care of it by itself. But then you can do as others have suggested, save the image into a file and don't keep it in memory if you're not going to use it.

Related

Multiple renders to single texture without blocking MTLCommandBuffer

I am trying to render 3 separate things to one texture in Metal.
I have a MTLTexture that is used as a destination in 3 different MTLCommandBuffers. I commit them one after another. Each MTLCommandBuffer renders to a separate part of the texture - first draws in the 0 - 1/3 part, second draws the middle 1/3 - 2/3 and the last one draws 2/3 - 1.
id<MTLTexture> dst_texture = ...;
id<MTLCommandBuffer> buffer1 = [self drawToTexture:dst_texture];
[buffer1 commit];
id<MTLCommandBuffer> buffer2 = [self drawToTexture:dst_texture];
[buffer2 commit];
id<MTLCommandBuffer> buffer3 = [self drawToTexture:dst_texture];
[buffer3 commit];
The problem is that it seems I can't share the destination texture in the different command buffers - I get glitches, sometimes I can see only partial results on the destination texture...
Inside drawToTexture I use dst_texture this way:
_renderPassDescriptor.colorAttachments[0].texture = dst_texture;
_renderPassDescriptor.colorAttachments[0].loadAction = MTLLoadActionLoad;
The problem gets fixed when I call [buffer waitUntilCompleted] after each individual commit but I suppose it affects the performance and I would love to have it without blocking/waiting.
This works:
id<MTLTexture> dst_texture = ...;
id<MTLCommandBuffer> buffer1 = [self drawToTexture:dst_texture];
[buffer1 commit];
[buffer1 waitUntilCompleted];
id<MTLCommandBuffer> buffer2 = [self drawToTexture:dst_texture];
[buffer2 commit];
[buffer2 waitUntilCompleted];
id<MTLCommandBuffer> buffer3 = [self drawToTexture:dst_texture];
[buffer3 commit];
[buffer3 waitUntilCompleted];
What else I could do here to avoid waitUntilCompleted calls?
To answer the questions "I am trying to render 3 separate things in one texture in metal", and "What else I could do here to avoid waitUntilCompleted calls?" (Hamid has already explained why the problem occurs), is that you shouldn't be using multiple command buffers for basic rendering with multiple draw calls. If you're rendering to one texture then you need one command buffer which you use to create one renderPassDescriptor that you attach the texture to. Then you need one encoder created from the renderPassDescriptor where you can encode all the draw calls and change of buffers states etc..so as I said in the comment you bind the shader set buffers etc then draw and then don't call endEncoding but set shaders and buffers again and again for how many draw calls and buffer changes you want.
If you wanted to draw to multiple textures then you typically create multiple renderPassDescriptors (but still use one command buffer). Generally you use one command buffer per frame, or for a set of offscreen render passes.
The manual synchronization is only required:
For Untracked Resources.
Across Multiple Devices.
Between a GPU and the CPU.
Between separate command queues.
otherwise, metal automatically synchronizes resources (tracked) between the command buffers even if they are running in parallel.
If a command buffer includes write or read operations on a given MTLTexture, you must ensure that these operations complete before reading or writing the MTLTexture contents. You can use the addCompletedHandler: method, waitUntilCompleted method, or custom semaphores to signal that a command buffer has completed execution.

Memory management issue with CIImage / CGImageRef

Good morning,
I encounter a memory management issue in the video processing software i'm trying to write. (video capture + (almost)real-time processing + display + recording).
The following code is part of the "..didOutputSampleBuffer.." function of AVCaptureVideoDataOutputSampleBufferDelegate.
capturePreviewLayer is a CALayer.
ctx is a CIContext I reuse over and over.
outImage is a vImage_Buffer.
With the commented section kept commented, memory usage is stable and acceptable, but if I uncomment it, memory won't stop increasing. Note that if I leave the filtering operation commented and only keep CIImage creation and conversion back to CGImageRef, the problem remains. (I mean : I don't think this is related to the filter itself).
If I run the XCode's Analyse, it points a potential memory leak if this part is uncommented, but none if it is commented.
Does anybody has an idea to explain and fix this ?
Thank you very much !
Note : I prefer not to use AVCaptureVideoPreviewLayer and its filters property.
CGImageRef convertedImage = vImageCreateCGImageFromBuffer(&outImage, &outputFormat, NULL, NULL, 0, &err);
//CIImage * img = [CIImage imageWithCGImage:convertedImage];
////[acc setValue:img forKey:#"inputImage"];
////img = [acc valueForKey:#"outputImage"];
//convertedImage = [self.ctx createCGImage:img fromRect:img.extent];
dispatch_sync(dispatch_get_main_queue(), ^{
self.capturePreviewLayer.contents = (__bridge id)(convertedImage);
});
CGImageRelease(convertedImage);
free(outImage.data);
Both vImageCreateCGImageFromBuffer() and -[CIContext createCGImage:fromRect:] give you a reference you are responsible for releasing. You are only releasing one of them.
When you replace the value of convertedImage with the new CGImageRef, you are losing the reference to the previous one without releasing it. You need to add another call to CGImageRelease(convertedImage) after your last use of the old image and before you lose that reference.

Why does this ternary operation causes memory growth

The following line causes memory growth (no releases, only one malloc line in instruments) when testing with mark generation feature of allocations instrument
- (instancetype)initWithTitle:(NSString *)title andImageNamed:(NSString *)imageName andButtonProperties:(NOZSKButtonNodeProperties *)buttonProperties
{
...
textNode.text = buttonProperties.buttonTitleIsUppercase ?
[title uppercaseStringWithLocale:[NSLocale currentLocale] ] : title;
...
}
Here is the code that calls it
NOZSKButtonNodeProperties *props = [self getThreeButtonProps];
...
props.buttonTitleIsUppercase = YES;
...
// this initializer is calling the above initializer by passing nil for imageName arg
NOZSKButtonNode *btn = [[NOZSKButtonNode alloc] initWithTitle:#"Play Again" andButtonProperties:props];
-uppercaseStringWithLocale: makes an upper-case copy of the title string. This requires allocation. And although you didn't show this, I assume textNode both retains its text property and has a wider scope than the -initWithTitle:andImageNamed:andButtonProperties: method and therefore continues to live afterwards.
Without more complete code it is not possible to be sure, but here is a guess in case it helps.
It sounds like you might be chasing a ghost. Are you seeing increasing memory due to this code across different iterations of your run loop?
Explanation: Even with ARC some memory is placed in the "autorelease pool" rather than being immediately deallocated when no longer required. This is an unfortunate legacy of MRC. ARC is able to avoid some uses of the autorelease pool and deallocate memory quicker, but not all uses.
The autorelease pool is typically emptied once per iteration of your main run loop. If you allocate and release a lot of memory in response to a single event, say in a long loop, it can be worth while wrapping the offending code in an #autorelease { ... } block which creates, and empties on exit, a local autorelease pool. This helps keep the peak memory usage down - it doesn't deallocate any more memory overall, it just does it cleanup sooner.
When you altered your code and saw an apparent improvement you may just have reorganised your code in a way more amenable to ARC's optimisations which reduce use of the autorelease pool, and so memory is deallocated sooner.
You only need to be concerned if (a) memory is increasing across multiple events or (b) you are hitting a too high peak memory use. Under ARC (a) should be rare, while (b) requires locating the source and wrapping it in an #autorelease { ... } block.
HTH
textNode.text = buttonProperties.buttonTitleIsUppercase ?
[title uppercaseStringWithLocale:[NSLocale currentLocale] ] : title;
When you write [title uppercaseStringWithLocale:], this method has to create a new instance in order to change the given string to upper case. This is going to increase the memory usage of your program because this new string has to be allocated.
Also, it would have helped to know whether or not you are using ARC, because I don't think this should be an issue with ARC.

memset error in iPad app

Hi Im currently building an iPad app. I was using the memset() as below but every tine it runs I get a bad access error?
arrayPointer = malloc(sizeof(int) * size);
memset(arrayPointer, 0, sizeof(int)* size); //sets all the values in the array to 0
Cheers
You could use calloc() it basically does the same as malloc() but also sets all bits to 0 in the allocated memory. It is also suited well for array initializations. For your example:
arrayPointer = calloc(sizeof(int), size);
EDIT: You should consider inspecting the returned pointer. NULL will be returned, when your memory allocation was erroneous.

Optimize a views drawing code

in a simple drawing application I have a model which has a NSMutableArray curvedPaths holding all the lines the user has drawn. A line itself is also a NSMutableArray, containing the point objects. As I draw curved NSBezier paths, my point array has the following structure: linePoint, controlPoint, controlPoint, linePoint, controlPoint, controlPoint, etc... I thought having one array holding all the points plus control points would be more efficient than dealing with 2 or 3 different arrays.
Obviously my view draws the paths it gets from the model, which leads to the actual question: Is there a way to optimize the following code (inside the view's drawRect method) in terms of speed?
int lineCount = [[model curvedPaths] count];
// Go through paths
for (int i=0; i < lineCount; i++)
{
// Get the Color
NSColor *theColor = [model getColorOfPath:[[model curvedPaths] objectAtIndex:i]];
// Get the points
NSArray *thePoints = [model getPointsOfPath:[[model curvedPaths] objectAtIndex:i]];
// Create a new path for performance reasons
NSBezierPath *path = [[NSBezierPath alloc] init];
// Set the color
[theColor set];
// Move to first point without drawing
[path moveToPoint:[[thePoints objectAtIndex:0] myNSPoint]];
int pointCount = [thePoints count] - 3;
// Go through points
for (int j=0; j < pointCount; j+=3)
{
[path curveToPoint:[[thePoints objectAtIndex:j+3] myNSPoint]
controlPoint1:[[thePoints objectAtIndex:j+1] myNSPoint]
controlPoint2:[[thePoints objectAtIndex:j+2] myNSPoint]];
}
// Draw the path
[path stroke];
// Bye stuff
[path release];
[theColor release];
}
Thanks,
xonic
Hey xon1c, the code looks good. In general it is impossible to optimize without measuring performance in specific cases.
For example, lets say the code above is only ever called once. It draws a picture in a view and it never needs redrawing. Say the code above takes 50 milliseconds to run. You could rewrite it in openGL and do every optimisation under the sun and get that time down to 20 milliseconds and realistically the 30 milliseconds that you have saved makes no difference to anyone and you pretty much just wasted your time and added a load of code-bloat that is going to be more difficult to maintain.
However, if the code above is called 50 times a second and most of those times it is drawing the same thing then you could meaningfully optimise it.
When it comes to drawing the best way to optimise is to is to eliminate unnecessary drawing.
Each time you draw you recreate the NSBezierPaths - are they always different? You may want to maintain the list of NSBezier paths to draw, keep that in sync with your model, and keep drawrect solely for drawing the paths.
Are you drawing to areas of your view which don't need redrawing? The argument to drawrect is the area of the view that needs redrawing - you could test against that (or getRectsBeingDrawn:count:), but it may be in your case that you know that the entire view needs to be redrawn.
If the paths themselves don't change often, but the view needs redrawing often - eg when the shapes of the paths aren't changing but their positions are animating and they overlap in different ways, you could draw the paths to images (textures) and then inside drawrect you would draw the texture to the view instead of drawing the path to the view. This can be faster because the texture is only created once and uploaded to video memory which is faster to draw to the screen. You should look at Core Animation if this is what you need todo.
If you find that drawing the paths is too slow you could look at CGPath
So, on the whole, it really does depend on what you are doing. The best advice is, as ever, not to get sucked into premature optimisation. If your app isn't actually too slow for your users, your code is just fine.