I have a raw bitmap image of RGBA malloc-ed data; rows are obviously a multiple of 4 bytes. This data actually originates from an AVI (24-bit BGR format), but I convert it to 32-bit ARGB. There's about 8mb of 32-bit data (1920x1080) per frame.
For each frame:
I convert that frame's data into a NSData object via NSData:initWithBytes:length.
I then convert that into a CIImage object via CIImage:imageWithBitmapData:bytesPerRow:size:format:colorSpace.
From that CIImage, I draw it into my final NSOpenGLView context using NSOpenGLView:drawImage:inRect:fromRect. Due to the "mosaic" nature of the target images, there are approximately 15-20 calls made on this with various source/destination Rects.
Using a 30hz NSTimer that calls [self setNeedsDisplay:YES] on the NSOpenGLView, I can attain about 20-25fps on a 2012 MacMini/2.6ghz/i7 -- it's not rock solid at 30hz. This to be expected with an NSTimer instead of a CVDisplayLink.
But... ignoring the NSTimer issue for now, are there any suggestions/pointers on making this frame-by-frame rendering a little more efficient?
Thanks!
NB: I would like to stick with CIImage objects as I'll want to access transition effects at some point.
Every frame, the call to NSData's initWithBytes:length: causes an 8MB memory allocation & an 8MB copy.
You can get rid of this per-frame allocation/copy by replacing theNSData object with a persistent NSMutableData object (set up once at the beginning), and using its mutableBytes as the destination buffer for the frame's 24- to 32-bit conversion.
(Alternatively, if you prefer to manage the destination-buffer memory yourself, leave the object as NSData class, but initialize it with initWithBytesNoCopy:length:freeWhenDone: & pass NO as the last parameter.)
Related
I am implementing a custom control to present images, and that control uses an array to store either a UIImage or NSString object. I want to implement such a kind of mechanism:if the memory usage is high, the control will write some big UIImage objects into files, then replace the UIImage objects with their corresponding files path(NSString objects).
So the only question is how to mesure the memory usage of an UIImage? Thanks!
Whenever your code utilizes a memory portion that generates problem for the ios it calls the - (void)didReceiveMemoryWarning method. so the conversion part you'll have to do in this method.
I have my own image downloader class, it holds a queue and downloads images one (or a certain amount) at a time, writes them to the cache folder and retrieves them from the cache folder when necessary. I also have a UIImageView subclass to which I can pass a URL, through the image downloader class it will look if the image already exists on the device and show it if it does, or download and show it after it finished.
After an image finishes downloading I do the following. I create a UIImage from the downloaded NSData, save the downloaded NSData to disk and return the UIImage.
// This is executed in a background thread
downloadedImage = [UIImage imageWithData:downloadedData];
BOOL saved = [fileManager createFileAtPath:filePath contents:downloadedData attributes:attributes];
// Send downloadedImage to the main thread and do something with it
To retrieve an existing image I do this.
// This is executed in a background thread
if ([fileManager fileExistsAtPath:filePath])
{
NSData* imageData = [fileManager contentsAtPath:filePath];
retrievedImage = [UIImage imageWithData:imageData];
// Send retrievedImage to the main thread and do something with it
}
As you can see, I always create a UIImage directly from the downloaded NSData, I never create NSData using UIImagePNGRepresentation so the image never gets compressed. When you create a UIImage from compressed NSData, UIImage will decompress it right before rendering on the main thread and thus block the UI. Since I'm now having a UITableView with a ton of small images in it that have to be downloaded or retrieved from disk, this would be unacceptable as it would slow down my scrolling immensely.
Now my problem. The user is also able to select a photo from the camera roll, save it and it also has to appear in my UITableView. But I can't seem to find a way to turn the UIImage from the camera roll into NSData without using UIImagePNGRepresentation. So here's my question.
How can I convert a UIImage into uncompressed NSData so I can convert it back to a UIImage later using imageWithData so that it doesn't have to be decompressed before rendering?
or
Is there any way I can do the decompression before sending the UIImage to the main thread and cache it so it only has to be decompressed once?
Thanks in advance.
How can I convert a UIImage into uncompressed NSData so I can convert it back to a UIImage later using imageWithData so that it doesn't have to be decompressed before rendering?
What you're really asking here, I take it, is how to store the UIImage on disk in such a way that you can later read the UIImage from disk as fast as possible. You don't really care whether it is stored as NSData; you just want to be able to read it quickly. I suggest you use the ImageIO framework. Save by way of an image destination and fetch later by way of an image source.
http://developer.apple.com/library/ios/#documentation/GraphicsImaging/Conceptual/ImageIOGuide/ikpg_dest/ikpg_dest.html
Is there any way I can do the decompression before sending the UIImage to the main thread and cache it so it only has to be decompressed once?
Yes, good question. That was going to be my second suggestion: use threading. This is what people have to do with tables all the time. When the table asks for the image, you either have the image already or you don't. If you don't, you supply a filler image and, in the background, fetch the real image. When the real image is ready, you have arranged to get a notification. Back on the main thread, you tell the table view to ask for the data for that row again; this time you've got the image and you supply it. The user will thus see a slight delay before the image appears. I'm sure you've seen lots of apps that behave this way (New York Times is a good example).
I have one further suggestion, and it may be the best of all. You speak of it taking time to decompress the image from disk. But this should take no time at all if the image is small. But the image should be small, because it's going to go into a small place - a table cell. In other words, you should shrink the images beforehand, when you first receive them, so that you are ready with the small version of each image when asked. It is a huge waste of time and memory to supply a large image that is to go into a small space.
ADDED LATER: Of course you do understand that a lot of this worry would be unnecessary if you weren't saving the images to disk. I'm not at all clear on why you need to do that. I hope you have a good reason for it; but it's a heck of a lot faster, obviously, if you just hold the images ready in memory.
I found solution:
CGImageRef downloadedImageRef = downloadedImage.CGImage;
CGDataProviderRef provider = CGImageGetDataProvider(downloadedImageRef);
NSData *data = CFBridgingRelease(CGDataProviderCopyData(provider));
// Then you can save the data
IF you download the data and save it to disk, then the data is compressed in either PNG, JPEG, or GIF format. You are not going to be downloading uncompressed image data. So, the root of your question about doing the decompression first needs to be addressed before you save the file to disk. Decompressing before you save will make the file a lot bigger, but it means that decompression is not needed before the data is read back into a CGImageRef or UIImage. It is the loading and then decompressing a bunch of images that is slowing down your CPU and making scrolling slow. But, it is not a solution to simply hold everything in memory already decompressed, because that will use up all your app memory and crash your phone before long. You might be able to get away with it for some small number of images, but this is a basic design flaw that you need to address when first writing your code. If you like, you can have a look at my blog post on this topic video-and-memory-usage-on-ios-devices, the post deals with video, but you have the exact same issue when dealing with lots of different images. I would suggest that you write your small images to disk in an uncompressed format like TIFF or BMP, that way reading them back in is easy as long as ImageIO supports that specific format.
My class is rendering images offscreen. I thought reusing the CGContext instead of creating the same context again and again for every image would be a good thing. I set a member variable _imageContext so I would only have to create a new context if _imageContext is nil like so:
if(!_imageContext)
_imageContext = [self contextOfSize:imageSize];
instead of:
CGContextRef imageContext = [self contextOfSize:imageSize];
Of course I do not release the CGContext anymore.
These are the only changes I made, turns out that reusing the context slowed down rendering from about 10ms to 60ms. Have I missed something? Do I have to clear the context or something before drawing into it again? Or is it the correct way to recreate the context for each image?
EDIT
Found the weirdest connection..
While I was searching for the reason why the app's memory is incredibly increasing when the app starts rendering the images, I found the problem was where I set the rendered image to an NSImageView.
imageView.image = nil;
imageView.image = [[NSImage alloc] initWithCGImage:_imageRef size:size];
It looks like ARC is not releasing the previous NSImage. First way to avoid that was to draw the new image into the old one.
[imageView.image lockFocus];
[[[NSImage alloc] initWithCGImage:_imageRef size:size] drawInRect:NSMakeRect(0, 0, size.width, size.height) fromRect:NSZeroRect operation:NSCompositeSourceOver fraction:1.0];
[imageView.image unlockFocus];
[imageView setNeedsDisplay];
The memory problem was gone and what happened to the CGContext-reuse problem?
Not reusing the context now takes 20ms instead of 10ms - of course drawing into an image takes longer than just setting it.
Reusing the context also takes 20ms instead of 60ms. But why? I don't see that there could be any connection, but I can reproduce the old state where reusing takes more time just by setting the NSImageView's image instead of drawing it.
I investigated this, and I observe the same slowdown. Looking with Instruments set to sample kernel calls as well as userland calls shows the culprit. #RyanArtecona's comment was on the right track. I focused Instruments in on the bottom most userland call CGSColorMaskCopyARGB8888_sse in two test runs (one reusing contexts, the other making a new one every time), and then inverted the resulting call tree. In the case where the context is not reused, I see that the heaviest kernel trace is:
Running Time Self Symbol Name
668.0ms 32.3% 668.0 __bzero
668.0ms 32.3% 0.0 vm_fault
668.0ms 32.3% 0.0 user_trap
668.0ms 32.3% 0.0 CGSColorMaskCopyARGB8888_sse
This is the kernel zeroing out pages of memory that are being faulted in by virtue of CGSColorMaskCopyARGB8888_sse accessing them. What this means is that the CGContext maps VM pages to back the bitmap context but the kernel doesn't actually do the work associated with that operation until someone actually accesses that memory. The actual mapping/fault happens on first access.
Now let's look at the heaviest kernel trace when we DO reuse the context:
Running Time Self Symbol Name
1327.0ms 35.0% 1327.0 bcopy
1327.0ms 35.0% 0.0 user_trap
1327.0ms 35.0% 0.0 CGSColorMaskCopyARGB8888_sse
This is the kernel copying pages. My money would be on this being the underlying copy-on-write mechanism that delivers the behavior #RyanArtecona was talking about in his comment:
In the Apple docs for CGBitmapContextCreateImage, it says the actual
bit-copying operation doesn't happen until more drawing is done on the
original context.
In the contrived case I used to test, the non-reuse case took 3392ms to execute and the reuse case took 4693ms (significantly slower). Considering just the single heaviest trace from each case, the kernel trace indicates that we spend 668.0ms zero filling new pages on the first access, and 1327.0ms writing into the copy-on-write pages on the first write after the image gets a reference to those pages. This is a difference of 659ms. This one difference alone accounts for ~50% of the gap between the two cases.
So, to distill it down a little, the non-reused context is faster because when you create the context it knows the pages are empty, and there's no one else with a reference to those pages to force them to be copied when you write to them. When you reuse the context, the pages are referenced by someone else (the image you created) and must be copied on the first write, so as to preserve the state of the image when the state of the context changes.
You could further explore what's going on here by looking at the virtual memory map of the process as you step through in the debugger. vmmap is the helpful tool for that.
Practically speaking, you should probably just create a new CGContext every time.
To complement #ipmcc's excellent and thorough answer, here is an instructional overview.
In the Apple docs for CGBitmapContextCreateImage it is stated:
The CGImage object returned by this function is created by a copy
operation. In some cases the copy
operation actually follows copy-on-write semantics, so that the actual
physical copy of the bits occur only if the underlying data in the
bitmap graphics context is modified.
So, when this function is called, the image's underlying bits may not be copied right away, and may instead wait to be copied when the bitmap context is next modified. This bit-copying may be expensive (depending on the size and colorspace of the context), and may disguise itself in an Instruments profile as part of whatever CGContext... drawing function that gets called next on the context (when the bits are forced to copy). This is probably what is happening here with CGContextDrawImage.k
However, the docs go on to say this:
As a consequence, you may want to use the resulting image and release
it before you perform additional drawing into the bitmap graphics
context. In this way, you can avoid the actual physical copy of the
data.
This implies that if you will be finished using the in-memory created image (i.e. it has been saved to disk, sent over the network, etc.) by the time you need to do more drawing in the context, the image would never need to be physically copied at all!
TL;DR
If at some point you need to pull a CGImage out of a bitmap context, and you won't need to keep any references to it (including setting it as a UIImageView's image) before you do any more drawing in the context, then it is probably a good idea to use CGBitmapContextCreateImage. If not, your image will be physically copied at some point, which may take a while, and it may be better to just use a new context each time.
I'm familiar with how to stream audio data from the ipod library using AVAssetReader, but I'm at a loss as to how to seek within the track. e.g. start playback at the halfway point, etc. Starting from the beginning and then sequentially getting successive samples is easy, but surely there must be a way to have random access?
AVAssetReader has a property, timeRange, which determines the time range of the asset from which media data will be read.
#property(nonatomic) CMTimeRange timeRange
The intersection of the value of this property and CMTimeRangeMake(kCMTimeZero, asset.duration) determines the time range of the asset from which media data will be read.
The default value is CMTimeRangeMake(kCMTimeZero, kCMTimePositiveInfinity). You cannot change the value of this property after reading has started.
So, if you want to seek to the middle the track, you'd create a CMTimeRange from asset.duration/2 to asset.duration, and set that as the timeRange on the AVAssetReader.
AVAssetReader is amazingly slow when seeking. If you try to recreate an AVAssetReader to seek while the user is dragging a slider, your app will bring iOS to its knees.
Instead, you should use an AVAssetReader for fast forward only access to video frames, and then also use an AVPlayerItem and AVPlayerItemVideoOutput when the user wants to seek with a slider.
It would be nice if Apple combined AVAssetReader and AVPlayerItem / AVPlayerItemVideoOutput into a new class that was performant and was able to seek quickly.
Be aware that AVPlayerItemVideoOutput will not give back pixel buffers unless there is an AVPlayer attached to the AVPlayerItem. This is obviously a strange implementation detail, but it is what it is.
If you are using AVPlayer and AVPlayerLayer, then you can simply use the seek methods on AVPlayer itself. The above details are only important if you are doing custom rendering with the pixel buffers and/or need to send the pixel buffers to an AVAssetWriter.
This is a multiple part question, mostly because my ignorance on the matter has multiple layers.
First, I put together a caching system for caching CGImageRef objects. I keep it at the CGImageRef level (rather than UIImage) as I am loading images in background threads. When an image is loaded I put it into a NSMutableDictionary. I had to do a bit of arm twisting to get CGImageRef's into the array:
//Bunch of stuff drawing into a context
CGImageRef imageRef = CGBitmapContextCreateImage(context);
CGContextRelease(context);
[(id)imageRef autorelease];
[self.cache setObject:(id)imageRef forKey:#"SomeKey"];
So, as you can see, I'm trying to treat the Image Ref as an NSObject, setting it to autorelease then placing it in the dictionary. My expectation is this will allow the image to be cleaned up after being removed from the dictionary. Now, I am beginning to have my doubts.
My application clears the cache array when the user "restarts" to play with different images. Running the application in Instruments shows that the memory is not dropping back to the "start" level on restart, but instead remains steady. My gut tells me that when the array has all objects removed the CGImageRef is not being cleared.
However, I'm unable to confirm this as I don't quite know how to track down the actual source of the memory in instruments. It's just a list of (Malloc 16 Bytes, Malloc 32 Bytes, etc), drilling into them just show a list of dyld callers. Not sure how to properly read it.
So, first question, is my way of caching CGImageRef objects completely flawed? And is there a better way to confirm such things in instruments?
First of all, caching CGImages is OK and I don't see any problems with the code you posted.
Am I correctly assuming you use an NSMutableDictionary as the cache? If so, you can clear it by sending it -removeAllObjects, which should release all the keys and values. If you just set different images for the same keys, memory usage may remain roughly the same because you replace previous images with new ones. If the images have the same size, memory usage should be constant except brief spikes when you create a new batch of images.
As for Instruments, I've seen it both report false positives and miss real leaks. Try running it several times, making pauses, if possible, for the Leaks instrument to "catch up". This sounds crazy, but I think it may make it a bit more reliable.
If all else fails, you can log the contents of the cache before and after loading a set of images to make sure the cache itself works as expected.
Why not just cache UIImage objects; you can make them fine on a background thread?
It's UIImageView objects that you have to be more careful with and even they are OK for most operations in the background.