Objective C improve CIImage filter speed - objective-c

I wrote the following code to apply a Sepia filter to an image:
- (void)applySepiaFilter {
// Set previous image
NSData *buffer = [NSKeyedArchiver archivedDataWithRootObject: self.mainImage.image];
[_images push:[NSKeyedUnarchiver unarchiveObjectWithData: buffer]];
UIImage* u = self.mainImage.image;
CIImage *image = [[CIImage alloc] initWithCGImage:u.CGImage];
CIFilter *filter = [CIFilter filterWithName:#"CISepiaTone"
keysAndValues: kCIInputImageKey, image,
#"inputIntensity", #0.8, nil];
CIImage *outputImage = [filter outputImage];
self.mainImage.image = [self imageFromCIImage:outputImage];
}
- (UIImage *)imageFromCIImage:(CIImage *)ciImage {
CIContext *ciContext = [CIContext contextWithOptions:nil];
CGImageRef cgImage = [ciContext createCGImage:ciImage fromRect:[ciImage extent]];
UIImage *image = [UIImage imageWithCGImage:cgImage];
CGImageRelease(cgImage);
return image;
}
When I run this code it seems to lag for 1-2 seconds. I heard that core image is faster than core graphics but I am unimpressed with the rendering time. I was wondering if this would be faster processing in CoreGraphics or even OpenCV(which is being used elsewhere in the project)? If not is there any way I can optimize this code to run faster?

I can almost guarantee it will be slower in Core Graphics than using Core Image, depending on the size of the image. If the image is small, Core Graphics may be fine, but if you are doing a lot of processing, it will be much slower than rendering using the GPU.
Core Image is very fast, however, you have to be very conscious of what is going on. Most of the performance hit with Core Image is due to setting up of the context, and copying images to/from Core Image. In addition to just copying bytes, Core Image may be converting between image formats as well.
Your code is doing the following every time:
Creating a CIContext. (slow)
Taking bytes from a CGImage and creating a CIImage.
Copying image data to GPU (slow).
Processing Sepia filter (fast).
Copying result image back to CGImage. (slow)
This is not a recipe for peak performance. Bytes from CGImage will typically live in CPU memory, but Core Image wants to use the GPU for its processing.
An excellent reference for performance considerations are provided in Getting the Best Performance documentation for Core Image:
Don’t create a CIContext object every time you render.
Contexts store a lot of state information; it’s more efficient to reuse them.
Evaluate whether you app needs color management. Don’t use it unless you need it. See Does Your App Need Color Management?.
Avoid Core Animation animations while rendering CIImage objects with a GPU context.
If you need to use both simultaneously, you can set up both to use the CPU.
Make sure images don’t exceed CPU and GPU limits. (iOS)
Use smaller images when possible.
Performance scales with the number of output pixels. You can have Core Image render into a smaller view, texture, or framebuffer. Allow Core Animation to upscale to display size.
Use Core Graphics or Image I/O functions to crop or downsample, such as the functions CGImageCreateWithImageInRect or CGImageSourceCreateThumbnailAtIndex.
The UIImageView class works best with static images.
If your app needs to get the best performance, use lower-level APIs.
Avoid unnecessary texture transfers between the CPU and GPU.
Render to a rectangle that is the same size as the source image before applying a contents scale factor.
Consider using simpler filters that can produce results similar to algorithmic filters.
For example, CIColorCube can produce output similar to CISepiaTone, and do so more efficiently.
Take advantage of the support for YUV image in iOS 6.0 and later.
If you demand real-time processing performance, you will want to use an OpenGL view that CoreImage can render its output to, and read your image bytes directly into the GPU instead of pulling it from a CGImage. Using a GLKView, and overriding drawRect: is a fairly simple solution to get a view that Core Image can render directly to. Keeping data on the GPU is the best way to get peak performance out of Core Image.
Try to reuse as much as possible. Keep a CIContext around for subsequent renders (like the doc says). If you end up using an OpenGL view, these are also things you may want to re-use as much as possible.
You may also be able to get better performance by using software rendering. Software rendering would avoid a copy to/from GPU. [CIContext contextWithOptions:#{kCIContextUseSoftwareRenderer: #(YES)}] However, this will have performance limitations in the actual render, since the CPU render is usually slower than a GPU render.
So, you can choose your level of difficulty to get maximum performance. The best performance can be more challenging, but a few tweaks may get you to "acceptable" performance for your use case.

Related

Objective-c : Load part of an image file

I searched in the CG API but i did not found any way to load only a subset of pixels in a given image.
I need to load a really big image in openGL AT RUNTIME (the requirement is that i can't resize it at compile time). The texture size is too big ( > GL_MAX_TEXTURE_SIZE) so i subdivide it in other smaller images so openGL doesn't complains.
Right now this is what i do to load the big image:
NSData *texData = [[NSData alloc] initWithContentsOfFile:textureFilePath];
UIImage *srcImage = [[UIImage alloc] initWithData:texData];
And then i use CG to subdivide the image using CGImageCreateWithImageInRect() ... and its ready to be sent to openGL.
The problem is that on iPod touch the app crashed because its taking too much memory after loading the big image. I would like to load only the pixels of interest without having to create a huge peak of memory, then i can release the memory and load the next chunk that i need. Someone knows if it is possible?

Performance when frequently drawing CGPaths

I am working on an iOS App that visualizes data as a line-graph. The graph is drawn as a CGPath in a fullscreen custom UIView and contains at most 320 data-points. The data is frequently updated and the graph needs to be redrawn accordingly – a refresh rate of 10/sec would be nice.
So far so easy. It seems however, that my approach takes a lot of CPU time. Refreshing the graph with 320 segments at 10 times per second results in 45% CPU load for the process on an iPhone 4S.
Maybe I underestimate the graphics-work under the hood, but to me the CPU load seems a lot for that task.
Below is my drawRect() function that gets called each time a new set of data is ready. N holds the number of points and points is a CGPoint* vector with the coordinates to draw.
- (void)drawRect:(CGRect)rect {
CGContextRef context = UIGraphicsGetCurrentContext();
// set attributes
CGContextSetStrokeColorWithColor(context, [UIColor lightGrayColor].CGColor);
CGContextSetLineWidth(context, 1.f);
// create path
CGMutablePathRef path = CGPathCreateMutable();
CGPathAddLines(path, NULL, points, N+1);
// stroke path
CGContextAddPath(context, path);
CGContextStrokePath(context);
// clean up
CGPathRelease(path);
}
I tried rendering the path to an offline CGContext first before adding it to the current layer as suggested here, but without any positive result. I also fiddled with an approach drawing to the CALayer directly but that too made no difference.
Any suggestions how to improve performance for this task? Or is the rendering simply more work for the CPU that I realize? Would OpenGL make any sense/difference?
Thanks /Andi
Update: I also tried using UIBezierPath instead of CGPath. This post here gives a nice explanation why that didn't help. Tweaking CGContextSetMiterLimit et al. also didn't bring great relief.
Update #2: I eventually switched to OpenGL. It was a steep and frustrating learning curve, but the performance boost is just incredible. However, CoreGraphics' anti-aliasing algorithms do a nicer job than what can be achieved with 4x-multisampling in OpenGL.
This post here gives a nice explanation why that didn't help.
It also explains why your drawRect: method is slow.
You're creating a CGPath object every time you draw. You don't need to do that; you only need to create a new CGPath object every time you modify the set of points. Move the creation of the CGPath to a new method that you call only when the set of points changes, and keep the CGPath object around between calls to that method. Have drawRect: simply retrieve it.
You already found that rendering is the most expensive thing you're doing, which is good: You can't make rendering faster, can you? Indeed, drawRect: should ideally do nothing but rendering, so your goal should be to drive the time spent rendering as close as possible to 100%—which means moving everything else, as much as possible, out of drawing code.
Depending on how you make your path, it may be that drawing 300 separate paths is faster than one path with 300 points. The reason for this is that often the drawing algorithm will be looking to figure out overlapping lines and how to make the intersections look 'perfect' - when perhaps you only want the lines to opaquely overlap each other. Many overlap and intersection algorithms are N**2 or so in complexity, so the speed of drawing scales with the square of the number of points in one path.
It depends on the exact options (some of them default) that you use. You need to try it.
tl;dr: You can set the drawsAsynchronously property of the underlying CALayer, and your CoreGraphics calls will use the GPU for rendering.
There is a way to control the rendering policy in CoreGraphics. By default, all CG calls are done via CPU rendering, which is fine for smaller operations, but is hugely inefficient for larger render jobs.
In that case, simply setting the drawsAsynchronously property of the underlying CALayer switches the CoreGraphics rendering engine to a GPU, Metal-based renderer and vastly improves performance. This is true on both macOS and iOS.
I ran a few performance comparisons (involving several different CG calls, including CGContextDrawRadialGradient, CGContextStrokePath, and CoreText rendering using CTFrameDraw), and for larger render targets there was a massive performance increase of over 10x.
As can be expected, as the render target shrinks the GPU advantage fades until at some point (generally for render target smaller than 100x100 or so pixels), the CPU actually achieves a higher framerate than the GPU. YMMV and of course this will depend on CPU/GPU architectures and such.
Have you tried using UIBezierPath instead? UIBezierPath uses CGPath under-the-hood, but it'd be interesting to see if performance differs for some subtle reason. From Apple's Documentation:
For creating paths in iOS, it is recommended that you use UIBezierPath
instead of CGPath functions unless you need some of the capabilities
that only Core Graphics provides, such as adding ellipses to paths.
For more on creating and rendering paths in UIKit, see “Drawing Shapes
Using Bezier Paths.”
I'd would also try setting different properties on the CGContext, in particular different line join styles using CGContextSetLineJoin(), to see if that makes any difference.
Have you profiled your code using the Time Profiler instrument in Instruments? That's probably the best way to find where the performance bottleneck is actually occurring, even when the bottleneck is somewhere inside the system frameworks.
I am no expert on this, but what I would doubt first is that it could be taking time to update 'points' rather than rendering itself. In this case, you could simply stop updating the points and repeat rendering the same path, and see if it takes nearly the same CPU time. If not, you can improve performance focusing on the updating algorithm.
If it IS truly the problem of the rendering, I think OpenGL should certainly improve performance because it will render all 320 lines at the same time in theory.

Scaling of image (scriptable image processing system)

I want to scale images to 400x400 (I am creating thumbnails). I am using the Scriptable Image Processing System (SIPS) in a Cocoa application, but the problem is poor efficiency. SIPS takes 70-90% CPU while converting 300 images in 20 seconds. Should I use the CIImage class (CIImage is the type required to use the various GPU-optimized Core Image filters) or NSImage class? Can anyone suggest a better method?
A very simple and fast way to generate thumbnails on OS X is to use QLThumbnailImageCreate.
It's just one line of code so you can easily try out how it compares to SIPS & Core Image.
I tried thumbnail genration using NSImage , CIImage and sips. All are taking same CPU (70-90%) usage but sips is faster.

UIImage drawInRect: is very slow; is there a faster way?

This does exactly what it needs to, except that it takes about 400 milliseconds, which is 350 milliseconds too much:
- (void) updateCompositeImage { //blends together the background and the sprites
UIGraphicsBeginImageContext(CGSizeMake(480, 320));
[bgImageView.image drawInRect:CGRectMake(0, 0, 480, 320)];
for (int i=0;i<numSprites;i++) {
[spriteImage[spriteType[i]] drawInRect:spriteRect[i] blendMode:kCGBlendModeScreen alpha:spriteAlpha[i]];
}
compositeImageView.image = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
}
The images are fairly small, and there are only three of them (the for loop only iterates twice)
Is there any way of doing this faster? While still being able to use kCGBlendModeScreen and alpha?
you can:
get the UIImages' CGImages
then draw them to a CGBitmapContext
produce an image from that
using CoreGraphics in itself may be faster. the other bonus is that you can perform the rendering on a background thread. also consider how you can optimize that loop and profile using Instruments.
other considerations:
can you reduce the interpolation quality?
are the source images resized in any way (it can help if you resize them)
drawInRect is slow. Period. Even in small images it's grossly inefficient.
If you are doing a lot of repeat drawing, then have a look at CGLayer, which is designed to facilitate repeat-rendering of the same bits

NSImage vs. CIImage vs. CGImage?

When should I use each?
NSImage is an abstract data type that can represent many different types of images, as well as multiple representations of an image. It is often useful when the actual type of image is not important for what you're trying to do. It is also the only image class that AppKit will accept in its APIs (NSImageView and so forth).
CGImage can only represent bitmaps. If you need to get down and dirty with the actual bitmap data, CGImage is an appropriate type to use. The operations in CoreGraphics, such as blend modes and masking, require CGImageRefs. CGImageRefs can be used to create NSBitmapImageRefs, which can then added to an NSImage.
I think the documentation describes a CIImage best:
Although a CIImage object has image data associated with it, it is not an image. You can think of a CIImage object as an image “recipe.” A CIImage object has all the information necessary to produce an image, but Core Image doesn’t actually render an image until it is told to do so. This “lazy evaluation” method allows Core Image to operate as efficiently as possible.
CIImages are the type required to use the various GPU-optimized Core Image filters that come with Mac OS X, but, like CGImageRefs, they can also be converted to NSBitmapImageReps.