Number or hardware draw calls in Core Animation - core-animation

I’m building a Metal app which renders a few objects to a view, similar to how Core Animation renders its CALayers. Soon I realized that rendering each layer with a separate draw call can be expensive and inefficient if there are many objects, compared to drawing all objects from a single vertex buffer with a single draw call. However then the single buffer size may become large and if only a few of the objects change frequently, the entire buffer must be sent to the GPU which may also be inefficient.
So I’m wondering how Core Animation works in this regard. Does it render each layer with a separate draw call or unifies all layers into a single draw call, or something in between?
I tried to profile a Core Animation app with Instruments but the Metal draw calls don’t seem to be recorded, even though Core Animation is said to use Metal under the hood. Any insights?

Related

How to improve MTKView rendering when using MPSImageScale and MTLBlitCommandEncoder

TL;DR: From within my MTKView's delegate drawInMTKView: method, part of my rendering pass involves adding an MPSImageBilinearScale performance shader and zero or more MTLBlitCommandEncoder requests for generateMipmapsForTexture. Is that a smart thing to do from within drawInMTKView:, which happens on the main thread? Do either of them block the main thread while running or are they only being encoded and then executed later and entirely on the GPU?
Longer Version:
I'm playing around with Metal within the context of an imaging application. I use Core Image to load an image and apply filters. The output image is displayed as a 2D plane in a metal view with a single texture. This works, but to improve performance I wanted to experiment with Core Image's ability to render out smaller tiles at a time. Each tile is rendered into its own IOSurface.
On each render pass, I check if there are any tiles that have been recently rendered. For each rendered tile (which is now an IOSurface), I create a Metal texture from a CVMetalTextureCache that is backed by the surface.
I think use a scaling MPS to copy from the tile-texture into the "master" texture. If a tile was copied over, then I issue a blit command to generate the mipmaps on the master texture.
What I'm seeing is that if my master texture is quite large, then generate the mipmaps can take "a bit of time". The same is true if I have a lot of tiles. It appears this is blocking the main thread because my FPS drops significantly. (The MTKView is running at the standard 60fps.)
If I play around with tile sizes, then I can improve performance in some areas but decrease it in others. For example, increasing the tile size that Core Image renders it creates less tiles, and thus less calls to generate mipmaps and blits, but at the cost of Core Image taking longer to render a region.
If I decrease the size of my "master" texture, then mipmap generation goes faster since only the dirty textures are updates, but there appears to be a lower bounds on how small I should make the master texture because if I make it too small, then I need to pass in a large number of textures to the fragment shader. (And it looks like that limit might be 128?)
What's not entirely clear to me is how much of this I can move off the main thread while still using MTKView. If part of the rendering pass is going to block the main thread, then I'd prefer to move it to a background through so that UI elements (like sliders and checkboxes) remain fully responsive.
Or maybe this isn't the right strategy in the first place? Is there a better way to display really large images in Metal other than tiling? (i.e.: Images larger than Metal's texture size limit of 16384?)

Metal drawable musings... ugh

I have two issues in my Metal App.
My call to currentPassDescriptor is stalling. I have too many drawables, apparently.
I'm wholly confused on how to most performantly configure the multiple MTKViews I am using.
Issue (1)
I have a problem with currentPassDescriptor in my app. It is occasionally blocking (for 1.00s) which, according to the docs, is because there is no currentDrawable available.
Background: I have 4 HD 1920x1080 videos playing concurrently, tiled out onto a 3840x2160 second external display as a debugging configuration. The pixel buffers of these AVPlayer instances are captured by 4 independent CVDIsplayLink callbacks and, from within the callback, there is the draw call to its assigned MTKView. A total of 4 MTKViews are subviews tiled on a single NSWindow, and are configured for manual drawing.
I'm using CVDisplayLink callbacks manually. If I don't, then I get stutter when mousing up on the app’s menus, for example.
Within each draw call, I do a bit of kernel shader work then attempt to obtain the currentPassDescriptor. If successful, I do one pass of a fragment/vertex shader and then present the drawable. My code flow follows Apple’s sample code as well as published examples.
According to the Metal System Trace, most of draw calls take under 5ms. The GPU is about 20-25% utilized and there’s about 25% of the GPU memory free. I can also cause the main thread to usleep() for 1 second without any hiccups.
Without any user interaction, there’s about a 5% chance of the videos stalling out in the first minute. If there’s some UI work going then I see that as windowServer work in Instruments. I also note that AVFoundation seems to cache about 15 frames of video onto the GPU for each AVPlayer.
If the cadence of the draw calls is upset, there's about a 10% chance that things stall completely or some of the videos -- some will completely stall, some will stall with 1hz updates, some won't stall at all. There's also less chance of stalling when running Metal System Trace. The movies that have stalled seem to have done so on obtaining a currentPassDescriptor.
This is really a poor design to have this currentPassDescriptor block for ≈1s during a render loop. So much so that I’m thinking of eschewing the MTKView all together and just drawing to a CAMetalLayer myself. But the docs on CAMetalLayer seem to indicate the same blocking behaviour will occur.
I also grab these 4 pixel buffers on the fly and render sub-size regions-of-interest to 4 smaller MTKViews on the main monitor; but the stutters still occur if this code is removed.
Is the drawable buffer limit per MTKView or per the backing CALayer? The docs for maximumDrawableCount on CAMetalLayer say the number needs to be 2 or 3. This question ties into the configuration of the views.
Issue (2)
My current setup is a 3840x2160 NSWindow with a single content view. This subclass of NSView does some hiding/revealing of the mouse cursor by introducing an NSTrackingRectTag. The MTKViews are tiled subviews on this content view.
Is this the best configuration? Namely, one NSWindow with tiled MTKViews… or should I do one MTKView per window?
I'm also not sure how to best configure these windows/layers — ie. by setting (or clearing) wantsLayer, wantsUpdateLayer, and/or canDrawSubviewsIntoLayer. I'm currently just setting wantsLayer to YES on the single content view. Any hints on this would be great.
Does adjusting these properties collapse all the available drawables to the backing layer only; are there still 2 or 3 per MTKView?
NB: I've attached a sample run of my Metal app. The longest 'work' on the top graph is just under 5ms. The clumps of green/blue are rendering on the 4 MTKViews. The 'work' alternates a bit because one of the videos is a 60fps source; the others are all 30fps.

Rendering multiple objects with different textures, vertex buffers, and uniform values in Vulkan

My background is in OpenGL and I'm attempting to learn Vulkan. I'm having a little trouble with setting up a class so I can render multiple objects with different textures, vertex buffers, and UBO values. I've run into an issue where two of my images are drawn, but they flicker and alternate. I'm thinking it must be due to presenting the image after the draw call. Is there a way to delay presentation of an image? Or merge different images together before presenting? My code can be found here, I'm hoping it is enough for someone to get an idea of what I'm trying to do: https://gitlab.com/cwink/Ingin/blob/master/ingin.cpp
Thanks!
You call render twice per frame. And render calls vkQueuePresentKHR, so obviously the two renderings of yours alternate.
You can delay presentation simply by delaying vkQueuePresentKHR call. Let's say you want to show each image for ~1 s. You can simply std::this_thread::sleep_for (std::chrono::seconds(1)); after each render call. (Possibly not the bestest way to do it, but just to get the idea where your problem lies.)
vkQueuePresentKHR does not do any kind of "merging" for you. Typically you "merge images" by simply drawing them into the same swapchain VkImage in the first place, and then present it once.

Cocoa 2D graphics: Quartz, Core Image or Core Animation?

I have been reading for several hours now documentation about drawing two dimensional graphics in a objective-c cocoa application. There appears to be several different technologies all specific to certain tasks. My understanding is that the following technologies do the following things. Please correct me if I'm wrong.
Quartz 2D: The primary library for drawing shapes, text, and images to the screen.
Core Graphics: this is the name of the framework that contains Quartz. This can be used as a synonym for Quartz.
QuartzGL: A GPU acceleration mode for Quartz that is not enabled by default and not necessarily faster for drawing things on the screen.
OpenGL: The most low level library, talk directly to the graphics card at the cost of more lines of code. More suited for 3D graphics.
Core Image: A library for displaying images and text, but not so much for drawing shape primitives.
Core Animation: A library for automatically animating objects. Apparently not suited for moving large numbers of objects. Nor for continuous animation of objects.
QuickTime: A library that apparently also does images and text in addition to video, but probably not good for drawing primitive shapes.
What I would like to do is create a browser for some specific type of data. The view would not very complicated and would consist of drawing rectangles at specific locations. However, the user should be able to move around by dragging the view to the left or the right and the this movement should be fluid. Here is a example that is very close to what I'm trying to make:
http://jbrowse.org/ucsc/hg19/
What drawing technology would you recommand I start coding with?
You want Quartz. Unless your graphing MASSIVE amounts of data, any Mac (I'm assuming Mac not iOS) should handle it easily. It is easy, efficient, and will probably get you where you need to go. For the dragging movement, you'll probably manage that with Core Animation layers.
Note: Everything in the end is handled by AppKit (Mac) or UIKit (iOS) and, eventually, Core Animation. If you're doing graphics, you will encounter Core Animation at some point, as it manages everything displayed.
Note: If you are graphing that much data, you can use OpenGL, but even then, the need shouldn't be too much until you start displaying, probably many millions of vertices or complex visualisations.

Using Core Animation/CALayer for simple layered painting

I would like to create a custom NSView that takes a layered approach to painting. I imagine the majority of the layers would be the same width and height as the backing view.
Is it appropriate to use the Core Animation classes like CALayer for this task, even though I don't expect to need much animation? Is there a more appropriate approach?
To clarify, the view is not meant to be like a canvas in a Photoshop-like application. It more of a data display that should allow for user interaction (selecting, moving, scrolling, etc.)
If it's display and layout you're after, I'd say that a CALayer-based architecture is a good choice. For the open source Core Plot framework, we construct all of our graphs and plot elements out of CALayers, and organize them in a regular hierarchy. CALayers are lightweight and use almost identical APIs between Mac and iPhone. They can even be made to respond to touch or mouse events.
For another example of a CALayer-based user interface, my iPhone application's entire equation entry interface is composed of CALayers, including the menu that slides up from below. Performance is slightly better than that of my previous UIView-based implementation, but the same code also works within my preliminary desktop version of the application.
For a drawing program, I would imagine it would be important to hold a buffer of the bitmap data. The only issue with using a CALayer is that the contents property is a CGImageRef. To turn that back into a graphics context for doing further drawing can be a bit of a pain. You'd have to initialize a new context, draw the bitmap data into it, then do whatever drawing operations you wanted to do, and finally turn that back into a CGImageRef. You probably wouldn't be able to avoid doing a number of pretty large memory allocations, which is virtually guaranteed to slow your program way down.
I would consider holding an off-screen buffer for each layer. Take a look at the Quartz CGLayerRef object. I think it probably does what you want to do: it's an off-screen buffer that holds things you might want to draw repeatedly. You can also quickly get a CGContextRef whenever you need it so you can do additional drawing. And you can always use that CGContextRef with NSGraphicsContext if you want to use Cocoa drawing methods.