In Vulkan, using FIFO, happens when you don't have an image ready to present when the vertical blank arrives? - vulkan

I'm new to graphics, and I've been looking at Vulkan presentation modes. I was wondering: in a situation where we've only got 2 images in our swapchain (one that the screen's currently reading from and one that's free), what happens if we don't manage to finish drawing to the currently free image before the next vertical blank? Do we do the presentation and get weird tearing, or do skip the presentation and draw the same image again (I guess giving a "stuttering" effect)? Do we need to define what happens, or is it automatic?
As a side note, is this why people use longer swap chains? i.e. so that if you managed to draw out 2 images to your swap chain while the screen was displaying the last image but now you're running late, at least you can present the newer of the 2 images from before?
I'm not sure how much of this is specific to FIFO or mailbox mode: I guess with mailbox you'll already have used the newest image you've got, so you're stuck again?
[2-image swapchain][1]
[1]: https://i.stack.imgur.com/rxe51.png

Tearing never happens in regular FIFO (or mailbox) mode. When you present an image, this image will be used for all subsequent vblanks until a new image is presented. And since FIFO disallows tearing, in your case, the image will be fully displayed twice.
If you are using a 2-deep swapchain with FIFO, you have to produce each image on time in order to avoid stuttering. With longer swapchains and FIFO, you have more leeway to avoid visible stuttering. With longer swapchains and mailbox, you can get a similar effect, but there will be less visible latency when your application is running on-time.

Related

How do Swapchain minImageCount and VK_PRESENT_MODE_IMMEDIATE_KHR relate to each other?

If we assign VkSwapchainCreateInfoKHR.presentMode = VK_PRESENT_MODE_IMMEDIATE_KHR; as I understand, we have a single buffer and the image is being presented to the surface immediately,
but at the same time SwapChainDetails.surfaceCapabilities.minImageCount = 2; on my GTX 1070.
Does minImageCount mean that minimum number of buffers should be 2 ?
How does it even work ?
They don't relate to each other at all. How many buffers you have has nothing to do with how they get displayed.
Immediate presentation means that, when you present an image, it is displayed immediately. As stated in the standard:
the presentation engine does not wait for a vertical blanking period to update the current image
If the presentation engine only allowed you one swapchain image, then this would mean that the acquire would have to wait until the image is finished being shown. This would make immediate presentation pointless.
Now, you might think that acquiring such an image ought to happen immediately. But that would mean that you could render to the image while the presentation engine is still reading from it, causing potential corruption. But that's not what immediate presentation is for. Immediate presentation changes nothing about the ownership dynamics between the presentation engine and user code; it only changes the timing of how swapchain images are presented.
The ability to write to an image that is actually being presented is governed by the "shared" present modes (VK_PRESENT_MODE_SHARED_DEMAND_REFRESH_KHR and VK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR).

3 vs 2 VkSwapchain images?

So a Vulkan swapchain basically has a pool of images, user-defined in number, that it allocates when it is created, and images from the pool cycle like this:
1. An unused image is acquired by the program from the swapchain
2. The image is rendered by the program
3. The image is delivered for presentation to the surface.
4. The image is returned to the pool.
Given that, of what advantage is having 3 images in the swapchain rather than 2?
(I'm asking because I received a BestPractice validation complaint that I was using 2 rather than 3.)
With 2 images, isn't it the case that one will be being presented (3.) while the other is rendered (2.), and then they alternate those roles back and forth?
With 3 images, isn't it the case that one of the three will be idle? Particularly if the loop is locked to the refresh rate anyway?
I'm probably missing something.
So what happens if one image is being presented, one has finished being rendered to by the GPU, and you have some work to render for later presentation? Well, if you're rendering to a swapchain image, that GPU work cannot start until image 0 has finished being presented.
Double buffering will therefore lead to stalling the GPU if the time to present a frame is greater than the time to render a frame. Yes, triple buffering will do the same if the render time consistently is shorter than the present time, but your frame time is right on the edge of the present time, then double buffering has a greater chance of wasting GPU time.
Of course, the downside is latency; triple buffering means that the image you display will be seen one frame later than double buffering.

Metal drawable musings... ugh

I have two issues in my Metal App.
My call to currentPassDescriptor is stalling. I have too many drawables, apparently.
I'm wholly confused on how to most performantly configure the multiple MTKViews I am using.
Issue (1)
I have a problem with currentPassDescriptor in my app. It is occasionally blocking (for 1.00s) which, according to the docs, is because there is no currentDrawable available.
Background: I have 4 HD 1920x1080 videos playing concurrently, tiled out onto a 3840x2160 second external display as a debugging configuration. The pixel buffers of these AVPlayer instances are captured by 4 independent CVDIsplayLink callbacks and, from within the callback, there is the draw call to its assigned MTKView. A total of 4 MTKViews are subviews tiled on a single NSWindow, and are configured for manual drawing.
I'm using CVDisplayLink callbacks manually. If I don't, then I get stutter when mousing up on the app’s menus, for example.
Within each draw call, I do a bit of kernel shader work then attempt to obtain the currentPassDescriptor. If successful, I do one pass of a fragment/vertex shader and then present the drawable. My code flow follows Apple’s sample code as well as published examples.
According to the Metal System Trace, most of draw calls take under 5ms. The GPU is about 20-25% utilized and there’s about 25% of the GPU memory free. I can also cause the main thread to usleep() for 1 second without any hiccups.
Without any user interaction, there’s about a 5% chance of the videos stalling out in the first minute. If there’s some UI work going then I see that as windowServer work in Instruments. I also note that AVFoundation seems to cache about 15 frames of video onto the GPU for each AVPlayer.
If the cadence of the draw calls is upset, there's about a 10% chance that things stall completely or some of the videos -- some will completely stall, some will stall with 1hz updates, some won't stall at all. There's also less chance of stalling when running Metal System Trace. The movies that have stalled seem to have done so on obtaining a currentPassDescriptor.
This is really a poor design to have this currentPassDescriptor block for ≈1s during a render loop. So much so that I’m thinking of eschewing the MTKView all together and just drawing to a CAMetalLayer myself. But the docs on CAMetalLayer seem to indicate the same blocking behaviour will occur.
I also grab these 4 pixel buffers on the fly and render sub-size regions-of-interest to 4 smaller MTKViews on the main monitor; but the stutters still occur if this code is removed.
Is the drawable buffer limit per MTKView or per the backing CALayer? The docs for maximumDrawableCount on CAMetalLayer say the number needs to be 2 or 3. This question ties into the configuration of the views.
Issue (2)
My current setup is a 3840x2160 NSWindow with a single content view. This subclass of NSView does some hiding/revealing of the mouse cursor by introducing an NSTrackingRectTag. The MTKViews are tiled subviews on this content view.
Is this the best configuration? Namely, one NSWindow with tiled MTKViews… or should I do one MTKView per window?
I'm also not sure how to best configure these windows/layers — ie. by setting (or clearing) wantsLayer, wantsUpdateLayer, and/or canDrawSubviewsIntoLayer. I'm currently just setting wantsLayer to YES on the single content view. Any hints on this would be great.
Does adjusting these properties collapse all the available drawables to the backing layer only; are there still 2 or 3 per MTKView?
NB: I've attached a sample run of my Metal app. The longest 'work' on the top graph is just under 5ms. The clumps of green/blue are rendering on the 4 MTKViews. The 'work' alternates a bit because one of the videos is a 60fps source; the others are all 30fps.

SDL Window shows final frame from the last time the program was run in the background of a new window when I start up a new instance

I'm making a simple game and just messing around with SDL. I have two images currently, and I am practicing making them the background. I make one the background by calling RenderCopy, the DestroyTexture to clear it from memory, and then I present it. I changed the file path from one image to the other to change the background. I ran the program, and now the new image is layered on top of the other. I can fix this problem if I manually do a clean of my computer's memory, and it renders properly. For some reason, SDL is not clearing the image called previously. There is no mention of the old image anywhere else in the code, and all the Destroy functions are called. What is going on?
Edit: I did a little bit of snooping and its not that it even loads the old image; whatever the final render displayed on any program running SDL was before it was run, it will be the background. It is literally like a TV burning an image into the screen; its always there.
The problem was fixed by simply clearing the frame between frames. Didn't have any performance impact.

In a double buffer opengl context, is it possible that front and back buffer be the same?

I have a situation in which I ask and get a double-buffering OpenGL context, but when I draw in it, both the front and back buffer are affected. The draw buffer is set to the back buffer (And only the back buffer). If I look in OpenGL Profiler, I do see all that: the value for GL_DRAW_BUFFER (GL_BACK) and the actual back and front buffer being drawn to.
Since I'm working with an NSWindow that has a backing store, We do not see any of this happening on the screen. The problem is that I'm getting screenshots of this window with CGWindowListCreateImage. This function seems to be fetching the image from the front buffer, and not from the screen buffer (Wherever that is...). So the image returned is incomplete: it only contains the elements that are drawn at the moment it is grabbed, even if no flush has been called.
There is a utility in the mac developer package called Pixie. It basically grab the screen at the mouse position, and display it zoomed in so you can analyze it. This program has the same behavior than calling CGWindowListCreateImage: you can see incomplete images. So I guess the problem is not with the way I use CGWindowListCreateImage, but rather with my window or my display...
Also, It does not seems to happen all the time. Not every windows show this behavior, and even for a given window, it seems to come and go, especially if I move the window to a different screen (In a dual display).
Anyone faced this before?