How can I overlay my UI render target onto the back buffer using DirectX 11? - rendering

I have two render targets, the back buffer and a UI render target where all 2d UI will be drawn.
I have used the graphics debugger to confirm that both render targets are being written to with the correct data, but I'm having trouble combining the two right at the end.
My world objects are drawn directly to the backbuffer so there is no problem displaying these, but how do I now overlay the UI render target OVER the backbuffer?
Desired effect:
Back buffer render target
UI render target

There's several ways to do this. The easiest is to render your UI elements to a texture that has both a RenderTargetView and a ShaderResourceView, then render the whole texture to the back buffer as a single quad in orthographic projection space. This effectively draws a 2D square containing your UI in screen space on the back buffer. It also has the benefit of allowing transparency.
You could also use the OutputMerger stage to blend the UI render target with the back buffer during rendering of the world geometry. You'd need to be careful how you set up your blend operations, as it could result in items being drawn over the UI, or blending inappropriately.
If your UI is not transparent, you could do the UI rendering first and mark the area under the UI in the stencil buffer, then do your world rendering while the stencil test is enabled. This would cause the GPU to ignore any pixels underneath the UI, and not send them to the pixel shader.
The above could also be modified to write the minimum depth value to the pixels within the UI render target, ensuring all geometry underneath it would fail the depth test. This modification would free up the stencil buffer for mirrors/shadows/etc.
The above all work for flat UIs drawn over the existing 3D world. To actually draw more complex UIs that appear to be a part of the world, you'll need to actually render the elements to 3D objects in the world space, or do complex projection operations to make it seem like they are.


Vulkan Rendering - Portion of Surface

How to render a vulkan framebuffer(vkImage) in a portion of Surface?
When I draw in framebuffer, vulkan clear all surface with vkColorClear.
The surface has 800x600 but I would like vulkan render 300x200 using a offset 100x100, for example.
When you begin a render pass, you provide the VkRenderPassBeginInfo object. In this object is the renderArea rectangle, which defines the area of each of the attachment images that the render pass will affect. Any pixels of attachments outside of this area are unaffected by render pass operations, including the clear load op and vkCmdClearAttachments.
Note that the renderArea is subject to the limitations of the render area granularity, as queried from vkGetRenderAreaGranularity.
You can subset a window by setting the view rectangle and viewport in the VkGraphicsPipelineCreateInfo structure to the subregion you wish to render. You can dynamically configure the viewport at draw time using vkCmdSetViewport().
For VkCmdClearAttachments() you can set the clear area via the pRects argument (it ignores viewport).

Can a VkSurfaceKHR represent only a whole window? Or also a portion of a window (ie some rectangular widget)? [duplicate]

We have an application which has a window with a horizontal toolbar at the top. The windows-level handle we pass to Vulkan to create the surface ends up including the area behind the toolbar i.e. Vulkan is completely unaware of the toolbar and the surface includes the space "behind" it.
My question is, can a surface represent only a portion of this window? We obviously need not process data for the pixels that lie behind the toolbar, and so want to avoid creating a frame buffer, depth buffer etc. bigger than necessary.
I fully understand that I can accomplish this visually using a viewport which e.g. has an origin offset and height compensation, however to my understanding the frame buffer actually still contains information for pixels the full size of the surface (e.g. 800x600 for an 800x600 client-area window) even if I am only rendering to a portion of that window. The frame buffer then gets "mapped" and therefore squished to the viewport area.
All of this has sort of left me wondering what the purpose of a viewport is. If it simply defines a mapping from your image buffer to an area in the surface, is that not highly inefficient if your framebuffer contains considerably more pixels than the area it is being mapped to? Would it not make sense to rather section of portions in your application using e.g. different windows HWNDs FIRST, and then create different surfaces from then onwards?
How can I avoid rendering to an area bigger than necessary?
The way this gets handled for pretty much every application is that the client area of a window (ie: the stuff that isn't toolbars and the like) is a child window of the main frame window. When the frame is resized, you resize the client window to match the new client area (taking into account the new sizes of the toolbars/etc).
It is this client window which should have a Vulkan surface created for it.

How to improve MTKView rendering when using MPSImageScale and MTLBlitCommandEncoder

TL;DR: From within my MTKView's delegate drawInMTKView: method, part of my rendering pass involves adding an MPSImageBilinearScale performance shader and zero or more MTLBlitCommandEncoder requests for generateMipmapsForTexture. Is that a smart thing to do from within drawInMTKView:, which happens on the main thread? Do either of them block the main thread while running or are they only being encoded and then executed later and entirely on the GPU?
Longer Version:
I'm playing around with Metal within the context of an imaging application. I use Core Image to load an image and apply filters. The output image is displayed as a 2D plane in a metal view with a single texture. This works, but to improve performance I wanted to experiment with Core Image's ability to render out smaller tiles at a time. Each tile is rendered into its own IOSurface.
On each render pass, I check if there are any tiles that have been recently rendered. For each rendered tile (which is now an IOSurface), I create a Metal texture from a CVMetalTextureCache that is backed by the surface.
I think use a scaling MPS to copy from the tile-texture into the "master" texture. If a tile was copied over, then I issue a blit command to generate the mipmaps on the master texture.
What I'm seeing is that if my master texture is quite large, then generate the mipmaps can take "a bit of time". The same is true if I have a lot of tiles. It appears this is blocking the main thread because my FPS drops significantly. (The MTKView is running at the standard 60fps.)
If I play around with tile sizes, then I can improve performance in some areas but decrease it in others. For example, increasing the tile size that Core Image renders it creates less tiles, and thus less calls to generate mipmaps and blits, but at the cost of Core Image taking longer to render a region.
If I decrease the size of my "master" texture, then mipmap generation goes faster since only the dirty textures are updates, but there appears to be a lower bounds on how small I should make the master texture because if I make it too small, then I need to pass in a large number of textures to the fragment shader. (And it looks like that limit might be 128?)
What's not entirely clear to me is how much of this I can move off the main thread while still using MTKView. If part of the rendering pass is going to block the main thread, then I'd prefer to move it to a background through so that UI elements (like sliders and checkboxes) remain fully responsive.
Or maybe this isn't the right strategy in the first place? Is there a better way to display really large images in Metal other than tiling? (i.e.: Images larger than Metal's texture size limit of 16384?)

Metal -- skipping commandBuffer.present(drawable) to not display a frame?

In my Metal app for macOS, I have a situation where I only want to display the render results every so often. I want to complete the rendering pass every frame, and save the drawable texture image to a file, but I only want to display the render every sixteenth frame or so. I tried just skipping commandBuffer.present(drawable) when I don't want to display, but it is not working. It just stops displaying new frames once I do that. After skipping one call to commandBuffer.present(), it just doesn't display any new frames. It does continue to run, however.
Why would that happen? Once I commit a command buffer, is it required for it to be presented?
If I can't get this to work, then I will try to render into an offscreen buffer for these frames I don't want displayed. But it would be extra work and require more memory for the offscreen render buffer, so I'd rather just be able to use my regular onscreen render buffer if possible.
It's not required that a command buffer present a drawable. I think the issue is that, once you've obtained the drawable, it's not returned to the pool maintained by the CAMetalLayer (or, indirectly, MTKView) that provided it until it is presented.
Do not render to a drawable's texture if you don't plan on presenting. Rendering to an off-screen texture is the right approach. In fact, if you always render first to an off-screen texture and then, only for the frames you want to display, copy that to a drawable's texture, then you can leave the framebufferOnly property of the CAMetalLayer with its default true value. In that case, there's a decent chance that you won't increase the memory required (because the drawable's texture is really just part of the screen's backing store).

Using WebGL or OpenGL ES 2, how do I render the contents of an RBO onscreen?

Using WebGL (which is constrained to the OpenGL ES 2 API), I am successfully rendering to texture and then displaying that texture onscreen. Because it is a texture, it is not being antialiased. If I were rendering to an RBO and then displaying that onscreen, I would be able to take advantage of AA.
My render target setup looks like this:
Create FBO
Bind FBO
Create texture (to be rendered to)
Create and bind depth buffer as RBO
Attach texture and RBO to FBO
And my rendering update loop looks like this:
Render the scene to the FBO created in step #2 above
Render a screen aligned quad with the texture created in step #3 above
With desktop OpenGL, I would call glBlitFramebuffer() instead of drawing the screen aligned quad.
How do I render my scene with antialiasing? Do I need to replace the texture with an RBO? If so, what calls do I use to bind the RBO to draw a screen-aligned quad?
You cannot blit the contents of an RBO to screen in WebGL unless you perform a readback and re-upload to texture to blit, which is rather slow.
WebGL has no support for MSAA on FBOs in any form (neither as RBO nor as RTT).
You can implement your own antialiasing in a variety of ways.
Render at 2:2 size and scale down (google maps with webgl does this)
Render at 1:1 size, run a sobel or laplace edge detection on color and depth, and run a bilateral gaussian blur using edge strength as weight (I've used this technique in some of my demos, it works well, )
Use the morphological antialiasing recipe from GPU Pro 2 (I've yet to try that)