I have an imported external buffer, which starts out with PREINITIALIZED layout, which is then switched to TRANSFER_SRC_OPTIMAL with a memory barrier before copying the data out.
That works well once, but now I'd like to reuse the image in the next frame, i.e. after the image copy is complete, I'd have the external hardware fill new data into memory, and make sure that the next frame uses the new data.
Is an image barrier without a layout change the correct approach here, as I'm not allowed to take the layout back to PREINITIALIZED?
Related
In an existing renderer which draws geometry in the swapchain, I need to render some parts of this geometry in a texture, others parts must remain on screen. All the geometry is recorded into one command buffer. I won't need to render this texture every time.
I created destination image, image view and framebuffer, but I don't know what to do now.
I dont think I need a specific pipeline, nor a new specific descriptor set, as everything is correctly rendered on screen.
Do I need another render pass, or a subpass, or anything else?
Exactly, you need a separate renderpass that fills your destination images. As the renderpass stores a reference to the images (as attachments) a separate one is required.
Within that renderpass you then can use subpass dependencies to transition the destination images to the proper layout. Your first transition should be VK_ACCESS_SHADER_READ_BIT to VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT for writing to the destination image and once that's done you transition back from VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT to VK_ACCESS_SHADER_READ_BIT so you can e.g. render your destination images in the visual pass. An alternative would be blitting them to the swap chain if the device supports that.
If you need a reference, you can check out my offscreen rendering sample.
TL;DR: From within my MTKView's delegate drawInMTKView: method, part of my rendering pass involves adding an MPSImageBilinearScale performance shader and zero or more MTLBlitCommandEncoder requests for generateMipmapsForTexture. Is that a smart thing to do from within drawInMTKView:, which happens on the main thread? Do either of them block the main thread while running or are they only being encoded and then executed later and entirely on the GPU?
Longer Version:
I'm playing around with Metal within the context of an imaging application. I use Core Image to load an image and apply filters. The output image is displayed as a 2D plane in a metal view with a single texture. This works, but to improve performance I wanted to experiment with Core Image's ability to render out smaller tiles at a time. Each tile is rendered into its own IOSurface.
On each render pass, I check if there are any tiles that have been recently rendered. For each rendered tile (which is now an IOSurface), I create a Metal texture from a CVMetalTextureCache that is backed by the surface.
I think use a scaling MPS to copy from the tile-texture into the "master" texture. If a tile was copied over, then I issue a blit command to generate the mipmaps on the master texture.
What I'm seeing is that if my master texture is quite large, then generate the mipmaps can take "a bit of time". The same is true if I have a lot of tiles. It appears this is blocking the main thread because my FPS drops significantly. (The MTKView is running at the standard 60fps.)
If I play around with tile sizes, then I can improve performance in some areas but decrease it in others. For example, increasing the tile size that Core Image renders it creates less tiles, and thus less calls to generate mipmaps and blits, but at the cost of Core Image taking longer to render a region.
If I decrease the size of my "master" texture, then mipmap generation goes faster since only the dirty textures are updates, but there appears to be a lower bounds on how small I should make the master texture because if I make it too small, then I need to pass in a large number of textures to the fragment shader. (And it looks like that limit might be 128?)
What's not entirely clear to me is how much of this I can move off the main thread while still using MTKView. If part of the rendering pass is going to block the main thread, then I'd prefer to move it to a background through so that UI elements (like sliders and checkboxes) remain fully responsive.
Or maybe this isn't the right strategy in the first place? Is there a better way to display really large images in Metal other than tiling? (i.e.: Images larger than Metal's texture size limit of 16384?)
In my Metal app for macOS, I have a situation where I only want to display the render results every so often. I want to complete the rendering pass every frame, and save the drawable texture image to a file, but I only want to display the render every sixteenth frame or so. I tried just skipping commandBuffer.present(drawable) when I don't want to display, but it is not working. It just stops displaying new frames once I do that. After skipping one call to commandBuffer.present(), it just doesn't display any new frames. It does continue to run, however.
Why would that happen? Once I commit a command buffer, is it required for it to be presented?
If I can't get this to work, then I will try to render into an offscreen buffer for these frames I don't want displayed. But it would be extra work and require more memory for the offscreen render buffer, so I'd rather just be able to use my regular onscreen render buffer if possible.
Thanks!
It's not required that a command buffer present a drawable. I think the issue is that, once you've obtained the drawable, it's not returned to the pool maintained by the CAMetalLayer (or, indirectly, MTKView) that provided it until it is presented.
Do not render to a drawable's texture if you don't plan on presenting. Rendering to an off-screen texture is the right approach. In fact, if you always render first to an off-screen texture and then, only for the frames you want to display, copy that to a drawable's texture, then you can leave the framebufferOnly property of the CAMetalLayer with its default true value. In that case, there's a decent chance that you won't increase the memory required (because the drawable's texture is really just part of the screen's backing store).
I want to render my scene to a texture and then use that texture in shader so I created a frambuffer using imageview and recorded a command buffer for that. I successfully uploaded and executed the command buffer on gpu but the descriptor of imageview is black. I'm creating a descriptor from the imageview before rendering loop. Is it black because I create it before anything is rendered to framebuffer? If so I will have to update the descriptor every frame. Will I have to create a new descriptor from imageview every frame? Or is there another way I can do this?
I have read other thread on this title. Don't mark this as duplicate cause that thread is about textures and this is texture from a imageview.
Thanks.
#IAS0601 I will answer questions from Your comment through an answer, as it allows for much longer text to be written, and its formatting is much better. I hope this also answers Your original question, but You don't have to treat like the answer. As I wrote, I'm not sure what You are asking about.
1) In practically all cases, GPU accesses images through image views. They specify additional parameters which define how image is accessed (like for example which part of the image is accessed), but still it is the original image that gets accessed. Image view, as name suggests, is just a view, list of access parameters. It doesn't have any memory bound to it, it doesn't contain any data (apart from the parameters specified during image view creation).
So when You create a framebuffer and render into it, You render into original images or, to be more specific, to those parts of original images which were specified in image views. For example, You have a 2D texture with 3 array layers. You create a 2D image view for the middle (second) layer. Then You use this image view during framebuffer creation. And now when You render into this framebuffer, in fact You are rendering into the second layer of the original 2D texture array.
Another thing - when You later access the same image, and when You use the same image view, You still access the original image. If You rendered something into the image, then You will get the updated data (provided You have done everything correctly, like perform appropriate synchronization operations, layout transition if necessary etc.). I hope this is what You mean by updating image view.
2) I'm not sure what You mean by updating descriptor set. In Vulkan when we update a descriptor set, this means that we specify handles of Vulkan resources that should be used through given descriptor set.
If I understand You correctly - You want to render something into an image. You create an image view for that image and provide that image view during framebuffer creation. Then You render something into that framebuffer. Now You want to read data from that image. You have two options. If You want to access only one sample location that is associated with fragment shader's location, You can do this through an input attachment in the next subpass of the same render pass. But this way You can only perform operations which don't require access to multiple texels, for example a color correction.
But if You want to do something more advanced, like blurring or shadow mapping, if You need access to several texels, You must end a render pass and start another one. In this second render pass, You can read data from the original image through a descriptor set. It doesn't matter when this descriptor set was created and updated (when the handle of image view was specified). If You don't change the handles of resources - meaning, if You don't create a new image or a new image view, You can use the same descriptor set and You will access the data rendered in the first render pass.
If You have problems accessing the data, for example (as You wrote) You get only black colors, this suggests You didn't perform everything correctly - render pass load or store ops are incorrect, or initial and final layouts are incorrect. Or synchronization isn't performed correctly. Unfortunately, without access to Your project, we can't be sure what is wrong.
I am rendering using SlimDX to a control in a form. Since the size of that control might change very often, and there are lots of complex meshes, the traditional free-reset-construct method may be too slow to my taste. Any way to boost it up?
create an additional SwapChain linked to your current window using IDirect3DDevice9::CreateAdditionalSwapChain Method,
then, get the back buffer of the new SwapChain, and, use IDirect3DDevice9::SetRenderTarget method
to set the back buffer of the new SwapChain as the render target,
when you finished your drawings, call the present method of the new SwapChain instead of the IDirect3DDevice9::present,
when your window is resized, just release the additional SwapChain and re-create it with new back buffer sizes and do the render target setting thing again, now, you don`t have to do the device reset which is very slow.
if you have any more questions, email me : xux660#hotmail.com
I am a chinese so my english is not so good, forgive me.