I'm a little fuzzy on the concepts of RenderPass, Image, and Framebuffer, especially FrameBuffer, but is it a piece of memory for storing pixel data or not, and if so, how is it different from Image?
Or does the rendering flow like this:
The rendered pixels are stored in Framebuffer first, and then in Image?
Objects in Vulkan are pretty much what the API suggests that they are. VkFramebuffer does not get attached to any VkDeviceMemory allocations (directly), so it clearly is not "a piece of memory for storing pixel data". The Vulkan specification lays out the role of a VkFramebuffer pretty simply:
Framebuffers represent a collection of specific memory attachments that a render pass instance uses.
That's all they do: they hold all of the image attachments that a particular render pass instance will use.
Related
I'm following vulkan-tutorial.com and in the images portion of the tutorial it mentions layout transitions, but doesn't elaborate on what they are. I don't like to copy and paste code without knowing exactly what it does and I can't find a sufficient explanation in the tutorial or on google.
A "layout transition" is exactly what those words mean. It's when you transition the layout of an image sub-resource from one layout to another. So your question really seems to be... what is a layout?
In the Vulkan abstraction, images are composed of sub-resources. These represent distinct sections of an image which can be manipulated independently of other sections. For example, each mipmap level of a mipmapped image is a sub-resource.
At any particular time that an image sub-resource is being used by a GPU process, that sub-resource has a layout. This is part of the Vulkan abstraction of GPU operations, so exactly what it means to the GPU will vary from chip to chip.
The important part is this: layouts restrict how you can use an image sub-resource. Or more to the point, in order to use an image sub-resource in a particular way, it must be in a layout which permits that usage.
When a sub-resource is in the VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL layout, you can only perform operations which read from the sub-resource within a shader. The shader cannot write to the image, nor can the image be used as a render target.
Now, the general layout allows pretty much any use at any time while within that layout. However, this also can represent less optimal performance. Any of the more restricted layouts can make those accesses to the image more performance-friendly (depending on hardware).
So it is your job to keep track of the layout of any image sub-resources you plan to use. Now for most images, you're going to use destination transfer layout to upload to them, and then just leave them as shader read-only, because you don't generally use most images more arbitrarily. So generally, this means keeping track of render targets that you want to read from, as well as swapchain images (you have to transition them to the present layout before presenting them) and storage images.
Layout transitions typically happen as part of an explicit dependency between two operations. This makes sense; if you're uploading data to an image, and you later want to read from it, you need a dependency between the upload and the read. You may as well do the layout transition then, since the transition can modify the way the bytes of the image are stored, so you need the transfer to be done first.
I want to load a texture which is layed out in memory as ARGB (with 1 byte components). Therefore when loading it with Vulkan, I need to apply a texture swizzle to it, as there is no ARGB pixel format in Vulkan.
My question is, should I prefer using VK_FORMAT_R8G8B8A8_UNORM over VK_FORMAT_B8G8R8A8_UNORM or the other way around?
PD: This texture will be used only once and then updated
Edit: Please note that the available alternatives are:
VK_FORMAT_R8G8B8A8_UNORM
VK_FORMAT_B8G8R8A8_UNORM
VK_FORMAT_A8B8G8R8_UNORM_PACK32 (equivalent to VK_FORMAT_R8G8B8A8_UNORM on LE systems, therefore I will ignore it)
Both of the two non-packed types are required image formats in Vulkan and therefore will always be present in Vulkan implementations for most kinds of image usage. So long as you're not using the format for buffer views, image load/store, or as an attachment, there is no Vulkan-detectable reason to prefer one format over the other.
Since you are going to apply a swizzle mask to the image view, the order of components in the format isn't all that relevant to texture sampling operations.
As such, which one you use is just a matter of taste.
In Direct3D12, you can use "ID3D12Resource::WriteToSubresource" to enable zero-copy optimizations for UMA adapters.
What is the equivalent of "ID3D12Resource::WriteToSubresource" in Vulkan?
What WriteToSubresource seems to do (in Vulkan-equivalent terms) is write pixel data from CPU memory to an image whose storage is in CPU-writable memory (hence the requirement that it first be mapped), to do so immediately without the need for a command buffer, and to be able to do so regardless of linear/tiling.
Vulkan doesn't have a way to do that. You can write directly to the backing storage for linear images (in the generic layout), but not for tiled ones. You have to use a proper transfer command for that, even on UMA architectures. Which means building a command buffer and submitting to a transfer-capable queue, since Vulkan doesn't have any immediate copy commands like that.
A Vulkan way to do this would essentially be a function that writes data to a mapped pointer to device memory storage as appropriate for a tiled VkImage in the pre-initialized layout that you intend to store in a particular region of memory. That way, you could then bind the image to that location of memory, and you'd be able to transition the layout to whatever you want.
But that would require adding such a function and allowing the pre-initialized layout to be used for tiled images (so long as the data is written by this function).
So, from ID3D12Resource::WriteToSubresource docunentation I read it performs one copy, with marketeze sprinkled on top.
Vulkan is an explicit API, which does perfectly allow you to do an one-copy on UMA (or on anything else). It even allows you to do real zero-copy, if you stick with linear tiling.
UMA may look like this: https://vulkan.gpuinfo.org/displayreport.php?id=4919#memorytypes
I.e. has only one heap, and the memory type is both DEVICE_LOCAL and HOST_VISIBLE.
So, if you create a linearly tiled image\buffer in Vulkan, vkMapMemory its memory, and then produce your data into that mapped pointer directly, there you have a (real) zero-copy.
Since this is not always practical (i.e. you cannot always choose how things are allocated, e.g. if it is data returned from library function), there is an extension VK_EXT_external_memory_host (assuming your ICD supports it of course), which allows you to import your host data directly, without having to first make a Vulkan memory map.
Now, there are optimally tiled images. Optimal tiling is opaque in Vulkan (so far), and implementation-dependent, so you do not even know the addressing scheme without some reverse engineering. You, generally speaking, want to use optimally tiled images, because supposedly accessing them has better performance characteristics (at least in common situations).
This is where the single copy comes in. You would take your linearly tiled image (or buffer), and vkCmdCopy* it into your optimally tiled image. That copy is performed by the Device\GPU with all its bells and whistles, potentially faster than CPU, i.e. what I suspect they would call "near zero-copy".
In DirectX12, you render multiple objects in different locations using the equivalent of a single uniform buffer for the world transform like:
// Basic simplified pseudocode
SetRootSignature();
SetPrimitiveTopology();
SetPipelineState();
SetDepthStencilTarget();
SetViewportAndScissor();
for (auto object : objects)
{
SetIndexBuffer();
SetVertexBuffer();
struct VSConstants
{
QEDx12::Math::Matrix4 modelToProjection;
} vsConstants;
vsConstants.modelToProjection = ViewProjMat * object->GetWorldProj();
SetDynamicConstantBufferView(0, sizeof(vsConstants), &vsConstants);
DrawIndexed();
}
However, in Vulkan, if you do something similar with a single uniform buffer, all the objects are rendered in the location of last world matrix:
for (auto object : objects)
{
SetIndexBuffer();
SetVertexBuffer();
UploadUniformBuffer(object->GetWorldProj());
DrawIndexed();
}
Is there a way to draw multiple objects with a single uniform buffer in Vulkan, just like in DirectX12?
I'm aware of Sascha Willem's Dynamic uniform buffer example (https://github.com/SaschaWillems/Vulkan/tree/master/dynamicuniformbuffer) where he packs many matrices in one big uniform buffer, and while useful, is not exactly what I am looking for.
Thanks in advance for any help.
I cannot find a function called SetDynamicConstantBufferView in the D3D 12 API. I presume this is some function of your invention, but without knowing what it does, I can only really guess.
It looks like you're uploading data to the buffer object while rendering. If that's the case, well, Vulkan can't do that. And that's a good thing. Uploading to memory that you're currently reading from requires synchronization. You have to issue a barrier between the last rendering command that was reading the data you're about to overwrite, and the next rendering command. It's just not a good idea if you like performance.
But again, I'm not sure exactly what that function is doing, so my understanding may be wrong.
In Vulkan, descriptors are generally not meant to be changed in the middle of rendering a frame. However, the makers of Vulkan realized that users sometimes want to draw using different subsets of the same VkBuffer object. This is what dynamic uniform/storage buffers are for.
You technically don't have multiple uniform buffers; you just have one. But you can use the offset(s) provided to vkCmdBindDescriptorSets to shift where in that buffer the next rendering command(s) will get their data from. So it's a light-weight way to supply different rendering commands with different data.
Basically, you rebind your descriptor sets, but with different pDynamicOffset array values. To make these work, you need to plan ahead. Your pipeline layout has to explicitly declare those descriptors as being dynamic descriptors. And every time you bind the set, you'll need to provide the offset into the buffer used by that descriptor.
That being said, it would probably be better to make your uniform buffer store larger arrays of matrices, using the dynamic offset to jump from one block of matrices to the other. You would tehn
The point of that is that the uniform data you provide (depending on hardware) will remain in shader memory unless you do something to change the offset or shader. There is some small cost to uploading such data, so minimizing the need for such uploads is probably not a bad idea.
So you should go and upload all of your objects buffer data in a single DMA operation. Then you issue a barrier, and do your rendering, using dynamic offsets and such to tell each offset where it goes.
You either have to use Push constants or have separate uniform buffers for each location. These can be bound either with a descriptor per location of dynamic offset.
In Sasha's example you can have more than just the one matrix inside the uniform.
That means that inside UploadUniformBuffer you append the new matrix to the buffer and bind the new location.
I'm creating a DirectX 11 game that renders complex meshes in 3D space. I'm using vertex/index buffers/shaders and this all works fine. However I now want to perform some basic 'overlay' rendering - more specifically, I want to render wireframe boxes in 3D space to show the bounds of a particular area. There would only ever be one or two boxes in view at any one time, and their vertices would change position each frame.
I've therefore been searching for simpler DX11 rendering methods but most articles I find still prepare a vertex/index buffer for very simple rendering. I know that hardware is well optimised for processing vertex streams, but is the overhead of building and filling a vertex buffer every frame just to process 8 vertices really the most efficient method?
My question is therefore, what is the most efficient method for performing this very simple rendering in DX11? Is there any more primitive method ("DrawLine", "DrawLineList(D3DXVECTOR3[])", ...) that would be a better solution? It could be less efficient per-vertex than the standard method of passing vertex buffers because it's only ever going to be used for a handful of vertices per frame.
Thanks in advance
Rob
You should create a single vertex / index buffer for each primitive Shape (box, sphere, ...) and use transformation matrix to place it correctly in the world.