Is it possible to accidentally overwrite "meta-data" in VkDeviceMemory? - vulkan

From the answer from this reddit post.
...drivers may need some additional meta-data to manage given resource. That's why we need to call vkGet...MemoryRequirements() functions and allocate enough memory.
However, vkMapMemory maps the allocated memory, inside which the meta-data required by driver may be contained, according to the above answer.
When I write to that mapped memory, how am I supposed to know whether I am overwriting
the meta-data?

You should ignore that part of the post. It's nonsense.
Implementations may in fact need metadata for VkBuffer and VkImage objects. But that metadata is not stored within the VkDeviceMemory you provide when you bind those objects to the memory. They are associated with the VkBuffer and VkImage objects. They are allocated when those objects are constructed (which is why creating them takes a VkAllocationCallbacks) and deallocated when they are destroyed.
The purpose in querying the memory requirements for buffers and images is not for metadata associated with the buffer/image per-se. It's for layout purposes. Some buffer usages have to have specific alignments and the sizes may need to be rounded up to that alignment. Optimally-tiled images have a layout that is completely opaque; you're not allowed to directly write to the bytes of a tiled image. You can only copy into/outof them with image copy commands.

Related

Why are Resource Descriptors needed?

When VkPipeline access vertex buffer:
VkPipeline needs a vertex buffer
vkCmdBindVertexBuffers() finds that VkBuffer object.
When VkPipeline access uniform buffer:
VkPipeline needs a uniform buffer
vkCmdBindDescriptorSets() finds a VkDescriptorSet
VkDescriptorSet finds corresponding VkDescriptorBufferInfo
VkDescriptorBufferInfo finds that VkBuffer object.
My question is why can't there be a hypothetical function called vkCmdBindUniformBuffer(), just like vkCmdBindVertexBuffers().
Why are descriptors needed.
Edit: I asked this because I first thought descriptors are like pointers in C/C++. However, I don't spend hundreds of lines to just create a pointer in C/C++. Descriptors feel like an over-complication of something could be as easy as calling a vkCmdBind...() function.
What does "needed" mean in this context? Vulkan defines an abstraction of what's going on in actual hardware. Nothing is strictly "needed"; there are merely different consequences of different abstractions.
One consequence of the descriptor set abstraction is that descriptor set layouts tell both the pipeline building and the descriptor set binding code what to expect in so far as the mapping between any particular Vulkan resource and the underlying hardware resources. The pipeline layout defines a direct mapping.
The pipeline layout represents a mapping from Vulkan set/binding indices to internal resource indices. The internal hardware has different kinds of resources. So which internal resources a particular binding takes depends on the kind of binding. Sampled images take up a set of resources that is separate from SSBOs, for example. The hardware potentially has X sampled image indices and Y indices for storage buffers.
However, some hardware doesn't have different internal resource types for certain Vulkan constructs. For example, some hardware doesn't have a "uniform buffer" as a distinct construct. The implementation implements UBOs as a read-only storage buffer. But this means that any Vulkan UBO also takes up the same resource indices from an SSBO.
As such, we need a way to map from Vulkan resources to the internal resource lists, one which allows the implementation to hide details like this. This is what the pipeline layout is for: the layout defines a mappings from each descriptor in the layout to a particular resource.
If set 0 assigns a storage image to internal index 0, the system knows that if set 1 uses a UBO, that UBO must use internal index 1, since internal index 0 was already taken by set 0. Because in this hardware, UBOs and SSBOs use the same list of resources.
This can't be done without pipeline layouts or some similar system. You need something which tells the system what all of the resources are, so that it can build a mapping table.
Pipeline layouts are grouped into sets to make it easier to change a large number of resources at once. If you want to switch to a different set of 8 textures, your way might require 8 different function calls. Furthermore, because you can have different kinds of descriptors bundled into the same set, you can change 3 textures, 2 UBOs, and one SSBO all in a single bind call.
Set groupings also allow you to have pipeline layouts that are partially compatible. Two pipelines can have the same set 0 but different set 1s. This means when you switch pipelines, you can bind a different descriptor for set 1 without changing set 0's binding. This is useful for descriptors whose update frequency is different.
For example, every object in a scene may use the same perspective and camera matrices, but different world matrices and textures. You can put the former into set 0 and the latter into set 1.

How to write to the image directly by CPU when load it in Vulkan?

In Direct3D12, you can use "ID3D12Resource::WriteToSubresource" to enable zero-copy optimizations for UMA adapters.
What is the equivalent of "ID3D12Resource::WriteToSubresource" in Vulkan?
What WriteToSubresource seems to do (in Vulkan-equivalent terms) is write pixel data from CPU memory to an image whose storage is in CPU-writable memory (hence the requirement that it first be mapped), to do so immediately without the need for a command buffer, and to be able to do so regardless of linear/tiling.
Vulkan doesn't have a way to do that. You can write directly to the backing storage for linear images (in the generic layout), but not for tiled ones. You have to use a proper transfer command for that, even on UMA architectures. Which means building a command buffer and submitting to a transfer-capable queue, since Vulkan doesn't have any immediate copy commands like that.
A Vulkan way to do this would essentially be a function that writes data to a mapped pointer to device memory storage as appropriate for a tiled VkImage in the pre-initialized layout that you intend to store in a particular region of memory. That way, you could then bind the image to that location of memory, and you'd be able to transition the layout to whatever you want.
But that would require adding such a function and allowing the pre-initialized layout to be used for tiled images (so long as the data is written by this function).
So, from ID3D12Resource::WriteToSubresource docunentation I read it performs one copy, with marketeze sprinkled on top.
Vulkan is an explicit API, which does perfectly allow you to do an one-copy on UMA (or on anything else). It even allows you to do real zero-copy, if you stick with linear tiling.
UMA may look like this: https://vulkan.gpuinfo.org/displayreport.php?id=4919#memorytypes
I.e. has only one heap, and the memory type is both DEVICE_LOCAL and HOST_VISIBLE.
So, if you create a linearly tiled image\buffer in Vulkan, vkMapMemory its memory, and then produce your data into that mapped pointer directly, there you have a (real) zero-copy.
Since this is not always practical (i.e. you cannot always choose how things are allocated, e.g. if it is data returned from library function), there is an extension VK_EXT_external_memory_host (assuming your ICD supports it of course), which allows you to import your host data directly, without having to first make a Vulkan memory map.
Now, there are optimally tiled images. Optimal tiling is opaque in Vulkan (so far), and implementation-dependent, so you do not even know the addressing scheme without some reverse engineering. You, generally speaking, want to use optimally tiled images, because supposedly accessing them has better performance characteristics (at least in common situations).
This is where the single copy comes in. You would take your linearly tiled image (or buffer), and vkCmdCopy* it into your optimally tiled image. That copy is performed by the Device\GPU with all its bells and whistles, potentially faster than CPU, i.e. what I suspect they would call "near zero-copy".

Vulkan descriptor binding

In my vulkan application i used to draw meshes like this when all the meshes used the same texture
Updatedescriptorsets(texture)
Command buffer record
{
For each mesh
Bind transformubo
Draw mesh
}
But now I want each mesh to have a unique texture so i tried this
Command buffer record
{
For each mesh
Bind transformubo
Updatedescriptorsets (textures[meshindex])
Draw mesh
}
But it gives an error saying descriptorset is destroyed or updated. I looked in vulkan documentation and found out that I can't update descriptorset during command buffer records. So how can I have a unique texture to each mesh?
vkUpdateDescriptorSets is not synchonrized with anything. Therefore, you cannot update a descriptor set while it is in use. You must ensure that all rendering operations that use the descriptor set in question have finished, and that no commands have been placed in command buffers that use the set in question.
It's basically like a global variable; you can't have people accessing a global variable from numerous threads without some kind of synchronization. And Vulkan doesn't synchronize access to descriptor sets.
There are several ways to deal with this. You can give each object its own descriptor set. This is usually done by having the frequently changing descriptor set data be of a higher index than the less frequently changing data. That way, you're not changing every descriptor for each object, only the ones that change on a per-object basis.
You can use push constant data to index into large tables/array textures. So the descriptor set would have an array texture or an array of textures (if you have dynamic indexing for arrays of textures). A push constant would provide an index, which is used by the shader to fetch that particular object's texture from the array texture/array of textures. This makes frequent changes fairly cheap, and the same index can also be used to give each object its own transformation matrices (by fetching into an array of matrices).
If you have the extension VK_KHR_push_descriptor available, then you can integrate changes to descriptors directly into the command buffer. How much better this is than the push constant mechanism is of course implementation-dependent.
If you update a descriptor set then all command buffers that this descriptor set is bound to will become invalid. Invalid command buffers cannot be submitted or be executed by the GPU.
What you basically need to do is to update descriptor sets before you bind them.
This odd behavior is there because in vkCmdBindDescriptorSets some implementations take the vulkan descriptor set, translate it to native descriptor tables and then store it in the command buffer. So if you update the descriptor set after vkCmdBindDescriptorSets the command buffer will be seeing stale data. VK_EXT_descriptor_indexing extension relaxed this behavior under some circumstances.

What would be the value of VkAccessFlags for a VkBuffer or VkImage after being allocated memory?

So I create a bunch of buffers and images, and I need to set up a memory barrier for some reason.
How do I know what to specify in the srcAccessMask field for the barrier struct of a newly created buffer or image, seeing as at that point I wouldn't have specified the access flags for it? How do I decide what initial access flags to specify for the first memory barrier applied to a buffer or image?
Specifying initial values for other parameters in Vk*MemoryBarrier is easy since I can clearly know, say, the original layout of an image, but it isn't apparent to me what the value of srcAccessMask could be the first time I set up a barrier.
Is it based on the usage flags specified during creation of the object concerned? Or is there some other way that can be used to find out?
So, let's assume vkCreateImage and VK_LAYOUT_UNDEFINED.
Nowhere the specification says it defines some scheduled operation. So it is healthy to assume all its work is done as soon as it returns. Plus, it does not even have memory.
So any synchronization needs would be of the memory you bind to it. Let's assume it is just fresh memory from vkAllocate. Similarly, nowhere it is said in the specification that it defines some scheduled operation.
Even so, there's really only two options. Either the implementation does nothing with the memory, or it null-fills it (for security reason). In the case it null-fills it, that must be done in a way you cannot access the original data (even using synchronization errors). So it is healthy to assume the memory has no "synchronization baggage" on it.
So simply srcStageMask = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT (no previous outstanding scheduled operation) and srcAccessMask = 0 (no previous writes) should be correct.

Vulkan: Is there a way to draw multiple objects in different locations like in DirectX12?

In DirectX12, you render multiple objects in different locations using the equivalent of a single uniform buffer for the world transform like:
// Basic simplified pseudocode
SetRootSignature();
SetPrimitiveTopology();
SetPipelineState();
SetDepthStencilTarget();
SetViewportAndScissor();
for (auto object : objects)
{
SetIndexBuffer();
SetVertexBuffer();
struct VSConstants
{
QEDx12::Math::Matrix4 modelToProjection;
} vsConstants;
vsConstants.modelToProjection = ViewProjMat * object->GetWorldProj();
SetDynamicConstantBufferView(0, sizeof(vsConstants), &vsConstants);
DrawIndexed();
}
However, in Vulkan, if you do something similar with a single uniform buffer, all the objects are rendered in the location of last world matrix:
for (auto object : objects)
{
SetIndexBuffer();
SetVertexBuffer();
UploadUniformBuffer(object->GetWorldProj());
DrawIndexed();
}
Is there a way to draw multiple objects with a single uniform buffer in Vulkan, just like in DirectX12?
I'm aware of Sascha Willem's Dynamic uniform buffer example (https://github.com/SaschaWillems/Vulkan/tree/master/dynamicuniformbuffer) where he packs many matrices in one big uniform buffer, and while useful, is not exactly what I am looking for.
Thanks in advance for any help.
I cannot find a function called SetDynamicConstantBufferView in the D3D 12 API. I presume this is some function of your invention, but without knowing what it does, I can only really guess.
It looks like you're uploading data to the buffer object while rendering. If that's the case, well, Vulkan can't do that. And that's a good thing. Uploading to memory that you're currently reading from requires synchronization. You have to issue a barrier between the last rendering command that was reading the data you're about to overwrite, and the next rendering command. It's just not a good idea if you like performance.
But again, I'm not sure exactly what that function is doing, so my understanding may be wrong.
In Vulkan, descriptors are generally not meant to be changed in the middle of rendering a frame. However, the makers of Vulkan realized that users sometimes want to draw using different subsets of the same VkBuffer object. This is what dynamic uniform/storage buffers are for.
You technically don't have multiple uniform buffers; you just have one. But you can use the offset(s) provided to vkCmdBindDescriptorSets to shift where in that buffer the next rendering command(s) will get their data from. So it's a light-weight way to supply different rendering commands with different data.
Basically, you rebind your descriptor sets, but with different pDynamicOffset array values. To make these work, you need to plan ahead. Your pipeline layout has to explicitly declare those descriptors as being dynamic descriptors. And every time you bind the set, you'll need to provide the offset into the buffer used by that descriptor.
That being said, it would probably be better to make your uniform buffer store larger arrays of matrices, using the dynamic offset to jump from one block of matrices to the other. You would tehn
The point of that is that the uniform data you provide (depending on hardware) will remain in shader memory unless you do something to change the offset or shader. There is some small cost to uploading such data, so minimizing the need for such uploads is probably not a bad idea.
So you should go and upload all of your objects buffer data in a single DMA operation. Then you issue a barrier, and do your rendering, using dynamic offsets and such to tell each offset where it goes.
You either have to use Push constants or have separate uniform buffers for each location. These can be bound either with a descriptor per location of dynamic offset.
In Sasha's example you can have more than just the one matrix inside the uniform.
That means that inside UploadUniformBuffer you append the new matrix to the buffer and bind the new location.