I'm trying to draw geometry for multiple models using a single draw call. All the geometry, thusly, resizes within the same vertex/index buffers. The geometry for the different models share the same vertex format, but the vertex amounts for each model can be different.
In the vertex/fragment shaders, what's a technique that can be used to differentiate between the different models, to access their appropriate transforms/textures/etc ?
Are these static models? For traditional static batching:
You only need a single transform relative to the batch origin (position the individual models relative to the batch origin as part of the offline data packaging step).
You can batch your textures in to a single atlas (either a single 2D image with different coordinates for each object, or a texture array with a different layer for each object).
If you do it this way you don't need to different component models - they are effectively just "one large model". Which has nice performance properties ...
For more modern methods, you can try indirect draws with multiple "drawCount" values to index the settings you want. This allows variable buffer offsets and triangle counts to be used, but the rest of the state used needs to be the same.
As an alternative to texture arrays, with bindless texturing you can just programmatically select which texture to use in the shader at runtime. BUT you generally still want it to be at least warp-uniform to avoid a performance hit.
Related
I want to enable heterogenous multi-GPU support for my Vulkan Application. My goal is to enhance the amount of draw calls (and thereby visible objects) without increasing frame times. I decided it would be best to divide the geometry to multiple GPUs and let each one render its part to the GPU's local Framebuffer. Each local Framebuffer (including depth buffer) would be copied to the logical device containing the presentation engine at the end of the frame.
Is there any way to "join" those Framebuffers depending on the depth values? The pixel value of the framebuffer containing the least depth value should be copied to the final output Frambuffer. A version of vkCmdBlitImage containing a parameter of type VkCompareOp would be nice to have.
I thought in two ways to write my opengl es 2.0 code.
First, I write many calls to draw elements in the screen with many VAOs and VBOs or one only VAO and many VBOs.
Second, I save the coordinates of all elements in one list and I write all vertices of these coordinates in one only VAO and one only VBO and draw all vertices in the screen.
What is the better way that I should follow?
These are the ones I thought, what other ways are there?
The VAO is meant to save you some setup calls when setting the vertex attributes pointers and enabling/disabling the pipeline states related to that setup. Having just one VAO isn't saving you anything, because you will repeatedly re-bind the vertex buffers and change some settings. So you should aim to have multiple VAOs, one per "static" rendering batch, but not necessarily one per object drawn.
As to having all vertices in single VBO or many VBOs - that really depends on the task.
Having all data in single VBO has no benefits if you draw that all in many calls. But there's also no point in allocating one VBO per sprite. It's always about the balance between the costs of different calls to setup the pipeline, so ideally you try different approaches and decide what's best for you in your particular case.
There might be restrictions on the buffer sizes, and there's definitely "reasonable" sizes preferred by specific implementations. I remember some issues with old Intel drivers, when rendering the portion of the buffer would process the entire buffer, skipping unneeded vertices.
Let me say the scenario, we have several meshes with the same shader (material type, e.g. PBR material), but the difference between meshes materials are the uniform buffer and textures for rendering them.
For uniform buffer we have a dynamic uniform buffer technique that uniform buffer offsets can be specify for each draw in the command buffer, but for the image till here I didn't find a way of specifying image view in command buffer for descriptor set. In all the sample codes I have seen till now, for every mesh and every material of that mesh they have a new pipeline, descriptor sets and etc.
I think it is not the best way, there must be a way to only have one pipeline and descriptor set and etc for a material type and only change the uniform buffer offset and texture image-view and sampler, am I right?
If I'm wrong, are these samples doing the best way?
How should I specify the VkDescriptorPoolCreateInfo.maxSets (or other limits like that) for dynamic scene that every minute meshes will add and remove?
Update:
I think it is possible to have a same pipeline and descriptor set layout for all of the objects but problem with VkDescriptorPoolCreateInfo.maxSets (or other limits like that) and the best practice still exist.
It is not duplicate
I was seeking for a way of specifying textures like what we can do with dynamic uniform buffer (to reduce number of descriptor sets) and along with this question there were complementary questions mostly to find best practices for the way that's gonna be suggested with an answer.
You have many options.
The simplest mechanism is to divide your descriptor set layout into sets based on the frequency of changes. Things that change per-scene would be in set 0, things that change per-kind-of-object (character, static mesh, etc), would be in set 1, and things that change per-object would be in set 2. Or whatever. The point is that the things that change with greater frequency go in higher numbered sets.
This texture is per-object, so it would be in the highest numbered set. So you would give each object its own descriptor set containing that texture, then apply that descriptor set when you go to render.
As for VkDescriptorPoolCreateInfo.maxSets, you set that to whatever you feel is appropriate for your system. And if you run out, you can always create another pool; nobody's forcing you to use just one.
However, this is only one option. You can also employ array textures or arrays of textures (depending on your hardware capabilities). In either method, you have an array of different images (either as a single image view or multiple views bound to the same arrayed descriptor). Your per-object uniform data would have that object's texture index, so that it can fetch the index from the array texture/array of textures.
In DirectX12, you render multiple objects in different locations using the equivalent of a single uniform buffer for the world transform like:
// Basic simplified pseudocode
SetRootSignature();
SetPrimitiveTopology();
SetPipelineState();
SetDepthStencilTarget();
SetViewportAndScissor();
for (auto object : objects)
{
SetIndexBuffer();
SetVertexBuffer();
struct VSConstants
{
QEDx12::Math::Matrix4 modelToProjection;
} vsConstants;
vsConstants.modelToProjection = ViewProjMat * object->GetWorldProj();
SetDynamicConstantBufferView(0, sizeof(vsConstants), &vsConstants);
DrawIndexed();
}
However, in Vulkan, if you do something similar with a single uniform buffer, all the objects are rendered in the location of last world matrix:
for (auto object : objects)
{
SetIndexBuffer();
SetVertexBuffer();
UploadUniformBuffer(object->GetWorldProj());
DrawIndexed();
}
Is there a way to draw multiple objects with a single uniform buffer in Vulkan, just like in DirectX12?
I'm aware of Sascha Willem's Dynamic uniform buffer example (https://github.com/SaschaWillems/Vulkan/tree/master/dynamicuniformbuffer) where he packs many matrices in one big uniform buffer, and while useful, is not exactly what I am looking for.
Thanks in advance for any help.
I cannot find a function called SetDynamicConstantBufferView in the D3D 12 API. I presume this is some function of your invention, but without knowing what it does, I can only really guess.
It looks like you're uploading data to the buffer object while rendering. If that's the case, well, Vulkan can't do that. And that's a good thing. Uploading to memory that you're currently reading from requires synchronization. You have to issue a barrier between the last rendering command that was reading the data you're about to overwrite, and the next rendering command. It's just not a good idea if you like performance.
But again, I'm not sure exactly what that function is doing, so my understanding may be wrong.
In Vulkan, descriptors are generally not meant to be changed in the middle of rendering a frame. However, the makers of Vulkan realized that users sometimes want to draw using different subsets of the same VkBuffer object. This is what dynamic uniform/storage buffers are for.
You technically don't have multiple uniform buffers; you just have one. But you can use the offset(s) provided to vkCmdBindDescriptorSets to shift where in that buffer the next rendering command(s) will get their data from. So it's a light-weight way to supply different rendering commands with different data.
Basically, you rebind your descriptor sets, but with different pDynamicOffset array values. To make these work, you need to plan ahead. Your pipeline layout has to explicitly declare those descriptors as being dynamic descriptors. And every time you bind the set, you'll need to provide the offset into the buffer used by that descriptor.
That being said, it would probably be better to make your uniform buffer store larger arrays of matrices, using the dynamic offset to jump from one block of matrices to the other. You would tehn
The point of that is that the uniform data you provide (depending on hardware) will remain in shader memory unless you do something to change the offset or shader. There is some small cost to uploading such data, so minimizing the need for such uploads is probably not a bad idea.
So you should go and upload all of your objects buffer data in a single DMA operation. Then you issue a barrier, and do your rendering, using dynamic offsets and such to tell each offset where it goes.
You either have to use Push constants or have separate uniform buffers for each location. These can be bound either with a descriptor per location of dynamic offset.
In Sasha's example you can have more than just the one matrix inside the uniform.
That means that inside UploadUniformBuffer you append the new matrix to the buffer and bind the new location.
I'm creating a DirectX 11 game that renders complex meshes in 3D space. I'm using vertex/index buffers/shaders and this all works fine. However I now want to perform some basic 'overlay' rendering - more specifically, I want to render wireframe boxes in 3D space to show the bounds of a particular area. There would only ever be one or two boxes in view at any one time, and their vertices would change position each frame.
I've therefore been searching for simpler DX11 rendering methods but most articles I find still prepare a vertex/index buffer for very simple rendering. I know that hardware is well optimised for processing vertex streams, but is the overhead of building and filling a vertex buffer every frame just to process 8 vertices really the most efficient method?
My question is therefore, what is the most efficient method for performing this very simple rendering in DX11? Is there any more primitive method ("DrawLine", "DrawLineList(D3DXVECTOR3[])", ...) that would be a better solution? It could be less efficient per-vertex than the standard method of passing vertex buffers because it's only ever going to be used for a handful of vertices per frame.
Thanks in advance
Rob
You should create a single vertex / index buffer for each primitive Shape (box, sphere, ...) and use transformation matrix to place it correctly in the world.