How to stream process a flatbuffer bigger than RAM? - flatbuffers

We have following scenario:
We are building a flatbuffer application in an environment with only static memory allocation allowed.
Our incomming flatbuffers contain software update images which are bigger than our whole available RAM on the processing unit.
We require to process the incomming flatbuffer partially and stream the images to another unit for storage.
The processing unit does not have persistant storage, we cannot dump the big flatbuffer and utilize mmap().
We want to utilize the flatbuffers::Verifier class to check that our received flatbuffer is correct
Our approach was to store meta information about the objects in [ObjectInfo] and binary concat all the image objects and put it into a ubyte vector.
But we have to manually track the pointer to the buffer and where we are within the objects vector. We do not profit from flatbuffer generated code to access the objects.
Schema example for problem:
root_type Container;
table Container {
// Information about the big objects, has to fit into RAM,
metaInfo:[ObjectInfo]
// Does not fit into RAM, has to be streamed with arbitrary size buffer
bigObjects:[Objects];
}
table ObjectInfo {
type:int;
name:string;
offset:double; // Offset within Objects
length:double; // length of object
}
table Objects {
objects:[ubyte] // Contains the objects, contatinated
}
We need to partially process flatbuffers.
What could we do?

There is no elegant way to do this with FlatBuffers, it would require a lot of hacks. You can't verify a partial buffer. You can't follow references to other tables safely, etc.
If you want to stream, put each piece of data in its own FlatBuffer that individually fits in ram, then stream a sequence of such buffers. You can use the SizePrefixed functionality to make these buffers easy to stream. You can use the file_identifier to recognize different kinds of buffers.
Also, don't use double for offset/length data, use an appropriately sized integer type.

Related

Why can't I store un-serialized data structure on disk the same way I can store them in memory?

Firstly, I am assuming that data structures, like a hash-map for example, can only be stored in-memory but not on disk unless they are serialized. I want to understand why not?
What is holding us back from dumping a block of memory which stores the data structure directly into disk without any modifications?
Something like a JSON could be thought of as a "serialized" python dictionary. We can very well store JSON in files, so why not a dict?
You may say how would you represent non-string values like bool/objects on disk? I can argue "the same way you store them in memory". Am I missing something here?
naming a few problems:
Big endian vs Little endian makes reading data from disk depend on the architecture of the CPU, so if you just dumped it you won't be able to read it again on different device.
items are not contagious in memory, a list (or dictionary) for example only contains pointers to things that exist "somewhere" in memory, you can only dump contagious memory, otherwise you are only storing the locations in memory that the data was in, which won't be the same when you load the program again.
the way structures are laid in memory can change between two compiled versions of the same program, so if you just recompile your application, you may get different layouts for structures in memory so you just lost your data.
different versions of the same application may wish to update the shape of the structures to allow extra functionality, this won't be possible if the data shape on disk is the same as in memory. (which is one of the reasons why you shouldn't be using pickle for portable data storage, despite it using a memory serializer)

Is it possible to accidentally overwrite "meta-data" in VkDeviceMemory?

From the answer from this reddit post.
...drivers may need some additional meta-data to manage given resource. That's why we need to call vkGet...MemoryRequirements() functions and allocate enough memory.
However, vkMapMemory maps the allocated memory, inside which the meta-data required by driver may be contained, according to the above answer.
When I write to that mapped memory, how am I supposed to know whether I am overwriting
the meta-data?
You should ignore that part of the post. It's nonsense.
Implementations may in fact need metadata for VkBuffer and VkImage objects. But that metadata is not stored within the VkDeviceMemory you provide when you bind those objects to the memory. They are associated with the VkBuffer and VkImage objects. They are allocated when those objects are constructed (which is why creating them takes a VkAllocationCallbacks) and deallocated when they are destroyed.
The purpose in querying the memory requirements for buffers and images is not for metadata associated with the buffer/image per-se. It's for layout purposes. Some buffer usages have to have specific alignments and the sizes may need to be rounded up to that alignment. Optimally-tiled images have a layout that is completely opaque; you're not allowed to directly write to the bytes of a tiled image. You can only copy into/outof them with image copy commands.

Why are Resource Descriptors needed?

When VkPipeline access vertex buffer:
VkPipeline needs a vertex buffer
vkCmdBindVertexBuffers() finds that VkBuffer object.
When VkPipeline access uniform buffer:
VkPipeline needs a uniform buffer
vkCmdBindDescriptorSets() finds a VkDescriptorSet
VkDescriptorSet finds corresponding VkDescriptorBufferInfo
VkDescriptorBufferInfo finds that VkBuffer object.
My question is why can't there be a hypothetical function called vkCmdBindUniformBuffer(), just like vkCmdBindVertexBuffers().
Why are descriptors needed.
Edit: I asked this because I first thought descriptors are like pointers in C/C++. However, I don't spend hundreds of lines to just create a pointer in C/C++. Descriptors feel like an over-complication of something could be as easy as calling a vkCmdBind...() function.
What does "needed" mean in this context? Vulkan defines an abstraction of what's going on in actual hardware. Nothing is strictly "needed"; there are merely different consequences of different abstractions.
One consequence of the descriptor set abstraction is that descriptor set layouts tell both the pipeline building and the descriptor set binding code what to expect in so far as the mapping between any particular Vulkan resource and the underlying hardware resources. The pipeline layout defines a direct mapping.
The pipeline layout represents a mapping from Vulkan set/binding indices to internal resource indices. The internal hardware has different kinds of resources. So which internal resources a particular binding takes depends on the kind of binding. Sampled images take up a set of resources that is separate from SSBOs, for example. The hardware potentially has X sampled image indices and Y indices for storage buffers.
However, some hardware doesn't have different internal resource types for certain Vulkan constructs. For example, some hardware doesn't have a "uniform buffer" as a distinct construct. The implementation implements UBOs as a read-only storage buffer. But this means that any Vulkan UBO also takes up the same resource indices from an SSBO.
As such, we need a way to map from Vulkan resources to the internal resource lists, one which allows the implementation to hide details like this. This is what the pipeline layout is for: the layout defines a mappings from each descriptor in the layout to a particular resource.
If set 0 assigns a storage image to internal index 0, the system knows that if set 1 uses a UBO, that UBO must use internal index 1, since internal index 0 was already taken by set 0. Because in this hardware, UBOs and SSBOs use the same list of resources.
This can't be done without pipeline layouts or some similar system. You need something which tells the system what all of the resources are, so that it can build a mapping table.
Pipeline layouts are grouped into sets to make it easier to change a large number of resources at once. If you want to switch to a different set of 8 textures, your way might require 8 different function calls. Furthermore, because you can have different kinds of descriptors bundled into the same set, you can change 3 textures, 2 UBOs, and one SSBO all in a single bind call.
Set groupings also allow you to have pipeline layouts that are partially compatible. Two pipelines can have the same set 0 but different set 1s. This means when you switch pipelines, you can bind a different descriptor for set 1 without changing set 0's binding. This is useful for descriptors whose update frequency is different.
For example, every object in a scene may use the same perspective and camera matrices, but different world matrices and textures. You can put the former into set 0 and the latter into set 1.

Vulkan descriptor binding

In my vulkan application i used to draw meshes like this when all the meshes used the same texture
Updatedescriptorsets(texture)
Command buffer record
{
For each mesh
Bind transformubo
Draw mesh
}
But now I want each mesh to have a unique texture so i tried this
Command buffer record
{
For each mesh
Bind transformubo
Updatedescriptorsets (textures[meshindex])
Draw mesh
}
But it gives an error saying descriptorset is destroyed or updated. I looked in vulkan documentation and found out that I can't update descriptorset during command buffer records. So how can I have a unique texture to each mesh?
vkUpdateDescriptorSets is not synchonrized with anything. Therefore, you cannot update a descriptor set while it is in use. You must ensure that all rendering operations that use the descriptor set in question have finished, and that no commands have been placed in command buffers that use the set in question.
It's basically like a global variable; you can't have people accessing a global variable from numerous threads without some kind of synchronization. And Vulkan doesn't synchronize access to descriptor sets.
There are several ways to deal with this. You can give each object its own descriptor set. This is usually done by having the frequently changing descriptor set data be of a higher index than the less frequently changing data. That way, you're not changing every descriptor for each object, only the ones that change on a per-object basis.
You can use push constant data to index into large tables/array textures. So the descriptor set would have an array texture or an array of textures (if you have dynamic indexing for arrays of textures). A push constant would provide an index, which is used by the shader to fetch that particular object's texture from the array texture/array of textures. This makes frequent changes fairly cheap, and the same index can also be used to give each object its own transformation matrices (by fetching into an array of matrices).
If you have the extension VK_KHR_push_descriptor available, then you can integrate changes to descriptors directly into the command buffer. How much better this is than the push constant mechanism is of course implementation-dependent.
If you update a descriptor set then all command buffers that this descriptor set is bound to will become invalid. Invalid command buffers cannot be submitted or be executed by the GPU.
What you basically need to do is to update descriptor sets before you bind them.
This odd behavior is there because in vkCmdBindDescriptorSets some implementations take the vulkan descriptor set, translate it to native descriptor tables and then store it in the command buffer. So if you update the descriptor set after vkCmdBindDescriptorSets the command buffer will be seeing stale data. VK_EXT_descriptor_indexing extension relaxed this behavior under some circumstances.

Vulkan: Is there a way to draw multiple objects in different locations like in DirectX12?

In DirectX12, you render multiple objects in different locations using the equivalent of a single uniform buffer for the world transform like:
// Basic simplified pseudocode
SetRootSignature();
SetPrimitiveTopology();
SetPipelineState();
SetDepthStencilTarget();
SetViewportAndScissor();
for (auto object : objects)
{
SetIndexBuffer();
SetVertexBuffer();
struct VSConstants
{
QEDx12::Math::Matrix4 modelToProjection;
} vsConstants;
vsConstants.modelToProjection = ViewProjMat * object->GetWorldProj();
SetDynamicConstantBufferView(0, sizeof(vsConstants), &vsConstants);
DrawIndexed();
}
However, in Vulkan, if you do something similar with a single uniform buffer, all the objects are rendered in the location of last world matrix:
for (auto object : objects)
{
SetIndexBuffer();
SetVertexBuffer();
UploadUniformBuffer(object->GetWorldProj());
DrawIndexed();
}
Is there a way to draw multiple objects with a single uniform buffer in Vulkan, just like in DirectX12?
I'm aware of Sascha Willem's Dynamic uniform buffer example (https://github.com/SaschaWillems/Vulkan/tree/master/dynamicuniformbuffer) where he packs many matrices in one big uniform buffer, and while useful, is not exactly what I am looking for.
Thanks in advance for any help.
I cannot find a function called SetDynamicConstantBufferView in the D3D 12 API. I presume this is some function of your invention, but without knowing what it does, I can only really guess.
It looks like you're uploading data to the buffer object while rendering. If that's the case, well, Vulkan can't do that. And that's a good thing. Uploading to memory that you're currently reading from requires synchronization. You have to issue a barrier between the last rendering command that was reading the data you're about to overwrite, and the next rendering command. It's just not a good idea if you like performance.
But again, I'm not sure exactly what that function is doing, so my understanding may be wrong.
In Vulkan, descriptors are generally not meant to be changed in the middle of rendering a frame. However, the makers of Vulkan realized that users sometimes want to draw using different subsets of the same VkBuffer object. This is what dynamic uniform/storage buffers are for.
You technically don't have multiple uniform buffers; you just have one. But you can use the offset(s) provided to vkCmdBindDescriptorSets to shift where in that buffer the next rendering command(s) will get their data from. So it's a light-weight way to supply different rendering commands with different data.
Basically, you rebind your descriptor sets, but with different pDynamicOffset array values. To make these work, you need to plan ahead. Your pipeline layout has to explicitly declare those descriptors as being dynamic descriptors. And every time you bind the set, you'll need to provide the offset into the buffer used by that descriptor.
That being said, it would probably be better to make your uniform buffer store larger arrays of matrices, using the dynamic offset to jump from one block of matrices to the other. You would tehn
The point of that is that the uniform data you provide (depending on hardware) will remain in shader memory unless you do something to change the offset or shader. There is some small cost to uploading such data, so minimizing the need for such uploads is probably not a bad idea.
So you should go and upload all of your objects buffer data in a single DMA operation. Then you issue a barrier, and do your rendering, using dynamic offsets and such to tell each offset where it goes.
You either have to use Push constants or have separate uniform buffers for each location. These can be bound either with a descriptor per location of dynamic offset.
In Sasha's example you can have more than just the one matrix inside the uniform.
That means that inside UploadUniformBuffer you append the new matrix to the buffer and bind the new location.