Vulkan: Using buffer aliasing to implement a defragmentation scheme

Vulkan: Using buffer aliasing to implement a defragmentation scheme - vulkan

Say I have one VkBuffer bound to every device allocation, and use appropriate combinations of vkCmdCopyBuffer to perform defragmentation of an arena block-by-block.
Say an arena may contain linear and non-linear data in any appropriately-aligned arrangement. Due to the immutability of VkImage after binding, defragmentation will involve constructing and binding new VkImages at the new locations of image data that have been moved.
None of the resources within an arena undergoing defragmentation are bound to anything, or could be considered "in-use".
This isn't difficult to implement, though I have a concern:
Is it UB to use vkCmdCopyBuffer to move an image's data (as to avoid redundant layout transitions), then construct a new VkImage at the new location?
My thoughts are that perhaps an implementation would do something strange, like rely on absolute device addresses within some internal bookkeeping structure, making it UB to treat image data as POD until bound to a new object.

Well, let's look at this systematically.
So you find a suitable destination area for your image. You do a vkCmdCopyBuffer to copy from the source area to the destination area. Now you create a new VkImage for that destination area, and the initial layout you specify is... what?
See, there are only two valid initialLayout values in VkCreateImageInfo: undefined or preinitialized. And preinitialized is only works for images that use linear tiling, since there is no well-defined layout for optimal tiled images.
So you can't use preinitialized layout. And using undefined layout means that the next image transition you use will potentially mangle whatever data is there. Now, undefined layout might work on some implementations. On implementations that don't care about layout, it might work. It also might work on implementations if the source image was in the general layout.
But none of that is guaranteed to work by the standard. As far as the standard is concerned, if you set the layout to be undefined, then the data will not be preserved. So regardless of questions of buffer/image aliasing, this can't work.
You have to create the VkImage at the destination, then use vkCmdCopyImage to copy from the source image to the destination.
It should also be noted that, even if the layout issue worked, the aliasing rules tell us that copying from non-host-accessible memory (ie: images in optimal tiling or non-general/preinitialized layout) to host-accessible memory yields undefined values. So even if the layout issue weren't a problem, the copy itself doesn't work. In theory, at least.

Related

How to Instance an object in Godot?

So I basically have some fair knowledge of Opengl 4.0. In OpenGL you can render the same object at many places. This is a technique called Instancing. This saves up some CPU calls or something.
I wanted to do this in Godot. So I looked up in the docs and it basically just tells me to duplicate an object. But I think this does not save the CPU calls to the GPU, like how Instancing does (please let me know if I'm wrong about this).
Plus I cannot have all the nodes beforehand. Because the number of times I need to render the object(at different places) is determined during runtime and can change.
Is there a solution to this?
Any help would be appreciated.
Thank you

Instancing can be thought of as making copies of an object from a blueprint. The reason it saves memory and draw calls is that essentially, only the "blueprint" must be kept in memory. The recommended way that Godot addresses this (as per the documentation) is through (packed) scenes.
To do this, create the object as it's own scene - remember that you can right click on the root node of a scene (even an empty one) and change the type to whatever you want. Once you have the object set up the way you like, save it as it's own scene (ex: myInstance.tscn).
From there, you can call upon the instance from your main scene (or whatever scene you need it in). To do this you need to do a couple of things:
First, create a variable for your instance in the script you want to call it from by declaring something like onready var instancedObject = preload("res://myInstance.tscn"). (Using whatever path you used for the scene).
From there, you call the variable from whatever function you need by writing something like: var myObject = instancedObject.instance()
You then must add the instance to the current scene with add_child(myObject)
After this, you can (optionally) specify things like transforms and rotations to determine where the instance gets put (Ex: myObject.transform.origin = Vector3(0,10,0) - For 3D, or myObject.position = Vector2(10,0) for 2D)
Alternatively, you can initialize and instance the object at the same time by writing onready var instancedObject = preload(res://myInstance.tscn).instance(), and then adding it in functions by using add_child(instancedObject), however although it requires fewer steps, there are limitations to doing it this way, and I personally have had much more success using the first approach.
If, however, you are looking to instance multiple thousands of objects (or more) in the same scene, I recommend using Calinou's answer and using a MultiMeshInstance. However, one of the limitations of the MultiMeshInstance is that it uses an all or nothing approach to drawing, meaning all instances will either be all drawn at once, or not drawn at all. There is no in-between. This could be good or bad depending on what you need it for.
Hope this helps.

Instancing in Godot is handled using the MultiMeshInstance node. It's the instanced counterpart to MeshInstance. See Optimization using MultiMeshes in the documentation for more information.
Keep in mind MultiMeshes aren't suited if you need to move the objects in different directions every frame (although you can can achieve this by using INSTANCE_ID in a shader shared among all instances). MultiMeshInstance lets you change how many instances are visible by setting its visible_instance_count property.

When to use multiple MTLRenderCommandEncoders to perform my Metal rendering?

I'm learning Metal, and there's a conceptual question that I'm trying to wrap my head around: at what level, exactly, should my code handle successive drawing operations that require different pipeline states? As I understand it (from answers like this: https://stackoverflow.com/a/43827775/2752221), I can use a single MTLRenderCommandEncoder and change its pipeline state, the vertex buffer it's using, etc., between calls to drawPrimitives:, and the encoder state that was current at the time of each call to drawPrimitives: will be preserved. So that's great. But it also seems like the design of Metal is such that one can make multiple MTLRenderCommandEncoder instances, and use them to sequentially throw batches of commands into a MTLCommandBuffer. Given that the former works – using one MTLRenderCommandEncoder and changing its state – why would one do the latter? Under what circumstances is it correct to do the former, and under what circumstances is it necessary to do the latter? What is an example of a situation where the latter would be necessary/appropriate?
If it matters, I'm working on a macOS app, using Objective-C. Thanks.

Ignoring multithreaded encoding cases, which are somewhat advanced, the main reason you'd want to create multiple render command encoders during a frame is because you need to change which textures you're rendering to.
You'll notice that you need to provide a render pass descriptor when creating a render command encoder. For this reason, we often say that the sequence of commands belonging to a particular encoder constitute a render pass. The attachments of that descriptor refer to the textures that will be written to by the commands encoded by the encoder.
Many different techniques, including shadow mapping and postprocessing effects like bloom require multiple passes to produce. Since you can't change attachments in the midst of a pass, creating a new encoder is the only way to encode multiple passes in a frame.
Relatedly, you should ordinarily use one command buffer per frame. You can, however, sometimes reduce frame time by splitting your passes across multiple command buffers, but this is highly dependent on the shape of your workload and should only be done in tandem with profiling, as it's not always an optimization.

In addition to Warren's answer, another way to look at the question is by examining the API. A number of Metal objects are created from descriptors. The properties of the descriptor at the time an object is created from it govern that object for its lifetime. Those are aspects of the object that can't be changed after creation.
By contrast, the object will have various setter methods to modify other properties over its lifetime.
For a render command encoder, the properties that are fixed for its lifetime are those specified by the MTLRenderPassDescriptor used to create it. If you want to render with different values for any of those properties, the only way to do so is to create a new encoder from a different descriptor. On the other hand, if you can do everything you need/want to do by using the encoder's setter methods, then you don't need a new encoder.

glVertexAttributePointer scope?

I'm currently programming an OpenGL ES 2.0 application on both iOS and Android platforms.
In this application I render multiple meshes that all use VBOs. In the process of optimizing the rendering, I realized that the meshes I render share two vertex formats. So I wanted to do the following:
First, setup all vertex attribute pointer offsets, and then simply bind each VBO that uses this vertex format and render it without calling the glVertexAttribPointer function again.
But it gives me strange results.
My question is: Do we have to do the calls glVertexAttribPointer each time we bind a new VBO?

First of all, like every OpenGL state, the state set with glVertexAttribPointer keeps unchanged until someone else calls glVertexAttribPointer again (for the same attribute index). But the important thing here is, that the internal state changed with glVertexAttribPointer doesn't just store the buffer offset to be used for rendering, offsetting into the VBO bound when calling glDraw.... It also stores the actual buffer object bound when calling glVertexAttribPointer.
So yes, whenever you want your vertex data sourced from another VBO, you need to bind this VBO and do the appropriate glVertexAttribPointer calls while this VBO is bound. While this may seem cumbersome in your case, this is in fact a good thing. This way you don't need to worry about the currently bound buffer when rendering something, but only about the things set up with glVertexAttribPointer. And even more important it let's bind a different VBO before rendering, thus you can source different vertex attributes from different VBOs in a single render call (how else would you do that?).
EDIT: You can however use Vertex Array Objects to ease the process of setting up your vertex data. They encapsulate all the state neccessary for rendering from a bunch of arrays (and thus all the things changed by glVertexAttribPointer, gl(En/Dis)ableVertexAttribArray and the buffer bound to GL_ELEMENT_ARRAY_BUFFER, but like said, not the buffer bound to GL_ARRAY_BUFFER). You still have to properly bind the buffer before calling glVertexAttribPointer of course. But using a VAO you only need this code in some set up routine and all you need to do for rendering is calling glBindVertexArray. Though I don't know if your particular ES device supports them.

Some cool resource I found on drawing with VAO / VBO
http://www.arcsynthesis.org/gltut/Positioning/Tutorial%2005.html#d0e4720
It shows how you can drive several objects, with several VAOs and a single VBO for example (each VAO holding pointers with different offsets to the same VBO)
Definitively worth a look, not mentioning the part where you learn that
glBindBuffer(GL_ARRAY_BUFFER...doesn't bind any data, it's just a
global pointer for the following glVertexAttribPointer calls which do the actual data binding
BUT
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER...DOES bind/save the element array into the current VAO

Best practice of copying and serializing Quartz references

I have objects containing Quartz-2D references (describing colors, fill patterns, gradients and shadows) in Cocoa. I would like to implement the NSCoding protocol in my objects and thus need to serialize those opaque Quartz-2D structures.
Possible solutions might be:
Define a set of properties in my objects that allow to setup the data structures from scratch whenever they are needed. Those can then be serialized easily. Example: Store four floats for red, green, blue, and alpha, then use CGColorCreate. Disadvantage: Duplication of information, thus potential consistency and (so far minor) space consumption issues. I would need to write property setters manually that recreate the Quartz structure whenever a component changes. That would bloat my code substantially.
Read out the properties using Quartz functions. Example: Use CGColorGetComponents for colors. Disadvantage: It seems to work for colors. But there are no equivalent functions for other structures, so I don't see how this could work for things like gradients, shadings, shadows etc.
Read out the properties directly from the raw, opaque structures. Disadvantage: As the documentation says, the structures are supposed to be opaque. So in case something changes under the hood, my code will break. (Apple certainly would not have provided a function like CGColorGetComponents if that was supposed to be done.) Furthermore, things like the CGFunctionRef inside a CGShadingRef would really be asking for trouble.
What is the best practice for serializing Quartz structures?

The answer pretty much varies from one class to the next:
CGImage: Use CGImageDestination to make a TIFF file of it. (Equivalent to NSImage's TIFFRepresentation method.)
CGPath: Write an applier function you can use to describe the path's elements as, say, PostScript code. Write a simple interpreter to go the other direction.
CGColorSpace: You can export the ICC representation.
CGColor: As you described, but don't forget to include the color space.
CGLayer: Convoluted: Make a bitmap context, draw the layer into it, and dump an image of the context, then serialize that.
CGFont: The name should be enough for most applications. If you're being really fancy (i.e., using the variations feature), you'll want to include the font's variations dictionary. You'll have to maintain your knowledge of the font size separately, since CGFont doesn't have one and CGContext doesn't let you get the one you set in it.
CGPDFDocument: From a quick look, it looks like CGPDFObjects are all immutable, so you'd just archive the original PDF data or the URL you got it from.
CGGradient, CGPattern, CGShading, and most other classes: Yup, you're screwed. You'll just need to maintain all of the information you created the object with separately.

wxVListBox with "dynamic" data

I have a stream of data that I want to place into a container. This container will either be of fixed size or dynamically constrained to a certain size at runtime. The latter may be preferable.
When the container is full, the oldest data will be removed.
I want to display this data using a wxVListBox because I need full control of the display. However there is a problem: the calls to OnDrawItem are not atomic meaning that once the container is full, each call the OnDrawItem will be accessing moving data, the result will be a non-contiguous display with missing elements.
This is certainly true with any container with native array-like indexing, are required by OnDrawItem.
I can simulate array-like indexing in a std::map using iterator indexing, if the key is sequential integer, then all the items will be ordered and the map can be pruned quite easily, but that seems like an inefficient hack.
How can I solve this? Any other ideas or containers I haven't thought of?

The best approach seems to be to manage the full container state lazily within OnDrawBackground. That way the UI itself ensures the data remains static in the subsequent calls to OnDrawItem, using a deque as the container.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas