In Vulkan can you have a depth buffer for each colour attachment? - vulkan

In Vulkan if you want to write to a colour buffer with a depth buffer you create a framebuffer attachment for the colour, and one attachment for the depth buffer. Then when you create the subpass description you make the depth and stencil attachment pointer point to your depth buffer, but it seems there's only one pointer:
VkSubpassDescription subpass_description = {};
subpass_description.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
subpass_description.colorAttachmentCount = vk_attachment_descriptions.size();
subpass_description.pColorAttachments = vk_attachment_references.data();
subpass_description.pDepthStencilAttachment;
Although the subpass accepts multiple colour attachments it seems as though there's only one pDepthStencilAttachment pointer. Does Vulkan only allow one depth and stencil buffer when writing to multiple colour attachments?

Does Vulkan only allow one depth and stencil buffer when writing to multiple colour attachments?
Yes.

Related

Vertex buffer with vertices of different formats

I want to draw a model that's composed of multiple meshes, where each mesh has different vertex formats. Is it possible to put all the various vertices within the same vertex buffer, and to point to the correct offset at vkCmdBindVertexBuffers time?
Or must all vertices within a buffer have the same format, thus necessitating multiple vbufs for such a model?
Looking at the manual for vkCmdBindVertexBuffers, it's not clear whether the offset is in bytes or in vertex-strides.
https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/vkCmdBindVertexBuffers.html
Your question really breaks down into 3 questions
Does the pOffsets parameter for vkCmdBindVertexBuffers accept bytes or vertex strides?
Can I put more than one vertex format into a vertex buffer?
Should I put more than one vertex format into a vertex buffer?
The short version is
Bytes
Yes
Probably not
Does the offsets parameter for vkCmdBindVertexBuffers accept bytes or vertex strides?
The function signature is
void vkCmdBindVertexBuffers(
VkCommandBuffer commandBuffer,
uint32_t firstBinding,
uint32_t bindingCount,
const VkBuffer* pBuffers,
const VkDeviceSize* pOffsets);
Note the VkDeviceSize type for pOffsets. This unambiguously means "bytes", not strides. Any VkDeviceSize means an offset or size in raw memory. Vertex Strides aren't raw memory, they're simply a count, so the type would have to be a uint32_t or uint64_t.
Furthermore there's nothing in that function signature that specifies the vertex format so there would be no way to convert the vertex stride count to actual memory sizes. Remember that unlike OpenGL, Vulkan is not a state machine, so this function doesn't have any "memory" of a rendering pipeline that you might have previously bound.
Can I put more than one vertex format into a vertex buffer?
As a consequence of the above answer, yes. You can put pretty much whatever you want into a vertex buffer, although I believe some hardware will have alignment restrictions on what are valid offsets for vertex buffers, so make sure you check that.
Should I put more than one vertex format into a vertex buffer?
Generally speaking you want to render your scene in as few draw calls as possible, and having lots of arbitrary vertex formats runs counter to that. I would argue that if possible, the only time you want to change vertex formats is when you're switching to a different rendering pass, such as when switching between rendering opaque items to rendering transparent ones.
Instead you should try to make format normalization part of your asset pipeline, taking your source assets and converting them to a single consistent format. If that's not possible, then you could consider doing the normalization at load time. This adds complexity to the loading code, but should drastically reduce the complexity of the rendering code, since you now only have to think in terms of a single vertex format.

Metal fragment shader appears to write twice for a single triangle?

I'm trying to debug my Metal shader -- from using the GPU frame capture, I can see that the framebuffer values are twice what I would expect.
For example, I am using the ADD blend operation, with source and destination multipliers of 1. After I draw a single triangle, the buffer value is twice what I output from the fragment shader. That indicates to me that two writes have happened, and they added together. But why would two writes take place when I draw a single triangle? Anyone know?
If I change the blend operation -- for example I set the destination multiplier to 0 and the source to 1, so that successive writes don't add together -- then I see the value I expect in the buffer. That is, I see the same value output from the shader.
I'm just wondering if there's something simple I am missing that would cause two writes to the same pixel from a fragment shader when drawing a single triangle?

How to process a 24-bit 3 channel color image with SSE2/SSE3/SSE4?

I just started to use SS2 optimization of image processing, but for the 3 channel 24 bit color images have no idea.
My pix data arranged by BGR BGR BGR ... ,unsigned char 8-bi, so if I want to implement the Color2Gray with SSE2/SSE3/SSE4's instruction C/C++ fun ,how would I do? Does need to align(4/8/16) for my pix data?
I have read article:http://supercomputingblog.com/windows/image-processing-with-sse/
But it is ARGB 4 channel 32-bit color,exactly process 4 color pix data every time.
Thanks!
//Assume the original pixel:
unsigned char* pDataColor=(unsigned char*)malloc(src.width*src.height*3);//3
//init pDataColor every pix val
// The dst pixel:
unsigned char* pDataGray=(unsigned char*)malloc(src.width*src.height*1);//1
//RGB->Gray: Y=0.212671*R + 0.715160*G + 0.072169*B
I have slides on de-interleaving of 24-bit RGB pixels, which explain how to do it with SSE2 and SSSE3.
Here is some answers to your question:
For How to use SSE2 instruction C/C++ functions. These references may be helpful.
Optimization of Image Processing Algorithms: A Case Study
Speeding up some SSE2 Intrinsics for color conversion
SSE intrinsic functions reference
For the alignment: Yes, 16-byte align is necessary. When there are memory accesses using SSE2 intrinsic functions( The SSE2/SSE3/SSE4 instruction C/C++ functions), you should make sure that the memory address is 16-byte alignment. If you're using MSVC, you'll have to use declspec(align(16)), or with GCC, this would be __attribute((aligned (16))).
The reason why align is necessary can be found here: Why does instruction/data alignment exist?
For 3-channel RGB conversion, I am not an image-processing experts, so can not give advice. There are also some open source image processing libraries that may already contain the code you want.

create a new image with only masked part (without transparent area) with new size

I have a mask and an image on which mask is applied to get a portion of that image.
The problem is when I apply that mask on the image ,the resultant image from masking is of same size as the original image .Though the unmasked part is transparent. What I need is an image which only has the masked part of the original image ,I dont want transparent part to be in the image. so that the resultant image will be of smaller size an contain only the masked part.
Thanks
You can:
Draw the image to a new CGBitmapContext at actual size, providing a buffer for the bitmap. CGBitmapContextCreate
Read alpha values from the bitmap to determine the transparent boundaries. You will have to determine how to read this based on the pixel data you have specified.
Create a new CGBitmapContext providing the external buffer, using some variation or combination of: a) a pixel offset, b) offset bytes per row, or c) manually move the bitmap's data (in place to reduce memory usage, if possible). CGBitmapContextCreate
Create a CGImage from the second bitmap context. CGBitmapContextCreateImage

Comparing Kinect depth to OpenGL depth efficiently

Background:
This problem is related with 3D tracking of object.
My system projects object/samples from known parameters (X, Y, Z) to OpenGL and
try to match with image and depth informations obtained from Kinect sensor to infer the object's 3D position.
Problem:
Kinect depth->process-> value in millimeters
OpenGL->depth buffer-> value between 0-1 (which is nonlinearly mapped between near and far)
Though I could recover Z value from OpenGL using method mentioned on http://www.songho.ca/opengl/gl_projectionmatrix.html but this will yield very slow performance.
I am sure this is the common problem, so I hope there must be some cleaver solution exist.
Question:
Efficient way to recover eye Z coordinate from OpenGL?
Or is there any other way around to solve above problem?
Now my problem is Kinect depth is in mm
No, it is not. Kinect reports it's depth as a value in a 11 bit range of arbitrary units. Only after some calibration has been applied, the depth value can be interpreted as a physical unit. You're right insofar, that OpenGL perspective projection depth values are nonlinear.
So if I understand you correctly, you want to emulatea Kinect by retrieving the content of the depth buffer, right? Then the most easy solution was using a combination of vertex and fragment shader, in which the vertex shader passes the linear depth as an additional varying to the fragment shader, and the fragment shader then overwrites the fragment's depth value with the passed value. (You could also use an additional render target for this).
Another method was using a 1D texture, projected into the depth range of the scene, where the texture values encode the depth value. Then the desired value would be in the color buffer.