Metal fragment shader appears to write twice for a single triangle? - fragment-shader

I'm trying to debug my Metal shader -- from using the GPU frame capture, I can see that the framebuffer values are twice what I would expect.
For example, I am using the ADD blend operation, with source and destination multipliers of 1. After I draw a single triangle, the buffer value is twice what I output from the fragment shader. That indicates to me that two writes have happened, and they added together. But why would two writes take place when I draw a single triangle? Anyone know?
If I change the blend operation -- for example I set the destination multiplier to 0 and the source to 1, so that successive writes don't add together -- then I see the value I expect in the buffer. That is, I see the same value output from the shader.
I'm just wondering if there's something simple I am missing that would cause two writes to the same pixel from a fragment shader when drawing a single triangle?

Related

Vertex buffer with vertices of different formats

I want to draw a model that's composed of multiple meshes, where each mesh has different vertex formats. Is it possible to put all the various vertices within the same vertex buffer, and to point to the correct offset at vkCmdBindVertexBuffers time?
Or must all vertices within a buffer have the same format, thus necessitating multiple vbufs for such a model?
Looking at the manual for vkCmdBindVertexBuffers, it's not clear whether the offset is in bytes or in vertex-strides.
https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/vkCmdBindVertexBuffers.html
Your question really breaks down into 3 questions
Does the pOffsets parameter for vkCmdBindVertexBuffers accept bytes or vertex strides?
Can I put more than one vertex format into a vertex buffer?
Should I put more than one vertex format into a vertex buffer?
The short version is
Bytes
Yes
Probably not
Does the offsets parameter for vkCmdBindVertexBuffers accept bytes or vertex strides?
The function signature is
void vkCmdBindVertexBuffers(
VkCommandBuffer commandBuffer,
uint32_t firstBinding,
uint32_t bindingCount,
const VkBuffer* pBuffers,
const VkDeviceSize* pOffsets);
Note the VkDeviceSize type for pOffsets. This unambiguously means "bytes", not strides. Any VkDeviceSize means an offset or size in raw memory. Vertex Strides aren't raw memory, they're simply a count, so the type would have to be a uint32_t or uint64_t.
Furthermore there's nothing in that function signature that specifies the vertex format so there would be no way to convert the vertex stride count to actual memory sizes. Remember that unlike OpenGL, Vulkan is not a state machine, so this function doesn't have any "memory" of a rendering pipeline that you might have previously bound.
Can I put more than one vertex format into a vertex buffer?
As a consequence of the above answer, yes. You can put pretty much whatever you want into a vertex buffer, although I believe some hardware will have alignment restrictions on what are valid offsets for vertex buffers, so make sure you check that.
Should I put more than one vertex format into a vertex buffer?
Generally speaking you want to render your scene in as few draw calls as possible, and having lots of arbitrary vertex formats runs counter to that. I would argue that if possible, the only time you want to change vertex formats is when you're switching to a different rendering pass, such as when switching between rendering opaque items to rendering transparent ones.
Instead you should try to make format normalization part of your asset pipeline, taking your source assets and converting them to a single consistent format. If that's not possible, then you could consider doing the normalization at load time. This adds complexity to the loading code, but should drastically reduce the complexity of the rendering code, since you now only have to think in terms of a single vertex format.

DirectX 11 What is a Fragment?

I have been learning DirectX 11, and in the book I am reading, it states that the Rasterizer outputs Fragments. It is my understanding, that these Fragments are the output of the Rasterizer(which inputs geometric primitives), and in-fact are just 2D Positions(your 2D Render Target View)
Here is what I think I understand, please correct me.
The Rasterizer takes Geometric Primitives(spheres, cubes or boxes, toroids
cylinders, pyramids, triangle meshes or polygon meshes) (https://en.wikipedia.org/wiki/Geometric_primitive). It then translates these primitives into pixels(or dots) that are mapped to your Render Target View(that is 2D). This is what a Fragment is. For each Fragment, it executes the Pixel Shader, to determine its color.
However, I am only assuming because there is no simple explanation of what it is (That I can find).
So my questions are ...
1: What is a Rasterizer? What are the inputs, and what is the output?
2: What is a fragment, in relation to Rasterizer output.
3: Why is a fragment a float 4 value (SV_Position)? If it just 2D Screen Space for the Render Target View?
4: How does it correlate to the Render Target Output (the 2D Screen Texture)?
5: Is this why we clear the Render Target View(to whatever color) because the Razterizer, and Pixel Shader will not execute on all X,Y locations of the Render Target View?
Thank you!
I do not use DirectXI but OpenGL instead but the terminology should bi similar if not the same. My understanding is this:
(scene geometry) -> [Vertex shader] -> (per vertex data)
(per vertex data) -> [Geometry&Teseletaion shader] -> (per primitive data)
(per primitive data) -> [rasterizer] -> (per fragment data)
(per fragment data) -> [Fragment shader] -> (fragment)
(fragment) -> [depth/stencil/alpha/blend...]-> (pixels)
So in Vertex shader you can perform any per vertex operations like transform of coordinate systems, pre-computation of needed parameters etc.
In geometry and teselation you can compute normals from geometry, emit/convert primitives and much much more.
The Rasterizer then convert geometry (primitive) into fragments. This is done by interpolation. It basically divide the viewed part of any primitive into fragments see convex polygon rasterizer.
Fragments are not pixels nor super pixels but they are close to it. The difference is that they may or may not be outputted depending on the circumstances and pipeline configuration (Pixels are visible outputs). You can think of them as a possible super-pixels.
Fragment shader convert per fragment data into final fragments. Here you are computing per fragment/pixel lighting,shading, doing all the texture stuff, compute colors etc. The output is also fragment which is basically pixel + some additional info so it does not have just position and color but can have other properties as well (like more colors, depth, alpha, stencil, etc).
This goes into final combiner which provides the depth test and any other enabled tests or functionality like Blending. And only that output goes into framebuffer as pixel.
I think that answered #1,#2,#4.
Now #3 (I may be wrong here due to my lack of knowledge about DirectX) in per fragment data you often need 3D position of fragments for proper lighting or what ever computations and as homogenuous coordinates are used we need 4D (x,y,z,w) vector for it. The fragment itself has 2D coordinates but the 3D position is its interpolated value from geometry passed from Vertex shader. So it may not contain the screen position but world coordinates instead (or any other).
#5 Yes the scene may not cover whole screen and or you need to preset the buffers like Depth, Stencil, Alpha so the rendering works as should and is not invalidated by previous frame results. So we need to clear framebuffers usually at start of frame. Some techniques require multiple clearings per frame others (like glow effect) clears once per multiple frames ...

Comparing Kinect depth to OpenGL depth efficiently

Background:
This problem is related with 3D tracking of object.
My system projects object/samples from known parameters (X, Y, Z) to OpenGL and
try to match with image and depth informations obtained from Kinect sensor to infer the object's 3D position.
Problem:
Kinect depth->process-> value in millimeters
OpenGL->depth buffer-> value between 0-1 (which is nonlinearly mapped between near and far)
Though I could recover Z value from OpenGL using method mentioned on http://www.songho.ca/opengl/gl_projectionmatrix.html but this will yield very slow performance.
I am sure this is the common problem, so I hope there must be some cleaver solution exist.
Question:
Efficient way to recover eye Z coordinate from OpenGL?
Or is there any other way around to solve above problem?
Now my problem is Kinect depth is in mm
No, it is not. Kinect reports it's depth as a value in a 11 bit range of arbitrary units. Only after some calibration has been applied, the depth value can be interpreted as a physical unit. You're right insofar, that OpenGL perspective projection depth values are nonlinear.
So if I understand you correctly, you want to emulatea Kinect by retrieving the content of the depth buffer, right? Then the most easy solution was using a combination of vertex and fragment shader, in which the vertex shader passes the linear depth as an additional varying to the fragment shader, and the fragment shader then overwrites the fragment's depth value with the passed value. (You could also use an additional render target for this).
Another method was using a 1D texture, projected into the depth range of the scene, where the texture values encode the depth value. Then the desired value would be in the color buffer.

How could I read information off the graphics card after processing?

Supposing, for example, that I had a 10x10 "cloth" mesh, with each square being two triangles. Now, if I wanted to animate this, I could do the spring calculations on the CPU. Each vertex would have its own "spring" data and would, hopefully, bounce like whatever type of "cloth" it was supposed to represent.
However, that would involve a minimum of about 380? spring calculations per frame. Happily, the per-vertex calculations are "embarrassingly parallel" - Had I one CPU per vertex, each vertex could be run on a single CPU. GPUs, therefore, are theoretically an excellent choice for running such calculations on.
Except (and this is using DirectX/SlimDX) - I have no idea/am not sure how I would/should:
1) Send all this vertex data to the graphics card (yes, I know how to render stuff and have even written my own per-pixel and texture-blending global lighting effect file; however, it is necessary for each vertex to be able to access the position data of at least three other vertices). I suppose I could stick the relevant vertex positions and number of vertex positions in TextureCoords, but there may be a different, standard solution.
2) Read all the vertex data afterwards, so I can update the mesh in memory. Otherwise, each update will act on the exact same data to the exact same result, rather like adding 2 + 3 = 5, 2 + 3 = 5, 2 + 3 = 5 when you want is 2 + 3 = 5 - 2 = 3 + 1.5 = 4.5.
And it may be that I'm looking in the wrong direction to do this.
Thanks.
You could use the approach you have described to pack the data into textures and write special HLSL shaders to compute spring forces and then update vertex positions. That approach is totally valid but can be troublesome when you try to debug problems because you are using texture pixels in an unconventional way (you can draw the texture and maybe write some code to watch the values in a given pixel). It is probably easier in the long run to use something like CUDA, DirectCompute, or OpenCL. CUDA allows you to "bind" the DirectX vertex-buffer for access in CUDA. Then in a CUDA kernel you can use the positions to calculate forces and then write new positions to the vertex-buffer (in parallel on the GPU) before you render the updated positions.
There is a cloth demo that uses DirectCompute in the DirectX 10/11 DirectX SDK.

Seam Carving – Accessing pixel data in cocoa

I want to implement the seam carving algorithm by Avidan/Shamir. After the energy computing stage which can be implemented using a core image filter, I need to compute the seams with the lowest energy which can't be implemented as a core image filter for it uses dynamic programming (and you don't have access to previous computations in opengl shading language).
So i need a way to access the pixel data of an image efficiently in objective-c cocoa.
Pseudo code omitting boundary checks:
for y in 0..lines(image) do:
for x in 0..columns(image) do:
output[x][y] = value(image, x, y) +
min{ output[x-1][y-1]; output[x][y-1]; output[x+1][y-1] }
The best way to get access to the pixel values for an image, is to create a CGBitmapContextRef with CGBitmapContextCreate. The important part about this is that when you create the context, you get to pass the pointer in that will be used as the backing store for the bitmap's data. Meaning that data will hold the pixel values and you can do what ever you want with them.
So the steps should be:
Allocate a buffer with malloc or another suitable allocator.
Pass that buffer as the first parameter to CGBitmapContextCreate.
Draw your image into the returned CGBitmapContextRef.
Release the context.
Now you have your original data pointer that is filled with pixels in the format specified in the call to CGBitmapContextCreate.