OpenGL 4.5 - Shader storage: write in vertex shader, read in fragment shader - fragment-shader

Both my fragment and vertex shaders contain the following two guys:
struct Light {
mat4 view;
mat4 proj;
vec4 fragPos;
};
layout (std430, binding = 0) buffer Lights {
Light lights[];
};
My problem is that that last field, fragPos, is computed by the vertex shader like this, but the fragment shader does not see the changes made by the vertex shader in fragPos (or any changes at all):
aLight.fragPos = bias * aLight.proj * aLight.view * vec4(vs_frag_pos, 1.0);
... where aLight is lights[i] in a loop. As you can imagine I'm computing the position of the vertex in the coordinate systems of each light present to be used in shadow mapping. Any idea what's wrong here? Am I doing a fundamentally wrong thing?
Here is how I initialize my storage:
struct LightData {
glm::mat4 view;
glm::mat4 proj;
glm::vec4 fragPos;
};
glGenBuffers(1, &BBO);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, BBO);
glNamedBufferStorage(BBO, lights.size() * sizeof(LightData), NULL, GL_DYNAMIC_STORAGE_BIT);
// lights is a vector of a wrapper class for LightData.
for (unsigned int i = 0; i < lights.size(); i++) {
glNamedBufferSubData(BBO, i * sizeof(LightData), sizeof(LightData), &(lights[i]->data));
}
It may be worth noting that if I move fragPos to a fixed-size array out variable in the vertex shader out fragPos[2], leave the results there and then add the fragment shader counterpart in fragPos[2] and use that for the rest of my stuff then things are OK. So what I want to know more about here is why my fragment shader does not see the numbers crunched down by the vertex shader.

I will not be very accurate, but I will try to explain you why your fragment shader does not see what your vertex shader write :
When your vertex shader write some informations inside your buffer, the value you write are not mandatory to be wrote inside video memory, but can be stored in a kind of cache. The same idea occur when your fragment shader will read your buffer, it may read value in a cache (that is not the same as the vertex shader).
To avoid this problem, you must do two things, first, you have to declare your buffer as coherent (inside the glsl) : layout(std430) coherent buffer ...
Once you have that, after your writes, you must issue a barrier (globally, it says : be careful, I write value inside the buffer, values that you will read may be invalid, please, take the new values I wrote).
How to do such a thing ?
Using the function memoryBarrierBuffer after your writes. https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/memoryBarrierBuffer.xhtml
BTW : don't forget to divide by w after your projection.

Related

Does order within push constant structs matter, even when using alignas()?

I have a "Sprite" struct that I hand over as a push constant.
struct Sprite
{
glm::vec2 position;
alignas(16) Rect uvRect; //a "Rect" is just 2 glm::vec2s
alignas(4) glm::uint32 textureIndex;
alignas(4) glm::float32 rotation;
};
In my .vert file, I describe its layout as:
layout(push_constant) uniform Push
{
vec2 offset; //the 'position' part of the Sprite
vec2 origin; //The first part of the Sprite struct's "uvRect"
vec2 extent; //The second part of the Sprite struct's "uvRect"
uint textureIndex;
float rotation;
}
push;
This doesn't work: I get a black screen.
However, if I rearrange Sprite so that it goes:
struct Sprite
{
glm::vec2 position;
alignas(4) glm::uint32 textureIndex;
alignas(16) Rect uvRect; //a "Rect" is just 2 glm::vec2s
alignas(4) glm::float32 rotation;
};
...and then change the layout descriptor thing in the .vert file accordingly, suddenly it does work.
Does anyone know why this might be?
In your first structure you align the Rect to the 16 byte boundary, but push constant buffer in vulkan is expecting another vec2 to be tightly packed. Assuming I'm reading the alignment spec correctly archive, structures are aligned on a multiple of a 16 byte boundary, whereas a 2 component vector has an alignment twice that of its base component.
An array or structure type has an extended alignment equal to the largest extended alignment of any of its members, rounded up to a multiple of 16.
A scalar or vector type has an extended alignment equal to its base alignment.
A two-component vector has a base alignment equal to twice its scalar alignment.
A scalar of size N has a scalar alignment of N.
Thus the offset in of uvRect in C is 16, but in GLSL the offset of origin is 8 and extent is 16.
By changing the order Vulkan will begin looking for origin on the 8 byte alignment, which after 3 dwords would be an offset of 16, which then matches what C is expecting.

What are the normal methods for achiving texture mapping with raytracing?

When you create a BLAS (bottom level acceleration structures) you specify any number of vertex/index buffers to be part of the structure. How does that end up interacting with the shader and get specified in the descriptor set? How should I link these structures with materials?
How is texture mapping usually done with raytracing? I saw some sort of "materials table" in Q2RTX but the documentation is non-existent and the code is sparsely commented.
A common approach is to use a material buffer in combination with a texture array that is addressed in the shaders where you require the texture data. You then pass the material id e.g. per-vertex or per-primitive and then use that to dynamically fetch the material, and with it the texture index. Due to the requirements for Vulkan ray tracing you can simplify this by using the VK_EXT_descriptor_indexing extension (Spec) that makes it possible to create a large and descriptor set containing all textures required to render your scene.
The relevant shader parts:
// Enable required extension
...
#extension GL_EXT_nonuniform_qualifier : enable
// Material definition
struct Material {
int albedoTextureIndex;
int normalTextureIndex;
...
};
// Bindings
layout(binding = 6, set = 0) readonly buffer Materials { Material materials[]; };
layout(binding = 7, set = 0) uniform sampler2D[] textures;
...
// Usage
void main()
{
Primitive primitive = unpackTriangle(gl_Primitive, ...);
Material material = materials[primitive.materialId];
vec4 color = texture(textures[nonuniformEXT(material.albedoTextureIndex)], uv);
...
}
In your application you then create a buffer that stores the materials generated on the host, and bind it to the binding point of the shader.
For the textures, you pass them as an array of textures. An array texture would be an option too, but isn't as flexible due to the same size per array slice limitation. Note that it does not have a size limitation in the above example, which is made possible by VK_EXT_descriptor_indexing and is only allowed for the final binding in a descriptor set. This adds some flexibility to your setup.
As for the passing the material index that you fetch the data from: The easiest approach is to pass that information along with your vertex data, which you'll have to access/unpack in your shaders anyway:
struct Vertex {
vec4 pos;
vec4 normal;
vec2 uv;
vec4 color;
int32_t materialIndex;
}

How to cut polyhedron with a plane or bounding box?

Within a polyhedron, how do I obtain the handle to any edge that intersects a given plane (purpose is that I can further cut it with CGAL::polyhedron_cut_plane_3)?
I currently have this snippet but it doesn't work. I constructed this from pieces found in CGAL documentations and examples:
CGAL 4.14 - 3D Fast Intersection and Distance Computation (AABB Tree)
typedef CGAL::Simple_cartesian<double> Kernel;
typedef CGAL::Polyhedron_3<Kernel> Polyhedron;
typedef CGAL::AABB_face_graph_triangle_primitive<Polyhedron> Primitive;
typedef CGAL::AABB_traits<Kernel, Primitive> Traits;
Polyhedron poly = load_obj(argv[1]); // load from file using a helper
Kernel::Plane_3 plane(1, 0, 0, 0); // I am certain this plane intersects the given mesh
CGAL::AABB_tree<Traits> tree(faces(poly).first, faces(poly).second, poly);
auto intersection = tree.any_intersection(plane);
if (intersection) {
if (boost::get<Kernel::Segment_3>(&(intersection->first))) {
// SHOULD enter here and I can do things with intersection->second
} else {
// BUT it enters here
}
} else {
std::cout << "No intersection." << std::endl;
}
Edit on 9/9/2019:
I changed this title from the original Old title: How to obtain the handle to some edge found in a plane-polyhedron intersection. With the methods provided in CGAL/Polygon_mesh_processing/clip.h, it is unnecessary to use AABB_Tree to find intersection.
To clip with one plane, one line is enough: CGAL::Polygon_mesh_processing::clip(poly, plane);
To clip within some bounding box, as suggested by #sloriot, there is an internal function CGAL::Polygon_mesh_processing::internal::clip_to_bbox. Here is an example.
The simplest way to do it would be to use the function undocumented function clip_to_bbox() from the file CGAL/Polygon_mesh_processing/clip.h to turn a plane into a clipping bbox and call the function corefine() to embedded the plane intersection into your mesh. If you want to get the intersection edges, pass a edge constrained map to corefine() in the named parameters.

How does one add vertices to a mesh object in OpenGL?

I am new to OpenGL and I have been using The Red Book, and the Super Bible. In the SB, I have gotten to the section about using objects loaded from files. So far, I don't think I have a problem understanding what is going on and how to do it, but it got me thinking about making my own mesh within my own app--in essence, a modeling app. I have done a lot of searching through both of my references as well as the internet, and I have yet to find a nice tutorial about implementing such functionality into one's own App. I found an API that just provides this functionality, but I am trying to understand the implementation; not just the interface.
Thus far, I have created an "app" (I use this term lightly), that gives you a view that you can click in and add vertices. The vertices don't connect, just are just displayed where you click. My concern is that this method I stumbled upon while experimenting is not the way I should be implementing this process.
I am working on a Mac and using Objective-C and C in Xcode.
MyOpenGLView.m
#import "MyOpenGLView.h"
#interface MyOpenGLView () {
NSTimer *_renderTimer
Gluint VAO, VBO;
GLuint totalVertices;
GLsizei bufferSize;
}
#end
#implementation MyOpenGLView
/* Set up OpenGL view with a context and pixelFormat with doubleBuffering */
/* NSTimer implementation */
- (void)drawS3DView {
currentTime = CACurrentMediaTime();
NSOpenGLContext *currentContext = self.openGLContext;
[currentContext makeCurrentContext];
CGLLockContext([currentContext CGLContextObj]);
const GLfloat color[] = {
sinf(currentTime * 0.2),
sinf(currentTime * 0.3),
cosf(currentTime * 0.4),
1.0
};
glClearBufferfv(GL_COLOR, 0, color);
glUseProgram(shaderProgram);
glBindVertexArray(VAO);
glPointSize(10);
glDrawArrays(GL_POINTS, 0, totalVertices);
CGLFlushDrawable([currentContext CGLContextObj]);
CGLUnlockContext([currentContext CGLContextObj]);
}
#pragma mark - User Interaction
- (void)mouseUp:(NSEvent *)theEvent {
NSPoint mouseLocation = [theEvent locationInWindow];
NSPoint mouseLocationInView = [self convertPoint:mouseLocation fromView:self];
GLfloat x = -1 + mouseLocationInView.x * 2/(GLfloat)self.bounds.size.width;
GLfloat y = -1 + mouseLocationInView.y * 2/(GLfloat)self.bounds.size.height;
NSOpenGLContext *currentContext = self.openGLContext;
[currentContext makeCurrentContext];
CGLLockContext([currentContext CGLContextObj]);
[_renderer addVertexWithLocationX:x locationY:y];
CGLUnlockContext([currentContext CGLContextObj]);
}
- (void)addVertexWithLocationX:(GLfloat)x locationY:(GLfloat)y {
glBindBuffer(GL_ARRAY_BUFFER, VBO);
GLfloat vertices[(totalVertices * 2) + 2];
glGetBufferSubData(GL_ARRAY_BUFFER, 0, (totalVertices * 2), vertices);
for (int i = 0; i < ((totalVertices * 2) + 2); i++) {
if (i == (totalVertices * 2)) {
vertices[i] = x;
} else if (i == (totalVertices * 2) + 1) {
vertices[i] = y;
}
}
glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW);
totalVertices ++;
}
#end
The app is supposed take the location of the mouse click and provide it is a vertex location. With each added vertex, I first bind the VBO to make sure it is active. Next, I create a new array to hold my current vertex location's (totalVertices) plus space for one more vertex (+ 2 for x and y). Then I used glGetBufferSubData to bring the data back from the VBO and put it into this array. Using a for loop I add the X and Y numbers to the end of the array. Finally, I send this data back to the GPU into a VBO and call totalVertices++ so I know how many vertices I have in the array next time I want to add a vertex.
This brings me to my question: Am I doing this right? Put another way, should I be keeping a copy of the BufferData on the CPU side so that I don't have to call out to the GPU and have the data sent back for editing? In that way, I wouldn't call glGetBufferSubData, I would just create a bigger array, add the new vertex to the end, and then call glBufferData to realloc the VBO with the updated vertex data.
** I tried to include my thinking process so that someone like myself who is very inexperienced in programming can hopefully understand what I am trying to do. I don't want anyone to be offended by my explanations of what I did. **
I would certainly avoid reading the data back. Not only because of the extra data copy, but also to avoid synchronization between CPU and GPU.
When you make an OpenGL call, you can picture the driver building a GPU command, queuing it up for later submission to the GPU, and then returning. These commands will then be submitted to the GPU at a later point. The idea is that the GPU can run as independently as possible from whatever runs on the CPU, which includes your application. CPU and GPU operating in parallel with minimal dependencies is very desirable for performance.
For most glGet*() calls, this asynchronous execution model breaks down. They will often have to wait until the GPU completed all (or at least some) pending commands before they can return the data. So the CPU might block while only the GPU is running, which is undesirable.
For that reason, you should definitely keep your CPU copy of the data so that you don't ever have to read it back.
Beyond that, there are a few options. It will all depend on your usage pattern, the performance characteristics of the specific platform, etc. To really get the maximum out of it, there's no way around implementing multiple variations, and benchmarking them.
For what you're describing, I would probably start with something that works similar to a std::vector in C++. You allocate a certain amount of memory (typically named capacity) that is larger than what you need at the moment. Then you can add data without reallocating, until you fill the allocated capacity. At that point, you can for example double the capacity.
Applying this to OpenGL, you can reserve a certain amount of memory by calling glBufferData() with NULL as the data pointer. Keep track of the capacity you allocated, and populate the buffer with calls to glBufferSubData(). When adding a single point in your example code, you would call glBufferSubData() with just the new point. Only when you run out of capacity, you call glBufferData() with a new capacity, and then fill it with all the data you already have.
In pseudo-code, the initialization would looks something like this:
int capacity = 10;
glBufferData(GL_ARRAY_BUFFER,
capacity * sizeof(Point), NULL, GL_DYNAMIC_DRAW);
std::vector<Point> data;
Then each time you add a point:
data.push_back(newPoint);
if (data.size() <= capacity) {
glBufferSubData(GL_ARRAY_BUFFER,
(data.size() - 1) * sizeof(Point), sizeof(Point), &newPoint);
} else {
capacity *= 2;
glBufferData(GL_ARRAY_BUFFER,
capacity * sizeof(Point), NULL, GL_DYNAMIC_DRAW);
glBufferSubData(GL_ARRAY_BUFFER,
0, data.size() * sizeof(Point), &data[0]);
}
As an alternative to glBufferSubData(), glMapBufferRange() is another option to consider for updating buffer data. Going farther, you can look into using multiple buffers, and cycle through them, instead of updating just a single buffer. This is where benchmarking comes into play, because there isn't a single approach that will be best for every possible platform and use case.

What is the correct way to structure a Cg program?

This tutorial uses explicit OUT structures, e.g:
struct C3E1v_Output {
float4 position : POSITION;
float4 color : COLOR;
};
C3E1v_Output C3E1v_anyColor(float2 position : POSITION,
uniform float4 constantColor)
{
C3E1v_Output OUT;
OUT.position = float4(position, 0, 1);
OUT.color = constantColor; // Some RGBA color
return OUT;
}
But looking at one of my shaders I have explicit in/out parameters:
float4 slice_vp(
// Vertex Inputs
in float4 position : POSITION, // Vertex position in model space
out float4 oposition : POSITION,
// Model Level Inputs
uniform float4x4 worldViewProj) : TEXCOORD6
{
// Calculate output position
float4 p = mul(worldViewProj, position);
oposition=p;
return p;
}
I'm having some problems using HLSL2GLSL with this and wondered if my Cg format is to blame (even though it works fine as a Cg script). Is there a 'right' way or are the two simply different ways to the same end?
As you've seen, both ways work. However, I strongly endorse using structs -- especially for the output of vertex shaders (input of fragment shaders). The reasons are less to do with what the machine likes (it doesn't care), and more to do with creating code that can be safely re-used and shared between projects and people. The last thing you want to have to find and debug is a case where one programmer has assigned a value to TEXCOORD1 in some cases and is trying to read it from TEXCOORD2 in (some) other cases. Or any permutation of register mis-match. Use structs, your life will be better.