I'm looking for a better way (or a note that this is the best way) to transfer a pixel coordinate to its corresponding ray direction from a arbitrary camera position/direction.
My current method is as follows. I define a "camera" as a position vector, lookat vector, and up vector, named as such. (Note that the lookat vector is a unit vector in the direction the camera is facing, NOT where (position - lookat) is the direction, as is the standard in XNA's Matrix.CreateLookAt) These three vectors can uniquely define a camera position. Here's the actual code (well, not really the actual, a simplified abstracted version) (Language is HLSL)
float xPixelCoordShifted = (xPixelCoord / screenWidth * 2 - 1) * aspectRatio;
float yPixelCoordShifted = yPixelCoord / screenHeight * 2 - 1;
float3 right = cross(lookat, up);
float3 actualUp = cross(right, lookat);
float3 rightShift = mul(right, xPixelCoordShifted);
float3 upShift = mul(actualUp, yPixelCoordShifted);
return normalize(lookat + rightShift + upShift);
(the return value is the direction of the ray)
So what I'm asking is this- What's a better way to do this, maybe using matrices, etc. The problem with this method is that if you have too wide a viewing angle, the edges of the screen get sort of "radially stretched".
You can calculate it (ray) in pixel shader, HLSL code:
float4x4 WorldViewProjMatrix; // World*View*Proj
float4x4 WorldViewProjMatrixInv; // (World*View*Proj)^(-1)
void VS( float4 vPos : POSITION,
out float4 oPos : POSITION,
out float4 pos : TEXCOORD0 )
{
oPos = mul(vPos, WorldViewProjMatrix);
pos = oPos;
}
float4 PS( float4 pos : TEXCOORD0 )
{
float4 posWS = mul(pos, WorldViewProjMatrixInv);
float3 ray = posWS.xyz / posWS.w;
return float4(0, 0, 0, 1);
}
The information about your camera's position and direction is in View matrix (Matrix.CreateLookAt).
Related
I have this shader:
#version 450
layout(binding = 0) buffer b0 {
vec2 src[ ];
};
layout(binding = 1) buffer b1 {
vec2 dst[ ];
};
layout(binding = 2) buffer b2 {
int index[ ];
};
layout (local_size_x = 1, local_size_y = 1, local_size_z = 1) in;
void main()
{
int ind =int(gl_GlobalInvocationID.x);
vec2 norm;
norm=src[index[ind*3+2]]-src[index[ind*3]]+src[index[ind*3+1]]-src[index[ind*3]];
norm/=2.0;
dst[index[ind*3]] +=norm;
norm=src[index[ind*3+0]]-src[index[ind*3+1]]+src[index[ind*3+2]]-src[index[ind*3+1]];
dst[index[ind*3+1]] +=norm;
norm=src[index[ind*3+1]]-src[index[ind*3+2]]+src[index[ind*3+0]]-src[index[ind*3+2]];
norm/=2.0;
dst[index[ind*3+2]] +=norm;
}
Because dst buffer is not "atomic", the summation is incorrect.
Is there any way to solve this problem? My answer is no, but if i missed something.
For each vertex in polygon I am calculating a vector from vertex to the center of polygon. Different polygons has the same vertices.
dst - is a vertex buffer, the result of the summation of those shifts from vertex to polygon center.
Each time I have different computation results.
In the hair rendering slide developed by Sheuermann at ATI at GDC 2004, I found code like this:
float StrandSpecular (float3 T, float3 V, float3 L, float exponent)
{
float3 H = normalize(L + V);
float dotTH = dot(T, H);
float sinTH = sqrt(1.0 - dotTH*dotTH);
float dirAtten = smoothstep(-1.0, 0.0, dot(T, H));
return dirAtten * pow(sinTH, exponent);
}
I truly have no idea of the value dirAtten mean, what does this exactly mean in this shading model?
I regard this dirAtten as one attenuation coeffecient and it controls the range of the lighting you can see.
I am rewriting my application using the modern OpenGL (3.3+) in Jogl.
I am using all the conventional matrices, that is objectToWorld, WorldToCamera and CameraToClip (or model, view and projection)
I created a class for handling all the mouse movements as McKesson does in its "Learning modern 3d graphic programming" with a method to offset the camera target position:
private void offsetTargetPosition(MouseEvent mouseEvent){
Mat4 currMat = calcMatrix();
Quat orientation = currMat.toQuaternion();
Quat invOrientation = orientation.conjugate();
Vec2 current = new Vec2(mouseEvent.getX(), mouseEvent.getY());
Vec2 diff = current.minus(startDragMouseLoc);
Vec3 worldOffset = invOrientation.mult(new Vec3(-diff.x*10, diff.y*10, 0.0f));
currView.setTargetPos(currView.getTargetPos().plus(worldOffset));
startDragMouseLoc = current;
}
calcMatrix() returns the camera matrix, the rest should be clear.
What I want is moving my object along with the mouse, right now mouse movement and object translation don't correspond, that is they are not linear, because I am dealing with different spaces I guess..
I learnt that if I want to apply a transformation T in space O, but related in space C, I should do the following with p as vertex:
C * (C * T * C^-1) * O * p
Should I do something similar?
I solved with a damn simple proportion...
float x = (float) (10000 * 2 * EC_Main.viewer.getAspect() * diff.x / EC_Main.viewer.getWidth());
float y = (float) (10000 * 2 * diff.y / EC_Main.viewer.getHeight());
Vec3 worldOffset = invOrientation.mult(new Vec3(-x, y, 0.0f));
Taking into account my projection matrix
Mat4 orthographicMatrix = Jglm.orthographic(-10000.0f * (float) EC_Main.viewer.getAspect(), 10000.0f * (float) EC_Main.viewer.getAspect(),
-10000.0f, 10000.0f, -10000.0f, 10000.0f);
I'm trying to create custom cifilter (like adobe's warp filter). How to move only few pixels (that are in ROI) to some other location in kernel language? Maybe someone could suggest me some info about it? I have read all apple docs about creating custom cifilter, but haven't found any similar example of kernel portion of that type of filter. There are some CIFilters that does something similar (like CITwirlDistortion, CIBumpDistortion). Maybe there is some place where I could find their kernels?
You have to do this in reverse. Instead of saying I want to put those input pixels at this position in the output you have to answer the question where are the pixels in the input for this output pixel.
Take a look at this kernel:
kernel vec4 coreImageKernel(sampler image, float minX, float maxX, float shift)
{
vec2 coord = samplerCoord( image );
float x = coord.x;
float inRange = compare( minX - x, compare( x - maxX, 1., 0. ), 0. );
coord.x = coord.x + inRange * shift;
return sample( image, coord );
}
It replaces a vertical stripe between minX and maxX with the contents of the image that is shift pixels to the right. Using this kernel with minX = 100, maxX = 300 and shift = 500 gives the image in the lower left corner. Original is in upper right.
So the effect is that you move the pixels in the range (minX + shift, maxX + shift) to (minX, maxX)
What are successful strategies to optimize HLSL shader code in terms of computational complexity (meaning: minimizing runtime of the shader)?
I guess one way would be to minimize the number of arithmetic operations that result from compiling the shader.
How could this be done a) manually and b) using automated tools (if existing) ?
Collection of manual techniques (Updated)
Avoid branching (But how to do that best?)
Whenever possible: precompute outside shader and pass as argument.
An example code would be:
float2 DisplacementScroll;
// Parameter that limit the water effect
float glowHeight;
float limitTop;
float limitTopWater;
float limitLeft;
float limitRight;
float limitBottom;
sampler TextureSampler : register(s0); // Original color
sampler DisplacementSampler : register(s1); // Displacement
float fadeoutWidth = 0.05;
// External rumble displacement
int enableRumble;
float displacementX;
float displacementY;
float screenZoom;
float4 main(float4 color : COLOR0, float2 texCoord : TEXCOORD0) : COLOR0
{
// Calculate minimal distance to next border
float dx = min(texCoord.x - limitLeft, limitRight - texCoord.x);
float dy = min(texCoord.y - limitTop, limitBottom - texCoord.y);
///////////////////////////////////////////////////////////////////////////////////////
// RUMBLE //////////////////////
///////////////////////////////////////////////////////////////////////////////////////
if (enableRumble!=0)
{
// Limit rumble strength by distance to HLSL-active region (think map)
// The factor of 100 is chosen by hand and controls slope with which dimfactor goes to 1
float dimfactor = clamp(100.0f * min(dx, dy), 0, 1); // Maximum is 1.0 (do not amplify)
// Shift texture coordinates by rumble
texCoord.x += displacementX * dimfactor * screenZoom;
texCoord.y += displacementY * dimfactor * screenZoom;
}
//////////////////////////////////////////////////////////////////////////////////////////
// Water refraction (optical distortion) and water like-color tint //////////////////////
//////////////////////////////////////////////////////////////////////////////////////////
if (dx >= 0)
{
float dyWater = min(texCoord.y - limitTopWater, limitBottom - texCoord.y);
if (dyWater >= 0)
{
// Look up the amount of displacement from texture
float2 displacement = tex2D(DisplacementSampler, DisplacementScroll + texCoord / 3);
float finalFactor = min(dx,dyWater) / fadeoutWidth;
if (finalFactor > 1) finalFactor = 1;
// Apply displacement by water refraction
texCoord.x += (displacement.x * 0.2 - 0.15) * finalFactor * 0.15 * screenZoom; // Why these strange numbers ?
texCoord.y += (displacement.y * 0.2 - 0.15) * finalFactor * 0.15 * screenZoom;
// Look up the texture color of the original underwater pixel.
color = tex2D(TextureSampler, texCoord);
// Additional color transformation (blue shift)
color.r = color.r - 0.1f;
color.g = color.g - 0.1f;
color.b = color.b + 0.3f;
}
else if (dyWater > -glowHeight)
{
// No water distortion...
color = tex2D(TextureSampler, texCoord);
// Scales from 0 (upper glow limit) ... 1 (near water surface)
float glowFactor = 1 - (dyWater / -glowHeight);
// ... but bluish glow
// Additional color transformation
color.r = color.r - (glowFactor * 0.1); // 24 = 1/(30f/720f); // Prelim: depends on screen resolution, must fit to value in HLSL Update
color.g = color.g - (glowFactor * 0.1);
color.b = color.b + (glowFactor * 0.3);
}
else
{
// Return original color (no water distortion above and below)
color = tex2D(TextureSampler, texCoord);
}
}
else
{
// Return original color (no water distortion left or right)
color = tex2D(TextureSampler, texCoord);
}
return color;
}
technique Refraction
{
pass Pass0
{
PixelShader = compile ps_2_0 main();
}
}
I'm not very familar with the HLSL internals, but from what I've learned from GLSL is: never branch something. It probably will execute both parts and then decide which result of them should be valid.
Also have a look at this
and this.
As far as I know there are no automatic tools except the compiler itself. For very low level optimization you can use fxc with the /Fc parameter to get the assembly listing. The possible assembly instructions are listed here. One low level optimization which is worth mentioning is MAD: multiply and add. This may not be optimized to a MAD operation (I'm not sure, just try it out yourself):
a *= b;
a += c;
but this should be optimized to a MAD:
a = (a * b) + c;
You can optimize your code using mathematical techniques that involve manipulation functions, would be something like:
// Shift texture coordinates by rumble
texCoord.x += displacementX * dimfactor * screenZoom;
texCoord.y += displacementY * dimfactor * screenZoom;
Here you multiply three values, but only one of them comes from a register of the GPU, the other two are constants, you could pre multiply and store in a global constant.
// Shift texture coordinates by rumble
texCoord.x += dimfactor * pre_zoom_dispx; // displacementX * screenZoom
texCoord.y += dimfactor * pre_zoom_dispy; // displacementY * screenZoom
Another example:
// Apply displacement by water refraction
texCoord.x += (displacement.x * 0.2 - 0.15) * finalFactor * 0.15 * screenZoom; // Why these strange numbers ?
texCoord.y += (displacement.y * 0.2 - 0.15) * finalFactor * 0.15 * screenZoom;
0.15 * screenZoom <- can be optimized by one global.
The HLSL Compiler of Visual Studio 2012 have a option in poperties to enable Optimizations. But the best optimization that you can make is write the HLSL code simple as possible and using the Intrinsic functions http://msdn.microsoft.com/en-us/library/windows/desktop/ff471376(v=vs.85).aspx
Those functions are like memcpy of C, using assembly code in body that uses resources of system like 128-bit registers (yes, CPU have 128-bit registers http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions) and strongly fast operations.