Passing array of images to compute shader - compute-shader

I am currently working on a project using the draft for Compute shaders in WebGL 2.0. [draft].
Nevertheless I do not think that my question is WebGL specific but more an OpenGL problem. The goal build a pyramid of images to be used in a compute shader. Each level should be a squares of 2^(n-level) values (down to 1x1) and contain simple integer/float values...
I already have the values needed for this pyramid, stored in different
OpenGL Images all with their according sizes. My question is: how would you pass my image array to a compute shader. I have no problem in restricting the amount of images passed to lets say 14 or 16... but it needs to be at-least 12. If there is a mechanism to store the images in a Texture and than use textureLod that would solve the problem, but I could not get my head around how to do that with Images.
Thank you for your help, and I am pretty sure there should be an obvious way how to do that, but I am relatively new to OpenGL in general.

Passing in a "pyramid of images" is a normal thing to do in WebGL. They are called a mipmap.
gl.texImage2D(gl.TEXUTRE_2D, 0, gl.R32F, 128, 128, 0, gl.RED, gl.FLOAT, some128x128FloatData);
gl.texImage2D(gl.TEXUTRE_2D, 1, gl.R32F, 64, 64, 0, gl.RED, gl.FLOAT, some64x64FloatData);
gl.texImage2D(gl.TEXUTRE_2D, 2, gl.R32F, 32, 32, 0, gl.RED, gl.FLOAT, some32x32FloatData);
gl.texImage2D(gl.TEXUTRE_2D, 3, gl.R32F, 16, 16, 0, gl.RED, gl.FLOAT, some16x16FloatData);
gl.texImage2D(gl.TEXUTRE_2D, 4, gl.R32F, 8, 8, 0, gl.RED, gl.FLOAT, some8x8FloatData);
gl.texImage2D(gl.TEXUTRE_2D, 5, gl.R32F, 4, 4, 0, gl.RED, gl.FLOAT, some4x4FloatData);
gl.texImage2D(gl.TEXUTRE_2D, 6, gl.R32F, 2, 2, 0, gl.RED, gl.FLOAT, some2x2FloatData);
gl.texImage2D(gl.TEXUTRE_2D, 7, gl.R32F, 1, 1, 0, gl.RED, gl.FLOAT, some1x1FloatData);
The number in the 2nd column is the mip level, followed by the internal format, followed by the width and height of the mip level.
You can access any individual texel from a shader with texelFetch(someSampler, integerTexelCoord, mipLevel) as in
uniform sampler2D someSampler;
...
ivec2 texelCoord = ivec2(17, 12);
int mipLevel = 2;
float r = texelFetch(someSampler, texelCoord, mipLevel).r;
But you said you need 12, 14, or 16 levels. 16 levels is 2(16-1) or 32768x32768. That's a 1.3 gig of memory so you'll need a GPU with enough space and you'll have to pray the browser lets you use 1.3 gig of memory.
You can save some memory by allocating your texture with texStorage2D and then uploading data with texSubImage2D.
You mentioned using images. If by that you mean <img> tags or Image well you can't get Float or Integer data from those. They are generally 8 bit per channel values.
Or course rather than using mip levels you could also arrange your data into a texture atlas

Related

Emboss an Image in HALCON

Original Image:
OpenCV Processed Image:
The first Image is the original.
The second image is OpenCV's processed image.
I want to realize the effect in HALCON too.
Can someone give me advise on which method or HALCON operator to use?
According to Wikipedia (Image embossing) this can be achieved by a convolutional filter. In the example you provided it seems that the embossing direction is South-West.
In HALCON you can use the operator convol_image to calculate the correlation between an image and an arbitrary filter mask. The filter would be similar to this:
Embossing filter matrix
To apply such a filter matrix in HDevelop you can use the following line of code:
convol_image (OriginalImage, EmbossedImage, [3, 3, 1, 1, 0, 0, 0, 0, 0, 0, 0, -1], 0)

How to get real prediction from TensorFlow

I'm really new to TensorFlow and MI in general. I've been reading a lot, been searching days, but haven't really found anything useful, so..
My main problem is that every single tutorial/example uses images/words/etc., and the outcome is just a vector with numbers between 0 and 1 (yeah, that's the probability). Like that beginner tutorial, where they want to identify numbers in images. So the result set is just a "map" to every single digit's (0-9) probability (kind of). Here comes my problem: I have no idea what the result could be, it could be 1, 2, 999, anything.
So basically:
input: [1, 2, 3, 4, 5]
output: [2, 4, 6, 8, 10]
This really is just a simplified example. I would always have the same number of inputs and outputs, but how can I get the predicted values back with TensorFlow, not just something like [0.1, 0.1, 0.2, 0.2, 0.4]? I'm not really sure how clear my question is, if it's not, please ask.

How does tf.nn.conv2d behave with an even-sized filter?

I've read this question for which the accepted answer only mentions square odd-sized filters (1x1, 3x3), and I am intrigued about how tf.nn.conv2d() behaves when using a square even-sized filter (e.g. 2x2) given that none of its elements can be considered its centre.
If padding='VALID' then I assume that tf.nn.conv2d() will stride across the input in the same way as it would if the filter were odd-sized.
However, if padding='SAME' how does tf.nn.conv2d() choose to centre the even-sized filter over the input?
See the description here:
https://www.tensorflow.org/versions/r0.9/api_docs/python/nn.html#convolution
For VALID padding, you're exactly right. You simply walk the filter over the input without any padding, moving the filter by the stride each time.
For SAME padding, you do the same as VALID padding, but conceptually you pad the input with some zeros before and after each dimension before computing the convolution. If an odd number of padding elements must be added, the right/bottom side gets the extra element.
Use the formula pad_... formulae in the link above to work out how much padding to add. For example, for simplicity, let's consider a 1D convolution. For a input of size 7, a window of size 2, and a stride of 1, you would add 1 element of padding to the right, and 0 elements of padding to the left.
Hope that helps!
when you do filtering in the conv layer, you simply getting the mean of each patch:
taking example of 1D size 5 input [ 1, 2, 3, 4, 5 ],
use size 2 filter and do valid padding, stride is 1,
you will get size 4 output(use mean of the inner product for weight parameter metrix of [1,1]) [ (1+2)/2, (2+3)/2, (3+4)/2, (4+5)/2 ],
which is [ 1.5, 2.5, 3.5, 4.5 ],
if you do same padding with stride 1, you will get size 5 output [ (1+2)/2, (2+3)/2, (3+4)/2, (4+5)/2, (5 + 0)/2 ], the final 0 here is the padding 0,
which is [ 1.5, 2.5, 3.5, 4.5, 2.5 ],
vise versa, if you have a 224*224 input, when you do 2 by 2 filtering using valid padding, it will out put 223*223 output

Asymmetrical and inaccurate output from Mali-400MP GPU

I have the following simple fragment shader:
precision highp float;
main()
{
gl_FragColor = vec4(0.15, 0.15, 0.15, 0.15);
}
I'm rendering to texture using frame buffer object.
When reading back values from the frame buffer I get the following:
38, 38, 38, 38, 39, 39, 39, 39,38, 38, 38, 38, 39 etc.
0.15*255 = 38.25 so I expect to get 38 uniformly for all pixels, which I do get on my desktop GPU (intel 4000) and on Tegra 3.
I'll be glad if someone can shed some light on this issue.
It is critical to anyone doing GPGPU for mobile devices as the Mali-400MP is used in Samsung Galaxi s2, s3 and s3 mini.
It looks like your output is being dithered, as in the error left over from one pixel is carried over to the next where it gets rounded up. Remember GL_DITHER is on by default in OpenGL, try doing a glDisable(GL_DITHER).

Graph a series of planes as a solid object in Mathematica

I'm trying to graph a series of planes as a solid object in mathematica. I first tried to use the RangePlot3D options as well as the fill options to graph the 3D volume, but was unable to find a working result.
The graphic I'm trying to create will show the deviation between the z axis and the radius from the origin of a 3D cuboid. The current equation I'm using is this:
Plot3D[Evaluate[{Sqrt[(C[1])^2 + x^2 + y^2]} /.
C[1] -> Range[6378100, 6379120]], {x, -1000000,
1000000}, {y, -1000000, 1000000}, AxesLabel -> Automatic]
(output for more manageable range looks as follows)
Where C1 was the origional Z-value at each plane and the result of this equation is z+(r-z)
for any point on the x,y plane.
However this method is incredibly inefficient. Because this will be used to model large objects with an original z-values of >6,000,000 and heights above 1000, mathematica is unable to graph thousands of planes and represent them in a responsive method.
Additionally, Because the Range of C1 only includes integer values, there is discontinuity between these planes.
Is there a way to rewrite this using different mathematica functionality that will generate a 3Dplot that is both a reasonable load on my system and is a smooth object?
2nd, What can I do to improve my perforamance? when computing the above input for >30min, mathematica was only utilizing about 30% CPU and 4GB of ram with a light load on my graphics card as well. This is only about twice as much as chrome is using right now on my system.
I attempted to enable CUDALink, but it wouldn't enable properly. Would this offer a performance boost for this type of processing?
For Reference, my system build is:
16GB Ram
Intel i7 4770K running at stock settings
Nvidia GeForce 760GTX
256 Samsung SSD
Plotting a million planes and hoping that becomes a 3d solid seems unlikely to succeed.
Perhaps you could adapt something like this
Show[Plot3D[{Sqrt[6^2+x^2+y^2], Sqrt[20^2+x^2+y^2]}, {x, -10, 10}, {y, -10, 10},
AxesLabel -> Automatic, PlotRange -> {{-15, 15}, {-15, 15}, All}],
Graphics3D[{
Polygon[Join[
Table[{x, -10, Sqrt[6^2 + x^2 + (-10)^2]}, {x, -10, 10, 1}],
Table[{x, -10, Sqrt[20^2 + x^2 + (-10)^2]}, {x, 10, -10, -1}]]],
Polygon[Join[
Table[{-10, y, Sqrt[6^2 + (-10)^2 + y^2]}, {y, -10, 10, 1}],
Table[{-10, y, Sqrt[20^2 + (-10)^2 + y^2]}, {y, 10, -10, -1}]]],
Polygon[Join[
Table[{x, 10, Sqrt[6^2 + x^2 + 10^2]}, {x, -10, 10, 1}],
Table[{x, 10, Sqrt[20^2 + x^2 + 10^2]}, {x, 10, -10, -1}]]],
Polygon[Join[
Table[{10, y, Sqrt[6^2 + 10^2 + y^2]}, {y, -10, 10, 1}],
Table[{10, y, Sqrt[20^2 + 10^2 + y^2]}, {y, 10, -10, -1}]]]}]]
What that does is plot the top and bottom surface and then construct four polygons, each connecting the top and bottom surface along one side. But one caution, if you look very very closely you will see that, because they are polygons, the edges of the four faces are made up of short line segments rather than parabolas and thus are not perfectly joining your two paraboloids, there can be tiny gaps or tiny overlaps. This may or may not make any difference for your application.
That graphic displays in a fraction of a second on a machine that is a fraction of yours.
Mathematica does not automatically parallelize computations onto multiple cores.
CUDA programming is a considerably bigger challenge than turning the link on.
If you can simply define each face of your solid and combine them with Show then
I think you will have a much greater chance of success.
Another way:
xyrange = 10
cmin = 6
cmax = 20
RegionPlot3D[
Abs[x] < xyrange && Abs[y] < xyrange &&
cmin^2 < z^2 - ( x^2 + y^2) < cmax^2 ,
{x, -1.2 xyrange, 1.2 xyrange}, {y, -1.2 xyrange, 1.2 xyrange},
{z, cmin, Sqrt[ cmax^2 + 2 xyrange^2]}, MaxRecursion -> 15,
PlotPoints -> 100]
This is nowhere near as fast as Bills' approach, but it may be useful if you plot a more complicated region. Note RegionPlot does not work for your original example because the volume is too small compared to the plot range.