How to properly use GL_HALF_FLOAT_OES for textures? - opengl-es-2.0

I'm using OpenGL ES 2.0 on an iPad 2 / 3. I'd like to use GL_FLOAT when creating textures:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, texWidth, texHeight, 0, GL_RGBA, GL_FLOAT, rawData);
but the problem is that GL_LINEAR is not supported as a GL_TEXTURE_MAG_FILTER when GL_FLOAT is used if you don't have GL_OES_texture_float_linear showing up in your list of supported extensions. (None of the iPads do.)
But I do have GL_OES_texture_half_float_linear in my list of extensions. So using a half-float texture ought to work with linear interpolation.
Problem is, switching my texture creation to:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, texWidth, texHeight, 0, GL_RGBA, GL_HALF_FLOAT_OES, rawData);
causes EXC_BAD_ACCESS when I run the app. (libGPUSupportMercury.dlylib'gpus_ReturnGuiltyForHardwareRestart)
I haven't changed the format of the data since calling it with GL_FLOAT. Does the input data need to change for HALF_FLOAT somehow? If so, how? I don't know how I would split my input floats in half. Each component is GLfloat currently.

My "solution" for right now has been to add my own bilinear interpolation to the fragment shader, since OpenGL won't do it for me when GL_FLOAT is used.

Related

How to stop OpenGL background bleed on transparent textures

I have an iOS OpenGL ES 2.0 3D game and am working to get transparent textures working nicely, in this particular example for a fence.
I'll start with the final result. The bits of green background/clear color are coming through around the edges of the fence - note how it isn't ALL edges and some of it is ok:
The reason for the lack of bleed in the top right is order of operations. As you can see from the following shots, the order of draw includes some buildings that get drawn BEFORE the fence. But most of it is after the fence:
So one solution is to always draw my transparent textured objects last. I would like to explore other solutions, as my pipeline might not always allow this. I'm looking for other suggestions to solve this problem without sorting my draws.
This is likely a depth or blend function, but i've tried a ton of stuff and nothing seems to work (different blend functions, different discard alpha levels, different background colors, different texture settings).
Here are some specifics of my implementation.
In my frag shader I'm throwing out fragments that have transparency - this way they won't render to depth:
lowp vec4 texVal = texture2D(sTexture, texCoord);
if(texVal.w < 0.5)
discard;
I'm using one giant PVR texture atlas with mipmapping - the texture itself SHOULD just have 0 or 1 for alpha, but something with the blending could be causing this:
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
I'm using the following blending when rendering:
glEnable(GL_BLEND);
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
Any suggestions to fix this bleed would be great!
EDIT - tried a different min filter for the texture as suggested in the comments, LINEAR/NEAREST, but same result. Note I have also tried NEAREST/NEAREST and no luck:
try increasing the alpha filter limit,
lowp vec4 texVal = texture2D(sTexture, texCoord);
if(texVal.w < 0.9)
discard;
I know this is an old question but I came across it several times whilst trying to find an answer to my very similar OpenGL issue. Thought I'd share my findings here for anyone with similar. The culprit in my code looked like this:
glClearColor(1, 0, 1, 0);
glClear(GL_COLOR_BUFFER_BIT);
I used a pink transparent colour for ease of visual reference whilst debugging. Despite the fact it was transparent when it was blending between background and the colour of the subject it would bleed in much like the symptoms in the screenshot of the question. What fixed it for me was wrapping this code to mask the glClear step. It looked like this:
glColorMask(false, false, false, true);
glClearColor(1, 0, 1, 0);
glClear(GL_COLOR_BUFFER_BIT);
glColorMask(true, true, true, true);
To my knowledge, this means when the clear process kicks in it only operates on the alpha channel. After this is was all re-enabled to continue the process as intended. If someone with a more solid knowledge of OpenGL can explain it better I'd love to hear!

Can't draw completely using Open GL ES 2.0 in cocos 2d 2.0 in a game tutorial

I am currently working on this tutorial http://www.raywenderlich.com/3888/how-to-create-a-game-like-tiny-wings-part-1 and trying to get the hills correctly drawn. However, the tutorial have been written before cocos 2.0 and therefore it doesnt involve shaders. So, im trying to convert it. So far, I have not succeeded and getting images like these:
As it can be seen, it doesn't draw the hills completely and colors of the hills are wrong too.
Also it stops drawing them completely after 2 or 3 seconds. Here is the part of the code, I draw them.
- (void) draw {
[self.shaderProgram use];
[self.shaderProgram setUniformForModelViewProjectionMatrix];
ccGLBlendFunc( CC_BLEND_SRC, CC_BLEND_DST );
ccGLBindTexture2D(_stripes.texture.name);
glVertexAttribPointer(kCCVertexAttrib_Position, 2, GL_FLOAT, GL_FALSE, 0, _hillVertices);
glEnableVertexAttribArray(kCCVertexAttrib_Position);
glVertexAttribPointer(kCCVertexAttrib_TexCoords, 2, GL_FLOAT, GL_FALSE, 0, _hillTexCoords);
glEnableVertexAttribArray(kCCVertexAttrib_TexCoords);
glDrawArrays(GL_TRIANGLE_STRIP, 0, (GLsizei)_nHillVertices);
}
If needed I can share more of the code. I am hoping that anybody have had this problem before can help me or suggest a solution.
Thanks

Implementing non-power-of-2 textures in iSGL3d

I've written an app that relies heavily on iSGL3d for 3D rendering, and I've come to a point now where I need to start fiddling with texture sizes for memory allocation reasons.
My app uses very large textures (1024x1024) and going from that to 512x512 is unacceptable
So, using GL ES 2.0 as a basis, I want to slightly reduce my textures to something closer to 700x700
I know this is possible, because I've painstakingly handwritten OpenGL code in a previous life that uses non-power-of-2 textures
But I've had a hell of a time trying to sift through iSGL3d's code to find where I can affect this change... and the project appears to be abandoned now.
Basically, by default, even if you use a GLES 2.0 instance, iSGL3d will just make a power-of-two bitmap and dump your texture into it, leaving a bunch of transparent pixels. This is worthless.
Forcing the texture size to a non-power-of-two image generates GL errors. I am assuming this is because I am not properly forcing it everywhere it needs to be forced, or iSGL3d isn't properly using GLES 2.0 as it should be
Any pointers at all would be useful...
simply by disabling mipmapping, even valid textures fail to draw
Did you set the minification sampling for these textures to not use the mipmaps? It defaults to mipmap option, so you have to set it to something else if you don't use mipmaps.
e.g.
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);

Using depth and color buffers with different resolutions (sub-sampled depth buffer)

I want to use a sub-sampled depth buffer to increase performance of a program. In my case, it does not matter if there are artifacts or geometry popping will occur.
I have set up my framebuffer like this:
// Color attachment
glBindTexture(GL_TEXTURE_2D, colorAttachment);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 640, 360, 0, GL_RGBA, GL_UNSIGNED_BYTE, nil);
// Depth attachment
glBindRenderbuffer(GL_RENDERBUFFER, depthAttachment);
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT16, 160, 90);
// Framebuffer
glBindFramebuffer(GL_FRAMEBUFFER, framebuffer);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, colorAttachment, 0);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, depthAttachment);
However, now, glCheckFramebufferStatus(GL_FRAMEBUFFER) returns GL_FRAMEBUFFER_INCOMPLETE_DIMENSIONS which stands for "Not all attached images have the same width and height" according to the documentation.
There exists a research paper called "Full-3D Edge Tracking with a Particle Filter" which describes in section 3.5 that they actually used a sub-sampled depth buffer to increase the performance in their application.
Sub-sampled depth buffer: Adjacent pixels along an image edge are so closely correlated
that testing each individual edge pixel is redundant. For single-hypothesis trackers,
it is common to spread sample points a distance of 10-20 pixels apart along an edge.
Sampling only every nth edge pixel also reduces the graphics bandwidth required and
so only every 4th pixel is sampled. Instead of explicitly drawing stippled lines, this is here achieved by using a sub-sampled depth buffer (160 x 120) since this further achieves
a bandwidth reduction for clearing and populating the depth buffer. However, this also
means that hidden line removal can be inaccurate to approximately four pixels. Apart
from this, the accuracy of the system is unaffected.
The only workarounds which are obvious are
Using a fragment shader program to perform the lookup into the previously rendered depth buffer to apply the depth-check manually.
Rendering the depthbuffer in the lower resolution, then resample it to the bigger resolution, then use it as before.
Both approaches don't sound like they would be the most performant ideas. What is the cleanest way to achieve a sub-sampled depth buffer?
The doc page you referenced refers to OpenGL ES 1.0 and 2.0. The OpenGL wiki has more information as to the difference between 2.0 and 3.0, namely that starting with 3.0 (and ARB_framebuffer_object), framebuffer textures can be of different sizes. However, if I recall correctly, when you have textures of different sizes attached, the actual texture size used is the intersection of all FBO attached textures. I don't think this is what you want.
In order to reduce the size of your depth texture, I suggest using glBlitFramebuffer to transform your large texture into a smaller one. This operation is completely done on the GPU so it's very fast. The final smaller texture can then be used as input for further rendering operations in your shaders which will definitely provide bandwidth savings. Instead of performing the averaging of multiple depth values for each pixel shader execution, it will be done once per texel in the smaller texture. A smaller texture is also inherently faster to sample since it fits in cache better.
Keep in ming however that averaging depth samples can produce wild inaccuracies because the depth values are not linearly spread.

How do I draw 1000+ particles (w/ unique rotation, scale, and alpha) in iPhone OpenGL ES particle system without slowing down the game?

I am developing a game for iPhone using OpenGL ES 1.1. In this game, I have blood particles which emit from characters when they are shot, so there can be 1000+ blood particles on the screen at any one time. The problem is that when I have over 500 particles to render, the game's frame rate drops immensely.
Currently, each particle renders itself using glDrawArrays(..), and I know this is the cause for the slow down. All particles share the same texture atlas.
So what is the best option to reduce slow down from drawing many particles? Here are the options I found:
group all the blood particles together and render them using a single glDrawArrays(..) call --if I use this method, is there a way for each particle to have its own rotation and alpha? Or do all of them HAVE to have the same rotation when this method is used? If I can't render particles with unique rotation, then I cannot use this option.
Use point sprites in OpenGL ES 2.0. I am not using OpenGL ES 2.0 yet b/c I need to meet a deadline which I have set to release my game on the App Store. To use OpenGL ES would require preliminary research which unfortunately I do not have the time to perform. I will upgrade to OpenGL ES 2.0 upon a later release, but for the first, I only want to use 1.1.
Here is each particle rendering itself. This is my original particle-rendering methodolgy which caused the game to experience a significant drop in frame rate after 500+ particles were being rendered.
// original method: each particle renders itself.
// slow when many particles must be rendered
[[AtlasLibrary sharedAtlasLibrary] ensureContainingTextureAtlasIsBoundInOpenGLES:self.containingAtlasKey];
glPushMatrix();
// translate
glTranslatef(translation.x, translation.y, translation.z);
// rotate
glRotatef(rotation.x, 1, 0, 0);
glRotatef(rotation.y, 0, 1, 0);
glRotatef(rotation.z, 0, 0, 1);
// scale
glScalef(scale.x, scale.y, scale.z);
// alpha
glColor4f(1.0, 1.0, 1.0, alpha);
// load vertices
glVertexPointer(2, GL_FLOAT, 0, texturedQuad.vertices);
glEnableClientState(GL_VERTEX_ARRAY);
// load uv coordinates for texture
glTexCoordPointer(2, GL_FLOAT, 0, texturedQuad.textureCoords);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
// render
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
glPopMatrix();
Then I used method 1, but particles can't have unique rotation, scale, or alpha using this method (that I know of).
// this is method 1: group all particles and call glDrawArrays(..) once
// declare vertex and uv-coordinate arrays
int numParticles = 2000;
CGFloat *vertices = (CGFloat *) malloc(2 * 6 * numParticles * sizeof(CGFloat));
CGFloat *uvCoordinates = (CGFloat *) malloc (2 * 6 * numParticles * sizeof(CGFloat));
...build vertex arrays based on particle vertices and uv-coordinates.
...this part works fine.
// get ready to render the particles
glPushMatrix();
glLoadIdentity();
// if the particles' texture atlas is not already bound in OpenGL ES, then bind it
[[AtlasLibrary sharedAtlasLibrary] ensureContainingTextureAtlasIsBoundInOpenGLES:((Particle *)[particles objectAtIndex:0]).containingAtlasKey];
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glDisableClientState(GL_NORMAL_ARRAY);
glVertexPointer(2, GL_FLOAT, 0, vertices);
glTexCoordPointer(2, GL_FLOAT, 0, uvCoordinates);
// render
glDrawArrays(GL_TRIANGLES, 0, vertexIndex);
glPopMatrix();
I'll reiterate my question:
How do I render 1000+ particles without frame rate drastically dropping and each particle can still have unique rotation, alpha, and scale?
Any constructive advice would really help and would be greatly appreciated!
Thanks!
Use about 1- 10 textures, each made of say 200 red blood dots on a transparent background, and then draw them each about 3 - 10 times. Then you have your thousands of dots. You draw all the images in a spherical pattern, etc - exploding in layers.
You can't always do a 1 to 1 correspondence with reality when gaming. Take a real close look at some games that run on old Xbox or iPad, etc - there are shortcuts you need to do - and they often look great when done.
There is significant overhead with each OpenGL ES API call, so it's not a surprise that you're seeing a slowdown here with hundreds of passes through that drawing loop. It's not just glDrawArrays() that will get you here, but the individual glTranslatef(), glRotatef(), glScalef(), and glColorf() calls as well. glDrawArrays() may appear to be the hotspot, due to the way that deferred rendering works on these GPUs, but those other calls will also hurt you.
You should group these particle vertices together in one array (preferably a VBO so that you can take advantage of streaming updated data to the GPU more efficiently). You definitely can replicate the effects of individual rotation, scale, etc. in your combined vertex array, but you're going to need to perform the calculations as to where the vertices should be as they are rotated, scaled, etc. This will place some burden on the CPU for every frame, but that could be offset a bit by using the Accelerate framework to do some vector processing of this.
Color and alpha can be provided per-vertex in an array as well, so you can control that for each one of your particles.
However, I think you're right in that OpenGL ES 2.0 could provide an even better solution for this by letting you write a custom shader program. You could send static vertices in a VBO for all your points, then only have to update matrices to manipulate each particle and the alpha values for each particle vertex. I do something similar to generate procedural impostors as stand-ins for spheres. I describe this process here, and you can download the source code to the application here.