Surface format is B8G8R8A8_UNORM, but vkCmdClearColorImage takes float? - vulkan

I use vkGetPhysicalDeviceSurfaceFormatsKHR to get supported image formats for the swapchain, and (on Linux+Nvidia, using SDL) I get VK_FORMAT_B8G8R8A8_UNORM as the first option and I go ahead and create the swapchain with that format:
VkSwapchainCreateInfoKHR swapchain_info = {
...
.imageFormat = format, /* taken from vkGetPhysicalDeviceSurfaceFormatsKHR */
...
};
So far, it all makes sense. The image format used to draw on the screen is the usual 8-bits-per-channel BGRA.
As part of my learning process, I have so far arrived at setting up a lot of stuff but not yet the graphics pipeline1. So I am trying the only command I can use that doesn't need a pipeline: vkCmdClearColorImage2.
The VkClearColorValue used to define the clear color can take the color as float, uint32_t or int32_t, depending on the format of the image. I would have expected, based on the image format given to the swapchain, that I should give it uint32_t values, but that doesn't seem to be correct. I know because the screen color didn't change. I tried giving it floats and it works.
My question is, why does the clear color need to be specified in floats when the image format is VK_FORMAT_B8G8R8A8_UNORM?
1 Actually I have, but thought I would try out the simpler case of no pipeline first. I'm trying to incrementally use Vulkan (given its verbosity) particularly because I'm also writing tutorials on it as I learn.
2 Actually, it technically doesn't need a render pass, but I figured hey, I'm not using any pipeline stuff here, so let's try it without a pipeline and it worked.
My rendering loop is essentially the following:
acquire image from swapchain
create a command buffer with the following:
transition from VK_IMAGE_LAYOUT_UNDEFINED to VK_IMAGE_LAYOUT_GENERAL (because I'm clearing the image outside a render pass)
clear the image
transition from VK_IMAGE_LAYOUT_GENERAL to VK_IMAGE_LAYOUT_PRESENT_SRC_KHR
submit command buffer to queue (taking care of synchronization with swapchain with semaphores)
submit for presentation

My question is, why does the clear color need to be specified in floats when the image format is VK_FORMAT_B8G8R8A8_UNORM?
Because the normalized, scaled, or sRGB image formats are really just various forms of floating-point compression. A normalized integer is a way of storing floating-point values on the range [0, 1] or [-1, 1], but using a much smaller amount of data than even a 16-bit float. A scaled integer is a way of storing floating point values on the range [0, MAX] or [-MIN, MAX]. And sRGB is just a compressed way of storing linear color values on the range [0, 1], but in a gamma-corrected color space that puts precision in different places than the linear color values would suggest.
You see the same things with inputs to the vertex shader. A vec4 input type can be fed by normalized formats just as well as by floating-point formats.

Related

How to convert BGR TensorFlow Lite model to RGB?

I have a tflite model trained on BGR data. How can I make it work properly with RGB images?
UPDATE
I want to use it with the material-showcase app: https://github.com/googlesamples/mlkit/tree/master/android/material-showcase
#Farmaker #JaredJunyoungLim . Thank you very much for your answers. I've updated the question. At first I was thinking about converting the model itself, so it wouldn't require any changes in the code. For example, the converter to the OpenVINO format has an option to reverse input channels. I have also tried to set the BGR ColorSpace in the metadata, but have found out that it's most probably not possible.
I guess I'll go with your suggestion then. In the linked code, there is indeed the ByteBuffer (FrameProcessorBase.kt). I guess this is the place to change the order of the channels (after the line 70):
val frame = processingFrame ?: return
However, how can I change the order of channels, if this is just a ByteBuffer? Do I need to figure out the way data is stored in it? For example there is R,G,B,R,G,B,R,G,B,... etc. for every pixel? Or maybe there is some more elegant way to that?
I can see that the format is set to IMAGE_FORMAT_NV21, which is YCrCb
UPDATE 2
For what I've tested (
Log.d("ByteBuffer", frame.toString())
), it seems that the ByteBuffer takes 1.5 bytes per pixel:
java.nio.HeapByteBuffer[pos=0 lim=3110401 cap=3110401]
(Resolution: 1920x1080; 3110400/1920/1080=1.5)
So it uses 12 bits per pixel, which means 4 bits per channel per pixel. That's a bit strange, because I would suspect at least 8 bits per channel per pixel (0-255).
So I guess that maybe it's compressed.

Vertex buffer with vertices of different formats

I want to draw a model that's composed of multiple meshes, where each mesh has different vertex formats. Is it possible to put all the various vertices within the same vertex buffer, and to point to the correct offset at vkCmdBindVertexBuffers time?
Or must all vertices within a buffer have the same format, thus necessitating multiple vbufs for such a model?
Looking at the manual for vkCmdBindVertexBuffers, it's not clear whether the offset is in bytes or in vertex-strides.
https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/vkCmdBindVertexBuffers.html
Your question really breaks down into 3 questions
Does the pOffsets parameter for vkCmdBindVertexBuffers accept bytes or vertex strides?
Can I put more than one vertex format into a vertex buffer?
Should I put more than one vertex format into a vertex buffer?
The short version is
Bytes
Yes
Probably not
Does the offsets parameter for vkCmdBindVertexBuffers accept bytes or vertex strides?
The function signature is
void vkCmdBindVertexBuffers(
VkCommandBuffer commandBuffer,
uint32_t firstBinding,
uint32_t bindingCount,
const VkBuffer* pBuffers,
const VkDeviceSize* pOffsets);
Note the VkDeviceSize type for pOffsets. This unambiguously means "bytes", not strides. Any VkDeviceSize means an offset or size in raw memory. Vertex Strides aren't raw memory, they're simply a count, so the type would have to be a uint32_t or uint64_t.
Furthermore there's nothing in that function signature that specifies the vertex format so there would be no way to convert the vertex stride count to actual memory sizes. Remember that unlike OpenGL, Vulkan is not a state machine, so this function doesn't have any "memory" of a rendering pipeline that you might have previously bound.
Can I put more than one vertex format into a vertex buffer?
As a consequence of the above answer, yes. You can put pretty much whatever you want into a vertex buffer, although I believe some hardware will have alignment restrictions on what are valid offsets for vertex buffers, so make sure you check that.
Should I put more than one vertex format into a vertex buffer?
Generally speaking you want to render your scene in as few draw calls as possible, and having lots of arbitrary vertex formats runs counter to that. I would argue that if possible, the only time you want to change vertex formats is when you're switching to a different rendering pass, such as when switching between rendering opaque items to rendering transparent ones.
Instead you should try to make format normalization part of your asset pipeline, taking your source assets and converting them to a single consistent format. If that's not possible, then you could consider doing the normalization at load time. This adds complexity to the loading code, but should drastically reduce the complexity of the rendering code, since you now only have to think in terms of a single vertex format.

Explain how premultiplied alpha works

Can somebody please explain why rendering with premultiplied alpha (and corrected blending function) looks differently than "normal" alpha when, mathematically speaking, those are the same?
I've looked into this post for understanding of premultiplied alpha:
http://blogs.msdn.com/b/shawnhar/archive/2009/11/06/premultiplied-alpha.aspx
The author also said that the end computation is the same:
"Look at the blend equations for conventional vs. premultiplied alpha. If you substitute this color format conversion into the premultiplied blend function, you get the conventional blend function, so either way produces the same end result. The difference is that premultiplied alpha applies the (source.rgb * source.a) computation as a preprocess rather than inside the blending hardware."
Am I missing something? Why is the result different then?
neshone
The difference is in filtering.
Imagine that you have a texture with just two pixels and you are sampling it exactly in the middle between the two pixels. Also assume linear filtering.
Schematically:
R|G|B|A + R|G|B|A = R|G|B|A
non-premultiplied:
1|0|0|1 + 0|1|0|0 = 0.5|0.5|0|0.5
premultiplied:
1|0|0|1 + 0|0|0|0 = 0.5|0|0|0.5
Notice the difference in green channel.
Filtering premultiplied alpha produces correct results.
Note that all this has nothing to do with blending.
This is a guess, because there is not enough information yet to figure it out.
It should be the same. One common way of getting a different value is to use a different Gamma correction method between the premultiply and the rendering step.
I am going to guess that one of your stages, either the blending, or the premultiplying stage is being done with a different gamma value. If you generate your premultiplied textures with a tool like DirectXTex texconv and use the default srgb option for premultiplying alpha, then your sampler needs to be an _SRGB format and your render target should be _SRGB as well. If you are treating them linearly then you may not be able to render to an _SRGB target or sample the texture with gamma correction, even if you are doing the premultiply in the same shader that samples (depending on 3D API and render target setup differences). Doing so will cause the alpha to be significantly different between the two methods in the midtones.
See: The Importance of Being Linear.
If you are generating the alpha in Photoshop then you should know a couple things. Photoshop does not save alpha in linear OR sRGB format. It saves it as a Gamma value about half way between linear and sRGB. If you premultiply in Photoshop it will compute the premultiply correctly but save the result with the wrong ramp. If you generate a normal alpha then sample it as sRGB or LINEAR in your 3d API it will be close but will not match the values Photoshop shows in either case.
For a more in depth reply the information we would need would be.
What 3d API are you using.
How are your textures generated and sampled
When and how are you premultiplying the alpha.
and preferably a code or shader example that shows the error.
I was researching why one would use Pre vs non-Pre and found this interesting info from Nvidia
https://developer.nvidia.com/content/alpha-blending-pre-or-not-pre
It seems that their specific case has more precision when using Pre, over Post-Alpha.
I also read (I believe on here but cannot find it), that doing pre-alpha (which is multiplying Alpha to each RGB value), you will save time. I still need to find out if that's true or not, but there seems to be a reason why pre-alpha is preferred.

How to convert 32 bit PNG to RGB565?

How can I accomplish this? A programmatic solution (Objective-c) is great, but even a non-progarmmatic one is good.
I have pixelmator -> But that doesn't give you the option. I can't seem to do it with Preview either.
I have tried googling, but haven't been able to find a solution so far. The only tool I have been able to use to do this is TexturePacker, but that creates a sprite sheet.
You can use libpng to convert the PNG image to three-byte (8:8:8) RGB. Then you can downsample to the 5:6:5 16-bit color values of RGB565. If r, g, and b are the respective 8-bit colors (stored in an unsigned char type), then the 16-bit RGB565 value is:
((r >> 3) << 11) | ((g >> 2) << 5) | (b >> 3)
You can improve a tad on this by rounding instead of chopping, being careful to not overflow the values. You can also force the green value to be equal to the blue and red values when they are all equal in the original 8-bit values. Otherwise it is possible to have colors that were originally gray inadvertently take on color after conversion.
Create Bitmap Context with color RGB565 using Quartz, paint your PNG on this context, save this bitmap context to file.
PNG does not support a RGB565 packing. You can always apply a posterize to the image (programatically or with ImageMagick or with any image editor), which amounts to discard the lower significant bits in each channel. When saving to PNG, you will still be saving 8 bits per channel (unless you use a palette), but even then you will get an appreciable reduction in size, because of the PNG compression.
A quick example: original:
after a simple posterize with 32 levels (equivalent to a RGB555) applied with XnView
The size goes from 89KB to 47KB, with a small quality loss.
In case of synthetic images with gradients, the quality loss could be much more noticiable (banding).
I received this answer from the creator of texture packer:
you can do it from command line - see
http://www.texturepacker.com/uncategorized/batch-converting-images-to-pvr-or-pvr-ccz/
Just adjust the opt and set output to .png instead of pvr.ccz
Make sure that you do not overwrite your source images.
According to Wikipedia, which is always right, the only 16-bit PNG is a greyscale PNG. http://en.wikipedia.org/wiki/Portable_Network_Graphics
If you just add your 32-bit (alpha) or 24-bit (no alpha) PNG to your project as normal, and then set the texture format in Cocos2D, all should be fine. The code for that is:
[CCTexture2D setDefaultAlphaPixelFormat:kCCTexture2DPixelFormat_RGB565];

Texture format for cellular automata in OpenGL ES 2.0

I need some quick advice.
I would like to simulate a cellular automata (from A Simple, Efficient Method
for Realistic Animation of Clouds) on the GPU. However, I am limited to OpenGL ES 2.0 shaders (in WebGL) which does not support any bitwise operations.
Since every cell in this cellular automata represents a boolean value, storing 1 bit per cell would have been the ideal. So what is the most efficient way of representing this data in OpenGL's texture formats? Are there any tricks or should I just stick with a straight-forward RGBA texture?
EDIT: Here's my thoughts so far...
At the moment I'm thinking of going with either plain GL_RGBA8, GL_RGBA4 or GL_RGB5_A1:
Possibly I could pick GL_RGBA8, and try to extract the original bits using floating point ops. E.g. x*255.0 gives an approximate integer value. However, extracting the individual bits is a bit of a pain (i.e. dividing by 2 and rounding a couple times). Also I'm wary of precision problems.
If I pick GL_RGBA4, I could store 1.0 or 0.0 per component, but then I could probably also try the same trick as before with GL_RGBA8. In this case, it's only x*15.0. Not sure if it would be faster or not seeing as there should be fewer ops to extract the bits but less information per texture read.
Using GL_RGB5_A1 I could try and see if I can pack my cells together with some additional information like a color per voxel where the alpha channel stores the 1 bit cell state.
Create a second texture and use it as a lookup table. In each 256x256 block of the texture you can represent one boolean operation where the inputs are represented by the row/column and the output is the texture value. Actually in each RGBA texture you can represent four boolean operations per 256x256 region. Beware texture compression and MIP maps, though!