In a Vulkan program, fragment shaders generally output single-precision floating-point colors in the range 0.0 to 1.0 to each red/blue/green channel, and these are then written to (blended into) the swapchain image that is then presented to screen. The floating point values are encoded into bits according to the format of the swapchain image (specified when the swapchain is created).
When I change my swapchain format from VK_FORMAT_B8G8R8A8_UNORM to VK_FORMAT_B8G8R8A8_SRGB I observe that the overall brightness of the frames is greatly increased, and also there are some minor color shifts.
My understanding of the SRGB format was that it was a lot like the UNORM format just having a different mapping of floating point values to 8-bit integers, such that it had higher color resolution in some areas and less in others, but the actually meaning of the "pre-encoded" RGB floating-point values remained unchanged.
So I'm a little suprised about the brightness increase. Is my understanding of SRGB encoding wrong? and/or is such a brightness increase is expected vs UNORM?
or maybe I have a bug and a brightness increase is not expected?
Update:
I've observed that if I use SRGB swapchain images and also load my images/textures in VK_FORMAT_B8G8R8A8_SRGB format rather than VK_FORMAT_B8G8R8A8_UNORM then the extra brightness goes away. It looks the same as if I use VK_FORMAT_B8G8R8A8_UNORM swapchain images and load my images/textures in VK_FORMAT_B8G8R8A8_UNORM format.
Also, if I put the swapchain image into VK_FORMAT_B8G8R8A8_UNORM format and then load the images/textures with VK_FORMAT_B8G8R8A8_SRGB, the frames look extra dark / almost black.
Some clarity about what is going on would be helpful.
This is a colorspace and display issue.
Fragment shaders are assumed to be writing values in a linear RGB colorspace. As such, if you are rendering to an image that has a linear RGB colorspace (UNORM), the values your FS produces are interpreted directly. When you render to an image which has an sRGB colorspace, you are writing values from one space (linear) into another space (sRGB). As such, these values are automatically converted into the sRGB colorspace. It's no different from transforming a position from model space to world space or whatever.
What is different is the fact that you've been looking at your scene incorrectly. See, odds are very good that your swapchain's VkSurfaceFormat::colorSpace value is VK_COLOR_SPACE_SRGB_NONLINEAR_KHR.
VkSurfaceFormat::colorspace tells you how the display engine will interpret the pixel data in swapchain images you present. This is not a setting you provide; this is the display engine telling you something about how it is going to interpret the values you send it.
I say "odds are very good" that it is sRGB because, outside of extensions, this is the only possible value. You are rendering to an sRGB display device whether you like it or not.
So if you write to a UNORM image, the display device will read the actual bits of data and interpret them as if they are in the sRGB colorspace. This operation only makes sense if the data your fragment shader wrote itself is in the sRGB colorspace.
However, that's generally not how FS's generate data. The lighting computations you compute only make sense on color values in a linear RGB colorspace. So unless you wrote your FS to deliberately do sRGB conversion after computing the final color value, odds are good that all of your results have been in a linear RGB colorspace. And that's what you've been writing to your framebuffer.
And then the display engine mangles it.
By using an sRGB image as your destination, you force a colorspace conversion from linear RGB to sRGB, which will then be interpreted by the display engine as sRGB values. This means that your lighting equations are finally producing the correct results.
Failure to do gamma-correct rendering properly (including the source texture images which are almost certainly also in the sRGB colorspace, as this is the default colorspace for most image editors. The exceptions would be for things like gloss-maps, normal maps, or other images that aren't storing "colors".) leads to cases where linear light attention appears more correct than quadratic attenuation, even though quadratic is how reality works.
This is gamma correction.
Using a swapchain with VK_FORMAT_B8G8R8A8_SRGB leverages the ability to to apply gamma correction as the final step in your render pipeline. This happens for you automatically behind the scenes.
That is the only place you want gamma correction to happen. Make sure your shaders are not applying gamma correction. You might see it as:
color = pow(color, vec3(1.0/2.2));
If your swapchain does the gamma correction, you do not need todo it in your shaders.
Most images are SRGB (pictures, color textures, etc). Linear images are for specific data, like a blue noise texture or heightmap.
Load SRGB images w/ VK_FORMAT_R8G8B8A8_SRGB
Load LINEAR images w/ VK_FORMAT_R8G8B8A8_UNORM
No shader conversion is required if the rules outlined above are followed.
Related
Can anyone tell why the image in this pdf does not display as 100% Cyan?
clrtestc - NOPREBLEND32.PDF
Warning: I probably know just enough about pdf and colour to be dangerous!
I'm pretty sure each colour plane of the image is in a separate image. Here's a blended version if that helps.
I know the ColorSpace is DeviceCMYK
I'm pretty sure there is only 100% Cyan in the image, at least there was when it went into the PDF converter.
What went in:
CMYK: 100,0,0,0
RGB: 0,255,255
What I measure coming out:
CMYK: 100,27,0,6
RGB: 0,173,238
I'm foxed! Is there some filter affecting the rendering of the PDF?
There's also Magenta, Yellow and Black versions if they help.
Any help much appreciated.
The PDF file is extraordinarily complicated, it has numerous Forms, some of them nested, most of which are empty. However there only appears to be one image, which is defined in an Indexed CMYK space. So as far as I can see, this is indeed a 100% cyan image.
The extended graphics state does use the Multiply Blend mode, and there is no group and no page group specified, so the colour space used for the blending will depend on the colour model of the output device. If that's a monitor, then it's entirely possible that the resulting output will be RGB.
That's because your CMYK image needs to be converted to RGB in order to be blended using that colour space.
Incidentally, the image is in an Indexed colour space. In your image all the image samples have the same value, that value is then consulted in a lookup table, and that table returns the CMYK components. So no, there is not one image per colour plane, or at least, not in this file.
To be honest, you're going to have to explain better how you are evaluating the content of the PDF file. As far as I can see the image is 100% cyan, and when rendered to a CMYK device, it will remain 100% cyan. If you render to an RGB device, it will be converted to RGB. A poor quality PDF consumer might decide to convert to RGB in the absence of a defined colour space for the blending operation.
Since the blending mode doesn't actually do anything (there's no defined alpha, SMask or any other transparency in the file) you could remove that and see if it sorts out your problem.
Edit
Your screen will be an RGB device, so no matter what the CMYK values in the PDF file are, there won't be any CMYK in the screenshot. The PDF rendering engine will have to convert the CMYK to RGB.
So the PDF rendering engine performs an opaque CMYK->RGB conversion. Then you take a picture of that RGB screen. You load that into an image editing application, and ask it what the RGB values are and presumably what it thinks are the CMYK equivalents.
If the CMYK->RGB calculation that the PDF viewer performs is not the inverse of the calculation that the RGB->CMYK image application performs, then you won't be getting the right values!
There's no way to predict what the RGB intermediate values 'should' be, because there is no 'right' answer here. Fundamentally this isn't a reliable technique for evaluating the colour.
It's hard to make any kind of recommendation without knowing what you are trying to achieve (and possibly why), and what tools you are prepared to use. I believe Acrobat Pro would allow you to look at the colour values directly for example. Or you could use something like Ghostscript to create a CMYK TIFF file, then open that in an image application which supports CMYK (like Photoshop) and look at the values there.
But rendering to the screen, taking a screenshot and trying to figure out what the CMYK values might or might not have been is not really going to work.
Inches = Pixels / dpi
I noticed that PDF Clown uses measurements in float: how do I convert inches and pixels to float for width, height, etc. to properly work in PDF? Does anybody have a mathematical formula for this?
1) Inches --> float
2) Pixels --> float
PDF's coordinate system is based on the concept of device independence, that is the ability to preserve the geometric relationship between graphics objects and their page, no matter the device they are rendered through. Such device-independent coordinate system is called user space. By default, the length of its unit (approximately) corresponds to a typographic point (1/72 inch), defining the so-called default user space.
But, as mkl appropriately warned you, the relation between user space and device space can be altered through the current transformation matrix (CTM): this practically means that the user space unit's length matches the typographic point only until skewing or scaling are applied! Furthermore, since PDF 1.6 default user space unit may be overridden setting the UserUnit entry in the page dictionary.
So, the short answer is that, in PDF, one inch corresponds to 72 default user space units (no CTM interference granted); on the other hand, as this coordinate system is (by-definition) device-independent, it doesn't make sense to reason about pixels -- they exist in discrete spaces of samples only, whilst PDF defines a continuous space of vector graphics which is agnostic about device resolution! See the following picture:
If you need to map into PDF some graphics which are natively expressed in pixels, then prior conversion to inches may be a sensible approach.
BTW, floating-point data type was chosen to represent user space measurements just because it was obviously the most convenient approximation to map such a continuum -- I guess that after this explanation you won't confuse measures with measurements any more.
An extensive description about PDF's coordinate system can be found in § 4.2 of the current spec.
Can anybody explain how rendering differs from rasterization especially in the context of font rendering (why not font rasterization)?
Can rendering be called a special technique (like greyscale rendering and subpixel rendering) before the rasterizer rasterizes the image?
Rendering is a broad term that generally means transforming computer-readable information, for example objects in a 3d scene, to one or more images.
Rasterization is a more specific term that typically means the process of transforming a vector (curve based) image to a rasterized (pixel based) image.
Rendering involves performing the calculations for vectors and shape geometry for the elements to be drawn.
Rasterizing involves converting the rendered vectors and shapes into pixel bit maps for display.
What would be the most efficient way to remap the colors of an image to a gradient for iOS? This is defined as "apply a color lookup table to the image" in the Image Magic docs, and generally I think. Is there something built in core image for instance to do this? I know it can be done with ImageMagick code using convert -clut, but not certain that is the most efficient way to do it.
the result of remapping the image to a gradient is as pictured here:
http://owolf.net/uploads/ny.jpg
The basic formula, copied from fraxel's comment is:
1.Open your image as grayscale, and RGB
2.Convert the RGB image to HSV (Hue, Saturation, Value/Brightness) color space. This is a cylindrical space, with hue represented by a single value on the polar axis.
3.Set the hue channel to the grayscale image we already opened, this is the crucial step.
4.Set value, and saturation channels both to maximal values.
5.Convert back to RGB space (otherwise display will be incorrect).
I have a PDF document that is created by creating NSImages with size in 72dpi pts, each has a single representation which is measured in pixels. I then put these images into PDFPages with initWithImage, and then save the document.
When I open the document, I need the resolution of the original image. However, all of the rectangles that PDFPage gives me are measured in points, not pixels.
I know that the information is in there, and I suppose I can try to parse the PDF data myself, by going through the voyeur.app example... but that's a WHOLE lot of effort to do something that should be pretty normal...
Is there an easier way to do this?
Added:
I've tried two techniques:
get the PDFRepresentation data from
the page, and use it to make a new
NSImage via initWithData. This
works, however, the image has both
size and pixel size in 72dpi.
Draw the PDFPage into a new
off-screen context, and then get a
CGImage from that. The problem is
that when I'm making the context, it
appears that I need to know the size
in pixels already, which defeats
part of the purpose...
There are a few things you need to understand about PDF:
The PDF Coordinate system is in
points (1/72 inch) by default.
The PDF Coordinate system is devoid of resolution. (this is a white lie - the resolution is effectively the limits of 32 bit floating point numbers).
Images in PDF do not inherently have any resolution attached to them (this is a white lie - images compressed with JPEG2000 still have resolution in their embedded metadata).
An Image in PDF is represented by an object that contains a series of samples that are stored using some compression filter.
Image objects can be rendered on a page multiple times at any size.
Since resolution is defined as the number of pixels (or samples) per unit distance, resolution only means something for a particular rendering of an image on a page. So if you are rendering a particular image to fill the page, then the resolution in dpi is
xdpi = image_width / (pageWidthInPoints / 72.0);
ydpi = image_height / (pageHeightInPoints / 72.0);
If the image is not being rendered to the full size of the page, a complete solution is very tricky. Adobe prescribes that images should be treated as being 1x1 and that you change the page transformation matrix to determine how to render them. The means that you would need the matrix at the point of rendering the image and you would need to push the points (0,0), (0, 1), (1,0) through the matrix. The Euclidean distance between (0, 0)' and (1, 0)' will give you the width in points and the Euclidean distance between (0, 0)' and (0, 1)' will give you the height in points.
So how do you get that matrix? Well, you need the content stream for the page and you need to write a PDF interpreter that can rip the content stream and keep track of changes to the CTM. When you reach your image, you extract the CTM for it.
To do that last step should be about an hour with a decent PDF toolkit, provided you are familiar with the toolkit. Writing that toolkit is several person years of work.