I'm trying to port my OpenGL code to Metal one. As part of it I need to draw PNG textures, and I'm using MTKTextureLoader to load them. Here is my pipeline code:
texturePipelineDescriptor.vertexFunction = textureVertexFunc;
texturePipelineDescriptor.fragmentFunction = textureFragmentFunc;
texturePipelineDescriptor.colorAttachments[0].pixelFormat = MTLPixelFormatRGBA8Unorm;
texturePipelineDescriptor.colorAttachments[0].blendingEnabled = YES;
texturePipelineDescriptor.colorAttachments[0].rgbBlendOperation = MTLBlendOperationAdd;
texturePipelineDescriptor.colorAttachments[0].sourceRGBBlendFactor = MTLBlendFactorSourceAlpha;
texturePipelineDescriptor.colorAttachments[0].destinationRGBBlendFactor = MTLBlendFactorOneMinusSourceAlpha;
And here is my loader code:
NSData* imageData = [NSData dataWithBytes:imageBuffer length:imageBufferSize];
id<MTLTexture> newMTLTexture = [m_metal_renderer.metalTextureLoader newTextureWithData:imageData options:nil error:&error];
And here is the result. The entire image should be orange, like the small parts.
If I change this line:
texturePipelineDescriptor.colorAttachments[0].pixelFormat = MTLPixelFormatRGBA8Unorm;
To this line:
texturePipelineDescriptor.colorAttachments[0].pixelFormat = MTLPixelFormatBGRA8Unorm;
I will get the following results:
As you can see,images changed R and B channels, When I loading the textures, in ALL OF THEM
newMTLTexture.pixelFormat = MTLPixelFormatBGRA8Unorm;
regardless whether they are drawn correctly or not.
I would suspect there is something wrong with PNG files, but they seems absolutely normal in any PNG viewer.
Any clues what I can do to resolve it? Is it a bug In MTLTextureLoader? Is there another way of loading PNG picture?
P.S. The software is written for MacOS and not iOS.
I found the problem. It seems that some of the PNG files were stored in RGB (24 bit) format, while another were stored in RGBA (32 bit) format. All color became right after converting images to 32bit format.
I would expect MTKTextureLoader to be able to handle that, but apparently it is not. If someone knows the option I missed or another way to correctly load those PNG's to Metal (Other than adding 4th channel manually) it will be great!
Related
I have read (after running into the limitation myself) that for copying data from the host to a VK_IMAGE_TILING_OPTIMAL VkImage, you're better off using a VkBuffer rather than a VkImage for the staging image to avoid restrictions on mipmap and layer counts. (Here and Here)
So, when it came to implementing a glReadPixels-esque piece of functionality to read the results of a render-to-texture back to the host, I thought that reading to a staging VkBuffer with vkCmdCopyImageToBuffer instead of using a staging VkImage would be a good idea.
However, I haven't been able to get it to work yet, I'm seeing most of the intended image, but with rectangular blocks of the image in incorrect locations and even some bits duplicated.
There is a good chance that I've messed up my synchronization or layout transitions somewhere and I'll continue to investigate that possibility.
However, I couldn't figure out from the spec whether using vkCmdCopyImageToBuffer with an image source using VK_IMAGE_TILING_OPTIMAL is actually supposed to 'un-tile' the image, or whether I should actually expect to receive a garbled implementation-defined image layout if I attempt such a thing.
So my question is: Does vkCmdCopyImageToBuffer with a VK_IMAGE_TILING_OPTIMAL source image fill the buffer with linearly tiled data or optimally (implementation defined) tiled data?
Section 18.4 describes the layout of the data in the source/destination buffers, relative to the image being copied from/to. This is outlined in the description of the VkBufferImageCopy struct. There is no language in this section which would permit different behavior from tiled images.
The specification even has pseudo code for how copies work (this is for non-block compressed images):
rowLength = region->bufferRowLength;
if (rowLength == 0)
rowLength = region->imageExtent.width;
imageHeight = region->bufferImageHeight;
if (imageHeight == 0)
imageHeight = region->imageExtent.height;
texelSize = <texel size taken from the src/dstImage>;
address of (x,y,z) = region->bufferOffset + (((z * imageHeight) + y) * rowLength + x) * texelSize;
where x,y,z range from (0,0,0) to region->imageExtent.width,height,depth}.
The x,y,z part is the location of the pixel in question from the image. Since this location is not dependent on the tiling of the image (as evidenced by the lack of anything stating that it would be), buffer/image copies will work equally on both kinds of tiling.
Also, do note that this specification is shared between vkCmdCopyImageToBuffer and vkCmdCopyBufferToImage. As such, if a copy works one way, it by necessity must work the other.
For example, there are QR scanners which scan video stream in real time and get QR codes info.
I would like to check the light source from the video, if it is on or off, it is quite powerful so it is no problem.
I will probably take a video stream as input, maybe make images of it and analyze images or stream in real time for presence of light source (maybe number of pixels of certain color on the image?)
How do I approach this problem? Maybe there is some source of library?
It sounds like you are asking for information about several discreet steps. There are a multitude of ways to do each of them and if you get stuck on any individual step it would be a good idea to post a question about it individually.
1: Get video Frame
Like chaitanya.varanasi said, AVFoundation Framework is the best way of getting access to an video frame on IOS. If you want something less flexible and quicker try looking at open CV's video capture. The goal of this step is to get access to a pixel buffer from the camera. If you have trouble with this, ask about it specifically.
2: Put pixel buffer into OpenCV
This part is really easy. If you get it from openCV's video capture you are already done. If you get it from an AVFoundation you will need to put it into openCV like this
//Buffer is of type CVImageBufferRef, which is what AVFoundation should be giving you
//I assume it is BGRA or RGBA formatted, if it isn't, change CV_8UC4 to the appropriate format
CVPixelBufferLockBaseAddress( Buffer, 0 );
int bufferWidth = CVPixelBufferGetWidth(Buffer);
int bufferHeight = CVPixelBufferGetHeight(Buffer);
unsigned char *pixel = (unsigned char *)CVPixelBufferGetBaseAddress(Buffer);
cv::Mat image = cv::Mat(bufferHeight,bufferWidth,CV_8UC4,pixel); //put buffer in open cv, no memory copied
//Process image Here
//End processing
CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );
note I am assuming you plan to do this in OpenCV since you used its tag. Also I assume you can get the OpenCV framework to link to your project. If that is an issue, ask a specific question about it.
3: Process Image
This part is by far the most open ended. All you have said about your problem is that you are trying to detect a strong light source. One very quick and easy way of doing that would be to detect the mean pixel value in a greyscale image. If you get the image in colour you can convert with cvtColor. Then just call Avg on it to get the mean value. Hopefully you can tell if the light is on by how that value fluctuates.
chaitanya.varanasi suggested another option, you should check it out too.
openCV is a very large library that can do a wide wide variety of things. Without knowing more about your problem I don't know what else to tell you.
Look at the AVFoundation Framework from Apple.
Hope it helps!
You can try this method: start by getting all images to an AVCaptureVideoDataOutput. From the method:captureOutput:didOutputSampleBuffer:fromConnection,you can sample/calculate every pixel. Source: answer
Also, you can take a look at this SO question where they check if a pixel is black. If its such a powerful light source, you can take the inverse of the pixel and then determine using a set threshold for black.
The above sample code only provides access to the pixel values stored in the buffer; you cannot run any other commands but those that change those values on a pixel-by-pixel basis:
for ( uint32_t y = 0; y < height; y++ )
{
for ( uint32_t x = 0; x < width; x++ )
{
bgraImage.at<cv::Vec<uint8_t,4> >(y,x)[1] = 0;
}
}
This—to use your example—will not work with the code you provided:
cv::Mat bgraImage = cv::Mat( (int)height, (int)extendedWidth, CV_8UC4, base );
cv::Mat grey = bgraImage.clone();
cv::cvtColor(grey, grey, 44);
I am trying to record a wave file and then convert this file to flac in iOS. However, the libflac library always give me the following error:
invalid/unsupported WAVE file, only 16bps stereo WAVE in canonical form allowed
How can I record the file with this kind of properties? These are the properties that I am currently using:
AVFormatIDKey = kAudioFormatLinearPCM
AVSampleRateKey = 16000
AVNumberOfChannelsKey = 2
AVLinearPCMBitDepthKey = 16
AVLinearPCMIsBigEndianKey = NO
AVLinearPCMIsFloatKey = NO
How should I change these properties in order to use libflac?
It turned out that the settings are correct after all. The problem was with the wav file format and libflac. Apple creates a format that is slightly different than the know wave format. That is the reason of the problems that appeared in my case
Apple's wave format has a slightly different header. Check out Jason Hurt's code for converting Apple's waves to FLAC: https://github.com/jhurt/wav_to_flac.
Just a (maybe stupid) question: why saving a NSImage as kUTTypeTIFF or kUTTypePNG (setting kCGImageDestinationLossyCompressionQuality to 1.0) produces a lossless file while kUTTypeAppleICNS makes weird icons?
The most noticeable result I got is loading as NSImage the standard (I mean "not customized") MacOS trash icon (/System/Library/CoreServices/CoreTypes.bundle/Contents/Resources/TrashIcon.icns) and trying to writing it back to file.
(I use a standard procedure: CGImageDestinationCreateWithURL + CGImageDestinationAddImage + CGImageDestinationFinalize)
Thank you
The requirement is like this,
I would get a single large PNG Images for a button, this single image will contain images for hOver, button clicked , mouse exit that need to be displayed,
Single PNG File size would be 1024 X 28, so each image have size about 256 X 28,
I am googling the best possible approach but couldn't make out how to achieve this,
I have following approach in mind,
NSImage *pBtnImage[MAX_BUTTON_IMAGES]
for ( i = 0; i < 4 ; i++) {
pBtnImage[i] = [[NSImage alloc]initWithData:??????];
}
I want to know what should i give in the NSData parameter,
Is it possible to load a Single Image and clipped image accordingly as and when it needed.
Thanks in advance
There's no simple Cocoa-supported way to read only a sub-rectangle of the image from its data. It's a simple matter, however, to read the whole image in and only use a select rectangle of the image when compositing. Thing is, with all the available API, you might be better off just to use the standard +[NSImage imageNamed:] method to read the images in individually and let the OS handle caching.
What actual, measured performance problem are you trying to solve? Does one really exist, or is this a case of premature optimization?