How do I analyze video stream on iOS?

How do I analyze video stream on iOS? - objective-c

For example, there are QR scanners which scan video stream in real time and get QR codes info.
I would like to check the light source from the video, if it is on or off, it is quite powerful so it is no problem.
I will probably take a video stream as input, maybe make images of it and analyze images or stream in real time for presence of light source (maybe number of pixels of certain color on the image?)
How do I approach this problem? Maybe there is some source of library?

It sounds like you are asking for information about several discreet steps. There are a multitude of ways to do each of them and if you get stuck on any individual step it would be a good idea to post a question about it individually.
1: Get video Frame
Like chaitanya.varanasi said, AVFoundation Framework is the best way of getting access to an video frame on IOS. If you want something less flexible and quicker try looking at open CV's video capture. The goal of this step is to get access to a pixel buffer from the camera. If you have trouble with this, ask about it specifically.
2: Put pixel buffer into OpenCV
This part is really easy. If you get it from openCV's video capture you are already done. If you get it from an AVFoundation you will need to put it into openCV like this
//Buffer is of type CVImageBufferRef, which is what AVFoundation should be giving you
//I assume it is BGRA or RGBA formatted, if it isn't, change CV_8UC4 to the appropriate format
CVPixelBufferLockBaseAddress( Buffer, 0 );
int bufferWidth = CVPixelBufferGetWidth(Buffer);
int bufferHeight = CVPixelBufferGetHeight(Buffer);
unsigned char *pixel = (unsigned char *)CVPixelBufferGetBaseAddress(Buffer);
cv::Mat image = cv::Mat(bufferHeight,bufferWidth,CV_8UC4,pixel); //put buffer in open cv, no memory copied
//Process image Here
//End processing
CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );
note I am assuming you plan to do this in OpenCV since you used its tag. Also I assume you can get the OpenCV framework to link to your project. If that is an issue, ask a specific question about it.
3: Process Image
This part is by far the most open ended. All you have said about your problem is that you are trying to detect a strong light source. One very quick and easy way of doing that would be to detect the mean pixel value in a greyscale image. If you get the image in colour you can convert with cvtColor. Then just call Avg on it to get the mean value. Hopefully you can tell if the light is on by how that value fluctuates.
chaitanya.varanasi suggested another option, you should check it out too.
openCV is a very large library that can do a wide wide variety of things. Without knowing more about your problem I don't know what else to tell you.

Look at the AVFoundation Framework from Apple.
Hope it helps!
You can try this method: start by getting all images to an AVCaptureVideoDataOutput. From the method:captureOutput:didOutputSampleBuffer:fromConnection,you can sample/calculate every pixel. Source: answer
Also, you can take a look at this SO question where they check if a pixel is black. If its such a powerful light source, you can take the inverse of the pixel and then determine using a set threshold for black.

The above sample code only provides access to the pixel values stored in the buffer; you cannot run any other commands but those that change those values on a pixel-by-pixel basis:
for ( uint32_t y = 0; y < height; y++ )
{
for ( uint32_t x = 0; x < width; x++ )
{
bgraImage.at<cv::Vec<uint8_t,4> >(y,x)[1] = 0;
}
}
This—to use your example—will not work with the code you provided:
cv::Mat bgraImage = cv::Mat( (int)height, (int)extendedWidth, CV_8UC4, base );
cv::Mat grey = bgraImage.clone();
cv::cvtColor(grey, grey, 44);

Related

Reliable access and modify captured camera frames under SceneKit

I try to add a B&W filter to the camera images of an ARSCNView, then render colored AR objects over it.
I'am almost there with the following code added to the beginning of - (void)renderer:(id<SCNSceneRenderer>)aRenderer updateAtTime:(NSTimeInterval)time
CVPixelBufferRef bg=self.sceneView.session.currentFrame.capturedImage;
if(bg){
char* k1 = CVPixelBufferGetBaseAddressOfPlane(bg, 1);
if(k1){
size_t x1 = CVPixelBufferGetWidthOfPlane(bg, 1);
size_t y1 = CVPixelBufferGetHeightOfPlane(bg, 1);
memset(k1, 128, x1*y1*2);
}
}
This works really fast on mobile, but here's the thing: sometimes a colored frame is displayed.
I've checked and my filtering code is executed but I assume it's too late, SceneKit's pipeline already processed camera input.
Calling the code earlier would help, but updateAtTime is the earliest point one can add custom frame by frame code.
Getting notifications on frame captures might help, but looks like the whole AVCapturesession is unaccessible.
The Metal ARKit example shows how to convert the camera image to RGB and that is the place where I would do filtering, but that shader is hidden when using SceneKit.
I've tried this possible answer but it's way too slow.
So how can I overcome the frame misses and convert the camera feed reliably to BW?

Here's the key for this problem:
session:didUpdateFrame:
Provides a newly captured camera image and accompanying AR information to the delegate.
So just moved CVPixelBufferRef manipulation, the image filtering code from
- (void)renderer:(id<SCNSceneRenderer>)aRenderer updateAtTime:(NSTimeInterval)time
to
- (void)session:(ARSession *)session didUpdateFrame:(ARFrame *)frame
Made sure to set self.sceneView.session.delegate = self to have this delegate called.

Arduino + OV7670 - Without FIFO - Reading Snapshot

I know that there is a lot in internet (http://forum.arduino.cc/index.php?topic=159557.0 for example) about OV7670 and I read a lot about it, but seems something is missing.
First of all I took a look into the way how can we read pixel by pixel from the camera to build the rectangular 600 X 480 image, and this was quite easy to understand considering HREF, VSYNCH and PCLOCK described on documentation here: http://www.voti.nl/docs/OV7670.pdf. I understand XCLOCK as an input I need to give to OV7670 as a kind of cycle controller and RESET would be something to reset it.
So at this point I thought that the functionality of such camera would be covered by wiring the following pins:
D0..D7 - for data (pixel) connected to arduino digital pins 0 to 7 as INPUT on arduino board
XCLK - for camera clock connected to arduino digital pin 8 as OUTPUT from arduino board
PCLK - for pixel clock connected to arduino digital pin 9 as INPUT on arduino board
HREF - to define when a line starts / ends connected to arduino digital pin 10 as INPUT on arduino board
VSYCH - to define when a frame starts / ends connected to arduino digital pin 11 as INPUT on arduino board
GRD - groud connected to arduino GRD
3V3 - 3,3 INPUT connected to arduino 3,3v
RESET - connected to arduino RESET
PWDN - connected to arduino GRD
The implementation for such approach from my point of view would be something like:
Code:
for each loop function do
write high to XCLK
if VSYNCH is HIGH
return;
if HREF is LOW
return;
if lastPCLOCK was HIGH and currentPCLOCK is LOW
readPixelFromDataPins();
end for
My readPixelFromDataPins() basically read just the first byte (as I'm just testing if I can even read something from the camera), and it is written as follows:
Code:
byte readPixelFromDataPins() {
byte result = 0;
for (int i = 0; i < 8; i++) {
result = result << 1 | digitalRead(data_p[i]);
}
return result;
}
In order to check if something is being read from the camera I just print it to the Serial 9600, the byte read from data pins as a number. But currently I'm receiving only zero values. The code I'm using to retrieve an image is stored here: https://gist.github.com/franciscospaeth/8503747.
Did somebody that makes OV7670 work with Arduino already figure out what am I doing wrong? I suppose I'm using the XCLOCK wrongly right? What shall I do to get it working?
I searched a lot and I didn't found any SSCCE (http://sscce.org/) for this camera using arduino, if somebody have it please let me know.
This question is present on arduino forum (http://forum.arduino.cc/index.php?topic=211741.0) too.

your idea is not bad but ...
the xclock need to be a clock (in your program is just a transition from 0 to 1 and is freezing there)
you need also to use I2C with SIOC and SIOD for configuring the camera (or you can use the default settings, but I am not sure if is the correct output format for you, 30F/s,VGA, YUV format ....)
your code execution is slower using the serial output in the same loop with reading data
I will recommend you to toggle the xclock pin and to move the pixel print in a if(). Also you will be able to read Data only in a very precise time, if you want to read only one byte, than after a transition from 0 to 1 of HREF you need to wait for a new transition from 0 to 1 of PCLK (you will be able to see only one 0-1 transition of HREF after 784x2 transitions of PCLK, (640 active pixels + 144 dead time for each line) x 2 (for YUV or RGB are 2 bytes received for each pixel) )

Hello I am Mr_Arduino from the arduino forums. Your issue is that you are reading pixels too slow please do not use digital read to do such a thing. Also if you insist on using a separate function just to read a byte make sure the function is being inlined. You can do this by declaring your function as static inline. Also as mentioned above how are you generating the clock. You can generate the XCLK using PWM on the arduino.
I have created a working example here:
https://github.com/ComputerNerd/arduino-camera-tft/blob/master/captureimage.c
Edit: a 3rd party has copied part but not all of the code from the above link into the answer here. However, the link must remain as the code posted below requires additional files from that source to actually work.
Edit 2: Removed irrelevant code. You will need to modify what you do with the data.
void capImg(void){
cli();
uint8_t w,ww;
uint8_t h;
w=160;
h=240;
tft_setXY(0,0);
CS_LOW;
RS_HIGH;
RD_HIGH;
DDRA=0xFF;
//DDRC=0;
#ifdef MT9D111
while (PINE&32){}//wait for low
while (!(PINE&32)){}//wait for high
#else
while (!(PINE&32)){}//wait for high
while (PINE&32){}//wait for low
#endif
while (h--){
ww=w;
while (ww--){
WR_LOW;
while (PINE&16){}//wait for low
PORTA=PINC;
WR_HIGH;
while (!(PINE&16)){}//wait for high
WR_LOW;
while (PINE&16){}//wait for low
PORTA=PINC;
WR_HIGH;
while (!(PINE&16)){}//wait for high
WR_LOW;
while (PINE&16){}//wait for low
PORTA=PINC;
WR_HIGH;
while (!(PINE&16)){}//wait for high
WR_LOW;
while (PINE&16){}//wait for low
PORTA=PINC;
WR_HIGH;
while (!(PINE&16)){}//wait for high
}
}
CS_HIGH;
sei();
}
You can also find it on github.

You can use my instruction: how to retrieve image from ov7670 It contains all the steps you need. There is also instuction to setup FrameGrabber: how to run framegrabber

DirectX 11.1 Disable the depth buffer

This question relates to a previous question I have asked.
I have a series of 48 textures on flat square meshes that I am rendering and they all combine to form one "scene." They each have a large percentage of of transparency with one or two smaller images, and when they are line up, I should be able to see the full scene. I expected this would work without much issue, but when when I went to test it, I see the top-most texture, and then anywhere it would have transparency, it is just the clear color.
At first, I thought it was an issue with how I was loading the image and somehow was disabling the alpha, but after playing around with the clear color, I realized that there was some transparency.
Second, I tried was to enable blending - this works if all the textures get combined on a single z plane.
I have posted my image loading and blending code on the question I linked to above.
Now I am starting to think it may be an issue with the depth buffer, so I added the following code to my window dependent resources:
Microsoft::WRL::ComPtr<ID3D11DepthStencilState> DepthDefault;
D3D11_DEPTH_STENCIL_DESC depthstencilDesc;
ZeroMemory(&depthstencilDesc, sizeof(depthstencilDesc));
depthstencilDesc.DepthEnable = FALSE;
depthstencilDesc.DepthWriteMask = D3D11_DEPTH_WRITE_MASK_ALL;
depthstencilDesc.DepthFunc = D3D11_COMPARISON_ALWAYS;
depthstencilDesc.StencilEnable = FALSE;
depthstencilDesc.BackFace.StencilDepthFailOp = D3D11_STENCIL_OP_KEEP;
depthstencilDesc.BackFace.StencilFailOp = D3D11_STENCIL_OP_KEEP;
depthstencilDesc.BackFace.StencilFunc = D3D11_COMPARISON_ALWAYS;
depthstencilDesc.BackFace.StencilPassOp = D3D11_STENCIL_OP_KEEP;
depthstencilDesc.FrontFace.StencilDepthFailOp = D3D11_STENCIL_OP_KEEP;
depthstencilDesc.FrontFace.StencilFailOp = D3D11_STENCIL_OP_KEEP;
depthstencilDesc.FrontFace.StencilFunc = D3D11_COMPARISON_ALWAYS;
depthstencilDesc.FrontFace.StencilPassOp = D3D11_STENCIL_OP_KEEP;
DX::ThrowIfFailed( direct3d.device->CreateDepthStencilState(&depthstencilDesc, DepthDefault.GetAddressOf() ) );
direct3d.context->OMSetDepthStencilState(DepthDefault.Get(), 0);
Even with this code, I am only seeing the topmost layer. Am I missing something, or am I setting something incorrectly?
Edit: To visualize the problem, it's as if I had 48 panes of glass that are all the same size and they are all in a row. Each piece of glass has one image somewhere on it. When you look through all the glass panes, you get one extra awesome image of all the smaller images combined. For me, directx or the pixel shader is only drawing the first glass pane and filling all the transparency of the first pane with the clear/background color.
Edit: The code I'm using to create the depthstencilview:
CD3D11_TEXTURE2D_DESC depthStencilDesc( DXGI_FORMAT_D24_UNORM_S8_UINT, backBufferDesc.Width, backBufferDesc.Height, 1, 1, D3D11_BIND_DEPTH_STENCIL );
ComPtr<ID3D11Texture2D> depthStencil;
DX::ThrowIfFailed( direct3d.device->CreateTexture2D( &depthStencilDesc, nullptr, &depthStencil ) );
auto viewDesc = CD3D11_DEPTH_STENCIL_VIEW_DESC(D3D11_DSV_DIMENSION_TEXTURE2D);
DX::ThrowIfFailed( direct3d.device->CreateDepthStencilView( depthStencil.Get(), &viewDesc, &direct3d.depthStencil ) );
That code is literally right above my depth test/ D3D11_DEPTH_STENCIL_DESC code. I'm presuming that this creates the depth code.

I think you might need to sort the order in which you render your vertices if you want to render semi-transparencies with a depth buffer. If you don't want to use a depth buffer - perhaps just don't define/create/set it?

How to correctly paint a huge image

I'm fairly new to wxWidgets so please bear with me. Let's say I have a 10Kx10K image and my wxScrolledWindow has a size of 640x480. I load the whole image into a wxBitmap which I use in my paint function.
Now in my OnPaint function I just say
wxPaintDC dc(this);
dc.DrawBitmap(_Bitmap, 0, 0 );
This somewhat works for the first few paints but soon the Window content is out order and artifacts appear. This happens very fast when I move a scroll bar back and forth very quickly.
I use the latest wxWidgets on a Windows 7 machine.
So, how can I improve my painting code?
Many thanks,
Christian

Using a 10000x10000 wxBitmap is a bad idea on its own, it may simply fail to be created on an older system (that's 400MiB of video RAM!). Drawing it entirely is sheer madness.
I don't know where does your data come from but in a typical case of e.g. a map to be shown on screen, you should break it into tiles, convert the tiles that are currently visible on screen to wxBitmap (or several of them) and draw only those.
Then you may optimize your drawing by using double buffering (which is relatively useless under Windows 7 that double buffers everything on its own) and otherwise, but you should be using a reasonably-sized backing store bitmap.

This sounds like something that might be helped by using double buffering.
The first thing to start trying is to replace wxPaintDC with wxBufferedPaintDC
For more suggestions, here is a wiki article on the subject: http://wiki.wxwidgets.org/Flicker-Free_Drawing

As Ravenspoint kindly pointed out, there is an article on wxWidgets' wiki. So according to that article two things need to happen. First override the EVT_ERASE_BACKGROUND with an empty function.
void Canvas::EraseBackground( wxEraseEvent& WXUNUSED(event))
{
}
And second to implement a basic double buffering scheme. Here is how I did it.
void Canvas::OnPaint(wxPaintEvent& WXUNUSED(event))
{
int x, y;
GetViewStart(&x, &y);
wxRect Client_Area = GetClientRect();
int width = Client_Area.width;
int height = Client_Area.height;
wxBitmap Current = _Bitmap.GetSubBitmap(wxRect( x * 10, y * 10, width, height ));
wxPaintDC dc(this);
dc.DrawBitmap(Current, 0, 0, false );
}
My scroll rate for both x and y is set to 10. That's why I multiply the view start coordinates.
Any more insight is very welcome.
Thanks,
Christian

NSImage loading a portion of image

The requirement is like this,
I would get a single large PNG Images for a button, this single image will contain images for hOver, button clicked , mouse exit that need to be displayed,
Single PNG File size would be 1024 X 28, so each image have size about 256 X 28,
I am googling the best possible approach but couldn't make out how to achieve this,
I have following approach in mind,
NSImage *pBtnImage[MAX_BUTTON_IMAGES]
for ( i = 0; i < 4 ; i++) {
pBtnImage[i] = [[NSImage alloc]initWithData:??????];
}
I want to know what should i give in the NSData parameter,
Is it possible to load a Single Image and clipped image accordingly as and when it needed.
Thanks in advance

There's no simple Cocoa-supported way to read only a sub-rectangle of the image from its data. It's a simple matter, however, to read the whole image in and only use a select rectangle of the image when compositing. Thing is, with all the available API, you might be better off just to use the standard +[NSImage imageNamed:] method to read the images in individually and let the OS handle caching.
What actual, measured performance problem are you trying to solve? Does one really exist, or is this a case of premature optimization?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas