Reducing CPU usage of navigator.webkitGetUserMedia (Electron: DesktopCapturer) - webkit

I'm using navigator.webkitGetUserMedia to capture screenshot of a window once every second by assigning the returned stream to a <video> and copying it to a <canvas> and saving the Buffer to file.
The CPU usage in my application is consistently high and I've pinpointed it to this area.
Code
// Initialize the video, canvas, and ctx
var localStream,
_video = document.querySelector('#video'),
_canvas = document.querySelector('#canvas'),
_ctx = _canvas.getContext('2d'),
sourceName = 'my-window-id';
// Load the stream from navigator.webkitGetUserMedia
navigator.webkitGetUserMedia({
audio: false,
video: {
mandatory: {
chromeMediaSource: 'desktop',
chromeMediaSourceId: sourceName,
minWidth: 1920,
maxWidth: 1920,
minHeight: 1080,
maxHeight: 1080
}
}
}, gotStream, getUserMediaError);
function gotStream(stream) {
// Use the stream in our <video>
_video.src = window.URL.createObjectURL(stream);
// Reference the stream locally
localStream = stream;
}
function captureState() {
var buffer,
dataURL;
// Draw <video> to <canvas> and convert to buffer (image data)
_ctx.drawImage(_video, 0, 0);
dataURL = _canvas.toDataURL('image/png');
buffer = new Buffer(dataURL.split(",")[1], 'base64');
// Create an image from the data
fs.writeFileSync('screenshot.png', buffer);
}
// Capture state every second
setInterval(function() {
captureState();
}, 1000);
This code my not run, it's a simplified version of what I have in my code to make it StackOverflow readable.
Things I've Tried
_video.pause() and _video.play() when needed. Didn't seem to change CPU usage.
_video.stop(). This means I would have to get the stream again which causes a spike in CPU usage worse than keeping it open.
My best lead right now is to change the frame rate by adding:
optional: [
{ minFrameRate: 1 },
{ frameRate: 1 }
]
Extremely low frame rate would be fine. However, I haven't been able to determine if the frameRate setting works in this case. The docs don't have it listed and I don't have the newer mediaDevices.getUserMedia available.
Is it possible to set extremely low frame rates (or any at all) for navigator.webkitGetUserMedia?
Has anyone been able to reduce CPU usage of the stream in any other way?
Any alternative methods of achieving the same goal (state capture on interval) would also be helpful.
Thanks!
Side Note
This is in an Electron app on Windows using DesktopCapturer to get the chromeMediaSourceId.
Update on CPU Usage
Cost of running stream: 6% CPU Usage
Calling captureState every 1000ms: 5% CPU Usage
Total Current: 11%
Currently working on reducing #2 based on the recommendations of Csaba Toth so far. I should be able to reduce captureState by changing how the canvas is captured. Will update when that's done.
For #1, if I can't avoid capturing the video stream I'll have to just try to cap the total CPU usage at just over 6% by optimizing #2.

There's some unnecessary base64 encoding and operations going on here, it's weird how you get hold of the data:
dataURL = _canvas.toDataURL('image/png');
buffer = new Buffer(dataURL.split(",")[1], 'base64');
Take a look at how the QR decoder access the image instead: https://github.com/bulldogearthday/booths/blob/master/scripts/qrdecoder.js#L1991
var canvas_qr = document.getElementById("qr-canvas");
var context = canvas_qr.getContext('2d');
qrcode.width = canvas_qr.width;
qrcode.height = canvas_qr.height;
qrcode.imagedata = context.getImageData(0, 0, qrcode.width, qrcode.height);
(the other side of the software did a drawImage to the canvas earlier). Now the task would be to find a method which won't unnecessarily convert the PNG data into base64 and then decode it. I see that everywhere this URI encoding is advised because it's less number of lines. But performance wise an unecessary encoding/decoding phase is undesirable. 1920x1080 PNGs are big, not meant for base64 in-lining. Since you are in nodejs anyway, try to use https://github.com/niegowski/node-pngjs or similar library to save the image data.
There's always a tradeoff between space and time, so if time really matters with lower compression you can have higher performance: https://github.com/wheany/js-png-encoder
There is a trade-off here, since the base64 URI encoding examples take advantage of the browser's native (C++, fast) png encoding, but then do unnecessary base64 encodeing+decoding. The node-pngjs would perform PNG encoding in JS land, which maybe not as performant as the browser's internal encoding. The best would be to find a way to leverage the browser's encoding without having the base64.
Earlier advices
According to what you show I think your main problem is that you perform _ctx.drawImage(_video, 0, 0); and other operations in your gotStream.
Here is a Progressive Web App of mine, it performs QR code scanning too: https://github.com/bulldogearthday/booths/blob/master/scripts/app.js
Notice that in the "gotStream" (which is anonymous in my case https://github.com/bulldogearthday/booths/blob/master/scripts/app.js#L67) I only wire up the stream to the canvas.
My situation is easier because I don't have to enforce size (I hope you dont' hard wire those screen size pixel numbers), but I also perform processing (QR code scan attempt, every 500ms) periodically. I originally used timer for that, but that stopped working after a some iterations/ticks, so technically I issue a single timeout, and every time it hits I re-issue a new one. See initial timeout https://github.com/bulldogearthday/booths/blob/master/scripts/app.js#L209 and periodical re-issue: https://github.com/bulldogearthday/booths/blob/master/scripts/app.js#L231
As you can see the only place I do "heavy lifting" is in the app.scanQRCode which happens only twice a second. There I process the content of the canvas:
https://github.com/bulldogearthday/booths/blob/master/scripts/app.js#L218
I advise you to restructure your code that way. So setup either a timer ticking every second or re-issue time-outs as me. Then do the capture+save in that section. Hopefully that will lighten the CPU load, although encoding 1920x1080 PNG once a second may stress a CPU (there will be PNG encoding).
(That's beneficial if you want to go for individual images. If you want to end-up with a video anyway in the end, then I'd try to go on the route of enforcing 1s FPS video as you suggested and capturing the video stream directly instead of individual images. But for the CPU load my suggestion should help IMHO.)
In the README (https://github.com/bulldogearthday/booths) you can see one of the main sources I looked at for getUserMedia: https://github.com/samdutton/simpl/blob/gh-pages/getusermedia/sources/js/main.js
I don't fiddle with issuing .play() or .pause() or anything. As a matter of fact my code waits until it receives the signal that the play started (starts by itself by default at least for cameras): document.getElementById('qrVideo').addEventListener('playing', app.saveVideoSize, false); https://github.com/bulldogearthday/booths/blob/master/scripts/app.js#L67 My intention with that was to not disturb the natural process with anything if possible. In my case I detect the video size this gentle way. Looking at DesktopCapturer they also don't perform any extra in the gotStream in their README https://github.com/electron/electron/blob/master/docs/api/desktop-capturer.md, and as shown ideally you just wire up the video stream with the canvas.

Related

Buffer Not Large enough for pixel

I am trying to get a bitmap From byte array
val bitmap_tmp =
Bitmap.createBitmap(height, width, Bitmap.Config.ARGB_8888)
val buffer = ByteBuffer.wrap(decryptedText)
bitmap_tmp.copyPixelsFromBuffer(buffer)
callback.bitmap(bitmap_tmp)
I am facing a error in the below line :
bitmap_tmp.copyPixelsFromBuffer(buffer)
The Error Reads As:
java.lang.RuntimeException: Buffer not large enough for pixels
I have tried Different Solutions found on stack Like Add the line before error but still it crashes:
buffer.rewind()
However the Weird part is the same code at a different place for the same image [Same image with same dimensions] get perfectly functioned and I get the bitmap but here it crashes.
How do I solve this?
Thanks in Adv
The error message makes it sound like the buffer you're copying from isn't large enough, like it needs to contain at least as many bytes as necessary to overwrite every pixel in your bitmap (which has a set size and pixel config).
The documentation for the method doesn't make it clear, but here's the source for the Bitmap class, and in that method:
if (bufferBytes < bitmapBytes) {
throw new RuntimeException("Buffer not large enough for pixels");
}
So yeah, you can't partially overwrite the bitmap, you need enough data to fill it. And if you check the source, that depends on the buffer's current position and limit (it's not just its capacity, it's how much data is remaining to be read).
If it works elsewhere, I'm guessing decryptedText is different there, or maybe you're creating your Bitmap with a different Bitmap.Config (like ARGB_8888 requires 4 bytes per pixel)

How to extract frames from video using webcodecs from chrome 86

WebCodecs is released in Chrome 86. But there's no real code example on how to use it yet. Given a video url, how to extract video frames as ImageData using webcodecs?
What you describe is the entire complex process of acquiring raw bitmap-like data (e.g. something you can dump on a canvas), from a formatted file or a stream of data chunks.
In case of files (including the case where your URL points to a complete file, such as an .mp4 file), this is generally made of 2 steps:
Parsing the container file into individual chunks of encoded video and/or audio
Decoding these chunks of encoded video/audio
WebCodecs only facilitates step 2 of this process, i.e. what is called decoding. The reasoning behind this decision was that parsing the container is computationally trivial, so you can efficiently do this with the File APIs already, but you still need to implement parsing/processing the container yourself.
Luckily, plenty of libraries exist already, many of which ironically existed long before the emergence of the WebCodecs API.
MP4Box is one example, helping you acquire encoded video and audio chunks, which you can then feed into a VideoDecoder or AudioDecoder.
With MP4Box, the key piece of your code will be centered around the onSamples callback you provide, and it'll look something like this:
mp4BoxFile.onSamples = (trackId, user, chunks) =>
{
for (let i = 0; i < chunks.length; i++)
{
let chunk = chunks[i];
let encodedChunk = new EncodedVideoChunk({
// you'll need to deep-inspect chunk to figure these out
type: "key", // or "delta"
timestamp: ...
duration: ...
data: chunk.data
});
// pass encodedChunk to a VideoDecoder instance's decode method
}
};
This is just a rough sketch of how your code will probably look, it probably won't work without more inspection, and it'll take a lot of trial and error, because this is very low level stuff.
WebCodecs is not the silver bullet you probably expected, but it can help you build one.

Read binary files without having them buffered in the volume block cache

Older, now deprecated, macOS file system APIs provided flags to read a file unbuffered.
I seek a modern way to accomplish the same, so that I can read a file's data into memory without it being cached needlessly somewhere else in memory (such as the volume cache).
Reading with fread and first calling setvbuf (fp, NULL, _IONBF, 0) is not having the desired effect in my tests, for example. I am seeking other low-level functions that let me read into a prepared memory buffer and that let me avoid buffering of the whole data.
Background
I am writing a file search program. It reads large amounts of file content (many GBs) that isn't and won't be used by the user otherwise. It would be a waste to have all this data cached in the volume cache as it'll soon get purged by further reads again, anyway. It'll also likely lead to purging file data that's actually in use by the user or system, causing more cache misses.
Therefore, I should be able to tell the system that I do not need the file data cached. The little caching needed for cluster boundaries is not an issue. It's the many large chunks that I read briefly into memory to search it that is not needed to be cached.
Two suggestions:
Use the read() system call instead of stdio.
Disable data caching with the F_NOCACHE option for fcntl().
In Swift that would be something like (error checking omitted for brevity):
import Foundation
let path = "/path/to/file"
let fd = open(path, O_RDONLY)
fcntl(fd, F_NOCACHE, 1)
var buffer = Data(count: 1024 * 1024)
buffer.withUnsafeMutableBytes { ptr in
let amount = read(fd, ptr.baseAddress, ptr.count)
}
close(fd)

Does vkCmdCopyImageToBuffer work when source image uses VK_IMAGE_TILING_OPTIMAL?

I have read (after running into the limitation myself) that for copying data from the host to a VK_IMAGE_TILING_OPTIMAL VkImage, you're better off using a VkBuffer rather than a VkImage for the staging image to avoid restrictions on mipmap and layer counts. (Here and Here)
So, when it came to implementing a glReadPixels-esque piece of functionality to read the results of a render-to-texture back to the host, I thought that reading to a staging VkBuffer with vkCmdCopyImageToBuffer instead of using a staging VkImage would be a good idea.
However, I haven't been able to get it to work yet, I'm seeing most of the intended image, but with rectangular blocks of the image in incorrect locations and even some bits duplicated.
There is a good chance that I've messed up my synchronization or layout transitions somewhere and I'll continue to investigate that possibility.
However, I couldn't figure out from the spec whether using vkCmdCopyImageToBuffer with an image source using VK_IMAGE_TILING_OPTIMAL is actually supposed to 'un-tile' the image, or whether I should actually expect to receive a garbled implementation-defined image layout if I attempt such a thing.
So my question is: Does vkCmdCopyImageToBuffer with a VK_IMAGE_TILING_OPTIMAL source image fill the buffer with linearly tiled data or optimally (implementation defined) tiled data?
Section 18.4 describes the layout of the data in the source/destination buffers, relative to the image being copied from/to. This is outlined in the description of the VkBufferImageCopy struct. There is no language in this section which would permit different behavior from tiled images.
The specification even has pseudo code for how copies work (this is for non-block compressed images):
rowLength = region->bufferRowLength;
if (rowLength == 0)
rowLength = region->imageExtent.width;
imageHeight = region->bufferImageHeight;
if (imageHeight == 0)
imageHeight = region->imageExtent.height;
texelSize = <texel size taken from the src/dstImage>;
address of (x,y,z) = region->bufferOffset + (((z * imageHeight) + y) * rowLength + x) * texelSize;
where x,y,z range from (0,0,0) to region->imageExtent.width,height,depth}.
The x,y,z part is the location of the pixel in question from the image. Since this location is not dependent on the tiling of the image (as evidenced by the lack of anything stating that it would be), buffer/image copies will work equally on both kinds of tiling.
Also, do note that this specification is shared between vkCmdCopyImageToBuffer and vkCmdCopyBufferToImage. As such, if a copy works one way, it by necessity must work the other.

How do I analyze video stream on iOS?

For example, there are QR scanners which scan video stream in real time and get QR codes info.
I would like to check the light source from the video, if it is on or off, it is quite powerful so it is no problem.
I will probably take a video stream as input, maybe make images of it and analyze images or stream in real time for presence of light source (maybe number of pixels of certain color on the image?)
How do I approach this problem? Maybe there is some source of library?
It sounds like you are asking for information about several discreet steps. There are a multitude of ways to do each of them and if you get stuck on any individual step it would be a good idea to post a question about it individually.
1: Get video Frame
Like chaitanya.varanasi said, AVFoundation Framework is the best way of getting access to an video frame on IOS. If you want something less flexible and quicker try looking at open CV's video capture. The goal of this step is to get access to a pixel buffer from the camera. If you have trouble with this, ask about it specifically.
2: Put pixel buffer into OpenCV
This part is really easy. If you get it from openCV's video capture you are already done. If you get it from an AVFoundation you will need to put it into openCV like this
//Buffer is of type CVImageBufferRef, which is what AVFoundation should be giving you
//I assume it is BGRA or RGBA formatted, if it isn't, change CV_8UC4 to the appropriate format
CVPixelBufferLockBaseAddress( Buffer, 0 );
int bufferWidth = CVPixelBufferGetWidth(Buffer);
int bufferHeight = CVPixelBufferGetHeight(Buffer);
unsigned char *pixel = (unsigned char *)CVPixelBufferGetBaseAddress(Buffer);
cv::Mat image = cv::Mat(bufferHeight,bufferWidth,CV_8UC4,pixel); //put buffer in open cv, no memory copied
//Process image Here
//End processing
CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );
note I am assuming you plan to do this in OpenCV since you used its tag. Also I assume you can get the OpenCV framework to link to your project. If that is an issue, ask a specific question about it.
3: Process Image
This part is by far the most open ended. All you have said about your problem is that you are trying to detect a strong light source. One very quick and easy way of doing that would be to detect the mean pixel value in a greyscale image. If you get the image in colour you can convert with cvtColor. Then just call Avg on it to get the mean value. Hopefully you can tell if the light is on by how that value fluctuates.
chaitanya.varanasi suggested another option, you should check it out too.
openCV is a very large library that can do a wide wide variety of things. Without knowing more about your problem I don't know what else to tell you.
Look at the AVFoundation Framework from Apple.
Hope it helps!
You can try this method: start by getting all images to an AVCaptureVideoDataOutput. From the method:captureOutput:didOutputSampleBuffer:fromConnection,you can sample/calculate every pixel. Source: answer
Also, you can take a look at this SO question where they check if a pixel is black. If its such a powerful light source, you can take the inverse of the pixel and then determine using a set threshold for black.
The above sample code only provides access to the pixel values stored in the buffer; you cannot run any other commands but those that change those values on a pixel-by-pixel basis:
for ( uint32_t y = 0; y < height; y++ )
{
for ( uint32_t x = 0; x < width; x++ )
{
bgraImage.at<cv::Vec<uint8_t,4> >(y,x)[1] = 0;
}
}
This—to use your example—will not work with the code you provided:
cv::Mat bgraImage = cv::Mat( (int)height, (int)extendedWidth, CV_8UC4, base );
cv::Mat grey = bgraImage.clone();
cv::cvtColor(grey, grey, 44);