How to capture and record video from webcam using JavaCV - video-capture

I'm new to JavaCV and I have difficult time finding good tutorials about different issues on the topics that I'm interested in. I've succeed to implement some sort of real time video streaming from my webcam but the problem is that I use this code snippet which I found on the net :
public void run() {
FrameGrabber grabber = new VideoInputFrameGrabber(0); // 1 for next
// camera
int i = 0;
try {
IplImage img;
while (true) {
img = grabber.grab();
if (img != null) {
cvFlip(img, img, 1);// l-r = 90_degrees_steps_anti_clockwise
cvSaveImage((i++) + "-aa.jpg", img);
// show image on window
that results in multiple jpg files.
What I really want to do is capture my webcam input and along with showing it I want to save it in a proper video file. I find out about FFmpegFrameRecorder but don't know how to implement it. Also I've been wondering what are the different options for the format of the video file, because flv maybe would be more useful for me.

It's been quite a journey. Still a few things that I'm not sure what's the meaning behind them, but here is a working example for capturing and recording video from a webcam using JavaCV:
import com.googlecode.javacv.CanvasFrame;
import com.googlecode.javacv.FFmpegFrameRecorder;
import com.googlecode.javacv.OpenCVFrameGrabber;
import com.googlecode.javacv.cpp.avutil;
import com.googlecode.javacv.cpp.opencv_core.IplImage;
public class CameraTest {
public static final String FILENAME = "output.mp4";
public static void main(String[] args) throws Exception {
OpenCVFrameGrabber grabber = new OpenCVFrameGrabber(0);
IplImage grabbedImage = grabber.grab();
CanvasFrame canvasFrame = new CanvasFrame("Cam");
canvasFrame.setCanvasSize(grabbedImage.width(), grabbedImage.height());
System.out.println("framerate = " + grabber.getFrameRate());
FFmpegFrameRecorder recorder = new FFmpegFrameRecorder(FILENAME, grabber.getImageWidth(),grabber.getImageHeight());
recorder.setVideoBitrate(10 * 1024 * 1024);
while (canvasFrame.isVisible() && (grabbedImage = grabber.grab()) != null) {
It was somewhat hard for me to make this work so in addition to those that may have the same issue, if you follow the official guide about how to setup JavaCV on Windows 7/64bit and want to capture video using the code above you should create a new directory in C:\ : C:\ffmpeg and extract the files from the ffmped release that you've been told to download in the official guide. Then you should add C:\ffmpeg\bin to your Enviorment variable PATH and that's all. About this step all credits go to karlphillip
and his post here


Receive video from a source, preprocess and stream live preview to the client | ASP.NET Core

I need to implement a server that gets video from some source for example IPCamera
then preprocess the image and streams it down to the client (if requested).
I already implemented the part with processing, it accepts a single frame (bitmap) and returns the processed bitmap. What I'm struggling with is the part of receiving video from the camera and then streaming it to the client.
What would be the right way to do it?
What libraries do you recommend using?
I use ASP.NET Core for the Server part, and Angular/React for the Client.
I tried to implement gRPC but a gRPC-Web client for typescript seems to be a pain in the ass.
Edit: 02.08.2022
What I achieved so far:
I figured out how to receive image output from the camera.
I found out RTSP Client for C#. Source: C# RTSP Client for .NET
It works pretty fine. I can receive output with small to no delay, and I use my phone to emulate the RTSP camera/server.
So RTSP Client receives raw frames (in my case H.264 IFrame/PFrame). The problem is I need to decode those frames preferably to Bitmap because I use YoloV4 ONXX Model for object detection.
Here's how I set up YoloV4 with ML.Net. Source: Machine Learning with ML.NET – Object detection with YOLO
To decode raw frames I use FFMpeg (sadly I didn't find any working FFMpeg package that would work with .NET Core, I tried AForge.Net, Accord but in both packages, the FFMPEG namespace is missing after installing for some reason, so I dug through Github and took this project FrameDecoderCore). It's not the best solution but it works. Now I can receive the output and decode it to Bitmap.
Now I'm facing three major issues:
How to detect objects without delaying the process of receiving camera output. And how to properly build an onnx model just to predict without training.
How to convert processed bitmaps back to a video stream. I also need to be able to save part of it as a video file on disk (video format doesn't matter) whenever the desired object was detected.
How to stream processed or unprocessed output to the client when the client wants to see the camera output. - I'm thinking of gRPC here and sending bitmaps and then displaying it on HTML Canvas.
Here's how my service looks at the moment:
public class CCTVService : BackgroundService
private readonly RtspClient _rtspClient;
private readonly ILogger<CCTVService> _logger;
private const int streamWidth = 480;
private const int streamHeight = 640;
private static readonly FrameDecoder FrameDecoder = new FrameDecoder();
private static readonly FrameTransformer FrameTransformer = new FrameTransformer(streamWidth, streamHeight);
public CCTVService(ILogger<CCTVService> logger)
_logger = logger;
_rtspClient = new RtspClient(new ConnectionParameters(new Uri("rtsp://")));
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
using (_rtspClient)
await _rtspClient.ConnectAsync(stoppingToken);
_logger.LogInformation("Connecting to RTSP");
catch(RtspClientException clientException)
_rtspClient.FrameReceived += (obj, rawFrame) =>
if (rawFrame is not RawVideoFrame rawVideoFrame)
var decodedFrame = FrameDecoder.TryDecode(rawVideoFrame);
if (decodedFrame == null)
using var bitmap = FrameTransformer.TransformToBitmap(decodedFrame);
_logger.LogInformation($"Timestamp: {new DateTimeOffset(rawFrame.Timestamp).ToUnixTimeSeconds()} Timestamp-diff: {new DateTimeOffset(DateTime.Now).ToUnixTimeSeconds() - new DateTimeOffset(rawFrame.Timestamp).ToUnixTimeSeconds()}");
// save bitmaps | Test
//var t = new Thread(() =>
// using var bitmap = FrameTransformer.TransformToBitmap(decodedFrame);
// var name = "./test/" + new DateTimeOffset(rawFrame.Timestamp).ToUnixTimeMilliseconds().ToString() + " - " + new Random().NextInt64().ToString() + ".bmp";
// bitmap.Save(name);
//t.Priority = ThreadPriority.Highest;
await _rtspClient.ReceiveAsync(stoppingToken);
// swallow
So I can't really help with part 2 and 3 of your question but with ML.NET, one of the things you might consider is batching the predictions. Instead of preprocessing them one at a time, you could collect 10-20 frames and then instead of using PredictionEngine, use the Transform passing it in an IDataView instead of a Bitmap.
Here are some samples of using ONNX models inside applications. The WPF sample might be of interest since it uses a webcam to capture inputs. I believe it uses the native Windows APIs, so different than how you'd do it for web but it might be worth looking at anyway.

Adding text items to an Existing PDF w/ Telerik DocumentProcessing Library

I want to open an existing PDF document and add different annotations to it. Namely bookmarks and some text
I am using the Telerik Document Processing Library (dpl) v2019.3.1021.40
I am new to dpl , but I believe the RadFlowDocument is the way to go.
I am having troubles creating the RadFlowDocument
FlowProvider.PdfFormatProvider provider = new FlowProvider.PdfFormatProvider();
using (Stream stream = File.OpenRead(sourceFile))
--> RadFlowDocument flowDoc = provider.Import(stream);
The line indicated w/ the arrow give the error "Import Not Supported"
There is a telerik blog post here
It seems relevant, but not 100% sure.
It cautions to be sure the providers are mated correctly, I believe they are in my example....
Again, ultimate goal is to open a PDF and add some stuff to it. I think the RadFlowDocument is the right direction. If there is a better solution, Im happy to hear that too.
I figured it out. The DPL is pretty good, but doc is still growing, hope this helps someone out...
This draws from a myriad of articles, I cant begin to cite them all.
There are 2 notions for working w/ PDFs in the DPL.
FixedDocument takes pages. I think this is meant for sewing docs together.
FlowDocument I believe lays things out like an HTML renderer would.
I am using Fixed, mainly b/c I can get that to work.
using System;
using System.IO;
using System.Windows; //nec for Size struct
using System.Diagnostics; //nec for launching the pdf at the end
using Telerik.Windows.Documents.Fixed.Model;
//if you have fixed and flow provider, you have to specify, so I make a shortcut
using FixedProvider = Telerik.Windows.Documents.Fixed.FormatProviders.Pdf;
using Telerik.Windows.Documents.Fixed.Model.Editing;
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace DocAggregator
public class UnitTest2
public void EditNewFIle_SrcAsFixed_TrgAsFixed()
String dt = #"C:\USERS\greg\DESKTOP\DPL\";
String sourceFile = dt + "output.pdf";
//Open the sourceDoc so you can add stuff to it
RadFixedDocument sourceDoc;
//a provider parses the actual file into the model.
FixedProvider.PdfFormatProvider fixedProv = new FixedProvider.PdfFormatProvider();
using (Stream stream = File.OpenRead(sourceFile))
//'populate' the doc object from the file
//using the FLOW classes, I get "Import Not Supported".
sourceDoc = fixedProv.Import(stream);
int pages = sourceDoc.Pages.Count;
int pageCounter = 1;
int xoffset = 150;
int yoffset = 50;
//editor is the thing that lets you add elements into the source doc
//Like the provider, the Editor needs to match the document class (Fixed or Flow)
RadFixedDocumentEditor editor = new RadFixedDocumentEditor(sourceDoc);
foreach (RadFixedPage page in sourceDoc.Pages)
FixedContentEditor pEd = new FixedContentEditor(page);
Size ps = page.Size;
pEd.Position.Translate(ps.Width - xoffset, ps.Height - yoffset);
Block block = new Block();
block.HorizontalAlignment = Telerik.Windows.Documents.Fixed.Model.Editing.Flow.HorizontalAlignment.Center;
block.TextProperties.FontSize = 22;
block.InsertText(string.Format("Page {0} of {1} ", pageCounter, pages));
string exportFileName = "addedPageNums.pdf";
if (File.Exists(exportFileName))
File.WriteAllBytes(exportFileName, fixedProv.Export(sourceDoc));
//launch the app

Read a file from the cache in CEFSharp

I need to navigate to a web site that ultimately contains a .pdf file and I want to save that file locally. I am using CEFSharp to do this. The nature of this site is such that once the .pdf appears in the browser, it cannot be accessed again. For this reason, I was wondering if once you have a .pdf displayed in the browser, is there a way to access the source for that file in the cache?
I have tried implementing IDownloadHandler and that works, but you have to click the save button on the embedded .pdf. I am trying to get around that.
OK, here is how I got it to work. There is a function in CEFSharp that allows you to filter an incoming web response. Consequently, this gives you complete access to the incoming stream. My solution is a little on the dirty side and not particularly efficient, but it works for my situation. If anyone sees a better way, I am open for suggestions. There are two things I have to assume in order for my code to work.
GetResourceResponseFilter is called every time a new page is downloaded.
The PDF is that last thing to be downloaded during the navigation process.
Start with the CEF Minimal Example found here :
I used the WinForms version. Implement the IRequestHandler and IResponseFilter in the form definition as follows:
public partial class BrowserForm : Form, IRequestHandler, IResponseFilter
public readonly ChromiumWebBrowser browser;
public BrowserForm(string url)
browser = new ChromiumWebBrowser(url)
Dock = DockStyle.Fill,
browser.BrowserSettings.FileAccessFromFileUrls = CefState.Enabled;
browser.BrowserSettings.UniversalAccessFromFileUrls = CefState.Enabled;
browser.BrowserSettings.WebSecurity = CefState.Disabled;
browser.BrowserSettings.Javascript = CefState.Enabled;
browser.LoadingStateChanged += OnLoadingStateChanged;
browser.ConsoleMessage += OnBrowserConsoleMessage;
browser.StatusMessage += OnBrowserStatusMessage;
browser.TitleChanged += OnBrowserTitleChanged;
browser.AddressChanged += OnBrowserAddressChanged;
browser.FrameLoadEnd += browser_FrameLoadEnd;
browser.LifeSpanHandler = this;
browser.RequestHandler = this;
The declaration and the last two lines are the most important for this explanation. I implemented the IRequestHandler using the template found here:
I changed everything to what it recommends as default except for GetResourceResponseFilter which I implemented as follows:
IResponseFilter IRequestHandler.GetResourceResponseFilter(IWebBrowser browserControl, IBrowser browser, IFrame frame, IRequest request, IResponse response)
if (request.Url.EndsWith(".pdf"))
return this;
return null;
I then implemented IResponseFilter as follows:
FilterStatus IResponseFilter.Filter(Stream dataIn, out long dataInRead, Stream dataOut, out long dataOutWritten)
BinaryWriter sw;
if (dataIn == null)
dataInRead = 0;
dataOutWritten = 0;
return FilterStatus.Done;
dataInRead = dataIn.Length;
dataOutWritten = Math.Min(dataInRead, dataOut.Length);
byte[] buffer = new byte[dataOutWritten];
int bytesRead = dataIn.Read(buffer, 0, (int)dataOutWritten);
string s = System.Text.Encoding.UTF8.GetString(buffer);
if (s.StartsWith("%PDF"))
sw = new BinaryWriter(File.Open(pdfFileName, FileMode.Append));
dataOut.Write(buffer, 0, bytesRead);
return FilterStatus.Done;
bool IResponseFilter.InitFilter()
return true;
What I found is that the PDF is actually downloaded twice when it is loaded. In any case, there might be header information and what not at the beginning of the page. When I get a stream segment that begins with %PDF, I know it is the beginning of a PDF so I delete the file to discard any previous contents that might be there. Otherwise, I just keep appending each segment to the end of the file. Theoretically, the PDF file will be safe until you navigate to another PDF, but my recommendation is to do something with the file as soon as the page is loaded just to be safe.

Tess4J doOCR() for *First Page* of pdf / tif

Is there a way to tell Tess4J to only OCR a certain amount of pages / characters?
I will potentially be working with 200+ page PDF's, but I really only want to OCR the first page, if that!
As far as I understand, the common sample
package net.sourceforge.tess4j.example;
import net.sourceforge.tess4j.*;
public class TesseractExample {
public static void main(String[] args) {
File imageFile = new File("eurotext.tif");
Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
// Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping
try {
String result = instance.doOCR(imageFile);
} catch (TesseractException e) {
Would attempt to OCR the entire, 200+ page into a single String.
For my particular case, that is way more than I need it to do, and I'm worried it could take a very long time if I let it do all 200+ pages and then just substring the first 500 or so.
The library has a PdfUtilities class that can extract certain pages of a PDF.

Windows 8 Metro App MediaElement.SetSource (can not change the volume during playback)

I am making Windows 8 Metro style app.
I want to be able to run different sounds at the same time and manage them. For this goals I have created MediaPlayService which should contain methods which allow me to do that.
I found one issue that after "_mediaElement.SetSource()" I can not change volume. I am calling SetVolume and nothing happen.
Play(); --- this sequence works
SetVolume(100); --- does not work (I can not change the volume during playback)
public void SetVolume(int volume)
//_m ediaElement.Volume = Math.Round((double)((double)volume / 100), 2);
double dvolume = Math.Round((double)((double)volume / 100), 2);
_mediaElement.SetValue(MediaElement.VolumeProperty, dvolume);
string _mediaPath;
public void Initialize(Sound sound)
_mediaElement = new MediaElement();
_mediaPath = sound.FilePath;
_mediaElement.AudioCategory = Windows.UI.Xaml.Media.AudioCategory.Communications;
_mediaElement.IsLooping = true;
_mediaElement.MediaFailed += _mediaElement_MediaFailed;
_mediaElement.RealTimePlayback = true;
public async void Play()
var pack = Windows.ApplicationModel.Package.Current;
var installedLoction = pack.InstalledLocation;
var storageFile = await installedLoction.GetFileAsync(_mediaPath);
if (storageFile != null)
var stream = await storageFile.OpenAsync(Windows.Storage.FileAccessMode.Read);
_mediaElement.SetSource(stream, storageFile.ContentType);
Your MediaElement isn't part of the VisualTree of your page. As consequence you have to deal with those strange behaviors like setting the volume or position won't work correctly.
As solution you might create the MediaElement in your XAML file or add it from your code-behind to the VisualTree (something like contentGrid.Children.Add( _mediaElement ). In the latter case you probably have to remove it before navigating to another page else it might happen that it won't play the next time you are navigating back.