Recording video simultaneously with audio in chrome blocks on main thread, causing invalid audio - webrtc

So, I have what I think is a fairly interesting and, hopefully, not intractable problem. I have an audio/video getUserMedia stream that I am recording in Chrome. Individually, the tracks record perfectly well. However, when attempting to record both, one blocks the main thread, hosing the other. I know that there is a way to resolve this. Muaz Khan has a few demos that seem to work without blocking.
Audio is recorded via the web audio API. I am piping the audio track into a processor node which converts it to a 16b mono channel and streams it to a node.js server.
Video is recorded via the usual canvas hack and Whammy.js. When recording, video frames are drawn to a canvas and then the resulting image data is pushed into a frames array which is later encoded into a webm container by Whammy, subsequently uploaded to the node.js server.
The two are then muxed together via ffmpeg server-side and the result stored.
The ideas I've had so far are:
Delegate one to a worker thread. Unfortunately both the canvas and the stream are members of the DOM as far as I know.
Install headless browser in node.js and establish an rtc connection with the client, thereby exposing the entire stream server-side
The entire situation will eventually be solved by Audio Worker implementation. The working group seems to have stalled public progress updates on that while things are shuffled around a bit though.
Any suggestions for resolving the thread blocking?
Web Audio Connections:
var context = new AudioContext();
var source = context.createMediaStreamSource(stream);
var node = context.createScriptProcessor(2048, 1, 1);
node.onaudioprocess = audioProcess;
source.connect(node);
node.connect(context.destination);
Web Audio Processing:
if (!recording.audio) return;
var leftChannel = e.inputBuffer.getChannelData(0);
Socket.emit('record-audio', convertFloat32ToInt16(leftChannel));
Video Frame Buffering:
if (recording.video) {
players.canvas.context.fillRect(0, 0, players.video.width, players.video.height);
players.canvas.context.drawImage(players.video.element, 0, 0, players.video.width, players.video.height);
frames.push({
duration: 100,
image: players.canvas.element.toDataURL('image/webp')
});
lastTime = new Date().getTime();
requestAnimationFrame(drawFrame);
} else {
requestAnimationFrame(getBlob);
}
Update: I've since managed to stop the two operations from completely blocking one another, but it's still doing it enough to distort my audio.

There are a few key things that allow for successful getUserMedia recording in Chrome at the moment, as taken from a conglomeration of information gleaned from the helpful comments attached to the original question and my own experience.
When harvesting data from the recording canvas, encode as jpeg. I had been attempting webp to satisfy the requirements of Whammy.js. Generating a webp dataURI is apparently a cycle hog.
Delegate as much of the non-DOM operations as possible to worker threads. This is especially true of any streaming / upload operations (e.g., audio sample streaming via websockets)
Avoid requestAnimationFrame as a means of facilitating recording canvas drawing. It is resource intensive, and as Aldel has pointed out, can fail if the user switches tabs. Using setInterval is much more efficient/reliable. It also allows for better framerate control.
For Chrome at least, avoid client-side AV encoding for the time being. Stream audio samples and video frames server-side for processing. While client-side AV encoding libraries are very cool, they simply don't seem efficient enough for production quite yet.
Also, for Node.js ffmpeg automation, I highly recommend fluent-ffmpeg. Special thanks to Benjamin Trent for some practical examples.

#aldel is right. Increasing bufferSize value fixes it. E.g. bufferSize= 16384;
Try this demo in chrome and record audio+video. You'll hear clear recorded WAV in parallel with 720p video frames.
BTW, I agree with jesup that MediaRecorder solutions should be preferred.
Chromium guys are very close and hoping M47/48 will bring MediaRecorder implementations! At least for video (vp8) recordings.
There is chrome-based alternative for whammy.js as well:
https://github.com/streamproc/MediaStreamRecorder/issues/43

Related

Playing a Live stream from media server on android application

My setup is as follows:
OBS Studio to create the video feed
Ant Media Server to distribute the stream
Now I'm building an app that will display this stream and I'm currently using ExoPlayer, however I'm having a hard time getting it to work for both RTMP and HLS, I read some where that I could embed a webplayer in my app would that be easier? Here is my code for ExoPlayer:
//RTMP Url
String url = "rtmp://192.168.1.244/WebRTCApp/379358104902020985845622";
BandwidthMeter bandwidthMeter = new DefaultBandwidthMeter();
TrackSelection.Factory videoTrackSelectionFactory =
new AdaptiveTrackSelection.Factory();
TrackSelector trackSelector =
new DefaultTrackSelector(videoTrackSelectionFactory);
SimpleExoPlayer player = ExoPlayerFactory.newSimpleInstance(this, trackSelector);
PlayerView playerView = findViewById(R.id.simple_player);
playerView.setPlayer(player);
/*
Create RTMP Data Source
*/
RtmpDataSourceFactory rtmpDataSourceFactory = new RtmpDataSourceFactory();
MediaSource videoSource = new ExtractorMediaSource.Factory(rtmpDataSourceFactory)
.createMediaSource(Uri.parse(url));
player.prepare(videoSource);
player.setPlayWhenReady(true);
Any help on this would be much appreciated.
Most online video streaming use Adaptive Bit Rate streaming (ABR) protocols to deliver the video, mainly HSL and DASH this days.
Most Media players, like ExoPlayer, support these protocols well, although they are complex and evolving protocols so there are always edge cases.
Many video conferencing applications use WebRTC which is a real time optimised protocol - the usual approach is to use a WebRTC client for this type of stream.
The difference between the two approaches from a streaming latency point of view, at a very high level, is:
ABR protocols prioritise quality and avoiding interruptions and buffer enough of the video to try to gaurantee uninterrupted playback. They are usually aimed at movie and live video streaming services. Even for low latency implementation the latency is measured in multiple seconds and more.
WebRTC prioritises latency and sacrifices quality if necessary. It is aimed typically at real time sensitive applications like video conferencing where it is important not to fall behind the discussion even if it means a temporary video glitch or even brief interruption in video. Latency is usually measured in sub seconds.
Any Media Server comes from the WebRTC side although recent versions support HLS /CMAF and Low Latency DASH (these are still higher latency than WebRTC generally as noted above).
For your service, if you are able to use a DASH or HLS stream you may find that it is an easier path with ExoPlayer. If you look at the demo app for example you will see DASH and HLS streams but no RTMP ones. You can easily extend or modify the demo app to play your own HLS or DASH stream and this is often an easy way to start - look at the sample material in the assets/media.exolist.json and add your own URL:
https://github.com/google/ExoPlayer/blob/aeb306a164911aa1491b46c2db4da0d329c83c65/docs/demo-application.md
However, ExoPlayer should also support RTMP via an extension if this is your preferred route - there is a specific extension that allows this:
https://github.com/google/ExoPlayer/blob/0ba317b1337eaa789f05dd6c5241246478a3d1e5/extensions/rtmp/README.md
In theory you simply need to add this dependency to your application:
if your application is using DefaultDataSource or DefaultDataSourceFactory, adding support for RTMP streams is as simple as adding a dependency to the RTMP extension
It would be worth checking the issues list in this repository for any recent issues and/or workarounds.

Stream html5 camera output

does anyone know how to stream html5 camera output to other users.
If that's possible should I use sockets, images and stream them to the users or other technology.
Is there any video tutorial where I can take a look about it.
Many thanks.
The two most common approaches now are most likely:
stream from the source to a server, and allow users connect to the server to stream to their devices, typically using some form of Adaptive Bit Rate streaming protocol (ABR - basically creates multiple bit rate versions of your content and chunks them, so the client can choose the next chunk from the best bit rate for the device and current network conditions).
Stream peer to peer, or via a conferencing hub, using WebRTC
In general, the latter is more focused towards real time, e.g. any delay should be below the threshold which would interfere with audio and video conferences, usually less than 200ms for audio for example. To achieve this it may have to sacrifice quality sometimes, especially video quality.
There are some good WebRTC samples available online (here at the time of writing): https://webrtc.github.io/samples/

WebRTC video and photo at same time

I'm working on an application that transmits video in low quality using webrtc. Periodically I want to send from same camera single frame in high resolution.
When I try to acquire another stream using getUserMedia I get same low quality one and when I try to pass some constraints to force higher resolution then then operation fails with overconstrained error (even though normally when there is no other stream it works fine).
Is it even possible to have at the same time many streams with different parameters from same device? Or is it possible acquire high resolution image without requesting for a new stream?

iOS: stream to rtmp server from GPUImage

Is it possible to stream video and audio to a rtmp://-server with GPUImage?
I'm using the GPUImageVideoCamera and would love to stream (video + audio) directly to a rtmp-server.
I tried VideoCore which streams perfectly to e.g. YouTube, but whenever I try to overlay the video with different images I do get performance problems.
It seems as GPUImage is doing a really great job there, but I don't know how to stream with that. I found issues on VideoCore talking about feeding VideoCore with GPUImage, but I don't have a starting point on how that's implemented...

Play audio stream using WebAudio API

I have a client/server audio synthesizer where the server (java) dynamically generates an audio stream (Ogg/Vorbis) to be rendered by the client using an HTML5 audio element. Users can tweak various parameters and the server immediately alters the output accordingly. Unfortunately the audio element buffers (prefetches) very aggressively so changes made by the user won't be heard until minutes later, literally.
Trying to disable preload has no effect, and apparently this setting is only 'advisory' so there's no guarantee that it's behavior would be consistent across browsers.
I've been reading everything that I can find on WebRTC and the evolving WebAudio API and it seems like all of the pieces I need are there but I don't know if it's possible to connect them up the way I'd like to.
I looked at RTCPeerConnection, it does provide low latency but it brings in a lot of baggage that I don't want or need (STUN, ICE, offer/answer, etc) and currently it seems to only support a limited set of codecs, mostly geared towards voice. Also since the server side is in java I think I'd have to do a lot of work to teach it to 'speak' the various protocols and formats involved.
AudioContext.decodeAudioData works great for a static sample, but not for a stream since it doesn't process the incoming data until it's consumed the entire stream.
What I want is the exact functionality of the audio tag (i.e. HTMLAudioElement) without any buffering. If I could somehow create a MediaStream object that uses the server URL for its input then I could create a MediaStreamAudioSourceNode and send that output to context.destination. This is not very different than what AudioContext.decodeAudioData already does, except that method creates a static buffer, not a stream.
I would like to keep the Ogg/Vorbis compression and eventually use other codecs, but one thing that I may try next is to send raw PCM and build audio buffers on the fly, just as if they were being generated programatically by javascript code. But again, I think all of the parts already exist, and if there's any way to leverage that I would be most thrilled to know about it!
Thanks in advance,
Joe
How are you getting on ? Did you resolve this question ? I am solving a similar challenge. On the browser side I'm using web audio API which has nice ways to render streaming input audio data, and nodejs on the server side using web sockets as the middleware to send the browser streaming PCM buffers.