How does audio and video in a webrtc peerconnection stay in sync? - webrtc

How does audio and video in a webrtc peerconnection stay in sync? I am using an API which publishes audio and video (I assume as one peer connection) to a media server. The audio can occasionally go out of sync up to 200ms. I am attributing this to the possibility that the audio and video are separate streams and this accounts for the why the sync can be out.

In addition to Sean's answer:
WebRTC player in browsers has a very low tolerance for timestamp difference between arriving audio and video samples. Your audio and video streams must be aligned (interleaved) precisely, i.e. the timestamp of last audio sample received from network, should be +- 200ms or so comparing to timestamp of last video frame received from network. Otherwise WebRTC player will stop using NTP Timestamps and will play streams individually. This is because WebRTC player tries to keep latency at a minimum. Not sure it's good decision from WebRTC team. If your bandwidth is not sufficient, or if live encoder provides streams not timestamp-aligned - then you will have out of sync playback. In my opinion, WebRTC player could have a setting - whether to use that tolerance value or always play in sync, using NTP Timestamps, at the expense of latency.

RTP/RTCP (which WebRTC uses) traditionally uses the RTCP Sender Report. That allows each SSRC stream to be synced on a NTP Timestamp. Browsers do use them today, so things should work.
Are you doing any protocol bridging or anything that could be RTP only? What Media Server are you using?

Related

Playing a Live stream from media server on android application

My setup is as follows:
OBS Studio to create the video feed
Ant Media Server to distribute the stream
Now I'm building an app that will display this stream and I'm currently using ExoPlayer, however I'm having a hard time getting it to work for both RTMP and HLS, I read some where that I could embed a webplayer in my app would that be easier? Here is my code for ExoPlayer:
//RTMP Url
String url = "rtmp://192.168.1.244/WebRTCApp/379358104902020985845622";
BandwidthMeter bandwidthMeter = new DefaultBandwidthMeter();
TrackSelection.Factory videoTrackSelectionFactory =
new AdaptiveTrackSelection.Factory();
TrackSelector trackSelector =
new DefaultTrackSelector(videoTrackSelectionFactory);
SimpleExoPlayer player = ExoPlayerFactory.newSimpleInstance(this, trackSelector);
PlayerView playerView = findViewById(R.id.simple_player);
playerView.setPlayer(player);
/*
Create RTMP Data Source
*/
RtmpDataSourceFactory rtmpDataSourceFactory = new RtmpDataSourceFactory();
MediaSource videoSource = new ExtractorMediaSource.Factory(rtmpDataSourceFactory)
.createMediaSource(Uri.parse(url));
player.prepare(videoSource);
player.setPlayWhenReady(true);
Any help on this would be much appreciated.
Most online video streaming use Adaptive Bit Rate streaming (ABR) protocols to deliver the video, mainly HSL and DASH this days.
Most Media players, like ExoPlayer, support these protocols well, although they are complex and evolving protocols so there are always edge cases.
Many video conferencing applications use WebRTC which is a real time optimised protocol - the usual approach is to use a WebRTC client for this type of stream.
The difference between the two approaches from a streaming latency point of view, at a very high level, is:
ABR protocols prioritise quality and avoiding interruptions and buffer enough of the video to try to gaurantee uninterrupted playback. They are usually aimed at movie and live video streaming services. Even for low latency implementation the latency is measured in multiple seconds and more.
WebRTC prioritises latency and sacrifices quality if necessary. It is aimed typically at real time sensitive applications like video conferencing where it is important not to fall behind the discussion even if it means a temporary video glitch or even brief interruption in video. Latency is usually measured in sub seconds.
Any Media Server comes from the WebRTC side although recent versions support HLS /CMAF and Low Latency DASH (these are still higher latency than WebRTC generally as noted above).
For your service, if you are able to use a DASH or HLS stream you may find that it is an easier path with ExoPlayer. If you look at the demo app for example you will see DASH and HLS streams but no RTMP ones. You can easily extend or modify the demo app to play your own HLS or DASH stream and this is often an easy way to start - look at the sample material in the assets/media.exolist.json and add your own URL:
https://github.com/google/ExoPlayer/blob/aeb306a164911aa1491b46c2db4da0d329c83c65/docs/demo-application.md
However, ExoPlayer should also support RTMP via an extension if this is your preferred route - there is a specific extension that allows this:
https://github.com/google/ExoPlayer/blob/0ba317b1337eaa789f05dd6c5241246478a3d1e5/extensions/rtmp/README.md
In theory you simply need to add this dependency to your application:
if your application is using DefaultDataSource or DefaultDataSourceFactory, adding support for RTMP streams is as simple as adding a dependency to the RTMP extension
It would be worth checking the issues list in this repository for any recent issues and/or workarounds.

Does Google webrtc native implementation have support for SFU?

Does Google WebRTC Native implementation has support for SFU?
Does Google WebRTC Native implementation support for integrating custom/hardware encoder/decoder?
Not without alteration.
Internally WebRTC's internal audio/video pipelines are directly tied to encoder/decoders.
PeerConnectionFactory allows you to provide a video decoder/encoder factory, so you can short circuit the logic here, and grab the encoded frames, mock up a stream, and feed them directly into it as a relay, creating a new PeerConnection and setting those streams onto it.
The audio end is more difficult. There isn't a codec factory, so you will have to short circuit the logic there probably by alteration of libwebrtc.
The final question is RTCP termination, and how to override the mechanisms for quality/bandwidth control to not create a "One goes out, they all go out." situation.
Since libwebrtc will be the SFU, it will receive RTCP feedback from its remote peer for the content it is proxying, and vice versa.
For a 1-1 situation, it needs to be able to forward the RTCP feedback to the remote peer.
For multipoint, it needs to perform some logic to determine if one of the peers is problematic, and stop sending it video, switch off its video feed, or attempt to switch to a lower bitrate video stream. Basically it needs to act as a conduit that attempts to predict why/how packet loss is occurring, and keep as many audio/video feeds operating normally at at the highest possible quality for each peer.
How exactly to hijack the RTCP feedback mechanisms in libwebrtc, I think that again will likely require some customization/hooks into libwebrtc
I think it will be easier to try with GStreamer implementation of WebRTC. Although it is still in "Bad Plugins" it is way easier to get or provide encoded audio and video. Actually it is implemented in that in mind - to make implementation of MFU and SFU easier.

Stream html5 camera output

does anyone know how to stream html5 camera output to other users.
If that's possible should I use sockets, images and stream them to the users or other technology.
Is there any video tutorial where I can take a look about it.
Many thanks.
The two most common approaches now are most likely:
stream from the source to a server, and allow users connect to the server to stream to their devices, typically using some form of Adaptive Bit Rate streaming protocol (ABR - basically creates multiple bit rate versions of your content and chunks them, so the client can choose the next chunk from the best bit rate for the device and current network conditions).
Stream peer to peer, or via a conferencing hub, using WebRTC
In general, the latter is more focused towards real time, e.g. any delay should be below the threshold which would interfere with audio and video conferences, usually less than 200ms for audio for example. To achieve this it may have to sacrifice quality sometimes, especially video quality.
There are some good WebRTC samples available online (here at the time of writing): https://webrtc.github.io/samples/

WebRtc stream without loss of quality

My web application records video streams on the server side using webRtc and kurento media server. Its just writing the raw stream received from the client to disk. But I was faced with the fact that the quality of the video falls dramatically. All because of codecs and compression. Is it possible to send video without compression at all? The number of FPS is not important to me. 5 FPS for my purpose is pretty enough. The main criterion is 100% quality, or close to it. How to achieve this? Is there any codec that compresses without loss of video quality?
Server side of my app is written in Spring java

How Chrome/Firefox handles SRTCP report comming from WebRtc connection?

SRTCP tracks the number of sent and lost bytes and packets, last received sequence number, inter-arrival jitter for each SRTP packet, and other SRTP statistics.
Does mentioned browsers do something with SRTCP reports when dealing with audio stream, for example adjust bitrate on the fly if network conditions are changed ?
Given that Chrome does adjust bitrate and resolution of VP8 on the fly in a connection, I would assume that OPUS configurations are changed in the connection as well.
You can see the feed back on the sending audio in this image. The bitrate obviously drops slightly when using opus. However, I would imagine that video bitrate would be the first changed in a video call as changing it would have the greater effect.
Obviously, one cannot change the bitrate on a codec that only supports constant bitrates.
All the other stats are a combination of what the RTCP reports give(packetsLost, Rtt, bits sent, etc.) and googles monitoring of the inputs/outputs(audio level, echo cancellation, etc.).
NOTE: this is taken from a session created by AppRtc in chrome.