Client-server real-time audio streaming - udp

I'm trying to develop a client/server realtime audio processing application. The client sends streaming audio to the server, the server sends the modified streaming audio to the client. A small delay is important to me, so the TCP is not suitable as a transport protocol.
Right now I'm using gRPC, but the delay is big. I tried to find a project similar to gRPC, but with UDP, but didn't find it. I don't have the resources to make a good framework using UDP. Then I learned about RTSP (RTP), but didn't understand if it can be used in bidirectional streaming. Help with the right direction for me. Recommend open source projects (or not) if they exist. Thanks.

Related

Is it possible to save a video stream between two peers in webrtc in the server, realtime?

Suppose I have 2 peers exchanging video with webRTC. Now I need both of the streams to be saved as video files in the central server. Is is possible to do it realtime? (Storing/Uploading the video from peers is not an option).
I thought of making a 3 node webRTC connection, with the 3rd node being the server. This way, I can screen record the 3rd node's stream or save it using some other way. But I am not sure about the reliability/feasibility of the implementation.
This is for a mobile application, and I would avoid any method that involves uploading/saving.
PS: I'm using Agora.io for the purpose of video-conference.
in my opinion
you can do it like the record demo:https://webrtc.github.io/samples/src/content/getusermedia/record/.
record each stream to blobs and push them to your server with websocket.
then convert the blobs to a webm file or just add in a video
Agora doesn't offer on-premise recording out of the box but they do provide thee code for you to be able to launch your own on-premise recording using your own server. Agora has the code and instructions to deploy on GitHub: https://github.com/AgoraIO/Basic-Recording
The way it works, once you have set up the Agora Recording SDK, the client would trigger the recording to start, via user interaction (button tap) or some other event (i.e. peer-joined or stream-subscribed) this will trigger the recording service to join the channel and record the streams. _The service outputs the video file once recording has stopped.
you need a WebRTC media server.
WebRTC media servers makes it possible to support more complex
scenarios WebRTC media servers are servers that act as WebRTC clients
but run on the server side. They are termination points for the media
where we’d like to take action. Popular tasks done on WebRTC media
servers include:
Group calling Recording Broadcast and live streaming Gateway to other
networks/protocols Server-side machine learning Cloud rendering
(gaming or 3D) The adventurous and strong hearted will go and develop
their own WebRTC media server. Most would pick a commercial service or
an open source one. For the latter, check out these tips for choosing
WebRTC open source media server framework.
In many cases, the thing developers are looking for is support for
group calling, something that almost always requires a media server.
In that case, you need to decide if you’d go with the classing (and
now somewhat old) MCU mixing model or with the more accepted and
modern SFU routing model. You will also need to think a lot about the
sizing of your WebRTC media server.
For recording WebRTC sessions, you can either do that on the client
side or the server side. In both cases you’ll be needing a server, but
what that server is and how it works will be very different in each
case.
If it is broadcasting you’re after, then you need to think about the
broadcast size of your WebRTC session.
link:https://bloggeek.me/webrtc-server/

Live streaming audio with WebRTC browser => server

I'm trying to sent some audio stream from my browser to some server(udp, also try websockets).
I'm recording audio stream with webrtc , but I have problems with transmitting data from a nodeJS client to the my server.
Any idea? is it possible to send audio stream to the server using webrtc(openwebrtc)?
To get audio from the browser to the server, you have a few different possibilities.
Web Sockets
Simply send the audio data over a binary web socket to your server. You can use the Web Audio API with a ScriptProcessorNode to capture raw PCM and send it losslessly. Or, you can use the MediaRecorder to record the MediaStream and encode it with a codec like Opus, which you can then stream over the Web Socket.
There is a sample for doing this with video over on Facebook's GitHub repo. Streaming audio only is conceptually the same thing, so you should be able to adapt the example.
HTTP (future)
In the near future, you'll be able to use a WritableStream as the request body with the Fetch API, allowing you to make a normal HTTP PUT with a stream source from a browser. This is essentially the same as what you would do with a Web Socket, just without the Web Socket layer.
WebRTC (data channel)
With a WebRTC connection and the server as a "peer", you can open a data channel and send that exact same PCM or encoded audio that you would have sent over Web Sockets or HTTP.
There's a ton of complexity added to this with no real benefit. Don't use this method.
WebRTC (media streams)
WebRTC calls support direct handling of MediaStreams. You can attach a stream and let the WebRTC stack take care of negotiating a codec, adapting for bandwidth changes, dropping data that doesn't arrive, maintaining synchronization, and negotiating connectivity around restrictive firewall environments. While this makes things easier on the surface, that's a lot of complexity as well. There aren't any packages for Node.js that expose the MediaStreams to you, so you're stuck dealing with other software... none of it as easy to integrate as it could be.
Most folks going this route will execute gstreamer as an RTP server to handle the media component. I'm not convinced this is the best way, but it's the best way I know of at the moment.

VoIP App development in xamarin with Xmpp Server

I want to develop a VoIP app with Xamarin and Xmpp server.
So far the only things that I have found is the openfire and "jitsi meet" for the server side and matrix for the client side. But the matrix has nothing to do with voice streaming and is just for text messaging and "jitsi meet" doesn’t have any sdk for .net client side.
I also have found the red5pro but this has client sdks just for native android and ios development platform and has nothing for Mono.
So what Should I look for?!
First, let's clarify some basics:
openfire is a XMPP server. Basically, this is all you need on the server side for basic VoIP support.
Alternatives include ejabberd and Prosody.
jitsi meet essentialy already is a VoIP app, so if you want to develop your own, you don't really need that.
"Jitsi Videobridge" on the other hand can be used to provide a relay server for video conferences. For your first steps with a simple VoIP app, you wont need that either, but if you want your users to be able to create video conferences with many participants, then this helps.
(Explanation: Normally, when you create a P2P-Video conference, you
have two options: First, all users send their video data to all
participants (everybody needs lots of bandwidth), or you pick one
participant ("host") that receives the video streams of every
participant end sends them to every other participant. In the second
case, a normal participant only has to upload his stream once and
download n streams, whereas the host does most of the work - so only
that one user needs high bandwidth.
Jitsi Videobridge can run on a server and act as this conference host (usually a server has a much better bandwidth than a home user), so that none of the participants has to act as a host.
In simple VoIP applications (without video), this may not be neccessary, as audio streams are usually much smaller than video streams.)
Now, as I said above, in order to write a VoIP app, you basically only need a XMPP server (openfire, prosody and ejabberd should all be sufficient for this use case), a client library that supports Jingle and client libraries for the RTP media streams (transfer and display).
Jingle is the name of a XMPP protocol extension that enables the negotiation of P2P data streams as they are needed for a VoIP call.
The relevant protocol specifications:
XEP 0166: Jingle
XEP-0167: Jingle RTP Sessions
So what you need to find is a XMPP library with support for the jingle protocol. The C# Matrix XMPP SDK (not to be confused with the "Matrix protocol", which is a different protocol and has nothing to do with XMPP except for having a common goal) is one example of such a library. According to their web site, there is support for Jingle, but I couldn't find any documentation about it.
However, as I mentioned above, Jingle is only about how to negotiate data streams, not the data streams and VoIP itself.
So what that library probably helps you with is parsing of the Jingle XMPP messages that are needed to set up a RTP data stream.
For displaying and transfering the RTP stream, however, you need additional libraries. For that, have a look at the following SO questions and answers:
Open Source .net C# library for Real Time transport Protocol
Streaming Avi files from C# using RTP
I hope I could give you some useful hints...

Best protocol for UDP streaming

Currently, I'm using open source code from OBS multiplatform for my video streaming application. But OBS uses RTMP for streaming, which only works on TCP. I want to use UDP since this is a live streaming scenario where I don't desire latency.
What is the best way to move to UDP? I read that protocols like RTP support UDP. Is this the best option to use? How about writing a simple UDP socket code to send and receive video data? Please guide.
The reason for trying to move to UDP is to ensure minimum latency between client and server. I'm looking for a quickest way to move to a udp based streaming than the current RTMP which uses TCP. I'm not very experienced in networking related coding. Will a simple socket program suffice to stream a video data continuously or should I implement a protocol like RTP? Or any other alternative? Ultimately, streaming should be smooth with minimum latency.

WebRTC Relay Server / Broadcast multiple clients

I've got WebRTC peer to peer working but when I want to broadcast a single camera to multiple clients obviously peer to peer isn't suitable.
I've found solutions like
http://lynckia.com
and
http://www.medooze.com/products/mcu/webrtc-support.aspx
But the first I can't get setup (and it seems to have cross browser issues)
the second just feels like we're hitting a nail with a nuclear missile.
All I need is a relay, I don't need to decode / recode streams.
I just need
The Broadcaster to connect to the server (peer to peer)
The clients to connect to the server (peer to peer)
The server to relay the stream from the broadcaster to the clients.
Is there any software out there that offers this solution that I've missed? is there an alternative working and scalable alternative?
Thanks
Jitsi Video Bridge works pretty much exactly how you describe.
On your server you can run Janus, to which your broadcaster can provide a stream via RTP.
Have a look at an example configuration file.
After writing a configuration file which defines how the server receives stream from the broadcaster, you should be able to launch janus in the background via a command line interface tool:
$ janus --daemon --config=config_file.conf
Also, see streaming test demo.
Note: I have not tested this thoroughly.
Have a look at this github-repo inspired from muaz khan's WebRTC p2p scalable broadcast. This can work great on LAN. On internet, I am not sure how well it can work as of now though we are improving it on the go.
If you just want to broadcast from a peer to a set of peers, if they don't care about the latency, the best solution is to covert WebRTC to live streaming, without transcoding just muxing:
Peer(Publisher) ---WebRTC--> Server --RTMP/HLS/DASH--> Peers/Players
If this works good for you, SRS is able to covert WebRTC to live streaming.
Because live streaming allows you to use CDN or TCP to deliver the streams, and the latency is about 3~5s, so this solution is only available when Peers/Players never need to communicate to the Peer(Publisher).
If you want all those peers to talk to each other, it's very complex and need a WebRTC SFU cluster to do this, there will be a huge number of streams. For example, if allows 100 peers to talk to each other, there will be 100x100 = 10k streams.
It's too complicated, so I don't think there is good open-source solution right now(at 2022.02).