I am trying to build my own client RTMP library for an app that I am working on. So far everything has gone pretty successfully in that I am able to connect to the RTMP server negotiate the handshake and then send all the necessary packets (FCPublish Publish ETC) then from the server i get the onStatus message of NetStream.Publish.Start which means that I have successfully got the server to allow me to start publishing my live video broadcast. Wireshark also confirms that the information (/Data packetizing) is correct as it shows up correctly there also.
Now for where I am having some trouble is RTMP Chunking, going off the Adobe RTMP Specification on page 17 & 18 shows an example of how a message is chunked. From this example I can see that it is broken down based on the chunk size (128 bytes). For me the chunk size gets negotiated in the initial connect and exchange which is always 4096 bytes. So for when I am exchanging video data that is larger than 4096 bytes I need to chunk the message down sending the RTMP packetHeader combined with the first 4096 bytes of data then sending a small RTMP header which is 0xc4 (0xc0 | packetHeaderType (0x04)) combined with 4096 bytes of video data until the full packet specified by the header has been sent. Then a new frame comes in and the same process is repeated.
By checking other RTMP client example written in different languages this seems to be what they are all doing. Unfortunately the ingest server that I am trying to stream to is not picking up the broadcast video data, they dont close the connection on my they just never show video or any sign that the video is right. Wireshark shows that after the video atom packet is sent most packets sent are Unknown (0x0) for a little bit and then they will switch into Video Data and will sort of flip flop between showing Unknown (0x0) and Video Data. However if I restrict my payload max size to 20000 bytes Wireshark shows everything as Video Data. Obviously the ingest server will not show video in this situation as i am removing chunks of data for it to be only 20k bytes.
Trying to figure out what is going wrong I started another xcode project that allows me to spoof a RTMP server on my Lan so that I can see what the data looks like from libRTMP IOS as it comes into the server. Also with libRTMP I can make it log the packets it sends and they seem to inject the byte 0xc4 even 128 bytes even tho I have sent the Change Chunk size message as the server. When I try to replicate this in my RTMP client Library by just using a 128 chunk size even tho it has been set to 4096 bytes the server will close my connection on me. However if change libRTMP to try to go to the live RTMP server it still prints out within LibRTMP that it is sending packets in a chunk size of 128. And the server seems to be accepting it as video is showing up. When I do look at the data coming in on my RTMP server I can see that it is all their.
Anyone have any idea what could be going on?
While I haven't worked specifically with RTMP, I have worked with RTSP/RTP/RTCP pretty extensively, so, based on that experience and the bruises I picked up along the way, here are some random, possibly-applicable tips that might help/things to look for that might be causing an issue:
Does your video encoding match what you're telling the server? In other words, if your video is encoded as H.264, is that what you're specifying to the server?
Does the data match the container format that the server is expecting? For example, if the server expects to receive an MPEG-4 movie (.m4v) file but you're sending only an encoded MPEG-4 (.mp4) stream, you'll need to encapsulate the MPEG-4 video stream into an MPEG-4 movie container. Conversely, if the server is expecting only a single MPEG-4 video stream but you're sending an encapsulated MPEG-4 Movie, you'll need to de-mux the MPEG-4 stream out of its container and send only that content.
Have you taken into account the MTU of your transmission medium? Regardless of chunk size, getting an MTU mismatch between the client and server can be hard to debug (and is possibly why you're getting some packets listed as "Unknown" type and others as "Video Data" type). Much of this will be taken care of with most OS' built-in Segmentation-and-Reassembly (SAR) infrastructure so long as the MTU is consistent, but in cases where you have to do your own SAR logic it's very easy to get this wrong.
Have you tried capturing traffic in Wireshark with libRTMP iOS and your own client and comparing the packets side by side? Sometimes a "reference" packet trace can be invaluable in finding that one little bit (or many) that didn't originally seem important.
Good luck!
Related
I've been playing with WebRTC using libdatachannel, experimenting and learning.
Wrote some code to parse RTP packets into NALU's, and testing connecting to a "known good" server which sends H264 video.
Problem:
I'm only seeing NALU's with type = 1 (fragmented into multiple FU-A's) and sometimes type = 24 (which contain embedded SPS and PPS NALU's).
So I don't understand how to decode / render this stream - I would expect the server to send a NALU with a key frame (NALU type 5) automatically to a newly connected client, but it does not.
What am I missing to be able to decode the stream? What should I do to receive a key frame quickly? If my understanding is correct, I need a key frame to start decoding / rendering.
Tried requesting a key frame from code - it does arrive (type 5) but after some delay which is undesirable.
And yet the stream plays perfectly fine with a web browser client (Chrome, JavaScript) and starts up quickly.
Am I maybe overthinking this, and the browser also has a delay but I'm just perceiving it as instant?
In any case, what's the situation with key frames? Is a client supposed to request them (and without that, a server should not be expected to send them)?
If so what's a good interval? One second, two, three?
On AWS, how do you play video from MediaLive through the UDP output group?
For my use case, I'm building a live stream pipeline that takes an MPEG-2 transport stream from MediaLive, processes it through a UDP server (configured as an output group), and consumed by a web client that plays on HTML5 video.
The problem is: the data is flowing, but the video isn't rendering.
Previously, my output group was set to AWS MediaPackage, but because I need the ability to read and update frames over live stream, I'm trying to feed through UDP.
Is setting the output group to UDP the right approach?
The documentation is a bit sparse here. I'm wondering if there are resources or examples where others were able to play video this way as oppose to HLS/DASH.
Thanks for your post. Yes the UDP or RTP output would be the right choice of output from MediaLive. Appropriate routing rules will need to be used on any intermediary routers or firewalls to ensure that the UDP traffic can reach the client.
You wrote that 'the data is flowing, but the video isn't rendering.' This suggests an issue with the web client.
I suggest adding another identical UDP output to your UDP server and sending its output to a computer (or AWS Workspace) running a copy of VLC player. Decoding that new stream will give you a confidence monitor on the output of the entire workflow up to that point. This will help isolate the problem.
You could achieve the same result with a packet capture or TS stream analyzer if you want to go that route instead. If you go this route, I recommend trying to play back one of the packet captures locally with the web client.
In the last couple of weeks I've been developing a boot loader that performs a firmware update on a certain device. The setup is as follows:
The firmware binary and its respective SHA1 hash are stored in a web server;
The device is composed of an ESP8266 and a STM32 microcontroller (STM32F401 or STM32F030, there two hardware versions, but the one I'm using is the F401). The ESP is used only with AT+ commands, i.e., I did not built it's firmware, just used the latest version from Espressif.
The idea is that, the STM32 bootloader should use the ESP to download the firmware hash and binary from the webserver and then boot the firmware if the hash is OK. The download is made using the ESP in passive mode, i.e. the STM has to manually request X bytes to read from the ESP buffer, currently I'm using 1 MTU (1460 bytes).
At first, the connection to the webserver was made using HTTP and everything worked perfectly, however, I had to change it to HTTPS, and that's where the problem starts. After the STM has received around 100kB of the firmware (which has 110kB), the ESP only provides 30 bytes per request (which should be around 1 MTU), thus, making the download time extremely large.
I've already did some digging trying to find out if this is related to the ESP, but didn't find anything. Also, the point where this 30 byte download rate starts to happen isn't always at the 100kB mark, I've tested with a 170kB firmware and It started to happen at 160kB ish, so, it looks like it's always the last 10kB.
I've also added some delays in the firmware when the packet size becomes smaller than 1 MTU, to give more time for the ESP to process the packet, since the SSL decryption stuff takes longer to process; but it did not help.
My question is: is there some characteristic in the HTTPS/SSL protocols that reduces the packet length? What could be the causes of what is happening here?
SRTCP tracks the number of sent and lost bytes and packets, last received sequence number, inter-arrival jitter for each SRTP packet, and other SRTP statistics.
Does mentioned browsers do something with SRTCP reports when dealing with audio stream, for example adjust bitrate on the fly if network conditions are changed ?
Given that Chrome does adjust bitrate and resolution of VP8 on the fly in a connection, I would assume that OPUS configurations are changed in the connection as well.
You can see the feed back on the sending audio in this image. The bitrate obviously drops slightly when using opus. However, I would imagine that video bitrate would be the first changed in a video call as changing it would have the greater effect.
Obviously, one cannot change the bitrate on a codec that only supports constant bitrates.
All the other stats are a combination of what the RTCP reports give(packetsLost, Rtt, bits sent, etc.) and googles monitoring of the inputs/outputs(audio level, echo cancellation, etc.).
NOTE: this is taken from a session created by AppRtc in chrome.
I am building an Arduino-based device that needs to send data over the internet to a remote server. It needs to do this as frequently as possible but also use as little bandwidth as possible. It will probably work over GSM/EDGE (cellular networking).
The data I'd like to send is about 40 bytes in size - really minimal. I'd like to send this packet to the server about once a minute, but also receive a packet of around that size in response once in a while.
The bandwidth on my server is no problem - the bandwidth on the device's internet connection is, i.e. the cellular data.
Do headers on mobile requests and responses count as part of the bandwidth?
Yes, the total packet size is all data that is sent. Assuming a TCP packet you lose 20 bytes right from the start. If you get intimate with Wireshark you can see exactly what's happening.