What are good strategies to transfer an audio over web https? - file-upload

My andriod app is bandwidh constraint. It should work in as low as 64kbps net.
The user will record voice (max length 120 sec, avrage 60 sec)
and then will encode it with encoder (options are:
1. Losless: FLAC or ALAC?
2. lossy: MP3?
Say file is 1024 kb i.e. 1 MB.
So I am thinking sending file by dividing into of chunks of size 32kb
and
if response is received in 1 sec after request:
exponentially increasing size of chunks then
else
The app will binary search for exact chunk size.
3. Is my approach to transfer an audio from android to server
feasible for low speed connections?
4. Or is it better to push the entire file in
multi-part form-data to server in one https post call?

Assuming you are doing this:
Record an audio file
Compress file
Upload file
You are uploading over https which uses tcp. There is no reason to exponentially increase the size of chunk because internally TCP does this automatically to fairly share bandwidth. It is called Slow Start.
No reason to chunk up in to pieces and let it grow. Additionally, the max packet size is 64k.
Just open a stream and send it. Let the underlying network protocols take care of the details.
On your server, you probably have to change server settings to allow large file uploads and increase the timeout settings.

Related

Requests - How to upload large chunks of file in requests?

I have to upload large files (~5GB). I am dividing the file in small chunks (10MB), can't send all data(+5GB) at once(as the api I am requesting fails for large data than 5GB if sent in one request). The api I am uploading to, has a specification that it needs minimum of 10MB data to be sent. I did use read(10485760) and send it via requests which works fine.
However, I do not want to read all the 10MB in the memory and if I leverage multithreading in my script, so each thread reading 10MB would cost me too much memory.
Is there a way I can send a total of 10MB to the api requests but read only 4096/8192 bytes at a time and transfer till I reach 10MB, so that I do not overuse memory.
Pls.note I cannot send the fileobj in the requests as that will use less memory but I will not be able to break the chunk at 10MB and entire 5GB data will go to the request, which I do not want.
Is there any way via requests. I see the httplib has it. https://github.com/python/cpython/blob/3.9/Lib/http/client.py - I will call the send(fh.read(4096) function here in loop till I complete 10MB and will complete one request of 10MB without heavy memory usage.
this is what documentation says:
In the event you are posting a very large file as a multipart/form-data request, you may want to stream the request. By default, requests does not support this, but there is a separate package which does - requests-toolbelt. You should read the toolbelt’s documentation for more details about how to use it.
so try to stream the upload if it doesn't work as per your needs then go for requests-toolbelt
In order to stream the upload, you need to pass stream=True in the function call whether its post or put.

Trouble with RTMP ingest chunk stream

I am trying to build my own client RTMP library for an app that I am working on. So far everything has gone pretty successfully in that I am able to connect to the RTMP server negotiate the handshake and then send all the necessary packets (FCPublish Publish ETC) then from the server i get the onStatus message of NetStream.Publish.Start which means that I have successfully got the server to allow me to start publishing my live video broadcast. Wireshark also confirms that the information (/Data packetizing) is correct as it shows up correctly there also.
Now for where I am having some trouble is RTMP Chunking, going off the Adobe RTMP Specification on page 17 & 18 shows an example of how a message is chunked. From this example I can see that it is broken down based on the chunk size (128 bytes). For me the chunk size gets negotiated in the initial connect and exchange which is always 4096 bytes. So for when I am exchanging video data that is larger than 4096 bytes I need to chunk the message down sending the RTMP packetHeader combined with the first 4096 bytes of data then sending a small RTMP header which is 0xc4 (0xc0 | packetHeaderType (0x04)) combined with 4096 bytes of video data until the full packet specified by the header has been sent. Then a new frame comes in and the same process is repeated.
By checking other RTMP client example written in different languages this seems to be what they are all doing. Unfortunately the ingest server that I am trying to stream to is not picking up the broadcast video data, they dont close the connection on my they just never show video or any sign that the video is right. Wireshark shows that after the video atom packet is sent most packets sent are Unknown (0x0) for a little bit and then they will switch into Video Data and will sort of flip flop between showing Unknown (0x0) and Video Data. However if I restrict my payload max size to 20000 bytes Wireshark shows everything as Video Data. Obviously the ingest server will not show video in this situation as i am removing chunks of data for it to be only 20k bytes.
Trying to figure out what is going wrong I started another xcode project that allows me to spoof a RTMP server on my Lan so that I can see what the data looks like from libRTMP IOS as it comes into the server. Also with libRTMP I can make it log the packets it sends and they seem to inject the byte 0xc4 even 128 bytes even tho I have sent the Change Chunk size message as the server. When I try to replicate this in my RTMP client Library by just using a 128 chunk size even tho it has been set to 4096 bytes the server will close my connection on me. However if change libRTMP to try to go to the live RTMP server it still prints out within LibRTMP that it is sending packets in a chunk size of 128. And the server seems to be accepting it as video is showing up. When I do look at the data coming in on my RTMP server I can see that it is all their.
Anyone have any idea what could be going on?
While I haven't worked specifically with RTMP, I have worked with RTSP/RTP/RTCP pretty extensively, so, based on that experience and the bruises I picked up along the way, here are some random, possibly-applicable tips that might help/things to look for that might be causing an issue:
Does your video encoding match what you're telling the server? In other words, if your video is encoded as H.264, is that what you're specifying to the server?
Does the data match the container format that the server is expecting? For example, if the server expects to receive an MPEG-4 movie (.m4v) file but you're sending only an encoded MPEG-4 (.mp4) stream, you'll need to encapsulate the MPEG-4 video stream into an MPEG-4 movie container. Conversely, if the server is expecting only a single MPEG-4 video stream but you're sending an encapsulated MPEG-4 Movie, you'll need to de-mux the MPEG-4 stream out of its container and send only that content.
Have you taken into account the MTU of your transmission medium? Regardless of chunk size, getting an MTU mismatch between the client and server can be hard to debug (and is possibly why you're getting some packets listed as "Unknown" type and others as "Video Data" type). Much of this will be taken care of with most OS' built-in Segmentation-and-Reassembly (SAR) infrastructure so long as the MTU is consistent, but in cases where you have to do your own SAR logic it's very easy to get this wrong.
Have you tried capturing traffic in Wireshark with libRTMP iOS and your own client and comparing the packets side by side? Sometimes a "reference" packet trace can be invaluable in finding that one little bit (or many) that didn't originally seem important.
Good luck!

The most reliable and efficient udp packet size?

Would sending lots a small packets by UDP take more resources (cpu, compression by zlib, etc...). I read here that sending one big packet of ~65kBYTEs by UDP would probably fail so I'm thought that sending lots of smaller packets would succeed more often, but then comes the computational overhead of using more processing power (or at least thats what I'm assuming). The question is basically this; what is the best scenario for sending the maximum successful packets and keeping computation down to a minimum? Is there a specific size that works most of the time? I'm using Erlang for a server and Enet for the client (written in c++). Using Zlib compression also and I send the same packets to every client (broadcasting is the term I guess).
The maximum size of UDP payload that, most of the time, will not cause ip fragmentation is
MTU size of the host handling the PDU (most of the case it will be 1500) -
size of the IP header (20 bytes) -
size of UDP header (8 bytes)
1500 MTU - 20 IP hdr - 8 UDP hdr = 1472 bytes
#EJP talked about 534 bytes but I would fix it to 508. This is the number of bytes that FOR SURE will not cause fragmentation, because the minimum MTU size that an host can set is 576 and IP header max size can be 60 bytes (508 = 576 MTU - 60 IP - 8 UDP)
By the way i'd try to go with 1472 bytes because 1500 is a standard-enough value.
Use 1492 instead of 1500 for calculation if you're passing through a PPPoE connection.
Would sending lots a small packets by UDP take more resources ?
Yes, it would, definitely! I just did an experiment with a streaming app. The app sends 2000 frames of data each second, precisely timed. The data payload for each frame is 24 bytes. I used UDP with sendto() to send this data to a listener app on another node.
What I found was interesting. This level of activity took my sending CPU to its knees! I went from having about 64% free CPU time, to having about 5%! That was disastrous for my application, so I had to fix that. I decided to experiment with variations.
First, I simply commented out the sendto() call, to see what the packet assembly overhead looked like. About a 1% hit on CPU time. Not bad. OK... must be the sendto() call!
Then, I did a quick fakeout test... I called the sendto() API only once in every 10 iterations, but I padded the data record to 10 times its previous length, to simulate the effect of assembling a collection of smaller records into a larger one, sent less often. The results were quite satisfactory: 7% CPU hit, as compared to 59% previously. It would seem that, at least on my *NIX-like system, the operation of sending a packet is costly just in the overhead of making the call.
Just in case anyone doubts whether the test was working properly, I verified all the results with Wireshark observation of the actual UDP transmissions to confirm all was working as it should.
Conclusion: it uses MUCH less CPU time to send larger packets less often, then the same amount of data in the form of smaller packets sent more frequently. Admittedly, I do not know what happens if UDP starts fragging your overly-large UDP datagram... I mean, I don't know how much CPU overhead this adds. I will try to find out (I'd like to know myself) and update this answer.
534 bytes. That is required to be transmitted without fragmentation. It can still be lost altogether of course. The overheads due to retransmission of lost packets and the network overheads themselves are several orders of magnitude more significant than any CPU cost.
You're probably using the wrong protocol. UDP is almost always a poor choice for data you care about transmitting. You wind up layering sequencing, retry, and integrity logic atop it, and then you have TCP.

Do headers on mobile requests and responses count as part of the bandwidth?

I am building an Arduino-based device that needs to send data over the internet to a remote server. It needs to do this as frequently as possible but also use as little bandwidth as possible. It will probably work over GSM/EDGE (cellular networking).
The data I'd like to send is about 40 bytes in size - really minimal. I'd like to send this packet to the server about once a minute, but also receive a packet of around that size in response once in a while.
The bandwidth on my server is no problem - the bandwidth on the device's internet connection is, i.e. the cellular data.
Do headers on mobile requests and responses count as part of the bandwidth?
Yes, the total packet size is all data that is sent. Assuming a TCP packet you lose 20 bytes right from the start. If you get intimate with Wireshark you can see exactly what's happening.

Does SSL cause a lot more bandwidth?

I know SSL has a performance hit on your HTTP communication in terms of speed but is there much of a difference in the amount of data transferred?
ie, If a mobile device is paying a lot per kb, is there a huge difference? Does anyone have an estimate of how much of a difference?
Thanks for the help!
Matt
No, there is not much of a difference, neither in terms of "performance" nor in terms of bandwidth.
According to Google, a company one would hope is a reliable source on large-scale networking, the network-bandwidth overhead is less than 2%.
As Borealid pointed, the overhead is small. Usually. For an average request (which extends to multimegabyte files).
However if you have something like RESTful APIs to call, you need to ensure that persistent connection is used, otherwise with small request bodies SSL will add significant overhead. I can't tell you exact numbers now (simply because they vary depending on certificate size and number of certificates in the chain) but if you have to establish SSL session to send a 200-byte request and receive a 2-Kb response, SSL handshake can add another 5-7 Kb easily, so you see the overhead.
I just did a test using wireshark, downloading a 5-byte file from Amazon S3 over http and https to an iPad using a simple NSURLConnection request.
For http, total traffic was 1310 bytes.
For https, total traffic was 7099 bytes.
This was just for a single download in each case, and includes all back-and-forth over-the-wire traffic associated with the request, including DNS (about 200 bytes) and TCP handshaking (about 400 bytes for the http case).
Obviously the actual totals would change according to URL length and your particular SSL certificate; you could certainly have leaner headers than S3 delivers.
In theory, the SSL bandwidth overhead for a 1MB file should be about the same as a 1-byte file, i.e. about 5800 bytes in the above example, as encryption shouldn't increase the size of the data transmitted beyond the initial certificate and key exchange. So for large files it's negligible, but for small files can be significant, as pointed out by Eugene.