WinhttpReadData slow network - winhttp

I am downloading an exe from the server using Winhttp C++. I use the sample code provided in MSDN
http://msdn.microsoft.com/en-us/library/aa384104%28v=vs.85%29.aspx
It works fine.I normally used to add up all the data read (Read from WinhttpReadData) and log it.
The expected result is, the added sum should match the exe size. It works fine in reasonably fast network.
In case of very slow network. The data read is too much larger than the original size. But when i check the downloaded exe size, it is same as that of the server.
The logs (which is adding up the data read) shows it reads more data than the original size.
Remember it only occurs in slow network. Have anyone faced this issue?

Are you respecting the value returned via the lpdwNumberOfBytesRead parameter? The number of bytes read with each call may be less than the buffer size you provided -- especially if fewer bytes are available at that time due to the slowness of the network.

Related

FileChannel.transferTo (supposedly zero-copy) not giving any performance gain

I am working on a REST API that has an endpoint to download a file that could be > 2 GB in size. I have read that Java's FileChannel.transferTo(...) will use zero-copy if the OS supports it. My server is running on localhost during development on my MacBook Pro OS 10.11.6.
I compared the following two methods of writing file to response stream:
Copying a fixed number of bytes from FileChannel to WritableByteChannel using transferTo
Reading a fixed number of bytes from FileInputStream into a byte array (size 4096) and writing to OutputStream in a loop.
The time taken for a 5.2GB file is between 20 and 23 seconds with both methods. I tried transferTo with the fixed number of bytes in single transfer set to following values: 4KB (i.e. 4 * 1024), 1MB and 50MB. The time taken to write is in the same range in all the 3 cases.
Time taken is measured from before entering the while-loop to after exiting the while-loop, in which bytes are read from the file. This is all on the server side. The network hop time does not figure into this.
Any ideas on what the reason could be? I am quite sure MacOS 10.11.6 should support zero-copy (i.e. sendfile system call).
EDIT (6/18/2018):
I found the following blog post from 2015, saying that sendfile on MacOS X is broken. Could it be that this problem still exists?
https://blog.phusion.nl/2015/06/04/the-brokenness-of-the-sendfile-system-call/
The (high) transfer rate that you are quoting is likely close to or at the limit of what a SATA device can do anyway. If my guess is right, you will not see a performance gain reflected in the time it takes to run your test - however there will likely be a change in the CPU load during the test. Given that you have a relatively powerful machine, your CPU and memory are fast enough. Any method (zero-copy or not) will work at the same speed - which is the speed of your disk. However, zero-copy will cause a lot less CPU load and will not grab unnecessary bandwidth from your memory, either. Therefore, you should test different methods and see which one ends up using the least amount of CPU and choose that method for your application.

WasapiLoopbackCapture to WaveOut

I'm using WasapiLoopbackCapture to capture sound coming from my speakers and then using onDataAvailable to send it to another device and I'm attempting to play the data sent using the WaveOut class and a BufferedWaveProvider and just adding a sample everytime data is sent from my client using the onDataAvailable. I'm having problems sending sound. The most functioning I've managed to get it is:
Not syncing the Wave format of the client and the server, just sending data and adding it to the sample. Problem is this is stutters very much even though I checked the buffer stored size and it has 51 seconds. I even have to increase the buffer size which eventually overflows anyway.
I tried syncing the Wave format and I just get clicks but have no problem with buffer size. I also tried making sure that at least a second was stored in the buffer but that had zero effect.
If anyone could point me in the right direction that would be great.
Uncompressed audio takes up a lot of space on a network. On my machine the WasapiLoopbackCapture object produces 32-bit (IeeeFloat) stereo samples at 44100 samples per second, for around 2.7Mbit/sec total raw bandwidth. Once you factor in TCP packet overheads and so on, that's quite a lot of data you're transferring.
The first thing I would suggest though is that you plug in some profiling code at each step in the process to get an idea of where your bottlenecks are happening. How fast is data arriving from the capture device? How big are your packets? How long does it take to service each call to your OnDataAvailable event handler? How much data are you sending per second across the network? How fast is the data arriving at the client? Figure out where the bottlenecks are and you get a much better idea of what the bottlenecks are.
Try building a simulated server that reads data from a wave file in various WaveFormats (channels, bits per sample and sample rate) and simulates sending that data across the network to the client. You might find that the problem goes away at lower bandwidth. And if bandwidth is the issue, compression might be the solution.
If you're using a single-threaded model, and servicing each OnDataAvailable event takes longer than the recording frequency (ie: number of expected calls to OnDataAvailable per second) then there's going to be a data loss issue. Multiple threads can help with this - one to get the data from the audio system, another to process and send the data. But you can end up in the same position: losing data because you're not dealing with it quickly enough. When that happens it's handy to know about it, because it indicates a problem in the program. Find out when and where it happens - overflow in input, processing or output buffers all have different potential reasons and need different attention.

Max file size for File.ReadAllLines

I need to read and process a text file. My processing would be easier if I could use the File.ReadAllLines method but I'm not sure what is the maximum size of the file that could be read with this method without reading by chunks.
I understand that the file size depends on the computer memory. But are still there any recommendations for an average machine?
On a 32-bit operating system, you'll get at most a contiguous chunk of memory around 550 Megabytes, allowing loading a file of half that size. That goes down hill quickly after your program has been running for a while and the virtual memory address space gets fragmented. 100 Megabytes is about all you can hope for.
This is not an issue on a 64-bit operating system.
Since reading a text file one line at a time is just as fast as reading all lines, this should never be a real problem.
I've done stuff like this with 1-2GB before, albeit in Python. I do not think .NET would have a problem, though. But I would only do this for one-off processing.
If you are doing this on a regular basis, you might want to go through the file line by line.
Its bad design unless you know the files sizes vs the computer memory that would be avaiable in the running app.
A better solution would be consider memory mapped files. They use themselvses as page fil storage,

Reading a whole file on a network drive the fast way (Windows, C/C++, C#, ...)

Lately I've been having problems reading big files on a network drive and I just can't pinpoint what I may be doing wrong. I tried both in C++ (Unmanaged) and in C# and had about the same performances on both...which were somewhat abysmal.
Sometimes it will read at 4 KB/s a file on the network, but if this file is located on the local HD it will achieve easily the maximum data rate the HD can output. That is with reading 64 KB chunks at a time... I tried with bigger buffers up to insane numbers, or smaller and it doesn't make much differences.
I tried async IO in C# with BeginRead on the FileStream and OVERLAPPED IO in C++ as well as synchronous reads and they all had the same problems, which is being slow on the network.
The only solution we came up with is to copy the file using the OS CopyFile function on the local HD before actually reading the file but I'm not too satisfied with this approach. It just seems like CopyFile is doing something we are not that makes it incredibly faster than our approach.
Anyone has a clue as to why this is?
We would have to guess, since you aren't showing us your code. So my guess is that Windows file copy is opening the file with the FILE_FLAG_SEQUENTIAL_SCAN flag which in turn causes the file system/cache to choose optimal block sizes and submit read requests in anticipation of read calls that havn't been submitted yet.
We only can assume that you have been trying really all possible methods of reading/writing. Have you been reading synchronously or asynchronously? Did you try I/O completion ports? Or ReadFileEx() function? I would guess that the Windows CopyFile() function detects that you want to read a file from network and will use different method for reading then it would use for disk access.
If you have really exhausted all possible reading methods, and if you really need thing to be solved, then I would suggest to check out a bit on what is the CopyFile() function doing. There are numerous tools for doing that. E.g.: this one (or some other -- links on the same page).

Grand Unified Theory of logging

Is their a Grand Unified Theory of logging? Shall we develop one? Question (just to show this is not a discussion :), how can I improve on the following? (note that I live mainly in the embedded world, but non-embedded suggestions are also welcome)
How do you log, when do you log, what do you log, what do you do with log files?
How do you log - I generally have macros, #ifdef TESTING, sort of thing. They write to RAM and a low priority process writes them out when the system is idle (using UDP, since I do embedded systems)
When do you log - same as voting, early and often. At every (in)significant program event, I log at varying levels. Events received, transaction succeed/fail, data updated, etc
What do you log - Fatal/Error/Warning/Info/Debug/Trace is covered in When to use the different log levels?
What do you do with log files - 1) keep them (in CVS), both pass and fail 2) capture everything and filter later in case I can't repeat a problem. I have tools to filter the log by "level" (Fatal/Error/etc), process, file, etc. And to draw message sequence charts, dump data structures, draw histograms of memory usage - what am I missing?
Hmmm, binary or ascii log file format? Ascii is bulkier, but binary requires more processing. I have done both, currently I use ascii
Question - did I miss anything, and how can I improve on this?
You could "instrument" your code in many different ways, everything from start-up/shut-down events to individual machine instruction execution (using a processor emulator). Of all the possibilities, what's worth doing? Don't just do it for the sake of completeness; have a specific goal in mind. A business case if you like, with a benefit you expect to receive. E.g.:
Insight into CPU task execution times/patterns to enable optimisation (if you need to improve performance).
Insight into other systems to resolve system integration issues (e.g. what messages is your VoIP box sending and receiving when it connects to a particular peer?)
Insight into the nature of errors (for field diagnostics)
Aid in development
Aid in validation testing
I imagine that there's no grand unified theory of logging, because what you do would depend on many details:
Quantity of data
Type of data
Events
Streamed audio/video
Available storage
Storage speed
Storage capacity
Available channels to extract data
Bandwidth
Cost
Availability
Internet connected 24×7
Site visit required
Need to unlock a rusty gate, climb a ladder onto a roof, to plug in a cable, after filling out OHS documentation
Need to wait until the Antarctic winter is over and the ice sheets thaw
Random access vs linear access (e.g. if you compress it, do you need to read from the start to decompress and access some random point?)
Need to survive error conditions
Watchdog reboots
Possible data corruption
Due to failing power supply
Due to unreliable storage media
Need to survive a plane crash
As for ASCII vs binary, I usually prefer to keep the logging simple, and put any nice presentation in a PC application that decodes the data. It's usually easier to create a user-friendly presentation in PC software (written in e.g. Python) rather than in the embedded system itself.
did I miss anything, and how can I
improve on this?
Asynchronous logging.
Using multiple log files for the same process for different logging abstractions. e.g. the process' activities are logged in a normal log file. And the process' stats (periodic statistics that you might be interested in) are logged in a separate stats log file.
Hmmm, binary or ascii log file format?
Ascii is bulkier, but binary requires
more processing. I have done both,
currently I use ascii
ASCII is good. More often than not, logs are meant to be used for debugging purposes. A human readable form eases and speeds this up.
However, if your logs are used mostly to record information which is used later on for analysis and generation of reports (e.g. stats or latencies etc.) a binary format would be preferred. You can go one step ahead and use a custom format along with a db service which does index based sorting, where the index can be a tuple of time with the event type.
--
One thing which may be helpful is to have a "maybeLogger" object which will accept log records for an operation which may or may not succeed, and then either ditch those records if the operation succeeds or fails in an uninteresting way, or log them if it does something interesting. This is relatively easy to do in something like .net. In an embedded system, it can only be done really easily if the amount of stuff to be logged is small enough to fit in free RAM, but one could probably use a garbage-collection-based approach to hold stuff in flash (have one 'stream' of data in flash for new log entries, and another for ones that are confirmed to be interesting; periodically move data which is known to be good from the first stream to the second).
Here's my $0.02.
I only log when I'm having a problem and need to track down the source. Usually this has to do with a customer's environment, so I can't just attach the debugger. My solution is to enable the Telnet port and use that to print out statements as to where the program is and values of variables.
I do ASCII only because it's over telnet.
Another aspect of telnet is that it is pretty simple. It's a TCP port with text being thrown out. Very little processing other than the normal TCP headaches.
The log files are dumped as soon as I get them because I have not tried to capture and save a telnet session. I guess I could with WireShark, but I don't need a history of that session. I just need to find the problem and verify a fix.