GCD dispatch_io_read() intermittent under CPU strain - objective-c

I'm building a macOS app (Objective-C) and in it I am using dispatch_io_read() to stream live data from a socket and it works great 99% of the time. However, if the computer happens to be under high CPU load (for other reasons) then the reads start to become delayed and “batched/buffered”, regardless if my application is in the foreground or not. For example, I am able to reproduce this by starting code compilation in the background while running my app. Sometimes I am not receiving any data for several seconds, only to then receive a burst of everything quickly.
Does anyone know why this is happening? Naturally I am assuming that my reads are of lower CPU priority than the other tasks on the system, but how could I possibly solve that?
I have tried using different dispatch queues etc, but it does not make a difference.
dispatch_io_read(dispatch_io_t channel,
off_t offset,
size_t length,
dispatch_queue_t queue,
dispatch_io_handler_t io_handler);

Related

How to improve throughput of TUN interface when using Erlang TUNCTL

I'm using TUNCTL with {active, true} to get UDP packets from a TUN interface. The process gets the packets and sends them to a different process that does work and sends them to yet another process that pushes them out a different interface using gen_udp. The same process repeats in the opposite direction, I use gen_udp to get packets and send them to a TUN interface.
I start seeing overruns on the incoming TUN interface when CPU load is close to 50%, about 2500 packets/sec. I don't loose any packets on gen_udp side ever, only with tunctl. Why is my application not getting all the packets from the TUN interface when CPU is not overloaded? My process has no messages in it's message queue.
I've played with process priorities and buffer sizes, which didn't do much. Total CPU load makes a bit of a difference. I managed to lower CPU load, but even though I saw a slight increase in TUN interface throughput, it now seems to max out at a lower CPU load, say 50% instead of 60%.
Is TUNCTL/Procket not able to read packets fast enough or is TUNCTL/Procket not getting enough CPU time for some reason? My theory is that Erlang Scheduler doesn't know how much time it needs as it's calling a NIF and it doesn't know about the number of unhandled messages on the TUN interface. Do I need to get my hands dirty with C++ and/or write my own NIF? MSANTOS HELP!
As expected, it was a problem with TUNCTL not getting enough CPU time when active is true. I used procket:read which gets the packet from the TUN buffer. Using this approach lets you specify how often to check the buffer, which tells Erlang Scheduler how much time your process needs. This let me load the CPU up to 100% if needed and allowed me to get all the packets from TUN interface that I needed. Bottleneck solved.

Hoot determine GTK3 memory leaks on OS X

I've written a small single-line oriented UDP based display service to support raspberry pi projects I frequently work on, where it would be nice to see the results of sensor data being captured. This is a rewrite of that program using GTK3 V3.18.9, and GLIB2 V2.46.2. I'm developing on OSX El Capitan
It seems to double in real memory size every 30 minutes or so based on traffic; so I'm presuming I have a memory leak somewhere. But for the life of me I can't see where in the code it could possible be. Val grind did not initially work for me, so I've got some studying to do to resolve what ever issue that is.
Meanwhile I was hoping that different eyes might be able to suggest a coding cause for a traffic based (at least I think so) memory leak. Here is the program and test client.
It starts up using about 10MB of real memory, then jumps when it has received 64 total messages to 14MB, than slowly grows from there. At 64 message, I start deleting the 65th message off the end of the list, presuming I should be saving memory; as this program might run for weeks.
Here is the code for the test client and the display service:
https://gist.github.com/skoona/c1218919cf393c98af474a4bf868009f

Operating System Basics

I am reading process management,and I have a few doubts-
What is meant by an I/o request,for E.g.-A process is executing and
hence it is in running state,it is in waiting state if it is waiting
for the completion of an I/O request.I am not getting by what is meant by an I/O request,Can you
please give an example to elaborate.
Another doubt is -Lets say that a process is executing and suddenly
an interrupt occurs,then the process stops its execution and will be
put in the ready state,is it possible that some other process began
its execution while the interrupt is also being processed?
Regarding the first question:
A simple way to think about it...
Your computer has lots of components. CPU, Hard Drive, network card, sound card, gpu, etc. All those work in parallel and independent of each other. They are also generally slower than the CPU.
This means that whenever a process makes a call that down the line (on the OS side) ends up communicating with an external device, there is no point for the OS to be stuck waiting for the result since the time it takes for that operation to complete is probably an eternity (in the CPU view point of things).
So, the OS fires up whatever communication the process requested (call it IO request), flags the process as waiting for IO, and switches execution to another process so the CPU can do something useful instead of sitting around blocked waiting for the IO request to complete.
When the external device finishes whatever operation was requested, it generates an interrupt, so the OS is informed the work is done, and it can then flag the blocked process as ready again.
This is all a very simplified view of course, but that's the main idea. It allows the CPU to do useful work instead of waiting for IO requests to complete.
Regarding the second question:
It's tricky, even for single CPU machines, and depends on how the OS handles interrupts.
For code simplicity, a simple OS might for example, whenever an interrupt happens process the interrupt in one go, then resume whatever process it decides it's appropriate whenever the interrupt handling is done. So in this case, no other process would run until the interrupt handling is complete.
In practice, things get a bit more complicated for performance and latency reasons.
If you think about an interrupt lifetime as just another task for the CPU (From when the interrupt starts to the point the OS considers that handling complete), you can effectively code the interrupt handling to run in parallel with other things.
Just think of the interrupt as notification for the OS to start another task (that interrupt handling). It grabs whatever context it needs at the point the interrupt started, then keeps processing that task in parallel with other processes.
I/O request generally just means request to do either Input , Output or both. The exact meaning varies depending on your context like HTTP, Networks, Console Ops, or may be some process in the CPU.
A process is waiting for IO: Say for example you were writing a program in C to accept user's name on command line, and then would like to print 'Hello User' back. Your code will go into waiting state until user enters their name and hits Enter. This is a higher level example, but even on a very low level process executing in your computer's processor works on same basic principle
Can Processor work on other processes when current is interrupted and waiting on something? Yes! You better hope it does. Thats what scheduling algorithms and stacks are for. However the real answer depending on what Architecture you are on, does it support parallel or serial processing etc.

WasapiLoopbackCapture to WaveOut

I'm using WasapiLoopbackCapture to capture sound coming from my speakers and then using onDataAvailable to send it to another device and I'm attempting to play the data sent using the WaveOut class and a BufferedWaveProvider and just adding a sample everytime data is sent from my client using the onDataAvailable. I'm having problems sending sound. The most functioning I've managed to get it is:
Not syncing the Wave format of the client and the server, just sending data and adding it to the sample. Problem is this is stutters very much even though I checked the buffer stored size and it has 51 seconds. I even have to increase the buffer size which eventually overflows anyway.
I tried syncing the Wave format and I just get clicks but have no problem with buffer size. I also tried making sure that at least a second was stored in the buffer but that had zero effect.
If anyone could point me in the right direction that would be great.
Uncompressed audio takes up a lot of space on a network. On my machine the WasapiLoopbackCapture object produces 32-bit (IeeeFloat) stereo samples at 44100 samples per second, for around 2.7Mbit/sec total raw bandwidth. Once you factor in TCP packet overheads and so on, that's quite a lot of data you're transferring.
The first thing I would suggest though is that you plug in some profiling code at each step in the process to get an idea of where your bottlenecks are happening. How fast is data arriving from the capture device? How big are your packets? How long does it take to service each call to your OnDataAvailable event handler? How much data are you sending per second across the network? How fast is the data arriving at the client? Figure out where the bottlenecks are and you get a much better idea of what the bottlenecks are.
Try building a simulated server that reads data from a wave file in various WaveFormats (channels, bits per sample and sample rate) and simulates sending that data across the network to the client. You might find that the problem goes away at lower bandwidth. And if bandwidth is the issue, compression might be the solution.
If you're using a single-threaded model, and servicing each OnDataAvailable event takes longer than the recording frequency (ie: number of expected calls to OnDataAvailable per second) then there's going to be a data loss issue. Multiple threads can help with this - one to get the data from the audio system, another to process and send the data. But you can end up in the same position: losing data because you're not dealing with it quickly enough. When that happens it's handy to know about it, because it indicates a problem in the program. Find out when and where it happens - overflow in input, processing or output buffers all have different potential reasons and need different attention.

Using only free CPU time with Objective-C program

The BOINC client (does distributed processing jobs like SETI#home does) is able to turn processing on or off based on whether other processes are using a certain percentage of CPU time. That is, if the user starts to do some work and their processes start using 60% CPU, BOINC can pause to avoid interfering with the user's work.
I would like to do the same thing (monitor CPU usage by other processes). The difficulty as I see it is not monitoring CPU usage, but rather making sure that the information isn't skewed by my own usage. For example, if my process is using a ton of CPU time it may prevent another process from using enough to trigger the pause.
Can someone point me in the right direction? Even a suggestion for what to search for would be useful. I'm not really sure what this feature would be called.
You can use NSTask to set the 'nice' value of the process when your process starts.
Also [[NSThread mainThread] setThreadPriority:0.0]
where priority value is between 0.0 and 1.0 is a Cocoa API which may save you frakking about with sudo