Using Cocoa to detect when a running application plays audio - objective-c

I'm looking into writing an app that runs as a background process and detects when an app (say, Safari) is playing audio. I can use NSWorkspace to get the process ID's of the currently running applications but I'm at a loss when it comes to detecting what those processes are doing. I assume that there is a way to listen in on a process and detect what public messages the objects are sending. I apologize for my ignorance on the subject.
Has anyone attempted anything like this or are aware of any resources that can help?

I don't think that your "answer" is an answer at all...
and there IS an answer (which is not "42")
your best bet for doing this would be to write a pass-through audio output device. Much like soundflower, actually. so your audio output device would then load the actual (physical) audio output device and pass the audio data along to it directly (after first having a look at the audio stream, of course!). then you only need to convince your users to configure your audio device as the default audio output device so that the majority of applications which play sound will use it automatically. and voila...
your audio processing function will probably just do a quick RMS on the buffer before passing it along to the actual output device. and when the audio power crosses a certain threshold (probably something like -54dB with apple audio hardware), then you know that some app is making sound.
|K<

SoundFlower is an open-source project that allows Mac OS X applications to pass audio to each other. It almost certainly does something similar to what you describe.

I've been informed on another thread that while this is possible, it is an extremely advanced technique and not recommended. It would involve using Application Enhancer (APE) and is considered a not 'nice' thing to do. Looks like that app idea is destined for the big recycling bin in the sky :)

Related

Get macOS Output Device Audio Buffers in Realtime

I'm trying to tap the currently selected output audio device on macOS, so I basically have a pass through listener that can monitor the audio stream currently being output without affecting it.
I want to copy this data to a ring buffer in real time so I can operate on it separately.
The combination of Apple docs and (outdated?) SO answers are confusing as to whether I need to write a hacky kernel extension, can utilise CoreAudio for this, or need to interface with the HAL?
I would like to work in Swift if possible.
Many thanks
(ps. I had been looking at this and this)
I don't know about kernel extensions - their use of special "call us" signing certificates or the necessity of turning off SIP discourages casual exploration.
However you can use a combination of CoreAudio and HAL AudioServer plugins to do what you want, and you don't even need to write the plugin yourself, there are several open source versions to choose from.
CoreAudio doesn't give you a way to record from (or "tap") output devices - you can only record from input devices, so the way to get around this is to create a virtual "pass through" device (AudioServerPlugin), not associated with any hardware, that copies output through to input and then set this pass through device as default output and record from its input. I've done this using open source AudioServer Plugins like BackgroundMusic and BlackHole [TODO: add more].
To tap/record from the resulting device you can simply add an AudioDeviceIOProc callback to it or set the device as the kAudioOutputUnitProperty_CurrentDevice of an kAudioUnitSubType_HALOutput AudioUnit
There are two problems with the above virtual pass through device approach:
you can't your hear output anymore, because it's being consumed by the pass through device
changing default output device will switch away from your device and the tap will fall silent.
If 1. is a problem, then a simple is to create a Multi-Output device containing the pass through device and a real output device (see screenshot) & set this as the default output device. Volume controls stop working, but you can still change the real output device's volume in Audio MIDI Setup.app.
For 2. you can add a listener to the default output device and update the multi-output device above when it changes.
You can do most of the above in swift, although for ringbuffer-stowing from the buffer delivery callbacks you'll have to use C or some other language that can respect the realtime audio rules (no locks, no memory allocation, etc). You could maybe try AVAudioEngine to do the tap, but IIRC changing input device is a vale of tears.

Streaming IP Camera solutions that do not require a computer?

I want to embed a video stream into my web page, which is part of our own cloud based software. The video should be low-latency (like video conferencing), and it would be preferable, but not required, for it to include audio. I am comfortable serving streaming binary data from the server-side, and embedding it into the page using HTML5 video.
What I am not comfortable with is the ability to capture the video data to begin with. The client does not already have a solution in place, and is looking to us for assistance. The video would be routed through our server equipment, and not be an embedded peice that connects directly to the video source.
It is a known quantity for us to use a USB or built-in camera from the computer. What I would like more information is about stand-alone cameras.
Some models of cameras have their own API documentation (example). It would seem from what I am reading that a manufacturer would typically have their own API which they repeat on many or all of their models, and that each manufacturer would be different in their API. However, I have only done surface reading and hope to gain more knowledge from someone who has already researched this, or perhaps even had first hand experience.
Do stand-alone cameras generally include an API? (Wouldn't this is a common requirement, so that security software can use multiple lines of cameras?) Or if not an API, how is the data retrieved from the on-board webserver? Is it usually flash based? Perhaps there is a re-useable video stream I could capture from there? Or is the stream formatting usually diverse?
What would I run into when trying to get the server-side to capture that data?
How does latency on a stand-alone device compare with a USB camera solution?
Do you have tips on picking out a stand-alone camera that would be a good fit for streaming through a server?
I am experienced at using JavaScript (both HTML5 and Node.JS), Perl and Java.
Each camera manufacturer has their own take on this from the point of access points; generally you should be able to ask for a snapshot or a MJPEG stream, but it can vary. Take a look at this entry on CodeProject; it tackles two common methodologies. Here's another one targeted at Foscam specifically.
Get a good NAS, I suggest Synology, check out their long list of supported IP Web Cams. You can connect them with a hub or with a router or whatever you wish. It's not a "computer" as-in "tower", but it does many computer jobs, and it can stay on while your computer is off or away, and do thing like like video feeds, torrents, backups, etc.
I'm not an expert on all the features, so I don't know how to get it to broadcast without recording, but even if it does then at least it's separate. Synology is a popular brand and there are lot of authorized and un-authorized plugins for it. Check them out and see if one suits you.

QTKit: Analog for VideoContext for the sound

I am writing a simple application for streaming video over the network, using a slightly different from the ordinary "H.264 over RTP" approach (i am using my own codecs).
To achieve this, i need raw frames and raw audio samples that QTMovie, when playing back a movie, implicitly sends to QTMovieView.
The most common way to retrieve raw video frames is to use VisualContext - and then, using a display link callback, i "generate" a CVPixelBufferRef, using this VisualContext. So i am getting frames with some frequency that is synchronized with my current refresh rate (not that i need this synchronization - i only need to have a "stream" of frames that i can transmit over the network - but CoreVideo Programming Guide and most Apple samples related to video promote this approach).
The first problem i have faced with - is when i attach a VisualContext to a QTMovie, the picture can't be rendered onto the QTMovieView anymore. I don't know why does this happen (i guess it's related to the idea of GWorld and the rendering being "detached" from it when i attach VisualContext). Ok, at least i have frames, which i could render onto a simple NSView (though this sounds wrong, and performance-unfriendly. Am i doing it right?)
What about the sound, i have no idea what to do. I need to get raw samples of sound as the movie being played (ideally - something similar to what QTCaptureDecompressedAudioOutput returns in its callback).
I have prepared myself to delving into deprecated Carbon QuickTime APIs, if there is no other way. But I don't know even where to start. Should i use the same CoreVideo Display link and periodically retrieve sound somehow? Should i get QTDataReference and locate the sound frames manually?
I am actually a beginner with programming video and audio services. If you could share some experience i would REALLY appreciate any idea you could share with me :)
Thank you,
James

Mac OS X equivalent for DirectShow, GraphEdit

New to Mac OS X, familiar with Windows. Windows has DirectShow, a good number of built-in filters, COM programming, and GraphEdit for very fast prototyping and snooping on the graphs you've constructed in code.
I'm now about to go to the Mac to work with cameras, webcams, microphones, color spaces, files, splitting, synchronization, rendering, file reading, file saving, and many of things I've come to take for granted with DirecShow when putting together applications for live performance. On the Mac side, so far I've found ... nothing! Either I don't know where to look or I'm having the toughest time tying the Mac's reputation for its ease of handling media with a coherent programmatic ability to get in there and start messin' with media manipulatin' building blocks.
I've seen some weak suggestions to use gstreamer or some library for QT but I can't bring myself to believe that this is the Apple way to go. And I've come across some QuickTime documentation but I'm not looking to do transitions, sprites, broadcasting, ...
Having a brain trained on DirectShow means I don't even know how Apple thinks about providing DirectShow-like functionality. That means I don't know the right keywords and don't even know where to look. Books? Bought a few. Now I might be able to write some code that can edit your sister's wedding video (if I can't make decent headway on this topic I may next be asking what that'd be worth to you), but for identifying what filters are available and how to string them together ... nothing. Suggestions?
Video handling is going through a huge transition on the Mac at the moment. QuickTime is very old, but also big and powerful, so it's been undergoing an incremental replacement process for the past 5 years or so.
That said, QTKit is the QuickTime subset (capture, playback, format conversion and basic video editing) which is supported going forward. The legacy QuickTime APIs are still there for the moment, and probably will remain at least until its major features are available elsewhere, but are 32-bit only. For some involved video stuff you may end up needing to use it in places.
At the moment, iOS is ahead of the Mac because it could start from scratch with AV Foundation. The future of the Mac media frameworks will probably either be AV Foundation directly (with QTKit being a lightweight shim over the top) or an extension of QTKit that looks very similar.
For audio there's Core Audio which is on Mac and iOS and isn't going away any time soon. It's quite powerful but somewhat obtuse in places. Luckily online support is very good; the mailing list is an essential resource.
For filters and frame-level processing you've got Core Video as someone else mentioned, as well as Core Image. For motion graphics there's Quartz Composer which includes a graphical editor and a plugin architecture to add your own patches. For programmatic procedural animation and easily mixing rendering modelsĀ (OpenGL, Quartz, video, etc.) there's Core Animation.
In addition to all of these, of course there's no reason you can't use open source libraries where the built-in stuff doesn't do what you want.
To address your comment below:
In QuickTime (and QTKit), individual data types like audio and video are represented as tracks. It may not be immediately clear that QuickTime can open audio as well as video file formats. A common way to combine audio and video would be:
Create a QTMovie with your video file.
Create a QTMovie with your audio file.
Take the QTTrack object representing the audio and add it to the QTMovie with the video in it.
Flatten the movie, so it doesn't simply contain a reference to the other movie but actually contains the audio data.
Write the movie to disk.
Here's an example from Blender. You'll see how the A/V muxing is done in the end_qt function. There's also some use of Core Audio in there (AudioConverter*). (There's some classic QuickTime export code in quicktime_export.c but it doesn't seem to do audio.)

Power Management in Symbian

Are there any "best practices" for writing a power-efficient background application in Symbian?
Specifically, is there any way (i.e. API) for a Symbian app to hint the OS regarding its current state in order to reduce battery consumption?
In Android, for instance, there is the notion of Wake Locks, which prevents the device from going into standby mode - Is there anything similar in Symbian?
EDIT:
Are there any implications when running code as a separate thread with the Open-C library, and not as "native" Symbian C++, using Active Objects etc.? (the Open-C code is blocking on IO most of the time).
You can check user (in-)activity with a RTimer::Inactivity() method. This way is described in Forum Nokia Wiki page. There it's also described how you can reset inactivity timer.
You can check whether device screen is turned on or off using HAL API. See classes HAL and HALData. You may use such a call:
TInt displayState;
HAL::Get(HALData::EDisplayState, displayState);
And the displayState will hold either 0 if display is turned off or 1 in other case.
With these APIs you will know whether user is active now, so you'll be able to change behavior of your background service to reduce its power consumption.
You can also use Nokia Energy Profiler application to record power consumption of handset, with different power saving options of your background service. Also please refer to Nokia's document describing best practices to save power of device. This document is quite straightforward, but useful nonetheless.
Hope this helps.
EDIT: About separate thread and Open C. As far as I know, Open C is just a plugin and deep down all the implementations are still "native Symbian". So, as far as you avoid periodic polling of some resource and just use usual blocking IO, your code is quite same economical on power as standard Symbian Active Objects techniques (which use Symbian-specific semaphores to block threads).
I have not come across anything special in Symbain to keep the device out of stand-by mode. Basically the "best practices" would be the same as all mobile devices:
Don't loop waiting for things, always use whatever signaling services avaialble on the platform, for Symbain ActiveObjects / User::WaitForXxx
Limit the number of background threads (currently all mobile devices are still only 1 CPU...)
Don't hang onto system services, close them ASAP (this is normally my main battery drain in my mobile applications, sometimes trying to find which system service causes the most battery drain can be a real pain, WinMo is very bad for this).
For me, I find that it mostly comes down to a tradeoff between battery life and performance / responsiveness for the application. Unfortunately power that be always seem to side with the performance / responsiveness side and damn the battery drain.....
Give your application low priority (see RProcess and RThread classes). Your approach will really depend on what your background application does. These things consume most battery: radio (GSM/3G/WIFI/BlueTooth), screen backlight, file accesses.
Symbian OS will always try to put your application to sleep, you don't need to tell it to do this. Just make sure your approach gives it the opportunity to put it to sleep.
Power management is an very most important issue while developing application.
In Symbian it depends on what you are using to run background activities .
Whether you are using Thread or ActiveX control.
For Eg. you are developing application browser that you want the browser to download something then that downloading activity should go in background and able activity starts and when to show progress and when it finishes it should again come to fore end.
It depends on how you are managing thread if you are using thread. You can do like which thread to pause when the long time taking activity starts and when to resume when background activity has finishes execution..
In fact this is the very good topic u have come across
There used to be an inactivity timer which could be reset by the application. This would prevent the screen from going into any screen saver mode.
If you use the various asynchronous function in Symbian, your app will run when appropriate.
One of these methods should work depending on your needs. If you describe what you want to achieve in more detail it would be easier to help you.