QTKit: Analog for VideoContext for the sound - objective-c

I am writing a simple application for streaming video over the network, using a slightly different from the ordinary "H.264 over RTP" approach (i am using my own codecs).
To achieve this, i need raw frames and raw audio samples that QTMovie, when playing back a movie, implicitly sends to QTMovieView.
The most common way to retrieve raw video frames is to use VisualContext - and then, using a display link callback, i "generate" a CVPixelBufferRef, using this VisualContext. So i am getting frames with some frequency that is synchronized with my current refresh rate (not that i need this synchronization - i only need to have a "stream" of frames that i can transmit over the network - but CoreVideo Programming Guide and most Apple samples related to video promote this approach).
The first problem i have faced with - is when i attach a VisualContext to a QTMovie, the picture can't be rendered onto the QTMovieView anymore. I don't know why does this happen (i guess it's related to the idea of GWorld and the rendering being "detached" from it when i attach VisualContext). Ok, at least i have frames, which i could render onto a simple NSView (though this sounds wrong, and performance-unfriendly. Am i doing it right?)
What about the sound, i have no idea what to do. I need to get raw samples of sound as the movie being played (ideally - something similar to what QTCaptureDecompressedAudioOutput returns in its callback).
I have prepared myself to delving into deprecated Carbon QuickTime APIs, if there is no other way. But I don't know even where to start. Should i use the same CoreVideo Display link and periodically retrieve sound somehow? Should i get QTDataReference and locate the sound frames manually?
I am actually a beginner with programming video and audio services. If you could share some experience i would REALLY appreciate any idea you could share with me :)
Thank you,
James

Related

How to modify DirectX camera

Suppose I have a 3D (but not stereoscopic) DirectX game or program. Is there a way for a second program (or a driver) to change the camera position in the game?
I'm trying to build a head-tracking plugin or driver that I can use for my DirectX games/programs. An inertial motion sensor will give me the position of my head but my problem is using that position data to change the camera position, not with the hardware/math concerns of head tracking.
I haven't been able to find anything on how to do this so far, but iZ3D was able to create two cameras near the original camera and use it for stereoscopic stuff, so I know there exists some hook/link/connection into DirectX that makes camera manipulation by a second program possible.
If I am able to get this to work I'll release the code.
-Shane
Hooking Direct3D calls in its nature is just hooking DLL calls. I.e. its not something special to D3D but just a generic technique. Try googling for "hook dll" or start from here: [C++] Direct3D hooking sample. As it always happens with hooks there are many caveats and you'll have to make a pretty huge boilerplate to satisfy all needs of the hooked application.
Though, manipulation with camera in games usually gives not good results. There are at least two key features of modern PC game which will severely limit your idea:
Pre-clipping. Almost any game engine filters out objects that are behind the viewing plane. So when you rotate camera to a side you won't see the objects you'd expect to see in a real world - they were just not sent to D3D since game doesn't know that viewing plane has changed.
Multiple passes rendering. Many popular post processing effects are done in extra passes (either thru the whole scene or just part of it). Mirrors and "screens" are the most known such effects. Without knowing what camera you're manipulating with you'll most likely just break the scene.
Btw, #2 is the reason why stereoscopic mode is not 100% compatible with all games. For example, in Source engine HDR scenes are rendered in three passes and if you don't know how to distinguish them you'll do nothing but break the game. Take a look at how nVidia implements their stereoscopic mode: they make a separate hook for every popular game and even with this approach it's not always possible to get expected result.

Objective-C play sound

I know how to play mp3 files and whatnot in Xcode iOS. But how do I play a certain frequency, like if I just wanted to emit a C# note for 25 seconds; how might I do that? (The synth isn't as important to me as just the pitch of the note.)
You need to generate the PCM audio waveform that corresponds to the note you want to play and store that into a sample buffer in memory. Then you send that buffer to the audio hardware.
Here is a tutorial on generating waveforms of several types. The article goes into some details on the many aspects to a note you need to consider, including the frequency, volume, waveform shape, sampling rate, etc. The article comes with Flash source code, I think you should have no problem taking the concepts and adapting them to iOS.
If you also need a library that you can use to play the generated buffers on iOS, then I recommend the open source Finch.
I hope this helps!
You can synthesize waveforms of your desired frequency and feed them to the callbacks of either the Audio Queue or the RemoteIO Audio Unit API.
Here is a short tutorial on some of the code needed to create sine wave tones for iOS in C.

Mac OS X equivalent for DirectShow, GraphEdit

New to Mac OS X, familiar with Windows. Windows has DirectShow, a good number of built-in filters, COM programming, and GraphEdit for very fast prototyping and snooping on the graphs you've constructed in code.
I'm now about to go to the Mac to work with cameras, webcams, microphones, color spaces, files, splitting, synchronization, rendering, file reading, file saving, and many of things I've come to take for granted with DirecShow when putting together applications for live performance. On the Mac side, so far I've found ... nothing! Either I don't know where to look or I'm having the toughest time tying the Mac's reputation for its ease of handling media with a coherent programmatic ability to get in there and start messin' with media manipulatin' building blocks.
I've seen some weak suggestions to use gstreamer or some library for QT but I can't bring myself to believe that this is the Apple way to go. And I've come across some QuickTime documentation but I'm not looking to do transitions, sprites, broadcasting, ...
Having a brain trained on DirectShow means I don't even know how Apple thinks about providing DirectShow-like functionality. That means I don't know the right keywords and don't even know where to look. Books? Bought a few. Now I might be able to write some code that can edit your sister's wedding video (if I can't make decent headway on this topic I may next be asking what that'd be worth to you), but for identifying what filters are available and how to string them together ... nothing. Suggestions?
Video handling is going through a huge transition on the Mac at the moment. QuickTime is very old, but also big and powerful, so it's been undergoing an incremental replacement process for the past 5 years or so.
That said, QTKit is the QuickTime subset (capture, playback, format conversion and basic video editing) which is supported going forward. The legacy QuickTime APIs are still there for the moment, and probably will remain at least until its major features are available elsewhere, but are 32-bit only. For some involved video stuff you may end up needing to use it in places.
At the moment, iOS is ahead of the Mac because it could start from scratch with AV Foundation. The future of the Mac media frameworks will probably either be AV Foundation directly (with QTKit being a lightweight shim over the top) or an extension of QTKit that looks very similar.
For audio there's Core Audio which is on Mac and iOS and isn't going away any time soon. It's quite powerful but somewhat obtuse in places. Luckily online support is very good; the mailing list is an essential resource.
For filters and frame-level processing you've got Core Video as someone else mentioned, as well as Core Image. For motion graphics there's Quartz Composer which includes a graphical editor and a plugin architecture to add your own patches. For programmatic procedural animation and easily mixing rendering modelsĀ (OpenGL, Quartz, video, etc.) there's Core Animation.
In addition to all of these, of course there's no reason you can't use open source libraries where the built-in stuff doesn't do what you want.
To address your comment below:
In QuickTime (and QTKit), individual data types like audio and video are represented as tracks. It may not be immediately clear that QuickTime can open audio as well as video file formats. A common way to combine audio and video would be:
Create a QTMovie with your video file.
Create a QTMovie with your audio file.
Take the QTTrack object representing the audio and add it to the QTMovie with the video in it.
Flatten the movie, so it doesn't simply contain a reference to the other movie but actually contains the audio data.
Write the movie to disk.
Here's an example from Blender. You'll see how the A/V muxing is done in the end_qt function. There's also some use of Core Audio in there (AudioConverter*). (There's some classic QuickTime export code in quicktime_export.c but it doesn't seem to do audio.)

How to programmatically test for audio sync

I have a multimedia application that among other things converts video using FFMpeg. Video conversion being the pain that it is, I have in my test suits some tests that check our ability to convert various video formats, with emphasis on sample videos known not to work.
A common problem we've noticed from users is that some videos end up with their audio desynched after being processed, and I am looking for a way to check this in my tests.
Extracting the audio portion of the resulting videos is not a problem.
My best idea so far would be to check the offset of the first non-silence at both the beginning and end and compare each between the two videos, but I'm hoping someone smart has a better idea.
The application language/environment is Java, but since this is for testing, I'm free to use any toolset.
The basic problem is likely that the video and audio are different lengths. Extract the audio and test its length vs. the video length. If they are significantly different (more than maybe .05 sec, I'm not really sure what is detectable as "off"), then there's a problem.
To fix it, re-encode the audio to match the video length, and then put the audio and video back into a container format.

Using Cocoa to detect when a running application plays audio

I'm looking into writing an app that runs as a background process and detects when an app (say, Safari) is playing audio. I can use NSWorkspace to get the process ID's of the currently running applications but I'm at a loss when it comes to detecting what those processes are doing. I assume that there is a way to listen in on a process and detect what public messages the objects are sending. I apologize for my ignorance on the subject.
Has anyone attempted anything like this or are aware of any resources that can help?
I don't think that your "answer" is an answer at all...
and there IS an answer (which is not "42")
your best bet for doing this would be to write a pass-through audio output device. Much like soundflower, actually. so your audio output device would then load the actual (physical) audio output device and pass the audio data along to it directly (after first having a look at the audio stream, of course!). then you only need to convince your users to configure your audio device as the default audio output device so that the majority of applications which play sound will use it automatically. and voila...
your audio processing function will probably just do a quick RMS on the buffer before passing it along to the actual output device. and when the audio power crosses a certain threshold (probably something like -54dB with apple audio hardware), then you know that some app is making sound.
|K<
SoundFlower is an open-source project that allows Mac OS X applications to pass audio to each other. It almost certainly does something similar to what you describe.
I've been informed on another thread that while this is possible, it is an extremely advanced technique and not recommended. It would involve using Application Enhancer (APE) and is considered a not 'nice' thing to do. Looks like that app idea is destined for the big recycling bin in the sky :)