Objective C - detect sound - objective-c

How can I detect sound on the iPhone?
I currently have it somewhat working using the method described in this article http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/
However, as mentioned in that article, in noisy rooms for example, the method won't work.
So in a way, I'm looking for methods that I can use to separate background noise from actual noise intended?
There probably won't be an existing code on the internet, but if someone can point me to the right direction, it would be very appreciated.
Thank you,
Tee

It's definitely not a great solution, but you could determine a baseline amplitude (ambient sound) and trigger your events when you determine that the amplitude is a certain amount greater for a set amount of time.

Related

Information about CGAL and alternatives

I'm working on a problem that will eventually run in an embedded microcontroller (ESP8266). I need to perform some fairly simple operations on linear equations. I don't need much, but do need to be able work with points and linear equations to:
Define an equations for lines either from two known points, or one
point and a gradient
Calculate a new x,y point on an equation line that is a specific distance from another point on that equation line
Drop a perpendicular onto an equation line from a point
Perform variations of cosine-rule calculations on points and triangle sides defined as equations
I've roughed up some code for this a while ago based on high school "y = mx + c" concepts, but it's flawed (it fails with infinities when lines are vertical), and currently in Scala. Since I suspect I'm reinventing a wheel that's not my primary goal, I'd like to use someone else's work for this!
I've come across CGAL, and it seems very likely it's capable of all this and more, but I have two questions about it (given that it seems to take ages to get enough understanding of this kind of huge library to actually be able to answer simple questions!)
It seems to assert some kind of mathematical perfection in it's calculations, but that's not important to me, and my system will be severely memory constrained. Does it use/offer memory efficient approximations?
Is it possible (and hopefully easy) to separate out just a limited subset of features, or am I going to find the entire library (or even a very large subset) heading into my memory limited machine?
And, I suppose the inevitable follow up: are there more suitable libraries I'm unaware of?
TIA!
The problems that you are mentioning sound fairly simple indeed, so I'm wondering if you really need any library at all. Maybe if you post your original code we could help you fix it--your problem sounds like you need to redo a calculation avoiding a division by zero.
As for your point (2) about separating a limited number of features from CGAL, giving the size and the coding style of that project, from my experience that will be significantly more complicated (if at all possible) than fixing your own code.
In case you want to try a simpler library than CGAL, maybe you could try Boost.Geometry
Regards,

Does ZedGraph offer any kind of Level-of-Detail Culling behavior?

I've searched and can't find an answer to this question. I could write the code myself to do it, but I don't want to reinvent the wheel. :)
Since ZedGraph uses an IPointList and its indexer for internal data access, you can assign any kind of data structure to it and dynamically change the data that ZedGraph receives when it calls the indexer.
It's a smart architecture, and naturally, it would be feasible to implement a Level-of-Detail system using a custom IPointList where the number of points is culled based on the xScale and yScale of the GraphPane.
This way you can have millions of points loaded, but when the zoomlevel of the graph would show all the points, they can be culled so that ZedGraph is only drawing a few thousand. As the zoom magnification is increased, fewer points would be culled in the region of interest.
I wanted to know if ZedGraph already offers anything like this out of the box. I haven't seen any indication of support for it.
Does anyone know?
I posted about this on Sourceforge and got no response there either.
Then I posted on a fork on Github and got a response. It's here:
https://github.com/ZedGraph/ZedGraph/issues/13
The answer:
There is a naive algorithm that filters points by simply skipping them blindly to reach a target display number.
Of course this naive approach can give completely wrong impressions of what the data looks like when peaks and valleys get dropped in a line graph, for instance. IMHO, an algorithm like that is completely unuseable.
So basically, there is no acceptable built-in culling in ZedGraph at the present time.

Tensorflow: how to detect audio direction

I have a task: to determine the sound source location.
I had some experience working with tensorflow, creating predictions on some simple features and datasets. I assume that for this task, there would be necessary to analyze the sound frequences and probably other related data on training and then prediction steps. The sound goes from the headset, so human ear is able to detect the direction.
1) Did somebody already perform that? (unfortunately couldn't find any similar project)
2) What kind of caveats could I meet while trying to achieve that?
3) Am I able to do that using this technology approach? Are there any other sound processing frameworks / technologies / open source projects that could help me ?
I am asking that here, since my research on google, github, stackoverflow didn't show me any relevant results on that specific topic, so any help is highly appreciated!
This is typically done with more traditional DSP with multiple sensors. You might want to look into time difference of arrival(TDOA) and direction of arrival(DOA). Algorithms such as GCC-PHAT and MUSIC will be helpful.
Issues that you might encounter are: DOA accuracy is function of the direct to reverberant ratio of the source, i.e. the more reverberant the environment the harder it is to determine the source location.
Also you might want to consider the number of location dimensions you want to resolve. A point in 3D space is much more difficult than a direction relative to the sensors
Using ML as an approach to this is not entirely without merit but you will have to consider what it is you would be learning, i.e. you probably don't want to learn the test rooms reverberant properties but instead the sensors spatial properties.

FFT Pitch Detection for iOS using Accelerate Framework?

I have been reading up on FFT and Pitch Detection for a while now, but I'm having trouble piecing it all together.
I have worked out that the Accelerate framework is probably the best way to go with this, and I have read the example code from apple to see how to use it for FFTs. What is the input data for the FFT if I wanted to be running the pitch detection in real time? Do I just pass in the audio stream from the microphone? How would I do this?
Also, after I get the FFT output, how can I get the frequency from that? I have been reading everywhere, and can't find any examples or explanations of this?
Thanks for any help.
Frequency and pitch are not the same thing - frequency is a physical quantity, pitch is a psychological percept - they are similar, but there are important differences, which may or may not matter to you, depending on the type of instrument for which you are trying to measure pitch.
You need to read up a little on the various pitch detection algorithms (and on the meaning of pitch itself), decide what algorithm you want to use and only then set about implementing it. See this Wikipedia page for a good overview of pitch and pitch detection (note that you can use FFT for the autocorrelation-based and frequency domain methods).
As for using the FFT to identify peaks in a spectrum and their associated frequencies, there are many questions and answers related to this on SO already, see for example: How do I obtain the frequencies of each value in an FFT?
I have an example implementation of an Autocorrelation function available online for ios 5.1. Look at this post for a link to the implementation AND functions on how to find the nearest note and how to create a string representing the pitch (A, A#, B, B#, etc...)
While the FFT is very useful in many applications, it might not be the most accurate if you're trying to do simple pitch detection. (It can be as accurate, but you have to deal with complex numbers to do a lot of phase calculations)

Creating simple waveforms with CoreAudio

I am new to CoreAudio, and I would like to output a simple sine wave and square wave with a given frequency and amplitude through the speakers using CA. I don't want to use sound files as I want to synthesize the sound.
What do I need to do this? And can you give me an example or tutorial? Thanks.
There are a number of errors in the previous answer. I, the legendary :-) James McCartney, not James Harkins wrote the sinewavedemo, I also wrote SuperCollider which is what the audiosynth.com website is about. I also now work at Apple on CoreAudio. The sinewavedemo DOES use CoreAudio, since it uses AudioHardware.h from CoreAudio.framework as its way to play the sound.
You should not use the sinewavedemo. It is very old code and it makes dangerous assumptions about the buffer layout of the audio hardware. The easiest way nowadays to play a sound that you are generating is to use the AudioQueue, or to use an output audio unit with a render callback set.
The best and easiest way to do that without files is to prepare a single cycle buffer, containing one cycle of the wave (this is called technically a wavetable)
In the playback function called by CoreAudio thread, fill the output buffer with samples read from the wave buffer.
Note however that you will face two problems very quickly :
- for the sine wave, if the playback frequency is not an integer multiple of the desired sine frequency, you will probably need to implement an interpolator if you want to have a good quality. Using only integer pointers will generate a significant level of harmonic noise.
for the square wave, avoid to just program an array with +1 / -1 values. Such a signal is not bandlimited and will alias a lot. Do not forget that the spectrum of a square wave is virtually infinite!
To get good algorithms for signal generation, take a look to musicdsp.org, that's probably one of the best resource for that
Are you new to audio programming in general? As a starting point i would check out
http://www.audiosynth.com/sinewavedemo.html
This is a minimum osx sinewave implementation by the legendary James Harkins. Note, it doesn't use CoreAudio at all.
If you specifically want to use CoreAudio for your sinewave you need to create an output unit (RemoteIO on the iphone, AUHAL on osx) and supply an input callback, where you can pretty much use the code from the above example. Check out
http://developer.apple.com/mac/library/technotes/tn2002/tn2091.html
The benefits of CoreAudio are chiefly, chain other effects with your sinewave, write plugins for hosts like Logic & provide the interfaces for them, write a host (like Logic) for plugins that can be chained together.
If you don't wont to write a plugin, or host plugins then CoreAudio might not actually be for you. But one of the best things about using CoreAudio is that once you get your sinewave callback working it is easy to add effects, or mix multiple sines together
To do this you need to put your output unit in a graph, to which you can effects, mixers, etc.
Here is some help on setting up graphs http://timbolstad.com/2010/03/16/core-audio-getting-started-pt2/
It isn't as difficult as it looks. Apple provides C++ helper classes for many things (/Developer/Examples/CoreAudio/PublicUtility) and even if you don't want to use C++ (you don't have to!) they can be a useful guide to the CoreAudio API.
If you are not doing this realtime, using the sin() function from math.h is not a bad idea. Just fill however many samples you need with sin() beforehand when it is time to play it, just send it to the audio buffer. sin() can be quite slow to call once every sample if you are doing this realtime, using an interpolated wavetable lookup method is much faster, but the resulting sound will not be as spectrally pure.
There is a good and well documented sine wave player code example in Chapter 7 of the Adamson/Avila "Learning Core Audio" book, published by Addison-Wesley Professional (ISBN-10: 0-321-63684-8 ):
http://www.informit.com/store/learning-core-audio-a-hands-on-guide-to-audio-programming-9780321636843
It is a rather new publication (2012) and addresses precisely the issue of this question. It's only a starting point, but it's a valuable starting point.
BTW. Don't jump to graphs before having this basic lesson (which involves some math) behind.
Concerning example code, a quick and efficient method I often use deals with a pre-filled sinewave lookup table which has as many members as sample rate, for 44100 Hz the table has size of 44100. In other words, cycle length equals sample rate. This gives an acceptable trade-off between speed and quality in many cases. You can initialize it with the program.
If you generate floating point samples (which is default in OSX), and use math functions, use sinf() rather than (float)sin(). Promotions in inner loop cycles of a render callback are always resource-expensive. So are repetitive multiplications of constants, such as 2.0*M_PI, which can too often be found in code examples.