Custom EQ AudioUnit on iOS - objective-c

The only effect AudioUnit on iOS is the "iTunes EQ", which only lets you use EQ pre-sets. I would like to use a customized eq in my audio graph
I came across this question on the subject and saw an answer suggesting using this DSP code in the render callback. This looks promising and people seem to be using this effectively on various platforms. However, my implementation has a ton of noise even with a flat eq.
Here's my 20 line integration into the "MixerHostAudio" class of Apple's "MixerHost" example application (all in one commit):
https://github.com/tassock/mixerhost/commit/4b8b87028bfffe352ed67609f747858059a3e89b
Any ideas on how I could get this working? Any other strategies for integrating an EQ?
Edit: Here's an example of the distortion I'm experiencing (with the eq flat):
http://www.youtube.com/watch?v=W_6JaNUvUjA

In the code in EQ3Band.c, the filter coefficients are used without being initialized. The init_3band_state method initialize just the gains and frequencies, but the coefficients themselves - es->f1p0 etc. are not initialized, and therefore contain some garbage values. That might be the reason for the bad output.

This code seems wrong in more then one way.
A digital filter is normally represented by the filter coefficients, which are constant, the filter inner state history (since in most cases the output depends on history) and the filter topology, which is the arithmetic used to calculate the output given the input and the filter (coeffs + state history). In most cases, and of course when filtering audio data, you expect to get 0's at the output if you feed 0's to the input.
The problems in the code you linked to:
The filter coefficients are changed in each call to the processing method:
es->f1p0 += (es->lf * (sample - es->f1p0)) + vsa;
The input sample is usually multiplied by the filter coefficients, not added to them. It doesn't make any physical sense - the sample and the filter coeffs don't even have the same physical units.
If you feed in 0's, you do not get 0's at the output, just some values which do not make any sense.
I suggest you look for another code - the other option is debugging it, and it would be harder.
In addition, you'd benefit from reading about digital filters:
http://en.wikipedia.org/wiki/Digital_filter
https://ccrma.stanford.edu/~jos/filters/

Related

How to access net displacements in pyiron

Using pyiron, I want to calculate the mean square displacement of the ions in my system. How do I see the total displacement (i.e. not folded back by periodic boundary conditions) without dumping very frequently and checking when an atom passes over the boundary and gets wrapped?
Try to compare job['output/generic/unwrapped_positions'][-1] and job.structure.positions+job.output.total_displacements[-1]. If they deliver the same values, it's definitely fine both ways. If not, you can post the relevant lines in your notebook here.
I'd like to add a few comments to Jan's answer:
While job['output/generic/unwrapped_positions'] returns the unwrapped positions parsed from the output files, job.output.total_displacements returns the displacement of atoms calculated from each pair of consecutive snapshots. So if an atom moves more than half the box length in any direction, job.output.total_displacements will give wrong coordinates. Therefore, job['output/generic/unwrapped_positions'] is generally more trustworthy, but it is not available in all the codes (since some codes simply do not provide an output for unwrapped positions).
Moreover, if an interactive job is used, it is possible that job.structure.positions does not return the initial positions, i.e. job.structure.positions+job.output.total_displacements won't be initial positions + displacements.
So, in short, my answer to your question would be rather "Use job['output/generic/unwrapped_positions'] and if it's not available, use job.structure.positions+job.output.total_displacements but be aware of potential problems you might be running into."

How to add a new syntax element in HM (HEVC test Model)

I've been working on the HM reference software for a while, to improve something in the intra prediction part. Now a new intra prediction algorithm is added to the code and I let the encoder choose between my algorithm and the default algorithm of HM (according to the RDCost of course).
What I need now, is to signal a flag for each PU, so that the decoder will be able to perform the same algorithm as the encoder decides in the rate distortion loop.
I want to know what exactly should I do to properly add this one bit flag to the stream, without breaking anything in the code.
Assuming that I want to use a CABAC context model to keep the track of my flag's statistics, what else should I do:
adding a new context model like ContextModel3DBuffer m_cCUIntraAlgorithmSCModel to the TEncSbac.h file.
properly initializing the model (both at encoder and decoder side) by looking at how the HM initialezes other context models.
calling the function m_pcBinIf->encodeBin(myFlag, cCUIntraAlgorithmSCModel) and m_pcTDecBinIfdecodeBin(myFlag, cCUIntraAlgorithmSCModel) at the encoder side and decoder side, respectively.
I take these three steps but apparently it breaks something.
PS: Even an equiprobable signaling (i.e. without using CABAC contexts) will be useful. I just want to send this flag peacefully!
Thanks in advance.
I could solve this problem finally. It was a bug in the CABAC context initialization.
But I want to share this experience as many people may want to do the same thing.
The three steps that I explained are essentially necessary to add a new syntax element, but one might be very careful with the followings:
In the beginning, you need to decide either you want to use a separate context model for your syntax element? Or you want to use an existing one? In case of CABAC separation, you should define a ContextModel3DBuffer and the best way to do that is: finding a similar syntax element in the code; then duplicating its ``ContextModel3DBuffer'' definition and ALL of its occurences in the code. This way assures that you are considering everything.
Encoding of each syntax elements happens in two different places: first, in the RDO loop to make a "decision", and second, during the actual encoding phase and when the decisions are being encoded (e.g. encodeCtu function).
The order of encoding/decoding syntaxt elements should be the same at the encoder/decoder sides. For example if your new syntax element is encoded after splitFlag and before predMode at the encoder side, you should decode it exactly between splitFlag and predMode at the decoder side.
The context model is implemented as a 3D matrix in order to let track the statistics of syntaxt elements separately for different block sizes, componenets etc. This means that when you want to call the function encodeBin, you may make sure that a correct index is being used. I've made stupid mistakes in this part!
Apart from the above remarks, I found a the function getState very useful for debugging. This function returns the state of your CABAC context model in an arbitrary place of the code when you have access to it. It is very useful to compare the state at the same place of the encoder and the decoder when you have a mismatch. For example, it happens a lot that you encode a 1 but you decode a 0. In this case, you need to check the state of your CABAC context before encoding and decoding. They should be the same. If they are not the same, track back the error to find the first place of mismatch.
I hope it was helpful.

GNU Radio block with variable number of intputs/outputs

I am currently trying to do the signal processing of multiple channels in parallel using a custom source block. Up to now I created an OOT-source block which streams data for only one channel into one output perfectly fine.
Now I am searching for a way to expand this module so that it can support a higher number of channels ( = outputs of the source block; up to 64) in parallel. Because the protocol used to pull the samples pulls them all at one it is not possible to use more instances of the same source block.
Things I have found so far:
A pdf which seems to explain exactly what to do already but it seems that it is outdated and that this functionality is no longer supported under GNU Radio.
The description of a feature which should be implemented in the future.
Is there are known solution or workaround for this problem?
Look at the add block: It has configurable many inputs!
Now, the trick here is threefold:
define an io_signature as input and output that allows for adjustable numbers. If you use gr_modtool add to create a new block, your io_signatures will be filled with <+MIN_IN+>, <+MAX_IN+>, <+MIN_OUT+> and <+MAX_OUT+>. Adjust these to reflect your actual minimum and maximum in- and output port numbers. If you want to have 1 to infinity inputs, use 1, -1.
in your (general_)work method, check for the number of inputs by doing something like ninputs = input_items.size(), and for the number of outputs by doing noutputs = output_items.size().
(optionally, if you want to use GRC) modify the <sink>/<source> definitions in your block GRC XML:
<sink>
<name>in</name>
<type>complex</type>
<nports>$num_inputs</nports>
</sink>
num_inputs could be a block parameter; compare the add_XX block source code.

How can I compare two NSImages for differences?

I'm attempting to gauge the percentage difference between two images.
Having done a lot of reading I seem to have a number of options but I'm not sure what the best method to follow for:
Ease of coding
Performance.
The methods I've seen are:
Non language specific - academic Image comparison - fast algorithm and Mac specific direct pixel access http://www.markj.net/iphone-uiimage-pixel-color/
Does anyone have any advice about what solutions make most sense for the above two cases and have code samples to show how to apply them?
I've had success calculating the difference between two images using the histogram technique mentioned here. redmoskito's answer in the SO question you linked to was actually my inspiration!
The following is an overview of the algorithm I used:
Convert the images to grayscale—compare one channel instead of three.
Divide each image into an n * n grid of "subimages". Then, for subimage pair:
Calculate their colour composition histograms.
Calculate the absolute difference between the two histograms.
The maximum difference found between two subimages is a measure of the two images' difference. Other metrics could also be used (e.g. the average difference betwen subimages).
As tskuzzy noted in his answer, if your ultimate goal is a binary "yes, these two images are (roughly) the same" or "no, they're not", you need some meaningful threshold value. You could produce such a value by passing images into the algorithm and tweaking the threshold based on its output and how similar you think the images are. A form of machine learning, I suppose.
I recently wrote a blog post on this very topic, albeit as part of a larger goal. I also created a simple iPhone app to demonstrate the algorithm. You can find the source on GitHub; perhaps it will help?
It is really difficult to suggest something when you don't tell us more about the images or the variations. Are they shapes? Are they the different objects and you want to know what class of objects? Are they the same object and you want to distinguish the object instance? Are they faces? Are they fingerprints? Are the objects in the same pose? Under the same illumination?
When you say performance, what exactly do you mean? How large are the images? All in all it really depends. With what you've said if it is only ease of coding and performance I would suggest to just find the absolute value of the difference of pixels. That is super easy to code and about as fast as it gets, but really unlikely to work for anything other than the most synthetic examples.
That being said I would like to point you to: DHOG, GLOH, SURF and SIFT.
You can use fairly basic subtraction technique that the lads above suggested. #carlosdc has hit the nail on the head with regard to the type of image this basic technique can be used for. I have attached an example so you can see the results for yourself.
The first shows a image from a simulation at some time t. A second image was subtracted away from the first which was taken some (simulation) time later t + dt. The subtracted image (in black and white for clarity) then shows how the simulation has changed in that time. This was done as described above and is very powerful and easy to code.
Hope this aids you in some way
This is some old nasty FORTRAN, but should give you the basic approach. It is not that difficult at all. Due to the fact that I am doing it on a two colour pallette you would do this operation for R, G and B. That is compute the intensities or values in each cell/pixal, store them in some array. Do the same for the other image, and subtract one array from the other, this will leave you with some coulorfull subtraction image. My advice would be to do as the lads suggest above, compute the magnitude of the sum of the R, G and B componants so you just get one value. Write that to array, do the same for the other image, then subtract. Then create a new range for either R, G or B and map the resulting subtracted array to this, the will enable a much clearer picture as a result.
* =============================================================
SUBROUTINE SUBTRACT(FNAME1,FNAME2,IOS)
* This routine writes a model to files
* =============================================================
* Common :
INCLUDE 'CONST.CMN'
INCLUDE 'IO.CMN'
INCLUDE 'SYNCH.CMN'
INCLUDE 'PGP.CMN'
* Input :
CHARACTER fname1*(sznam),fname2*(sznam)
* Output :
integer IOS
* Variables:
logical glue
character fullname*(szlin)
character dir*(szlin),ftype*(3)
integer i,j,nxy1,nxy2
real si1(2*maxc,2*maxc),si2(2*maxc,2*maxc)
* =================================================================
IOS = 1
nomap=.true.
ftype='map'
dir='./pictures'
! reading first image
if(.not.glue(dir,fname2,ftype,fullname))then
write(*,31) fullname
return
endif
OPEN(unit2,status='old',name=fullname,form='unformatted',err=10,iostat=ios)
read(unit2,err=11)nxy2
read(unit2,err=11)rad,dxy
do i=1,nxy2
do j=1,nxy2
read(unit2,err=11)si2(i,j)
enddo
enddo
CLOSE(unit2)
! reading second image
if(.not.glue(dir,fname1,ftype,fullname))then
write(*,31) fullname
return
endif
OPEN(unit2,status='old',name=fullname,form='unformatted',err=10,iostat=ios)
read(unit2,err=11)nxy1
read(unit2,err=11)rad,dxy
do i=1,nxy1
do j=1,nxy1
read(unit2,err=11)si1(i,j)
enddo
enddo
CLOSE(unit2)
! substracting images
if(nxy1.eq.nxy2)then
nxy=nxy1
do i=1,nxy1
do j=1,nxy1
si(i,j)=si2(i,j)-si1(i,j)
enddo
enddo
else
print *,'SUBSTRACT: Different sizes of image arrays'
IOS=0
return
endif
* normal finishing
IOS=0
nomap=.false.
return
* exceptional finishing
10 write (*,30) fullname
return
11 write (*,32) fullname
return
30 format('Cannot open file ',72A)
31 format('Improper filename ',72A)
32 format('Error reading from file ',72A)
end
! =============================================================
Hope this is of some use. All the best.
Out of the methods described in your first link, the histogram comparison method is by far the simplest to code and the fastest. However key point matching will provide far more accurate results since you want to know a precise number describing the difference between two images.
To implement the histogram method, I would do the following:
Compute the red, green, and blue histograms of each image
Add up the differences between each bucket
If the difference is above a certain threshold, then the percentage is 0%
Otherwise the colors found in the images are similar. So then do a pixel by pixel comparison and convert the difference into a percentage.
I don't know any precise algorithms for finding the key points of an image. However once you find them for each image you can do a pixel by pixel comparison for each of the key points.

Objective C - Cross-correlation for audio delay estimation

I would like to know if anyone knows how to perform a cross-correlation between two audio signals on iOS.
I would like to align the FFT windows that I get at the receiver (I am receiving the signal from the mic) with the ones at the transmitter (which is playing the audio track), i.e. make sure that the first sample of each window (besides a "sync" period) at the transmitter will also be the first window at the receiver.
I injected in every chunk of the transmitted audio a known waveform (in the frequency domain). I want estimate the delay through cross-correlation between the known waveform and the received signal (over several consecutive chunks), but I don't know how to do it.
It looks like there is the method vDSP_convD to do it, but I have no idea how to use it and whether I first have to perform the real FFT of the samples (probably yes, because I have to pass double[]).
void vDSP_convD (
const double __vDSP_signal[],
vDSP_Stride __vDSP_signalStride,
const double __vDSP_filter[],
vDSP_Stride __vDSP_strideFilter,
double __vDSP_result[],
vDSP_Stride __vDSP_strideResult,
vDSP_Length __vDSP_lenResult,
vDSP_Length __vDSP_lenFilter
)
The vDSP_convD() function calculates the convolution of the two input vectors to produce a result vector. It’s unlikely that you want to convolve in the frequency domain, since you are looking for a time-domain result — though you might, if you have FFTs already for some other reason, choose to multiply them together rather than convolving the time-domain sequences (but in that case, to get your result, you will need to perform an inverse DFT to get back to the time domain again).
Assuming, of course, I understand you correctly.
Then once you have the result from vDSP_convD(), you would want to look for the highest value, which will tell you where the signals are most strongly correlated. You might also need to cope with the case where the input signal does not contain sufficient of your reference signal, and in that case you may wish to (for example) ignore values in the result vector below a certain level.
Cross-correlation is the solution, yes. But there are many obstacles you need to handle. If you get samples from the audio files, they contain padding which cross-correlation function does not like. It is also very inefficient to perform correlation with all those samples - it takes a huge amount of time. I have made a sample code which demonstrates time shift of two audio files. If you are interested in the sample, look at my Github Project.