naudio SineWaveProvider32 gives clicks when changing Amplitude - naudio

I am using naudio with SineWaveProvider32 code directly from http://mark-dot-net.blogspot.com/2009/10/playback-of-sine-wave-in-naudio.html to generate
sine wave tones. The relevant code in the SineWaveProvider32 class:
public override int Read(float[] buffer, int offset, int sampleCount)
{
int sampleRate = WaveFormat.SampleRate;
for (int n = 0; n < sampleCount; n++)
{
buffer[n + offset] =
(float)(Amplitude * Math.Sin((2 * Math.PI * sample * Frequency) / sampleRate));
sample++;
if (sample >= sampleRate) sample = 0;
}
return sampleCount;
}
I was getting clicks/beats every second, so I changed
if (sample >= sampleRate) sample = 0;
to
if (sample >= (int)(sampleRate / Frequency)) sample = 0;
This fixed the clicks every second (so that "sample" was always relative to a zero-crossing, not the sample rate).
However, whenever I set the Amplitude variable, I get a click. I tried setting it only when the buffer[] was at a zero-crossing,
thinking that a sudden jump in amplitude might be causing the problem. That did not solve the problem. I am setting the Amplitude to values between
0.25 and 0.0
I tried adusting the latency and number of buffers as suggested in NAudio change volume in runtime but that
had no effect either.
My code that changes the Amplitude:
public async void play(int durationMS, float amplitude = .25f)
{
PitchPlayer pPlayer = new PitchPlayer(this.frequency, amplitude);
pPlayer.play();
await Task.Delay(durationMS/2);
pPlayer.provider.Amplitude = .15f;
await Task.Delay(durationMS /2);
pPlayer.stop();
}

the clicks are caused by a discontinuity in the waveform. This is hard to fix in a class like this because ideally you would slowly ramp the volume from one value to the other. This can be done by modifying the code to have a target amplitude, and then if the current amplitude is not equal to the target amplitude then you move towards it by a small delta amount calculated each time through the loop. So over a period of say 10ms, you move from the old to new amplitude. But you'd need to write this yourself unfortunately.
For a similar concept where the frequency is being changed gradually rather than the amplitude, take a look at my blog post on portamento in NAudio.

Angular speed
Instead of frequency it is easier to think in terms of angular speed. How much to increase the angular argument of a sin() function for each sample.
When using radians for angle, one periode completing a full circle is 2*pi so the angular velocity of one Hz is (2*pi)/T = (2*pi)/1/f = f*2*pi = 1*2*pi [rad/s]
The sample rate is in [samples per second] and the angular velocity is in [radians per second] so to get the [angle per sample] you simply divide angular velocity by sample rate to get [radians/second]/[samples/second] = [radians/sample].
That is the number to continuously increase the angle of the sin() function for each sample - no multiplication is needed.
To sweep from one frequency to another you simply move from one angular increment to another in small steps over a number of samples.
By sweeping between frequencies there will be a continuous chain of adjacent samples and transient spread out smoothly over time.
Moving from one amplitude to another could also be spread out over multiple samples to avoid sharp transients.
Fade in and fade out incrementally adjusting the amplitude at the start and end of a sound is more graceful than stepping the output from one level to another in one sample.
Sharp steps produce rings on the water that propagate out in the world.
About sin() calculations
For speedy calculations it may be better to rotate a vector of the length of the amplitude and calculate sn=sin(delta), cs=cos(delta) only when angular speed changes:
Wikipedia Link to theory
where amplitude^2 = x^2 + y^2, each new sample can be calculated as:
px = x * cs - y * sn;
py = x * sn + y * cs;
To increase the amplitude you simply multiply px and py by a factor say 1.01. To make the next sample you set x=px, y=py and run the px, py calculation again with cs and sn the same all the time.
py or px can be used as the signal output and will be 90 deg out of phase.
On the first sample you can set x=amplitude and y=0.

Related

Getting frequency and amplitude from an audio file using FFT - so close but missing some vital insights, eli5?

tl/dr: I've got two audio recordings of the same song without timestamps, and I'd like to align them. I believe FFT is the way to go, but while I've got a long way, it feels like I'm right on the edge of understanding enough to make it work, and would greatly benefit from a "you got this part wrong" advice on FFT. (My education never got into this area) So I came here seeking ELI5 help.
The journey:
Get two recordings at the same sample rate. (done!)
Transform them into a waveform. (DoubleArray) This doesn't keep any of the meta info like "samples/second" but the FFT math doesn't care until later.
Run a FFT on them using a simplified implementation for beginners
Get a Array<Frame>, each Frame contains Array<Bin>, each Bin has (amplitude, frequency) because the older implementation hid all the details (like frame width, and number of Bins, and ... stuff?) and outputs words I'm familiar with like "amplitude" and "frequency"
Try moving to a more robust FFT (Apache Commons)
Get an output of 'real' and 'imaginary' (uh oh)
Make the totally incorrect assumption that those were the same thing (amplitude and frequency). Surprise, they aren't!
Apache's FFT returns an Array<Complex> which means it... er... is just one frame's worth? And I should be chopping the song into 1 second chunks and passing each one into the FFT and call it multiple times? That seems strange, how does it get lower frequencies?
To the best of my understanding, the complex number is a way to convey the phase shift and amplitude in one neat container (and you need phase shift if you want to do the FFT in reverse). And the frequency is calculated from the index of the array.
Which works out to (pseudocode in Kotlin)
val audioFile = File("Dream_On.pcm")
val (phases, amplitudes) = AudioInputStream(
audioFile.inputStream(),
AudioFormat(
/* encoding = */ AudioFormat.Encoding.PCM_SIGNED,
/* sampleRate = */ 44100f,
/* sampleSizeInBits = */ 16,
/* channels = */ 2,
/* frameSize = */ 4,
/* frameRate = */ 44100f,
/* bigEndian = */ false
),
(audiFile.length() / /* frameSize */ 4)
).use { ais ->
val bytes = ais.readAllBytes()
val shorts = ShortArray(bytes.size / 2)
ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(shorts)
val allWaveform = DoubleArray(shorts.size)
for (i in shorts.indices) {
allWaveform[i] = shorts[i].toDouble()
}
val halfwayThroughSong = allWaveform.size / 2
val moreThanOneSecond = allWaveform.copyOfRange(halfwayThroughSong, halfwayThroughSong + findNextPowerOf2(44100))
val fft = FastFourierTransformer(DftNormalization.STANDARD)
val fftResult: Array<Complex> = fft.transform(moreThanOneSecond, TransformType.FORWARD)
println("fftResult size: ${fftResult.size}")
val phases = DoubleArray(fftResult.size / 2)
val amplitudes = DoubleArray(fftResult.size / 2)
val frequencies = DoubleArray(fftResult.size / 2)
fftResult.filterIndexed { index, _ -> index < fftResult.size / 2 }.forEachIndexed { idx, complex ->
phases[idx] = atan2(complex.imaginary, complex.real)
frequencies[idx] = idx * 44100.0 / fftResult.size
amplitudes[idx] = hypot(complex.real, complex.imaginary)
}
Triple(phases, frequencies, amplitudes)
}
Is my step #8 at all close to the truth? Why would the FFT result return an array as big as my input number of samples? That makes me think I've got the "window" or "frame" part wrong.
I read up on
FFT real/imaginary/abs parts interpretation
Converting Real and Imaginary FFT output to Frequency and Amplitude
Java - Finding frequency and amplitude of audio signal using FFT
An audio recording in waveform is a series of sound energy levels, basically how much sound energy there should be at any one instant. Based on the bit rate, you can think of the whole recording as a graph of energy versus time.
Sound is made of waves, which have frequencies and amplitudes. Unless your recording is of a pure sine wave, it will have many different waves of sound coming and going, which summed together create the total sound that you experience over time. At any one instant of time, you have energy from many different waves added together. Some of those waves may be at their peaks, and some at their valleys, or anywhere in between.
An FFT is a way to convert energy-vs.-time data into amplitude-vs.-frequency data. The input to an FFT is a block of waveform. You can't just give it a single energy level from a one-dimensional point in time, because then there is no way to determine all the waves that add together to make up the amplitude at that point of time. So, you give it a series of amplitudes over some finite period of time.
The FFT then does its math and returns a range of complex numbers that represent the waves of sound over that chunk of time, that when added together would create the series of energy levels over that block of time. That's why the return value is an array. It represents a bunch of frequency ranges. Together the total data of the array represents the same energy from the input array.
You can calculate from the complex numbers both phase shift and amplitude for each frequency range represented in the return array.
Ultimately, I don’t see why performing an FFT would get you any closer to syncing your recordings. Admittedly it’s not a task I’ve tried before. But I would think waveform data is already the perfect form for comparing the data and finding matching patterns. If you break your songs up into chunks to perform FFTs on, then you can try to find matching FFTs but they will only match perfectly if your chunks are divided exactly along the same division points relative to the beginning of the original recording. And even if you could guarantee that and found matching FFT’s, you will only have as much precision as the size of your chunks.
But when I think of apps like Shazam, I realize they must be doing some sort of manipulation of the audio that breaks it down into something simpler for rapid comparison. That possibly involves some FFT manipulation and filtering.
Maybe you could compare FFTs using some algorithm to just find ones that are pretty similar to narrow down to a time range and then compare wave form data in that range to find the exact point of synchronization.
I would imagine the approach that would work well would to find the offset with the maximum cross-correlation between the two recordings. This means calculate the cross-correlation between the two pieces at various offsets. You would expect the maximum cross-correlation to occur at the offset where the two piece were best aligned.

STM32 Gyroscope angle tracking

I'm working with a Gyroscope (L3GD20) with a 2000DPS
Correct me if their is a mistake,
I start by reading the values High and Low for the 3 axes and concatenate them. Then I multiply every value by 0.07 to convert them into DPS.
My main goal is to track the angle over time, so I simply implemented a Timer which reads the data every dt = 10 ms
to integrate ValueInDPS * 10ms, here is the code line I'm using :
angleX += (resultGyroX)*dt*0.001; //0.001 to get dt in [seconds]
This should give us the value of the angle in [degree] am I right ?
The problem is that the values I'm getting are a little bit weird, for example when I make a rotation of 90°, I get something like 70°...
Your method is a recipe for imprecision and accumulated error.
You should avoid using floating point (especially if there is no FPU), and especially also if this code is in the timer interrupt handler.
you should avoid unnecessarily converting to degrees/sec on every sample - that conversion is needed only for presentation, so you should perform it only when you need to need the value - internally the integrator should work in gyro sample units.
Additionally, if you are doing floating point in both an ISR and in a normal thread and you have an FPU, you may also encounter unrelated errors, because FPU registers are not preserved and restored in an interrupt handler. All in all floating point should only be used advisedly.
So let us assume you have a function gyroIntegrate() called precisely every 10ms:
static int32_t ax = 0
static int32_t ay = 0
static int32_t az = 0
void gyroIntegrate( int32_t sample_x, int32_t sample_y, int32_t sample_z)
{
ax += samplex ;
ay += sampley ;
az += samplez ;
}
Not ax etc. are the integration of the raw sample values and so proportional to the angle relative to the starting position.
To convert ax to degrees:
degrees = ax × r-1 × s
Where:
r is the gyro resolution in degrees per second (0.07)
s is the sample rate (100).
Now you would do well to avoid floating point and here it is entirely unnecessary; r-1 x s is a constant (1428.571 in this case). So to read the current angle represented by the integrator, you might have a function:
#define GYRO_SIGMA_TO_DEGREESx10 14286
void getAngleXYZ( int32_t* int32_t, int32_t* ydeg, int32_t* zdeg )
{
*xdeg = (ax * 10) / GYRO_SIGMA_TO_DEGREESx10 ;
*ydeg = (ax * 10) / GYRO_SIGMA_TO_DEGREESx10 ;
*zdeg = (ax * 10) / GYRO_SIGMA_TO_DEGREESx10 ;
}
getAngleXYZ() should be called from the application layer when you need a result - not from the integrator - you do the math at the point of need and have CPU cycles left to do more useful stuff.
Note that in the above I have ignored the possibility of arithmetic overflow of the integrator. As it is it is good for approximately +/-1.5 million degrees +/-4175 rotations), so it may not be a problem in some applications. You could use an int64_t or if you are not interested in the number of rotations, just the absolute angle then, in the integrator:
ax += samplex ;
ax %= GYRO_SIGMA_360 ;
Where GYRO_SIGMA_360 equals 514286 (360 x s / r).
Unfortunately, MEMs sensor math is quite complicated.
I would personally use ready libraries provided by the STM https://www.st.com/en/embedded-software/x-cube-mems1.html.
I actually use them, and the results are very good.

How to simulate Mouse Acceleration?

I've written iPhone - Mac, Client - Server app that allows to use mouse via touchpad.
Now on every packet sent I move cursor by pecific amount of pixels (now 10px).
It isn't accurate. When i change sensitivity to 1px it's to slow.
I am wondering how to enhance usability and simulate mouse acceleration.
Any ideas?
I suggest the following procedure:
ON THE IPHONE:
Determine the distance moved in x and y direction, let's name this dx and dy.
Calculate the total distance this corresponds to: dr = sqrt(dx^2+dy^2).
Determine how much time has passed, and calculate the speed of the movement: v = dr/dt.
Perform some non-linear transform on the velocity, e.g.: v_new = a * v + b * v^2 (start with a=1 and b=0 for no acceleration, and then experiment for optimal values)
Calculate a new distance: dr_new = v_new * dt.
Calculate new distances in x/y direction:
dx_new = dx * dr_new / dr and dy_new = dy * dr_new / dr.
Send dx_new and dy_new to the Mac.
ON THE MAC:
Move the mouse by dx_new and dy_new pixels in x/y direction.
NOTE: This might jitter a lot, you can try averaging the velocity after step (3) with the previous two or three measured velocities if it jitters to much.

Animation independent of frame rate

To make animation independent of frame rate, is it necessary to multiply the delta value by both velocity and acceleration?
// Multiply both acceleration and velocity by delta?
vVelocity.x += vAcceleration.x * delta;
vVelocity.y += vAcceleration.y * delta;
position.x += vVelocity.x * delta;
position.y += vVelocity.y * delta;
Should I apply delta to the velocity only and not acceleration?
Assuming your "delta" is the amount of time passed since last update:
Short answer: yes.
Long answer:
One way to check this sort of thing is to see if the units work out. It's not guaranteed, but usually if your units work out, then you've figured things correctly.
Velocity measures distance per unit time, and delta is time. So velocity times delta is (picking arbitrary units meters and seconds) (m/s) * s = m. So you can see that velocity times delta does create a distance, so that appears reasonable for position.
Acceleration measures velocity per unit time, that is, with the same units (m/s)/s. So, acceleration times delta is ((m/s)/s) * s = m/s. Looks like a velocity to me. We're good!
Yes, it is necessary to involve delta with both the velocity and the acceleration. They're both properties that are defined with respect to time (m/s for one, m/s/s for the other - units may vary), so delta should be used whenever they have to change non-instantaneously.

gravity simulation

I want to simulate a free fall and a collision with the ground (for example a bouncing ball). The object will fall in a vacuum - an air resistance can be omitted. A collision with the ground should causes some energy loss so finally the object will stop moving. I use JOGL to render a point which is my falling object. A gravity is constant (-9.8 m/s^2).
I found an euler method to calculate a new position of the point:
deltaTime = currentTime - previousTime;
vel += acc * deltaTime;
pos += vel * deltaTime;
but I'm doing something wrong. The point bounces a few times and then it's moving down (very slow).
Here is a pseudocode (initial pos = (0.0f, 2.0f, 0.0f), initial vel(0.0f, 0.0f, 0.0f), gravity = -9.8f):
display()
{
calculateDeltaTime();
velocity.y += gravity * deltaTime;
pos.y += velocity.y * deltaTime;
if(pos.y < -2.0f) //a collision with the ground
{
velocity.y = velocity.y * energyLoss * -1.0f;
}
}
What is the best way to achieve a realistic effect ? How the euler method refer to the constant acceleration equations ?
Because floating points dont round-up nicely, you'll never get at a velocity that's actually 0. You'd probably get something like -0.00000000000001 or something.
you need to to make it 0.0 when it's close enough. (define some delta.)
To expand upon my comment above, and to answer Tobias, I'll add a complete answer here.
Upon initial inspection, I determined that you were bleeding off velocity to fast. Simply put, the relationship between kinetic energy and velocity is E = m v^2 /2, so after taking the derivative with respect to velocity you get
delta_E = m v delta_v
Then, depending on how energyloss is defined, you can establish the relationship between delta_E and energyloss. For instance, in most cases energyloss = delta_E/E_initial, then the above relationship can be simplified as
delta_v = energyloss*v_initial / 2
This is assuming that the time interval is small allowing you to replace v in the first equation with v_initial, so you should be able to get away with it for what your doing. To be clear, delta_v is subtracted from velocity.y inside your collision block instead of what you have.
As to the question of adding air-resistance or not, the answer is it depends. For small initial drop heights, it won't matter, but it can start to matter with smaller energy losses due to bounce and higher drop points. For a 1 gram, 1 inch (2.54 cm) diameter, smooth sphere, I plotted time difference between with and without air friction vs. drop height:
For low energy loss materials (80 - 90+ % energy retained), I'd consider adding it in for 10 meter, and higher, drop heights. But, if the drops are under 2 - 3 meters, I wouldn't bother.
If anyone wants the calculations, I'll share them.