NSSpeechSynthesizer and track duration - objective-c

This time I have a logic question. Hope someone of you could help me. Using the `NSSpeechSynthesizer' you can set the rate, i.e. 235 words per minute, 100 words per minute and so on...
I found that generally the average of words per minute is calculated using standardized word length of 5 characters per word, counting spaces and symbols too.
I need to automatically subdivide a long text in tracks with a pre-selected duration, let say 15 minutes per track.
How can we calculate the correct number of characters to pass for each 'split' to the speech engine?
My solution is as follow:
// duration is the number of minutes per track
numberOfWordsPerTrack = [rateSlider floatValue] * duration;
splits = [[NSMutableArray alloc] init];
finished = NO;
NSUInteger position = 0;
while( !finished ) {
NSRange range;
// the idea is: I take 5*numberOfWordsPerTrack characters
// until the text allows me to select them
range = NSMakeRange( position, 5*numberOfWordsPerTrack );
if( range.location+range.length > mainTextView.string.length ) {
// If there are not another full character track we get
// the tail of the remaining text
finished = YES;
range = NSMakeRange( position, mainTextView.string.length-position );
}
// Here we get the track and add it to the split list
if( range.location+range.length <= mainTextView.string.length ) {
currentSplit = [mainTextView.string substringWithRange:range];
[splits addObject:currentSplit];
}
position += range.length;
}
The problem with this solution is that the track duration is not correct. It is not quite far from the desired value, but it is not right. For example, using 235 words per minute with duration of 50 minutes, I have 40 minutes per track. If I set 120 minutes per track, I have 1h:39m per track... and so on...
Where do you think is the logic error?
EDIT AFTER JanX2 REPLY
Well, while randomly thinking I came to the following hypotesis Could you tell me what do you think about that before its implementation, because it is not a light change in my code
If I used the speechSynthesizer:willSpeakWord:ofString: delegate member I could test the .aiff file size frequently, i.e. before speaking the next word (real word, not standardized). Because we know the Hzs, bits and channels those file are created with by synthesizer and because we know they are not compressed, we could gain some guess about the current length of the track.
The biggest drawback of this solution could be che continuous disk access, that can highly degrade performance.
What do you think?

I can only guess, but the heuristic you use will include “silent” characters. Why not try an compensate for the measured error? You appear to have an error that is pretty much linear so you could factor that into your calculation:
40 / 50 = 80%
99 / 120 = 82.5%
So you have an error of about 17.5-20%. Just multiply the time you calculate above by 0.8 or 0.825 and you are getting closer. This is crude, but you are already using a heuristic.
BTW: You probably should consider using -enumerateSubstringsInRange:options:usingBlock: to achieve sentence granularity instead of arbitrary word splits.
Using “-speechSynthesizer:willSpeakWord:ofString:” causes bigger issues: in my experience it can be out of sync with the position in the file being written by several hundred ms up to several seconds. And speaking up the next word seems to have problems when used with the Nuance voices.

Related

Time and Space Complexity(for specific algorithm)

Despite the last 30 minutes i spent on trying to understand time and space complexity better, i still can't confidently determine those for the algorithm below:
bool checkSubstr(std::string sub)
{
//6 OR(||) connected if statement(checks whether the parameter
//is among the items in the list)
}
void checkWords(int start,int end)
{
int wordList[2] ={0};
int j = 0;
if (start < 0)
{
start = 0;
}
if (end>cAmount)
{
end = cAmount -1;
}
if (end-start < 2)
{
return;
}
for (int i = start; i <= end-2; i++)
{
if (crystals[i] == 'I' || crystals[i] == 'A')
{
continue;
}
if (checkSubstr(crystals.substr(i,3)))
{
wordList[j] = i;
j++;
}
}
if (j==1)
{
crystals.erase(wordList[0],3);
cAmount -= 3;
checkWords(wordList[0]-2,wordList[0]+1);
}
else if (j==2)
{
crystals.erase(wordList[0],(wordList[1]-wordList[0]+3));
cAmount -= wordList[1]-wordList[0]+3;
checkWords(wordList[0]-2,wordList[0]+1);
}
}
The function basically checks a sub-string of the whole string for predetermined (3 letter, e.g. "SAN") combinations of letters. Sub-string length can be 4-6 no real way to determine, depends on the input(pretty sure it's not relevant, although not 100%).
My reasoning:
If there are n letters in the string, worst case scenario, we have to check each of them. Again depending on the input, this can be done 3 ways.
All 6 length sub-strings: If this is the case the function runs n/6 times, each running 8(or 10?) processes, which(i think) means that its time complexity is O(n).
All 4 length sub-strings: Pretty much the same reason above, O(n).
4 and 6 length sub-strings mixed: Can't see why this would be different than previous 2. O(n)
As for the space complexity, i am completely lost. However, i have an idea:
If the function recurs for maximum amount of time,it will require:
n/4 x The Amount Used In One Run
which made me think it should be O(n). Although, i'm not convinced this is correct. I thought maybe seeing someone else's thought process on this example would help me understand how to calculate time and space complexity better.
Thank you for your time.
EDIT: Let me provide clearer information. We read a combination of 6 different letters into a string, this can be (almost)any combination in any length. 'crystals' is the string, and we are looking for 6 different 3 letter combinations in that list of letters. Sort of like a jewel matching game. Now the starting list contains no matches(none of the 6 predetermined combinations exist in the first place). Therefore the only way matches can occur from then on is by swaps or matches disappearing. Once a swap is processed by top level code, the function is called to check for matches, and if a match is found the function recurs after deleting the "match" part of the string.
Now let's look at how the code is looking for a match. To demonstrate a swap of 2 letters:
ABA B-R ZIB(no spaces or '-' in the actual string, used for better demonstration),
B and R is being swapped. This swap only effects the 6 letters starting from 2nd letter and ending on 7th letter. In other words, the letters the first A and last B can form a match with are same, before and after the swap, thus no point checking for matches including those words. So a sub-string of 6 letters sent to the checking algorithm. Similarly, if a formed match disappears(gets deleted from the string) the range of effected letters is 4. So when i thought of a worst case scenario, i imagined either 1 swap creating a whole chain reaction and matching all the way till there are not enough letters to form a match, or each match happens with a swap. Again, i am not saying this is how we should think when calculating time and space complexity but this is how the code works. Hope this is clear enough if not let me know and i can provide more details. It's also important to note that swap amount and places are a part of the input we read.
EDIT: Here is how the function is called on top level for the first time:
checkWords(swaps[i]-2,swaps[i]+3);
Sub-string length can be 4-6 no real way to determine, depends on the
input (pretty sure it's not relevant, although not 100%).
That's not what the code shows; the line if (checkSubstr(crystals.substr(i,3))) conveys that substrings always have exactly 3 characters. If the substring length varies, it is relevant, since your naive substring match will degrade to O(N*M) in the general case, where N is start-end+1 (the size of the input string) and M is the size of the substring being searched. This happens because in the worst case you'll compare M characters for each of the N characters of the source string.
The rest of this answer assumes that substrings are of size 3, since that's what the code shows.
If substrings are always 3 characters long, it's different: you can essentially assume checkSubstr() is O(1) because you will always compare at most 3 characters. The bulk of the work happens inside the for loop, which is O(N), where N is end-1-start.
After the loop, in the worst case (when one of the ifs is entered), you erase a bunch of characters from crystal. Assuming this is a string backed by an array in memory, this is an O(cAmount) operation, because all elements after wordList[0] must be shifted. The recursive call always passes in a range of size 4; it does not grow nor shrink with the size of the input, so you can also say there are O(1) recursive calls.
Thus, time complexity is O(N+cAmount) (where N is end-1-start), and space complexity is O(1).

Faster calculation for large amounts of data / inner loop

So, I am programming a simple Mandelbrot renderer.
My inner loop (which is executed up to ~100,000,000 times each time I draw on screen) looks like this:
Complex position = {re,im};
Complex z = {0.0, 0.0};
uint32_t it = 0;
for (; it < maxIterations; it++)
{
//Square z
double old_re = z.re;
z.re = z.re*z.re - z.im*z.im;
z.im = 2*old_re*z.im;
//Add c
z.re = z.re+position.re;
z.im = z.im+position.im;
//Exit condition (mod(z) > 5)
if (sqrt(z.re*z.re + z.im*z.im) > 5.0f)
break;
}
//Color in the pixel according to value of 'it'
Just some very simple calculations. This takes between 0.5 and a couple of seconds, depending on the zoom and so on, but i need it to be much faster, to enable (almost) smooth scrolling.
My question is: What is my best bet to achieve the maximum possible calculation speed?
OpenCl to use the GPU? Coding it in assembly? Dividing the image into small pieces and dispatch the calculation of each piece on another thread? A combination of those?
Any help is appreciated!
I have written a Mandelbrot set renderer several times... and here are the things that you should keep in mind...
The things that take the longest are the ones that never escape and take all the iterations.
a. so you can make a region in the middle out of a few rectangles and check that first.
any starting point with a real and imaginary part between -1 and 1 will never escape.
you can cache points (20, or 30) in a rolling buffer and if you ever see a point in the buffer that you just calculated means that you have a cycle and it will never escape.
You can use a more general logic that doesn't require a square root... in that if any part is less than -2 or more than 2 it will race out of control and can be considered escaped.
But you can also break this up because each point is its own thing, so you can make a separate thread or gcd dispatch or whatever for each row or quadrant... it is a very easy problem to divide up and run in parallel.
In addition to the comments by #Grady Player you could start just by optimising your code
//Add c
z.re += position.re;
z.im += position.im;
//Exit condition (mod(z) > 5)
if (z.re*z.re + z.im*z.im > 25.0f)
break;
The compiler may optimise the first, but the second will certainly help.
Why are you coding your own complex rather than using complex.h

Do - While now i am Logically Stuck

I am proper confused.com now and my lack of programming knowledge is clearly showing... so time to call in the pro's! (as a side note i do feel that i have learned a great deal already from you all)
The Problem
I have a little app that takes the width of a virtual wall then asks for the tile size, it then works out how many tiles you can fit with no gap between the width, if the tiles do not fit equally the user is presented with two buttons "grow wall" or "shrink wall", in this example the code below if for growing the wall. It should then run through the while trying to find out in increments of 0.1 how wide the wall needs to be to allow the tiles to fit exactly. It needs to be able to handle double values such as the wall being 12.5 feet wide with a tile width and length 13.6 that is why i am using doubles not ints
I have the following code (but think i am simplifying this way too much)
either way it is not working i have spent half a day reading playing etc but can't figure it out, am i going down the wrong road or am i on the correct track
As always any advice would be warmly received
Mr H
{
double wWdth;
double tWdth;
double wdivision = 3;
NSString *growstring = [NSString stringWithFormat:#"%#,%#", self.wallWidth.text, self.tileWidth.text];
NSArray *wall = [growstring componentsSeparatedByString:#","];
NSLog(#"%#",wall);
wWdth = [[wall objectAtIndex:0]doubleValue];
blkWdth = [[wall objectAtIndex:1]doubleValue];
do
{
NSLog(#" INSIDE WHILE %.f",wdivision);
wWdth = wWdth;
wWdth += 0.1;
wdivision = fmod(qWdth,blkWdth);
} while (wdivision ==! 0);
NSString* newWdth = [NSString stringWithFormat:#"%.f", wWdth];
self.wallWidth.text=newWdth;
}
Your while loop condition has several problems. It is based on the value of wdivision and the value of wdivision is from fmod(qWdth,blkWdth). But neither qWdth nor blkWdth ever change. So your loop will either execute once or forever.
Your loop condition needs to be based on a value that will eventually become false so the loop terminates at some point.
Also, you are using ==! for (what I assume) to be the "not equal" operator. The "not equal" operator is !=. What you have is really:
while (wdivision == !0);
and !0 equates to 1 so you really have:
while (wdivision == 1);
So your loop will end after one iteration unless is just so happens that wdivision is equal to 1. And if it is, your loop will never end.
Since wdivision is a double based on the fmod function, it may never actually equal 0 exactly. You should probably use this instead:
while (division > DBL_EPSILON);
But again, you need to change how wdivision is calculated so it is different each iteration.
If you want to stick with the code you have, recognize that "getting within half of the increment" is as close as you can get. So the lines
wWdth += 0.1;
wdivision = fmod(qWdth,blkWdth);
} while (wdivision ==! 0);
Should be replaced with
wWdth += increment;
wdivision = fmod(qWdth,blkWdth);
} while (abs(wdivision) > 0.5*increment);
Maybe I'm not understanding the question completely correctly, so I apologize if I don't. You say you ask for a tile size right? And then you grow/shrink the wall if the tiles can't fit perfectly into that wall size? If so, can't you just take the current wall size and set it to the next multiple (up or down) of the tile size based on when you click grow/shrink wall? Setting it to a multiple of the tile size will allow for the tiles to fit exactly in it.

audio sequencer with swing (shuffle) Obj-C

I'm working on a drum computer with sequencer for the iPad. The drum computer is working just fine and writing the sequencer wasn't that much of a problem either. However, the sequencer is currently only capable of a straight beat (each step has equal duration). I would like to add a swing (or shuffle as some seem to call it) option, but I'm having trouble figuring out how.
'Swing' according to Wikipedia
Straight beat (midi, low volume)
Beat with Swing (midi, low volume)
If I understand correctly, swing is pretty much achieved by offsetting the eights notes between the 1-2-3-4 with a configurable amount. So instead of
1 + 2 + 3 + 4 +
it becomes something like
1 +2 +3 +4 +
The linked midi files illustrate this better...
However, the sequencer works with 1/16th or even 1/32th steps, so if the 2/8th (4/16th) note is offset, how would that affect the 5/16th note.
I'm probably not approaching this the correct way. Any pointers?
Sequencer code
This is the basics of how I implemented the sequencer. I figured altering the stepDuration at certain points should give me the swing effect I want, but how?
#define STEPS_PER_BAR 32
// thread
- (void) sequencerLoop
{
while(isRunning)
{
NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
// prepare for step
currentStep++;
if(currentStep >= STEPS_PER_BAR * activePatternNumBars)
currentStep = 0;
// handle the step/tick
...
//calculate the time to sleep until the next step
NSTimeInterval stepDuration = (60.0f / (float)bpm) / (STEPS_PER_BAR / 4);
nextStepStartTime = nextStepStartTime + stepDuration;
NSTimeInterval now = [NSDate timeIntervalSinceReferenceDate];
// sleep if there is time left
if(nextStepStartTime > now)
[NSThread sleepUntilDate:[NSDate dateWithTimeIntervalSinceReferenceDate:nextStepStartTime]];
else {
NSLog(#"WARNING: sequencer loop is lagging behind");
}
[pool release];
}
}
Edit: added code
I'm not familiar with the sequencer on iOS, but usually sequencers subdivide steps or beats into "ticks", so the way to do this would be to shift the notes that don't fall right on a beat back by a few "ticks" durring playback. So if the user programmed:
1 + 2 + 3 + 4 +
Instead of playing it back like that, you shift any notes falling on the "and" back by however many ticks (depending on exactly where it falls, how much "swing" was used, and how many "ticks" per beat)
1 + 2 + 3 + 4 +
Sorry if that's not much help, or if I'm not much more than restating the question, but the point is you should be able to do this, probably using something called "ticks". You may need to access another layer of the API to do this.
Update:
So say there are 32 ticks per beat. That means the "+" in the diagram above is tick # 16 -- that's what needs to be shifted. (that's not really a lot of resolution, so having more ticks is better).
Lets call the amount we move it, the "swing factor", s. For no swing, s = 1, for "infinite" swing, s=2. You probably want to use a value like 1.1 or 1.2. For simplicity, we'll use linear interpolation to determine the new position. (As a side note, for more on linear interpolation and how it pertains to audio, I wrote a little tutorial) we need to break the time before and after 16 into two sections, since the time before is going to be stretched and the time after is going to be compressed.
if( tick <= 16 )
tick *= s; //stretch
else
tick = (2-s)*tick + 32*(s-1) //compress
How you deal with rounding is up to you. Obviously, you'll want to do this on playback only and not store the new values, since you won't be able to recover the original value exactly due to rounding.
Change the number of steps to 12 instead of 16. Then each beat has 3 steps instead of 4. Triplets instead of 16th notes. Put sounds on the first and third triplet and it swings. Musicians playing swing use the second triplet also.
Offsetting the notes to create a shuffle does not give you access to the middle triplet.

Bouncing ball not conforming to Conservation of Energy Rule

I am currently busy on writing a small ball physics engine for my programming course in Win32 API and c++. I have finished the GDI backbuffer renderer and the whole GUI (couple of more things to adjust) but i am very near to completion. The only big obstacles that last are ball to ball collision (but i can fix this on my own) but the biggest problem of them all is the bouncing of the balls. What happens is that i throw a ball and it really falls, but once it bounces it will bounce higher than the point were i released it??? the funny thing is, it only happens if below a certain height. This part is the physics code:
(If you need any more code or explanation, please ask, but i would greatly appreciate it if you guys could have a look at my code.)
#void RunPhysics(OPTIONS &o, vector<BALL*> &b)
{
UINT simspeed = o.iSimSpeed;
DOUBLE DT; //Delta T
BOOL bounce; //for playing sound
DT= 1/o.REFRESH;
for(UINT i=0; i<b.size(); i++)
{
for(UINT k=0; k<simspeed; k++)
{
bounce=false;
//handle the X bounce
if( b.at(i)->rBall.left <= 0 && b.at(i)->dVelocityX < 0 ) //ball bounces against the left wall
{
b.at(i)->dVelocityX = b.at(i)->dVelocityX * -1 * b.at(i)->dBounceCof;
bounce=true;
}
else if( b.at(i)->rBall.right >= SCREEN_WIDTH && b.at(i)->dVelocityX > 0) //ball bounces against the right wall
{
b.at(i)->dVelocityX = b.at(i)->dVelocityX * -1 * b.at(i)->dBounceCof;
bounce=true;
}
//handle the Y bounce
if( b.at(i)->rBall.bottom >= SCREEN_HEIGHT && b.at(i)->dVelocityY > 0 ) //ball bounces against the left wall
{
//damping of the ball
if(b.at(i)->dVelocityY < 2+o.dGravity/o.REFRESH)
{
b.at(i)->dVelocityY = 0;
}
//decrease the Velocity of the ball according to the bouncecof
b.at(i)->dVelocityY = b.at(i)->dVelocityY * -1*b.at(i)->dBounceCof;
b.at(i)->dVelocityX = b.at(i)->dVelocityX * b.at(i)->dBounceCof;
bounce=true;
}
//gravity
b.at(i)->dVelocityY += (o.dGravity)/o.REFRESH;
b.at(i)->pOrigin.y += b.at(i)->dVelocityY + (1/2)*o.dGravity/o.REFRESH*DT*METER;
//METER IS DEFINED GLOBALLY AS 100 which is the amount of pixels in a meter
b.at(i)->pOrigin.x += b.at(i)->dVelocityX/o.REFRESH*METER;
b.at(i)->UpdateRect();
}
}
return;
}
You are using the Euler method of integration. It is possible that your time step (DT) is too large. Also there seems to be a mistake on the row that updates the Y coordinate:
b.at(i)->pOrigin.y += b.at(i)->dVelocityY + (1/2)*o.dGravity/o.REFRESH*DT*METER;
You have already added the gravity to the velocity, so you don't need to add it to the position and you are not multiplying the velocity by DT. It should be like this:
b.at(i)->pOrigin.y += b.at(i)->dVelocityY * DT;
Furthermore there appears to be some confusion regarding the units (the way METER is used).
Okay, a few things here.
You have differing code paths for bounce against left wall and against right wall, but the code is the same. Combine those code paths, since the code is the same.
As to your basic problem: I suspect that your problem stems from the fact that you apply the gravity after you apply any damping forces / bounce forces.
When do you call RunPhysics? In a timer loop? This code is just an approximation and no exact calculation. In the short interval of delta t, the ball has already changed his position and velocity a litte bit which isn't considered in your algorithm and produces little mistakes. You'll have to compute the time until the ball hits the ground and predict the changes.
And the gravity is already included in the velocity, so don't add it twice here:
b.at(i)->pOrigin.y += b.at(i)->dVelocityY + (1/2)*o.dGravity/o.REFRESH*DT*METER;
By the way: Save b.at(i) in a temporary variable, so you don't have to recompute it in every line.
Ball* CurrentBall = b.at(i);
ANSWER!!ANSWER!!ANSWER!! but i forgot my other account so i can't flag it :-(
Thanks for all the great replies, it really helped me alot! The answers that you gave were indeed correct, a couple of my formulas were wrong and some code optimisation could be done, but none was really a solution to the problem. So i just sat down with a piece of paper and started calculation every value i got from my program by hand, took me like two hours :O But i did find the solution to my problem:
The problem is that as i update my velocity (whith corrected code) i get a decimal value, no problem at all. Later i increase the position in Y by adding the velocity times the Delta T, which is a verry small value. The result is a verry small value that needs to be added. The problem is now that if you draw a Elipse() in Win32 the point is a LONG and so all the decimal values are lost. That means that only after a verry long period, when the values velocity starts to come out of the decimal values something happens, and that alongside with that, the higher you drop the ball the better the results (one of my symptons) The solution to this problem was really simple, ad an extra DOUBLE value to my Ball class which contained the true position (including decimals) of my ball. During the RenderFrame() you just take the floor or ceiling value of the double to draw the elipse but for all the calculations you use the Double value. Once again thanks alot for all your replies, STACKOVERFLOW PEOPLE ROCK!!!
If your dBounceCof is > 1 then, yes your ball will bounce higher.
We do not have all the values to be able to reply to your question.
I don't think your equation for position is right:
b.at(i)->dVelocityY += (o.dGravity)/o.REFRESH;
This is v=v0+gt - that seems fine, although I'd write dGravity*DT instead of dGravity/REFRESH_FREQ.
b.at(i)->pOrigin.y += b.at(i)->dVelocityY + (1/2)*o.dGravity/o.REFRESH*DT*METER;
But this seems off: It is eqivalent to p = p0+v + 1/2gt^2.
You ought to multiply velocity * time to get the units right
You are scaling the gravity term by pixels/meter, but not the velocity term. So that ought to be multiplied by METER also
You have already accounted for the effect of gravity when you updated velocity, so you don't need to add the gravity term again.
Thanks for the quick replies!!! Sorry, i should have been more clear, the RunPhysics is beiing run after a PeekMessage. I have also added a frame limiter which makes sure that no more calculations are done per second than the refresh rate of the monitor. My dleta t is therefore 1 second devided by the refresh rate. Maybe my DT is actually too small to calculate, although it's a double value??? My cof of restitution is adjustable but starts at 0.9
You need to recompute your position on bounce, to make sure you bounce from the correct place on the wall.
I.e. resolve the exact point in time when the bounce occured, and calculate new velocity/position based on that direction change (partially into a "frame" of calculation) to make sure your ball does not move "beyond" the walls, more and more on each bounce.
W.r.t. time step, you might want to check out my answer here.
In a rigid body simulation, you need to run the integration up to the instant of collision, then adjust the velocities to avoid penetration at the collision, and then resume the integration. It's sort of an instantaneous kludge to cover the fact that rigid bodies are an approximation. (A real ball deforms during a collision. That's hard to model, and it's unnecessary for most purposes.)
You're combining these two steps (integrating the forces and resolving the collisions). For a simple simulation like you've shown, it's probably enough to skip the gravity bit on any iteration where you've handled a vertical bounce.
In a more advanced simulation, you'd split any interval (dt) that contains a collision at the actual instance of collision. Integrate up to the collision, then resolve the collision (by adjusting the velocity), and then integrate for the rest of the interval. But this looks like overkill for your situation.