Speech recognition syllable counter - msdn

I'm wanting help in finding or creating an app or program that counts syllables with voice recognition. The application is for stuttering therapy; slowing the speech down,and joining/slurring words at slow speed (50 syllables per minute), and then slowly speeding up while practicing a modified way of speaking. Spending 2 days at 50 SPM (syllable per minute), and then the next few days at 80 spm, then next at 100 and so on. Until the stutterer is talking at 180 - 200 syllables per minute (normal speed) but with a modified speech pattern (smooth speech) which significantly reduces the stutter. In the past I have used a hand held device, and manually counted the syllables, and told the speaker to slow down or speed up, depending on their syllable count.

There is a praat script to detect syllable nuclei that has been published and validated. You may be able to use this script for your work, or it may be a good starting point for you to write your own.
You may also be interested in the Android app SpeakRite, which displays speaking rate in real time.

Related

OptaPlanner VRPPD with Traffic and Time Windows

Is there currently a way to incorporate traffic patterns into OptaPlanner with the package and delivery VRP problem?
Eg. Let's say I need to optimize 500 pickup and deliveries today and tomorrow amongst 30 vehicles where each pickup has a 1-4hr time window. I want to avoid busy areas of the city during rush hours when possible.
New pickups can also be added (or cancelled in the meantime).
I'm sure this is a common problem. Does a decent solution exist for this in OptaPlanner?
Thanks!
Users often do this, but there is no out-of-the-box example of it.
There are several ways to do it, but one way is to add a 3th dimension to the distanceMatrix, indicating the departureTime. Typically that use a granularity of 15 minutes, 30 minutes or 1 hour.
There are 2 scaling concerns here:
memory. 15 minutes means 24 * 4 = 96 per day. Given that with 2 dimensions, a 10k locations distanceMatrix uses almost 2 GB RAM, clearly memory can become a concern.
pre-calculation time. Calculation the distance matrix can be time consuming. "bulk algorithms" can help here. For example, graphhopper community doesn't support bulk distance calculations, but their enterprise version - as well OSRM (which is free) does. Getting a 3 dimensional matrix from the remote Google Maps API, or the remote enterprise Graphhopper API, can result in bandwidth concerns (see above, the distance matrix can become several GB in size, especially in non-binary formats such as JSON or CSV).
In any case, one that 3 dimensional matrix is there, it's just a matter of adjust OptaPlanner examples's ArrivalTimeUpdateListener to use getDistance(from, to, departureTime).

Precision in Game Engine Physics (eg TrackMania "Press Forward Maps")

For some time now, I've been thinking about how games calculate physics. Take as an example the game TrackMania. There are special routes where you only have to accelerate from the beginning to get to the finish. As an example, I take the following YouTube video (https://www.youtube.com/watch?v=uK7Y7zyP_SY). Unfortunately, I'm not a specialist in game development, but I know roughly how an engine works.
Most engines use a game loop, which means they use the delta value between the last call and the current call. This delta value is used to move objects, detect collisions and so on. The higher the delta value, the farther the object must have moved. The principle works fine with many games, but not with TrackMania.
A PC that can only display 25 FPS would calculate the physics differently than a PC with 120 FPS, because the collision detection is more accurate (impact is detected earlier, speed adjusted accordingly, ...). Now you can assume that the delta value is always the same (as with Super Mario Maker, at least that's my assumption), then this would work. But that would cause problems similar to old games (https://superuser.com/questions/630769/why-do-some-old-games-run-much-to-quickly-on-modern-hardware/).
Now my question, why do such maps work on every PC and why is the physics always exactly the same? Did I miss any aspect of game development / engine development?
The answer is simple, first the physic of the game is predictable, based on the input the result will always be the same.
Then the physic loop is not the same as the render, the game ensure the physic loop will be call with exactly the same period every time during the whole execution. So, yes a delta is needed for the render part, but the physic as a constant time in ms between each iteration.
One last think : you wont find "Press Forward" maps on the multiplayer, these kind of maps will not work correctly, this is directly linked to specificities in the physic to avoid TAS (Tool Assisted Speedrun).

Using Optaplanner to solve VRPTW with large number of customers and sophisticated constraints

I'm developing a solver for a VRPTW problem using the OptaPlanner and I have faced a problem when large number of customers need to be serviced. By the large number I mean up to 10,000 customers. I have tried running a solver for about 48 hours but no feasible solution was ever reached.
I use a highly customized VRPTW domain model that introduces additional planning entity so-called "Workbreak". Workbreaks are like customers but they can have a location that is actually another planning value - because every day a worker can return home or go to the hotel. Workbreaks have fixed time of departure (usually next day morning), and a variable time of arrival (because it depends on the previous entity within a chain). A hard constraint cares about not allowing to "arrive" to the Workbreak after certain point of time. There are other hard constraints too, like:
multiple service time windows per customer
every week the last customer in chain must be a special customer "storage space visit" (workers need to gather materials before the next week)
long jobs management (when a customer needs to be serviced longer than specified time it should be serviced before specific hour of a day)
max number of jobs per workday
max total job duration per workday (as worker cannot work longer than specified time)
a workbreak cannot have a location of a hotel that is too close to worker's home.
jobs can not be serviced on Sundays
... and many more - there is a total number of 19 hard constrains that have to be applied. There are 3 soft constraints too.
All the aforementioned constraints were initially written as Drools rules, but because of many accumulation-based constraints (max jobs per day, max hours per day, overtime hours per week) the overall speed of the solver (benchmarks) was about 400 step/sec.
At first I thought that solver's speed is too slow to reach a feasible solution in a reasonable time, so I have rewritten all rules into easy score calculator, and it had a decent speed - about 4600 steps/sec. I knew that is will only perform best for a really small number of customers, but I wanted to know if the Drools was the cause of that poor performance. Then I have rewritten all these rules into incremental score calculator (and survived the pain of corrupted score bugs until all of them were successfully fixed). Surprisingly incremental score calculation is a bit slower for a small number of customers, comparing to easy score calculator, but it is not an issue, because overall speed is about 4000 steps/sec - no matter how many entities I have.
The thing that bugs me the most is that above a certain number of customers (problems start at 1000 customers) the solver cannot reach feasible solution. Currently I'm using Late Acceptance and Step Counting algorithms, because they perform really good for this kind of a problem (at least for a less number of customers). I used Simulated Annealing too, but without success, mostly because I could not find good values for algorithm specific parameters.
I have implemented some custom moves too:
Composite move that changes workbreak's location when sibling entities are changed using other moves like change/swap moves (it helps escaping many score traps, as improving step usually needs at least two moves to be performed in a single step)
Move factory for better long jobs assignment (it generates moves that tries to put customers with longer service time in the front of a workday chain)
Workbreak assignment move factory (it generates moves that helps putting workbreaks in proper sequence)
Now I'm scratching my head, and wondering what I should do to diagnose the source of my problem. I suspected that maybe it was hitting a score trap, but I have modified the solver so it saves snapshots of best score each minute. After reading these snapshots I realized that the score was still decreasing. Can the number of hard constraints play the role? I suspect that many moves need to be performed to find out a move that improves the score. The fact is that maybe 48 hours isn't that much for this kind of a problem, and it should make computations a whole week? Unfortunately I have nothing to compare with.
I would like to know how to find out if it is solely a performance problem, or a solver (algorithm, custom moves, hard/soft score) configuration problem.
I really apologize for my bad English.
TL;DR but FWIW:
To scale above 1k locations you need to use NearBy selection.
To scale above 10k locations, add Partitioned Search too.

Performance overhead for frequent (5Hz) Core Data saves

For an iPhone app that plays audio files, I'm working on a system to track the user's progress in any episode they've listened to (eg, they listen to the first 4:35 of file1, then starts another file, and goes back to file1 and it starts at 4:35).
I've set up a Core Data model to store the metadata, but I'm wondering how aggressively I could/should cache the current location during playback.
Currently I have just stuck the save: call in a method that was previously being used to update the time labels and UISlider playhead. That method is being called by a NSTimerInterval every 0.2 seconds.
0.2 seconds is much more precision than I need to keep track of for the progress cache. The values are rounded to the nearest second anyway, so essentially 4/5 of every save is redundant.
Given, though, that this is pretty much all Core Data is doing, it's only only ever dealing with a single value for a single record at any given time, I'm wondering if it makes more sense to just do the extra, unnecessary save:'s, or to manage a second timer for doing the update less frequently.
As is, Instruments reports the Save Duration of each event as ~800, peaking around 2000. I'm not really sure how to interpret those results. Actual app performance in the simulator doesn't appear to be significantly impacted.
If this kind of save is so cheap that it makes sense to keep code complexity low (only managing a single timer), I would keep it as is, but my gut instinct is that that's a lot of operations, no matter how cheap.
You shouldn't see as much of a difference in performance as you may see in battery consumption.
Writing to disk with flash storage in an iOS device is much faster than writing to a spinning plate HDD on a computer. Also, a write to a HDD does not cost much electricity compared to just keeping the plated spinning anyway. However, writing to the flash storage takes more power relative to a read or just leaving the flash alone.
In other words, the power consumption for a write on an iOS device it not negligible. If you can get away with 4hz, that could easily result in a notable improvement in batter consumption for your app.

GPS signal cleaning & road network matching

I'm using GPS units and mobile computers to track individual pedestrians' travels. I'd like to in real time "clean" the incoming GPS signal to improve its accuracy. Also, after the fact, not necessarily in real time, I would like to "lock" individuals' GPS fixes to positions along a road network. Have any techniques, resources, algorithms, or existing software to suggest on either front?
A few things I am already considering in terms of signal cleaning:
- drop fixes for which num. of satellites = 0
- drop fixes for which speed is unnaturally high (say, 600 mph)
And in terms of "locking" to the street network (which I hear is called "map matching"):
- lock to the nearest network edge based on root mean squared error
- when fixes are far away from road network, highlight those points and allow user to use a GUI (OpenLayers in a Web browser, say) to drag, snap, and drop on to the road network
Thanks for your ideas!
I assume you want to "clean" your data to remove erroneous spikes caused by dodgy readings. This is a basic dsp process. There are several approaches you could take to this, it depends how clever you want it to be.
At a basic level yes, you can just look for really large figures, but what is a really large figure? Yeah 600mph is fast, but not if you're in concorde. Whilst you are looking for a value which is "out of the ordinary", you are effectively hard-coding "ordinary". A better approach is to examine past data to determine what "ordinary" is, and then look for deviations. You might want to consider calculating the variance of the data over a small local window and then see if the z-score of your current data is greater than some threshold, and if so, exclude it.
One note: you should use 3 as the minimum satellites, not 0. A GPS needs at least three sources to calculate a horizontal location. Every GPS I have used includes a status flag in the data stream; less than 3 satellites is reported as "bad" data in some way.
You should also consider "stationary" data. How will you handle the pedestrian standing still for some period of time? Perhaps waiting at a crosswalk or interacting with a street vendor?
Depending on what you plan to do with the data, you may need to supress those extra data points or average them into a single point or location.
You mention this is for pedestrian tracking, but you also mention a road network. Pedestrians can travel a lot of places where a car cannot, and, indeed, which probably are not going to be on any map you find of a "road network". Most road maps don't have things like walking paths in parks, hiking trails, and so forth. Don't assume that "off the road network" means the GPS isn't getting an accurate fix.
In addition to Andrew's comments, you may also want to consider interference factors such as multipath, and how they are affected in your incoming GPS data stream, e.g. HDOPs in the GSA line of NMEA0183. In my own GPS controller software, I allow user specified rejection criteria against a range of QA related parameters.
I also tend to work on a moving window principle in this regard, where you can consider rejecting data that represents a spike based on surrounding data in the same window.
Read the posfix to see if the signal is valid (somewhere in the $GPGGA sentence if you parse raw NMEA strings). If it's 0, ignore the message.
Besides that you could look at the combination of HDOP and the number of satellites if you really need to be sure that the signal is very accurate, but in normal situations that shouldn't be necessary.
Of course it doesn't hurt to do some sanity checks on GPS signals:
latitude between -90..90;
longitude between -180..180 (or E..W, N..S, 0..90 and 0..180 if you're reading raw NMEA strings);
speed between 0 and 255 (for normal cars);
distance to previous measurement matches (based on lat/lon) matches roughly with the indicated speed;
timedifference with system time not larger than x (unless the system clock cannot be trusted or relies on GPS synchronisation :-) );
To do map matching, you basically iterate through your road segments, and check which segment is the most likely for your current position, direction, speed and possibly previous gps measurements and matches.
If you're not doing a realtime application, or if a delay in feedback is acceptable, you can even look into the 'future' to see which segment is the most likely.
Doing all that properly is an art by itself, and this space here is too short to go into it deeply.
It's often difficult to decide with 100% confidence on which road segment somebody resides. For example, if there are 2 parallel roads that are equally close to the current position it's a matter of creative heuristics.