How should I objectively test my program results? - testing

I have developed two differing methods in MATLAB which aim to analyse a pop song and then automatically create a 30 second audio thumbnail (a preview clip) containing part of the chorus section.
Both methods have varying results:
The first method can create a thumbnail for each track, managing to find a chorus section in 40 out of 50 tested songs
The second method only managed to work on 30 out of the 50 songs, and it found the chorus section 21 times out the 30.
Obviously I know which method is superior, but I need to describe and explain the results in a report which requires the demonstration of proper statistical testing.
Other academic papers have previously used an f-test to do this, but because their methods are vastly superior, their aims are usually involve the detection of chorus onset times with 100% accuracy.
My aim is more relaxed as I am just looking for the generated thumbnails to contain any part of the chorus, regardless of onset.
Can anyone suggest some objective tests that I could possibly explore with regards to my project? This is my first time conducting an investigation like this so my experience/knowledge is incredibly low.
Thank you!

Possibly, the way for you is formating your song track with time cuts for relevant information about type of sound(chorus, etc). In sound editor like CoolEdit, you can set time cuts and assign names for theirs like 'chorus', 'pause','music'... Then, you must extract cut information to import in Matlab. For Windows 32 can be used utility Wav2labs from http://www.pallier.org/ressources/wspot/sig2wav/toolswav.html; http://www.pallier.org/ressources/wspot/sig2wav/Wav2labs.exe This program extract cuts to text file and you can read with Matlab textscan function.
After all, only segmentation accuracy must be proceed, like percent time when signal type(chorus/not chorus) was recognized correctly
Or specify your question more exactly

Related

I'm looking to create an automated numbering system for custom paint by number kits in photoshop

So I know very little about programming all around. I'm adept at photoshop and I'm looking to automate the numbering system for making these paint by number kits. I convert the images into vector format and set a maximum number of color variations. I then use adobe illustrator to create the outlined partitions of the image by color. This is all well and good, it's automated and efficient as far as I need.
My dilemma is that I do not have a system that can number these partitions in a clear and uniform fashion. I must do this tediously in photoshop, taking hours to finish.
I am looking to create or find a system that will do this last step automatically.
My vison for how this would look would be numbers, 1-20 or so depending on the set color cap, evenly distributed across each partition in uniform font and size. The idea is that there would be a grid of 1 number (this number would be the reference to the color needed in this partition) spread across larger partitions and only a few of 1 number on the smaller partitions. It would hopefully look like so:
You can see here how tedious this can become.
I don't know how to accomplish this, but I'm wondering how complicated this process would be in theory and would it be better for me to learn how to do it myself, hire a professional, or continue the hand numbering. It's creating a labor cap on my small business that is preventing me from further growth.
Any and all help is very much appreciated; if I can provide more context or specifications I would be more than happy to do so. Thank you!
Just for fun I've managed to tweak old Johnware's script (Circle Fill). Now it can fill with given letters (numbers for example). It works to a degree, but the result far from ideal:
Probably it can be used for start.
I believe a real programmer could make it way better.
My tweaked version of the script is here: https://disk.yandex.ru/d/Ze4-1DQoNRVF1g
Update
I'm improved the script further. Now it:
works more precise
handles several selected paths
remembers values in the dialog window
sets font size
Here is the is the updated version of the script: https://disk.yandex.ru/d/0pcpLDGrfQKMJA
It took me about 15 minutes to do this:
But I had to to split some complex paths with a Knife tool. Sometimes the script throws a some mystical error. I've just selected another set of paths an run the scripts again and again.
It is not a final result but it's close. I think it's much faster that to do it manually.
It can be done with script to some degree. It will work fine for simply forms. But for complicated forms it will be too hard to calculate where you need to put all numbers and how many number will be enough.
But I saw scripts that can fill any form with any symbols. So it's possible to fill any form with numbers, I think, technically.
Of course, if you aren't a seasoned coder it makes no sense to try to do it at home. You need a pro (not even me).
And I see another very simply options as well:
It doesn't even need a script. What do you think?

Using tensorflow to identify lego bricks?

having read this article about a guy who uses tensorflow to sort cucumber into nine different classes I was wondering if this type of process could be applied to a large number of classes. My idea would be to use it to identify Lego parts.
At the moment, a site like Bricklink describes more than 40,000 different parts so it would be a bit different than the cucumber example but I am wondering if it sounds suitable. There is no easy way to get hundreds of pictures for each part but does the following process sound feasible :
take pictures of a part ;
try to identify the part using tensorflow ;
if it does not identify the correct part, take more pictures and feed the neural network with them ;
go on with the next part.
That way, each time we encounter a new piece we "teach" the network its reference so that it can better be recognized the next time. Like that and after hundreds of iterations monitored by a human, could we imagine tensorflow to be able to recognize the parts? At least the most common ones?
My question might sound stupid but I am not into neural networks so any advice is welcome. At the moment I have not found any way to identify a lego part based on pictures and this "cucumber example" sounds promising so I am looking for some feedback.
Thanks.
You can read about the work of Jacques Mattheij, he actually uses a customized version of Xception1 running on https://keras.io/.
The introduction is Sorting 2 Metric Tons of Lego.
In Sorting 2 Tons of Lego, The software Side you can read:
The hard challenge to deal with next was to get a training set large
enough to make working with 1000+ classes possible. At first this
seemed like an insurmountable problem. I could not figure out how to
make enough images and to label them by hand in acceptable time, even
the most optimistic calculations had me working for 6 months or longer
full-time in order to make a data set that would allow the machine to
work with many classes of parts rather than just a couple.
In the end the solution was staring me in the face for at least a week
before I finally clued in: it doesn’t matter. All that matters is that
the machine labels its own images most of the time and then all I need
to do is correct its mistakes. As it gets better there will be fewer
mistakes. This very rapidly expanded the number of training images.
The first day I managed to hand-label about 500 parts. The next day
the machine added 2000 more, with about half of those labeled wrong.
The resulting 2500 parts where the basis for the next round of
training 3 days later, which resulted in 4000 more parts, 90% of which
were labeled right! So I only had to correct some 400 parts, rinse,
repeat… So, by the end of two weeks there was a dataset of 20K images,
all labeled correctly.
This is far from enough, some classes are severely under-represented
so I need to increase the number of images for those, perhaps I’ll
just run a single batch consisting of nothing but those parts through
the machine. No need for corrections, they’ll all be labeled
identically.
A recent update is Sorting 2 Tons of Lego, Many Questions, Results.
1CHOLLET, François. Xception: Deep Learning with Depthwise Separable Convolutions. arXiv preprint arXiv:1610.02357, 2016.
I have started this using IBM Watson's Visual Recognition.
I had six different bricks to be recognized on the transport belt background.
I am actually thinking about tensorflow, since I can have it running locally.
The codelab : TensorFlow for Poets, describes almost exactly what you want to achieve,
For a demo of the Watson version:
https://www.ibm.com/developerworks/community/blogs/ibmandgoogle/entry/Lego_bricks_recognition_with_Watosn_lego_and_raspberry_pi?lang=en

Oxyplot: IsValidPoint on realtime LineSerie

I've been using oxyplot for a month now and I'm pretty happy with what it delivers. I'm getting data from an oscilloscope and, after a fast processing, I'm plotting it in real time to a graph.
However, if I compare my application CPU usage to the one provided by the oscilloscope manufacturer, I'm loading a lot more the CPU. Maybe they're using some gpu-based plotter, but I think I can reduce my CPU usage with some modifications.
I'm capturing 10.000 samples per second and adding it to a LineSeries. I'm not plotting all that data, I'm decimating it to a constant number of points, let's say 80 points for a 20 secs measure, so I have 4 points/sec while totally zoomed out and a bit more detail if I zoom in to a specific range.
With the aid of ReSharper, I've noticed that the application is calling a lot of times (I've 6 different plots) the IsValidPoint method (something like 400.000.000 times), which is taking a lot of time.
I think the problem is that, when I add new points to the series, it checks for every point if it is a valid point, instead of the added values only.
Also, it spends a lot of time in the MeasureText/DrawText method.
My question is: is there a way to override those methods and to adapt it to my needs? I'm adding 10.000 new values each second, but the first ones remain the same, so there's no need for re-validate them. Also, the text shown doesn't change.
Thank you in advance for any advice you can give me. Have a good day!

What methods to recognize sentence handwriting?

I mean posts per sentence, not per letter. Such a doctor's prescription handwriting which hard to read. Not just a normal handwriting.
In example :
I use a data mining or machine learning for doing a training from
paper handwrited.
User scanning a paper with hard to read writing.
The application doing an image processing.
And the output is some sentence from paper.
And what device to use? (Scanner or webcam)
I am newbie. If could i need some example in vb.net with emguCV/openCV and researches journals.
Any help would be appreciated.
Welcome to stack overflow! The answer to your question is twofold:
a. If you want to recognize handwriting that has already happened i.e. it is presented to you as an image you are in trouble. Computer Vision is still not good enough to provide you with reasonable accuracy.
b. If you have a chance to recognize handwriting “as it's happening” - you are in luck. Download, for example, a Gesture Search app from Android play store and you are in business.
The difference between the two scenarios is subtle but significant. In the second case you have an extra piece of information that makes handwriting recognition possible. This piece is timing of each stroke. In other words, instead of an image with handwriting you have a bunch of strokes that are all labeled with their time stamps. You can think about it as a sequence of lines and curves or as image segmentation - in any way this provides a big hint for the system. Additional help comes from the dictionary on your phone but this is typically used by any handwriting system.
Android of course has an open source library for stroke recognition (find more on your own). If you still want to go for recognizing images though, you have to first detect text (e.g. as a bounding box) and second use any of the existing engines to process detected regions. For text detection I can recommend MSER. But be careful trying to implement even text detection on your own - you are entering a world of pain here ;). Here is an article that can help.
As for learning how to recognize text from images on the Internet - this can be your plan B or C or Z when you master above mentioned stages. Don’t try to abuse learning methods and make them do hard work for you - you will hit a wall if you don’t understand what’s going on under the hood.

Creating simple waveforms with CoreAudio

I am new to CoreAudio, and I would like to output a simple sine wave and square wave with a given frequency and amplitude through the speakers using CA. I don't want to use sound files as I want to synthesize the sound.
What do I need to do this? And can you give me an example or tutorial? Thanks.
There are a number of errors in the previous answer. I, the legendary :-) James McCartney, not James Harkins wrote the sinewavedemo, I also wrote SuperCollider which is what the audiosynth.com website is about. I also now work at Apple on CoreAudio. The sinewavedemo DOES use CoreAudio, since it uses AudioHardware.h from CoreAudio.framework as its way to play the sound.
You should not use the sinewavedemo. It is very old code and it makes dangerous assumptions about the buffer layout of the audio hardware. The easiest way nowadays to play a sound that you are generating is to use the AudioQueue, or to use an output audio unit with a render callback set.
The best and easiest way to do that without files is to prepare a single cycle buffer, containing one cycle of the wave (this is called technically a wavetable)
In the playback function called by CoreAudio thread, fill the output buffer with samples read from the wave buffer.
Note however that you will face two problems very quickly :
- for the sine wave, if the playback frequency is not an integer multiple of the desired sine frequency, you will probably need to implement an interpolator if you want to have a good quality. Using only integer pointers will generate a significant level of harmonic noise.
for the square wave, avoid to just program an array with +1 / -1 values. Such a signal is not bandlimited and will alias a lot. Do not forget that the spectrum of a square wave is virtually infinite!
To get good algorithms for signal generation, take a look to musicdsp.org, that's probably one of the best resource for that
Are you new to audio programming in general? As a starting point i would check out
http://www.audiosynth.com/sinewavedemo.html
This is a minimum osx sinewave implementation by the legendary James Harkins. Note, it doesn't use CoreAudio at all.
If you specifically want to use CoreAudio for your sinewave you need to create an output unit (RemoteIO on the iphone, AUHAL on osx) and supply an input callback, where you can pretty much use the code from the above example. Check out
http://developer.apple.com/mac/library/technotes/tn2002/tn2091.html
The benefits of CoreAudio are chiefly, chain other effects with your sinewave, write plugins for hosts like Logic & provide the interfaces for them, write a host (like Logic) for plugins that can be chained together.
If you don't wont to write a plugin, or host plugins then CoreAudio might not actually be for you. But one of the best things about using CoreAudio is that once you get your sinewave callback working it is easy to add effects, or mix multiple sines together
To do this you need to put your output unit in a graph, to which you can effects, mixers, etc.
Here is some help on setting up graphs http://timbolstad.com/2010/03/16/core-audio-getting-started-pt2/
It isn't as difficult as it looks. Apple provides C++ helper classes for many things (/Developer/Examples/CoreAudio/PublicUtility) and even if you don't want to use C++ (you don't have to!) they can be a useful guide to the CoreAudio API.
If you are not doing this realtime, using the sin() function from math.h is not a bad idea. Just fill however many samples you need with sin() beforehand when it is time to play it, just send it to the audio buffer. sin() can be quite slow to call once every sample if you are doing this realtime, using an interpolated wavetable lookup method is much faster, but the resulting sound will not be as spectrally pure.
There is a good and well documented sine wave player code example in Chapter 7 of the Adamson/Avila "Learning Core Audio" book, published by Addison-Wesley Professional (ISBN-10: 0-321-63684-8 ):
http://www.informit.com/store/learning-core-audio-a-hands-on-guide-to-audio-programming-9780321636843
It is a rather new publication (2012) and addresses precisely the issue of this question. It's only a starting point, but it's a valuable starting point.
BTW. Don't jump to graphs before having this basic lesson (which involves some math) behind.
Concerning example code, a quick and efficient method I often use deals with a pre-filled sinewave lookup table which has as many members as sample rate, for 44100 Hz the table has size of 44100. In other words, cycle length equals sample rate. This gives an acceptable trade-off between speed and quality in many cases. You can initialize it with the program.
If you generate floating point samples (which is default in OSX), and use math functions, use sinf() rather than (float)sin(). Promotions in inner loop cycles of a render callback are always resource-expensive. So are repetitive multiplications of constants, such as 2.0*M_PI, which can too often be found in code examples.