What methods to recognize sentence handwriting? - vb.net

I mean posts per sentence, not per letter. Such a doctor's prescription handwriting which hard to read. Not just a normal handwriting.
In example :
I use a data mining or machine learning for doing a training from
paper handwrited.
User scanning a paper with hard to read writing.
The application doing an image processing.
And the output is some sentence from paper.
And what device to use? (Scanner or webcam)
I am newbie. If could i need some example in vb.net with emguCV/openCV and researches journals.
Any help would be appreciated.

Welcome to stack overflow! The answer to your question is twofold:
a. If you want to recognize handwriting that has already happened i.e. it is presented to you as an image you are in trouble. Computer Vision is still not good enough to provide you with reasonable accuracy.
b. If you have a chance to recognize handwriting “as it's happening” - you are in luck. Download, for example, a Gesture Search app from Android play store and you are in business.
The difference between the two scenarios is subtle but significant. In the second case you have an extra piece of information that makes handwriting recognition possible. This piece is timing of each stroke. In other words, instead of an image with handwriting you have a bunch of strokes that are all labeled with their time stamps. You can think about it as a sequence of lines and curves or as image segmentation - in any way this provides a big hint for the system. Additional help comes from the dictionary on your phone but this is typically used by any handwriting system.
Android of course has an open source library for stroke recognition (find more on your own). If you still want to go for recognizing images though, you have to first detect text (e.g. as a bounding box) and second use any of the existing engines to process detected regions. For text detection I can recommend MSER. But be careful trying to implement even text detection on your own - you are entering a world of pain here ;). Here is an article that can help.
As for learning how to recognize text from images on the Internet - this can be your plan B or C or Z when you master above mentioned stages. Don’t try to abuse learning methods and make them do hard work for you - you will hit a wall if you don’t understand what’s going on under the hood.

Related

IPA (International Phonetic Alphabet) Transcription with Tensorflow

I'm looking into designing a software platform that will aid linguists and anthropologists in their study of previously unstudied languages. Statistics show that around 1,000 languages exist that have never been studied by a person outside of their respective speaker groups.
My goal is to utilize TensorFlow to make a platform that will allow linguists to study and document these languages more efficiently, and to help them create written systems for the ones that don't have a written system already. One of their current methods of accomplishing such a task is three-fold: 1) Record a native speaker conversing in the language, 2) Listening to that recording and trying to transcribe it into the IPA, 3) From the phonetics, analyzing the phonemics and phonotactics of the language to eventually create a written system for the speaker.
My proposed platform would cut that research time down from a minimum of a year to a maximum of six months. Before I start, I have some questions...
What would be required to train TensorFlow to transcribe live audio into the IPA? Has this already been done? and if so, how would I utilize a previous solution for this project? Is a project like this even possible with TensorFlow? if not, what would you recommend using instead?
My apologies for the magnitude of this question. I don't have much experience in the realm of machine learning, as I am just beginning the research process for this project. Any help is appreciated!
I guess I will take a first shot at answering this. Since the question is pretty general, my answer will have to be pretty general as well.
What would be required. At the very least you would have to have a large dataset of pre-transcribed data. Ideally a large amount of spoken language audio mapped to characters in the phonetic alphabet, so the system could learn the sound of individual characters rather than whole transcribed words. If such a dataset doesn't exist, a less granular dataset could be used, mapping single words to their transcriptions. Then you would need a model, that is the actual neural network architecture implemented in code. And lastly you would need some computing resources. This is not something you can train casually, you would either have to buy some time in a cloud based machine learning framework (like Google Cloud ML) or build a fairly expensive machine to train at home.
Has this been done? I don't know. I don't think so. There have been published papers reporting various degrees of success at training systems to transcribe speech. Here is one, for example, http://deeplearning.stanford.edu/lexfree/lexfree.pdf It seems that since the alphabet you want to transcribe to is specifically designed to capture the way words sound rather than just write down the words you might have more success at training such a model.
Is it possible with TensorFlow. Yes, most likely. TensorFlow is well suited for implementing most modern deep learning architectures. Unless you end up designing some really weird and very original model for this purpose, TensorFlow should work just fine.
Edit: after some thought in part 1, you would have to use a dataset mapping spoken words to their transcriptions, since I expect that the same sound pronounced separately would be different from when the same sound is used in a word.
This has actually been done, albeit in PyTorch, by a group at CMU: https://github.com/xinjli/allosaurus

Robot odometry in labview

I am currently working on a (school-)project involving a robot having to navigate a corn field.
We need to make the complete software in NI Labview.
Because of the tasks the robot has to be able to perform the robot has to know it's position.
As sensors we have a 6-DOF IMU, some unrealiable wheel encoders and a 2D laser scanner (SICK TIM351).
Until now I am unable to figure out any algorithms or tutorials, and thus really stuck on this problem.
I am wondering if anyone ever attempted in making SLAM work in labview, and if so are there any examples or explanations to do this?
Or is there perhaps a toolkit for LabVIEW that contains this function/algorithm?
Kind regards,
Jesse Bax
3rd year mechatronic student
As Slavo mentioned, there's the LabVIEW Robotics module that contains algorithms like A* for pathfinding. But there's not very much there that can help you solve the SLAM problem, that I am aware of. The SLAM problem consist of the following parts: Landmark extraction, data association, state estimation and updating of state.
For landmark extraction, you have to pick one or multiple features that you want the robot to recognize. This can for example be a corner or a line(wall in 3D). You can for example use clustering, split and merge or the RANSAC algorithm. I believe your laser scanner extract and store the points in a list sorted by angle, this makes the Split and Merge algorithm very feasible. Although RANSAC is the most accurate of them, but also has a higher complexity. I recommend starting with some optimal data points for testing the line extraction. You can for example put your laser scanner in a small room with straight walls and perform one scan and save it to an array or a file. Make sure the contour is a bit more complex than just four walls. And remove noise either before or after measurement.
I haven't read up on good methods for data association, but you could for example just consider a landmark new if it is a certain distance away from any existing landmarks or update an old landmark if not.
State estimation and updating of state can be achieved with the complementary filter or the Extended Kalman Filter (EKF). EKF is the de facto for nonlinear state estimation [1] and tend to work very well in practice. The theory behind EKF is quite though, but it should be a tad easier to implement. I would recommend using the MathScript module if you are going to program EKF. The point of these two filters are to estimate the position of the robot from the wheel encoders and landmarks extracted from the laser scanner.
As the SLAM problem is a big task, I would recommend program it in multiple smaller SubVI's. So that you can properly test your parts without too much added complexity.
There's also a lot of good papers on SLAM.
http://www.cs.berkeley.edu/~pabbeel/cs287-fa09/readings/Durrant-Whyte_Bailey_SLAM-tutorial-I.pdf
http://ocw.mit.edu/courses/aeronautics-and-astronautics/16-412j-cognitive-robotics-spring-2005/projects/1aslam_blas_repo.pdf
The book "Probabalistic Robotics".
https://wiki.csem.flinders.edu.au/pub/CSEMThesisProjects/ProjectSmit0949/Thesis.pdf
LabVIEW provides LabVIEW Robotics module. There are also plenty of templates for robotics module. Firstly you can check the Starter Kit 2.0 template Which will provide you simple working self driving robot project. You can base on such template and develop your own application from working model, not from scratch.

Insert skeleton in 3D model programmatically

Background
I'm working on a project where a user gets scanned by a Kinect (v2). The result will be a generated 3D model which is suitable for use in games.
The scanning aspect is going quite well, and I've generated some good user models.
Example:
Note: This is just an early test model. It still needs to be cleaned up, and the stance needs to change to properly read skeletal data.
Problem
The problem I'm currently facing is that I'm unsure how to place skeletal data inside the generated 3D model. I can't seem to find a program that will let me insert the skeleton in the 3D model programmatically. I'd like to do this either via a program that I can control programmatically, or adjust the 3D model file in such a way that skeletal data gets included within the file.
What have I tried
I've been looking around for similar questions on Google and StackOverflow, but they usually refer to either motion capture or skeletal animation. I know Maya has the option to insert skeletons in 3D models, but as far as I could find that is always done by hand. Maybe there is a more technical term for the problem I'm trying to solve, but I don't know it.
I do have a train of thought on how to achieve the skeleton insertion. I imagine it to go like this:
Scan the user and generate a 3D model with Kinect;
1.2. Clean user model, getting rid of any deformations or unnecessary information. Close holes that are left in the clean up process.
Scan user skeletal data using the Kinect.
2.2. Extract the skeleton data.
2.3. Get joint locations and store as xyz-coordinates for 3D space. Store bone length and directions.
Read 3D skeleton data in a program that can create skeletons.
Save the new model with inserted skeleton.
Question
Can anyone recommend (I know, this is perhaps "opinion based") a program to read the skeletal data and insert it in to a 3D model? Is it possible to utilize Maya for this purpose?
Thanks in advance.
Note: I opted to post the question here and not on Graphics Design Stack Exchange (or other Stack Exchange sites) because I feel it's more coding related, and perhaps more useful for people who will search here in the future. Apologies if it's posted on the wrong site.
A tricky part of your question is what you mean by "inserting the skeleton". Typically bone data is very separate from your geometry, and stored in different places in your scene graph (with the bone data being hierarchical in nature).
There are file formats you can export to where you might establish some association between your geometry and skeleton, but that's very format-specific as to how you associate the two together (ex: FBX vs. Collada).
Probably the closest thing to "inserting" or, more appropriately, "attaching" a skeleton to a mesh is skinning. There you compute weight assignments, basically determining how much each bone influences a given vertex in your mesh.
This is a tough part to get right (both programmatically and artistically), and depending on your quality needs, is often a semi-automatic solution at best for the highest quality needs (commercial games, films, etc.) with artists laboring over tweaking the resulting weight assignments and/or skeleton.
There are algorithms that get pretty sophisticated in determining these weight assignments ranging from simple heuristics like just assigning weights based on nearest line distance (very crude, and will often fall apart near tricky areas like the pelvis or shoulder) or ones that actually consider the mesh as a solid volume (using voxels or tetrahedral representations) to try to assign weights. Example: http://blog.wolfire.com/2009/11/volumetric-heat-diffusion-skinning/
However, you might be able to get decent results using an algorithm like delta mush which allows you to get a bit sloppy with weight assignments but still get reasonably smooth deformations.
Now if you want to do this externally, pretty much any 3D animation software will do, including free ones like Blender. However, skinning and character animation in general is something that tends to take quite a bit of artistic skill and a lot of patience, so it's worth noting that it's not quite as easy as it might seem to make characters leap and dance and crouch and run and still look good even when you have a skeleton in advance. That weight association from skeleton to geometry is the toughest part. It's often the result of many hours of artists laboring over the deformations to get them to look right in a wide range of poses.

API to break voice into phonemes / synthesize new speech given speech samples?

You know those movies where the tech geeks record someone's voice, and their software breaks it into phonemes? Which they can then use to type in any phrase, and make it seem as if the target is saying it?
Does that software exist in an API Version? I don't even know what to Google.
There is no such software. Breaking arbitrary speech into its constituent phonemes is only a partially solved problem: speech-to-text software is still imperfect, as is text-to-speech.
The idea is to reproduce the timbre of the target's voice. Even if you were able to segment the audio perfectly, reordering the phonemes would produce audio with unnatural cadence and intonation, not to mention splicing artifacts. At that point you're getting into smoothing, time-scaling, and pitch correction, all of which are possible and well-understood in theory, but operate poorly on real-world data, especially when the audio sample in question is as short as a single phoneme, and further when the timbre needs to be preserved.
These problems are compounded on the phonetic side by allophonic variation in sounds based on accent and surrounding phonemes; in order to faithfully produce even a low-quality approximation of the audio, you'd need a detailed understanding of the target's language, accent, and speech patterns.
Furthermore, your ultimate problem is one of social engineering, and people are not easy to fool when it comes to the voices of people they know. Even with a large corpus of input data, at best you could get a short low-quality sample, hardly enough for a conversation.
So while it's certainly possible, it's difficult; even if it existed, it wouldn't always be good enough.
SRI International (the company that created Siri for iOS) has an SDK called EduSpeak, which will take audio input and break it down into individual phonemes. I know this because I sat through a demo of the product about a week ago. During the demo, the presenter showed us an application that was created using the SDK. The application gave a few lines of text for the presenter to read. After reading the text, the application displayed a bar chart where each bar represented a phoneme from his speech. The height of each bar represented a score of how well each phoneme was pronounced (the presenter was not a native English speaker, so he received lower scores on certain phonemes compared to others). The presenter could also click on each individual bar to have only that individual phoneme played back using the original audio.
So yes, software exists that divides audio up by phoneme, and it does a very good job of it. Now, whether or not those phonemes can be re-assembled into speech is an open question. If we end up getting a trial version of the SDK, I'll try it out and let you know.
If your aim is to mimic someone else's voice, then another attitude is to convert your own voice (instead of assembling phonemes). It is (surprisingly) called voice conversion, e.g http://www.busim.ee.boun.edu.tr/~speech/projects/Voice_Conversion.htm
The technology is called "voice synthesis" and "voice recognition"
The java API for this can be found here Java voice JSAPI
Apple has an API for this Apple speech
Microsoft has several ...one is discussed here Vista speech
Lyrebird is a start-up that is working on this very problem. Given samples of a person's voice and some written text, it can synthesize a spoken version of that written text in the voice of the person in the samples.
You can get interesting voice warping effects with a formant-aware pitch shift. Adobe Audition has a pretty good implementation. Antares produces some interesting vocal effects VST plugins.
These techniques use some form of linear predictive coding (LPC) to treat the voice as a source-filter model. LPC works on speech signals by estimating the resonance of the vocal tract (formant), reversing its effect with an inverse filter, and then coding the resulting residual signal. The residual signal is ideally an impulse train that represents the glottal impulse. This allows the scaling of pitch and formants independently, which leads to a much better gender conversion result than simple pitch shifting.
I dunno about a commercially available solution, but the concept isn't entirely out of the range of possibility. For example, the University of Delaware has fairly decent software for doing just that.
http://www.modeltalker.com

Models for 3d game programming? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I'm a beginner in game development and game programming. I have experience in computer graphics - mainly OpenGL
In those days Finally, I have some spare time to polish my game coding skills.
But when coming to program a simple 3d game, I couldn't find any good resource for free textures and models for 3d graphics (for 2d game for example, I found many resources for sprite sheets and so on).
Is there any good resource you're familiar with for 3d game textures/models?
This is not a programming queston.
As far as I know, good, free and high-quality modeling resources does not exist (from "good", "free" and "high-quality", select two).
There are multiple free model repositories, but quality of content is generally poor, and there are few places where you can buy models.
There are free textures in multiple places (like this one), and they are easier to find than good free models.
Also, most of free content frequently includes some kind of catch - "non-commercial use only", "creative commons share alike"(i.e. if you make derivative, it should use same license), or it is under GPL.
Anyway, if you're okay with "Creative Commons share alike" and GPL, then you can probably use content from some of opensource games (OpenArena ), and get quite a lot of textures from wikipedia or wikimedia commons, flickr, and you can google for "free textures". You should be careful about using content from opensource games - some opensource projects (like war$ow and sauerbraten) use closed-source/restricted licenses for game content (i.e. you're free to reuse modify engine, but you cannot modify game content and you cannot use it with modified engine. Reasons are pretty obvious).
Anyway, it depends on what kind of model you want. It is pretty easy to find "easy" stuff like boxes, barrels, etc, because everyone can do that. When it comes to guns and vehicles, there will be a trouble - quality will drop, and number of good models will decrease. And if you want a fully rigged animated character with multiple animation, normally you can forget about it - such content is almost impossible to find. But you can probably use mods for Q3 and Q2 if you want characters (you can forget about physics in this case, though)
I'd recommend to forget about "free stuff", and try to make content yourself or hire someone to do that.
If you decide to make content yourself, then you'll need digital photo camera and (optionally) graphic tablet. You can make mediocre textures from photos (digital camera is cheap) using gimp, gimp-resynthesizer plugin, gimp-texturisze plugin, high-pass filters, etc. You can also make normal maps using blender or gimp, and there are even tutorials about extracting them from photos (you still will need to process them by hand). Modeling and animation can be done in blender (after 1 or two weeks of training) using reference photos. Low poly modeling is pretty quick (20 minutes to make a low-poly low-quality gun, hour or two to make simple character), but texture and animation will take more (setting up animation for character can take a few hours for amateur, making one animation for character will take at least several hours as well, making texture unwrap - hour, painting texture - up to few days, depending on quality you want, available reference material, availability of graphic tablet, etc). It is possible to cut corners a bit - for example, for making animations, you can film motion using photo camera(or video camera), and then use it for rotoscoping. Also, you'll need to find some kind of model format blender can export to, or you'll have to write an export plugin in python.
The Blender foundation has a large model repository which may be of use.
There are some free models at Turbosquid that I use sometimes for my XNA games.
But of course, the best stuff is not free.
My experience is that there is very little in the way of quality 3d models with animation and full rigging freely available. There a few companies like this who sell suitable models cheaply and I guess most hobbyists could afford one or two models from them fairly easily which would probably be sufficient for learning. (I have no connection to them but I did buy one model pack from them which I quite liked)
It would be nice if there were a few more generally freely available 3d animated models around though. I even think it might be in the interests of some of the companies that make them to give a few away. If I'd been able to get further in my hobby projects I might have spent £100-200 in total on some nice model packs to make my project better, but due to the lack of any real 3d animated models I ended up losing interest in all my 3d projects before I got to the point of thinking maybe I'd spend a little money on this hobby. I wonder if the availability of a few more free quality models would actually significantly increase the size of the market for those companies as more people got their projects to the point where they were willing to spend a little money on it.
Some company should make a nice model pack with a few static models and a couple of fully rigged and animated humans and "monsters" and say that if the community donates £10000 they'll release them for free use. I suspect there are enough people out there who would like a few quality models they might reach this target in the same way that Blender was originally sold to the public.
I know that it's been a long time since this question was asked, but I ran into same problem when programming in XNA and I found a good solution. As long as you don't need rigged / animated models, Google Warehouse is the best place to search. As far as I know, each model submitted to Google Warehouse is available on Creative Commons license. You just need to:
Download and install Google Sketchup (Sketchup download)
Browse to find a model (Google Warehouse) - there's a 3D preview for each one!
Get a plugin to export Sketchup models to .X - I recommend the '3D RAD' plugin (3D RAD download)
If your model does not look good after the export, try to separate it into several less complex ones.
you are looking for open game art ...
http://thefree3dmodels.com/ has a multitude of free 3D models. I've used a few of these for animation purpose, maybe it'll help you too.