What is the application of automata? - finite-automata

In other words, why should I learn about it? When am I going to say... oh I need to know about push down automata or turing machines for this.
I am not able to see the applications of the material.
Thanks

You should learn about automata theory because it will help you understand what is computationally possible in a given system. People who understand the difference between a push-down automata and a universal turing machine understand why trying to parse HTML with regular expressions is a bad idea. People who don't think it is just fine to try to parse HTML with REs.

There are problems that are nice fit to this kind of solutions, some of which are:
parsers
simulations of stateful systems
event-driven problems
There are probably many others. If you start writing code that has some ad-hoc state variable depending on which some functions can do this or that, you can probably benefit from proper FSA.

First off, it's my position that there are things worth learning not because they're immediately useful, but because they are inherently valuable. A great failing of modern education is that it does nothing to convince students of this when they're still impressionable.
That being said, automata theory is both inherently valuable and incredibly useful. Parsing text, compiling programs, and the capabilities of computing devices can only really be understood using the kinds of things automata theory gives us... and getting the most out of computational systems requires deep understanding. Automata theory allows us to answer some of the most fundamental questions we can ask about computation: what resources do we need to do computation? with given resources, what can we solve? are there problems which can't be solved no matter how many resources we possess? Let alone the fact the complexity theory - which deals with the efficiency of computations - requires automata theory in order to be meaningfully defined.

Learning about automata(which are nothing but machines) gives an idea about the limits of computation. When an automata does not accept a string, it mean a machine cannot take that string as an input. State diagrams generally gives the possible outcomes for an input which makes us build parsers/machines.
Good example would be checking the format of email-id. Softwares donot accept the email-ids while filling a form if the email format is not good. Here the software is accepting email-ids only in a specific format. We were able to build a software of such by basically sorting out this theoretically using automata and state machines.

Related

Limitations of optimisation software such as CPLEX

Which of the following optimisation methods can't be done in an optimisation software such as CPLEX? Why not?
Dynamic programming
Integer programming
Combinatorial optimisation
Nonlinear programming
Graph theory
Precedence diagram method
Simulation
Queueing theory
Can anyone point me in the right direction? I didn't find too much information regarding the limitations of CPLEX on the IBM website.
Thank you!
That's kind-of a big shopping list, and most of the things on it are not optimisation methods.
For sure CPLEX does integer programming, non-linear programming (just quadratic, SOCP, and similar but not general non-linear) and combinatoric optimisation out of the box.
It is usually possible to re-cast things like DP as MILP models, but will obviously require a bit of work. Lots of MILP models are also based on graphs, so yes it is certainly possible to solve a lot of graph problems using a MILP solver such as CPLEX.
Looking wider at topics like simulation, then that is quite a different approach. Simulation really is NOT an optimisation method, but it can be used alongside optimisation to get extra insights which may be useful in a business context. Might be used for example to discover some empirical relationships that could be used in an optimisation model by CPLEX.
The same can probably also be said for things like queuing theory, precedence, etc. Basically, use CPLEX as an optimisation tool to solve part or all of your problem once you have structured and analysed it via one of these other approaches.
Hope that helps.

What's the difference between code written for a desktop machine and a supercomputer?

Hypothetically speaking, if my scientific work was leading toward the development of functions/modules/subroutines (on a desktop), what would I need to know to incorporate it into a large-scale simulation to be run on a supercomputer (which might simulate molecules, fluids, reactions, and so on)?
My impression is that it has to do with taking advantage of certain libraries (e.g., BLAS, LAPLACK) where possible, revising algorithms (reducing iteration), profiling, parallelizing, considering memory-hard disk-processor use/access... I am aware of the adage, "want to optimize your code? don't do it", but if one were interested in learning about writing efficient code, what references might be available?
I think this question is language agnostic, but since many number-crunching packages for biomolecular simulation, climate modeling, etc. are written in some version of Fortran, this language would probably be my target of interest (and I have programmed rather extensively in Fortran 77).
Profiling is a must at any level of machinery. In common usage, I've found that scaling to larger and larger grids requires a better understanding of the grid software and the topology of the grid. In that sense, everything you learn about optimizing for one machine is still applicable, but understanding the grid software gets you additional mileage. Hadoop is one of the most popular and widespread grid systems, so learning about the scheduler options, interfaces (APIs and web interfaces), and other aspects of usage will help. Although you may not use Hadoop for a given supercomputer, it is one of the less painful methods for learning about distributed computing. For parallel computing, you may pursue MPI and other systems.
Additionally, learning to parallelize code on a single machine, across multiple cores or processors, is something you can begin learning on a desktop machine.
Recommendations:
Learn to optimize code on a single machine:
Learn profiling
Learn to use optimized libraries (after profiling: so that you see the speedup)
Be sure you know algorithms and data structures very well (*)
Learn to do embarrassingly parallel programming on multiple core machines.
Later: consider multithreaded programming. It's harder and may not pay off for your problem.
Learn about basic grid software for distributed processing
Learn about tools for parallel processing on a grid
Learn to program for alternative hardware, e.g. GPUs, various specialized computing systems.
This is language agnostic. I have had to learn the same sequence in multiple languages and multiple HPC systems. At each step, take a simpler route to learn some of the infrastructure and tools; e.g. learn multicore before multithreaded, distributed before parallel, so that you can see what fits for the hardware and problem, and what doesn't.
Some of the steps may be reordered depending on local computing practices, established codebases, and mentors. If you have a large GPU or MPI library in place, then, by all means, learn that rather than foist Hadoop onto your collaborators.
(*) The reason to know algorithms very well is that as soon as your code is running on a grid, others will see it. When it is hogging up the system, they will want to know what you're doing. If you are running a process that is polynomial and should be constant, you may find yourself mocked. Others with more domain expertise may help you find good approximations for NP-hard problems, but you should know that the concept exists.
Parallelization would be the key.
Since the problems you cited (e.g. CFD, multiphysics, mass transfer) are generally expressed as large-scale linear algebra problems, you need matrix routines that parallelize well. MPI is a standard for those types of problems.
Physics can influence as well. For example, it's possible to solve some elliptical problems efficiently using explicit dynamics and artificial mass and damping matricies.
3D multiphysics means coupled differential equations with varying time scales. You'll want a fine mesh to resolve details in both space and time, so the number of degrees of freedom will rise rapidly; time steps will be governed by the stability requirements of your problem.
If someone ever figures out how to run linear algebra as a map-reduce problem they'll have it knocked.
Hypothetically speaking, if my scientific work was leading toward the development of functions/modules/subroutines (on a desktop), what would I need to know to incorporate it into a large-scale simulation to be run on a supercomputer (which might simulate molecules, fluids, reactions, and so on)?
First, you would need to understand the problem. Not all problems can be solved in parallel (and I'm using the term parallel in as wide meaning as it can get). So, see how the problem is now solved. Can it be solved with some other method quicker. Can it be divided in independent parts ... and so on ...
Fortran is the language specialized for scientific computing, and during the recent years, along with the development of new language features, there has also been some very interesting development in terms of features that are aiming for this "market". The term "co-arrays" could be an interesting read.
But for now, I would suggest reading first into a book like Using OpenMP - OpenMP is a simpler model but the book (fortran examples inside) explains nicely the fundamentals. Message parsing interface (for friends, MPI :) is a larger model, and one of often used. Your next step from OpenMP should probably go in this direction. Books on the MPI programming are not rare.
You mentioned also libraries - yes, some of those you mentioned are widely used. Others are also available. A person who does not know exactly where the problem in performance lies should IMHO never try to undertake the task of trying to rewrite library routines.
Also there are books on parallel algorithms, you might want to check out.
I think this question is language agnostic, but since many number-crunching packages for biomolecular simulation, climate modeling, etc. are written in some version of Fortran, this language would probably be my target of interest (and I have programmed rather extensively in Fortran 77).
In short it comes down to understanding the problem, learning where the problem in performance is, re-solving the whole problem again with a different approach, iterating a few times, and by that time you'll already know what you're doing and where you're stuck.
We're in a position similar to yours.
I'm most in agreement with #Iterator's answer, but I think there's more to say.
First of all, I believe in "profiling" by the random-pausing method, because I'm not really interested in measuring things (It's easy enough to do that) but in pinpointing code that is causing time waste, so I can fix it. It's like the difference between a floodlight and a laser.
For one example, we use LAPACK and BLAS. Now, in taking my stack samples, I saw a lot of the samples were in the routine that compares characters. This was called from a general routine that multiplies and scales matrices, and that was called from our code. The matrix-manipulating routine, in order to be flexible, has character arguments that tell it things like, if a matrix is lower-triangular or whatever. In fact, if the matrices are not very large, the routine can spend more than 50% of its time just classifying the problem. Of course, the next time it is called from the same place, it does the same thing all over again. In a case like that, a special routine should be written. When it is optimized by the compiler, it will be as fast as it reasonably can be, and will save all that classifying time.
For another example, we use a variety of ODE solvers. These are optimized to the nth degree of course. They work by calling user-provided routines to calculate derivatives and possibly a jacobian matrix. If those user-provided routines don't actually do much, samples will indeed show the program counter in the ODE solver itself. However, if the user-provided routines do much more, samples will find the lower end of the stack in those routines mostly, because they take longer, while the ODE code takes roughly the same time. So, optimization should be concentrated in the user-provided routines, not the ODE code.
Once you've done several of the kind of optimization that is pinpointed by stack sampling, which can speed things up by 1-2 orders of magnitude, then by all means exploit parallelism, MPI, etc. if the problem allows it.

How to make a 2D Soft-body physics engine?

The definition of rigid body in Box2d is
A chunk of matter that is so strong
that the distance between any two bits
of matter on the chunk is completely
constant.
And this is exactly what i don't want as i would like to make 2D (maybe 3D eventually), elastic, deformable, breakable, and even sticky bodies.
What I'm hoping to get out of this community are resources that teach me the math behind how objects bend, break and interact. I don't care about the molecular or chemical properties of these objects, and often this is all I find when I try to search for how to calculate what a piece of wood, metal, rubber, goo, liquid, organic material, etc. might look like after a force is applied to it.
Also, I'm a very visual person, so diagrams and such are EXTREMELY HELPFUL for me.
================================================================================
Ignore these questions, they're old, and I'm only keeping them here for contextual purposes
1.Are there any simple 2D soft-body physics engines out there like this?
preferably free or opensource?
2.If not would it be possible to make my own without spending years on it?
3.Could i use existing engines like bullet and box2d as a start and simply transform their code, or would this just lead to more problems later, considering my 1 year of programming experience and bullet being 3D?
4.Finally, if i were to transform another library, would it be the best change box2D's already 2d code, Bullet's already soft code, or mixing both's source code?
Thanks!
(1) Both Bullet and PhysX have support for deformable objects in some capacity. Bullet is open source and PhysX is free to use. They both have ports for windows, mac, linux and all the major consoles.
(2) You could hack something together if you really know what you are doing, and it might even work. However, there will probably be bugs unless you have a damn good understanding of how Box2D's sequential impulse constraint solver works and what types of measures are going to be necessary to keep your system stable. That said, there are many ways to get deformable objects working with minimal fuss within a game-like environment. The first option is to take a second (or higher) order approximation of the deformation. This lets you deal with deformations in much the same way as you deal with rigid motions, only now you have a few extra degrees of freedom. See for example the following paper:
http://www.matthiasmueller.info/publications/MeshlessDeformations_SIG05.pdf
A second method is pressure soft bodies, which basically model the body as a set of particles with some distance constraints and pressure forces. This is what both PhysX and Bullet do, and it is a pretty standard technique by now (for example, Gish used it):
http://citeseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.4.2828%26rep%3Drep1%26type%3Dpdf
If you google around, you can find lots of tutorials on implementing it, but I can't vouch for their quality. Finally, there has been a more recent push to trying to do deformable objects the `right' way using realistic elastic models and finite element type approaches. This is still an area of active research, so it is not for the faint of heart. For example, you could look at any number of the papers in this year's SIGGRAPH proceedings:
http://kesen.realtimerendering.com/sig2011.html
(3) Probably not. Though there are certain 2D style games that can work with a 3D physics engine (for example top down type games) for special effects.
(4) Based on what I just said, you should probably know the answer by now. If you are the adventurous sort and got some time to kill and the will to learn, then I say go for it! Of course it will be hard at first, but like anything it gets easier over time. Plus, learning new stuff is lots of fun!
On the other hand, if you just want results now, then don't do it. It will take a lot of time, and you will probably fail (a lot). If you just want to make games, then stick to the existing libraries and build on whatever abstractions it provides you.
Quick and partial answer:
rigid body are easy to model due to their property (you can use physic tools, like "Torseur+ (link on french on wikipedia, english equivalent points to screw theory) to modelate forces applying at any point in your element.
in comparison, non-solid elements move from almost solid (think very hard rubber : it can move but is almost solid) to almost liquid (think very soft ruber, latex). Meaning that dynamical properties applying to that knd of objects are much complex and depend of the nature of the object
Take the example of a spring : it's easy to model independantly (f=k.x), but creating a generic tool able to model that specific case is a nightmare (especially if you think of corner cases : extension is not infinite, compression reaches a lower point, material is non linear...)
as far as I know, when dealing with "elastic" materials, people do their own modelisation for their own purpose (a generic one does not exist)
now the answers:
Probably not, not that I know at least
not easily, see previously why
Unless you got high level background in elastic materials, I fear it's gonna be painful
Hope this helped
Some specific cases such as deformable balls can be simulated pretty well using spring-joint bodies:
Here is an implementation example with cocos2d: http://2sa-studio.blogspot.com/2014/05/soft-bodies-with-cocos2d-v3.html
Depending on the complexity of the deformable objects that you need, you might be able to emulate them using box2d, constraining rigid bodies with joints or springs. I did it in the past using a box2d clone for xna (farseer) and it worked nicely. Hope this helps.
The physics of your question breaks down into two different topics:
Inelastic Collisions: The math here is easy, and you could write a pretty decent library yourself without too much work for 2D points/balls. (And with more work, you could learn the physics for extended bodies.)
Materials Bending and Breaking: This will be hard. In general, you will have to model many of the topics in Mechanical Engineering:
Continuum Mechanics
Structural Analysis
Failure Analysis
Stress Analysis
Strain Analysis
I am not being glib. Modeling the bending and breaking of materials is, in general, a very detailed and varied topic. It will take a long time. And the only way to succeed will be to understand the science well enough that you can make clever shortcuts in limiting the scope of the science you need to model in your game.
However, the other half of your problem (modeling Inelastic Collisions) is a much more achievable goal.
Good luck!

What's the best language for physics modeling?

I've been out of the modeling biz, so to speak, for a while now. When I was in college, most of the models I worked with were written in FORTRAN, which I never liked. I'm looking to get back into science, so I'm wondering if there are modern languages with feature sets suited for this kind of application. What would you consider to be an optimal language for simulating complex physics systems?
While certainly Fortran was the absolute ruler for this, Python is being used more and more exactly for this purpose. While it is very hard to say which is the BEST program for this, I've found python pretty useful for physics simulations and physics education.
It depends on the task
C++ is good at complicated data structures, but it is bad at slicing and multiply matrices. (This task equires you to spend a lot of time writing for loops.)
FORTRAN has a nice notation for slicing and multiplying matrices, but it is clumsy for creating complicated data structure such as graphs and linked lists.
Python/scipy has a nice notation for everything, but python is an interepreted language, so it is slow at certain tasks.
Some people are interested in languages like CUDA that allow you to use your GPU to speed up your simulations.
In the molecular dynamics community c++ seems to be popular, because you need somewhat complicated data structures to represent the shapes of the molecules.
I think it's arguable that FORTRAN is still dominant when it comes to solving large-scale problems in physics, as long as we're talking about serial calculations.
I know that parallelization is changing the game. I'm less certain about whether or not parallelized versions of LINPACK and other linear algebra packages are still written in FORTRAN.
A lot of engineers are using MATLAB and Mathematica these days, because they combine numerical and graphics capabilities.
I'd also point out that there's a difference between calculation and display engines. The former might still be written in FORTRAN, but the latter may be using more modern languages and OpenGL.
I'm also unsure about how much modeling has crept into biology. Physical chemistry might be a very different animal altogether.
If you write a terrific parallel linear algebra package in Scala or F# or Haskell that performs well, the world will beat a path to your door.
Python + Matplotlib + NumPy + α
The nuclear/particle/high energy physics community has moved heavily toward c++ (in part due to ROOT and Geant4), with some interest in Python (because it has ROOT bindings).
But you'll note that this is sub-discipline dependent..."physics" and "modeling" are big, broad topics, so there is no one answer.
Modelica is a specialized language for modeling (and simulating) physical systems. OpenModelica is an open source implementation of Modelica.
Python is very popular among science-oriented people, as is Matlab. The issue with these is that they are both VERY slow (to run). If you want to do large simulations that may take minutes/hours/days, you're going to have to pick another language.
As long as you are picking a language for speed, suck it up and use C/C++, maybe with CUDA depending on your needs.
Final thought though: if it takes you two days longer to write and debug your model in C than in python, and the resulting code takes 10 minutes to run instead of an hour, have your really saved any time?
There's also a lot of capability with MATLAB. Especially when interfacing your simulations with hardware, or if you need your results visualised.
I'll chime in with Python but you should also look to R for any statistical work you may need to do. You should really be asking more about what packages for which languages to use rather than the language itself.

What are the typical use cases of Genetic Programming?

Today I read this blog entry by Roger Alsing about how to paint a replica of the Mona Lisa using only 50 semi transparent polygons.
I'm fascinated with the results for that particular case, so I was wondering (and this is my question): how does genetic programming work and what other problems could be solved by genetic programming?
There is some debate as to whether Roger's Mona Lisa program is Genetic Programming at all. It seems to be closer to a (1 + 1) Evolution Strategy. Both techniques are examples of the broader field of Evolutionary Computation, which also includes Genetic Algorithms.
Genetic Programming (GP) is the process of evolving computer programs (usually in the form of trees - often Lisp programs). If you are asking specifically about GP, John Koza is widely regarded as the leading expert. His website includes lots of links to more information. GP is typically very computationally intensive (for non-trivial problems it often involves a large grid of machines).
If you are asking more generally, evolutionary algorithms (EAs) are typically used to provide good approximate solutions to problems that cannot be solved easily using other techniques (such as NP-hard problems). Many optimisation problems fall into this category. It may be too computationally-intensive to find an exact solution but sometimes a near-optimal solution is sufficient. In these situations evolutionary techniques can be effective. Due to their random nature, evolutionary algorithms are never guaranteed to find an optimal solution for any problem, but they will often find a good solution if one exists.
Evolutionary algorithms can also be used to tackle problems that humans don't really know how to solve. An EA, free of any human preconceptions or biases, can generate surprising solutions that are comparable to, or better than, the best human-generated efforts. It is merely necessary that we can recognise a good solution if it were presented to us, even if we don't know how to create a good solution. In other words, we need to be able to formulate an effective fitness function.
Some Examples
Travelling Salesman
Sudoku
EDIT: The freely-available book, A Field Guide to Genetic Programming, contains examples of where GP has produced human-competitive results.
Interestingly enough, the company behind the dynamic character animation used in games like Grand Theft Auto IV and the latest Star Wars game (The Force Unleashed) used genetic programming to develop movement algorithms. The company's website is here and the videos are very impressive:
http://www.naturalmotion.com/euphoria.htm
I believe they simulated the nervous system of the character, then randomised the connections to some extent. They then combined the 'genes' of the models that walked furthest to create more and more able 'children' in successive generations. Really fascinating simulation work.
I've also seen genetic algorithms used in path finding automata, with food-seeking ants being the classic example.
Genetic algorithms can be used to solve most any optimization problem. However, in a lot of cases, there are better, more direct methods to solve them. It is in the class of meta-programming algorithms, which means that it is able to adapt to pretty much anything you can throw at it, given that you can generate a method of encoding a potential solution, combining/mutating solutions, and deciding which solutions are better than others. GA has an advantage over other meta-programming algorithms in that it can handle local maxima better than a pure hill-climbing algorithm, like simulated annealing.
I used genetic programming in my thesis to simulate evolution of species based on terrain, but that is of course the A-life application of genetic algorithms.
The problems GA are good at are hill-climbing problems. Problem is that normally it's easier to solve most of these problems by hand, unless the factors that define the problem are unknown, in other words you can't achieve that knowledge somehow else, say things related with societies and communities, or in situations where you have a good algorithm but you need to fine tune the parameters, here GA are very useful.
A situation of fine tuning I've done was to fine tune several Othello AI players based on the same algorithms, giving each different play styles, thus making each opponent unique and with its own quirks, then I had them compete to cull out the top 16 AI's that I used in my game. The advantage was they were all very good players of more or less equal skill, so it was interesting for the human opponent because they couldn't guess the AI as easily.
http://en.wikipedia.org/wiki/Genetic_algorithm#Problem_domains
You should ask yourself : "Can I (a priori) define a function to determine how good a particular solution is relative to other solutions?"
In the mona lisa example, you can easily determine if the new painting looks more like the source image than the previous painting, so Genetic Programming can be "easily" applied.
I have some projects using Genetic Algorithms. GA are ideal for optimization problems, when you cannot develop a fully sequential, exact algorithm do solve a problem. For example: what's the best combination of a car characteristcs to make it faster and at the same time more economic?
At the moment I'm developing a simple GA to elaborate playlists. My GA has to find the better combinations of albums/songs that are similar (this similarity will be "calculated" with the help of last.fm) and suggests playlists for me.
There's an emerging field in robotics called Evolutionary Robotics (w:Evolutionary Robotics), which uses genetic algorithms (GA) heavily.
See w:Genetic Algorithm:
Simple generational genetic algorithm pseudocode
Choose initial population
Evaluate the fitness of each individual in the population
Repeat until termination: (time limit or sufficient fitness achieved)
Select best-ranking individuals to reproduce
Breed new generation through crossover and/or mutation (genetic
operations) and give birth to
offspring
Evaluate the individual fitnesses of the offspring
Replace worst ranked part of population with offspring
The key is the reproduction part, which could happen sexually or asexually, using genetic operators Crossover and Mutation.