What are Finite State Automata and why should a programmer know about them?

What are Finite State Automata and why should a programmer know about them? - finite-automata

Erm - what the question said. It's something I keep hearing about, but I've not got round to looking into it yet.
(updated) I could look up the definition... but why not (as pointed out by #erikson) get insight into your real experiences and anecdotes. Community Wiki'd incase that helps folks vote up the most insightful answer. Interesting reading so far, thanks!

Short answer, it is a technique that you can use to express systems with concrete states (as opposed to quantum states / probability distributions).
Quoting the Wikipedia article:
A finite state machine (FSM) or finite
state automaton (plural: automata) or
simply a state machine, is a model of
behavior composed of a finite number
of states, transitions between those
states, and actions. A finite state
machine is an abstract model of a
machine with a primitive internal
memory.
So, what does that mean to you? Put simply, it is an effective way to represent the path(s) from a starting state to the end state(s) of the system that you care about. Using regular expressions as a fairly easy to understand example, let's look at the pattern AB+C (imagine that that plus is a superscript). I would expect to this pattern to accept strings such as "ABC", "ABBC", "ABBBC", etc. A at the start, C at the end, some number of B's in the middle (greater than or equal to one).
If you think about it, it's almost easier to think about this in terms of a picture. Faking it with text (and that my parentheses are a loopback arc), you can see that A (on the left), is the starting state and C (on the right) is the end state on the right.
_
( )
A --> B --> C
From FSAs, you can continue your journey into computational complexity by heading over to the land of Turing Machines.
However, you can also use state machines to represent real behaviors and systems. In my world, we use them to model certain workflow of actual people working with components that are extremely intolerant of mistakes in state order. As in, "A had better happen before C or there will be a very serious problem. Make that be not possible right now."

You could look it up, but what the hell. Intuitively, a finite state automaton is an abstraction of something that has some finite number of states, and rules by which you can go from state to state. A state is something for which a true or false statement can be made, and a rule is a way that you change from one state to another. So, you could have, say, two states: "I'm at home" and "I'm at work" and two rules, "go to work" and "go home."
It turns out that you can look at machines like this mathematically, and find there are things they can and cannot do. Regular expressions are basically a way of describing a finite state machine in which the states are a set of different strings, and the rules move you from state to state based on the next character read. You can prove that. But you can also prove that no finite state machine can tell whether or not the parentheses in an expression are matched (via the pumping lemma for FSAs.)
The reason you should learn about FSAs is that they can be used to solve many problems: string matching, control of systems, business process descriptions, digital circuit design. They're also inherently pretty.
Formally, an FSA is a algebraic structure F = 〈Σ, S, s0, F, δ〉 where Σ is the input alphabet, S is a set of states, s0 ∈ S is a particular start state, F ⊆ S is a set of accepting states, and δ:S×Σ → S is the state transition function.

in OOP terms: if you have an object with methods that you call on certain events, and some (other) methods that have different behaviour depending on the previous calls.... surprise! you have a state machine!
now, if you know the theory, you don't have to rethink it all. you simply say: "piece of cake, it's just a state machine" and go on to implement it.
if you don't know the theory you'll think about it for a while, write some clever hacks, and get something that's difficult to explain and document... because you don't have the words to describe it

Good answers above. I would only add that FSA are primarily a thinking tool, not a programming technique. What makes them useful is they have nice properties, and anything that acts like one has those properties. If you can think of something as an FSA, there are many ways you can build it:
as a regular expression
as a state-transition table
as a while-switch-on-state loop
as a goto-net (horrors!)
as simple structured program code
etc. etc.
If somebody says something is a FSA, you can immediately know what they are talking about, no matter how it is built.

You need state machines whenever you have to release your thread before you have completed your operation.
Since web services are often not statefull, you don't usually see this in web services--you re-arrange your URL so that each URL corresponds to a single path through the code.
I guess another way to think about it could be that every web server is a FSM where the state information is kept in the URL.
You often see it when processing input. You have to release your thread before the input has all been completed, so you set a flag saying "input in progress" or something like that. When done you set the flag to "awaiting input". That flag is your state monitor.
More often than not, a FSM is implemented as a switch statement that switches on a variable. Each case is a different state. At the end of the case, you may set the state to a new value. You've almost certainly seen this somewhere.
The nice thing about a FSM is that you can make the state a part of your data rather than your code. Imagine that you need to fill out 1000 items in the database. The incoming data will address one of the 1000 items, but you generally don't have enough data to complete the operation.
Without an FSM you might have hundreds of threads waiting around for the rest of the data so they can complete processing and write the results to the DB. With a FSM, you write the state to the DB, then exit your thread. Next time you can check the incoming data, read the state from the thread and that should give you enough info to determine what code to run.
Nearly every FSM operation COULD be done by dedicating a thread to it, but probably not as well (The complexity multiplies based on number of states, whereas with a state machine the rise in complexity is more linear). Also, there are some conceptual design issues--examining your code at the state level is in some cases much easier than examining it at the line of code level.

Every programmer should know about them because they are an excellent tool for certain kinds of problems, where the usual 'iterative-thinking' approach would yield nasty, complex code.
A typical example is game AI, where NPCs have different states that change according to where the player is, something like:
NPC_STATE_IDLE
NPC_STATE_ALERT (player at less than 100 meters)
NPC_STATE_ENGAGE (player attacked NPC)
NPC_STATE_FLEE (low on health)
where a FSM can describe easily the transitions and help perform complex reasoning about the system the FSM is describing.

Important: If you are a "visual" style learner, stop everything you are doing and go to this link ... Right Now.
If you are a "visual" learner, here is an excellent link that gives a very accessible introduction.
Reanimator by Oliver Steele
It looks like you've already approved an answer, but if you appreciate "visual" introduction to new concepts, as is common, you really should check out the link. It is simply outstanding.
(Note: the link points to a discussion of DFA and NDFA in the context of regular expressions -- with animated interactive diagrams)

Yes! You could look it up!
http://en.wikipedia.org/wiki/Finite_state_machine

What it is is better answered on other sites (such as Wikipedia), because there are pretty extensive answers out there already.
Why you should know them: Because you probably implemented them already.
Any time your code has a limited number of possible states (that's the "finite state" part) and switches to another one once some input/event happend (that's the "machine" part) you've written a finite state machine.
It is a very common tool and knowing the theoretical basics for that, being able to reason about it and knowing how to combine two FSMs into a single one that does the same work can be a great help.

FSAs are great data structures to understand because any chance you have to implement them, you're working at the lowest level of computational complexity on the Chomsky hierarchy. A great example is in word morphology (how parts of words come together). A lot of work has been done to show that even the most severe cases can be analyzed in this extremely fast analytical framework. Take a look at Karttunnen and Beesley's work out of PARC.
FSAs are also a great place to start learning about machine learning concepts like hidden markov models, because in many ways, the problem can be broken down using the same ideas and vocabulary.

One item that hasn't been mentioned so far is the semantic equivalence of finite state automata and regular expressions. A regular expression can be compiled to a finite state automaton (this is how regex libraries work) and vice-versa.

FSA (including DFA and NFA) are very important for computer science and they are use in many fields including many fields. For instance hidden markov fields for speech recognition also regular expressions are converted to the FSA's before they are interpreted by the software and NLP (Natural Language Processing), AI (game programming), Robot Programming etc.
One of the disadvantage of FSA's are they are usually slow and usually hard to implement and hard to understand or visualize while reading the code, but they are good because they usually provide generic solutions to the problems and they are well-known with a lot of studies on FSA's.

Related

alpha/beta pruning, from which perspective should the evaluation be performed?

I'm trying to develop a chess program. It will be regular bruteforce searching of a tree, the only thing different will be the evaluation. Initially I will use a standard evaluator as designed by Claude Shannon so that it is more easy to test if it works fine at the base. Move list generation and all other infra around it works fine.
Now for the searching: I would like to use the alpha/beta pruning code example of wikipedia.
And that's what is a bit of a problem: it is ambiguous about one thing; from who's perspective should the evaluation be done? I've been googling for days (literally) but no example says this explicitly.
So: should the evaluation be
from the point of view of the "current mover" at that depth
from the point of view of the "opponent of current move" at that depth
from the point of view of the mover at the root of the tree, e.g. the person for who this tree-search is being done (the AI player)
from the point of view of the opponent of the mover at the root of the tree
?
I experimentally tried rootSide, side, rootOpponent, well basically all options and then let them play against each other. The outcome of that was that "current mover at that depth" would be the one to use (it won most frequently) but testing that version against any other engine result in 100% loss.
Of course wikipedia will be updated to be more clear about it! (please ignore the note on the side in the wikipedia history: that was from myself which might be incorrect so I removed it)

Evaluation should be performed from the perspective of the node invoking it. This is your first case, and it is the only one that really makes sense.
If you're testing your very basic engine against other complete engines then yes you can expect to lose every match. You are missing lots of techniques so playing against other engines is not a good method of testing right now. Once your engine is stronger then yes you can use this method to improve strength or perform regression testing.
I'd recommend setting up simple positions and playing with it manually to look at whether it can see basic captures, checks, checkmates, etc. Be warned that even with a perfect implementation of alpha-beta and simple evaluation that your engine will still blunder at times. This is because of the fixed search horizon so next you may want to look into quiescence search.

Basic A Star pathfinding for a platformer game

I'm currently writing a basic platformer game with the idea of learning the basis patterns of a 2D game development. One of them is the amazing A Star Pathfinding. After reading and study a few articles (two of them are the best: Link and http://theory.stanford.edu/~amitp/GameProgramming/Heuristics.html#S7), tutorials and source code, I did a simple A Star Pathfinding using the Manhattan method that works well for my personal learning process.
The next, and logical step, is to do the same on a platform scenario where the NPC must jump or climb ladders to reach the goal target. I've found these screencastings http://www.youtube.com/watch?v=W2xoJfT6sek and https://www.youtube.com/watch?v=Bc7tu-KwfoU that show perfectly what I want to do.
I haven't found for the moment any resource that explains the method to implement it. Can someone give me a starting point I can work with?
Thanks in advance

As far as I can tell, the only difference that jumping/ladders make is that they affect possible movement. So, all your algorithm has to do is take into account where a character can and can't move to, and the costs (if you're implementing that) of moving there.

I haven't any clean A* code around fit for an example but let me give you a few pointers when implementing A*.
First off, the most essential tip that I often find neglected in tutorials is use a priority queue.
Whenever you generate possible move states, put them into the priority queue with priority being the value of your heuristic for that move state. This differentiates A* from DFS (which uses a stack and backtracking) or simple greedy path finding (which simply goes to the best state with respect to the present state).
Next, mind those pointers! I don't know what language will you implement this but even if you do it in some language that abstracts pointers (unlike C), you have to understand the concept of a deep copy versus a plain shallow copy. I remember the first time I did A* (in Python) and I did not pay attention to this. My states were repeatedly overwritten leading to unpredictable behavior. Turns out I did not make deep copies!
That said, depending on how complex the task you have in mind, all those deep copies might clutter up your memory. While good memory management will help, for big problem instances that need real-world performance, I tend to just use iterative-deepening A* so that after every iteration, I get to unclutter my memory.

How can I decide when to use linear programming?

When I look at optimization problems I see a lot of options. One is linear programming. I understand in abstract terms how LP works, but I find it difficult to see whether a particular problem is suitable for LP or not. Are there any heuristics that can help guide this decision?
For example, the work described in Is there a good way to do this type of mining? took weeks before I saw how to structure the problem correctly. Is it possible to know "in advance" that problem could be solved by LP, without first seeing "how to phrase it"?
Is there a checklist I can use to decide whether a problem is suitable for LP? Is there a standard (readable) reference for this topic?

Heuristics (and/or checklists) to decide if the problem at hand is really a Linear Program.
Here's my attempt at answering, and I have also tried to outline how I'd approach this problem.
Questions that indicate that a given problem is suitable to be formulated as an LP/IP:
Are there decisions that need to be taken regularly, at different time intervals?
Are there a number of resources (workers, machines, vehicles) that need to be assigned tasks? (hours, jobs, destinations)
Is this a routing problem, where different "points" have to be visited?
Is this a location or a "layout" problem? (Whole class of Stock-cutting problems fall into this group)
Answering yes to these questions means that an LP formulation might work.
Commonly encountered LP's include: Resource allocation.: (Assignment, Transportation, Trans-shipment, knapsack) ,Portfolio Allocation, Job Scheduling, and network flow problems.
Here's a good list of LP Applications for anyone new to LPs or IPs.
That said, there are literally 1000s of different types of problems that can be formulated as LP/IP. The people I've worked with (researchers, colleagues) develop an intuition. They are good at recognizing that a problem is a certain type of an Integer Program, even if they don't remember the details, which they can then look up.
Why this question is tricky to answer:
There are many reasons why it is not always straightforward to know if an LP formulation will cut it.
There is a lot of "art" (subjectivity) in the approach to modeling/formulation.
Experience helps a lot. People get good at recognizing that this problem can be "likened" to another known formulation
Even if a problem is not a straight LP, there are many clever master-slave techniques (sub-problems), or nesting techniques that make the overall formulation work.
What looks like multiple objectives can be combined into one objective function, with an appropriate set of weights attached.
Experienced modelers employ decomposition and constraint-relaxation techniques and later compensate for it.
How to Proceed to get the basic formulation done?
The following has always steered me in the right direction. I typically start by listing the Decision Variables, Constraints, and the Objective Function. I then usually iterate among these three to make sure that everything "fits."
So, if you have a problem at hand, ask yourself:
What are the Decision Variables (DV)? I find that this is always a good place to start the process of formulation. How many types of DV's are there? (Which resource gets which task, and when should it start?)
What are the Constraints?
Some constraints are very readily visible. Others take a little bit of teasing out. The constraints have to be written in terms of your decision variables, and any constants/limits that are imposed.
What is the Objective Function?
What are the quantities that need to be maximized or minimized? Note: Sometimes, it is not clear what the objective function is. That is okay, because it could well be a constraint-satisfaction problem.
A couple of quick Sanity Checks once you think your LP formulation is done:
I always try to see if a trivial solution (all 0s or all big
numbers) is not part of the solution set. If yes, then the
formulation is most probably not correct. Some constraint is
missing.
Make sure that each and every constraint is "related"' to
the Decision Variables. (I occasionally find constraints that are
just "hanging out there." This means that a "bookkeeping constraint"
has been missed.)
In my experience, people who keep at it almost always develop the needed intuition. Hope this helps.

Applying Darwinian evolution to programming

A while back I recall reading a magazine article (in Wired I believe) about applying Darwinian evolution to programs to create better programs. Essentially multiple mutations of a program would be spawned, and the one that performed the best would be selected for the next round of mutations.
Unforunately I can't make the subject sound nearly as interesting as is sounded in the article, but I can't find the article.
Since this sounds like just the coolest thing ever to me, I was wondering what mutations one could have inside of a program

Yes. It is called Genetic Programming, where a master program that writes programs itself. And the programs it writes can evolve to a certain criterion.
E.g. 8 queen could be solved by GP.

I think you're referring to Genetic Algorithms. I want to work on this topic for my dissertation. I can't stop reading about it :-)

Found this article/paper - is this what you're referring to?. Also found this PDF. Quite an interesting topic
What it sounds like is that you could use self-modifying code that reproduces the program itself based on self-monitoring optimizations. This would currently point at interpreted-language programs.

I read an article on Coding Horror about something like that the other day: Go That Way, Really Fast. Basically, the idea I got from it was that software should constantly be improved which means constantly pushing out new versions/releases. This seems to match the idea of evolution in that your software is always improving into something better.

As said before it's called Genetic Programming (GP).
The interesting thing is that GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done.
Using ideas from natural evolution, GP starts from a population of random computer programs and progressively refines them through processes of mutation and crossover (recombination), until solutions emerge.
All this without the user having to know or specify the form or structure of solutions in advance.
GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions (see also What are good examples of genetic algorithms/genetic programming solutions?).
I was wondering what mutations one could have inside of a program
There are many genetic operators (not only mutation) and many implementations. The fundamental property they are required to have is closure (they must mantain the structural integrity of the genetic program).
In general mutation replaces a symbol of the program with a compatible terminal / function choosen from a group of available symbols. Crossover operator mixes the information of two or more programs.
Probably the best free introduction to the subject is A Field Guide to Genetic Programming
Some nice links are:
Genetic Programming: Evolution of Mona Lisa
Genetic Cars
Smart rockets

OOP Problems to use for Coding Tests during interviews

As a second interview I get people to sit down and write code...I try to make the problem really technology independent.
My programming problems that I have don't really exercise peoples OO abilities. I tend to try and keep the coding problem solvable within 2 hours ish. So, I've struggled to find a problem small enough and involved enough that it exposes peoples OO design skills.
Any suggestions?

This is a problem that I use with some trainings, looks simple but is tricky OOP-wise:
Create model classes that will properly represent the following constructs:
Define a Shape object, where the object is any two dimensional figure, and has the following characteristics: a name, a perimeter, and a surface area.
Define a Circle, retaining and accurately outputting the values of the aforementioned characteristics of a Shape.
Define a Triangle. This time, the name of the triangle should take into account if it is equilateral (all 3 sides are the same length), isoceles (only 2 sides are the same length), or scalene (no 2 sides are the same).
You can go on and on with quadrelaterals (which include squares, rectangles, rhombi, etc) and other polygons.
The way that they would solve the above problems would reveal the people who understand OOP apart from those who don't.

ideally, you want to present a problem that appears difficult, but has a simple, elegant, obvious solution if you think in OO terms
perhaps:
we need to control access to a customer web site
each customer may have one or more people to access the site
different people from different customers may be able to view different parts of the site
the same person may work for more than one customer
customers want to manage permissions based on the person, department, team, or project
design a solution for this using object-oriented techniques
one OO solution is to have a Person, a Customer, an Account, and AccountPermissions, where the Account specifies a Person and a Customer and an optional Parent Account. the use of a recursive Account object collapses the otherwise cumbersome person/team/department/project structure a direct ERD solution might yield

I have used the FizzBuzz Programming Test. And shockingly can corroborate the claims made by the article. As a second follow up I have asked candidates to compute the angle(s) between the hands on an analog clock. We set up a laptop with VS 2008 installed and the stub in place. all they have to do is fill in the implementation.
I am always stunned at how poorly candidates do on these two questions. I really am.

Designing Social Security Application is something which I ask a lot of people during interviews.
The nice thing about this is everyone is aware of how it works and what things to keep track of.
They also have to justify their design and this really helps me get inside their head :)
(As there is lots of flexibility here)
Kind regards,

Whether or not people do some coding in the interview, I make it a point to ask this:
Tell me about a problem you solved recently using object oriented programming. You'd be surprised how often people cannot answer that simple question. A lot of times I get a blank stare, or they say something like "what do you mean? I program in .NET, which is all object oriented."

These aren't specifically OO Questions, but check out the other questions tagged interview-questions
Edit: What about implementing some design patterns? I don't have the best knowledge in the area but it seems as if you would be getting two questions for the price of one. You can test for both OO and Design pattens in the one question.

How about some sort of simple GUI. It's got inheritance, overriding, possibly events. If you mean for them to actually implement as part of the test then you could hand them a blank windows-form with an OnPaint() and tell them to get to it.

You could do worse than ask them to design a MapReduce library with a single-process implementation. Will the interface still work for a distributed implementation? What's the exception-handling policy? Should there be special support for chaining MapReduce jobs in a pipeline? What's the interface to the inputs and outputs? How are inputs chunked up? Can different inputs in one job go to different mappers? What defaults are reasonable?
A good solution in Python takes about a page of code.

I've got a super simple set. The idea is mainly to use them to filter out people who really don't know their stuff rather than filtering in the rock stars.
These are all 5 minute white-board type questions, so they are really not that hard. But the act of writing up code, and talking through it reveals a lot about a candidate - and is brilliant for exposing those that can otherwise BS through the talk.
Write a method that takes a radius of a circle as an argument, and returns the area of the circle (You would be amazed how many people struggle on this one!)
Write a program that accepts a series of numbers as arguments from the command line. Add them up, and print the sum
Write a class that acts as a keyed counter (basically a map that keeps track of how many times each key is "counted")

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas