Turing Machines - finite-automata

I'm reading a book on Languages & Automata and I'm not understanding Turing Machines. I've taught myself about DFA's NFA's and Pushdown Automata without any problems. Can someone please explain what this is doing?
B = {w#w|w ∈ {0, 1}*}
The following figure contains several snapshots of Ml 's tape while it is computing
in stages 2 and 3 when started on input 011000#011000.
Thanks alot!

"Imagine an endless row of hotel rooms, and each room contains a lightbulb and a switch that controls it. Initially, all the rooms are dark. A robot starts at one of the rooms, and has the ability to operate switches and move to adjacent rooms.
The robot has several states that it can be in, and each state determines what it should do based on whether the current room is light or dark. For example, a robot's rules could include these states:
The "scared" state:
If the room is dark, turn on the light and move to the room to the left.
If the room is light, do nothing and go to the "normal" state.
The "normal" state:
If the room is light, turn off the light and move to the room on the right.
Otherwise, go to the "scared" state.
One special state is the "stop" state. When the robot finds itself in this state, the process is complete.
Suppose a robot has n states (not including the "stop" state), and it stops. What is the maximum number of light rooms at this point?
This system is in direct allegory to Turing machines. The hotel is the tape, the robot is the Turing machine, and dark rooms and light rooms are 0 and 1 cells."
It is from googology wiki. I gave an idea to it, but, of course, this text has been improved since me.

Turing machine is a hypothetical machine with a tape where symbols are stored. It can have multiple heads that can read symbols from the tape or write symbols to the tape.
Now your grammar says B = {w#w|w ∈ {0, 1}*}, that is any string of the form "w#w", where w is any combination of 0's and 1's or none at all. So let's say w = 011000 for this particular example. The resulting string will be 011000#011000 and your turing machine will be verifying if it follows this grammar.
Your turing machine has one head in this case. It starts at the beginning of string. Reads the first character which is 0. Mark it "x": meaning I've read this. Then goes immediately after the # and checks if what it just read is matching. In this case it's 0 as well so it marks it as matching "x". It then goes back to previous position and does the same for next character. It keeps doing this until it reaches #. When it reads hash or #, it checks for the end of the string and if it is the end of string, it accepts this string saying yes this follows the given grammar.

Related

Complexity of an NFA

I have seen in several sources (e.g. 1, 2 - page 160), that the complexity of running through an NFA is O(m²n). However I haven't understood why it is so.
My intuition is that the complexity should be O(m^n) (where m is the length of the string, and n is the number of states), because for each letter in the input string, there are n possible states that the NFA can move to them.
Can anyone explain this to me?
Thanks.
Let's think about how we'd do one step of simulating the NFA. We begin in some set of active states Qinit. For each state we're in, we look at all the outgoing transitions labeled with the current symbol, then gather them into a set Qmid. Next, we follow all possible ε-transitions from those states and keep repeating this process until there are no more ε-transitions to follow (that is, we find the ε-closure), giving us our new set of state Qnext. We then repeat this process once per character in the input.
So how much time does this take? Well, there are m characters in the input string, so the runtime is m times whatever work we do on one individual character. Each state can have at most n transitions leaving it on each character, so the time to iterate over all the characters in Qinit to find the set Qmid is O(n2): there are O(n) states in Qinit and we only need to scan O(n) transitions per state. From there, we have to keep following ε-transitions until we run out of transitions to follow. We can do that by doing a BFS or DFS starting simultaneously from each state in Qmin and only following ε-transitions. The NFA can have at most O(n2) ε-transitions total (one between each pair of states), so the BFS or DFS will run in time O(n2). Overall we're doing O(n2) work per character, so the total work done is O(mn2).

measuring time between two rising edges in beaglebone

I am reading sensor output as square wave(0-5 volt) via oscilloscope. Now I want to measure frequency of one period with Beaglebone. So I should measure the time between two rising edges. However, I don't have any experience with working Beaglebone. Can you give some advices or sample codes about measuring time between rising edges?
How deterministic do you need this to be? If you can tolerate some inaccuracy, you can probably do it on the main Linux OS; if you want to be fancy pants, this seems like a potential use case for the BBB's PRU's (which I unfortunately haven't used so take this with substantial amounts of salt). I would expect you'd be able to write PRU code that just sits with an infinite outerloop and then inside that loop, start looping until it sees the pin shows 0, then starts looping until the pin shows 1 (this is the first rising edge), then starts counting until either the pin shows 0 again (this would then be the falling edge) or another loop to the next rising edge... either way, you could take the counter value and you should be able to directly convert that into time (the PRU is states as having fixed frequency for each instruction, and is a 200Mhz (50ns/instruction). Assuming your loop is something like
#starting with pin low
inner loop 1:
registerX = loadPin
increment counter
jump if zero registerX to inner loop 1
# pin is now high
inner loop 2:
registerX = loadPin
increment counter
jump if one registerX to inner loop 2
# pin is now low again
That should take 3 instructions per counter increment, so you can get the time as 3 * counter * 50 ns.
As suggested by Foon in his answer, the PRUs are a good fit for this task (although depending on your requirements it may be fine to use the ARM processor and standard GPIO). Please note that (as far as I know) both the regular GPIOs and the PRU inputs are based on 3.3V logic, and connecting a 5V signal might fry your board! You will need an additional component or circuit to convert from 5V to 3.3V.
I've written a basic example that measures timing between rising edges on the header pin P8.15 for my own purpose of measuring an engine's rpm. If you decide to use it, you should check the timing results against a known reference. It's about right but I haven't checked it carefully at all. It is implemented using PRU assembly and uses the pypruss python module to simplify interfacing.

Markov decision process - how to use optimal policy formula?

I have a task, where I have to calculate optimal policy
(Reinforcement Learning - Markov decision process) in the grid world (agent movies left,right,up,down).
In left table, there are Optimal values (V*).
In right table, there is sollution (directions) which I don't know how to get by using that "Optimal policy" formula.
Y=0.9 (discount factor)
And here is formula:
So if anyone knows how to use that formula, to get solution (those arrows), please help.
Edit: there is whole problem description on this page:
http://webdocs.cs.ualberta.ca/~sutton/book/ebook/node35.html
Rewards: state A(column 2, row 1) is followed by a reward of +10 and transition to state A', while state B(column 4, row 1) is followed by a reward of +5 and transition to state B'.
You can move: up,down,left,right. You cannot move outside the grid or stay in same place.
Break the math down, piece by piece:
The arg max (....) is telling you to find the argument, a, which maximizes everything in the parentheses. The variables a, s, and s' are an action, a state you're in, and a state that results from that action, respectively. So the arg max (...) is telling you to find an action that maximizes that term.
You know gamma, and someone did the hard work of calculating V*(s'), which is the value of that resulting state. So you know what to plug in there, right?
So what is p(s,a,s')? That is the probability that, starting from s and doing a, you end in some s'. This is meant to represent some kind of faulty actuator-- you say "go forward!" and it foolishly decides to go left (or two squares forward, or remain still, or whatever.) For this problem, I'd expect it to be given to you, but you haven't shared it with us. And the summation over s' is telling you that when you start in s, and you pick an action a, you need to sum over all possible resulting s' states. Again, you need the details of that p(s,a,s') function to know what those are.
And last, there is r(s,a) which is the reward for doing action a in state s, regardless of where you end up. In this problem it might be slightly negative, to represent a fuel cost. If there is a series of rewards in the grid and a grab action, it might be positive. You need that, too.
So, what to do? Pick a state s, and calculate your policy for it. For each s, you're going have the possibility of (s,a1), and (s,a2), and (s,a3), etc. You have to find the a that gives you the biggest result. And of course for each pair (s,a) you may (in fact, almost certainly will) have multiple values of s' to stick in the summation.
If this sounds like a lot of work, well, that's why we have computers.
PS - Read the problem description very carefully to find out what happens if you run into a wall.

How to create an "intercept missile" for a game?

I have a game I am working on that has homing missiles in it. At the moment they just turn towards their target, which produces a rather dumb looking result, with all the missiles following the target around.
I want to create a more deadly flavour of missile that will aim at the where the target "will be" by the time it gets there and I am getting a bit stuck and confused about how to do it.
I am guessing I will need to work out where my target will be at some point in the future (a guess anyway), but I can't get my head around how far ahead to look. It needs to be based on how far the missile is away from the target, but the target it also moving.
My missiles have a constant thrust, combined with a weak ability to turn. The hope is they will be fast and exciting, but steer like a cow (ie, badly, for the non HitchHiker fans out there).
Anyway, seemed like a kind of fun problem for Stack Overflow to help me solve, so any ideas, or suggestions on better or "more fun" missiles would all be gratefully received.
Next up will be AI for dodging them ...
What you are suggesting is called "Command Guidance" but there is an easier, and better way.
The way that real missiles generally do it (Not all are alike) is using a system called Proportional Navigation. This means the missile "turns" in the same direction as the line-of-sight (LOS) between the missile and the target is turning, at a turn rate "proportional" to the LOS rate... This will do what you are asking for as when the LOS rate is zero, you are on collision course.
You can calculate the LOS rate by just comparing the slopes of the line between misile and target from one second to the next. If that slope is not changing, you are on collision course. if it is changing, calculate the change and turn the missile by a proportionate angular rate... you can use any metrics that represent missile and target position.
For example, if you use a proportionality constant of 2, and the LOS is moving to the right at 2 deg/sec, turn the missile to the right at 4 deg/sec. LOS to the left at 6 deg/sec, missile to the left at 12 deg/sec...
In 3-d problem is identical except the "Change in LOS Rate", (and resultant missile turn rate) is itself a vector, i.e., it has not only a magnitude, but a direction (Do I turn the missile left, right or up or down or 30 deg above horizontal to the right, etc??... Imagine, as a missile pilot, where you would "set the wings" to apply the lift...
Radar guided missiles, which "know" the rate of closure. adjust the proportionality constant based on closure (the higher the closure the faster the missile attempts to turn), so that the missile will turn more aggressively in high closure scenarios, (when the time of flight is lower), and less aggressively in low closure (tail chases) when it needs to conserve energy.
Other missiles (like Sidewinders), which do not know the closure, use a constant pre-determined proportionality value). FWIW, Vietnam era AIM-9 sidewinders used a proportionality constant of 4.
I've used this CodeProject article before - it has some really nice animations to explain the math.
"The Mathematics of Targeting and Simulating a Missile: From Calculus to the Quartic Formula":
http://www.codeproject.com/KB/recipes/Missile_Guidance_System.aspx
(also, hidden in the comments at the bottom of that article is a reference to some C++ code that accomplishes the same task from the Unreal wiki)
Take a look at OpenSteer. It has code to solve problems like this. Look at 'steerForSeek' or 'steerForPursuit'.
Have you considered negative feedback on the recent change of bearing over change of time?
Details left as an exercise.
The suggestions is completely serious: if the target does not maneuver this should obtain a near optimal intercept. And it should converge even if the target is actively dodging.
Need more detail?
Solving in a two dimensional space for ease of notation. Take \vec{m} to be the location of the missile and vector \vec{t} To be the location of the target.
The current heading in the direction of motion over last time unit: \vec{h} = \bar{\vec{m}_i - \vec{m}_i-1}}. Let r be the normlized vector between the missile and the target: \vec{r} = \bar{\vec{t} - \vec{m}}. The bearing is b = \vec{r} \dot \vec{h} Compute the bearing at each time tick, and the change thereof, and change heading to minimize that quantity.
The math is harrier in 3d because of the need to find the plane of action at each step, but the process is the same.
You'll want to interpolate the trajectory of both the target and the missile as a function of time. Then look for the times in which the coordinates of the objects are within some acceptable error.

What are the most hardcore optimisations you've seen?

I'm not talking about algorithmic stuff (eg use quicksort instead of bubblesort), and I'm not talking about simple things like loop unrolling.
I'm talking about the hardcore stuff. Like Tiny Teensy ELF, The Story of Mel; practically everything in the demoscene, and so on.
I once wrote a brute force RC5 key search that processed two keys at a time, the first key used the integer pipeline, the second key used the SSE pipelines and the two were interleaved at the instruction level. This was then coupled with a supervisor program that ran an instance of the code on each core in the system. In total, the code ran about 25 times faster than a naive C version.
In one (here unnamed) video game engine I worked with, they had rewritten the model-export tool (the thing that turns a Maya mesh into something the game loads) so that instead of just emitting data, it would actually emit the exact stream of microinstructions that would be necessary to render that particular model. It used a genetic algorithm to find the one that would run in the minimum number of cycles. That is to say, the data format for a given model was actually a perfectly-optimized subroutine for rendering just that model. So, drawing a mesh to the screen meant loading it into memory and branching into it.
(This wasn't for a PC, but for a console that had a vector unit separate and parallel to the CPU.)
In the early days of DOS when we used floppy discs for all data transport there were viruses as well. One common way for viruses to infect different computers was to copy a virus bootloader into the bootsector of an inserted floppydisc. When the user inserted the floppydisc into another computer and rebooted without remembering to remove the floppy, the virus was run and infected the harddrive bootsector, thus permanently infecting the host PC. A particulary annoying virus I was infected by was called "Form", to battle this I wrote a custom floppy bootsector that had the following features:
Validate the bootsector of the host harddrive and make sure it was not infected.
Validate the floppy bootsector and
make sure that it was not infected.
Code to remove the virus from the
harddrive if it was infected.
Code to duplicate the antivirus
bootsector to another floppy if a
special key was pressed.
Code to boot the harddrive if all was
well, and no infections was found.
This was done in the program space of a bootsector, about 440 bytes :)
The biggest problem for my mates was the very cryptic messages displayed because I needed all the space for code. It was like "FFVD RM?", which meant "FindForm Virus Detected, Remove?"
I was quite happy with that piece of code. The optimization was program size, not speed. Two quite different optimizations in assembly.
My favorite is the floating point inverse square root via integer operations. This is a cool little hack on how floating point values are stored and can execute faster (even doing a 1/result is faster than the stock-standard square root function) or produce more accurate results than the standard methods.
In c/c++ the code is: (sourced from Wikipedia)
float InvSqrt (float x)
{
float xhalf = 0.5f*x;
int i = *(int*)&x;
i = 0x5f3759df - (i>>1); // Now this is what you call a real magic number
x = *(float*)&i;
x = x*(1.5f - xhalf*x*x);
return x;
}
A Very Biological Optimisation
Quick background: Triplets of DNA nucleotides (A, C, G and T) encode amino acids, which are joined into proteins, which are what make up most of most living things.
Ordinarily, each different protein requires a separate sequence of DNA triplets (its "gene") to encode its amino acids -- so e.g. 3 proteins of lengths 30, 40, and 50 would require 90 + 120 + 150 = 360 nucleotides in total. However, in viruses, space is at a premium -- so some viruses overlap the DNA sequences for different genes, using the fact that there are 6 possible "reading frames" to use for DNA-to-protein translation (namely starting from a position that is divisible by 3; from a position that divides 3 with remainder 1; or from a position that divides 3 with remainder 2; and the same again, but reading the sequence in reverse.)
For comparison: Try writing an x86 assembly language program where the 300-byte function doFoo() begins at offset 0x1000... and another 200-byte function doBar() starts at offset 0x1001! (I propose a name for this competition: Are you smarter than Hepatitis B?)
That's hardcore space optimisation!
UPDATE: Links to further info:
Reading Frames on Wikipedia suggests Hepatitis B and "Barley Yellow Dwarf" virus (a plant virus) both overlap reading frames.
Hepatitis B genome info on Wikipedia. Seems that different reading-frame subunits produce different variations of a surface protein.
Or you could google for "overlapping reading frames"
Seems this can even happen in mammals! Extensively overlapping reading frames in a second mammalian gene is a 2001 scientific paper by Marilyn Kozak that talks about a "second" gene in rat with "extensive overlapping reading frames". (This is quite surprising as mammals have a genome structure that provides ample room for separate genes for separate proteins.) Haven't read beyond the abstract myself.
I wrote a tile-based game engine for the Apple IIgs in 65816 assembly language a few years ago. This was a fairly slow machine and programming "on the metal" is a virtual requirement for coaxing out acceptable performance.
In order to quickly update the graphics screen one has to map the stack to the screen in order to use some special instructions that allow one to update 4 screen pixels in only 5 machine cycles. This is nothing particularly fantastic and is described in detail in IIgs Tech Note #70. The hard-core bit was how I had to organize the code to make it flexible enough to be a general-purpose library while still maintaining maximum speed.
I decomposed the graphics screen into scan lines and created a 246 byte code buffer to insert the specialized 65816 opcodes. The 246 bytes are needed because each scan line of the graphics screen is 80 words wide and 1 additional word is required on each end for smooth scrolling. The Push Effective Address (PEA) instruction takes up 3 bytes, so 3 * (80 + 1 + 1) = 246 bytes.
The graphics screen is rendered by jumping to an address within the 246 byte code buffer that corresponds to the right edge of the screen and patching in a BRanch Always (BRA) instruction into the code at the word immediately following the left-most word. The BRA instruction takes a signed 8-bit offset as its argument, so it just barely has the range to jump out of the code buffer.
Even this isn't too terribly difficult, but the real hard-core optimization comes in here. My graphics engine actually supported two independent background layers and animated tiles by using different 3-byte code sequences depending on the mode:
Background 1 uses a Push Effective Address (PEA) instruction
Background 2 uses a Load Indirect Indexed (LDA ($00),y) instruction followed by a push (PHA)
Animated tiles use a Load Direct Page Indexed (LDA $00,x) instruction followed by a push (PHA)
The critical restriction is that both of the 65816 registers (X and Y) are used to reference data and cannot be modified. Further the direct page register (D) is set based on the origin of the second background and cannot be changed; the data bank register is set to the data bank that holds pixel data for the second background and cannot be changed; the stack pointer (S) is mapped to graphics screen, so there is no possibility of jumping to a subroutine and returning.
Given these restrictions, I had the need to quickly handle cases where a word that is about to be pushed onto the stack is mixed, i.e. half comes from Background 1 and half from Background 2. My solution was to trade memory for speed. Because all of the normal registers were in use, I only had the Program Counter (PC) register to work with. My solution was the following:
Define a code fragment to do the blend in the same 64K program bank as the code buffer
Create a copy of this code for each of the 82 words
There is a 1-1 correspondence, so the return from the code fragment can be a hard-coded address
Done! We have a hard-coded subroutine that does not affect the CPU registers.
Here is the actual code fragments
code_buff: PEA $0000 ; rightmost word (16-bits = 4 pixels)
PEA $0000 ; background 1
PEA $0000 ; background 1
PEA $0000 ; background 1
LDA (72),y ; background 2
PHA
LDA (70),y ; background 2
PHA
JMP word_68 ; mix the data
word_68_rtn: PEA $0000 ; more background 1
...
PEA $0000
BRA *+40 ; patched exit code
...
word_68: LDA (68),y ; load data for background 2
AND #$00FF ; mask
ORA #$AB00 ; blend with data from background 1
PHA
JMP word_68_rtn ; jump back
word_66: LDA (66),y
...
The end result was a near-optimal blitter that has minimal overhead and cranks out more than 15 frames per second at 320x200 on a 2.5 MHz CPU with a 1 MB/s memory bus.
Michael Abrash's "Zen of Assembly Language" had some nifty stuff, though I admit I don't recall specifics off the top of my head.
Actually it seems like everything Abrash wrote had some nifty optimization stuff in it.
The Stalin Scheme compiler is pretty crazy in that aspect.
I once saw a switch statement with a lot of empty cases, a comment at the head of the switch said something along the lines of:
Added case statements that are never hit because the compiler only turns the switch into a jump-table if there are more than N cases
I forget what N was. This was in the source code for Windows that was leaked in 2004.
I've gone to the Intel (or AMD) architecture references to see what instructions there are. movsx - move with sign extension is awesome for moving little signed values into big spaces, for example, in one instruction.
Likewise, if you know you only use 16-bit values, but you can access all of EAX, EBX, ECX, EDX , etc- then you have 8 very fast locations for values - just rotate the registers by 16 bits to access the other values.
The EFF DES cracker, which used custom-built hardware to generate candidate keys (the hardware they made could prove a key isn't the solution, but could not prove a key was the solution) which were then tested with a more conventional code.
The FSG 2.0 packer made by a Polish team, specifically made for packing executables made with assembly. If packing assembly isn't impressive enough (what's supposed to be almost as low as possible) the loader it comes with is 158 bytes and fully functional. If you try packing any assembly made .exe with something like UPX, it will throw a NotCompressableException at you ;)