How to implement a "return" number with statements in EBNF? - grammar

Here is my current grammar:
program -> stmt-sequence
stmt-sequence -> statement { TT_NEWLINE statement }
assign-stmt -> var identifier TT_EQ factor
print-stmt -> print factor
add-stmt -> add factor TT_COMMA factor
sub-stmt -> sub factor TT_COMMA factor
mul-stmt -> mul factor TT_COMMA factor
div-stmt -> div factor TT_COMMA factor
factor -> number | identifier
For my math statements (add-stmt, sub-stmt, mul-stmt, div-stmt), I want those statements to return a number, like as if they were functions.
If you don't understand what I mean by "return", then here is one example:
print add 2, 4
I want for the math statement to be "replaced" with the number, the result of the statement from adding.
print 6
^
Basically, becoming like this.
factor -> number | identifier | add-stmt | sub-stmt | mul-stmt | div-stmt
I do not know if adding alternations of the math statements in the factor is appropriate.
How can I basically be able allow these math statements to "return" in the EBNF grammar?

The math statements should not contain the return terminal, it's not their part to define the return terminal/statement. Instead you define a new symbol like
return-stmt -> TT_RETURN expr
After that you define what expr is. It is something which results in a value:
expr -> factor | add-stmt | sub-stmt | mul-stmt | div-stmt
Be careful in how you define return-stmt and how you use the existing statements/expressions you have. You shouldn't be able to generate text/code like:
return return 4 + return 2 * 3

Related

Cypher: How to create a recursive cost query with alternates?

I have the following structure:
(:pattern)-[:contains]->(:pattern)
...basically a hierarchy of patterns that use other patterns as content. These constitute trees.
Certain patterns are generated by certain generators:
(:generator)-[:canProduce]->(:pattern)
The canProduce relationship has a cost value associated with it as a property. Multiple generators can create the same pattern.
I would like to figure out, with a query, what patterns I need to generate to produce a particular output - and which generators to choose to have the lowest cost. I started like this:
MATCH (p:pattern {name: 'preciousPattern'})-[:contains *]->(ps:pattern) RETURN ps
so far so good. The results don't contain the starting pattern, so I made this:
MATCH (p:pattern {name: 'preciousPattern'})-[:contains *]->(ps:pattern)
WITH p+collect(ps) as list
UNWIND list as patterns
RETURN patterns
That does not feel elegant, but it also does not provide the hierarchy
I can of course do a path query (MATCH path = MATCH...) but the results don't seem very useful.
Also, now I need to connect the cost from the generator relationship.
I tried this:
MATCH (p:pattern {name: 'awesome'})-[:contains *]->(ps:pattern)
WITH p+collect(ps) as list
UNWIND list as rec
CALL {
WITH rec
MATCH (rec)-[r:canGenerate]-(g:generator)
return r.GenCost as GenCost, g.name AS GenName
}
return rec.name, GenCost , GenName
The problem I have now is that if any of the patterns that are part of another pattern can be generated by multiple generators, I just get double entries in the list, but what I want is separate lists for each alternative possibility, so that I can generate the cost.
This is my pattern tree:
Awesome
input1
input2
input 3
Input 3 can be generated by 2 different generators. I now get:
Awesome | 2 | MainGen
input1 | 3 | TestGen1
input2 | 2.5 | TestGen2
input3 | 1.25 | TestGen3
input4 | 1.4 | TestGen4
What I want is this: Two lists (or n, in the general case, where I might have n possible paths), one
Awesome | 2 | MainGen
input1 | 3 | TestGen1
input2 | 2.5 | TestGen2
input3 | 1.25 | TestGen3
and one:
Awesome | 2 | MainGen
input1 | 3 | TestGen1
input2 | 2.5 | TestGen2
input4 | 1.4 | TestGen4
each set representing one alternative set, so that I can calculate the costs and compare.
I have no idea how to do something like that. Any suggestions?

Explain the Differential Evolution method

Can someone please explain the Differential Evolution method? The Wikipedia definition is extremely technical.
A dumbed-down explanation followed by a simple example would be appreciated :)
Here's a simplified description. DE is an optimisation technique which iteratively modifies a population of candidate solutions to make it converge to an optimum of your function.
You first initialise your candidate solutions randomly. Then at each iteration and for each candidate solution x you do the following:
you produce a trial vector: v = a + ( b - c ) / 2, where a, b, c are three distinct candidate solutions picked randomly among your population.
you randomly swap vector components between x and v to produce v'. At least one component from v must be swapped.
you replace x in your population with v' only if it is a better candidate (i.e. it better optimise your function).
(Note that the above algorithm is very simplified; don't code from it, find proper spec. elsewhere instead)
Unfortunately the Wikipedia article lacks illustrations. It is easier to understand with a graphical representation, you'll find some in these slides: http://www-personal.une.edu.au/~jvanderw/DE_1.pdf .
It is similar to genetic algorithm (GA) except that the candidate solutions are not considered as binary strings (chromosome) but (usually) as real vectors. One key aspect of DE is that the mutation step size (see step 1 for the mutation) is dynamic, that is, it adapts to the configuration of your population and will tend to zero when it converges. This makes DE less vulnerable to genetic drift than GA.
Answering my own question...
Overview
The principal difference between Genetic Algorithms and Differential Evolution (DE) is that Genetic Algorithms rely on crossover while evolutionary strategies use mutation as the primary search mechanism.
DE generates new candidates by adding a weighted difference between two population members to a third member (more on this below).
If the resulting candidate is superior to the candidate with which it was compared, it replaces it; otherwise, the original candidate remains unchanged.
Definitions
The population is made up of NP candidates.
Xi = A parent candidate at index i (indexes range from 0 to NP-1) from the current generation. Also known as the target vector.
Each candidate contains D parameters.
Xi(j) = The jth parameter in candidate Xi.
Xa, Xb, Xc = three random parent candidates.
Difference vector = (Xb - Xa)
F = A weight that determines the rate of the population's evolution.
Ideal values: [0.5, 1.0]
CR = The probability of crossover taking place.
Range: [0, 1]
Xc` = A mutant vector obtained through the differential mutation operation. Also known as the donor vector.
Xt = The child of Xi and Xc`. Also known as the trial vector.
Algorithm
For each candidate in the population
for (int i = 0; i<NP; ++i)
Choose three distinct parents at random (they must differ from each other and i)
do
{
a = random.nextInt(NP);
} while (a == i)
do
{
b = random.nextInt(NP);
} while (b == i || b == a);
do
{
c = random.nextInt(NP);
} while (c == i || c == b || c == a);
(Mutation step) Add a weighted difference vector between two population members to a third member
Xc` = Xc + F * (Xb - Xa)
(Crossover step) For every variable in Xi, apply uniform crossover with probability CR to inherit from Xc`; otherwise, inherit from Xi. At least one variable must be inherited from Xc`
int R = random.nextInt(D);
for (int j=0; j < D; ++j)
{
double probability = random.nextDouble();
if (probability < CR || j == R)
Xt[j] = Xc`[j]
else
Xt[j] = Xi[j]
}
(Selection step) If Xt is superior to Xi then Xt replaces Xi in the next generation. Otherwise, Xi is kept unmodified.
Resources
See this for an overview of the terminology
See Optimization Using Differential Evolution by Vasan Arunachalam for an explanation of the Differential Evolution algorithm
See Evolution: A Survey of the State-of-the-Art by Swagatam Das and Ponnuthurai Nagaratnam Suganthan for different variants of the Differential Evolution algorithm
See Differential Evolution Optimization from Scratch with Python for a detailed description of an implementation of a DE algorithm in python.
The working of DE algorithm is very simple.
Consider you need to optimize(minimize,for eg) ∑Xi^2 (sphere model) within a given range, say [-100,100]. We know that the minimum value is 0. Let's see how DE works.
DE is a population-based algorithm. And for each individual in the population, a fixed number of chromosomes will be there (imagine it as a set of human beings and chromosomes or genes in each of them).
Let me explain DE w.r.t above function
We need to fix the population size and the number of chromosomes or genes(named as parameters). For instance, let's consider a population of size 4 and each of the individual has 3 chromosomes(or genes or parameters). Let's call the individuals R1,R2,R3,R4.
Step 1 : Initialize the population
We need to randomly initialise the population within the range [-100,100]
G1 G2 G3 objective fn value
R1 -> |-90 | 2 | 1 | =>8105
R2 -> | 7 | 9 | -50 | =>2630
R3 -> | 4 | 2 | -9.2| =>104.64
R4 -> | 8.5 | 7 | 9 | =>202.25
objective function value is calculated using the given objective function.In this case, it's ∑Xi^2. So for R1, obj fn value will be -90^2+2^2+2^2 = 8105. Similarly it is found for all.
Step 2 : Mutation
Fix a target vector,say for eg R1 and then randomly select three other vectors(individuals)say for eg.R2,R3,R4 and performs mutation. Mutation is done as follows,
MutantVector = R2 + F(R3-R4)
(vectors can be chosen randomly, need not be in any order).F (scaling factor/mutation constant) within range [0,1] is one among the few control parameters DE is having.In simple words , it describes how different the mutated vector becomes. Let's keep F =0.5.
| 7 | 9 | -50 |
+
0.5 *
| 4 | 2 | -9.2|
+
| 8.5 | 7 | 9 |
Now performing Mutation will give the following Mutant Vector
MV = | 13.25 | 13.5 | -50.1 | =>2867.82
Step 3 : Crossover
Now that we have a target vector(R1) and a mutant vector MV formed from R2,R3 & R4 ,we need to do a crossover. Consider R1 and MV as two parents and we need a child from these two parents. Crossover is done to determine how much information is to be taken from both the parents. It is controlled by Crossover rate(CR). Every gene/chromosome of the child is determined as follows,
a random number between 0 & 1 is generated, if it is greater than CR , then inherit a gene from target(R1) else from mutant(MV).
Let's set CR = 0.9. Since we have 3 chromosomes for individuals, we need to generate 3 random numbers between 0 and 1. Say for eg, those numbers are 0.21,0.97,0.8 respectively. First and last are lesser than CR value, so those positions in the child's vector will be filled by values from MV and second position will be filled by gene taken from target(R1).
Target-> |-90 | 2 | 1 | Mutant-> | 13.25 | 13.5 | -50.1 |
random num - 0.21, => `Child -> |13.25| -- | -- |`
random num - 0.97, => `Child -> |13.25| 2 | -- |`
random num - 0.80, => `Child -> |13.25| 2 | -50.1 |`
Trial vector/child vector -> | 13.25 | 2 | -50.1 | =>2689.57
Step 4 : Selection
Now we have child and target. Compare the obj fn of both, see which is smaller(minimization problem). Select that individual out of the two for next generation
R1 -> |-90 | 2 | 1 | =>8105
Trial vector/child vector -> | 13.25 | 2 | -50.1 | =>2689.57
Clearly, the child is better so replace target(R1) with the child. So the new population will become
G1 G2 G3 objective fn value
R1 -> | 13.25 | 2 | -50.1 | =>2689.57
R2 -> | 7 | 9 | -50 | =>2500
R3 -> | 4 | 2 | -9.2 | =>104.64
R4 -> | -8.5 | 7 | 9 | =>202.25
This procedure will be continued either till the number of generations desired has reached or till we get our desired value. Hope this will give you some help.

How does the Soundex function work in SQL Server?

Here's an example of Soundex code in SQL:
SELECT SOUNDEX('Smith'), SOUNDEX('Smythe');
----- -----
S530 S530
How does 'Smith' become S530?
In this example, the first digit is S because that's the first character in the input expression, but how are the remaining three digits are calculated?
Take a look a this article
The first letter of the code corresponds to the first letter of the
name. The remainder of the code consists of three digits derived from
the syllables of the word according to the following code:
1 = B, F, P, V
2 = C, G, J, K, Q, S, X, Z
3 = D, T
4 = L
5 = M,N
6 = R
The double letters with the same Soundex code, A, E, I, O, U, H, W, Y,
and some prefixes are being disregarded...
So for Smith and Smythe the code is created like this:
S S -> S
m m -> 5
i y -> 0
t t -> 3
h h -> 0
e -> -
What is Soundex?
Soundex is:
a phonetic algorithm for indexing names by sound, as pronounced in English; first developed by Robert C. Russell and Margaret King Odell in 1918
How does it Work?
There are several implementations of Soundex, but most implement the following steps:
Retain the first letter of the name and drop all other occurrences of vowels and h,w:
|a, e, i, o, u, y, h, w | → "" |
Replace consonants with numbers as follows (after the first letter):
| b, f, p, v | → 1 |
| c, g, j, k, q, s, x, z | → 2 |
| d, t | → 3 |
| l | → 4 |
| m, n | → 5 |
| r | → 6 |
Replace identical adjacent numbers with a single value (if they were next to each other prior to step 1):
| M33 | → M3 |
Cut or Pad with zeros or cut to produce a 4 digit result:
| M3 | → M300 |
| M34123 | → M341 |
Here's an interactive demo in jsFiddle:
And here's a demo in SQL using SQL Fiddle
In SQL Server, SOUNDEX is often used in conjunction with DIFFERENCE, which is used to score how many of the resulting digits are identical (just like the game mastermind†), with higher numbers matching most closely.
What are the Alternatives?
It's important to understand the limitations and criticisms of soundex and where people have tried to improve it, notably only being rooted in English pronunciation and also discards a lot of data, resulting in more false positives.
Both Metaphone & Double Metaphone still focus on English pronunciations, but add much more granularity to the nuances of speech in Enlgish (ie. PH → F)
Phil Factor wrote a Metaphone Function in SQL with the source on github
Soundex is most commonly used on identifying similar names, and it'll have a really hard time finding any similar nicknames (i.e. Robert → Rob or Bob). Per this question on a Database of common name aliases / nicknames of people, you could incorporate a lookup against similar nicknames as well in your matching process.
Here are a couple free lists of common nicknames:
SOEMPI - name_to_nick.csv | Github
carltonnorthern - names.csv | Github
Further Reading:
Fuzzy matching using T-SQL
SQL Server – Do You Know Soundex Functions?

What's the proper grammar for this language?

I have this language:
{an bm | m+n is an even number}
What's the proper grammar for this?
S -> aaS | aB | bbC | ε
B -> bbB | b
C -> bbC | ε
you see, it is a regular language. 'S' stands for "we have constructed an even number of a's and more a's may follow, 'B' stands for "we have constructed an uneven number of a's and now an uneven number of b's follows. 'C' stands for "we have constructed an even number of a's and now an even number of b's follows.
ε stands for "", the empty string

How can I construct a grammar that generates this language?

I'm studying for a finite automata & grammars test and I'm stuck with this question:
Construct a grammar that generates L:
L = {a^n b^m c^m+n|n>=0, m>=0}
I believe my productions should go along this lines:
S->aA | aB
B->bB | bC
C->cC | c Here's where I have doubts
How can my production for C remember the numbers of m and n? I'm guessing this must rather be a context-free grammar, if so, how should it be?
Seems like it should be like:
A->aAc | aBc | ac | epsilon
B->bBc | bc | epsilon
You need to force C'c to be counted during construction process. In order to show it's context-free, I would consider to use Pump Lemma.
S -> X
X -> aXc | Y
Y -> bYc | e
where e == epsilon and X is unnecessary but
added for clarity
Yes, this does sound like homework, but a hint:
Every time you match an 'a', you must match a 'c'. Same for matching a 'b'.
S->aSc|A
A->bAc|λ
This means when ever you get a at least you have 1 c or if you get a and b you must have 2 c.
i hope it has been helpful
Well guys, this is how I'll do it:
P={S::=X|epsilon,
X::=aXc|M|epsilon,
M::=bMc|epsilon}
My answer:
S -> aAc | aSc
A -> bc | bAc
where S is the start symbol.
S-> aBc/epsilon
B-> bBc/S/epsilon
This takes care of the order of the alphabets as well