Earley algorithm gone wrong - grammar

I am trying to implement Earley's algorithm for parsing a grammar, however I must have done something wrong because after the first entry in the chart it doesn't go through the rest of the input string. My test grammar is the following:
S -> aXbX | bXaX
X -> aXbX | bXaX | epsilon
S and X are non-terminals; a and b are terminals.
The string I want to check if it is accepted or not by the grammar is: 'abba'.
Here is my code:
rules = {
"S": [
['aXbX'],
['bXaX'],
],
"X" : [
['aXbX'],
['bXaX'],
['']
]
}
def predictor(rule, state):
if rule["right"][rule["dot"]].isupper(): # NON-TERMINAL
return [{
"left": rule["right"][rule["dot"]],
"right": right,
"dot": 0,
"op": "PREDICTOR",
"completor": []
} for right in rules[rule["right"][rule["dot"]]]]
else:
return []
def scanner(rule, next_input):
# TERMINAL
if rule["right"][rule["dot"]].islower() and next_input in rules[rule["right"][rule["dot"]]]:
print('scanner')
return [{
"left": rule["right"][rule["dot"]],
"right": [next_input],
"dot": 1,
"op": "SCANNER",
"completor": []
}]
else:
return []
def completor(rule, charts):
if rule["dot"] == len(rule["right"]):
print('completor')
return list(map(
lambda filter_rule: {
"left": filter_rule["left"],
"right": filter_rule["right"],
"dot": filter_rule["dot"] + 1,
"op": "COMPLETOR",
"completor": [rule] + filter_rule["completor"]
},
filter(
lambda p_rule: p_rule["dot"] < len(p_rule["right"]) and rule["left"] == p_rule["right"][p_rule["dot"]],
charts[rule["state"]]
)
))
else:
return []
input_string = 'abba'
input_arr = [char for char in input_string] + ['']
charts = [[{
"left": "S'",
"right": ["S"],
"dot": 0,
"op": "-",
"completor": []
}]]
for curr_state in range(len(input_arr)):
curr_chart = charts[curr_state]
next_chart = []
for curr_rule in curr_chart:
if curr_rule["dot"] < len(curr_rule["right"]): # not finished
curr_chart += [i for i in predictor(curr_rule, curr_state) if i not in curr_chart]
next_chart += [i for i in scanner(curr_rule, input_arr[curr_state]) if i not in next_chart]
else:
print('else')
curr_chart += [i for i in completor(curr_rule, charts) if i not in curr_chart]
charts.append(next_chart)
def print_charts(charts, inp):
for chart_no, chart in zip(range(len(charts)), charts):
print("\t{}".format("S" + str(chart_no)))
print("\t\n".join(map(
lambda x: "\t{} --> {}, {} {}".format(
x["left"],
"".join(x["right"][:x["dot"]] + ["."] + x["right"][x["dot"]:]),
str(chart_no) + ',',
x["op"]
),
chart
)))
print()
print_charts(charts[:-1], input_arr)
And this is the output I get (for states 1 to 4 I should get 5 to 9 entries):
S0
S' --> .S, 0, -
S --> .aXbX, 0, PREDICTOR
S --> .bXaX, 0, PREDICTOR
S1
S2
S3
S4

Related

jq reducing stream to an array of all leaf values using input

I want to receive streamed json inputs and reduce them to an array containing the leaf values.
Demo: https://jqplay.org/s/cZxLguJFxv
Please consider
filter:
try reduce range(30) as $i ( []; (.+[until(length==2;input)[1]] // error(.)) )
catch empty
input:
[
[
0,
0,
"a"
],
null
]
[
[
0,
0,
"a"
]
]
[
[
0,
1,
"b"
],
null
]
[
[
0,
1,
"b"
]
]
[
[
0,
1
]
]
[
[
1
],
0
]
...
output:
empty
I expect the output: [null, null, 0, ...] but I get empty instead.
I told reduce to iterate 30 times but the size of inputs is less than that. I'm expecting it will empty those input of length other than 2 and produce an array containing all leaf values.
I don't know how this will behave when there is no more input with length 2 left and there are iterations of reduce left.
I want to know why my filter returns empty. What am I doing wrong? Thanks!
These filters should do what you want:
jq -n 'reduce inputs as $in ([]; if $in | has(1) then . + [$in[1]] else . end)'
Demo
jq -n '[inputs | select(has(1))[1]]'
Demo

How to enumerate files in channel to use `collectFile`

I am trying to enumerate files in a Channel to rename them before using collectFile:
files.flatten().merge(Channel.fromList([1, 2, 3, 4])).collectFile(storeDir: "$SCRATCH/intermediate") {
item -> ["data${item[1]}.csv", item[0].text]
}
But the latest documentation says that the merge operator for channels is deprecated, but does not point to any alternative that should be used. What can I use instead of merge?
The migration notes say to use the join operator instead. If your inputs were lists, you could do something like:
def indexedChannel( items ) {
return Channel.from( items.withIndex() ).map { item, idx -> tuple( idx, item ) }
}
ch1 = indexedChannel( [ 15, 20, 21 ] )
ch2 = indexedChannel( [ 'a', 'b', 'c' ] )
ch3 = indexedChannel( [ 1, 2, 3 ] )
ch1
.join( ch2 )
.join( ch3 )
.view()
Results:
[0, 15, a, 1]
[1, 20, b, 2]
[2, 21, c, 3]
However, the merging/joining of two channels is unnecessary to enumerate. Just use the map operator:
def c = 1
Channel
.fromPath( './data/*.txt' )
.map { tuple( it, c++ ) }
.collectFile(storeDir: "$SCRATCH/intermediate") { fn, count ->
["data${count}.csv", fn.text]
}

Minimum number of jumps to reach end dynamic programmig

Given an array, verify from the first element how many steps are needed to reach the end.
Example: arr = [1, 3, 5, 8, 4, 2, 6, 7, 0, 7, 9]
1 -> 3 -> 8 (this is the shortest path)
3 steps.
So far, i have this code from geeks for geeks:
def jumpCount(x, n):
jumps = [0 for i in range(n)]
if (n == 0) or (x[0] == 0):
return float('inf')
jumps[0] = 0
for i in range(1, n):
jumps[i] = float('inf')
for j in range(i):
if (i <= j + x[j]) and (jumps[j] != float('inf')):
jumps[i] = min(jumps[i], jumps[j] + 1)
break
return jumps[n-1]
def jumps(x):
n = len(x)
return jumpCount(x,n)
x = [1, 3, 5, 8, 4, 2, 6, 7, 0, 7, 9]
print(jumps(x))
But I want to print out what numbers made the shortest path (1-3-8). How can I adapt the code to do it?
I tried to create a list of j's but since 5 is tested in the loop, it's appended too.
Link to the problem:
https://www.geeksforgeeks.org/minimum-number-of-jumps-to-reach-end-of-a-given-array/
The essential idea is that you need an auxiliary structure to help you keep track of the minimum path. Those type of structures are usually called "backpointers" (you could call them in our case "forwardpointers" since we are going forward, duh). My code solves the problem recursively, but the same could be done iteratively. The strategy is as follows:
jumps_vector = [ 1, 3, 5, 8, 4, 2, 6, 7, 0, 7, 9 ]
"""
fwdpointers holds the relative jump size to reach the minimum number of jumps
for every component of the original vector
"""
fwdpointers = {}
def jumps( start ):
if start == len( jumps_vector ) - 1:
# Reached the end
return 0
if start > len( jumps_vector ) - 1:
# Cannot go through that path
return math.inf
if jumps_vector[ start ] == 0:
# Cannot go through that path (infinite loop with itself)
return math.inf
# Get the minimum in a traditional way
current_min = jumps( start + 1 )
fwdpointers[ start ] = start + 1
for i in range( 2, jumps_vector[ start ] + 1 ):
aux_min = jumps( start + i )
if current_min > aux_min:
# Better path. Update minimum and fwdpointers
current_min = aux_min
# Store the (relative!) index of where I jump to
fwdpointers[ start ] = i
return 1 + current_min
In this case, the variable fwdpointers stores the relative indexes of where I jump to. For instance, fwdpointers[ 0 ] = 1, since I will jump to the adjacent number, but fwdpointers[ 1 ] = 2 since I will jump two numbers the next jump.
Having done that, then it's only a matter of postprocessing things a bit on the main() function:
if __name__ == "__main__":
min_jumps = jumps( 0 )
print( min_jumps )
# Holds the index of the jump given such that
# the sequence of jumps are the minimum
i = 0
# Remember that the contents of fwdpointers[ i ] are the relative indexes
# of the jump, not the absolute ones
print( fwdpointers[ 0 ] )
while i in fwdpointers and i + fwdpointers[ i ] < len( jumps_vector ):
print( jumps_vector[ i + fwdpointers[ i ] ] )
# Get the index of where I jump to
i += fwdpointers[ i ]
jumped_to = jumps_vector[ i ]
I hope this answered your question.
EDIT: I think the iterative version is more readable:
results = {}
backpointers = {}
def jumps_iter():
results[ 0 ] = 0
backpointers[ 0 ] = -1
for i in range( len( jumps_vector ) ):
for j in range( 1, jumps_vector[ i ] + 1 ):
if ( i + j ) in results:
results[ i + j ] = min( results[ i ] + 1, results[ i + j ] )
if results[ i + j ] == results[ i ] + 1:
# Update where I come from
backpointers[ i + j ] = i
elif i + j < len( jumps_vector ):
results[ i + j ] = results[ i ] + 1
# Set where I come from
backpointers[ i + j ] = i
return results[ len( jumps_vector ) - 1 ]
And the postprocessing:
i = len( jumps_vector ) - 1
print( jumps_vector[ len( jumps_vector ) - 1 ], end = " " )
while backpointers[ i ] >= 0:
print( jumps_vector[ backpointers[ i ] ], end = " " )
i = backpointers[ i ]
print()

Paste multi line code in elm-repl

I'm just trying to evaluate some expressions in elm-repl but I don't know how to paste it in.
Something like:
List.map
(\l ->
li []
[ span [ class "position filled" ]
[]
]
)
[ 1, 2, 3 ]
You can span multiple lines in Elm REPL by ending each line with a backslash (\) character:
List.map \
(\l -> \
li [] \
[ span [ class "position filled" ] \
[] \
] \
) \
[ 1, 2, 3 ]

Tensorflow tfcompile: fetching gradients

I created a very simple tensorflow model where I fetch gradients:
# tf Graph Input
X = tf.placeholder(tf.float32, [1, 2], name="X")
Y = tf.placeholder(tf.float32, [1, 2], name="Y")
# Model parameter variables
W = tf.Variable([[1.0, 2.0], [3.0, 4.0]], name="weight")
B = tf.Variable([[5.0, 6.0]], name="bias")
# Construct a multivariate linear model
matmul = tf.matmul(X, W, name="matrixMul")
pred = tf.add(matmul, B, name="addition")
# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2) / 2 )
# Fetch gradients
grads = tf.gradients(cost, [W, B])
I exported this graph into a protobuf and now I use tfcompile for AOT compilation. I want to use the compiled graph in a C++ program and fetch the computed gradients.
The config file for tfcompile looks like:
feed {
id { node_name: "X" }
shape {
dim { size: 1 }
dim { size: 2 }
}
name: "x"
}
feed {
id { node_name: "Y" }
shape {
dim { size: 1 }
dim { size: 2 }
}
name: "y"
}
feed {
id { node_name: "weight" }
shape {
dim { size: 2 }
dim { size: 2 }
}
name: "w"
}
feed {
id { node_name: "bias" }
shape {
dim { size: 1 }
dim { size: 2 }
}
name: "b"
}
fetch {
id { node_name: "addition"}
name: "prediction"
}
fetch {
id { node_name: "gradients/matrixMul_grad/MatMul_1"}
name: "weight_grad"
}
fetch {
id { node_name: "gradients/addition_grad/Reshape"}
name: "bias_grad"
}
Finally I run this C++ code:
obj.set_arg_x_data(x.data());
obj.set_arg_y_data(y.data());
obj.set_arg_w_data(w.data());
obj.set_arg_b_data(b.data());
obj.Run();
std::cout << "result_prediction =" << std::endl << obj.result_prediction(0,0) << " " << obj.result_prediction(0,1) << std::endl;
std::cout << "result_weight_grad =" << std::endl << obj.result_weight_grad(0,0) << " " << obj.result_weight_grad(0,1) << " " << obj.result_weight_grad(1,0) << " " << obj.result_weight_grad(1,1) << std::endl;
std::cout << "result_bias_grad =" << std::endl << obj.result_bias_grad(0,0) << " " << obj.result_bias_grad(0,1) << std::endl;
For result_prediction and result_bias_grad I get the expected values.
Just for result_weight_grad I get only 0,0,0,0.
Maybe I am fetching there the wrong node:
fetch {
id { node_name: "gradients/matrixMul_grad/MatMul_1"}
name: "weight_grad"
}
Does somebody tried already to fetch computed gradients? Tensorflow only offers examples where they using tfcompile for prediction.