How to use PLY module to realize two-line syntax analysis of "1+1 \n 2+2", output 2 and 4 respectively - yacc

Through PLY to achieve the "1+1 \n 2+2" result analysis, I think it is two irrelevant statements, but PLY has reduced them, how to make them irrelevant
def p_statement_expr(p):
'''statement : expression
print p[1]
def p_expr_num(p):
'''expression : NUMBER'''
p[0] = p[1]
if "__main__" == __name__:
parser = yacc.yacc(tabmodule="parser_main")
import time
t = time.time()
for i in range(1):
result = parser.parse("1+1 \n 2+2", debug=debug)
# print time.time() - t
# print result
Through PLY to achieve the "1+1 \n 2+2" result analysis, I think it is two irrelevant statements, but PLY has reduced them, how to make them irrelevant
PLY: PARSE DEBUG START State : 0 Stack : . LexToken(NUMBER,1,1,0) Action : Shift and goto state 3 State : 3 Stack : NUMBER . LexToken(ADD,'+',1,1) Action : Reduce rule [expression -> NUMBER] with [1] and goto state 5 Result : (1) State : 5 Stack : expression . LexToken(ADD,'+',1,1) Action : Shift and goto state 9 State : 9 Stack : expression ADD . LexToken(NUMBER,1,1,2) Action : Shift and goto state 3 State : 3 Stack : expression ADD NUMBER . LexToken(NUMBER,2,2,6) ERROR: Error : expression ADD NUMBER . LexToken(NUMBER,2,2,6)
When 2+2 is reported, how can I implement multi-line statement execution and automatically execute the following code after execution?

Ply has not done anything with the second expression.
Your grammar matches exactly one statement, assuming you are showing it all. Ply expects the input to terminate at that point, but it doesn't so Ply complains about an unexpected number.

Related

Iterate over the tokens in the doc contains a dot in front a number

It looks like the output is not as what I expecting. Could it be by design or program bug?
doc = nlp(
"Line 1 50%. "
"Line 2 40% end space and dot ." # try comment
# "Line 2 40% end space and dot." # try comment
"20% at line 3 where Line 2 end with or without space"
)
# Iterate over the tokens in the doc
for token in doc:
# Check if the token resembles a number
if token.like_num:
# Get the next token in the document
next_token = doc[token.i+1]
# Check if the next token's text equals "%"
if next_token.text == "%":
print("Percentage found:", token.text)
Try doc.text and see the reason for what you're getting:
doc.text
'Line 1 50%. Line 2 40% end space and dot .20% at line 3 where Line 2 end with or without space'
Tip: you have all your strings concatenated into one.
To achieve what you want feed your docs as comma separated strings:
docs = nlp.pipe(["Line 1 50%. "
,"Line 2 40% end space and dot ."
,"20% at line 3 where Line 2 end with or without space"])
for doc in docs:
for token in doc:
if token.like_num and doc[token.i+1].text=="%":
print("Percentage found:", token)
Percentage found: 50
Percentage found: 40
Percentage found: 20

Indentation of boxes in Format.fprintf

Please consider the function f:
open Format
let rec f i = match i with
| x when x <= 0 -> ()
| i ->
pp_open_hovbox std_formatter 2;
printf "This is line %d#." i;
f (i-1);
printf "This is line %d#." i;
close_box ();
()
It recursively opens hovboxes and prints something, followed by a newline hint (#.). When I call the f 3, i obtain the following output:
This is line 3
This is line 2
This is line 1
This is line 1
This is line 2
This is line 3
but I expected:
This is line 3
This is line 2
This is line 1
This is line 1
This is line 2
This is line 3
Can you explain why I obtain the first output and what I need to change to obtain the second one?
#. is not a newline hint, it is equivalent to print_newline which calls print_flush which closes all opened boxes and follows by a new line.
If you want to print line by line with Format you should open a vertical box with open_vbox and use print_cut ("#,") whenever you want to output a new line.
Instead of using #. you should use #\n specificator. The former, will flush the formatter and output a hard newline, actually breaking your pretty printing. It is intended to be used at the end of document, and, since it is not actually composable, I would warn against using it at all.
With #\n, you will get an output that is much closer to what you're expecting:
This is line 3
This is line 2
This is line 1
This is line 1
This is line 2
This is line 3
The same output, by the way, can be obtained by using vbox and emitting #; good break hints, that is better.

SAS IML constraining a called function

How do I properly constrain this minimizing function?
Mincvf(cvf1) should minimize cvf1 with respect to h and I want to set so that h>=0.4
proc iml;
EDIT kirjasto.basfraaka var "open";
read all var "open" into cp;
p=cp[1:150];
conh={0.4 . .,. . .,. . .};
m=nrow(p);
m2=38;
pi=constant("pi");
e=constant("e");
start Kmod(x,h,pi,e);
k=1/(h#(2#pi)##(1/2))#e##(-x##2/(2#h##2));
return (k);
finish;
start mhatx2 (m2,newp,h,pi,e);
t5=j(m2,1); /*mhatx omit x=t*/
do x=1 to m2;
i=T(1:m2);
temp1=x-i;
ue=Kmod(temp1,h,pi,e)#newp[i];
le=Kmod(temp1,h,pi,e);
t5[x]=(sum(ue)-ue[x])/(sum(le)-le[x]);
end;
return (t5);
finish;
Start CVF1(h) global (newp,pi,e,m2);
cv3=j(m2,1);
cv3=1/m2#sum((newp-mhatx2(m2,newp,h,pi,e))##2);
return(cv3);
finish;
start mincvf(CVF1);
optn={0,0};
init=1;
call nlpqn(rc, res,"CVF1",init) blc="conh";
return (res);
finish;
start outer(p,m) global(newp);
wl=38; /*window length*/
m1=m-wl; /*last window begins at m-wl*/
newp=j(wl,1);
hyi=j(m1,1);
do x=1 to m1;
we=x+wl-1; /*window end*/
w=T(x:we); /*window*/
newp=p[w];
hyi[x]=mincvf(CVF1);
end;
return (hyi);
finish;
wl=38; /*window length*/
m1=m-wl; /*last window begins at m-wl*/
time=T(1:m1);
ttt=j(m1,1);
ttt=outer(p,m);
print time ttt p;
However I get lots of:
WARNING: Division by zero, result set to missing value.
count : number of occurrences is 2
operation : / at line 1622 column 22
operands : _TEM1003, _TEM1006
_TEM1003 1 row 1 col (numeric)
.
_TEM1006 1 row 1 col (numeric)
0
statement : ASSIGN at line 1622 column 1
traceback : module MHATX2 at line 1622 column 1
module CVF1 at line 1629 column 1
module MINCVF at line 1634 column 1
module OUTER at line 1651 column 1
Which happens because losing of precision when h approaches 0 and "le" in "mhatx2" approaches 0. At value h=0.4, le is ~0.08 so I just artificially picked that as a lower bound which is still precise enough.
Also this output of "outer" subroutine, ttt which is vector of h fitted for the rolling windows, still provides values below the constraint 0.4. Why?
I have solved loss of precision issues previously by simply applying a multiplication transformation to the input... Multiply it by 10,000 or whatever is necessary, and then revert the transformation at the end.
Not sure if it will work in your situation, but it may be worth a shot.
This way it works, had to put that option and constrain vector both into the input parentheses:
Now I get no division by 0 warning. The previously miss-specified-due-loss-of-precision point's are now not specified at all and the value is substituted by 0.14 but the error isn't likely big.
start mincvf(CVF1);
con={0.14 . .,. . .,. . .};
optn={0,0};
init=1;
call nlpqn(rc, res,"CVF1",init,optn,con);
return (res);
finish;

gnuplot store one number from data file into variable

OSX v10.6.8 and Gnuplot v4.4
I have a data file with 8 columns. I would like to take the first value from the 6th column and make it the title. Here's what I have so far:
#m1 m2 q taua taue K avgPeriodRatio time
#1 2 3 4 5 6 7 8
K = #read in data here
graph(n) = sprintf("K=%.2e",n)
set term aqua enhanced font "Times-Roman,18"
plot file using 1:3 title graph(K)
And here is what the first few rows of my data file looks like:
1.00e-07 1.00e-07 1.00e+00 1.00e+05 1.00e+04 1.00e+01 1.310 12070.00
1.11e-06 1.00e-07 9.02e-02 1.00e+05 1.00e+04 1.00e+01 1.310 12070.00
2.12e-06 1.00e-07 4.72e-02 1.00e+05 1.00e+04 1.00e+01 1.310 12070.00
3.13e-06 1.00e-07 3.20e-02 1.00e+05 1.00e+04 1.00e+01 1.310 12090.00
I don't know how to correctly read in the data or if this is even the right way to go about this.
EDIT #1
Ok, thanks to mgilson I now have
#m1 m2 q taua taue K avgPeriodRatio time
#1 2 3 4 5 6 7 8
set term aqua enhanced font "Times-Roman,18"
K = "`head -1 datafile | awk '{print $6}'`"
print K+0
graph(n) = sprintf("K=%.2e",n)
plot file using 1:3 title graph(K)
but I get the error: Non-numeric string found where a numeric expression was expected
EDIT #2
file = "testPlot.txt"
K = "`head -1 file | awk '{print $6}'`"
K=K+0 #Cast K to a floating point number #this is line 9
graph(n) = sprintf("K=%.2e",n)
plot file using 1:3 title graph(K)
This gives the error--> head: file: No such file or directory
"testPlot.gnu", line 9: Non-numeric string found where a numeric expression was expected
You have a few options...
FIRST OPTION:
use columnheader
plot file using 1:3 title columnheader(6)
I haven't tested it, but this may prevent the first row from actually being plotted.
SECOND OPTION:
use an external utility to get the title:
TITLE="`head -1 datafile | awk '{print $6}'`"
plot 'datafile' using 1:3 title TITLE
If the variable is numeric, and you want to reformat it, in gnuplot, you can cast strings to a numeric type (integer/float) by adding 0 to them (e.g).
print "36.5"+0
Then you can format it with sprintf or gprintf as you're already doing.
It's weird that there is no float function. (int will work if you want to cast to an integer).
EDIT
The script below worked for me (when I pasted your example data into a file called "datafile"):
K = "`head -1 datafile | awk '{print $6}'`"
K=K+0 #Cast K to a floating point number
graph(n) = sprintf("K=%.2e",n)
plot "datafile" using 1:3 title graph(K)
EDIT 2 (addresses comments below)
To expand a variable in backtics, you'll need macros:
set macro
file="mydatafile.txt"
#THE ORDER OF QUOTES (' and ") IS CRUCIAL HERE.
cmd='"`head -1 ' . file . ' | awk ''{print $6}''`"'
# . is string concatenation. (this string has 3 pieces)
# to get a single quote inside a single quoted string
# you need to double. e.g. 'a''b' yields the string a'b
data=#cmd
To address your question 2, it is a good idea to familiarize yourself with shell utilities -- sed and awk can both do it. I'll show a combination of head/tail:
cmd='"`head -2 ' . file . ' | tail -1 | awk ''{print $6}''`"'
should work.
EDIT 3
I recently learned that in gnuplot, system is a function as well as a command. To do the above without all the backtic gymnastics,
data=system("head -1 " . file . " | awk '{print $6}'")
Wow, much better.
This is a very old question, but here's a nice way to get access to a single value anywhere in your data file and save it as a gnuplot-accessible variable:
set term unknown #This terminal will not attempt to plot anything
plot 'myfile.dat' index 0 every 1:1:0:0:0:0 u (var=$1):1
The index number allows you to address a particular dataset (separated by two carriage returns), while every allows you to specify a particular line.
The colon-separated numbers after every should be of the form 1:1:<line_number>:<block_number>:<line_number>:<block_number>, where the line number is the line with the the block (starting from 0), and the block number is the number of the block (separated by a single carriage return, again starting from 0). The first and second numbers say plot every 1 lines and every one data block, and the third and fourth say start from line <line_number> and block <block_number>. The fifth and sixth say where to stop. This allows you to select a single line anywhere in your data file.
The last part of the plot command assigns the value in a particular column (in this case, column 1) to your variable (var). There needs to be two values to a plot command, so I chose column 1 to plot against my variable assignment statement.
Here is a less 'awk'-ward solution which assigns the value from the first row and 6th column of the file 'Data.txt' to the variable x16.
set table
# Syntax: u 0:($0==RowIndex?(VariableName=$ColumnIndex):$ColumnIndex)
# RowIndex starts with 0, ColumnIndex starts with 1
# 'u' is an abbreviation for the 'using' modifier
plot 'Data.txt' u 0:($0==0?(x16=$6):$6)
unset table
A more general example for storing several values is given below:
# Load data from file to variable
# Gnuplot can only access the data via the "plot" command
set table
# Syntax: u 0:($0==RowIndex?(VariableName=$ColumnIndex):$ColumnIndex)
# RowIndex starts with 0, ColumnIndex starts with 1
# 'u' is an abbreviation for the 'using' modifier
# Example: Assign all values according to: xij = Data33[i,j]; i,j = 1,2,3
plot 'Data33.txt' u 0:($0==0?(x11=$1):$1),\
'' u 0:($0==0?(x12=$2):$2),\
'' u 0:($0==0?(x13=$3):$3),\
'' u 0:($0==1?(x21=$1):$1),\
'' u 0:($0==1?(x22=$2):$2),\
'' u 0:($0==1?(x23=$3):$3),\
'' u 0:($0==2?(x31=$1):$1),\
'' u 0:($0==2?(x32=$2):$2),\
'' u 0:($0==2?(x33=$3):$3)
unset table
print x11, x12, x13 # Data from first row
print x21, x22, x23 # Data from second row
print x31, x32, x33 # Data from third row

"TypeError: bad operand type for unary ~: 'float'" not down to NA (not available)?

I'm trying to filter a pandas data frame. Following #jezrael's answer here I can use the following to count up the rows I will be removing:
mask= ((analytic_events['section']==2) &
~(analytic_events['identifier'].str[0].str.isdigit()))
print (mask.sum())
However when I run this on my data I get the following error:
TypeError Traceback (most recent call last)
in
1 mask= ((analytic_events['section']==2) &
----> 2 ~(analytic_events['identifier'].str[0].str.isdigit()))
3
4 print (mask.sum())
c:\program files\python37\lib\site-packages\pandas\core\generic.py in invert(self)
1454 def invert(self): 1455 try:
-> 1456 arr = operator.inv(com.values_from_object(self))
1457 return self.array_wrap(arr)
1458 except Exception:
TypeError: bad operand type for unary ~: 'float'
The accepted wisdom for that error, bad operand type for unary ~: 'float', is that the unary operator encountered a NA value (for example, see this answer)
The problem is that I do not have any such missing data. Here's my analysis. Running
analytic_events[analytic_events['section']==2]['identifier'].str[0].value_counts(dropna=False)
gives the results:
2 1207791
3 39289
1 533
. 56
Or running
analytic_events[analytic_events['section']==2]['identifier'].str[0].str.isdigit().value_counts(dropna=False)
gives the results
True 1247613
False 56
(Note that the amounts above sum to the total number of rows, i.e. there are none missing.)
Using the more direct method suggested in #jezrael's answer below
analytic_events[analytic_events['section']==2]['identifier'].isnull().sum()
analytic_events[analytic_events['section']==2]['identifier'].str[0].isnull().sum()
both produce the output zero. So there are no NA (not available) values.
Why am I getting the error
TypeError: bad operand type for unary ~: 'float'
from the code at the start of this post?
I believe you need filter by first condition and then again in filtered values:
m1 = analytic_events['section']==2
mask = ~analytic_events.loc[m1, 'identifier'].str[0].str.isdigit()
print (mask.sum())