Indentation of boxes in Format.fprintf - formatting

Please consider the function f:
open Format
let rec f i = match i with
| x when x <= 0 -> ()
| i ->
pp_open_hovbox std_formatter 2;
printf "This is line %d#." i;
f (i-1);
printf "This is line %d#." i;
close_box ();
()
It recursively opens hovboxes and prints something, followed by a newline hint (#.). When I call the f 3, i obtain the following output:
This is line 3
This is line 2
This is line 1
This is line 1
This is line 2
This is line 3
but I expected:
This is line 3
This is line 2
This is line 1
This is line 1
This is line 2
This is line 3
Can you explain why I obtain the first output and what I need to change to obtain the second one?

#. is not a newline hint, it is equivalent to print_newline which calls print_flush which closes all opened boxes and follows by a new line.
If you want to print line by line with Format you should open a vertical box with open_vbox and use print_cut ("#,") whenever you want to output a new line.

Instead of using #. you should use #\n specificator. The former, will flush the formatter and output a hard newline, actually breaking your pretty printing. It is intended to be used at the end of document, and, since it is not actually composable, I would warn against using it at all.
With #\n, you will get an output that is much closer to what you're expecting:
This is line 3
This is line 2
This is line 1
This is line 1
This is line 2
This is line 3
The same output, by the way, can be obtained by using vbox and emitting #; good break hints, that is better.

Related

finding if a line starts with a specified string in a clob and then extract

I have a CLOB column that I want to search for a line that starts with '305' then extract something from that line, some of my rows will have multiple lines that start with '305' or '305 somewhere in the entire cell, so I'd only want to find the first line where it starts with '305' the entire cell content is split into lines like this
301|10500000908|
302|20171021|20171104|
303|00001|8306.7|
302|20171008|20171020|
303|00001|13174.5|
302|20170704|20171007|
303|00001|2508.7|
302|20170419|20170703|
303|00001|6962.9|
302|20170330|20170418|
303|00001|7628.2|
302|20170305|20170329|
--- my instr(dbms_lob.substr(flow_data, 4000, 1 ),'305|', 1, 1) keeps finding this line
303|00001|8489.1|
302|20170120|20170304|
303|00001|1997.9|
302|20161021|20170119|
303|00001|12359.8|
302|20160722|20161020|
303|00001|7354.0|
302|20160516|20160721|
303|00001|26.4|
304|20171105|
305|00001|5936.1|
--- i want to find this line and then extract the '5936.1' from it
304|20171021|
305|00001|5710.4|
304|20171008|
305|00001|5163.1|
304|20170704|
304|20170419|
305|00001|7390.8|
304|20170330|
305|00001|7363.2|
304|20170305|
305|00001|7181.4|
304|20170120|
305|00001|9200.2|
304|20161021|
305|00001|4791.3|
305|00001|2877.5|
304|20160516|
305|00001|4116.9|
306|0393|20160511|
307|SUPP|20160511|
310|A|20160511|
311|E|20160516|
when I use instr(dbms_lob.substr(flow_data, 4000, 1 ),'305|', 1, 1) it keeps finding the wrong line. by the way there are no gaps between the lines, I inserted them to keep the text separated.
Thanks all
Mac
If I follow you correctly, you can use regexp_substr():
select regexp_substr(flow_data, '^305\|[^|]*\|([^|]*)', 1, 1, 'm', 1) as val
from t
Argument breakdown:
flow_data: the value to search (CLOBs are allowed)
'^305\|[^|]*\|([^|]*)': the regex. We search for 305 at the beginning of a line, and capture the third value in the CSV list
1: start the search at the beginning of source string
1: return the first match
m - multiline mode : ^ matches at the begin of each line
1: return the first captured part of the match

SAS - # symbol in the INPUT statement

I have the following program, but don't understand what # symbol in the end of the INPUT lines does:
data colors;
input #1 Var1 $ #8 Var2 $ #;
input #1 Var3 $ #8 Var4 $ #;
datalines;
RED ORANGE YELLOW GREEN
BLUE INDIGO PURPLE VIOLET
CYAN WHOTE FICSIA BLACK
GRAY BROWN PINK MAGENTA
run;
proc print data=colors;
run;
Output produced without # in the end of the INPUT line is different from the ouput with #.
Can you please clarify what does # in the end of the 2nd and 3rd INPUT lines do?
# at the end of an input statement means, do not advance the line pointer after semicolon. ## means, do not advance the line pointer after the run statement either.
Normally an input statement has an implicit advance the line pointer one after semicolon. So:
data want;
input a b;
datalines;
1 2 3 4
5 6 7 8
run;
proc print data=want;
run;
Will return
1 2
5 6
If you want to read 3 4 into another line, then, you might do something like:
data want;
input a b #;
output;
input a b;
datalines;
1 2 3 4
5 6 7 8
run;
proc print data=want;
run;
Which gives
1 2
3 4
5 6
7 8
Similarly you could simply write
data want;
input a b ##;
datalines;
1 2 3 4
5 6 7 8
run;
proc print data=want;
run;
To get the same result - ## would hold the line pointer even across the run statement. (It still would advance once it hit the end of the line.)
In Summary: I think you probably don't want the trailing # in this case. The Input statements do not seem fitting for the data you are reading. With the trailing #, you are reading the same data into var1 and var3, and the same data into var2 and var4, because it is reading the same line twice. Either way, you are not reading in what the data appears to be. You would be better off with:
input Var1 $ Var2 $ #;
input Var3 $ Var4 $;
Or, more simply:
input Var1 $ Var2 $ Var3 $ Var4 $;
Official details from the SAS support site, annotated:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000146292.htm
Using Line-Hold Specifiers
Line-hold specifiers keep the pointer on
the current input record when
a data record is read by more than one INPUT statement (trailing #)
Use a single trailing # to allow the next INPUT statement to read from the same record.
Normally, each INPUT statement in a DATA step reads a new data record
into the input buffer. When you use a trailing #, the following
occurs:
The pointer position does not change.
No new record is read into the input buffer.
The next INPUT statement for the same iteration of the DATA step continues to read the same record rather than a new one.
SAS releases a record held by a trailing # when
a null INPUT statement executes:
input;
an INPUT statement without a trailing # executes
the next iteration of the DATA step begins.

Header and repeating time information removal from a GPS TEC rinex file

I have a rinex file and is shown here..an image showing the first part of rinex file
http://imageshack.us/photo/my-images/593/65961409.jpg
The data (AOPR Rinex file) is downloaded from the site after entering a year and a day.
http://www.naic.edu/aisr/GPSTEC/gpstec.html
I want to open this file as a matrix in matlab for further processing..After the end of header at the 42nd line the time information is on 43 rd line. Then data starts. But time information is coming again after some rows say 64 the line, which should be discarded. Header should also be discarded. Also the last column is coming below the first column as a second row which should be transferred to the last column. Totally there are 55700 rows. Kindly help me with this.
I suspect the last column appearing on the line below it is just an artifact of how large the window of your text reader is...
For the rest, I think a trial-and-error loop is in place here:
fid = fopen('test.txt','r');
C = {};
while ~feof(fid)
% read lines with dictated format.
D = textscan(fid, '%d %d %d %d');
% this will fail on headerlines, empty lines, etc.
if isempty(D{1})
% in those cases, advance the file pointer by one line
fgetl(fid);
else
% if that's not the case, save the lines thus read
C = [C;D]; %#ok
end
end
fclose(fid);
% Post-process: concatenate all sub-arrays into one
C = arrayfun(#(ii) cat(1, C{:,ii}), 1:size(C,2), 'UniformOutput', false);
This works, at least with my test.txt:
header
random
garbage
1 2 3 4
4 5 6 7
4 6 7 8
more random garbage
2 5 6 7
5 6 7 8
8 6 3 7
I suspect the last column appearing on the line below it is just an artifact of how large >the window of your text reader is...
For the rest, I think a trial-and-error loop is in place here
Dear Rody I don't have any matlab background and just a beginner. It is actually a Rinex file..with 2780 epochs and 6 observables with 30 satellite values..Decoding it in matlab is tough. That is the problem. You can read a sample code at
http://web.ics.purdue.edu/~tdauterm/EAS591/Lab7/read_rinexo.m
But the problem is that the observables are six and there only 5 in the m-file which also is not in the correct order. I need C1 P2 L1 L2 S1 S2...but the code at the link gives L1 L2 C1 P1 P2. :( Can you just correct that..Then it will be a great help..

gnuplot store one number from data file into variable

OSX v10.6.8 and Gnuplot v4.4
I have a data file with 8 columns. I would like to take the first value from the 6th column and make it the title. Here's what I have so far:
#m1 m2 q taua taue K avgPeriodRatio time
#1 2 3 4 5 6 7 8
K = #read in data here
graph(n) = sprintf("K=%.2e",n)
set term aqua enhanced font "Times-Roman,18"
plot file using 1:3 title graph(K)
And here is what the first few rows of my data file looks like:
1.00e-07 1.00e-07 1.00e+00 1.00e+05 1.00e+04 1.00e+01 1.310 12070.00
1.11e-06 1.00e-07 9.02e-02 1.00e+05 1.00e+04 1.00e+01 1.310 12070.00
2.12e-06 1.00e-07 4.72e-02 1.00e+05 1.00e+04 1.00e+01 1.310 12070.00
3.13e-06 1.00e-07 3.20e-02 1.00e+05 1.00e+04 1.00e+01 1.310 12090.00
I don't know how to correctly read in the data or if this is even the right way to go about this.
EDIT #1
Ok, thanks to mgilson I now have
#m1 m2 q taua taue K avgPeriodRatio time
#1 2 3 4 5 6 7 8
set term aqua enhanced font "Times-Roman,18"
K = "`head -1 datafile | awk '{print $6}'`"
print K+0
graph(n) = sprintf("K=%.2e",n)
plot file using 1:3 title graph(K)
but I get the error: Non-numeric string found where a numeric expression was expected
EDIT #2
file = "testPlot.txt"
K = "`head -1 file | awk '{print $6}'`"
K=K+0 #Cast K to a floating point number #this is line 9
graph(n) = sprintf("K=%.2e",n)
plot file using 1:3 title graph(K)
This gives the error--> head: file: No such file or directory
"testPlot.gnu", line 9: Non-numeric string found where a numeric expression was expected
You have a few options...
FIRST OPTION:
use columnheader
plot file using 1:3 title columnheader(6)
I haven't tested it, but this may prevent the first row from actually being plotted.
SECOND OPTION:
use an external utility to get the title:
TITLE="`head -1 datafile | awk '{print $6}'`"
plot 'datafile' using 1:3 title TITLE
If the variable is numeric, and you want to reformat it, in gnuplot, you can cast strings to a numeric type (integer/float) by adding 0 to them (e.g).
print "36.5"+0
Then you can format it with sprintf or gprintf as you're already doing.
It's weird that there is no float function. (int will work if you want to cast to an integer).
EDIT
The script below worked for me (when I pasted your example data into a file called "datafile"):
K = "`head -1 datafile | awk '{print $6}'`"
K=K+0 #Cast K to a floating point number
graph(n) = sprintf("K=%.2e",n)
plot "datafile" using 1:3 title graph(K)
EDIT 2 (addresses comments below)
To expand a variable in backtics, you'll need macros:
set macro
file="mydatafile.txt"
#THE ORDER OF QUOTES (' and ") IS CRUCIAL HERE.
cmd='"`head -1 ' . file . ' | awk ''{print $6}''`"'
# . is string concatenation. (this string has 3 pieces)
# to get a single quote inside a single quoted string
# you need to double. e.g. 'a''b' yields the string a'b
data=#cmd
To address your question 2, it is a good idea to familiarize yourself with shell utilities -- sed and awk can both do it. I'll show a combination of head/tail:
cmd='"`head -2 ' . file . ' | tail -1 | awk ''{print $6}''`"'
should work.
EDIT 3
I recently learned that in gnuplot, system is a function as well as a command. To do the above without all the backtic gymnastics,
data=system("head -1 " . file . " | awk '{print $6}'")
Wow, much better.
This is a very old question, but here's a nice way to get access to a single value anywhere in your data file and save it as a gnuplot-accessible variable:
set term unknown #This terminal will not attempt to plot anything
plot 'myfile.dat' index 0 every 1:1:0:0:0:0 u (var=$1):1
The index number allows you to address a particular dataset (separated by two carriage returns), while every allows you to specify a particular line.
The colon-separated numbers after every should be of the form 1:1:<line_number>:<block_number>:<line_number>:<block_number>, where the line number is the line with the the block (starting from 0), and the block number is the number of the block (separated by a single carriage return, again starting from 0). The first and second numbers say plot every 1 lines and every one data block, and the third and fourth say start from line <line_number> and block <block_number>. The fifth and sixth say where to stop. This allows you to select a single line anywhere in your data file.
The last part of the plot command assigns the value in a particular column (in this case, column 1) to your variable (var). There needs to be two values to a plot command, so I chose column 1 to plot against my variable assignment statement.
Here is a less 'awk'-ward solution which assigns the value from the first row and 6th column of the file 'Data.txt' to the variable x16.
set table
# Syntax: u 0:($0==RowIndex?(VariableName=$ColumnIndex):$ColumnIndex)
# RowIndex starts with 0, ColumnIndex starts with 1
# 'u' is an abbreviation for the 'using' modifier
plot 'Data.txt' u 0:($0==0?(x16=$6):$6)
unset table
A more general example for storing several values is given below:
# Load data from file to variable
# Gnuplot can only access the data via the "plot" command
set table
# Syntax: u 0:($0==RowIndex?(VariableName=$ColumnIndex):$ColumnIndex)
# RowIndex starts with 0, ColumnIndex starts with 1
# 'u' is an abbreviation for the 'using' modifier
# Example: Assign all values according to: xij = Data33[i,j]; i,j = 1,2,3
plot 'Data33.txt' u 0:($0==0?(x11=$1):$1),\
'' u 0:($0==0?(x12=$2):$2),\
'' u 0:($0==0?(x13=$3):$3),\
'' u 0:($0==1?(x21=$1):$1),\
'' u 0:($0==1?(x22=$2):$2),\
'' u 0:($0==1?(x23=$3):$3),\
'' u 0:($0==2?(x31=$1):$1),\
'' u 0:($0==2?(x32=$2):$2),\
'' u 0:($0==2?(x33=$3):$3)
unset table
print x11, x12, x13 # Data from first row
print x21, x22, x23 # Data from second row
print x31, x32, x33 # Data from third row

Do .. While Loop/Textfile/Operation Problem

Hi I have a problem with the following code:
int skp = 1;
do{
file.seekp(skp);
file>>s;
cout<<s;
stats[s]++;
skp++;
skp++;
}while(skp <= 10);
The Textfile has the following:
0
1
2
3
0
1
0
1
0
What I want this programming to do is start from reading the second number which it does, then skip one read next, skip one read the next etc. etc. what it's doing is read the second number which is good, then reads it again for 2 times, then read the next number for 3 times and the next for 3 times. So the output i receive from the above textfile is
1112223330.
Can any one help me please!
Thank you!
That's because your lines are separated by line feeds (actually CR and LF). Also, file >> s will skip leading white space, so you end up with
<CR><LF>1
<LF>1
1
All of which result in s being 1.
The same is repeated for 2, 3 and so on.
Forget yout seekp() and simply use
while (file.good()) {
file >> s; // skip line
if (!file.good()) break;
file >> s;
cout << s;
stats[s]++;
}