Fortran read statement reading beyond an end of line - file-io

do you know if the following statement is guaranteed to be true by one of the fortran 90/95/2003 standards?
"Suppose a read statement for a character variable is given a blank line (i.e., containing only white spaces and new line characters). If the format specifier is an asterisk (*), it continues to read the subsequent lines until a non-blank line is found. If the format specifier is '(A)', a blank string is substituted to the character variable."
For example, please look at the following minimal program and input file.
program code:
PROGRAM chk_read
INTEGER, PARAMETER :: MAXLEN=30
CHARACTER(len=MAXLEN) :: str1, str2
str1='minomonta'
read(*,*) str1
write(*,'(3A)') 'str1_start|', str1, '|str1_end'
str2='minomonta'
read(*,'(A)') str2
write(*,'(3A)') 'str2_start|', str2, '|str2_end'
END PROGRAM chk_read
input file:
----'input.dat' content is below this line----
yamanakako
kawaguchiko
----'input.dat' content is above this line----
Please note that there are four lines in 'input.dat' and the first and third lines are blank (contain only white spaces and new line characters). If I run the program as
$ ../chk_read < input.dat > output.dat
I get the following output
----'output.dat' content is below this line----
str1_start|yamanakako |str1_end
str2_start| |str2_end
----'output.dat' content is above this line----
The first read statement for the variable 'str1' seems to look at the first line of 'input.dat', find a blank line, move on to the second line, find the character value 'yamanakako', and store it in 'str1'.
In contrast, the second read statement for the variable 'str2' seems to be given the third line, which is blank, and store the blank line in 'str2', without moving on to the fourth line.
I tried compiling the program by Intel Fortran (ifort 12.0.4) and GNU Fortran (gfortran 4.5.0) and got the same result.
A little bit about a background of asking this question: I am writing a subroutine to read a data file that uses a blank line as a separator of data blocks. I want to make sure that the blank line, and only the blank line, is thrown away while reading the data. I also need to make it standard conforming and portable.
Thanks for your help.

From Fortran 2008 standard draft:
List-directed input/output allows data editing according to the type
of the list item instead of by a format specification. It also allows
data to be free-field, that is, separated by commas (or semicolons) or
blanks.
Then:
The characters in one or more list-directed records constitute a
sequence of values and value separators. The end of a record has the
same effect as a blank character, unless it is within a character
constant. Any sequence of two or more consecutive blanks is treated as
a single blank, unless it is within a character constant.
This implicitly states that in list-directed input, blank lines are treated as blanks until the next non-blank value.
When using a fmt='(A)' format descriptor when reading, blank lines are read into str. On the other side, fmt=*, which implies list-directed I/O in free-form, skips blank lines until it finds a non-blank character string. To test this, do something like:
PROGRAM chk_read
INTEGER :: cnt
INTEGER, PARAMETER :: MAXLEN=30
CHARACTER(len=MAXLEN) :: str
cnt=1
do
read(*,fmt='(A)',end=100)str
write(*,'(I1,3A)')cnt,' str_start|', str, '|str_end'
cnt=cnt+1
enddo
100 continue
END PROGRAM chk_read
$ cat input.dat
yamanakako
kawaguchiko
EOF
Running the program gives this output:
$ a.out < input.dat
1 str_start| |str_end
2 str_start| |str_end
3 str_start| |str_end
4 str_start|yamanakako |str_end
5 str_start| |str_end
6 str_start|kawaguchiko |str_end
On the other hand, if you use default input:
read(*,fmt=*,end=100)str
You end up with this output:
$ a.out < input.dat
1 str1_start|yamanakako |str1_end
2 str2_start|kawaguchiko |str2_end

This Part of the F2008 standard draft probably treats your problem:
10.10.3 List-directed input
7 When the next effective item is of type character, the input form
consists of a possibly delimited sequence of zero or more
rep-char s whose kind type parameter is implied by the kind of the
effective item. Character sequences may be continued from the end of
one record to the beginning of the next record, but the end of record
shall not occur between a doubled apostrophe in an
apostrophe-delimited character sequence, nor between a doubled quote
in a quote-delimited character sequence. The end of the record does
not cause a blank or any other character to become part of the
character sequence. The character sequence may be continued on as many
records as needed. The characters blank, comma, semicolon, and slash
may appear in default, ASCII, or ISO 10646 character sequences.

Related

How is input handled in Brainf***?

I can't really seem to find a standard for this. I know inputs are taken as ASCII values, but are they required to be single characters? If not, how are multi-character inputs handled?
Command line inputs in most (if not all) programming languages are taken a line at a time. When you hit enter into a console after typing a line, the whole line gets sent into the program as a return value from the function you called to get the input.
In brainfuck, you have more control over this: You can get as many characters as you want at a time, and stop when you want to.
A single comma "," will get one byte's worth of input (a.k.a one character). If you want to handle getting a string until a newline is met, you can try implementing something like the following code (10 being the ascii value of newline and the number of repetitions of "+" and "-" chars):
[-]>,----------[++++++++++>,----------]<[<]
An array of non zero values starting and ending with zero values is saved into memory containing the ascii values of input chars.

How to read elements from a line in VHDL?

I'm trying to use VHDL to read from a file that can have different formats. I know you're supposed to use the following two lines of code to read a line at a time, the read individual elements in that line.
readline(file, aline);
read(aline, element);
However my question is what will read(aline, element) return into element? What will it return if the line is empty? What will it return if I've used it let's say 5 times and my line only has 4 characters?
The reason I want to know is that if I am reading a file with an arbitrary number of spaces between valid data, how do I parse this valid data?
The file contains ASCII characters separated by arbitrary amounts of white space (any number of spaces, tabs, or new lines). If the line starts with a # that line is a comment and should be ignored.
Outside of these comments, the first part of the file contains characters that are only letters or numbers in combinations of variable size. In other words this:
123 ABC 12ABB3
However, the majority of the file (after a certain number of read words) will be purely numbers of arbitrary length, separated by an arbitrary amount of white space. In other words, the second part of the file is this:
255 0 2245 625 430
2222 33 111111
and I must be able to parse these numbers (and interpret them as such) individually.
As mentioned in the comments, all the read procedures in std.textio and ieee.std_logic_textio skip over leading spaces apart from the character and string versions (because a space is as much a character as any other).
You can test whether a line variable (the buffer) is empty like this:
if L'length > 0 then
where L is your line variable. There is also a set of overloaded read procedures with an extra status output:
procedure read (L : inout LINE;
VALUE: out <type> ;
GOOD : out BOOLEAN);
The extra output - GOOD - is true if the read was successful and false if it wasn't. The advantage of these if that the read is unsuccessful, the simulation does not stop (as it does with the regular procedures). Also, with the versions in std.textio, if the read is unsuccessful, the read is non-destructive (ie whatever you were trying to read remains in the buffer). This is not the case with the versions in ieee.std_logic_textio, however.
If you really do not know what format you are trying to read, you could read the entire line into a string, like this:
variable S : string(1 to <some big number>);
...
readline(F, L);
assert L'length < S'length; -- make sure S is big enough
S := (others => ' '); -- make sure that the previous line is overwritten
if L'length > 0 then
read(L, S(1 to L'length);
end if;
The line L is now in the string S. You can then write some code to parse it. You may find the type attribute 'value useful. This converts a string to some type, eg
variable I : integer;
...
I := integer'value(S(12 to 14));
would set integer I to the value contained in elements 12 to 14 of string S.
Another approach, as suggested by user1155120 below, is to peek at the values in the buffer, eg
if L'length > 0 then -- check that the L isn't empty, otherwise the next line blows up
if L.all(1) = '#' then
-- the first character of the line is a '#' so the line must be a comment

handling strings with \n in plain text e-mail

I have a column in my database that contains a string like this:
"Warning set for 7 days.\nCritical Notice - Last Time Machine backup was 118 days ago at 2012-11-16 20:40:52\nLast Time Machine Destination was FreeAgent GoFlex Drive\n\nDefined Destinations:\nDestination Name: FreeAgent GoFlex Drive\nBackup Path: Not Specified\nLatest Backup: 2012-11-17"
I am displaying this data in an e-mail to users. I have be able to easily format the field in my html e-mails perfectly by doing the following:
simple_format(#servicedata.service_exit_details.gsub('\n', '<br>'))
The above code replaces the "\n" with "<br>" tags and simple_format handles the rest.
My issues arises with how to format it properly in the plain text template. Initially I thought I could just call the column, seeing as it has "\n" I assumed the plain text would interpret and all would be well. However this simply spits out the string with "\n" intact just as displayed above rather than created line breaks as desired.
In an attempt to find a way to parse the string so the line breaks are acknowledged. I have tried:
#servicedata.service_exit_details.gsub('\n', '"\r\n"')
#servicedata.service_exit_details.gsub('\n', '\r\n')
raw #servicedata.service_exit_details
markdown(#servicedata.service_exit_details, autolinks: false) # with all the necessary markdown setup
simple_format(#servicedata.service_exit_details.html_safe)
none of which worked.
Can anyone tell me what I'm doing wrong or how I can make this work?
What I want is for the plain text to acknowledge the line breaks and format the string as follows:
Warning set for 7 days.
Critical Notice - Last Time Machine backup was 118 days ago at 2012-11-16 20:40:52
Last Time Machine Destination was FreeAgent GoFlex Drive
Defined Destinations:
Destination Name: FreeAgent GoFlex Drive
Backup Path: Not Specified\nLatest Backup: 2012-11-17"
I see.
You need to differentiate a literal backslash followed by a letter n as a sequence of two characters, and a LF character (a.k.a. newline) that is usually represented as \n.
You also need to distinguish two different kinds of quoting you're using in Ruby: singles and doubles. Single quotes are literal: the only thing that is interpreted in single quotes specially is the sequence \', to escape a single quote, and the sequence \\, which produces a single backslash. Thus, '\n' is a two-character string of a backslash and a letter n.
Double quotes allow for all kinds of weird things in it: you can use interpolation with #{}, and you can insert special characters by escape sequences: so "\n" is a string containing the LF control character.
Now, in your database you seem to have the former (backslash and n), as hinted by two pieces of evidence: the fact that you're seeing literal backslash and n when you print it, and the fact that gsub finds a '\n'. What you need to do is replace the useless backslash-and-n with the actual line separator characters.
#servicedata.service_exit_details.gsub('\n', "\r\n")

Why does Fortran output have a leading space?

I process lots of output from Fortran programs, and it generally appears that the output from these programs has a leading space, i.e., character column 1 is blank. Consider a short program leading.f:
program leading
print *, "<-- What's that space for?"
end program leading
Has the output
<-- What's that space for?
(yes, there is one space character at the beginning). My question is contained in the program.
Back in the dinosaur era, when FORTRAN output usually went to a green-bar impact printer, certain characters in the first print column were often interpreted as control codes (line feeds, form feeds, etc). Many programmers learned to explicitly blank column 1 of their output, unless some special effect was intended -- and old habits die hard!
As has been noted in another answer here, and elsewhere, Fortran output had the concept of carriage control. For printers which used carriage control the first character being a blank was necessary for a new line to be started.
Fortran itself deleted carriage control concept in Fortran 2003, but for completeness we can see that Fortran still requires list-directed output to have (in most cases) this (default) leading blank (Fortran 2018, 13.10.4 p.13):
Except for new records created by explicit formatting within a defined output procedure or by continuation of delimited character sequences, each output record begins with a blank character.
Namelist formatting has a similar statement.
You can avoid having this leading blank by avoiding using list-directed output:
print '(A)', '<-- No space here'
end
Note that it isn't the print here, but the list-directed output, which is to blame. We see similar with write:
write (*,*) '<-- Space from the list-directed output'
end
Finally, if we are using internal files we still get a leading blank with list-directed output:
character(len=20) :: internal
write (internal, *) '<-- Leading blank here'
end
(If we then output this internal file with list-directed output we'll see two leading blanks.)

How are error messages in StringTemplate to be interpreted?

I got this error message while using StringTemplate:
line 94:26: unexpected char: ')'
And after about 15 minutes of randomly adding and removing blank lines in my template, and observing how the number in that message changed, I finally isolated the line that caused trouble. It was line #152, position #35.
Is the value after "line " just normally totally wrong, or is there a way of deducing the real line number from that output?
In StringTemplate (ST) 4, it appears that the first number is the line number, within the specific template at issue and not the line number within the .stg file if that's what you're using (which most of us do).
When I'm using vim, this means I need to mentally offset that from the line number of the first line of the template (add them together) to get the actual line number within the .stg file.
The second number in the ST error is the character position within that line of the template. But wait, there's more - you know you love it...
When an error is on the first line of a multi-line template: since ST elides the starting newline in multi-line templates, ST effectively combines the first/ declaration line (ending in "<<") with the second (actual start of the template) line, in multi-line templates;
so at least with ST-4.0.8 I need to subtract the length of the template declaration line from the character position, to get the actual character position.
The first "\n" eliding (for multi-line templates only) also means the line number may appear to be offset by 1, and possibly the character position, for an error on the "first line".
The error should include the filename and template name, so it's enough information for an automated script or tool, but a bit cumbersome for us mere humans.
Good luck.