new-line exists in rebol / red for block what about tabs? - rebol

I can't find new-tab whereas there is new-line so how do you preserver tabs in blocks ?
help new-line
USAGE:
NEW-LINE position value
DESCRIPTION:
Sets or clears the new-line marker within a block or paren.
NEW-LINE is a native! value.
ARGUMENTS:
position [block! paren!] "Position to change marker (modified)".
value "Set TRUE for newline".
REFINEMENTS:
/all => Set/clear marker to end of series.
/skip => Set/clear marker periodically to the end of the series.
size [integer!]
RETURNS:
[block! paren!]

There is one newline flag per-cell in arrays ("any-block!"s), which indicates whether or not the molding process should put out a newline before that value.
Indentation is driven only from these flags. Indentation starts at the first newline flag, and further newlines will each align to that level, with an outdent at the end of the block if any newlines/indents occurred.
>> data: [a b c]
>> new-line next data true
>> data
== [a
b c
]
Note there are 4 "candidate positions" for newlines inside the block [a b c] (e.g. the positions are [* a * b * c *]). Yet there are only three value cells, with a newline marker indicating a desire to output a newline before that cell. Lacking anywhere to put the fourth newline signal, the decision in Rebol2 and Red is to implicitly put the closing bracket on its own line if there were any newline markers processed.
I've previously mentioned that it's non-obvious exactly how "out-of-band" information like this gets managed in the face of series modifications. It helps to work through your expectations. Even worrying about just one bit there is a lot of nuance, such as when you say:
compose [
1 + (block1)
(block2)
]
How should newline markers be merged, between what's in the COMPOSE and what's in the spliced data itself? That's just the logic related to one bit. Putting in some "indentation count" would introduce many more questions. Plus, there's not a lot of bits to spare for that count: one of the "rules of the game" is to keep things down to just 4 platform pointers per value cell.
Expanding the formatting features isn't too likely. One feature request that the tail get its own newline marker was accepted for open source Rebol3, but rejected by Red. I wouldn't expect to see much more done in this area.

Related

Kotlin: Printing string with array elements that cuts off left side of answers

I am writing a small text based game to familiarize myself with Kotlin. I am creating two strings that print out the multiple choice options. I have confirmed that all four array elements are captured appropriately, but when the string prints it cuts off the a) and c) options. I have used \t, spaces, etc. and it does the same thing. I have also tried to just use print() and then use a \n at the end
println(menuList[0])
println(menuList[1])
println(menuList[2])
println(menuList[3])
println("a) ${menuList[0]} b) ${menuList[1]}")
println("c) ${menuList[2]} d) ${menuList[3]}")
Output:
erroneous output of multiple choice text
The source text came from a file which was separating each line with \r\n, but the code reading it was splitting it with \n. The result was that each entry ended with \r. When printed out, this caused the first value to be overwritten.
The solution is, when reading the file, to split by \r\n rather than \n.

CMake multiline message with FATAL_ERROR

CMake documentation (for example current version 3.11.2) states
CMake Warning and Error message text displays using a simple markup language. Non-indented text is formatted in line-wrapped paragraphs delimited by newlines. Indented text is considered pre-formatted.
However, it doesn't mention any markup format. Unless the "non-indented" vs. "indented" is all there is about the "simple markup".
Anyway, I failed to make it work with FATAL_ERROR mode.
Furthermore, I noticed that with STATUS mode message is printed with leading -- (two dashes and space). While with FATAL_ERROR every line break in the message is turned into two lines, which (IMHO) looks awful.
Now I have a multiline message which lists what is wrong in CMAKE_BUILD_TYPE and what values are accepted. Because of above-mentioned issues, I ended up printing the message as STATUS and indenting subsequent lines with three spaces (so they align well with the --). Then I do a simple FATAL_ERROR repeating only the "title line" (stating that CMAKE_BUILD_TYPE is wrong). This looks acceptable on both console output and cmake-gui. (Although the 3 spaces indentation is needless on cmake-gui...)
However, I'm surprised how poorly is this topic described. And it seems to be so since long - see for example question [CMake] Extra blank lines with SEND_ERROR and FATAL_ERROR ?! remaining unanswered for almost 9 years now...
Are there any good practices, advice or tips for handling such messages? Or should they be avoided in the first place?
You're right. The "simple markup" is either non-indented (unformatted) or indented (formatted). Also, the non-indented text is in paragraphs delimited by newlines. That's why you end up with blank lines in between paragraphs.
Here's a running explanation of the various kinds of messages. Warning types and error types behave the same as far as formatted vs. unformatted text goes. The difference, of course, is what happens to the processing and generation phases of CMake. For readability, you can split strings into multiple double-quoted pieces that will be concatenated.
# STATUS
message(STATUS
"This is a status message. It is prefixed with \"-- \". It goes to stdout. The "
"lines will wrap according to the width of your terminal.\n"
"New lines will begin a new line at column 1, but without the \"-- \" prefix, "
"unless you provide it; they will not create a blank line (i.e., new "
"paragraph). Spacing between sentences is unaltered by CMake.\n"
"-- Here's a new paragraph with an explicit \"-- \" prefix added.")
# no mode given (informational)
message(
"This is an informational message. It goes to stderr. Each line begins at column "
"1. The lines will wrap according to the width of your terminal.\n"
"New lines will begin a new line at column 1; they will not create a blank line "
"(i.e., new paragraph). Spacing between sentences is unaltered by CMake (3 spaces "
"preceded this sentence.).")
# WARNING--unformatted
message(WARNING
"This is an unformatted warning message. It goes to stderr. Each line begins "
"at column 3. The lines will wrap at a particular column (it appears to be "
"column 77, set within CMake) and wrap back to column 3.\n"
"New lines will begin a new paragraph, so they will create a blank line. A final "
"thing about unformatted messages: They will separate sentences with 2 spaces, "
"even if your string had something different.")
# WARNING--formatted and unformatted
message(WARNING
" This is a formatted warning message. It goes to stderr. Formatted lines will"
" be indented an additional 2 spaces beyond what was provided in the output"
" string. The lines will wrap according to the width of your terminal.\n"
" Indented new lines will begin a new line. They will not create a blank line."
" If you separate sentences with 1 space, that's what you'll get. If you"
" separate them with 2 spaces, that's also what you'll get.\n"
" If you want to control the width of the formatted paragraphs\n"
" (a good practice), just keep track of the width of each line and place\n"
" a \"\\n\" at the end of each line.\n \n"
" And, if you want a blank line between paragraphs, just place \"\\n \\n\"\n"
" (i.e., 2 newlines separated by a space) at the end of the first paragraph.\n"
"Non-indented new lines, however, will be treated like unformatted warning "
"messages, described above. They will begin at and wrap to column 3. They begin "
"a new paragraph, so they will create a blank line. There will be 2 spaces "
"between sentences, regardless of how many you placed after the period (In the "
"script, there were 4 spaces before this sentence).\n"
"And, as you'd expect, a second unindented paragraph will be preceded by a "
"blank line. But why would you mix formatted and unformatted text?")
I saved this into Message.cmake and invoked it with cmake -P Message.cmake 2> output.txt. It results in the following stdout:
-- This is a status message. It is prefixed with "-- ". It goes to stdout. The lines will wrap according to the width of your terminal.
New lines will begin a new line at column 1, but without the "-- " prefix, unless you provide it; they will not create a blank line (i.e., new paragraph). Spacing between sentences is unaltered by CMake.
-- Here's a new paragraph with an explicit "-- " prefix added.
The file, output.txt, contains:
This is an informational message. It goes to stderr. Each line begins at column 1. The lines will wrap according to the width of your terminal.
New lines will begin a new line at column 1; they will not create a blank line (i.e., new paragraph). Spacing between sentences is unaltered by CMake (3 spaces preceded this sentence.).
CMake Warning at MessageScript.cmake:19 (message):
This is an unformatted warning message. It goes to stderr. Each line
begins at column 3. The lines will wrap at a particular column (it appears
to be column 77, set within CMake) and wrap back to column 3.
New lines will begin a new paragraph, so they will create a blank line. A
final thing about unformatted messages: They will separate sentences with 2
spaces, even if your string had something different.
CMake Warning at MessageScript.cmake:28 (message):
This is a formatted warning message. It goes to stderr. Formatted lines will be indented an additional 2 spaces beyond what was provided in the output string. The lines will wrap according to the width of your terminal.
Indented new lines will begin a new line. They will not create a blank line. If you separate sentences with 1 space, that's what you'll get. If you separate them with 2 spaces, that's also what you'll get.
If you want to control the width of the formatted paragraphs
(a good practice), just keep track of the width of each line and place
a "\n" at the end of each line.
And, if you want a blank line between paragraphs, just place "\n \n"
(i.e., 2 newlines separated by a space) at the end of the first paragraph.
Non-indented new lines, however, will be treated like unformatted warning
messages, described above. They will begin at and wrap to column 3. They
begin a new paragraph, so they will create a blank line. There will be 2
spaces between sentences, regardless of how many you placed after the
period (In the script, there were 4 spaces before this sentence).
And, as you'd expect, a second unindented paragraph will be preceded by a
blank line. But why would you mix formatted and unformatted text?
SUMMARY
INFORMATIONAL MESSAGES (no mode given)
start at column 1
wrap in terminal window until newline
go to stderr
new paragraphs begin without preceding blank line
sentence and word spacing preserved
STATUS MESSAGES
start at column 1, with "-- " prefix on first paragraph
wrap in terminal window until newline
go to stdout
new paragraphs begin without preceding blank line
sentence and word spacing preserved
UNFORMATTED WARNING AND ERROR MESSAGES (unindented strings)
start at column 3
wrap at column 77
go to stderr
new paragraphs are preceded by a blank line
sentences separated by 2 spaces; words by 1 space
FORMATTED WARNING AND ERROR MESSAGES (indented strings)
start at column 3, plus whatever indentation the string had
wrap in terminal window until newline
go to stderr
new paragraphs begin without preceding blank line
sentence and word spacing preserved

What's the inverse of block: load text in rebol / red

Let's say I have some rebol / red code. If I load the source text, I get a block, but how can get back the source text from block ? I tried form block but it doesn't give back the source text.
text: {
Red [Title: "Red Pretty Printer"]
out: none ; output text
spaced: off ; add extra bracket spacing
indent: "" ; holds indentation tabs
emit-line: func [] [append out newline]
emit-space: func [pos] [
append out either newline = last out [indent] [
pick [#" " ""] found? any [
spaced
not any [find "[(" last out find ")]" first pos]
]
]
]
emit: func [from to] [emit-space from append out copy/part from to]
clean-script: func [
"Returns new script text with standard spacing."
script "Original Script text"
/spacey "Optional spaces near brackets and parens"
/local str new
] [
spaced: found? spacey
clear indent
out: append clear copy script newline
parse script blk-rule: [
some [
str:
newline (emit-line) |
#";" [thru newline | to end] new: (emit str new) |
[#"[" | #"("] (emit str 1 append indent tab) blk-rule |
[#"]" | #")"] (remove indent emit str 1) break |
skip (set [value new] load/next str emit str new) :new
]
]
remove out ; remove first char
]
print clean-script read %clean-script.r
}
block: load text
LOAD is a higher-level operation with complex behaviors, e.g. it can take a FILE!, a STRING!, or a BLOCK!. Because it does a lot of different things, it's hard to speak of its exact complement as an operation. (For instance, there is SAVE which might appear to be the "inverse" of when you LOAD from a FILE!)
But your example is specifically dealing with a STRING!:
If I load the source text, I get a block, but how can get back the source text from block ?
As a general point, and very relevant matter: you can't "get back" source text.
In your example above, your source text contained comments, and after LOAD they will be gone. Also, a very limited amount of whitespace information is preserved, in the form of the NEW-LINE flag that each value carries. Yet what specific indentation style you used--or whether you used tabs or spaces--is not preserved.
On a more subtle note, small amounts of notational distinction are lost. STRING! literals which are loaded will lose knowledge of whether you wrote them "with quotes" or {with curly braces}...neither Rebol nor Red preserve that bit. (And even if they did, that wouldn't answer the question of what to do after mutations, or with new strings.) There are variations of DATE! input formats, and it doesn't remember which specific one you used. Etc.
But when it comes to talking about code round-tripping as text, the formatting is minor compared to what happens with binding. Consider that you can build structures like:
>> o1: make object! [a: 1]
>> o2: make object! [a: 2]
>> o3: make object! [a: 3]
>> b: compose [(in o1 'a) (in o2 'a) (in o3 'a)]
== [a a a]
>> reduce b
[1 2 3]
>> mold b
"[a a a]"
You cannot simply serialize b to a string as "[a a a]" and have enough information to get equivalent source. Red obscures the impacts of this a bit more than in Rebol--since even operations like to block! on STRING! and system/lexer/transcode appear to do binding into the user context. But it's a problem you will face on anything but the most trivial examples.
There are some binary formats for Rebol2 and Red that attempt to address this. For instance in "RedBin" a WORD! saves its context (and index into that context). But then you have to think about how much of your loaded environment you want dragged into the file to preserve context. So it's certainly opening a can of worms.
This isn't to say that the ability to MOLD things out isn't helpful. But there's no free lunch...so Rebol and Red programs wind up having to think about serialization as much as anyone else. If you're thinking of doing processing on any source code--for the reasons of comment preservation if nothing else--then PARSE should probably be the first thing you reach for.

Why does Fortran output have a leading space?

I process lots of output from Fortran programs, and it generally appears that the output from these programs has a leading space, i.e., character column 1 is blank. Consider a short program leading.f:
program leading
print *, "<-- What's that space for?"
end program leading
Has the output
<-- What's that space for?
(yes, there is one space character at the beginning). My question is contained in the program.
Back in the dinosaur era, when FORTRAN output usually went to a green-bar impact printer, certain characters in the first print column were often interpreted as control codes (line feeds, form feeds, etc). Many programmers learned to explicitly blank column 1 of their output, unless some special effect was intended -- and old habits die hard!
As has been noted in another answer here, and elsewhere, Fortran output had the concept of carriage control. For printers which used carriage control the first character being a blank was necessary for a new line to be started.
Fortran itself deleted carriage control concept in Fortran 2003, but for completeness we can see that Fortran still requires list-directed output to have (in most cases) this (default) leading blank (Fortran 2018, 13.10.4 p.13):
Except for new records created by explicit formatting within a defined output procedure or by continuation of delimited character sequences, each output record begins with a blank character.
Namelist formatting has a similar statement.
You can avoid having this leading blank by avoiding using list-directed output:
print '(A)', '<-- No space here'
end
Note that it isn't the print here, but the list-directed output, which is to blame. We see similar with write:
write (*,*) '<-- Space from the list-directed output'
end
Finally, if we are using internal files we still get a leading blank with list-directed output:
character(len=20) :: internal
write (internal, *) '<-- Leading blank here'
end
(If we then output this internal file with list-directed output we'll see two leading blanks.)

Fortran read statement reading beyond an end of line

do you know if the following statement is guaranteed to be true by one of the fortran 90/95/2003 standards?
"Suppose a read statement for a character variable is given a blank line (i.e., containing only white spaces and new line characters). If the format specifier is an asterisk (*), it continues to read the subsequent lines until a non-blank line is found. If the format specifier is '(A)', a blank string is substituted to the character variable."
For example, please look at the following minimal program and input file.
program code:
PROGRAM chk_read
INTEGER, PARAMETER :: MAXLEN=30
CHARACTER(len=MAXLEN) :: str1, str2
str1='minomonta'
read(*,*) str1
write(*,'(3A)') 'str1_start|', str1, '|str1_end'
str2='minomonta'
read(*,'(A)') str2
write(*,'(3A)') 'str2_start|', str2, '|str2_end'
END PROGRAM chk_read
input file:
----'input.dat' content is below this line----
yamanakako
kawaguchiko
----'input.dat' content is above this line----
Please note that there are four lines in 'input.dat' and the first and third lines are blank (contain only white spaces and new line characters). If I run the program as
$ ../chk_read < input.dat > output.dat
I get the following output
----'output.dat' content is below this line----
str1_start|yamanakako |str1_end
str2_start| |str2_end
----'output.dat' content is above this line----
The first read statement for the variable 'str1' seems to look at the first line of 'input.dat', find a blank line, move on to the second line, find the character value 'yamanakako', and store it in 'str1'.
In contrast, the second read statement for the variable 'str2' seems to be given the third line, which is blank, and store the blank line in 'str2', without moving on to the fourth line.
I tried compiling the program by Intel Fortran (ifort 12.0.4) and GNU Fortran (gfortran 4.5.0) and got the same result.
A little bit about a background of asking this question: I am writing a subroutine to read a data file that uses a blank line as a separator of data blocks. I want to make sure that the blank line, and only the blank line, is thrown away while reading the data. I also need to make it standard conforming and portable.
Thanks for your help.
From Fortran 2008 standard draft:
List-directed input/output allows data editing according to the type
of the list item instead of by a format specification. It also allows
data to be free-field, that is, separated by commas (or semicolons) or
blanks.
Then:
The characters in one or more list-directed records constitute a
sequence of values and value separators. The end of a record has the
same effect as a blank character, unless it is within a character
constant. Any sequence of two or more consecutive blanks is treated as
a single blank, unless it is within a character constant.
This implicitly states that in list-directed input, blank lines are treated as blanks until the next non-blank value.
When using a fmt='(A)' format descriptor when reading, blank lines are read into str. On the other side, fmt=*, which implies list-directed I/O in free-form, skips blank lines until it finds a non-blank character string. To test this, do something like:
PROGRAM chk_read
INTEGER :: cnt
INTEGER, PARAMETER :: MAXLEN=30
CHARACTER(len=MAXLEN) :: str
cnt=1
do
read(*,fmt='(A)',end=100)str
write(*,'(I1,3A)')cnt,' str_start|', str, '|str_end'
cnt=cnt+1
enddo
100 continue
END PROGRAM chk_read
$ cat input.dat
yamanakako
kawaguchiko
EOF
Running the program gives this output:
$ a.out < input.dat
1 str_start| |str_end
2 str_start| |str_end
3 str_start| |str_end
4 str_start|yamanakako |str_end
5 str_start| |str_end
6 str_start|kawaguchiko |str_end
On the other hand, if you use default input:
read(*,fmt=*,end=100)str
You end up with this output:
$ a.out < input.dat
1 str1_start|yamanakako |str1_end
2 str2_start|kawaguchiko |str2_end
This Part of the F2008 standard draft probably treats your problem:
10.10.3 List-directed input
7 When the next effective item is of type character, the input form
consists of a possibly delimited sequence of zero or more
rep-char s whose kind type parameter is implied by the kind of the
effective item. Character sequences may be continued from the end of
one record to the beginning of the next record, but the end of record
shall not occur between a doubled apostrophe in an
apostrophe-delimited character sequence, nor between a doubled quote
in a quote-delimited character sequence. The end of the record does
not cause a blank or any other character to become part of the
character sequence. The character sequence may be continued on as many
records as needed. The characters blank, comma, semicolon, and slash
may appear in default, ASCII, or ISO 10646 character sequences.