Kotlin: Printing string with array elements that cuts off left side of answers - kotlin

I am writing a small text based game to familiarize myself with Kotlin. I am creating two strings that print out the multiple choice options. I have confirmed that all four array elements are captured appropriately, but when the string prints it cuts off the a) and c) options. I have used \t, spaces, etc. and it does the same thing. I have also tried to just use print() and then use a \n at the end
println(menuList[0])
println(menuList[1])
println(menuList[2])
println(menuList[3])
println("a) ${menuList[0]} b) ${menuList[1]}")
println("c) ${menuList[2]} d) ${menuList[3]}")
Output:
erroneous output of multiple choice text

The source text came from a file which was separating each line with \r\n, but the code reading it was splitting it with \n. The result was that each entry ended with \r. When printed out, this caused the first value to be overwritten.
The solution is, when reading the file, to split by \r\n rather than \n.

Related

How is input handled in Brainf***?

I can't really seem to find a standard for this. I know inputs are taken as ASCII values, but are they required to be single characters? If not, how are multi-character inputs handled?
Command line inputs in most (if not all) programming languages are taken a line at a time. When you hit enter into a console after typing a line, the whole line gets sent into the program as a return value from the function you called to get the input.
In brainfuck, you have more control over this: You can get as many characters as you want at a time, and stop when you want to.
A single comma "," will get one byte's worth of input (a.k.a one character). If you want to handle getting a string until a newline is met, you can try implementing something like the following code (10 being the ascii value of newline and the number of repetitions of "+" and "-" chars):
[-]>,----------[++++++++++>,----------]<[<]
An array of non zero values starting and ending with zero values is saved into memory containing the ascii values of input chars.

VBA replace certain carriage

All.
I am used to programming VBA in Excel, but am new to the structures in Word.
I am working through a library of text files to update them. Many of them are either OCR documents, or were manually entered.
Each has a recurring pattern, the most common of which is unnecessary carriage returns.
For example, I am looking at several text files where there is a double return after each line. A search and replace of all double carriage returns removes all paragraph distinctions.
However, each line is approximately 30 characters long, and if I manually perform the following logic, it gives me a functional document.
If there is a double carriage return after 30+ characters, I replace them with a space.
If there were less than 30 characters prior to the double return, I replace them with a single return.
Can anyone help me with some rudimentary code that would help me get started on that? I could then modify it for each "pattern" of text documents I have.
e.g.
In this case, there are more than
thirty characters per line. And I
will keep going to illustrate this
example.
This would be a new paragraph, and
would be separated by another of
the single returns.
I want code that would return:
In this case, there are more than thirty character returns. And I will keep going to illustrate this example.
This would be a new paragraph, and would be separated by another of the single returns.
Let me know if anyone can throw something out that I can play with!
You can do this without code (which RegEx requires), simply using Word's own wildcard Find/Replace tools, where:
Find = ([!^13]{30,})[^13]{1,}
Replace = \1^32
and, to clean up the residual multi-paragraph breaks:
Find = [^13]{2,}
Replace = ^p
You could, of course, record the above as a macro...
Here is a RegEx that might work for you:
(\n\n)(?<!\.(\n\n))
The substitution is just a plain space, you can try it out (and modify / tweak it) here: https://regex101.com/r/zG9GPw/4
This 'pattern' tells the RegEx engine to look for the newline character \n which occurs x2 like this \n\n (worth noting this is from your question and might be different in your files, e.g. could be \r\n) and it assumes that a valid line break will be proceeded by a full stop: \..
In RegEx the full stop symbol is a single character wild card so it needs to be escaped with the '\' (n and r are normal characters, escaping them tells the RegEx engine they represent newline and return characters).
So... the expression is looking for a group of x2 newline characters but then uses a negative look-behind to exclude any matches where the previous character was a full stop.
Anyway, it's all explained on the site:
Here is how you could do a RegEx find and replace using NotePad++ (I'm not sure if it comes with RegEx or if a plugin is needed, either way it is easy). But you can set a location, filters (to target specific file types), and other options (such as search in sub-directories).
Other than that, as #MacroPod pointed out you could also do this with MS Word, document by document, not using any code :)

Postgres 9.3 end-of-copy marker corrupt - Any way to change this setting?

I am trying to stream data through an AWK program to a Postgres COPY command. This works great usually. However, in my data recently I have been getting long text stings containing '\.' values.
Postgres Documentation mentions this combination of characters represents the end-of-data marker, http://www.postgresql.org/docs/9.2/static/sql-copy.html, and I am getting the associated errors when trying to insert with COPY.
My question is, is there a way to turn this off? Perhaps change the end-of-data marker to a different combination of characters? Or do I have to alter/remove these strings before trying to insert using the COPY command?
You can try to filter your data through sed 's:\\:\\\\:g' - this would change every \ in your data to \\, which is a correct escape sequence for single backslash in copy data.
But I think not only backslash would be problematic. Also newlines should be encoded by \n, carriage returns as \r and tabs as \t (tab is a default field delimiter in copy).

making a list of traditional Chinese characters from a string

I am currently trying to estimate the number of times each character is used in a large sample of traditional Chinese characters. I am interested in characters not words. The file also includes punctuation and western characters.
I am reading in an example file of traditional Chinese characters. The file contains a large sample of traditional Chinese characters. Here is a small subset:
首映鼓掌10分鐘 評語指不及《花樣年華》
該片在柏林首映,完場後獲全場鼓掌10分鐘。王家衛特別為該片剪輯「柏林版本
增減20處 趙本山香港戲分被刪
在柏林影展放映的《一代宗師》版本
教李小龍武功 葉問決戰散打王
另一增加的戲分是開場時葉問(梁朝偉飾)
My strategy is to read each line, split each line into a list, and go through and check each character to see if it already exists in a list or a dictionary of characters. If the character does not yet exist in my list or dictionary I will add it to that list, if it does exist in my list or dictionary, I will increase the counter for that specific character. I will probably use two lists, a list of characters, and a parallel list containing the counts. This will be more processing, but should also be much easier to code.
I have not gotten anywhere near this point yet.
I am able to read in the example file successfully. Then I am able to make a list for each line of my file. I am able to print out those individual lines into my output file and sort of reconstitute the original file, and the traditional Chinese comes out intact.
However, I run into trouble when I try to make a list of each character on a particular line.
I've read through the following article. I understood many of the comments, but unfortunately, was unable to understand enough of it to solve my problem.
How to do a Python split() on languages (like Chinese) that don't use whitespace as word separator?
My code looks like the following
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import codecs
wordfile = open('Chinese_example.txt', 'r')
output = open('Chinese_output_python.txt', 'w')
LINES = wordfile.readlines()
Through various tests I am sure the following line is not splitting the string LINES[0] into its component Chinese characters.
A_LINE = list(LINES[0])
output.write(A_LINE[0])
I mean you want to use this, from answerer 'flow' at How to do a Python split() on languages (like Chinese) that don't use whitespace as word separator? :
from re import compile as _Re
_unicode_chr_splitter = _Re( '(?s)((?:[\ud800-\udbff][\udc00-\udfff])|.)' ).split
def split_unicode_chrs( text ):
return [ chr for chr in _unicode_chr_splitter( text ) if chr ]
to successfully split a line of traditional Chinese characters.. I just had to know the proper syntax to handle encoded characters.. pretty basic.
my_new_list = list(unicode(LINE[0].decode('utf8')));

Why does Fortran output have a leading space?

I process lots of output from Fortran programs, and it generally appears that the output from these programs has a leading space, i.e., character column 1 is blank. Consider a short program leading.f:
program leading
print *, "<-- What's that space for?"
end program leading
Has the output
<-- What's that space for?
(yes, there is one space character at the beginning). My question is contained in the program.
Back in the dinosaur era, when FORTRAN output usually went to a green-bar impact printer, certain characters in the first print column were often interpreted as control codes (line feeds, form feeds, etc). Many programmers learned to explicitly blank column 1 of their output, unless some special effect was intended -- and old habits die hard!
As has been noted in another answer here, and elsewhere, Fortran output had the concept of carriage control. For printers which used carriage control the first character being a blank was necessary for a new line to be started.
Fortran itself deleted carriage control concept in Fortran 2003, but for completeness we can see that Fortran still requires list-directed output to have (in most cases) this (default) leading blank (Fortran 2018, 13.10.4 p.13):
Except for new records created by explicit formatting within a defined output procedure or by continuation of delimited character sequences, each output record begins with a blank character.
Namelist formatting has a similar statement.
You can avoid having this leading blank by avoiding using list-directed output:
print '(A)', '<-- No space here'
end
Note that it isn't the print here, but the list-directed output, which is to blame. We see similar with write:
write (*,*) '<-- Space from the list-directed output'
end
Finally, if we are using internal files we still get a leading blank with list-directed output:
character(len=20) :: internal
write (internal, *) '<-- Leading blank here'
end
(If we then output this internal file with list-directed output we'll see two leading blanks.)