Kotlin - read only first n lines in file - kotlin

Is there a method, without resolving to traditional Java-like for loops, to limit the number of lines read by BufferedReader?
Take this code for example:
bufferedReader.useLines { it }
.take(10)
.toList()
The documentation for useLines states:
Calls the [block] callback giving it a sequence of all the lines in this file and closes the reader once
* the processing is complete.
To my understanding, this means that the entire file will be read, and only then the first ten will be filtered out of the sequence. I couldn't find anything online addressing this issue except fetching only the first line.

Sequence is a lazily evaluated collection. That means only the necessary items will be processed, so if you take(10), only the first 10 lines will be processed.
The keyword in the documentation is thus:
Calls the [block] callback giving it a sequence of all the lines in
this file and closes the reader once the processing is complete.
Now useLines closes the source as soon as its block is completed, making your code incorrect. Instead, use the following:
val list : List<String> = bufferedReader
.useLines { lines: Sequence<String> ->
lines
.take(10)
.toList()
}

Related

Why did this hashmap stop working out of no where?

I have used this HashMap for a few days now and no problems at all. Now I get an error about FloatingDecimal,parseDouble, ReadWrite, FileReadWrite, and Looping error.
the last thing I did to the program was adding $%.2f.formant to my ducts.second element it ran a few times I left to eat and came back to this!
I was able to narrow it down to when it pulls the data from the file and converts it to the hashmap setting.
Data in the file example 111,shoes,59.00
val fileName = "src/products.txt"
var products = HashMap<Int, Pair<String, Double>>()
var inputFD =File(fileName).forEachLine {
var pieces = it.split(",")
// println(pieces)
products [pieces [0].toInt()] = Pair(pieces [1].trim(),pieces[2].toDouble())
}
The data type in the file was altered when reading back in causing the whole program to crash. I wanted my double to be example 9.99 and when I read the file back in I added a $ sign meant for the front in view only. When the program was looking for a double (9.99) is only had the option of ($9.99) causing the error.

Limit input size in chars from stdin

I want to write an application in Rust that deals with input from terminal and I want to prevent it from crashing/being killed by running out of memory. It displays a prompt, processes a command and displays prompt again.
Basically I am looking for read_line_max(n) or read_until(delimiter, max_chars) API where at most n bytes or until delimiter is reached are read.
Possibilities I considered:
for io::stdin().lock().take(n).lines() takes n bytes total at most, I want unlimited chars but limit line size.
for io::stdin().lock().lines().take(n) limits number of lines
for line in io::stdin().lock().lines() {
let line = line?.chars().take(n);
println!("{}", respond_to(line));
}
too late, hogging 50GB of memory, kill imminent
I found out that BufReader used to have .chars() iterator that could be used in this way but it was removed.
With io::stdin().lock().read(buff) there is an issue with bytes vs chars but it may be my best bet. And then try to throw it into a String to check UTF-8 validity but that seems like something I would do in C and very unidiomatic.
Actually while writing this I quickly put together this thing:
let inp = io::stdin();
let mut bufinp = inp.lock();
let mut linebytes = [0_u8; 10];
loop {
match bufinp.read(&mut linebytes) {
Ok(bytes_read) => {
match String::from_utf8(linebytes[..bytes_read].to_vec()) {
Ok(line) => println!("processed line: {}", &line),
Err(err) => eprintln!("utf8 err: {:?}", err),
}
},
Err(err) => {
eprintln!("line read err: {:?}", err);
}
}
}
So... that kind of does what I want but I have some issues with it:
1) I need to trim the '\n' if input is smaller than buffer.
2) It doesn't clear rest of the stdin if input is larger than buffer. I'm guessing I need to put a skip_while() there at the end so that it doesn't spill over to the next read. Is there a nicer way to clear it?
3) It may split graphemes while I could in in fact handle those additional 3 bytes. I don't really care about reading up until specific hard limit. I just want to prevent the input/memory usage from being "too much".
4) It's just too low-level and complicated and not in line with "make good and safe choices easy to code and unsafe ones less available" which makes me think I'm not doing it right. But at least cat /dev/zero | ./target/debug/test doesn't result in SIGQUIT anymore.
I find it strange that a language that prides itself on safety wouldn't provide a fool-proof way to deal with potentially large input. Am I missing something or thinking too much about it? Every article I found just closes its eyes and fires read_to_end() or read_line() without much thought.
How should I read user input safely and idiomatically?

difference between toList().take(10) and take(10).toList() in kotlin

I am just trying the new kotlin language. I came across sequences which generate infinite list. I generated a sequence and tried printing the first 10 elements. But the below code didnt print anything:
fun main(args: Array<String>) {
val generatePrimeFrom2 = generateSequence(3){ it + 2 }
print(generatePrimeFrom2.toList().take(10))
}
But when I changed take(10).toList() in the print statement it work fine. Why is it so ?
This code worked fine for me:
fun main(args: Array<String>) {
val generatePrimeFrom2 = generateSequence(3){ it + 2 }
print(generatePrimeFrom2.take(10).toList())
}
The generateSequence function generates a sequence that is either infinite or finishes when the lambda passed to it returns null. In your case, it is { it + 2 }, which never returns null, so the sequence is infinite.
When you call .toList() on a sequence, it will try to collect all sequence elements and thus will never stop if the sequence is infinite (unless the index overflows or an out-of-memory error happens), so it does not print anything because it does not finish.
In the second case, on contrary, you limit the number of elements in the sequence with .take(10) before trying to collect its items. Then the .toList() call simply collects the 10 items and finishes.
It may become more clear if you check this Q&A about differences between Sequence<T> and Iterable<T>: (link)
Here is the hint -> generate infinite list. In the first solution you first want to create a list (wait infinity) then take first 10 elements.
On the second snippet, from infinite list you take only first 10 elements and change it to list
generatePrimeFrom2.toList() tries to compute/create an infinite-length list.
generatePrimeFrom2.toList().take(10) then takes the first 10 elements from the infinite-length list.
It does not print because it is calculating that infinite-length list.
Whereas, generatePrimeFrom2.take(10) only tries to compute the first 10 elements.
generatePrimeFrom2.take(10).toList() converts the first 10 elements to the list.
You know, generateSequence(3){ it + 2 } does not have the end. So it has the infinite length.
Sequences do not have actual values, calculated when they are needed, but Lists have to have actual values.
I came across sequences which generate infinite list.
This is not actually correct. The main point is that a sequence is not a list. It is a lazily evaluated construct and only the items you request will actually become "materialized", i.e., their memory allocated on the heap.
That's why it's not interchangeable to write
infiniteSeq.toList().take(10)
and
infiniteSeq.take(10).toList()
The former will try to instantiate infinitely many items—and, predictably, fail at it.

How would I write idiomatic kotlin code that loops over a subprocess and processes output from it?

I want to write some kotlin code that essentially runs a command:
Runtime.getRuntime().exec("mycommand.sh")
But, in this case mycommand.sh will never exit. It will sporadically output text that I want to process. Imagine the output is like this:
FOOBAR 1234
BARFOO 54657
ETCETC 9876
Say the first line comes in at 5 seconds, then the second at 10 seconds, and the third at 15 seconds.
How would I write code that receives each line as it comes in, and processes it?
For example, maybe I want to extract the words in all caps and pull out the number that follows and then stores those two pieces of text as key-values in a hash map.
As a bonus, I would love to know how to terminate the subprocess (signal with SIGINT?) from within the kotlin program.
Maybe something like this:
val inStream = BufferedReader(InputStreamReader(proc.inputStream))
val map = inStream.lines()
//maybe you need a more sufficient solution here
.map { it.split(" ") }
.map { it[0] to it[1] }.toList()
This will result in a List of Pairs then. The infix method to creates the Pairs, which are simple key value associations.

Turbo C++ : while(fin) vs while(!fin.eof())

I was told that I should be using while(fin) instead of while(!fin.eof()) when reading a file.
What exactly is the difference?
Edit: I do know that while(fin) actually checks the stream object and that when it becomes NULL, the loop breaks and it covers eof and fail flags.
But my course teacher says that fin.eof() is better so I need to understand the fundamental operation that's going on here.
Which one is the right practice?
Note: This is not a duplicate, I need assistance in Turbo C++ and with binary files.
I'm basically trying to read a file using a class object.
First of all I am assuming fin is your fstream object. In which case your teacher would not have told you to use while(fin.eof()) for reading from file. She would have told to use while(!fin.eof()).
Let me explain. eof() is a member of the fstream class which returns a true or false value depending on whether the End Of File (eof) of the file you are reading has been reached. Thus while eof() function returns 0 it means the end of file has not been reached and loop continues to execute, but when eof() returns 1 the end of the file has been reached and the loop exits.
while(fin) loop is entered because fin actually returns the value of an error flag variable inside the class object fin whose value is set to 0 when any function like read or write or open fails. Thus the loop works as long as the read function inside the loop works.
Personally I would not suggest either of them.
I would suggest
//assume a class abc.
abc ob;
While(fin.read((char*)&ob, sizeof(ob)))
{}
Or
While(fin.getline(parameters))
{}
This loop reads the file record inside the loop condition and if nothing was read due to the end of file being reached, the loop is exited.
The problem with while(!fin.eof()) is that it returns 1 if the end of file has been reached. End of file is actually a character that is put at the end of the file. So when the read function inside the loop reads this character and sets a variable eof to 1. All the function actually does is return this value.
Thus works fine when you are reading lines in words but when you are reading successive records of a class from a file, this method will fail.
Consider
clas abc
{}a;
Fstream fin("file");
While(!fin.eof())
{
fin.read((char*)&a,sizeof(a));
a.display(); // display is a member function which displays the info }
Thus displays the last record twice. This is because the end of file character is the character after the last byte of the last record. When the last is read the file pointer is at the eof byte but hasn't read it yet. So it will enter the loop again but this time the eof char is read but the read function fails. The values already in the variable a, that is the previous records will be displayed again.
One good method is to do something like this:
while ( instream.read(...) && !instream.eof() ) { //Reading a binary file
Statement1;
Statement2;
}
or in case of a text file:
while ( (ch = instream.get()) && !instream.eof() ) { //To read a single character
Statement1;
Statement2;
}
Here, the object is being read within the while loop's condition statement and then the value of eof flag is being tested.
This wouldn't result in undesired outputs.
Here we are checking the status of the actual I/O operation and the eof together. You may also check for the fail flag.
I would like to point out that according to #RetiredNinja, we may only check for the I/O operation.
That is:
while ( instream.read(...) ) { //Reading a binary file
Statement1;
Statement2;
}
A quick and easy workaround that worked for me to avoid any problems when using eof is to check for it after the first reading and not as a condition of the while loop itself. Something like this:
while (true) // no conditions
{
filein >> string; // an example reading, could be any kind of file reading instruction
if (filein.eof()) break; // break the while loop if eof was reached
// the rest of the code
}