How to read a single character from input as u8? - input

I am currently building a simple interpreter for this language for practice. The only problem left to overcome is reading a single byte as a character from user input. I have the following code so far, but I need a way to turn the String that the second lines makes into a u8 or another integer that I can cast:
let input = String::new()
let string = std::io::stdin().read_line(&mut input).ok().expect("Failed to read line");
let bytes = string.chars().nth(0) // Turn this to byte?
The value in bytes should be a u8 which I can cast to a i32 to use elsewhere. Perhaps there is a simpler way to do this, otherwise I will use any solution that works.

Reading just a byte and casting it to i32:
use std::io::Read;
let input: Option<i32> = std::io::stdin()
.bytes()
.next()
.and_then(|result| result.ok())
.map(|byte| byte as i32);
println!("{:?}", input);

First, make your input mutable, then use bytes() instead of chars().
let mut input = String::new();
let string = std::io::stdin().read_line(&mut input).ok().expect("Failed to read line");
let bytes = input.bytes().nth(0).expect("no byte read");
Please note that Rust strings are a sequence of UTF-8 codepoints, which are not necessarily byte-sized. Depending on what you are trying to achieve, using a char may be the better option.

Related

Setting the value of a System::Drawing::Imaging::PropertyItem^

I would like to give my Bitmap a PropertyItem value but i am not really sure how to give it a System::String^ value.
System::Drawing::Imaging::PropertyItem^ propItem = gcnew System::Drawing::Imaging::PropertyItem;
System::String^ newValue = gcnew System::String("newValue");
propItem->Id = PropertyTagImageTitle;
propItem->Len = 9;
propItem->Type = PropertyTagTypeASCII;
propItem->Value = newValue;
bmp->SetPropertyItem(propItem);
"System::Drawing::Imaging::PropertyItem::Value::set" cannot be called with the given argument list."
Argument types are(System::String^)
Object type is System::Drawing::Imaging::PropertyItem^
Hans Passant's answer is correct. I've implemented it like followed:
System::Drawing::Image^ theImage = System::Drawing::Image::FromFile("C:\\image.png");
System::Text::Encoding^ utf8 = System::Text::Encoding::UTF8;
array<System::Drawing::Imaging::PropertyItem^>^ propItem = theImage->PropertyItems;
System::String^ newValue = gcnew System::String("newValue");
propItem->Id = PropertyTagImageTitle;
propItem[0]->Len = 18;
propItem->Type = PropertyTagTypeByte;
array<Char>^propItemValue = newValue->ToCharArray();
array<byte>^ utf8Bytes = utf8->GetBytes(propItemValue);
propItem[0]->Value = utf8Bytes;
theImage->SetPropertyItem(propItem[0]);
The Value property type is array<Byte>^. That can be anything you want. But a conversion is required and for a string you have to fret a great deal about the encoding you use.
The MSDN docs and PropertyTagTypeASCII makes no bones about it, you are expected to use Encoding::ASCII to make the conversion, use its GetBytes() method to generate the array. But that tends to be a problem, it is where you live, the world does not speak ASCII and you may have to violate the warranty.
In general it is poorly standardized for image meta data, the spec rarely goes beyond specifying an array of bytes and leave it unspecified how it should be interpreted. Practically you can choose between Encoding::Default and Encoding::UTF8 to make the conversion. Encoding::Default is pretty likely to produce a readable string on the same machine on which the image was generated. But if the image is going to travel around the planet then utf8 tends to be the better choice. YMMV. In Germany you'd want to check if glyphs like ß and Ü come out as expected.

Why does .flat_map() with .chars() not work with std::io::Lines, but does with a vector of Strings?

I am trying to iterate over characters in stdin. The Read.chars() method achieves this goal, but is unstable. The obvious alternative is to use Read.lines() with a flat_map to convert it to a character iterator.
This seems like it should work, but doesn't, resulting in borrowed value does not live long enough errors.
use std::io::BufRead;
fn main() {
let stdin = std::io::stdin();
let mut lines = stdin.lock().lines();
let mut chars = lines.flat_map(|x| x.unwrap().chars());
}
This is mentioned in Read file character-by-character in Rust, but it does't really explain why.
What I am particularly confused about is how this differs from the example in the documentation for flat_map, which uses flat_map to apply .chars() to a vector of strings. I don't really see how that should be any different. The main difference I see is that my code needs to call unwrap() as well, but changing the last line to the following does not work either:
let mut chars = lines.map(|x| x.unwrap());
let mut chars = chars.flat_map(|x| x.chars());
It fails on the second line, so the issue doesn't appear to be the unwrap.
Why does this last line not work, when the very similar line in the documentation doesn't? Is there any way to get this to work?
Start by figuring out what the type of the closure's variable is:
let mut chars = lines.flat_map(|x| {
let () = x;
x.unwrap().chars()
});
This shows it's a Result<String, io::Error>. After unwrapping it, it will be a String.
Next, look at str::chars:
fn chars(&self) -> Chars
And the definition of Chars:
pub struct Chars<'a> {
// some fields omitted
}
From that, we can tell that calling chars on a string returns an iterator that has a reference to the string.
Whenever we have a reference, we know that the reference cannot outlive the thing that it is borrowed from. In this case, x.unwrap() is the owner. The next thing to check is where that ownership ends. In this case, the closure owns the String, so at the end of the closure, the value is dropped and any references are invalidated.
Except the code tried to return a Chars that still referred to the string. Oops. Thanks to Rust, the code didn't segfault!
The difference with the example that works is all in the ownership. In that case, the strings are owned by a vector outside of the loop and they do not get dropped before the iterator is consumed. Thus there are no lifetime issues.
What this code really wants is an into_chars method on String. That iterator could take ownership of the value and return characters.
Not the maximum efficiency, but a good start:
struct IntoChars {
s: String,
offset: usize,
}
impl IntoChars {
fn new(s: String) -> Self {
IntoChars { s: s, offset: 0 }
}
}
impl Iterator for IntoChars {
type Item = char;
fn next(&mut self) -> Option<Self::Item> {
let remaining = &self.s[self.offset..];
match remaining.chars().next() {
Some(c) => {
self.offset += c.len_utf8();
Some(c)
}
None => None,
}
}
}
use std::io::BufRead;
fn main() {
let stdin = std::io::stdin();
let lines = stdin.lock().lines();
let chars = lines.flat_map(|x| IntoChars::new(x.unwrap()));
for c in chars {
println!("{}", c);
}
}
See also:
How can I store a Chars iterator in the same struct as the String it is iterating on?
Is there an owned version of String::chars?

How to extract values from &mut iterator?

I am trying to make an iterator that maps a string to an integer:
fn main() {
use std::collections::HashMap;
let mut word_map = HashMap::new();
word_map.insert("world!", 0u32);
let sentence: Vec<&str> = vec!["Hello", "world!"];
let int_sentence: Vec<u32> = sentence.into_iter()
.map(|x| word_map.entry(x).or_insert(word_map.len() as u32))
.collect();
}
(Rust playground)
This fails with
the trait core::iter::FromIterator<&mut u32> is not implemented for the type collections::vec::Vec<u32>
Adding a dereference operator around the word_map.entry().or_insert() expression does not work as it complains about borrowing which is surprising to me as I'm just trying to copy the value.
The borrow checker uses lexical lifetime rules, so you can't have conflicting borrows in a single expression. The solution is to extract getting the length into a separate let statement:
let int_sentence: Vec<u32> = sentence.into_iter()
.map(|x| *({let len = word_map.len() as u32;
word_map.entry(x).or_insert(len)}))
.collect();
Such issues will hopefully go away when Rust supports non-lexical lifetimes.

Reading an integer from input and assigning it to a variable

I've been trying to find an easy way to read variables in Rust, but haven't had any luck so far. All the examples in the Rust Book deal with strings AFAIK, I couldn't find anything concerning integers or floats that would work.
I don't have a Rust compiler on this machine, but based in part on this answer that comes close, you want something like...
let user_val = match input_string.parse::<i32>() {
Ok(x) => x,
Err(_) => -1,
};
Or, as pointed out in the comments,
let user_val = input_string.parse::<i32>().unwrap_or(-1);
...though your choice in integer size and default value might obviously be different, and you don't always need that type qualifier (::<i32>) for parse() where the type can be inferred from the assignment.
To read user input, you always read a set of bytes. Sometimes, you can interpret those bytes as a UTF-8 string. You can then further interpret the string as an integral or floating point number (or lots of other things, like an IP address).
Here's a complete example of reading a single line of input and parsing it as a 32-bit signed integer:
use std::io;
fn main() {
let mut input = String::new();
io::stdin().read_line(&mut input).expect("Not a valid string");
let input_num: i32 = input.trim().parse().expect("Not a valid number");
println!("Your number plus one is {}", input_num + 1);
}
Note that no user-friendly error handling is taking place. The program simply panics if reading input or parsing fails. Running the program produces:
$ ./input
41
Your number plus one is 42
A set of bytes comprises an input. In Rust, you accept the input as a UTF-8 String. Then you parse the string to an integer or floating point number. In simple ways you accept the string and parse it, then write an expect`` statement for both, to display a message to the user what went wrong when the program panics during runtime.
fn main() {
let mut x = String::new();
std::io::stdin().read_line(&mut x)
.expect("Failed to read input.");
let x: u32 = x.trim().parse()
.expect("Enter a number not a string.");
println!("{:?}", x);
}
If the program fails to parse the input string then it panics and displays an error message. Notice that the program still panics and we are not handling an error perfectly. One more thing to notice is that we can use the same variable name x and not some x_int because of the variable shadowing feature. To handle the error better we can use the match construct.
fn main() {
let mut x = String::new();
match std::io::stdin().read_line(&mut x) {
Ok(_) => println!("String has been taken in."),
Err(_) => {
println!("Failed to read input.");
return;
},
};
let x: u32 = match x.trim().parse() {
Ok(n) => {
println!("Converted string to int.");
n
},
Err(_) => {
println!("Failed to parse.");
return;
},
};
println!("{:?}", x);
}
This is longer way but a nicer way to handle errors and input and parse a number.

Is it valid to rebind a variable in a while loop?

Is it valid to rebind a mutable variable in a while loop? I am having trouble getting the following trivial parser code to work. My intention is to replace the newslice binding with a progressively shorter slice as I copy characters out of the front of the array.
/// Test if a char is an ASCII digit
fn is_digit(c:u8) -> bool {
match c {
30|31|32|33|34|35|36|37|38|39 => true,
_ => false
}
}
/// Parse an integer from the front of an ascii string,
/// and return it along with the remainder of the string
fn parse_int(s:&[u8]) -> (u32, &[u8]) {
use std::str;
assert!(s.len()>0);
let mut newslice = s; // bytecopy of the fat pointer?
let mut n:Vec<u8> = vec![];
// Pull the leading digits into a separate array
while newslice.len()>0 && is_digit(newslice[0])
{
n.push(newslice[0]);
newslice = newslice.slice(1,newslice.len()-1);
//newslice = newslice[1..];
}
match from_str::<u32>(str::from_utf8(newslice).unwrap()) {
Some(i) => (i,newslice),
None => panic!("Could not convert string to int. Corrupted pgm file?"),
}
}
fn main(){
let s:&[u8] = b"12345";
assert!(s.len()==5);
let (i,newslice) = parse_int(s);
assert!(i==12345);
println!("length of returned slice: {}",newslice.len());
assert!(newslice.len()==0);
}
parse_int is failing to return a slice that is smaller than the one I passed in:
length of returned slice: 5
task '<main>' panicked at 'assertion failed: newslice.len() == 0', <anon>:37
playpen: application terminated with error code 101
Run this code in the rust playpen
As Chris Morgan mentioned, your call to slice passes the wrong value for the end parameter. newslice.slice_from(1) yields the correct slice.
is_digit tests for the wrong byte values. You meant to write 0x30, etc. instead of 30.
You call str::from_utf8 on the wrong value. You meant to call it on n.as_slice() rather than newslice.
Rebinding variables like that is perfectly fine. The general rule is simple: if the compiler doesn’t complain, it’s OK.
It’s a very simple error that you’ve made: your slice end point is incorrect.
slice produces the interval [start, end)—a half-open range, not closed. Therefore when you wish to just remove the first character, you should be writing newslice.slice(1, newslice.len()), not newslice.slice(1, newslice.len() - 1). You could also write newslice.slice_from(1).