Why does indexing a HashMap not return a reference? - indexing

I am writing the follwing test code.
fn test() {
let mut m = HashMap::new();
m.insert("aaa".to_string(), "bbb".to_string());
let a = m["aaa"]; // error [E0507] cannot move out of index of `HashMap<String, String>`
let a = m.index("aaa"); // ok, the type of a is &String. I think The compile will add & to m;
let a :&String = (&m).index("aaa"); // ok, the type of a is &String.
println!("{:?}", m["aaa"]); // ok
}
I am not understand why the return type of m["aaa"] is String, not &String. Because the index(&self, key: &Q) -> &V of the trait Index has a &self parameter, I think the compile will add a & to m, and the return type of m["aaa"] should be &String, so String "bbb" will not be moved out of m.
If the compile does not add & to m, it will not find the index() method, the error should be like m cannot be indexed by "bbb";

From the docs for Index:
container[index] is actually syntactic sugar for *container.index(index)
So what happens is that when you write m["aaa"], the compiler is actually adding a * that dereferences the value returned by Index::index, whereas when you call m.index ("aaa"), you get the &String reference directly.
As pointed out by #user4815162342, programmers are supposed to make their intent explicit by writing either &m["aaa"] or m["aaa"].clone().
Moreover println!("{:?}", m["aaa"]); works because the println! macro does add a & to all the values it accesses¹ to prevent accidental moves caused by display, and this cancels out the * added by the compiler.
(1) This is indirectly documented in the docs for the format_args! macro.

Related

fn foo() -> Result<()> throws "expected 2 type arguments"

Why isn't Result<()> allowed when compiling this bit of Rust code? Is it a breaking change between Rust editions?
fn run() -> Result<()> {
let (tx, rx) = channel();
thread::spawn(move || {
do_things_with_tx(&exit_tx);
});
match exit_rx.recv() {
Ok(result) => if let Err(reason) = result {
return Err(reason);
},
Err(e) => {
return Err(e.into());
},
}
Ok(())
}
The compiler says:
error[E0107]: wrong number of type arguments: expected 2, found 1
--> src/main.rs:1000:18
|
1000 | fn run_wifi() -> Result<()> {
| ^^^^^^^^^^ expected 2 type arguments
When I tweak the return type to Result<(), Err>, it says:
error[E0107]: wrong number of type arguments: expected 2, found 0
--> src/main.rs:1000:29
|
1000 | fn run() -> Result<(), Err> {
| ^^^ expected 2 type arguments
This is from the wifi-connect project.
The definition of Result is, and has always been, the following:
pub enum Result<T, E> {
Ok(T),
Err(E),
}
This definition is even presented in the Rust Programming language, to show how simple it is. As a generic sum type of an OK outcome and an error outcome, it always expects two type parameters, and the compiler will complain if it cannot infer them, or the list of type arguments does not have the expected length.
On the other hand, one may find many libraries and respective docs showing a Result with a single type argument, as in Result<()>. What gives?
It's still no magic. By convention, libraries create type aliases for result types at the level of a crate or module. This works pretty well because it is common for those to produce errors of the same, locally created type.
pub type Result<T> = std::result::Result<T, Error>;
Or alternatively, a definition which can still purport as the original result type.
pub type Result<T, E = Error> = std::result::Result<T, E>;
This pattern is so common that some error helper crates such as error-chain, will automatically create a result alias type for each error declared.
As such, if you are using a library that may or may not use error-chain, you are expected to assume that mentions of Result<T> are local type aliases to a domain-specific Result<T, Error>. In case of doubt, clicking on that type in the generated documentation pages will direct you to the concrete definition (in this case, the alias).
From The Rust Programming Language section The ? Operator Can Only Be Used in Functions That Return Result
use std::error::Error;
use std::fs::File;
fn main() -> Result<(), Box<dyn Error>> {
let f = File::open("hello.txt")?;
Ok(())
}
TL;DR
use std::io::Result;
Link to the type description
Long answer
I believe that the top-voted answer given by E_net4 the comment flagger is correct. But it doesn't work if applied blindly. In both cases
this
pub type Result<T> = Result<T, Error>;
and this
pub type Result<T, E = Error> = Result<T, E>;
will give the cycle dependency error
error[E0391]: cycle detected when expanding type alias `Result`
--> src\main.rs:149:33
|
149 | pub type Result<T, E = Error> = Result<T, E>;
| ^^^^^^^^^^^^
|
= note: ...which immediately requires expanding type alias `Result` again
= note: type aliases cannot be recursive
= help: consider using a struct, enum, or union instead to break the cycle
= help: see <https://doc.rust-lang.org/reference/types.html#recursive-types> for more information
So as much as users of SO don't want to admit it, but Gabriel soft is very close to elegant solution, because that type alias
pub type Result<T> = result::Result<T, Error>;
is straight from the standard library.
Here it is, our desired Result with 1 generic argument is defined in std::io (docs). To fix the problem I added
use std::io::Result;
fn some_func() -> Result<()> {
...
}
or
use std::io;
fn some_func() -> io::Result<()> {
...
}
rustc 1.62.1
i solved my own error by making a generic Result type to handle the error
As its says it require a generic of T and E, so to simplify things, i had to follow this way
pub type Result = result::Result<T, Error>;

Why does .flat_map() with .chars() not work with std::io::Lines, but does with a vector of Strings?

I am trying to iterate over characters in stdin. The Read.chars() method achieves this goal, but is unstable. The obvious alternative is to use Read.lines() with a flat_map to convert it to a character iterator.
This seems like it should work, but doesn't, resulting in borrowed value does not live long enough errors.
use std::io::BufRead;
fn main() {
let stdin = std::io::stdin();
let mut lines = stdin.lock().lines();
let mut chars = lines.flat_map(|x| x.unwrap().chars());
}
This is mentioned in Read file character-by-character in Rust, but it does't really explain why.
What I am particularly confused about is how this differs from the example in the documentation for flat_map, which uses flat_map to apply .chars() to a vector of strings. I don't really see how that should be any different. The main difference I see is that my code needs to call unwrap() as well, but changing the last line to the following does not work either:
let mut chars = lines.map(|x| x.unwrap());
let mut chars = chars.flat_map(|x| x.chars());
It fails on the second line, so the issue doesn't appear to be the unwrap.
Why does this last line not work, when the very similar line in the documentation doesn't? Is there any way to get this to work?
Start by figuring out what the type of the closure's variable is:
let mut chars = lines.flat_map(|x| {
let () = x;
x.unwrap().chars()
});
This shows it's a Result<String, io::Error>. After unwrapping it, it will be a String.
Next, look at str::chars:
fn chars(&self) -> Chars
And the definition of Chars:
pub struct Chars<'a> {
// some fields omitted
}
From that, we can tell that calling chars on a string returns an iterator that has a reference to the string.
Whenever we have a reference, we know that the reference cannot outlive the thing that it is borrowed from. In this case, x.unwrap() is the owner. The next thing to check is where that ownership ends. In this case, the closure owns the String, so at the end of the closure, the value is dropped and any references are invalidated.
Except the code tried to return a Chars that still referred to the string. Oops. Thanks to Rust, the code didn't segfault!
The difference with the example that works is all in the ownership. In that case, the strings are owned by a vector outside of the loop and they do not get dropped before the iterator is consumed. Thus there are no lifetime issues.
What this code really wants is an into_chars method on String. That iterator could take ownership of the value and return characters.
Not the maximum efficiency, but a good start:
struct IntoChars {
s: String,
offset: usize,
}
impl IntoChars {
fn new(s: String) -> Self {
IntoChars { s: s, offset: 0 }
}
}
impl Iterator for IntoChars {
type Item = char;
fn next(&mut self) -> Option<Self::Item> {
let remaining = &self.s[self.offset..];
match remaining.chars().next() {
Some(c) => {
self.offset += c.len_utf8();
Some(c)
}
None => None,
}
}
}
use std::io::BufRead;
fn main() {
let stdin = std::io::stdin();
let lines = stdin.lock().lines();
let chars = lines.flat_map(|x| IntoChars::new(x.unwrap()));
for c in chars {
println!("{}", c);
}
}
See also:
How can I store a Chars iterator in the same struct as the String it is iterating on?
Is there an owned version of String::chars?

How to extract values from &mut iterator?

I am trying to make an iterator that maps a string to an integer:
fn main() {
use std::collections::HashMap;
let mut word_map = HashMap::new();
word_map.insert("world!", 0u32);
let sentence: Vec<&str> = vec!["Hello", "world!"];
let int_sentence: Vec<u32> = sentence.into_iter()
.map(|x| word_map.entry(x).or_insert(word_map.len() as u32))
.collect();
}
(Rust playground)
This fails with
the trait core::iter::FromIterator<&mut u32> is not implemented for the type collections::vec::Vec<u32>
Adding a dereference operator around the word_map.entry().or_insert() expression does not work as it complains about borrowing which is surprising to me as I'm just trying to copy the value.
The borrow checker uses lexical lifetime rules, so you can't have conflicting borrows in a single expression. The solution is to extract getting the length into a separate let statement:
let int_sentence: Vec<u32> = sentence.into_iter()
.map(|x| *({let len = word_map.len() as u32;
word_map.entry(x).or_insert(len)}))
.collect();
Such issues will hopefully go away when Rust supports non-lexical lifetimes.

Chaining iterators of different types

I get type errors when chaining different types of Iterator.
let s = Some(10);
let v = (1..5).chain(s.iter())
.collect::<Vec<_>>();
Output:
<anon>:23:20: 23:35 error: type mismatch resolving `<core::option::Iter<'_, _> as core::iter::IntoIterator>::Item == _`:
expected &-ptr,
found integral variable [E0271]
<anon>:23 let v = (1..5).chain(s.iter())
^~~~~~~~~~~~~~~
<anon>:23:20: 23:35 help: see the detailed explanation for E0271
<anon>:24:14: 24:33 error: no method named `collect` found for type `core::iter::Chain<core::ops::Range<_>, core::option::Iter<'_, _>>` in the current scope
<anon>:24 .collect::<Vec<_>>();
^~~~~~~~~~~~~~~~~~~
<anon>:24:14: 24:33 note: the method `collect` exists but the following trait bounds were not satisfied: `core::iter::Chain<core::ops::Range<_>, core::option::Iter<'_, _>> : core::iter::Iterator`
error: aborting due to 2 previous errors
But it works fine when zipping:
let s = Some(10);
let v = (1..5).zip(s.iter())
.collect::<Vec<_>>();
Output:
[(1, 10)]
Why is Rust able to infer the correct types for zip but not for chain and how can I fix it? n.b. I want to be able to do this for any iterator, so I don't want a solution that just works for Range and Option.
First, note that the iterators yield different types. I've added an explicit u8 to the numbers to make the types more obvious:
fn main() {
let s = Some(10u8);
let r = (1..5u8);
let () = s.iter().next(); // Option<&u8>
let () = r.next(); // Option<u8>
}
When you chain two iterators, both iterators must yield the same type. This makes sense as the iterator cannot "switch" what type it outputs when it gets to the end of one and begins on the second:
fn chain<U>(self, other: U) -> Chain<Self, U::IntoIter>
where U: IntoIterator<Item=Self::Item>
// ^~~~~~~~~~~~~~~ This means the types must match
So why does zip work? Because it doesn't have that restriction:
fn zip<U>(self, other: U) -> Zip<Self, U::IntoIter>
where U: IntoIterator
// ^~~~ Nothing here!
This is because zip returns a tuple with one value from each iterator; a new type, distinct from either source iterator's type. One iterator could be an integral type and the other could return your own custom type for all zip cares.
Why is Rust able to infer the correct types for zip but not for chain
There is no type inference happening here; that's a different thing. This is just plain-old type mismatching.
and how can I fix it?
In this case, your inner iterator yields a reference to an integer, a Clone-able type, so you can use cloned to make a new iterator that clones each value and then both iterators would have the same type:
fn main() {
let s = Some(10);
let v: Vec<_> = (1..5).chain(s.iter().cloned()).collect();
}
If you are done with the option, you can also use a consuming iterator with into_iter:
fn main() {
let s = Some(10);
let v: Vec<_> = (1..5).chain(s.into_iter()).collect();
}

Is it valid to rebind a variable in a while loop?

Is it valid to rebind a mutable variable in a while loop? I am having trouble getting the following trivial parser code to work. My intention is to replace the newslice binding with a progressively shorter slice as I copy characters out of the front of the array.
/// Test if a char is an ASCII digit
fn is_digit(c:u8) -> bool {
match c {
30|31|32|33|34|35|36|37|38|39 => true,
_ => false
}
}
/// Parse an integer from the front of an ascii string,
/// and return it along with the remainder of the string
fn parse_int(s:&[u8]) -> (u32, &[u8]) {
use std::str;
assert!(s.len()>0);
let mut newslice = s; // bytecopy of the fat pointer?
let mut n:Vec<u8> = vec![];
// Pull the leading digits into a separate array
while newslice.len()>0 && is_digit(newslice[0])
{
n.push(newslice[0]);
newslice = newslice.slice(1,newslice.len()-1);
//newslice = newslice[1..];
}
match from_str::<u32>(str::from_utf8(newslice).unwrap()) {
Some(i) => (i,newslice),
None => panic!("Could not convert string to int. Corrupted pgm file?"),
}
}
fn main(){
let s:&[u8] = b"12345";
assert!(s.len()==5);
let (i,newslice) = parse_int(s);
assert!(i==12345);
println!("length of returned slice: {}",newslice.len());
assert!(newslice.len()==0);
}
parse_int is failing to return a slice that is smaller than the one I passed in:
length of returned slice: 5
task '<main>' panicked at 'assertion failed: newslice.len() == 0', <anon>:37
playpen: application terminated with error code 101
Run this code in the rust playpen
As Chris Morgan mentioned, your call to slice passes the wrong value for the end parameter. newslice.slice_from(1) yields the correct slice.
is_digit tests for the wrong byte values. You meant to write 0x30, etc. instead of 30.
You call str::from_utf8 on the wrong value. You meant to call it on n.as_slice() rather than newslice.
Rebinding variables like that is perfectly fine. The general rule is simple: if the compiler doesn’t complain, it’s OK.
It’s a very simple error that you’ve made: your slice end point is incorrect.
slice produces the interval [start, end)—a half-open range, not closed. Therefore when you wish to just remove the first character, you should be writing newslice.slice(1, newslice.len()), not newslice.slice(1, newslice.len() - 1). You could also write newslice.slice_from(1).