Why does indexing need to be referenced? [duplicate] - indexing

This question already has answers here:
What is the return type of the indexing operation?
(2 answers)
Closed 2 years ago.
I'm currently learning Rust coming from JavaScript.
My problem is the following:
fn main() {
let name = String::from("Tom");
let sliced = name[..2];
println!("{}, {}", name, sliced);
}
This doesn't work. Saying "doesn't have a size known at compile-time".
To fix this I need to add & the referencing operator.
fn main() {
let name = String::from("Tom");
let sliced = &name[..2];
println!("{}, {}", name, sliced);
}
I know I need to add & before name and & is the referencing operator. But I just don't know why I actually need to do that?
By referencing a variable the reference refers to variable name but does not own it. The original value will not get dropped if my reference gets out of scope. Does that mean that the variable gets out of scope if i do name[...] and the variable gets dropped and because of that i need to create a reference to it to prevent that?
Could somebody explain me that?

I know I need to add & before name and & is the referencing operator. But I just don't know why I actually need to do that.
I understand where the confusion come from, because when you look at index() it returns &Self::Output. So it already returns a reference, what's going on?
It's because the indexing operator is syntactic sugar and uses the Index trait. However, while it uses index() which does return a reference, that is not how it is desugared.
In short x[i] is not translated into x.index(i), but actually to *x.index(i), so the reference is immediately dereferenced. That's how you end up with a str instead of a &str.
let foo = "foo bar"[..3]; // str
// same as
let foo = *"foo bar".index(..3); // str
That's why you need to add the & to get it "back" to a reference.
let foo = &"foo bar"[..3]; // &str
// same as
let foo = &*"foo bar".index(..3); // &str
Alternatively, if you call index() directly, then it isn't implicitly dereferenced of course.
use std::ops::Index;
let foo = "foo bar".index(..3); // &str
Trait std::ops::Index - Rust Documentation:
container[index] is actually syntactic sugar for *container.index(index)
The same applies to IndexMut.

Related

Why does Kotlin's compiler not realize when a variable is initialized in an if statement?

The following example of Kotlin source code returns an error when compiled:
fun main() {
var index: Int // create an integer used to call an index of an array
val myArray = Array(5) {i -> i + 1} // create an array to call from
val condition = true // makes an if statement run true later
if (condition) {
index = 2 // sets index to 2
}
println( myArray[index] ) // should print 2; errors
}
The error says that the example did not initialize the variable index by the time it is called, even though it is guaranteed to initialize within the if statement. I understand that this problem is easily solved by initializing index to anything before the if statement, but why does the compiler not initialize it? I also understand that Kotlin is still in beta; is this a bug, or is it intentional? Finally, I am using Replit as an online IDE; is there a chance that the compiler on the website simply is an outdated compiler?
The compiler checks whether there is a path in your code that the index may not be initialized based on all the path available in your code apart from the value of the parameters. You have an if statement without any else. If you add the else statement you will not get any compile error.

What patterns exist for mocking a single function while testing? [duplicate]

This question already has answers here:
How to mock specific methods but not all of them in Rust?
(2 answers)
How to mock external dependencies in tests? [duplicate]
(1 answer)
How can I test stdin and stdout?
(1 answer)
Is there a way of detecting whether code is being called from tests in Rust?
(1 answer)
What is the proper way to use the `cfg!` macro to choose between multiple implementations?
(1 answer)
Closed 3 years ago.
I have a function generates a salted hash digest for some data. For the salt, it uses a random u32 value. It looks something like this:
use rand::RngCore;
use std::collections::hash_map::DefaultHasher;
use std::hash::Hasher;
fn hash(msg: &str) -> String {
let salt = rand::thread_rng().next_u32();
let mut s = DefaultHasher::new();
s.write_u32(salt);
s.write(msg.as_bytes());
format!("{:x}{:x}", &salt, s.finish())
}
In a test, I'd like to validate that it produces expected values, given a known salt and string. How do I mock (swizzle?) rand::thread_rng().next_u32() in the test to generate a specific value? In other words, what could replace the comment in this example to make the test pass?
mod tests {
#[test]
fn test_hashes() {
// XXX How to mock ThreadRng::next_u32() to return 3892864592?
assert_eq!(hash("foo"), "e80866501cdda8af09a0a656");
}
}
Some approaches I've looked at:
I'm aware that the ThreadRng returned by rand::thread_rng() implements RngCore, so in theory I could set a variable somewhere to store a reference to a RngCore, and implement my own mocked variant to set during testing. I've taken this sort of approach in Go and Java, but I couldn't get the Rust type checker to allow it.
I looked at the list of mock frameworks, such as MockAll, but they appear to be designed to mock a struct or trait to pass to a method, and this code doesn't pass one, and I wouldn't necessarily want users of the library to be able to pass in a RngCore.
Use the #[cfg(test)] macro to call a different function specified in the tests module, then have that function read the value to return from elsewhere. This I got to work, but had to use an unsafe mutable static variable to set the value for the mocked method to find, which seems gross. Is there a better way?
As a reference, I'll post an answer using the #[cfg(test)] + unsafe mutable static variable technique, but hope there's a more straightforward way to do this sort of thing.
In the test module, use lazy-static to add a static variable with a Mutex for thread safety, create a function like next_u32() to return its value, and have tests set the static variable to a known value. It should fall back on returning a properly random number if it's not set, so here I've made it Vec<u32> so it can tell:
mod tests {
use super::*;
use lazy_static::lazy_static;
use std::sync::Mutex;
lazy_static! {
static ref MOCK_SALT: Mutex<Vec<u32>> = Mutex::new(vec![]);
}
// Replaces random salt generation when testing.
pub fn mock_salt() -> u32 {
let mut sd = MOCK_SALT.lock().unwrap();
if sd.is_empty() {
rand::thread_rng().next_u32()
} else {
let ret = sd[0];
sd.clear();
ret
}
}
#[test]
fn test_hashes() {
MOCK_SALT.lock().unwrap().push(3892864592);
assert_eq!(hash("foo"), "e80866501cdda8af09a0a656");
}
}
Then modify hash() to call tests::mock_salt() instead of rand::thread_rng().next_u32() when testing (the first three lines of the function body are new):
fn hash(msg: &str) -> String {
#[cfg(test)]
let salt = tests::mock_salt();
#[cfg(not(test))]
let salt = rand::thread_rng().next_u32();
let mut s = DefaultHasher::new();
s.write_u32(salt);
s.write(msg.as_bytes());
format!("{:x}{:x}", &salt, s.finish())
}
Then use of the macros allows Rust to determine, at compile time, which function to call, so there's no loss of efficiency in non-test builds. It does mean that there's some knowledge of the tests module in the source code, but it's not included in the binary, so should be relatively safe. I suppose there could be a custom derive macro to automate this somehow. Something like:
#[mock(rand::thread_rng().next_u32())]
let salt = rand::thread_rng().next_u32();
Would auto-generate the mocked method in the tests module (or elsewhere?), slot it in here, and provide functions for the tests to set the value --- only when testing, of course. Seems like a lot, though.
Playground.

How to rewrite this in terms of R.compose

var take = R.curry(function take(count, o) {
return R.pick(R.take(count, R.keys(o)), o);
});
This function takes count keys from an object, in the order, in which they appear. I use it to limit a dataset which was grouped.
I understand that there are placeholder arguments, like R.__, but I can't wrap my head around this particular case.
This is possible thanks to R.converge, but I don't recommend going point-free in this case.
// take :: Number -> Object -> Object
var take = R.curryN(2,
R.converge(R.pick,
R.converge(R.take,
R.nthArg(0),
R.pipe(R.nthArg(1),
R.keys)),
R.nthArg(1)));
One thing to note is that the behaviour of this function is undefined since the order of the list returned by R.keys is undefined.
I agree with #davidchambers that it is probably better not to do this points-free. This solution is a bit cleaner than that one, but is still not to my mind as nice as your original:
// take :: Number -> Object -> Object
var take = R.converge(
R.pick,
R.useWith(R.take, R.identity, R.keys),
R.nthArg(1)
);
useWith and converge are similar in that they accept a number of function parameters and pass the result of calling all but the first one into that first one. The difference is that converge passes all the parameters it receives to each one, and useWith splits them up, passing one to each function. This is the first time I've seen a use for combining them, but it seems to make sense here.
That property ordering issue is supposed to be resolved in ES6 (final draft now out!) but it's still controversial.
Update
You mention that it will take some time to figure this out. This should help at least show how it's equivalent to your original function, if not how to derive it:
var take = R.converge(
R.pick,
R.useWith(R.take, R.identity, R.keys),
R.nthArg(1)
);
// definition of `converge`
(count, obj) => R.pick(R.useWith(R.take, R.identity, R.keys)(count, obj),
R.nthArg(1)(count, obj));
// definition of `nthArg`
(count, obj) => R.pick(R.useWith(R.take, R.identity, R.keys)(count, obj), obj);
// definition of `useWith`
(count, obj) => R.pick(R.take(R.identity(count), R.keys(obj)), obj);
// definition of `identity`
(count, obj) => R.pick(R.take(count, R.keys(obj)), obj);
Update 2
As of version 18, both converge and useWith have changed to become binary. Each takes a target function and a list of helper functions. That would change the above slightly to this:
// take :: Number -> Object -> Object
var take = R.converge(R.pick, [
R.useWith(R.take, [R.identity, R.keys]),
R.nthArg(1)
]);

Modifying self in `iter_mut().map(..)`, aka mutable functional collection operations

How do I convert something like this:
let mut a = vec![1, 2, 3, 4i32];
for i in a.iter_mut() {
*i += 1;
}
to a one line operation using map and a closure?
I tried:
a.iter_mut().map(|i| *i + 1).collect::<Vec<i32>>();
The above only works if I reassign it to a. Why is this? Is map getting a copy of a instead of a mutable reference? If so, how can I get a mutable reference?
Your code dereferences the variable (*i) then adds one to it. Nowhere in there does the original value get changed.
The best way to do what you asked is to use Iterator::for_each:
a.iter_mut().for_each(|i| *i += 1);
This gets an iterator of mutable references to the numbers in your vector. For each item, it dereferences the reference and then increments it.
You could use map and collect, but doing so is non-idiomatic and potentially wasteful. This uses map for the side-effect of mutating the original value. The "return value" of assignment is the unit type () - an empty tuple. We use collect::<Vec<()>> to force the Iterator adapter to iterate. This last bit ::<...> is called the turbofish and allows us to provide a type parameter to the collect call, informing it what type to use, as nothing else would constrain the return type.:
let _ = a.iter_mut().map(|i| *i += 1).collect::<Vec<()>>();
You could also use something like Iterator::count, which is lighter than creating a Vec, but still ultimately unneeded:
a.iter_mut().map(|i| *i += 1).count();
As Ry- says, using a for loop is more idiomatic:
for i in &mut a {
*i += 1;
}

Are there Rust variable naming conventions for things like Option<T>?

Is there a convention for variable naming in cases like the following? I find myself having to have two names, one the optional and one for the unwrapped.
let user = match optional_user {
Some(u) => {
u
}
None => {
new("guest", "guest").unwrap()
}
};
I'm unsure if there is a convention per say, but I often see (and use) maybe for Options. i.e.
let maybe_thing: Option<Thing> = ...
let thing: Thing = ...
Also, in regards to your use of u and user in this situation, it is fine to use user in both places. i.e.
let user = match maybe_user {
Some(user) => user,
...
This is because the match expression will be evaluated prior to the let assignment.
However (slightly off topic) #Manishearth is correct, in this case it would be nicer to use or_else. i.e.
let user = maybe_user.or_else(|| new("guest", "guest")).unwrap();
I'd recommend becoming familiar with the rest of Option's methods too as they are excellent for reducing match boilerplate.
If you're going to use a variable to initialize another and you don't need to use the first variable anymore, you can use the same name for both variables.
let user = /* something that returns Option<?> */;
let user = match user {
Some(u) => {
u
}
None => {
new("guest", "guest").unwrap()
}
};
In the initializer for the second let binding, the identifier user resolves to the first user variable, rather than the one being defined, because that one is not initialized yet. Variables defined in a let statement only enter the scope after the whole let statement. The second user variable shadows the first user variable for the rest of the block, though.
You can also use this trick to turn a mutable variable into an immutable variable:
let mut m = HashMap::new();
/* fill m */
let m = m; // freeze m
Here, the second let doesn't have the mut keyword, so m is no longer mutable. Since it's also shadowing m, you no longer have mutable access to m (though you can still add a let mut later on to make it mutable again).
Firstly, your match block can be replaced by optional_user.or_else(|| new("guest", "guest")).unwrap()
Usually for destructures where the destructured variable isn't used in a large block, a short name like u is common. However it would be better to call it user if the block was larger with many statements.