is it possible to filter on a vector in-place? - iterator

I'd like to remove some elements from a Vec, but vec.iter().filter().collect() creates a new vector with borrowed items.
I'd like to mutate the original Vec without extra memory allocation (and keep memory of removed elements as an extra capacity of the vector).

If you want to remove elements, you can use retain(), which removes elements from the vector if the closure returns false:
let mut vec = vec![1, 2, 3, 4];
vec.retain(|&x| x % 2 == 0);
assert_eq!(vec, [2, 4]);
If you want to modify the elements in place, you have to do that in a for x in vec.iter_mut().

If you truly want to mutate the vector's elements while filtering it, you can use the nightly-only method Vec::drain_filter, an extremely flexible tool:
#![feature(drain_filter)]
fn main() {
let mut vec = vec![1, 2, 3, 4];
vec.drain_filter(|x| {
if *x % 2 == 0 {
true
} else {
*x += 100;
false
}
});
assert_eq!(vec, [101, 103]);
}
It also allows you to get the removed elements as the return value of the entire method is an iterator!

Till Vec::drain_filter gets stable, we can solve the problem with homebrewed rust:
fn main() {
let mut v = vec![1, 2, 3, 4];
let mut i = 0;
while i < v.len() {
if v[i] % 2 == 0 {
v.remove(i);
} else {
v[i] += 100;
i += 1;
}
}
println!("{:?}", v); // [101, 103]
}
BTW remove() is an O(n) operation but doesn't allocate memory.
Playground

I am providing my take for this problem as I was unaware of the retain method:
impl<T> RemoveFilter<T> for Vec<T> {}
pub trait RemoveFilter<T>: BorrowMut<Vec<T>> {
fn remove_filter<F: for<'b> FnMut(&'b T) -> bool>(&mut self, mut cb: F) {
let vec: &mut Vec<T> = self.borrow_mut();
let mut write_to = 0;
let mut read_from = 0;
while read_from < vec.len() {
let maintain = cb(&mut vec[read_from]);
if maintain {
vec.as_mut_slice().swap(read_from, write_to);
write_to += 1;
}
read_from += 1;
}
vec.resize_with(write_to, || panic!("We are shrinking the vector"));
}
}
It will shift the elements as it iterates and then remove anything that is left behind. I think this is code may easily modified to solve other problems.

Related

elegant way of capturing a reference to an integer variable?

I have this snippet:
let mut animation_index = 0 as usize;
let mut ptr : *mut usize = &mut animation_index as _;
{
io_context.window().add_key_callback(
Box::new(move |key_states| {
if key_states[KbKey::Space.to_index()] == KeyActionState::Press
{
unsafe {
*ptr += 1;
println!("{}", animation_index);
}
}
})
);
}
Basically it adds a callback such that if and when I press space, the integer variable animation_index goes up by 1. This works, but requires the use of mutable pointers and unsafe, which is very ugly.
I'd like to have the same logic but ideally do it with pure safe rust isntead.
It looks like you are trying to share a mutable value across threads.
Typically, this is done with atomics, Arc<Mutex<T>> or Arc<RwLock<T>>.
use std::synce::{Arc, RwLock};
let mut animation_index = Arc::new(RwLock::new(0usize));
{
// a clone of the counter that can be moved into the callback
let animation_index = animation_index.clone();
io_context.window().add_key_callback(
Box::new(move |key_states| {
if key_states[KbKey::Space.to_index()] == KeyActionState::Press
{
let index = animation_index.write().unwrap();
*index += 1;
println!("{}", index);
}
})
);
}
With atomics it would look something like this:
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
let mut animation_index = Arc::new(AtomicUsize::new(0));
{
// a clone of the counter that can be moved into the callback
let animation_index = animation_index.clone();
io_context.window().add_key_callback(
Box::new(move |key_states| {
if key_states[KbKey::Space.to_index()] == KeyActionState::Press
{
let index = animation_index.fetch_add(1, Ordering::SeqCst);
println!("{}", index);
}
})
);
}

How do I use a Peekable iterator in a filter closure?

I need to find only the numbers where the next number is the same: [1,2,2,3,4,4] should produce [2,4]. Since I need to peek at the next number, I figured I'd try out using a Peekable iterator and write a filter.
fn main() {
let xs = [1, 2, 2, 3, 4, 4];
let mut iter = xs.iter().peekable();
let pairs = iter.filter(move |num| {
match iter.peek() {
Some(next) => num == next,
None => false,
}
});
for num in pairs {
println!("{}", num);
}
}
I get an error:
error[E0382]: capture of moved value: `iter`
--> src/main.rs:6:15
|
5 | let pairs = iter.filter(move |num| {
| ---- value moved here
6 | match iter.peek() {
| ^^^^ value captured here after move
|
= note: move occurs because `iter` has type `std::iter::Peekable<std::slice::Iter<'_, i32>>`, which does not implement the `Copy` trait
I think this is because iter is being used by the closure, but it hasn't borrowed it, and it can't copy it.
How do I solve this problem of wanting to refer to the iterator inside a filter?
refer to the iterator inside a filter
I don't believe you can. When you call filter, it takes ownership of the base iterator:
fn filter<P>(self, predicate: P) -> Filter<Self, P>
where
P: FnMut(&Self::Item) -> bool,
Once you do that, it's gone. There is no more iter. In some similar cases, you can use Iterator::by_ref to mutably borrow the iterator, drive it for a while, then refer back to the original. That won't work in this case because the inner iterator would need to borrow it mutably a second time, which is disallowed.
find only the numbers where the next number is the same.
extern crate itertools;
use itertools::Itertools;
fn main() {
let input = [1, 2, 2, 3, 4, 4];
let pairs = input
.iter()
.tuple_windows()
.filter_map(|(a, b)| if a == b { Some(a) } else { None });
let result: Vec<_> = pairs.cloned().collect();
assert_eq!(result, [2, 4]);
}
Or if you wanted something using only the standard library:
fn main() {
let xs = [1, 2, 2, 3, 4, 4];
let mut prev = None;
let pairs = xs.iter().filter_map(move |curr| {
let next = if prev == Some(curr) { Some(curr) } else { None };
prev = Some(curr);
next
});
let result: Vec<_> = pairs.cloned().collect();
assert_eq!(result, [2, 4]);
}

Return Option inside Loop

The program aims to use a loop to check if the index of a iterator variable meets certain criteria (i.g., index == 3). If find the desired index, return Some(123), else return None.
fn main() {
fn foo() -> Option<i32> {
let mut x = 5;
let mut done = false;
while !done {
x += x - 3;
if x % 5 == 0 {
done = true;
}
for (index, value) in (5..10).enumerate() {
println!("index = {} and value = {}", index, value);
if index == 3 {
return Some(123);
}
}
return None; //capture all other other possibility. So the while loop would surely return either a Some or a None
}
}
}
The compiler gives this error:
error[E0308]: mismatched types
--> <anon>:7:9
|
7 | while !done {
| ^ expected enum `std::option::Option`, found ()
|
= note: expected type `std::option::Option<i32>`
= note: found type `()`
I think the error source might be that a while loop evaluates to a (), thus it would return a () instead of Some(123). I don't know how to return a valid Some type inside a loop.
The value of any while true { ... } expression is always (). So the compiler expects your foo to return an Option<i32> but finds the last value in your foo body is ().
To fix this, you can add a return None outside the original while loop. You can also use the loop construct like this:
fn main() {
// run the code
foo();
fn foo() -> Option<i32> {
let mut x = 5;
loop {
x += x - 3;
for (index, value) in (5..10).enumerate() {
println!("index = {} and value = {}", index, value);
if index == 3 {
return Some(123);
}
}
if x % 5 == 0 {
return None;
}
}
}
}
The behaviour of while true { ... } statements is maybe a bit quirky and there have been a few requests to change it.

flat_map on Chars causes borrow checker error

I'm trying to generate a sequence like this: 1,2,3,4,5,6,7,8,9,1,0,1,1,1,2...
fn main() {
let iter = (1..).flat_map(|j| j.to_string().chars());
for i in iter {
println!("{}", i);
}
}
This does not work, because j.to_string() goes out of scope I believe (but why?)
p040.rs:2:35: 2:48 error: borrowed value does not live long enough
p040.rs:2 let iter = (1..).flat_map(|j| j.to_string().chars());
^~~~~~~~~~~~~
p040.rs:2:58: 6:2 note: reference must be valid for the block suffix following statement 0 at 2:57...
p040.rs:2 let iter = (1..).flat_map(|j| j.to_string().chars());
p040.rs:3 for i in iter {
p040.rs:4 println!("{}", i);
p040.rs:5 }
p040.rs:6 }
p040.rs:2:35: 2:56 note: ...but borrowed value is only valid for the block at 2:34
p040.rs:2 let iter = (1..).flat_map(|j| j.to_string().chars());
^~~~~~~~~~~~~~~~~~~~~
How could I solve this compiler error?
Iterators are lazy and can only live as long as their iteratee lives. j.to_string() is temporary and only lives inside the closure, hence the closure cannot return j.to_string().chars().
A simple solution would be to collect the characters before returning:
fn main() {
let iter = (1..).flat_map(|j| j.to_string().chars().collect::<Vec<_>>());
for i in iter {
println!("{}", i);
}
}
One problem with the solution that uses collect is that it keeps allocating strings and vectors. If you need a implementation that does the minimum allocation, you can implement your own iterator:
#[derive(Default)]
struct NumChars {
num: usize,
num_str: Vec<u8>,
next_index: usize,
}
impl Iterator for NumChars {
type Item = char;
fn next(&mut self) -> Option<char> {
use std::io::Write;
if self.next_index >= self.num_str.len() {
self.next_index = 0;
self.num += 1;
self.num_str.clear();
write!(&mut self.num_str, "{}", self.num).expect("write failed");
}
let index = self.next_index;
self.next_index += 1;
Some(self.num_str[index] as char)
}
}
fn main() {
assert_eq!(
vec!['1', '2', '3', '4', '5', '6', '7', '8', '9', '1', '0', '1', '1'],
NumChars::default().take(13).collect::<Vec<_>>()
);
}

Slow Swift Arrays and Strings performance

Here is two pretty similar Levenshtein Distance algorithms.
Swift implementation:
https://gist.github.com/bgreenlee/52d93a1d8fa1b8c1f38b
And Objective-C implementation:
https://gist.github.com/boratlibre/1593632
The swift one is dramatically slower then ObjC implementation
I've send couple of hours to make it faster but... It seems like Swift arrays and Strings manipulation are not as fast as objC.
On 2000 random Strings calculations Swift implementation is about 100(!!!) times slower then ObjC.
Honestly speaking, I've got no idea what could be wrong, coz even this part of swift
func levenshtein(aStr: String, bStr: String) -> Int {
// create character arrays
let a = Array(aStr)
let b = Array(bStr)
...
is few times slower then whole algorithm in Objective C
Is anyone knows how to speedup swift calculations?
Thank you in advance!
Append
After all suggested improvements swift code looks like this.
And it is 4 times slower then ObjC in release configuration.
import Foundation
class Array2D {
var cols:Int, rows:Int
var matrix:UnsafeMutablePointer<Int>
init(cols:Int, rows:Int) {
self.cols = cols
self.rows = rows
matrix = UnsafeMutablePointer<Int>(malloc(UInt(cols * rows) * UInt(sizeof(Int))))
for i in 0...cols*rows {
matrix[i] = 0
}
}
subscript(col:Int, row:Int) -> Int {
get {
return matrix[cols * row + col] as Int
}
set {
matrix[cols*row+col] = newValue
}
}
func colCount() -> Int {
return self.cols
}
func rowCount() -> Int {
return self.rows
}
}
extension String {
func levenshteinDistanceFromStringSwift(comparingString: NSString) -> Int {
let aStr = self
let bStr = comparingString
// let a = Array(aStr.unicodeScalars)
// let b = Array(bStr.unicodeScalars)
let a:NSString = aStr
let b:NSString = bStr
var dist = Array2D(cols: a.length + 1, rows: b.length + 1)
for i in 1...a.length {
dist[i, 0] = i
}
for j in 1...b.length {
dist[0, j] = j
}
for i in 1...a.length {
for j in 1...b.length {
if a.characterAtIndex(i-1) == b.characterAtIndex(j-1) {
dist[i, j] = dist[i-1, j-1] // noop
} else {
dist[i, j] = min(
dist[i-1, j] + 1, // deletion
dist[i, j-1] + 1, // insertion
dist[i-1, j-1] + 1 // substitution
)
}
}
}
return dist[a.length, b.length]
}
func levenshteinDistanceFromStringObjC(comparingString: String) -> Int {
let aStr = self
let bStr = comparingString
//It is really strange, but I should link Objective-C coz dramatic slow swift performance
return aStr.compareWithWord(bStr, matchGain: 0, missingCost: 1)
}
}
malloc?? NSString?? and at the end 4 times speed decrease? Is anybody needs swift anymore?
There are multiple reasons why the Swift code is slower than the Objective-C code.
I made a very simple test case by comparing two fixed strings 100 times.
Objective-C code: 0.026 seconds
Swift code: 3.14 seconds
The first reason is that a Swift Character represents an "extended grapheme cluster",
which can contain several Unicode code points (e.g. "flags"). This makes the
decomposition of a string into characters slow. On the other hand, Objective-C
NSString stores the strings as a sequence of UTF-16 code points.
If you replace
let a = Array(aStr)
let b = Array(bStr)
by
let a = Array(aStr.utf16)
let b = Array(bStr.utf16)
so that the Swift code works on UTF-16 sequences as well then the time goes down
to 1.88 seconds.
The allocation of the 2-dimensional array is also slow. It is faster to allocate
a single one-dimensional array. I found a simple Array2D class here:
http://blog.trolieb.com/trouble-multidimensional-arrays-swift/
class Array2D {
var cols:Int, rows:Int
var matrix: [Int]
init(cols:Int, rows:Int) {
self.cols = cols
self.rows = rows
matrix = Array(count:cols*rows, repeatedValue:0)
}
subscript(col:Int, row:Int) -> Int {
get {
return matrix[cols * row + col]
}
set {
matrix[cols*row+col] = newValue
}
}
func colCount() -> Int {
return self.cols
}
func rowCount() -> Int {
return self.rows
}
}
Using that class in your code
func levenshtein(aStr: String, bStr: String) -> Int {
let a = Array(aStr.utf16)
let b = Array(bStr.utf16)
var dist = Array2D(cols: a.count + 1, rows: b.count + 1)
for i in 1...a.count {
dist[i, 0] = i
}
for j in 1...b.count {
dist[0, j] = j
}
for i in 1...a.count {
for j in 1...b.count {
if a[i-1] == b[j-1] {
dist[i, j] = dist[i-1, j-1] // noop
} else {
dist[i, j] = min(
dist[i-1, j] + 1, // deletion
dist[i, j-1] + 1, // insertion
dist[i-1, j-1] + 1 // substitution
)
}
}
}
return dist[a.count, b.count]
}
the time in the test case goes down to 0.84 seconds.
The last bottleneck that I found in the Swift code is the min() function.
The Swift library has a built-in min() function which is faster. So just removing
the custom function from the Swift code reduces the time for the test case to
0.04 seconds, which is almost as good as the Objective-C version.
Addendum: Using Unicode scalars seems to be even slightly faster:
let a = Array(aStr.unicodeScalars)
let b = Array(bStr.unicodeScalars)
and has the advantage that it works correctly with surrogate pairs such
as Emojis.