How to "map" a function returning a Future - reactivex

Lets say you have a coroutine returning a task (using winrt to illustrate)
winrt::Windows::foundation::IAsyncOperation<int> my_coroutine(int i)
{
co_await winrt::resume_background();
Sleep(std::max(4 - i, 0));
co_return i;
}
and you want to in RxCpp like this:
range(0, 4)
| map(&my_coroutine)
| observe_on(observe_on_event_loop())
| subscribe<string>(println(cout))
;
// Output: 3 2 1 0
So the obvious problem is that we don't want map to pass an IAsyncOperation<int> to observe_on::on_next but rather we would have the coroutine call observe_on::on_next passing an int, when it returns its value.
Perhaps we could convert IAsyncOperation<int> to rxcpp::Observable<int> and use flat_map although it seams a bit wasteful to use a range-like monad when we only have a future monad.
Is there a easy way to do this with RxCpp? If it's rather complicated, I'm happy to know that so I can move on.
P.S. I'm actually co_awaiting winrt::resume_on_signal, so maybe RxCpp provides some similar functionality with event HANDLEs.
Edit: I suppose the obvious thing to do is just poll the futures on the event loop, pushing them (i.e. calling on_next) only when they have completed.
I have decided to give up on RxCpp, and am instead pursuing writing my pipeline with templates to take advantage of type deduction from lambda arguments (static polymorphism for refactoring rather than reuse).

I think the correct thing to do is just to not have the future at all
int my_function(int i)
{
Sleep(std::max(4 - i, 0));
return i;
}
and run this function on a thread pool
range(0, 4)
| observe_on(observe_on_thread_pool()) // doesn't actually exist
| map(&my_function)
| observe_on(observe_on_event_loop())
| subscribe<string>(println(cout))
;
// Output: 3 2 1 0
making sure that the observer it calls next on is thread safe.
And in rxcppv3 it shouldn't be too hard to write a custom operator in place of observe_on(observe_on_thread_pool()) | map:
const auto map_on_thread_pool = [](auto func){
return make_lifter([=](auto r){
return make_observer(r, [=](auto& r, auto v){
auto f = [=](){ r.next(func(v)); };
//On the windows thread pool
TrySubmitThreadpoolCallback(..); // what ever hacks are needed to run f
});
});
};

Related

Kotlin: Split Sequence<T> by N items into Sequence<Sequence<T>>?

How to "take(N)" iteratively - get a Sequence<Sequence>, each inner sequences having next N elements?
I am writing a high-load application in Kotlin.
I have tens of thousands of entries to insert to a database.
I want to batch them by, say, 1000.
So I created a loop:
val itemsSeq = itemsList.iterator().asSequence()
while (true) {
log.debug("Taking $BATCH_SIZE from $itemsSeq")
val batchSeq = itemsSeq.take(BATCH_SIZE)
val squareBatch = applySomething(batchSeq, something)
?: break
}
fun applySomething(batch: Sequence<Item>, something: Something) {
/* Fully consumes batch. Bulk-loads from DB by IDs, applies, bulk-saves. */
}
I thought that take() would advance the itemsSeq and the next call to take() would give me a sequence "view" of itemsSeq starting at the 10th item.
But with this code, I am getting:
DEBUG Taking 10 from kotlin.sequences.ConstrainedOnceSequence#53fe15ff
Exception in thread "main" java.lang.IllegalStateException: This sequence can be consumed only once.
at kotlin.sequences.ConstrainedOnceSequence.iterator(SequencesJVM.kt:23)
at kotlin.sequences.TakeSequence$iterator$1.<init>(Sequences.kt:411)
at kotlin.sequences.TakeSequence.iterator(Sequences.kt:409)
So it seems that the take() "opens" the itemsSeq again, while that can be consumed only once.
As a workaround, I can use chunked():
public fun <T> Sequence<T>.chunked(size: Int): Sequence<List<T>> {
But I would prefer not to create Lists, rather Sequences.
What I am looking for is something between take() and chunked().
Is there anything such in Kotlin SDK?
I can possibly create my own sequence { ... } but for readability, I would prefer something built-in.
There is a way to construct a Sequence by handing it over an Iterator, see Sequence.
Given an iterator function constructs a Sequence that returns values
through the Iterator provided by that function. The values are
evaluated lazily, and the sequence is potentially infinite.
Wrapped in an extension function it could look like this:
fun <T> Iterable<T>.toValuesThroughIteratorSequence(): Sequence<T> {
val iterator = this.iterator()
return Sequence { iterator }
}
Quick test:
data class Test(val id: Int)
val itemsList = List(45) { Test(it) }
val batchSize = 10
val repetitions = itemsList.size.div(batchSize) + 1
val itemsSeq = itemsList.toValuesThroughIteratorSequence()
(0 until repetitions).forEach { index ->
val batchSeq = itemsSeq.take(batchSize)
println("Batch no. $index: " + batchSeq.map { it.id.toString().padStart(2, ' ') }.joinToString(" "))
}
Output:
Batch no. 0: 0 1 2 3 4 5 6 7 8 9
Batch no. 1: 10 11 12 13 14 15 16 17 18 19
Batch no. 2: 20 21 22 23 24 25 26 27 28 29
Batch no. 3: 30 31 32 33 34 35 36 37 38 39
Batch no. 4: 40 41 42 43 44
Background
First of all, we need to be aware there is a big difference between an object that we can iterate over and object that represents a "live" or already running iteration process. First group means Iterable (so List, Set and all other collections), Array, Flow, etc. Second group is mostly Iterator or old Java Enumeration. The difference could be also compared to file vs file pointer when reading or database table vs database cursor.
Sequence belongs to the first group. Sequence object does not represent a live, already started iteration, but just a set of elements. These elements can be produced lazily, sequence could have unbounded size and usually internally it works by using iterators, but conceptually sequence is not an iterator itself.
If we look into the documentation about sequences it clearly compares them to Iterable, not to Iterator. All standard ways to construct sequences like: sequenceOf(), sequence {}, Iterable.asSequence() produce sequences that return the same list of items every time we iterate over them. Iterator.asSequence() also follows this pattern, but because it can't re-produce same items twice, it is intentionally protected against iterating multiple times:
public fun <T> Iterator<T>.asSequence(): Sequence<T> = Sequence { this }.constrainOnce()
Problem
Your initial attempt with using take() didn't work, because this is a misuse of sequences. We expect that subsequent take() calls on the same sequence object will produce exactly the same items (usually), not next items. Similarly as we expect multiple take() calls on a list always produce same items, each time starting from the beginning.
Being more specific, your error was caused by above constrainOnce(). When we invoke take() multiple times on a sequence, it has to restart from the beginning, but it can't do this if it was created from an iterator, so Iterator.asSequence() explicitly disallows this.
Simple solution
To fix the problem, you can just skip constrainOnce() part, as suggested by #lukas.j. This solution is nice, because stdlib already provides tools like Sequence.take(), so if used carefully, this is the easiest to implement and it just works.
However, I personally consider this a kind of workaround, because the resulting sequence doesn't behave as sequences do. It is more like an iterator on steroids than a real sequence. You need to be careful when using this sequence with existing operators or 3rd party code, because such sequence may work differently than they expect and as a result, you may get incorrect results.
Advanced solution
We can follow your initial attempt of using subsequent take() calls. In this case our object is used for live iteration, so it is no longer a proper sequence, but rather an iterator. The only thing we miss in stdlib is a way to create a sub-iterator with a single chunk. We can implement it by ourselves:
fun main() {
val list = (0 .. 25).toList()
val iter = list.iterator()
while (iter.hasNext()) {
val chunk = iter.limited(10)
println(chunk.asSequence().toList())
}
}
fun <T> Iterator<T>.limited(n: Int): Iterator<T> = object : Iterator<T> {
var left = n
val iterator = this#limited
override fun next(): T {
if (left == 0)
throw NoSuchElementException()
left--
return iterator.next()
}
override fun hasNext(): Boolean {
return left > 0 && iterator.hasNext()
}
}
I named it limited(), because take() suggests we read items from the iterator. Instead, we only create another iterator on top of the provided iterator.
Of course, sequences are easier to use than iterators and typical solution to this problem is by using chunked(). With above limited() it is pretty straightforward to implement chunkedAsSequences():
fun main() {
val list = (0 .. 25).toList()
list.asSequence()
.chunkedAsSequences(10)
.forEach { println(it.toList()) }
}
fun <T> Sequence<T>.chunkedAsSequences(size: Int): Sequence<Sequence<T>> = sequence {
val iter = iterator()
while (iter.hasNext()) {
val chunk = iter.limited(size)
yield(chunk.asSequence())
chunk.forEach {} // required if chunk was not fully consumed
}
}
Please also note there is a tricky case of chunk being not fully consumed. chunkedAsSequences() is protected against this scenario. Previous simpler solutions aren't.

Binding of private attributes: nqp::bindattr vs :=

I'm trying to find how the binding operation works on attributes and what makes it so different from nqp::bindattr. Consider the following example:
class Foo {
has #!foo;
submethod TWEAK {
my $fval = [<a b c>];
use nqp;
nqp::bindattr( nqp::decont(self), $?CLASS, '#!foo',
##!foo :=
Proxy.new(
FETCH => -> $ { $fval },
STORE => -> $, $v { $fval = $v }
)
);
}
method check {
say #!foo.perl;
}
}
my $inst = Foo.new;
$inst.check;
It prints:
$["a", "b", "c"]
Replacing nqp::bindattr with the binding operator from the comment gives correct output:
["a", "b", "c"]
Similarly, if foo is a public attribute and accessor is used the output would be correct too due to deconterisation taking place within the accessor.
I use similar code in my AttrX::Mooish module where use of := would overcomplicate the implementation. So far, nqp::bindattr did the good job for me until the above problem arised.
I tried tracing down Rakudo's internals looking for := implementation but without any success so far. I would ask here either for an advise as to how to simulate the operator or where in the source to look for its implementation.
Before I dig into the answer: most things in this post are implementation-defined, and the implementation is free to define them differently in the future.
To find out what something (naively) compiles into under Rakudo Perl 6, use the --target=ast option (perl6 --target=ast foo.p6). For example, the bind in:
class C {
has $!a;
submethod BUILD() {
my $x = [1,2,3];
$!a := $x
}
}
Comes out as:
- QAST::Op(bind) :statement_id<7>
- QAST::Var(attribute $!a) <wanted> $!a
- QAST::Var(lexical self)
- QAST::WVal(C)
- QAST::Var(lexical $x) $x
While switching it for #!a like here:
class C {
has #!a;
submethod BUILD() {
my $x = [1,2,3];
#!a := $x
}
}
Comes out as:
- QAST::Op(bind) :statement_id<7>
- QAST::Var(attribute #!a) <wanted> #!a
- QAST::Var(lexical self)
- QAST::WVal(C)
- QAST::Op(p6bindassert)
- QAST::Op(decont)
- QAST::Var(lexical $x) $x
- QAST::WVal(Positional)
The decont instruction is the big difference here, and it will take the contents of the Proxy by calling its FETCH, thus why the containerization is gone. Thus, you can replicate the behavior by inserting nqp::decont around the Proxy, although that rather begs the question of what the Proxy is doing there if the correct answer is obtained without it!
Both := and = are compiled using case analysis (namely, by looking at what is on the left hand side). := only works for a limited range of simple expressions on the left; it is a decidedly low-level operator. By contrast, = falls back to a sub call if the case analysis doesn't come up with a more efficient form to emit, though in most cases it manages something better.
The case analysis for := inserts a decont when the target is a lexical or attribute with sigil # or %, since - at a Perl 6 level - having an item bound to an # or % makes no sense. Using nqp::bindattr is going a level below Perl 6 semantics, and so it's possible to end up with the Proxy bound directly there using that. However, it also violates expectations elsewhere. Don't expect that to go well (but it seems you don't want to do that anyway.)

What's the most efficient way to reuse an iterator in Rust?

I'd like to reuse an iterator I made, so as to avoid paying to recreate it from scratch. But iterators don't seem to be cloneable and collect moves the iterator so I can't reuse it.
Here's more or less the equivalent of what I'm trying to do.
let my_iter = my_string.unwrap_or("A").chars().flat_map(|c|c.to_uppercase()).map(|c| Tag::from(c).unwrap() );
let my_struct = {
one: my_iter.collect(),
two: my_iter.map(|c|{(c,Vec::new())}).collect(),
three: my_iter.filter_map(|c|if c.predicate(){Some(c)}else{None}).collect(),
four: my_iter.map(|c|{(c,1.0/my_float)}).collect(),
five: my_iter.map(|c|(c,arg_time.unwrap_or(time::now()))).collect(),
//etc...
}
You should profile before you optimize something, otherwise you might end up making things both slower and more complex than they need to.
The iterators in your example
let my_iter = my_string.unwrap_or("A").chars().flat_map(|c|c.to_uppercase()).map(|c| Tag::from(c).unwrap() );
are thin structures allocated on the stack. Cloning them isn't going to be much cheaper than building them from scratch.
Constructing an iterator with .chars().flat_map(|c| c.to_uppercase()) takes only a single nanosecond when I benchmark it.
According to the same benchmark, wrapping iterator creation in a closure takes more time than simply building the iterator in-place.
Cloning a Vec iterator is not much faster than building it in-place, both are practically instant.
test construction_only ... bench: 1 ns/iter (+/- 0)
test inplace_construction ... bench: 249 ns/iter (+/- 20)
test closure ... bench: 282 ns/iter (+/- 18)
test vec_inplace_iter ... bench: 0 ns/iter (+/- 0)
test vec_clone_iter ... bench: 0 ns/iter (+/- 0)
Iterators in general are Clone-able if all their "pieces" are Clone-able. You have a couple of them in my_iter that are not: the anonymous closures (like the one in flat_map) and the ToUppercase struct returned by to_uppercase.
What you can do is:
rebuild the whole thing (as #ArtemGr suggests). You could use a macro to avoid repetition. A bit ugly but should work.
collect my_iter into a Vec before populating my_struct (since you seem to collect it anyway in there): let my_iter: Vec<char> = my_string.unwrap_or("A").chars().flat_map(|c|c.to_uppercase()).map(|c| Tag::from(c).unwrap() ).collect();
create your own custom iterator. Without your definitions of my_string (since you call unwrap_or on it I assume it's not a String) and Tag it's hard to help you more concretely with this.
You may use closure to get identical iterators:
#[derive(Debug)]
struct MyStruct{
one:Vec<char>,
two:Vec<char>,
three:String
}
fn main() {
let my_string:String = "ABCD1234absd".into();
let my_iter = || my_string.chars();
let my_struct = MyStruct{
one: my_iter().collect(),
two: my_iter().filter(|x| x.is_numeric()).collect(),
three: my_iter().filter(|x| x.is_lowercase()).collect()
};
println!("{:?}", my_struct);
}
See also this Correct way to return an Iterator? question.
Also you may clone iterator (see #Paolo Falabella answer about iterators cloneability):
fn main() {
let v = vec![1,2,3,4,5,6,7,8,9];
let mut i = v.iter().skip(2);
let mut j = i.clone();
println!("{:?}", i.take(3).collect::<Vec<_>>());
println!("{:?}", j.filter(|&x| x%2==0).collect::<Vec<_>>());
}
Unfortunately I can't tell which way is more effective

Conflicting lifetime requirement for iterator returned from function

This may be a duplicate. I don't know. I couldn't understand the other answers well enough to know that. :)
Rust version: rustc 1.0.0-nightly (b47aebe3f 2015-02-26) (built 2015-02-27)
Basically, I'm passing a bool to this function that's supposed to build an iterator that filters one way for true and another way for false. Then it kind of craps itself because it doesn't know how to keep that boolean value handy, I guess. I don't know. There are actually multiple lifetime problems here, which is discouraging because this is a really common pattern for me, since I come from a .NET background.
fn main() {
for n in values(true) {
println!("{}", n);
}
}
fn values(even: bool) -> Box<Iterator<Item=usize>> {
Box::new([3usize, 4, 2, 1].iter()
.map(|n| n * 2)
.filter(|n| if even {
n % 2 == 0
} else {
true
}))
}
Is there a way to make this work?
You have two conflicting issues, so let break down a few representative pieces:
[3usize, 4, 2, 1].iter()
.map(|n| n * 2)
.filter(|n| n % 2 == 0))
Here, we create an array in the stack frame of the method, then get an iterator to it. Since we aren't allowed to consume the array, the iterator item is &usize. We then map from the &usize to a usize. Then we filter against a &usize - we aren't allowed to consume the filtered item, otherwise the iterator wouldn't have it to return!
The problem here is that we are ultimately rooted to the stack frame of the function. We can't return this iterator, because the array won't exist after the call returns!
To work around this for now, let's just make it static. Now we can focus on the issue with even.
filter takes a closure. Closures capture any variable used that isn't provided as an argument to the closure. By default, these variables are captured by reference. However, even is again a variable located on the stack frame. This time however, we can give it to the closure by using the move keyword. Here's everything put together:
fn main() {
for n in values(true) {
println!("{}", n);
}
}
static ITEMS: [usize; 4] = [3, 4, 2, 1];
fn values(even: bool) -> Box<Iterator<Item=usize>> {
Box::new(ITEMS.iter()
.map(|n| n * 2)
.filter(move |n| if even {
n % 2 == 0
} else {
true
}))
}

Counter as variable in for-in-loops

When normally using a for-in-loop, the counter (in this case number) is a constant in each iteration:
for number in 1...10 {
// do something
}
This means I cannot change number in the loop:
for number in 1...10 {
if number == 5 {
++number
}
}
// doesn't compile, since the prefix operator '++' can't be performed on the constant 'number'
Is there a way to declare number as a variable, without declaring it before the loop, or using a normal for-loop (with initialization, condition and increment)?
To understand why i can’t be mutable involves knowing what for…in is shorthand for. for i in 0..<10 is expanded by the compiler to the following:
var g = (0..<10).generate()
while let i = g.next() {
// use i
}
Every time around the loop, i is a freshly declared variable, the value of unwrapping the next result from calling next on the generator.
Now, that while can be written like this:
while var i = g.next() {
// here you _can_ increment i:
if i == 5 { ++i }
}
but of course, it wouldn’t help – g.next() is still going to generate a 5 next time around the loop. The increment in the body was pointless.
Presumably for this reason, for…in doesn’t support the same var syntax for declaring it’s loop counter – it would be very confusing if you didn’t realize how it worked.
(unlike with where, where you can see what is going on – the var functionality is occasionally useful, similarly to how func f(var i) can be).
If what you want is to skip certain iterations of the loop, your better bet (without resorting to C-style for or while) is to use a generator that skips the relevant values:
// iterate over every other integer
for i in 0.stride(to: 10, by: 2) { print(i) }
// skip a specific number
for i in (0..<10).filter({ $0 != 5 }) { print(i) }
let a = ["one","two","three","four"]
// ok so this one’s a bit convoluted...
let everyOther = a.enumerate().filter { $0.0 % 2 == 0 }.map { $0.1 }.lazy
for s in everyOther {
print(s)
}
The answer is "no", and that's a good thing. Otherwise, a grossly confusing behavior like this would be possible:
for number in 1...10 {
if number == 5 {
// This does not work
number = 5000
}
println(number)
}
Imagine the confusion of someone looking at the number 5000 in the output of a loop that is supposedly bound to a range of 1 though 10, inclusive.
Moreover, what would Swift pick as the next value of 5000? Should it stop? Should it continue to the next number in the range before the assignment? Should it throw an exception on out-of-range assignment? All three choices have some validity to them, so there is no clear winner.
To avoid situations like that, Swift designers made loop variables in range loops immutable.
Update Swift 5
for var i in 0...10 {
print(i)
i+=1
}