Consider this code where a tap takes awhile to complete. All the blocks are running simultaneously (immediately outputting) then sleeping. Most don't finish because the program ends sooner then they do:
my $supply = Supply.interval(0.2);
my $tap = $supply.tap: { say "1 $^a"; sleep 5; };
sleep 5;
The output (elided) has 25 lines (one for each tick of 0.2 in 5 seconds):
1. 0
1. 1
...
1. 24
Then I change that supply to .share:
my $supply = Supply.interval(0.2).share;
my $tap = $supply.tap: { say "1. $^a"; sleep 5 };
sleep 5;
I only see one line of input but I expected the same output:
1. 1
The .share makes it possible for multiple taps to get the same values.
my $supply = Supply.interval(0.2).share;
my $tap = $supply.tap: { say "1. $^a"; sleep 5 };
my $tap2 = $supply.tap: { say "2. $^a"; };
sleep 5;
Still the output has output only for the first tap and still has only one line. I expected 25 lines for each:
1. 1
The basic rules for Supply are:
No introduction of concurrency without it being explicitly asked for
Back-pressure through a sender-pays model
A message is processed in full before the next one (so .map({ ...something with state... }) can be trusted not to cause conflicts over the state)
Rule 3 doesn't really apply to share since there's separate downstream operation chains after that point, but rules 1 and 2 do. The purpose of share is to allow publish/subscribe, and also to provide for re-use of a chunk of processing by multiple downstream message processors. Introducing parallel message processing is a separate concern from this.
The are various options. One is to have the messages for parallel processing stuck into a Channel. This explicitly introduces a place for the messages to be buffered (well, until you run out of memory...which is exactly why Supply comes with a sender-pays back-pressure model). Coercing a Channel back into a Supply gets the values pulled from the Channel and emitted on that Supply on a pool thread. That way looks like:
my $supply = Supply.interval(0.2).share;
my $tap = $supply.Channel.Supply.tap: { say "1. $^a"; sleep 5 };
my $tap2 = $supply.tap: { say "2. $^a"; };
sleep 5;
Note that since whenever automatically coerces the thing it's asked to react to to a Supply, then that'd look like whenever $supply.Channel { }, which makes it a pretty short solution - but at the same time nicely explicit in that it indicates how the normal back-pressure mechanism is being side-stepped. The other property of this solution is that it retains the order of the messages and still gives one-at-a-time processing downstream of the Channel.
The alternative is to react to each message by instead starting some asynchronous piece of work to handle it. The start operation on a Supply schedules the block it is passed to run on the thread pool for each message that is received, thus not blocking the arrival of the next message. The result is a Supply of Supply. This forces one to tap each inner Supply to actually make anything happen, which seems slightly counter-intuitive at first, but actually is for the good of the programmer: it makes it clear there's an extra bit of async work to keep track of. I very strongly suggest using this in combination with the react/whenever syntax, which does subscription management and error propagation automatically. The most direct transformation of the code in the question is:
my $supply = Supply.interval(0.2).share;
my $tap = supply { whenever $supply.start({ say "1. $^a"; sleep 5 }) { whenever $_ {} } }.tap;
my $tap2 = $supply.tap: { say "2. $^a"; };
sleep 5;
Although it's also possible to instead write it as:
my $supply = Supply.interval(0.2).share;
my $tap = supply { whenever $supply -> $a { whenever start { say "1. $a"; sleep 5 } {} } }.tap;
my $tap2 = $supply.tap: { say "2. $^a"; };
sleep 5;
Which points to the possibility writing a parallelize Supply combinator:
my $supply = Supply.interval(0.2).share;
my $tap = parallelize($supply, { say "1. $^a"; sleep 5 }).tap;
my $tap2 = $supply.tap: { say "2. $^a"; };
sleep 5;
sub parallelize(Supply $messages, &operation) {
supply {
whenever $messages -> $value {
whenever start operation($value) {
emit $_;
}
}
}
}
The output of this approach is rather different from the Channel one, since the operations are all kicked off as soon as the message comes in. Also it doesn't retain message order. There's still an implicit queue (unlike the explicit one with the Channel approach), it's just that now it's the thread pool scheduler's work queue and the OS scheduler that has to keep track of the in-progress work. And again, there's no back-pressure, but notice that it would be entirely possible to implement that by keeping track of outstanding Promises and blocking further incoming messages with an await Promise.anyof(#outstanding).
Finally, I'll note that there is some consideration of hyper whenever and race whenever constructs to provide some language-level mechanism for dealing with parallel processing of Supply messages. However the semantics of such, and how they play into the supply-block design goals and safety properties, represent significant design challenges.
The taps of a Supply are run sequentially within a single thread. So the code of the second tap will only be run after the first tap (which sleeps for 5 seconds). This shows in the following code:
my $supply = Supply.interval(0.2).share;
my $tap = $supply.tap: { say "1. $^a in #{+$*THREAD}" };
my $tap2 = $supply.tap: { say "2. $^a in #{+$*THREAD}" };
sleep 0.5;
===================
1. 1 in #4
2. 1 in #4
1. 2 in #4
2. 2 in #4
So the answer is currently: no
Related
I am trying to use Kotlin coroutines to perform multiple HTTP calls concurrently, rather than one at a time, but I would like to avoid making all of the calls concurrently, to avoid rate limiting by the external API.
If I simply launch a coroutine for each request, they all are sent near instantly. So I looked into the limitedParallelism function, which sounds very close to what I need, and some stack overflow answers suggest is the recommended solution. Older answers to the same question suggested using newFixedThreadPoolContext.
The documentation for that function mentioned limitedParallelism as a preferred alternative "if you do not need a separate thread pool":
If you do not need a separate thread-pool, but only have to limit effective parallelism of the dispatcher, it is recommended to use CoroutineDispatcher.limitedParallelism instead.
However, when I write my code to use limitedParallelism, it does not reduce the number of concurrent calls, compared to newFixedThreadPoolContext which does.
In the example below, I replace my network calls with Thread.sleep, which does not change the behavior.
// method 1
val fixedThreadPoolContext = newFixedThreadPoolContext(2)
// method 2
val limitedParallelismContext = Dispatchers.IO.limitedParallelism(2)
runBlocking {
val jobs = (1..1000).map {
// swap out the dispatcher here
launch(limitedParallelismContext) {
println("started $it")
Thread.sleep(1000)
println(" finished $it")
}
}
jobs.joinAll()
}
The behavior for fixedThreadPoolContext is as expected, no more than 2 of the coroutines runs at a time, and the total time to finish is several minutes (1000 times one second each, divided by two at a time, roughly 500 seconds).
However, for limitedParallelismContext, all "started #" lines print immediately, and one second later, all "finished #" lines print and the program completes in just over 1 total second.
Why does limitedParallelism not have the same effect as using a separate thread pool? What does it accomplish?
I modified your code slightly so that every coroutine takes 200ms to complete and it prints the time when it is completed. Then I pasted it to play.kotlinlang.org to check:
/**
* You can edit, run, and share this code.
* play.kotlinlang.org
*/
import kotlinx.coroutines.*
fun main() {
// method 1
val fixedThreadPoolContext = newFixedThreadPoolContext(2, "Pool")
// method 2
val limitedParallelismContext = Dispatchers.IO.limitedParallelism(2)
runBlocking {
val jobs = (1..10).map {
// swap out the dispatcher here
launch(limitedParallelismContext) {
println("it at ${System.currentTimeMillis()}")
Thread.sleep(200)
}
}
jobs.joinAll()
}
}
And there using kotlin 1.6.21 the result is as expected:
it at 1652887163155
it at 1652887163157
it at 1652887163358
it at 1652887163358
it at 1652887163559
it at 1652887163559
it at 1652887163759
it at 1652887163759
it at 1652887163959
it at 1652887163959
Only 2 coroutines are executed at a time.
let's say I'm making a simple dnd dice roller (cause I am), I made it so it rolls a bunch of random numbers based on how many dice they want rolled and the type of dice. it then sends it to a text view one at a time(what I want); However, it only shows one number because it has no delay to let the the user see each number rolled (it only shows the last number).
How would I do that?
else if (numTimesRolled.progress <= 4) {
for (i in 0 until numTimesRolled.progress){
randNum = Random.nextInt(1, diceIsComfirm)
resultsArray[i] = randNum.toString()
}
for (i in 0 until numTimesRolled.progress){
randNumDisplay.text = resultsArray[i]
}
Non-coroutines solution is to post Runnables:
val delayPerNumber = 500L // 500ms
for (i in 0 until numTimesRolled.progress){
randNumDisplay.postDelayed({ randNumDisplay.text = resultsArray[i] }, i * delayPerNumber)
}
With a coroutine:
lifecycleScope.launch {
for (i in 0 until numTimesRolled.progress){
delay(500) // 500ms
randNumDisplay.text = resultsArray[i]
}
}
An advantage with the coroutine is it will automatically stop if the Activity or Fragment is destroyed, so if the Activity/Fragment is closed while the coroutine's still running, it won't hold your obsolete views in memory.
I have a small program which runs until a SIGINT is received or two lines (press enter twice) from stdin are received. The react block logic is:
react {
whenever signal(SIGINT) {
say "Got signal";
exit;
}
whenever $*IN.lines.Supply {
say "Got line";
exit if $++ == 1 ;
}
}
Program will exit on two entered lines as expected.
However CTRL-C will not do anything, unless it is followed by a line (enter).
If I switch the order of the whenever blocks, the program is interrupted by a SIGINT but doesn't execute the signal whenever block
react {
whenever $*IN.lines.Supply {
say "Got line";
exit if $++ == 1 ;
}
whenever signal(SIGINT) {
say "Got signal";
exit;
}
}
Is there some other setup required before using the signal sub? Is the order of whenever blocks important in a react block?
Update
So it seems the lines() call is blocking the react block from executing (thanks #Håkon). I kind of get it.
When comparing to a similar code structure for reading a socket I'm confused though. The presence of data (or lack of) has no effect on the signal handler executing and it can read lines just fine in this example:
my $listener=IO::Socket::Async.listen("0.0.0.0",4432);
react {
whenever $listener {
whenever $_.Supply.lines() {
say "Got line";
}
}
whenever signal(SIGINT) {
say "Got signal";
exit;
}
}
#testing with:
# curl http://localhost:4432
Why does this behave so different to my original code?
The order doesn't matter provided the data sources really behave in an asynchronous manner, which unfortunately is not the case here. The Supply coercer on a Seq does not introduce any concurrency, and does immediately try to produce a value to emit on the Supply, which in turn blocks on reading from $*IN. Thus, the second subscription doesn't have chance to be set up; the same underlying issue causes the other problems observed.
The solution is to force the reading to happen "elsewhere". We can do that with Supply.from-list(...), plus telling it we really do want to use the current scheduler rather than its default CurrentThreadScheduler. Thus, this behaves as wanted:
react {
whenever Supply.from-list($*IN.lines, scheduler => $*SCHEDULER) {
say "Got line";
exit if $++ == 1 ;
}
whenever signal(SIGINT) {
say "Got signal";
exit;
}
}
It's likely this area will be revised somewhat in future Perl 6 versions. The current behavior was well-intended; the design principle was to avoid implicit introduction of concurrency, following the general principle that supplies are a tool for managing concurrency that inherently exists, rather than for introducing it. However, in reality, the lack of concurrency here has probably tripped up more folks than it has helped. (Further, we may look into offering real non-blocking file I/O, rather than building it from sync file I/O + threads.)
Here is a variant that runs the signal handler (based on this answer), but unfortunately autoflushing of $*IN seems to be turned off:
my $lines = supply {
whenever start $*IN.lines.Supply {
whenever .lines { .emit }
}
}.Channel;
react {
whenever signal(SIGINT) {
say "Got signal";
exit;
}
whenever $lines {
say "Got line: '{$_}'";
exit if $++ == 1;
}
}
Now you have to press CTRL-D to print the lines, and then it print all lines entered as a concatenated string and after that $*IN is closed.. How can I turn on autoflushing for $*IN in this case?
I want to print the current time every second, and also want to sleep 10 seconds very 5 seconds:
react {
whenever Supply.interval(1) {
say DateTime.now.posix;
}
whenever Supply.interval(5) {
sleep 10;
say 'Sleep Done';
}
whenever signal(SIGINT) {
say "Done.";
done;
}
}
the output is not what i wanted:
1542371045
Sleep Done
1542371055
Sleep Done
1542371065
Sleep Done
1542371075
Done.
...
what i want is this:
1542371045
1542371046
1542371047
1542371048
1542371049
Sleep Done
1542371059
1542371060
1542371061
1542371062
1542371063
Sleep Done
Done.
Don't know much about Promise, Supply... about Raku, is this possible?
Depending on exactly what else was needed, I'd probably write it something like this:
react {
sub sequence() {
whenever Supply.interval(1).head(5) {
say DateTime.now.posix;
LAST whenever Promise.in(10) {
say "Sleep done";
sequence();
}
}
}
sequence();
}
Which gives output like this:
1542395158
1542395159
1542395160
1542395161
1542395162
Sleep done
1542395172
1542395173
1542395174
1542395175
1542395176
Sleep done
1542395186
1542395187
1542395188
...
This will make absolutely sure you get 5 ticks out between the 10s pauses; doing it with two separate interval supplies - as in many solutions here - will not give any strict guarantees of that, and could miss a tick now and then. (One that doesn't is the cute one with rotor, which is a good bet if you don't need to actually print the "sleep done" thing). It's also free of state (variables) and conditions, which is rather nice.
While this looks like it might be recursive, since whenever is an asynchronous looping construct, it will not actually build up a call stack at all.
It's also fully built of asynchronous constructs, and so in Perl 6.d will not - if the react is triggered on the thread pool - ever block a real OS thread. So you could have thousands of these active. By contrast, sleep will block a real thread, which is what sleep traditionally would be expected to do, but isn't such a good fit if otherwise dealing with asynchronous constructs.
One mistake you are making is that you are assuming that supplies will lose values, or you are assuming they will stop generating values while the react is blocked.
They won't.
They keep generating values.
You should also try to have the code in a whenever run for as short of a time as possible.
(Pretend it is a CPU interrupt handler.)
There may be some exceptions to this rule, particularly for supply blocks.
Using the structure that you provided, this is one way to achieve what you want:
react {
# Are we ignoring the interval(1) values?
my Bool:D $ignore = False;
# The sleeping status of interval(5).
my Promise:D $sleep .= kept;
whenever Supply.interval(1) {
# Skip if it is supposed to be blocked.
next if $ignore;
say DateTime.now.posix;
}
# First one runs immediately, so skip it.
whenever Supply.interval(5).skip {
# Don't run while the “sleep” is pending.
next unless $sleep.status; # Planned
if $ignore {
$ignore = False;
say 'Sleep Done';
} else {
$ignore = True;
# Must be less than the multiple of 5 we want
# otherwise there may be a race condition.
$sleep = Promise.in(9);
}
}
whenever signal(SIGINT) {
say "Done.";
done;
}
}
That isn't very clear.
How about we just use .rotor instead, to skip every third interval of 5?
react {
my Bool:D $ignore = True;
# Note that first one runs immediately. (no .skip)
# We also want it to always be a few milliseconds before
# the other Supply, so we put it first.
# (Should have done that with the previous example as well.)
whenever Supply.interval(5).rotor(1, 1 => 1) {
$ignore = !$ignore;
}
whenever Supply.interval(1) {
next if $ignore;
say DateTime.now.posix;
}
whenever signal(SIGINT) {
say "Done.";
done;
}
}
While we are at it, why not just use .rotor on the .interval(1) Supply?
react {
whenever Supply.interval(1).rotor(1 xx 4, 1 => 10) {
say DateTime.now.posix;
}
whenever signal(SIGINT) {
say "Done.";
done;
}
}
Note that we can't just use 5 => 10 because that batches them up, and we want them to be run singly.
Note that .grep also works on Supplys, so we could have used that instead to check the $ignored value.
react {
my Bool:D $ignore = True;
whenever Supply.interval(5).rotor(1, 1 => 1) {
$ignore = !$ignore;
}
whenever Supply.interval(1).grep({ !$ignore }) {
say DateTime.now.posix;
}
whenever signal(SIGINT) {
say "Done.";
done;
}
}
Maybe this can work:
loop {
react {
whenever Supply.interval(1) {
say DateTime.now.posix;
}
whenever Promise.in(5) {
done;
}
whenever signal(SIGINT) {
say "Done.";
done;
}
}
sleep 10;
}
The output is:
1542347961
1542347962
1542347963
1542347964
1542347965
1542347976 # <- 10s
1542347977
1542347978
1542347979
1542347980
1542347991 # <- 10s
The thing is the two Supplies are effectively running in different threads so don't interact with each other. Your sleep only puts the thread it's in to sleep (and then the fact it's a 5 second interval creates another sleep anyway).
To achieve the result you're looking for I went with this which uses the single 1 second interval and a couple of flags.
react {
whenever Supply.interval(1) {
state $slept = False;
state $count = 0;
if $count >= 0 {
if $slept {
say "Sleep Done";
$slept = False
}
say DateTime.now.posix;
}
$count++;
if ( $count == 5 ) {
$count = -9;
$slept = True
}
}
whenever signal(SIGINT) {
say "Done.";
done;
}
}
Note that we have to use state variables because the whenever block is effectively executed in it's own thread each second. The state variables allow us to keep track of the current situation.
If it was running on a smaller interval I would maybe think about using atomic ints instead of normal ones (in case the code was executed while it was still running) but that block should never take more than a second to execute so I don't think it's a problem.
Because only one whenever will be executing at any time, the sleep in there will be halting all handling of things to react to. The easiest way to achieve what you want, is to do the sleep as an asynchronous job by wrapping the code of that whenever into a start block.
react {
whenever Supply.interval(1) {
say DateTime.now.posix;
}
whenever Supply.interval(5) {
start {
sleep 10;
say 'Sleep Done';
}
}
whenever signal(SIGINT) {
say "Done.";
done;
}
}
This gives the desired output, as far as I can see.
I have a never ending stream as a sequence.
What I am aiming for is to take a batch from the sequence both based on time and size.
What I mean is if my sequence has 2250 messages right now I want to send 3 batches ( 1000, 1000, 250).
Also if till the next 5 minute I still have not accumulated a 1000 messages I will send it anyway with whatever has accumulated so far.
sequence
.chunked(1000)
.map { chunk ->
// do something with chunk
}
What I was expecting to have is something like .chunked(1000, 300) which 300 is second for when I want to send every 5 minutes.
Thanks in advance
Kotlin Sequence is a synchronous concept and is not supposed to be used in any kind of time-limited fashion. If you ask the sequence for the next element then it blocks invoker thread until it produces the next element and there is no way to cancel it.
However, kotlinx.coroutines library introduces the concept of Channel which is a rough analogue of a sequence for an asynchronous world, where operation may take some time to complete and they don't block threads while doing so. You can read more in this guide.
It does not provide a ready-to-use chunked operator, but makes it straightforward to write one. You can use the following code:
import kotlinx.coroutines.experimental.channels.*
import kotlinx.coroutines.experimental.selects.*
fun <T> ReceiveChannel<T>.chunked(size: Int, time: Long) =
produce<List<T>>(onCompletion = consumes()) {
while (true) { // this loop goes over each chunk
val chunk = mutableListOf<T>() // current chunk
val ticker = ticker(time) // time-limit for this chunk
try {
whileSelect {
ticker.onReceive {
false // done with chunk when timer ticks, takes priority over received elements
}
this#chunked.onReceive {
chunk += it
chunk.size < size // continue whileSelect if chunk is not full
}
}
} catch (e: ClosedReceiveChannelException) {
return#produce // that is normal exception when the source channel is over -- just stop
} finally {
ticker.cancel() // release ticker (we don't need it anymore as we wait for the first tick only)
if (chunk.isNotEmpty()) send(chunk) // send non-empty chunk on exit from whileSelect
}
}
}
As you can see from this code, it embeds some non-trivial decisions on what to do in corner cases. What should we do if timer expires but current chunk is still empty? This code start new time interval and does not send the previous (empty) chunk. Do we finish current chunk on timeout after last element, measure time from the first element, or measure time from the beginning of chunk? This code does the later.
This code is completely sequential -- its logic is easy to follow in a step-by-step way (there is not concurrency inside the code). One can adjust it to any project-speicfic requirements.