How to utilize the batch function with multiple inputs to receive one agent from each input to meet the batch?

How to utilize the batch function with multiple inputs to receive one agent from each input to meet the batch? - input

I have 8 inputs that I would like to combine to one agent using the batch block. All the inputs have the same flow rate (1 per minute) and I would like them all to deliver one and only one agent to the batch so all inputs deliver one agent for the batch to be complete.
I have tried to use a delay and queue to manually restrict flow but that has not worked. I got an error saying cannot restrict flow but I have the inputs set "agents that cant exist are destroyed".
I also looked into trying to use a function but have not come across one that makes sense in my problem. Any help would be appreciated!

In a very primitive way you can build the following model:
You will define the HOLD block as follows:
The function will check that each queue has at least one agent ready and then release agents from each HOLD block:
if (queue.size() > 0 &&
queue1.size() > 0 &&
queue2.size() > 0 &&
queue3.size() > 0 &&
queue4.size() > 0 &&
queue5.size() > 0 &&
queue6.size() > 0 &&
queue7.size() > 0 )
{
hold.unblock();
hold1.unblock();
hold2.unblock();
hold3.unblock();
hold4.unblock();
hold5.unblock();
hold6.unblock();
hold7.unblock();
}
Every time an agent arrives you call the function() under the onAtExit event of your sources.

Related

Why is a conditional channel source causing a downstream process to not execute an instance for each value in a different channel?

I have a Nextflow DSL2 pipeline where an early process generally takes a very long time (~24 hours) and has intermediate products that occupy a lot of storage (~1 TB). Because of the length and resources required for this process, it would be desirable to be able to set a "checkpoint", i.e. save the (relatively small) final output to a safe location, and on subsequent pipeline executions retrieve the output from that location. This means that the intermediate data can be safely deleted without preventing resumption of the pipeline later.
However, I've found that when I implement this and use the checkpoint, a process further downstream that is supposed to run an instance for every value in a list only runs a single instance. Minimal working example and example outputs below:
// foobarbaz.nf
nextflow.enable.dsl=2
params.publish_dir = "$baseDir/output"
params.nofoo = false
xy = ['x', 'y']
xy_chan = Channel.fromList(xy)
process foo {
publishDir "${params.publish_dir}/", mode: "copy"
output:
path "foo.out"
"""
touch foo.out
"""
}
process bar {
input:
path foo_out
output:
path "bar.out"
script:
"""
touch bar.out
"""
}
process baz {
input:
path bar_out
val xy
output:
tuple val(xy), path("baz_${xy}.out")
script:
"""
touch baz_${xy}.out
"""
}
workflow {
main:
if( params.nofoo ) {
foo_out = Channel.fromPath("${params.publish_dir}/foo.out")
}
else {
foo_out = foo() // generally takes a long time and uses lots of storage
}
bar_out = bar(foo_out)
baz_out = baz(bar_out, xy_chan)
// ... continue to do things with baz_out ...
}
First execution with foo:
$ nextflow foobarbaz.nf
N E X T F L O W ~ version 21.10.6
Launching `foobarbaz.nf` [soggy_gautier] - revision: f4e70a5cd2
executor > local (4)
[77/c65a9a] process > foo [100%] 1 of 1 ✔
[23/846929] process > bar [100%] 1 of 1 ✔
[18/1c4bb1] process > baz (2) [100%] 2 of 2 ✔
(note that baz successfully executes two instances: one where xy==x and one where xy==y)
Later execution using the checkpoint:
$ nextflow foobarbaz.nf --nofoo
N E X T F L O W ~ version 21.10.6
Launching `foobarbaz.nf` [infallible_babbage] - revision: f4e70a5cd2
executor > local (2)
[40/b42ed3] process > bar (1) [100%] 1 of 1 ✔
[d9/76888e] process > baz (1) [100%] 1 of 1 ✔
The checkpointing is successful (bar executes without needing foo), but now baz only executes a single instance where xy==x.
Why is this happening, and how can I get the intended behaviour? I see no reason why whether foo_out comes from foo or retrieved directly from a file should make any difference to how the xy channel is interpreted by baz.

The problem is that the Channel.fromPath factory method creates a queue channel to provide a single value, whereas the output of process 'foo' implicitly produces a value channel:
A value channel is implicitly created by a process when an input
specifies a simple value in the from clause. Moreover, a value channel
is also implicitly created as output for a process whose inputs are
only value channels.
So without --nofoo, 'foo_out' and 'bar_out' are both value channels. Since, 'xy_chan' is a queue channel that provides two values, process 'bar' gets executed twice. With --nofoo, 'foo_out' and 'bar_out' are both queue channels which provide a single value. Since there's only one complete input configuration (i.e. one value from each input channel), process 'bar' gets executed only once. See also: Understand how multiple input channels work.
The solution is to ensure that 'foo_out' is either, always a queue channel or always value channel. Given your 'foo' process declaration, you probably want the latter:
if( params.nofoo ) {
foo_out = file( "${params.publish_dir}/foo.out" )
}
else {
foo_out = foo()
}

In my experience a process is executed according to the input channel with lowest N of emissions (which is one path emission from bar in your case).
So in this case the strange behaviour is actually the example without --nofoo in my mind.
If you want it executed 2 time you may try to combine the Channels using combine something like baz_input_ch=bar.out.combine(xy_chan)

Running jobs concurrently with multiple steps

I need to run 10,000 pipelines, each consisting of 5 steps (processes)
Another requirement is that I want to run about 300 concurrently. Meaning, I want 300 to start, then for each pipeline that finished the 5 steps, I want to start a new pipeline. I couldn't find how to do it using channels.
Some initial thoughts:
start by splitting the 10,000 channel to buffers of 300 items.
But it doesn't help with starting a new one when one ends...
proteins = Channel.fromPath( '/some/path/*.fa' ).buffer( size: 300 )
process A {
input:
file query_file from proteins
output:
}
process B {
}

You can achieve approximately what you want using the following in your nextflow.config:
process {
maxForks = 300
}
executor {
queueSize = 300
}
The maxForks directive sets the maximum number of process instances that can be executed in parallel. By setting this value in your nextflow.config, we ensure it is blanket applied to each of your five processes. If you have other processes that you don't want covered by this directive, you can of course use one or more process selectors to select the processes you want to limit this configuration to. Alternatively, just add the directive to each of your five process definitions.
The executor queueSize just defines the number of tasks the executor will handle in parallel.
This of course won't guarantee the completion of a chunk of five processes before starting a new chunk, but that usually isn't much of a concern.

Unable to exit while loop in UVM monitor

This might be a silly mistake from my side that I have overlooked but I'm fairly new to UVM and I tried tinkering with my code for a while before this. I'm trying to send in a stream of 8 bit data within a packet using Data valid stall protocol from my UVM driver to the DUT. I'm facing an issue with my input monitor not being able to pick up these transactions that are driven.
I have a while loop with a condition that the valid bit must be high and the stall bit should be low. As long as this condition holds good, the monitor needs to pick up the data byte and push into the queue. I know for a fact that the data is being picked up and pushed to a queue as I used $display statements along the way. The problem is arising once all the data bytes are received and the valid bit goes low. Ideally, this should cause the exit from the while loop but isn't doing so. Any help here would be appreciated. I have attached a snippet of the code below. Thanks in advance.
virtual task main_phase (uvm_phase phase);
$display("Run phase of input monitor");
collect_transfer();
endtask: main_phase
virtual task collect_transfer();
fork
forever begin
wait_for_valid_transaction_cycle();
create_and_populate_pkt();
broadcast_pkt();
#(iP0_vif.cb_iP0_MON);
end
join_none
endtask: collect_transfer
virtual task wait_for_valid_transaction_cycle();
wait(iP0_vif.cb_iP0_MON.ip_valid && ~iP0_vif.cb_iP0_MON.ip_stall);
endtask: wait_for_valid_transaction_cycle
virtual task create_and_populate_pkt();
pkt = Router_seq_item :: type_id :: create("pkt");
pkt.valid = iP0_vif.cb_iP0_MON.ip_valid;
pkt.sop = iP0_vif.cb_iP0_MON.ip_sop;
$display("before data collection");
while(iP0_vif.cb_iP0_MON.ip_valid === `HIGH && iP0_vif.cb_iP0_MON.ip_stall === `LOW) begin
$display("After checking for stall");
pkt.data = iP0_vif.cb_iP0_MON.ip_data;
$display(pkt.data);
pkt.data_q.push_front(pkt.data);
pkt.eop = iP0_vif.cb_iP0_MON.ip_eop;
$display("print check in input monitor # time = %0t", $time);
#(iP0_vif.cb_iP0_MON);
end
$display("before printing input packet from monitor");
Check_for_port_route_and_populate_packet_field(pkt);
print_packet(pkt);
endtask: create_and_populate_pkt
The $display statement "before printing input packet from monitor" is not being displayed.
HIGH is defined as a binary 1 and LOW is defined as a binary 0.
The output of the code in terms of display statements is as below.
before data collection
before checking for stall
After checking for stall
2
print check in input monitor # time = 105
before checking for stall
After checking for stall
1
print check in input monitor # time = 115
before checking for stall
After checking for stall
3
print check in input monitor # time = 125

It's possible that the main phase objection is being dropped elsewhere in your environment. UVM will automatically kill any threads that were spawned during a phase when it ends.
To fix this, do not object to the main phase in your monitor. Objecting to that phase is the responsibility of the threads creating the stimulus. Instead, you should be launching this monitor during the run_phase, which will ensure that your loop is not killed until the end of simulation.
Also, during the shutdown phase, you will want your monitor to object whenever it is currently seeing a packet. This will ensure that simulation doesn't end as soon as stimulus has been sent in, giving your other monitors time to collect responses from the DUT.

Is this redis lua script that deals with key expire race conditions a pure function?

I've been playing around with redis to keep track of the ratelimit of an external api in a distributed system. I've decided to create a key for each route where a limit is present. The value of the key is how many request I can still make until the limit resets. And the reset is made by setting the TTL of the key to when the limit will reset.
For that I wrote the following lua script:
if redis.call("EXISTS", KEYS[1]) == 1 then
local remaining = redis.call("DECR", KEYS[1])
if remaining < 0 then
local pttl = redis.call("PTTL", KEYS[1])
if pttl > 0 then
--[[
-- We would exceed the limit if we were to do a call now, so let's send back that a limit exists (1)
-- Also let's send back how much we would have exceeded the ratelimit if we were to ignore it (ramaning)
-- and how long we need to wait in ms untill we can try again (pttl)
]]
return {1, remaining, pttl}
elseif pttl == -1 then
-- The key expired the instant after we checked that it existed, so delete it and say there is no ratelimit
redis.call("DEL", KEYS[1])
return {0}
elseif pttl == -2 then
-- The key expired the instant after we decreased it by one. So let's just send back that there is no limit
return {0}
end
else
-- Great we have a ratelimit, but we did not exceed it yet.
return {1, remaining}
end
else
return {0}
end
Since a watched key can expire in the middle of a multi transaction without aborting it. I assume the same is the case for lua scripts. Therefore I put in the cases for when the ttl is -1 or -2.
After I wrote that script I looked a bit more in depth at the eval command page and found out that a lua script has to be a pure function.
In there it says
The script must always evaluates the same Redis write commands with
the same arguments given the same input data set. Operations performed
by the script cannot depend on any hidden (non-explicit) information
or state that may change as script execution proceeds or between
different executions of the script, nor can it depend on any external
input from I/O devices.
With this description I'm not sure if my function is a pure function or not.

After Itamar's answer I wanted to confirm that for myself so I wrote a little lua script to test that. The scripts creates a key with a 10ms TTL and checks the ttl untill it's less then 0:
redis.call("SET", KEYS[1], "someVal","PX", 10)
local tmp = redis.call("PTTL", KEYS[1])
while tmp >= 0
do
tmp = redis.call("PTTL", KEYS[1])
redis.log(redis.LOG_WARNING, "PTTL:" .. tmp)
end
return 0
When I ran this script it never terminated. It just went on to spam my logs until I killed the redis server. However time dosen't stand still while the script runs, instead it just stops once the TTL is 0.
So the key ages, it just never expires.

Since a watched key can expire in the middle of a multi transaction without aborting it. I assume the same is the case for lua scripts. Therefore I put in the cases for when the ttl is -1 or -2.
AFAIR that isn't the case w/ Lua scripts - time kinda stops (in terms of TTL at least) when the script's running.
With this description I'm not sure if my function is a pure function or not.
Your script's great (without actually trying to understand what it does), don't worry :)

What is a good way to define and declare job dependencies on rundeck?

On rundeck workflow scheduling, I want to configure a workflow like this.
Job Step 1 : returns status "success" Job Step 2 : checks return
status of {Job Step 1} (as "success") and proceeds
Am not sure if adding a flow control attribute like this solves the problem
Question 1:
At Job Step 1, I can return the job status as follows, but how do I check this status at another job step?
(some shell cmd)
if [ $? -eq 0 ]; then
exit_code="success"
else
exit_code="failure"
fi
echo $exit_code
Question 2: Is there a way to do this across jobs/workflows in the same/different project?

Flow Control and Job State Conditional allow you to use a custom exit status in one job, and have another job test the exit status of a job. However, these work across independent Jobs, not within a workflow.
If you want Step 2 to depend on Step 1 of a workflow being successful, right now you would have to set the workflow to "keepgoing on failure=false", which will exit the workflow as soon as Step 1 fails.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to utilize the batch function with multiple inputs to receive one agent from each input to meet the batch? - input

Related

Why is a conditional channel source causing a downstream process to not execute an instance for each value in a different channel?

Running jobs concurrently with multiple steps

Unable to exit while loop in UVM monitor

Is this redis lua script that deals with key expire race conditions a pure function?

What is a good way to define and declare job dependencies on rundeck?

Categories

Resources