APScheduler python: my scheduled job is running multiple times on same timeslot - apscheduler

job_stores = {
'default': MongoDBJobStore(database=databasename,
client=clientname, collection="schedulejob")
}
executors = {
'default': {'type': 'threadpool', 'max_workers': 5}
}
job_defaults = {
'coalesce': True, # When the same task is triggered multiple times at the same time, it runs only once
'max_instances': 3,
'misfire_grace_time': 3600, # The task is still executed after 30 seconds of expiration
}
global sched
sched = BackgroundScheduler(jobstores=job_stores, executors=executors, job_defaults=job_defaults, daemon=True)
sched.add_job(helloworld, 'interval', hours=3, args=(),
name=name)
I can see schedulejob is created with single entry, but on respective timestamp, my same function helloworld got executed four times.
Note: I am using Flask server with 4 nodes. Is the four nodes being the reason.

Related

Nexflow only processes one of my paired sample in a subworkflow

I have a workflow consisting of 2 subworkflows.
params.reads = "$projectDir/data/raw/reads/*_{1,2}.fastq.gz"
params.kaiju_db = "$projectDir/data/kaijudb/viruses/kaiju_db_viruses.fmi"
params.kaiju_names = "$projectDir/data/kaijudb/viruses/names.dmp"
params.kaiju_nodes = "$projectDir/data/kaijudb/viruses/nodes.dmp"
workflow subworkflow_A {
take:
reads // channel: [ val(sample), [ reads ] ]
main:
count_reads(reads)
trim_reads(reads)
emit:
trimmed_reads = process2.out.reads // channel: [ val(sample), [ trimmed_reads ] ]
}
workflow subworkflow_B {
take:
reads // channel: [ val(sample), [ reads ] ]
db // channel: /path/to/kaiju/db.fmi
nodes // channel: /path/to/kaiju/nodes/file
names // channel: /path/to/kaiju/names/file
main:
taxonomic_classification(reads, nodes, db)
kaiju_to_krona(taxonomic_classification.out, nodes, names)
krona_import_text(kaiju_to_krona.out)
kaiju_to_table(taxonomic_classification.out, nodes, names)
}
workflow main {
ch_reads = Channel.fromFilePairs("$params.reads", checkIfExists:true)
subworkflow_A(ch_reads)
ch_db = Channel.fromPath("$params.kaiju_db", checkIfExists: true)
ch_nodes = Channel.fromPath("$params.kaiju_nodes", checkIfExists: true)
ch_names = Channel.fromPath("$params.kaiju_names", checkIfExists: true)
ch_trimmed_reads = subworkflow_A.out.trimmed_reads
subworkflow_B(ch_processed_reads, ch_db, ch_nodes, ch_names)
}
The input for params.reads is a directory like,
reads/
├── test_sample1_1.fastq.gz
├── test_sample1_2.fastq.gz
├── test_sample2_1.fastq.gz
└── test_sample2_2.fastq.gz
The input for subworkflow_A, ch_reads is:
[test_sample1, [~project/data/raw/reads/test_sample1_1.fastq.gz, ~project/data/raw/reads/test_sample1_2.fastq.gz]]
[test_sample2, [~project/data/raw/reads/test_sample2_1.fastq.gz, ~project/data/raw/reads/test_sample2_2.fastq.gz]]
subworkflow_A then emits the following channel into ch_trimmed_reads
[test_sample1, [~project/work/51/240e81f0a30e7e4c1d932abfe97502/test_sample1.trim.R1.fq.gz, ~project/work/51/240e81f0a30e7e4c1d932abfe97502/test_sample1.trim.R2.fq.gz]]
[test_sample2, [~project/work/work/b2/d38399833f3adf11d4e8c6d85ec293/test_sample2.trim.R1.fq.gz, ~project/work/b2/d38399833f3adf11d4e8c6d85ec293/test_sample2.trim.R2.fq.gz]]
For some reason, subworkflow_B only runs the first sample test_sample1, and not the second sample test_sample1 when I want to run it over both samples.
Note that a value channel is implicitly created by a process when it is invoked with a simple value. This means you can just pass in a plain file object. For example:
workflow main {
ch_reads = Channel.fromFilePairs( params.reads, checkIfExists:true )
db = file( params.kaiju_db )
nodes = file( params.kaiju_nodes )
names = file( params.kaiju_names )
subworkflow_B( ch_reads, db, nodes, names )
}
Most of the time, what you want is one queue channel and one or more value channels when your process requires multiple input channels:
When two or more channels are declared as process inputs, the process
waits until there is a complete input configuration, i.e. until it
receives a value from each input channel. When this condition is
satisfied, the process consumes a value from each channel and launches
a new task, repeating this logic until one or more channels are empty.
As a result, channel values are consumed sequentially and any empty
channel will cause the process to wait, even if the other channels
have values.
A different semantic is applied when using a value channel. This kind
of channel is created by the Channel.value factory method or
implicitly when a process is invoked with an argument that is not a
channel. By definition, a value channel is bound to a single value and
it can be read an unlimited number of times without consuming its
content. Therefore, when mixing a value channel with one or more
(queue) channels, it does not affect the process termination because
the underlying value is applied repeatedly.

Hangfire executes job twice

I am using Hangfire.AspNetCore 1.7.17 and Hangfire.MySqlStorage 2.0.3 for software that is currently in production.
Now and then, we get a report of jobs being executed twice, despite the usage of the [DisableConcurrentExecution] attribute with a timeout of 30 seconds.
It seems that as soon as those 30 seconds have passed, another worker picks up that same job again.
The code is fairly straightforward:
public async Task ProcessPicking(HttpRequest incomingRequest)
{
var filePath = await StoreStreamAsync(incomingRequest, TriggerTypes.Picking);
var picking = await XmlHelper.DeserializeFileAsync<Picking>(filePath);
// delay with 20 minutes so outbound-out gets the chance to be send first
BackgroundJob.Schedule(() => StartPicking(picking), TimeSpan.FromMinutes(20));
}
[TriggerAlarming("[IMPORTANT] Failed to parse picking message to **** object.")]
[DisableConcurrentExecution(30)]
public void StartPicking(Picking picking)
{
var orderlinePickModels = picking.ToSalesOrderlinePickQuantityRequests().ToList();
var orderlineStatusModels = orderlinePickModels.ToSalesOrderlineStatusRequests().ToList();
var isParsed = DateTime.TryParse(picking.Order.UnloadingDate, out var unloadingDate);
for (var i = 0; i < orderlinePickModels.Count; i++)
{
// prevents bugs with usage of i in the background jobs
var index = i;
var id = BackgroundJob.Enqueue(() => SendSalesOrderlinePickQuantityRequest(orderlinePickModels[index], picking.EdiReference));
BackgroundJob.ContinueJobWith(id, () => SendSalesOrderlineStatusRequest(
orderlineStatusModels.First(x=>x.SalesOrderlineId== orderlinePickModels[index].OrderlineId),
picking.EdiReference, picking.Order.PrimaryReference, isParsed ? unloadingDate : DateTime.MinValue));
}
}
[TriggerAlarming("[IMPORTANT] Failed to send order line pick quantity request to ****.")]
[AutomaticRetry(Attempts = 2)]
[DisableConcurrentExecution(30)]
public void SendSalesOrderlinePickQuantityRequest(SalesOrderlinePickQuantityRequest request, string ediReference)
{
var audit = new AuditPostModel
{
Description = $"Finished job to send order line pick quantity request for item {request.Itemcode}, part of ediReference {ediReference}.",
Object = request,
Type = AuditTypes.SalesOrderlinePickQuantity
};
try
{
_logger.LogInformation($"Started job to send order line pick quantity request for item {request.Itemcode}.");
var response = _service.SendSalesOrderLinePickQuantity(request).GetAwaiter().GetResult();
audit.StatusCode = (int)response.StatusCode;
if (!response.IsSuccessStatusCode) throw new TriggerRequestFailedException();
audit.IsSuccessful = true;
_logger.LogInformation("Successfully posted sales order line pick quantity request to ***** endpoint.");
}
finally
{
Audit(audit);
}
}
It schedules the main task (StartPicking) that creates the objects required for the two subtasks:
Send picking details to customer
Send statusupdate to customer
The first job is duplicated. Perhaps the second job as well, but this is not important enough to care about as it just concerns a statusupdate. However, the first job causes the customer to think that more items have been picked than in reality.
I would assume that Hangfire updates the state of a job to e.g. in progress, and checks this state before starting a job. Is my time-out on the disabled concurrent execution too low? Is it possible in this scenario that the database connection to update the state takes about 30 seconds (to be fair, it is running on a slow server with ~8GB Ram, 6 vCores) due to which the second worker is already picking up the job again?
Or is this a Hangfire specific issue that must be tackled?

ActiveMQ/STOMP Clear Schedule Messages Pointed To Destination

I would like to remove messages that are scheduled to be delivered to a specific queue but i'm finding the process to be unnecessarily burdensome.
Here I am sending a blank message to a queue with a delay:
self._connection.send(body="test", destination=f"/queue/my-queue", headers={
"AMQ_SCHEDULED_DELAY": 100_000_000,
"foo": "bar"
})
And here I would like to clear the scheduled messages for that queue:
self._connection.send(destination=f"ActiveMQ.Scheduler.Management", headers={
"AMQ_SCHEDULER_ACTION": "REMOVEALL",
}, body="")
Of course the "destination" here needs to be ActiveMQ.Scheduler.Management instead of my actual queue. But I can't find anyway to delete scheduled messages that are destined for queue/my-queue. I tried using the selector header, but that doesn't seem to work for AMQ_SCHEDULER_ACTION type messages.
The only suggestions I've seen is to write a consumer to browser all of the scheduled messages, inspect each one for its destination, and delete each schedule by its ID. This seems insane to me as I don't have just a handful of messages but many millions of messages that I'd like to delete.
Is there a way I could send a command to ActiveMQ to clear scheduled messages with a custom header value?
Maybe I can define a custom scheduled messages location for each queue?
Edit:
I've written a wrapper around the stomp.py connection to handle purging schedules destined for a queue. The MQStompFacade takes an existing stomp.Connection and the name of the queue you are working with and provides enqueue, enqueue_many, receive, purge, and move.
When receiving from a queue, if include_delayed is True, it will subscribe to both the queue and a topic that consumes the schedules. Assuming the messages were enqueued with this class and have the name of the original destination queue as a custom header, scheduled messages that aren't destined for the receiving queue will be filtered out.
Not yet testing in production. Probably a lot of of optimizations here.
Usage:
stomp = MQStompFacade(connection, "my-queue")
stomp.enqueue_many([
EnqueueRequest(message="hello"),
EnqueueRequest(message="goodbye", delay=100_000)
])
stomp.purge() # <- removes queued and scheduled messages destined for "/queues/my-queue"
class MQStompFacade (ConnectionListener):
def __init__(self, connection: Connection, queue: str):
self._connection = connection
self._queue = queue
self._messages: List[Message] = []
self._connection_id = rand_string(6)
self._connection.set_listener(self._connection_id, self)
def __del__(self):
self._connection.remove_listener(self._connection_id)
def enqueue_many(self, requests: List[EnqueueRequest]):
txid = self._connection.begin()
for request in requests:
headers = request.headers or {}
# Used in scheduled message selectors
headers["queue"] = self._queue
if request.delay_millis:
headers['AMQ_SCHEDULED_DELAY'] = request.delay_millis
if request.priority is not None:
headers['priority'] = request.priority
self._connection.send(body=request.message,
destination=f"/queue/{self._queue}",
txid=txid,
headers=headers)
self._connection.commit(txid)
def enqueue(self, request: EnqueueRequest):
self.enqueue_many([request])
def purge(self, selector: Optional[str] = None):
num_purged = 0
for _ in self.receive(idle_timeout=5, selector=selector):
num_purged += 1
return num_purged
def move(self, destination_queue: AbstractQueueFacade,
selector: Optional[str] = None):
buffer_size = 500
move_buffer = []
for message in self.receive(idle_timeout=5, selector=selector):
move_buffer.append(EnqueueRequest(
message=message.body
))
if len(move_buffer) >= buffer_size:
destination_queue.enqueue_many(move_buffer)
move_buffer = []
if move_buffer:
destination_queue.enqueue_many(move_buffer)
def receive(self,
max: Optional[int] = None,
timeout: Optional[int] = None,
idle_timeout: Optional[int] = None,
selector: Optional[str] = None,
peek: Optional[bool] = False,
include_delayed: Optional[bool] = False):
"""
Receiving messages until one of following conditions are met
Args:
max: Receive messages until the [max] number of messages are received
timeout: Receive message until this timeout is reached
idle_timeout (seconds): Receive messages until the queue is idle for this amount of time
selector: JMS selector that can be applied to message headers. See https://activemq.apache.org/selector
peek: Set to TRUE to disable automatic ack on matched criteria. Peeked messages will remain the queue
include_delayed: Set to TRUE to return messages scheduled for delivery in the future
"""
self._connection.subscribe(f"/queue/{self._queue}",
id=self._connection_id,
ack="client",
selector=selector
)
if include_delayed:
browse_topic = f"topic/scheduled_{self._queue}_{rand_string(6)}"
schedule_selector = f"queue = '{self._queue}'"
if selector:
schedule_selector = f"{schedule_selector} AND ({selector})"
self._connection.subscribe(browse_topic,
id=self._connection_id,
ack="auto",
selector=schedule_selector
)
self._connection.send(
destination=f"ActiveMQ.Scheduler.Management",
headers={
"AMQ_SCHEDULER_ACTION": "BROWSE",
"JMSReplyTo": browse_topic
},
id=self._connection_id,
body=""
)
listen_start = time.time()
last_receive = time.time()
messages_received = 0
scanning = True
empty_receive = False
while scanning:
try:
message = self._messages.pop()
last_receive = time.time()
if not peek:
self._ack(message)
messages_received += 1
yield message
except IndexError:
empty_receive = True
time.sleep(0.1)
if max and messages_received >= max:
scanning = False
elif timeout and time.time() > listen_start + timeout:
scanning = False
elif empty_receive and idle_timeout and time.time() > last_receive + idle_timeout:
scanning = False
else:
scanning = True
self._connection.unsubscribe(id=self._connection_id)
def on_message(self, frame):
destination = frame.headers.get("original-destination", frame.headers.get("destination"))
schedule_id = frame.headers.get("scheduledJobId")
message = Message(
attributes=MessageAttributes(
id=frame.headers["message-id"],
schedule_id=schedule_id,
timestamp=frame.headers["timestamp"],
queue=destination.replace("/queue/", "")
),
body=frame.body
)
self._messages.append(message)
def _ack(self, message: Message):
"""
Deletes the message from queue.
If the message has an scheduled_id, will also remove the associated scheduled job
"""
if message.attributes.schedule_id:
self._connection.send(
destination=f"ActiveMQ.Scheduler.Management",
headers={
"AMQ_SCHEDULER_ACTION": "REMOVE",
"scheduledJobId": message.attributes.schedule_id
},
id=self._connection_id,
body=""
)
self._connection.ack(message.attributes.id, subscription=self._connection_id)
In order to remove specific messages you need to know the ID which you can get via a browse of the scheduled messages. The only other option available is to use the start and stop time options in the remove operations to remove all messages inside a range.
MessageProducer producer = session.createProducer(management);
Message request = session.createMessage();
request.setStringProperty(ScheduledMessage.AMQ_SCHEDULER_ACTION, ScheduledMessage.AMQ_SCHEDULER_ACTION_REMOVEALL);
request.setStringProperty(ScheduledMessage.AMQ_SCHEDULER_ACTION_START_TIME, Long.toString(start));
request.setStringProperty(ScheduledMessage.AMQ_SCHEDULER_ACTION_END_TIME, Long.toString(end));
producer.send(request);
If that doesn't suit your need I'm sure the project would welcome contributions.

Why batch jobs working on one server but not on other one?

I have same code on two servers.
I have job for adding batch job to the queue
static void Job_ScheduleBatch2(Args _args)
{
BatchHeader batHeader;
BatchInfo batInfo;
RunBaseBatch rbbTask;
str sParmCaption = "My Demonstration (b351) 2014-22-09T04:09";
SysRecurrenceData sysRecurrenceData = SysRecurrence::defaultRecurrence();
;
sysRecurrenceData = SysRecurrence::setRecurrenceStartDateTime(sysRecurrenceData, DateTimeUtil::utcNow());
sysRecurrenceData = SysRecurrence::setRecurrenceUnit(sysRecurrenceData, SysRecurrenceUnit::Minute,1);
rbbTask = new Batch4DemoClass();
batInfo = rbbTask .batchInfo();
batInfo .parmCaption(sParmCaption);
batInfo .parmGroupId(""); // The "Empty batch group".
batHeader = BatchHeader ::construct();
batHeader .addTask(rbbTask);
batHeader.parmRecurrenceData(sysRecurrenceData);
//batHeader.parmAlerts(NoYes::Yes,NoYes::Yes,NoYes::Yes,NoYes::Yes,NoYes::Yes);
batHeader.addUserAlerts(curUserId(),NoYes::No,NoYes::No,NoYes::No,NoYes::Yes,NoYes::No);
batHeader .save();
info(strFmt("'%1' batch has been scheduled.", sParmCaption));
}
and I have a batch job
class Batch4DemoClass extends RunBaseBatch
{
int methodVariable1;
int metodVariable2;
#define.CurrentVersion(1)
#localmacro.CurrentList
methodVariable1,
metodVariable2
endmacro
}
public container pack()
{
return [#CurrentVersion,#CurrentList];
}
public void run()
{
// The purpose of your job.
info(strFmt("epeating batch job Hello from Batch4DemoClass .run at %1"
,DateTimeUtil ::toStr(
DateTimeUtil ::utcNow())
));
}
public boolean unpack(container _packedClass)
{
Version version = RunBase::getVersion(_packedClass);
switch (version)
{
case #CurrentVersion:
[version,#CurrentList] = _packedClass;
break;
default:
return false;
}
return true;
}
On one server it do runs and in batch history I can see it is saving messages to the batch log and on the other server it does nothing. One server is R2 (running) and R3 (not running).
Both codes are translated to CIL. Both servers are Batch server. I can see the batch jobs in USMF/System administration/Area page inquries. Only difference I can find is that jobs on R2 has filled AOS instance name in genereal and in R3 it is empty. In R2 I can see that info messages in log and batch history but there is nothing in R3.
Any idea how to make batch jobs running?
See how to Configure an AOS instance as a batch server.
There are 3 preconditions:
Marked as a batch server
Time interval set correctly
Batch group set correctly
Update:
Deleting the user of a batch job may make the batch job never complete. Once the execute queue has filled, no further progress will happen.
Deleting the offending batch jobs is problematic, as only the owner can do so! Then consider changing BatchJob.aosValidateDelete.
Yes, worth a try to clear the tables, also if you could provide a screenshot of the already running batchjobs and their status, that might be of help further.

[play-framework : 2.1.2-java : thread pool configuration and AsyncResult

We have a typical web-service which serves JSON data read from a remote database. I was trying out returning Result and AsyncResult, each with the following configuration:
play {
akka {
event-handlers = ["akka.event.slf4j.Slf4jEventHandler"]
loglevel = WARNING
actor {
default-dispatcher = {
fork-join-executor {
parallelism-factor = 1.0
parallelism-max = 1
}
}
}
}
}
and one with
parallelism-factor = 1.0
parallelism-max = 5
Following are observations where the time taken to complete 500 requests is given (average of 5 readings):
1. parallelism-max=1 and parallelism-factor=1.0
Result :
Completion time = 291662 ms.
AsyncResult:
Completion time = 55601 ms
2. parallelism-max=5 and parallelism-factor=1.0
Result :
Completion time = 46419 ms.
AsyncResult:
Completion time = 46977 ms
We can see that with parallelism-max=1, AsyncResult clearly takes very less time compare to Result. However, with parallelism-max=5, Result and AsyncResult give very similar timings.
Shouldn't the time required become less as the number of threads increases, for AsyncResult also ?
Requesting for help to understand the reasons behind this observation.