What happens when a procedure executed by a JOB is not finished when is time for the JOB to execute it again? - sql

I want to know what happens when a procedure is executed through a job and before it finishes is time for the job to call the next execution of the procedure. Here the job I created:
DECLARE
X NUMBER;
BEGIN
SYS.DBMS_JOB.SUBMIT
(
job => x
,what => 'BEGIN PKG_DISTRIBUIDOR_SCHEDULER.PRC_DISTRIBUYE_TRANSACCIONES(5000); END;'
,next_date => to_date(sysdate,'dd/mm/yyyy hh24:mi:ss')
,interval => 'SYSDATE+30/86400'
,no_parse => FALSE
);
DBMS_OUTPUT.PUT_LINE('Job Number is: ' || to_char(x));
COMMIT;
END;
As you can see, the job is executed each 30 seconds. So if my procedure (PRC_DISTRIBUYE_TRANSACCIONES) delays more than 30 seconds, what does the job do in this case?

If you use the (old deprecated) Jobs, i.e. DBMS_JOB
The starting time for the next execution is determined when the current jobs is finished.
If you specify an interval as SYSDATE+30/86400 then it does not mean: "The job runs every 30 seconds."
It means: "The next jobs starts 30 seconds after the previous job has been finished."
If you use the Scheduler Jobs, i.e. DBMS_SCHEDULER
Immediately after a job starts, the repeat_interval (e.g. FREQ=SECONDLY;INTERVAL=30) is evaluated to determine the next scheduled execution time of the job. While this might arrive while the job is still running, a new instance of the job does not start until the current one completes. See About Setting the Repeat Interval
So it means: If a jobs last longer than 30 seconds then the new job will start immediately after the previous job has been finished.

Nothing happens!
Only when anonymous PL/SQL block inside "what" parameter finish is the next date calculated according the interval parameter

Related

How to add 2 minutes delay between jobs in a queue?

I am using Hangfire in ASP.NET Core with a server that has 20 workers, which means 20 jobs can be enqueued at the same time.
What I need is to enqueue them one by one with 2 minutes delay between each one and another. Each job can take 1-45 minutes, but I don't have a problem running jobs concurrently, but I do have a problem starting 20 jobs at the same time. That's why changing the worker count to 1 is not practical for me (this will slow the process a lot).
The idea is that I just don't want 2 jobs to run at the same second since this may make some conflicts in my logic, but if the second job started 2 minutes after the first one, then I am good.
How can I achieve that?
You can use BackgroundJob.Schedule() to run your job run at a specific time:
BackgroundJob.Schedule(() => Console.WriteLine("Hello"), dateTimeToExecute);
Based on that set a date for the first job to execute, and then increase this date to 2 minutes for each new job.
Something like this:
var dateStartDate = DateTime.Now;
foreach (var j in listOfjobsToExecute)
{
BackgroundJob.Schedule(() => j.Run(), dateStartDate);
dateStartDate = dateStartDate.AddMinutes(2);
}
See more here:
https://docs.hangfire.io/en/latest/background-methods/calling-methods-with-delay.html?highlight=delay

Snowflake How to resume root task after it gets suspended?

I have the following root and child task. Everything works as intended but how can I automatically have TSK_ROOT resume after it gets suspended? Do I have to make another task that checks if TSK_ROOT is suspended? Doesn't that defeat the purpose of a root task?
CREATE OR REPLACE TASK TSK_ROOT
WAREHOUSE = MYWH
SCHEDULE = '5 MINUTE'
WHEN
SYSTEM$STREAM_HAS_DATA('<stream_name>')
AS
ALTER TASK TSK_ROOT SUSPEND;
CREATE OR REPLACE TASK TKS_ONE
WAREHOUSE = MYWH
AFTER TSK_ROOT
AS
....
What I understand from the comments:
TKS_ONE runs after a run of TSK_ROOT.
Question assumes that TKS_ONE only runs after TSK_ROOT is suspended, hence TSK_ROOT incorporates code to suspend itself.
That's not necessarily true. TSK_ROOT could run anything, select 1 x for example, and TKS_ONE would run regardless.
So to avoid suspending TSK_ROOT, just don't suspend it from within TSK_ROOT.
Check this example:
https://github.com/fhoffa/snowflake_snippets/blob/main/stream_and_tasks/minimal.sql
The Suspend / resume should be used to turn on(resume) or off(suspend) for maintenance of code not as an operational function.
maybe try something like
SCHEDULE = 'USING CRON */5 9 * * * PST8PDT'
since you have the
SYSTEM$STREAM_HAS_DATA('<stream_name>') in your when clause it will not resume your warehouse / consume credits if it the stream is empty.
If this is a theoretical and not an operational question:
For any task chain with a schedule and following tasks in sequence like so
Root(scheduled)--task1( after root) --task2( after task1) --task3
(after task2)
'suspend' must be applied from the left and 'resume' from the right
it is possible for the sequence to be like
resumed -- suspended -- resumed -- suspended
but in practice it should always look like one of these
resumed -- resumed -- resumed -- resumed
suspended -- resumed -- resumed -- resumed
suspended -- suspended -- resumed -- resumed
suspended -- suspended -- suspended -- resumed
suspended -- suspended -- suspended -- suspended
so in this case if your operant goal is to run task1 once.
have root do a do nothing statement like select current_date;
Do what you must with task1
and then have
task2 'alter task root suspend;'

Run Job every 4 days but first run should happen now

I am trying to setup APScheduler to run every 4 days, but I need the job to start running now. I tried using interval trigger but I discovered it waits the specified period before running. Also I tried using cron the following way:
sched = BlockingScheduler()
sched.add_executor('processpool')
#sched.scheduled_job('cron', day='*/4')
def test():
print('running')
One final idea I got was using a start_date in the past:
#sched.scheduled_job('interval', seconds=10, start_date=datetime.datetime.now() - datetime.timedelta(hours=4))
but that still waits 10 seconds before running.
Try this instead:
#sched.scheduled_job('interval', days=4, next_run_time=datetime.datetime.now())
Similar to the above answer, only difference being it uses add_job method.
scheduler = BlockingScheduler()
scheduler.add_job(dump_data, trigger='interval', days=21,next_run_time=datetime.datetime.now())

Does HAWQ reuse QE processes after a query finished?

Query Executor processes are created on segments to do query execution. When I doing a query, I can see the working QEs. But when the query is finished, they are still alive with idle state. Does HAWQ reuse QE processes after a query finished?
Yes, HAWQ QE Process is kept in session level. If you have already finished a query but with session alive, the next query you sent through the same session will reuse the already started QEs.
There are two phenomenons:
1) The catched QE process number is less than the QEs needed for the new query on the same host. Under this case, HAWQ will reuse the catched QEs, and also start new QEs for the not-enough number.
2) The catched QE process number is more than the QEs needed for the new query on the same host. Under this case, HAWQ will choose some QEs inside of these catched QEs. You'll see some QEs still idle.
The number of QEs needed is decided by resource manager.
Moveover, if you run the "SET" command, if there are catched QEs on the segment hosts, all the QEs will be reused. But if there are no catched QEs, the "SET" command will not start any QEs in segment.
The cache of QEs in HAWQ is designed for two purpose:
Reuse the QEs between consecutive queries so as to avoid forking them every time we run a query, and thus improve query performance, especially for small query.
Debug in feature development and bug fix.
The QEs of current query is released if current session is closed or they are idle after gp_vmem_idle_resource_timeout ms. It is 10 minutes in debug build, and 18 seconds in release build by default. You may refer to guc.c for details:
{
{"gp_vmem_idle_resource_timeout", PGC_USERSET, CLIENT_CONN_OTHER,
gettext_noop("Sets the time a session can be idle (in milliseconds) before we release gangs on the segment DBs to free resources."),
gettext_noop("A value of 0 turns off the timeout."),
GUC_UNIT_MS | GUC_GPDB_ADDOPT
},
&IdleSessionGangTimeout,
#ifdef USE_ASSERT_CHECKING
600000, 0, INT_MAX, NULL, NULL /* 10 minutes by default on debug builds.*/
#else
18000, 0, INT_MAX, NULL, NULL
#endif
}
Yes. If in an interval, there comes another query, QEs can be reused. If this interval timeout, QEs quit.
Moreover session quit will quit all the forked QEs no matter the interval is.
The interval GUC is gp_vmem_idle_resource_timeout, you can set it in your session.

When does the first occurrence of a recurring timed BackgroundTask run?

If you register a BackgroundTask with a recurring TimeTrigger (OneShot set to false), when does the first occurrence run? After the first FreshnessTime minutes or before?
Microsoft documentation states:
If FreshnessTime is set to 15 minutes and OneShot is false, the task
will run every 15 minutes starting between 0 and 15 minutes from the
time it is registered.
edit
I tested this a few times and it seems to run the first occurrence at anytime during the 15 minute period after registration. It then runs future occurrences at regular 15 minute periods based on 15 minutes from the start time of the previous run.
I'm not sure internally how the OS is scheduling the timer cycles but the answer to your question is not after, not before but during.
nb. You cannot get any timer background task to fire immediately.