Control-M: cyclic children jobs - jobs

I have a Ctrl-M workflow of 3 Jobs that are supposed to run every 15 minutes as follows:
Job1-->Job2-->Job3.
Schedule is from 9am to 11am, cyclic every 15 minutes.
Control-M version V9.
Business rule: Job1 starts. As soon as Job1 is completed successfully, Job2 has to start. As soon as Job2 is completed successfully, Job3 has to start.
Therefore what I did is to define Job1 cyclic every 15 minutes and a create a out condition "job1-OK" that is a precondition for Job2, and the same from Job2 to Job3.
I found that if you don't define Job2 and Job3 as cyclic too, they are never ever kicked of again after their first execution. Do you know guys why? what I did to solve this problem is to set Job2 and Job3 cyclic every 15 minutes too, but this makes my "Workflow" < Job1-->Job2-->Job3 > to be messed up because the 15 minutes condition of Job2 is evaluated before my parent Job1 has completed, therefore Job2 never runs if Job1 "cycle" is not in sync with Job2 cycle. Do you know what I mean?
Example: Job1 runs in 1 minute, Job1-OK out condition is released and Job2 is kicked off, same from Job2 to Job3. Job1 is now waiting for the next 15 mins cycle, the cycle is hit, it runs and takes 20 minutes run: In parallel, Job2 is waiting for the 15 mins cycle + the out condition from Job1: As soon as the 15 mins cycle of job2 is hit, the pre-condition from Job1 is evaluated, returns false (because Job1 is still running) and Job2 goes to wait condition for the next 15 minutes cycle + Job1-OK out condition required to run Job2. Job1 is completed in 18 minutes and releases "Job1-OK" condition BUT Job2 is not kicked off because the next cycle will happen within the next 12 minutes because Job1 took 18 minutes to run. Therefore, Job2 execution is lost in this cycle.
i basically want Job2 to run as soon as Job1 is completed, no matter schedule, time, cycle or whatever. Is it possible to accomplish such a task? Otherwise, how can I fulfill such requirement without overlapping between cycles and pre-conditions check?
Thanks a lot, I hope I explained myself properly!

To define a cyclic flow all of the jobs do indeed have to be cyclic.
For your scenario I would suggest having Job1 set at 15 minute cyclic, Job 2 and Job 3 can both be set at 1 minute cyclic. As long as you are deleting in-conditions upon successful completion of a job then this flow will work perfectly.
Your alternative solution does work however it is very costly when it comes to licencing. Control-M is licenced based on the number of ordered in jobs. So if you define a flow that infinitely orders in additional jobs be prepared for a large bill!

After a long research, figured it out:
Ctrl-M needs all jobs to be cyclic if you want them to run multiple times: you can not relay just on a pre-condition only to run multiple times the same day
The only way to control when a job has to be kicked-off again is to play around with specific in-out conditions for the jobs workflow, as well as quantitative and control resources (no more than X jobs running at the same time, etc..)
ALTERNATIVE: A good alternative to not have cyclic jobs is to order children jobs "on the fly" as soon as the condition to kick-off the child job is fulfilled: this way there is no need to create tons of out conditions, just order the job when is supposed to be kicked-off by the parent!! In addition, to support complex workflows with multiple scenarios, you can create some dummy jobs to check for pre-conditions that will be evaluated cyclic too, every 1 minute for example.

Related

How changing number of workers will effect the Glue job

I have 2200000 records to process in Glue job which is leading to timeout as by default it is set to 2 days and number of workers are 10 . Increasing the number of workers will help in running the glue job faster ??
Increasing the numbers of worker will help to run the job faster, if your job has transformations that can run in parallel since you are allocating more executor nodes.
2200000 records isn't that much though, and you should check if somethings wrong with the code if it takes > 2 days.

How are reserved slots re-allocated between reservation/projects if idle slots are used?

The documentation on Introduction to Reservations: Idle Slots states that idle slots from reservations can be used by other reservations if required
By default, queries running in a reservation automatically use idle slots from other reservations. That means a job can always run as long as there's capacity. Idle capacity is immediately preemptible back to the original assigned reservation as needed, regardless of the priority of the query that needs the resources. This happens automatically in real time.
However, I'm wondering if this can have a negative effect on other reservations in a scenario where idle slots are used but are shortly after required by the "owning" reservation.
To be concrete I would like to understand if i can regard assigned slots as guarantee OR as a best effort.
Example:
Reserved slots: 100
Reservation A: 50 Slots
Reservation B: 50 Slots
"A" starts a query at 14:00:00 and the computation takes 300 seconds if 100 slots are used.
All slots are idle at the start of the query, thus all 100 slots are made available to A.
5 seconds later at 14:00:05 "B" starts a query that takes 30 seconds if 50 slots are used.
Note:
For the sake of simplicity let's assume that both queries have only excactly 1 stage and each computation unit ("job") in the stage takes the full time of the query. I.e. the stage is divided into 100 jobs and if a slot starts the computation it takes the full 300 seconds to finish successfully.
I'm fairly certain that on "multiple stages" or "shorter computation times" (e.g. if the computation can be broken down in 1000 jobs) GBQ would be smart enough to dynamically re-assign the freed up slot the reservation it belongs to.
Questions:
does "B" now have to wait until a slot in "A" finishes?
this would mean ~5 min waiting time
I'm not sure how "realistic" the 5 min are, but I feel this is an important variable since I wouldn't worry about a couple of seconds - but I would worry about a couple of minutes!
or might an already started computation of "A" also be killed mid-flight?
the docu Introduction to Reservations: Slot Scheduling seems to suggest something like this
The goal of the scheduler is to find a medium between being too aggressive with evicting running tasks (which results in wasting slot time) and being too lenient (which results in jobs with long running tasks getting a disproportionate share of the slot time).
Answer via Reddit
A stage may run for quite some time (minutes, even hours in really bad cases) but a stage is run by many workers. And most workers complete their work within a very short time, e.g. milliseconds or seconds. Hence rebalancing, I.e. reallocating slots from one job to another is very fast.
So if a rebalancing happens and a job loses a large part of slots, then it will run a lot slower. And the one that gains slots will run fast. And this change is quick.
So in the above example. As job B starts 5 seconds in, within a second or so it would have acquired most of its slots.
So bottom line:
a query is broken up into "a lot" of units of work
each unit of work finishes pretty fast
this give GBQ to opportunity to re-assign slots

How to make the trigger in G1ANT run every 3 minutes but if its in the middle of a process then let it complete it?

I was told to use crontab expression to make my schedule trigger run every 3 minutes, however, I notice that it's not actually running every 5 minutes, what its doing is it runs a process for a period of 3 minutes and then it stops even if the process is not complete.
I need the trigger to run every 3 minutes but if its in the middle of a process then to complete the current process. Can you advise, please?
I noticed in the schedule trigger you have “start instance” and “stop instance”, currently they are both false. My guess is I need to do something with these?
The triggers never interrupt the script execution.
The special variable timeout is what will solve your problem. According to the manual the timeout special variable:
Defines the maximal robot process duration time (in milliseconds), after which the process terminates; the default value is 180000 (3 minutes).
Which means that a script will stop after 3 minutes if its timeout's variable value stays default.
Add the below line of code at the beginning of your script to prevent this from happening after 3 minutes and increase its maximal robot process duration time.
♥timeout = 1800000
By the way if you have set up your triggers to launch a script every 3 minutes and the script's duration time is for example 4 minutes, the script will be launched the next time automatically after because it is 1 minute late and it creates a queue of scripts to be executed.

Hangfire batching - does each job execute in the order in which its added to the batch?

With batching, is each job executed in the order in which its added within the batch
https://docs.hangfire.io/en/latest/background-methods/using-batches.html
Say I had 3 different jobs, but would like them to execute in order 1-2-3 and 2 would only start after 1 was complete. Does batching help in this case? I would still need to at least define the start time with a chron schedule.

How do I do the Delayed::Job equivalent of Process#waitall?

I have a large task that proceeds in several major steps: Step A must complete before Step B can be started, etc. But each major step can be divided up across multiple processes, in my case, using Delayed::Job.
The question: Is there a simple technique for starting Step B only after all the processes have completed working on Step A?
Note 1: I don't know a priori how many external workers have been spun up, so keeping a reference count of completed workers won't help.
Note 2: I'd prefer not to create a worker whose sole job is to busy wait for the other jobs to complete. Heroku workers cost money!
Note 3: I've considered having each worker examine the Delayed::Job queue in the after callback to decide if it's the last one working on Step A, in which case it could initiate Step B. This could work, but seems potentially fraught with gotchas. (In the absence of better answers, this is the approach I'm going with.)
I think it really depends on the specifics of what you are doing, but you could set priority levels such that any jobs from Step A run first. Depending on the specifics, that might be enough. From the github page:
By default all jobs are scheduled with priority = 0, which is top
priority. You can change this by setting
Delayed::Worker.default_priority to something else. Lower numbers have
higher priority.
So if you set Step A to run at priority = 0, and Step B to run at priority = 100, nothing in Step B will run until Step A is complete.
There's some cases where this will be problematic -- in particular, if you have a lot of jobs and are running a lot of workers, you will probably have some workers running Step B before the work in Step A is finished. Ideally in this setup, Step B has some sort of check to make check if it can be run or not.