Setting Cloudwatch Event Cron Job - amazon-cloudwatch

I'm a little bit confused on the cron job documentation for cloudwatch events. My goal is to create a cron job that run every day at 9am, 5pm, and 11pm EST. Does this look correct or did I do it wrong? It seems like cloudwatch uses UTC military time so I tried to convert it to EST.
I thought I was right, but I got the following error when trying to deploy cloudformation template via sam deploy
Parameter ScheduleExpression is not valid. (Service: AmazonCloudWatchEvents; Status Code: 400; Error Code: ValidationException
What is wrong with my cron job? I appreciate any help!
(SUN-SAT, 4,0,6)
UPDATED:
this below gets same error Parameter ScheduleExpression is not valid
Events:
CloudWatchEvent:
Type: Schedule
Properties:
Schedule: cron(0 0 9,17,23 ? * * *)
MemorySize: 128
Timeout: 100

You have to specify a value for all six required cron fields
This should satisfy all your requirements:
0 4,14,22 ? * * *
Generated using:
https://www.freeformatter.com/cron-expression-generator-quartz.html
There are a lot of other cron generators you can find online.

Related

Cronos cron Service Working Cron Job Demo net core 5

I'm using the Cronos ServiceWorkerCronJobDemo ( https://github.com/dotnet-labs/ServiceWorkerCronJob )
In a Dot Net Core 5 application
As it stands it works as expected.
But as soon as I start to change the cron expression to what I need "0 0 1 * *" or #monthly
I need it to be schedule to run on the first of the month.
I get an invalid value for the interval error.
Does anyone have a solution?
screen dump of error
screen dump of settings
Your cron expression doesn't seem to find any problems, and I can parse it out with it.
Can you show the specific error and related code?
I have reviewed three articles, which may be helpful to you:
1.Cronos
2.Cron job troubleshooting guide
3.Schedule Cron Jobs using HostedService in ASP.NET Core

How to set up job dependencies in google bigquery?

I have a few jobs, say one is loading a text file from a google cloud storage bucket to bigquery table, and another one is a scheduled query to copy data from one table to another table with some transformation, I want the second job to depend on the success of the first one, how do we achieve this in bigquery if it is possible to do so at all?
Many thanks.
Best regards,
Right now a developer needs to put together the chain of operations.
It can be done either using Cloud Functions (supports, Node.js, Go, Python) or via Cloud Run container (supports gcloud API, any programming language).
Basically you need to
issue a job
get the job id
poll for the job id
job is finished trigger other steps
If using Cloud Functions
place the file into a dedicated GCS bucket
setup a GCF that monitors that bucket and when a new file is uploaded it will execute a function that imports into GCS - wait until the operations ends
at the end of the GCF you can trigger other functions for next step
another use case with Cloud Functions:
A: a trigger starts the GCF
B: function executes the query (copy data to another table)
C: gets a job id - fires another function with a bit of delay
I: a function gets a jobid
J: polls for job is ready?
K: if not ready, fires himself again with a bit of delay
L: if ready triggers next step - could be a dedicated function or parameterized function
It is possible to address your scenario with either cloud functions(CF) or with a scheduler (airflow). The first approach is event-driven getting your data crunch immediately. With the scheduler, expect data availability delay.
As it has been stated once you submit BigQuery job you get back job ID, that needs to be check till it completes. Then based on the status you can handle on success or failure post actions respectively.
If you were to develop CF, note that there are certain limitations like execution time (max 9min), which you would have to address in case BigQuery job takes more than 9 min to complete. Another challenge with CF is idempotency, making sure that if the same datafile event comes more than once, the processing should not result in data duplicates.
Alternatively, you can consider using some event-driven serverless open source projects like BqTail - Google Cloud Storage BigQuery Loader with post-load transformation.
Here is an example of the bqtail rule.
rule.yaml
When:
Prefix: "/mypath/mysubpath"
Suffix: ".json"
Async: true
Batch:
Window:
DurationInSec: 85
Dest:
Table: bqtail.transactions
Transient:
Dataset: temp
Alias: t
Transform:
charge: (CASE WHEN type_id = 1 THEN t.payment + f.value WHEN type_id = 2 THEN t.payment * (1 + f.value) END)
SideInputs:
- Table: bqtail.fees
Alias: f
'On': t.fee_id = f.id
OnSuccess:
- Action: query
Request:
SQL: SELECT
DATE(timestamp) AS date,
sku_id,
supply_entity_id,
MAX($EventID) AS batch_id,
SUM( payment) payment,
SUM((CASE WHEN type_id = 1 THEN t.payment + f.value WHEN type_id = 2 THEN t.payment * (1 + f.value) END)) charge,
SUM(COALESCE(qty, 1.0)) AS qty
FROM $TempTable t
LEFT JOIN bqtail.fees f ON f.id = t.fee_id
GROUP BY 1, 2, 3
Dest: bqtail.supply_performance
Append: true
OnFailure:
- Action: notify
Request:
Channels:
- "#e2e"
Title: Failed to aggregate data to supply_performance
Message: "$Error"
OnSuccess:
- Action: query
Request:
SQL: SELECT CURRENT_TIMESTAMP() AS timestamp, $EventID AS job_id
Dest: bqtail.supply_performance_batches
Append: true
- Action: delete
You want to use an orchestration tool, especially if you want to set up this tasks as recurring jobs.
We use Google Cloud Composer, which is a managed service based on Airflow, to do workflow orchestration and works great. It comes with automatically retry, monitoring, alerting, and much more.
You might want to give it a try.
Basically you can use Cloud Logging to know almost all kinds of operations in GCP.
BigQuery is no exception. When the query job completed, you can find the corresponding log in the log viewer.
The next question is how to anchor the exact query you want, one way to achieve this is to use labeled query (means attach labels to your query) [1].
For example, you can use below bq command to issue query with foo:bar label
bq query \
--nouse_legacy_sql \
--label foo:bar \
'SELECT COUNT(*) FROM `bigquery-public-data`.samples.shakespeare'
Then, when you go to Logs Viewer and issue below log filter, you will find the exactly log generated by above query.
resource.type="bigquery_resource"
protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.labels.foo="bar"
The next question is how to emit an event based on this log for the next workload. Then, the Cloud Pub/Sub comes into play.
2 ways to publish an event based on log pattern are:
Log Routers: set Pub/Sub topic as the destination [1]
Log-based Metrics: create alert policy whose notification channel is Pub/Sub [2]
So, the next workload can subscribe to the Pub/Sub topic, and be triggered when the previous query has completed.
Hope this helps ~
[1] https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#jobconfiguration
[2] https://cloud.google.com/logging/docs/routing/overview
[3] https://cloud.google.com/logging/docs/logs-based-metrics

CloudWatch Event Cron Expression invalid

I try to add a CloudWatch Scheduled Event with the following cron expression:
cron(0 1 * * ? *)
I want to trigger this event every day at one o'clock.
But I always get the following error:
There was an error while saving rule dms-unstage-tibia.
Details: Parameter ScheduleExpression is not valid.
What is wrong in this cron expression?
Ok, i fixed it by my self. If you create a CloudWatch Scheduled Event directly in CloudWatch, you don't need the "cron()" syntax, only the expression inside. But if you create the event from Lambda, you have to write "cron()". Not very intuitive.

How to run job every 15 mins using quartz in mule

How to configure corn expression to run the job every 15 mins.
Configured below one in the code but it is not working
fp.cron.expr=0 15 0 ? * *
Can you please help on this
You can use the Poll Scheduler in this case and give the Cron scheduler expression as
0 0/15 * 1/1 * ? *
For reference -->
https://docs.mulesoft.com/mule-user-guide/v/3.6/poll-schedulers
You expression is wrong, a job which run every 15 mn should look like this : 0 0/15 * * * ? *
Quartz connector is deprecated. When scheduling tasks, it’s recommended that you instead use the Poll Scope. It has option for Fixed Frequency Scheduler and Cron schedular. With Fixed frequency you can choose time units in MILLISECONDS, SECONDS, MINUTES, HOURS and DAYS.
You can try 0 0/15 * 1/1 * ? *. You can visit Cron Maker site to generate your cron expression.
Right expression is "0 0/15 * * * ?"
Visit https://www.quartz-scheduler.net/documentation/quartz-2.x/tutorial/crontriggers.html for further reference.
If the above expression is not working then you can also use a trigger like as below :
scheduler.Start();
IJobDetail job = JobBuilder.Create<JobName>().Build();
ITrigger trigger = TriggerBuilder.Create().StartNow().WithSimpleSchedule(x => x.WithIntervalInSeconds(15).RepeatForever()).Build();
scheduler.ScheduleJob(job, trigger);

Salt Stack - Best way to get schedule output

I'm using saltstack on my servers. I did a simple schedule:
job1:
schedule.present:
- function: state.apply
- seconds: 1800
- splay: 5
Now I want to get the output of the schedule back on my master
(or on my minion but I just like to know the best way)
I don't really know how to use salt mine or returners or what is best preferred for my needs.
thank you :)
Job data return and job metadata
By default, data about jobs runs from the Salt scheduler is returned to the master.
It can therefore be useful to include specific data to differentiate a job from other jobs. By using the metadata parameter a value can be associated with a scheduled job, although these metadata values aren't used in the execution of the job they can be used to search for specific job later.
job1:
schedule.present:
- function: state.apply
- seconds: 1800
- splay: 5
- metadata:
foo: bar
An example job search on the salt-master might look similar to this:
salt-run jobs.list_jobs search_metadata='{"foo": "bar"}'
Take a look at the salt.runners.jobs documentation for more examples.
Scheduler with returner
The schedule.present function includes a parameter to set a returner to use to return the results of the scheduled job. You could for example use the pushover returner:
job1:
schedule.present:
- function: state.apply
- seconds: 1800
- splay: 5
- returner: pushover
Take a look at the returners documentation for a full list of of builtin returners and usage information.