How do I troubleshoot a hanging SQL Server Agent Job - sql

I have a SQL Server Agent Job with 4 steps. If I run it, it displays as "Executing" indefinitely. If I run the code in the four steps sequentially directly into SSMS, they take ~7 seconds to execute. No configuration information (owner, run as, database, etc.) differs from any other Job that runs normally. What else can I examine?

As with any problem that comes in a group, break it down to it's individual parts. You have run each step separately and you know each individual step works. Next, add step 1 and 2 and see if it runs. Next add step 1,2 and 3 and see what happens. Eliminate all the possible issues step by step. My guess is that one step is not returning a success and the error logic on it does not say to fail or move to the next step.
Check the properties on each step under advanced and check the On success and On Failure actions.

Related

Run a SQL Server job until it succeeds

I have a SQL Server job that has run for almost 2 years.
It's connecting to a bad Oracle database that keeps disconnecting, it always fails due to that. And when I run it again after 10 or 15 minutes, it works successfully. I'm getting bored of checking it every day...
Is there a way that make the job run to connect to that Oracle source until it succeeds, or another job that looks over this job status and if it failed, then it runs it again until it succeeds?
A solution we are using is something like this:
Wrap your Oracle query in an SSIS package, and after the query, have the package update a SQL table that keeps either a history of executions, or just a single row that tracks the last time the job ran successfully. In short, if the Oracle query was successful, then put something in a table saying the query ran successfully today. If it was not successful, then don't put anything in the table for today.
Then at the beginning of the package, BEFORE the Oracle query, check to see if the query has been run successfully today. If it has already run successfully, then do nothing and exit the package. If it has not run successfully today, then go ahead and try to run it, following the post-query steps described above. If you have any other conditions about when the package should run (like "only after 10 am" or anything like that) you would include that logic here.
Finally, schedule the job to call the package, and schedule to run every 15 minutes, or however often you like. It will try every 15 minutes until it runs successfully, and after that it will stop doing anything until the next day.
As a bonus, you can use this same package and job to initiate all tasks that you want handled the same way. You just need to keep meta data about all these tasks in your history/metadata table.
an alternative is to create the job step and leave it unscheduled, and create an ssis job that acts as the master to all your jobs and it runs every minute checking all job steps from a config table that have yet to succeed today and any it finds execute using sp_start_job.
if they do run successfully log the stats to a log table and this prevents them ever being launched again until the next day. This prevents all yours jobs needing to be scheduled every 15 minutes etc, they launch asap, and you can add extra logic to handle dependencies, number parallel running, importance level etc, start time, latest start time, max number to retty etc

How to execute X times a Job Executor step

Introduction
To keep it simple, let's imagine a simple transformation.
This transformation gets an input of 4 rows, from a Data Grid step.
The stream passes through a Job Executor, referencing to a simple job, with a Write Log component.
Expectations
I would like the simple job executes 4 times, that means 4 log messages.
Results
It turns out that the Job Executor step launches the simple job only once, instead of 4 times : I only have one log message.
Hints
The documentation of the Job Executor component specifies the following :
By default the specified job will be executed once for each input row.
This is parametrized in the "Row grouping" tab, with the following field :
The number of rows to send to the job: after every X rows the job will be executed and these X rows will be passed to the job.
Answer
The step actually works well : an input of X rows will execute a "Job Executor" step X times. The fact is I wasn't able to see it with the logs.
To verify it, I have added a simple transformation inside the "Job Executor" step, which writes into a text file. After I have checked this file, it appeared that the "Job Executor" was perfectly executed X times.
Research
Trying to understand why I didn't have X log messages after the X times execution of "Job Executor", I have added a "Wait for" component inside the initial simple job. Finally, adding two seconds allowed me to see X log messages appearing during the execution.
Hope this helps because it's pretty tricky. Please feel free to provide further details.
A little late to the party, as a side note:
Pentaho is a set of programs (Spoon, Kettle, Chef, Pan, Kitchen), The engine is Kettle, and everything inside transformations is started in parallel. This makes log retrieving a challenging task for Spoon (the UI), you don't actually need a Wait for entry, try outputting the logs into a file (specifying a log file on the Job executor entry properties) and you'll see everything in place.
Sometimes we need to give Spoon a little bit of time to get everything in place, personally that's why I recommend not relying on Spoon Execution Results logging tab, it is better to output the logs to a DB or files.

Running same Kettle Job from two different scripts Issue

Is it possible to run a kettle job simultaneously more than once at the same time?
What I am Trying
Say we are running this script twice at a same time,
sh kitchen.sh -rep="development" -dir="job_directory" -job="job1"
If I run it only once at a point of time, data-flow is perfectly fine.
But, when I run this command twice at the same time, it throws error like:
ERROR 09-01 13:34:13,295 - job1 - Error in step, asking everyone to stop because of:
ERROR 09-01 13:34:13,295 - job1 - org.pentaho.di.core.exception.KettleException:
java.lang.Exception: Return code 1 received from statement : mkfifo /tmp/fiforeg
Return code 1 received from statement : mkfifo /tmp/fiforeg
at org.pentaho.di.trans.steps.mysqlbulkloader.MySQLBulkLoader.execute(MySQLBulkLoader.java:140)
at org.pentaho.di.trans.steps.mysqlbulkloader.MySQLBulkLoader.processRow(MySQLBulkLoader.java:267)
at org.pentaho.di.trans.step.RunThread.run(RunThread.java:50)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.Exception: Return code 1 received from statement : mkfifo /tmp/fiforeg
at org.pentaho.di.trans.steps.mysqlbulkloader.MySQLBulkLoader.execute(MySQLBulkLoader.java:95)
... 3 more
It's important to run the jobs simultaneously twice at a same time. To accomplish this, I can duplicate every job and run the original and the duplicate job at a point of time. But, not a good approach for long run!
Question:
Is Pentaho not maintaining threads?
Am I missing some option, or can I enable some option to make pentaho create different threads for different job instances?
Of course Kettle maintains threads. A great many of them in fact. It looks like the problem is that the MySQL bulk loader uses a FIFO. You have two instances of a FIFO called /tmp/fiforeg. The first instance to run creates the FIFO just fine; the second then tries to create another instance with the same name and that results in an error.
At the start of the job, you need to generate a unique FIFO name for that instance. I think you can do this by adding a transformation at the start of the job that uses a Generate random value step to generate a random string or even a UUID and store it in a variable in the job via the Set variables step.
Then you can use this variable in the 'Fifo file' field of the MySQL bulk loader.
Hope that works for you. I don't use MySQL, so I have no way to make sure.

Pig step execution details

I am newbee to pig .
I have written a small script in pig , where in i first load the data from two different tables and further right outer join the two tables ,later also i have next join of tables for two different st of data .It works fine .But i want to see
the steps of execution , like in which step my data is loaded that way i can note the time
needed for loading later details of step for data joining like how much time it is
taking for these much records to be joined .
Basically i want to know which part of my pig script is taking longer time to run so
that way i can further optimize my pig script .
Anyway we could println within the script and find which steps got executed which has started to execute .
Through jobtracker details link i could not get much info , just could see mapper is running & reducer is running , but idealy mapper for which part of script is running could not find that.
For example for a hive job run we can see in the jobtracker details link which step is currently getting executed.
Any information will be really helpfull.
Thanks in advance .
I'd suggest you to have a look at the followings:
Pig's Progress Notification Listener
Penny : this is a monitoring tool but I'm afraid that it hasn't been updated in the recent past (e.g: it won't compile for Pig 0.12.0 unless you do some code changes)
Twitter's Ambrose project. https://github.com/twitter/ambrose
On the other, after executing the script you can see a detailed statistics about the execution time of each alias (see: Job Stats (time in seconds)).
Have a look at the EXPLAIN operator. This doesn't give you real-time stats as your code is executing, but it should give you enough information about the MapReduce plan your script generates that you'll be able to match up the MR jobs with the steps in your script.
Also, while your script is running you can inspect the configuration of the Hadoop job. Look at the variables "pig.alias" and "pig.job.feature". These tell you, respectively, which of your aliases (tables/relations) is involved in that job and what Pig operations are being used (e.g., HASH_JOIN for a JOIN step, SAMPLER or ORDER BY for an ORDER BY step, and so on). This information is also available in the job stats that are output to the console upon completion.

Is it possible to execute pentaho step in sequence?

I have a pentaho transformation which is consist of, for example, 10 steps. I want to start this job for N input parameters but not in parallel, each job evaluation should start after previous transformation are fully completed(process done in transaction and commited or rollbacked). Is it possible with Pentaho?
You can add 'Block this step until steps finish' from Flow to your transformation. Or you can mix 'Wait for SQL' component from Utility with loop on your job.
Regards
Mateusz
Maybe you must do it using jobs instead of transformations. Jobs only run on sequence while transformations run on parallel. (Truly, a transformation has a initialize phase whose run is in parallel and then the flow runs sequentially).
If you can't use jobs, you always can do what Matusz said.