Pentaho Task Locked - pentaho

I have some Tasks in pentaho, and for some reason some times, some tasks simply stall with message Carte - Installing timer to purge stale objects after 1440 minutes. task. For example, I scheduled one task to run at 05h00 AM and this task usually runs in 10 minutes, but some times it never ends. The task stalls with aforementioned message. However, when I run the task run on the Pentaho Data Integration Canvas, the job works.
The .exe that I use to run is:
cd c:\data-integration
kitchen.bat /rep:repo /job:jobs//job_ft_acidentes /dir: /level:Minimal
Picture of the message
Hoe can I prevent this error?

Related

IICS How Do You Orchestrate Scheduled Task Flows?

I would like to run multiple scheduled Task Flows against the same data source but only run one at a time.
Example:
Schedule "Nightly" runs once a day (expected runtime 30 minutes),
Schedule "Hourly" runs once an hour (expected runtime 10 minutes),
Schedule "Minute" runs once a minute (expected runtime 5 seconds).
I would like:
#1 "Nightly" test status of "Hourly" and "Minute":
If they are not running, start "Nightly",
If either are running, loop around until both have stopped.
#2 "Hourly" test status of "Nightly" and "Minute":
If they are not running, start "Hourly",
If "Nightly" is running, exit,
If "Minute" is running, loop around until both have stopped.
#3 "Minute" test status of "Nightly" and "Hourly":
If they are not running, start "Minute",
If either are running, exit.
So far, I am using handshakes with several JSON files in the cloud.
Meaning, if "Minute" is running, the file minute.json contains information telling a caller "Minute" is running.
When "Minute" ends, it updates its file, minute.json, to reflect the operation has stopped.
As you can imagine, this is very slow.
Also, Informatica will always create a JSON file when JSON is the target. The issue here is, if there is any issue, Informatica will create a 0 file size JSON file that will fail any operation calling it.
There has got to be a better way.
You could use the Informatica Platform REST API v2 to monitor and control the execution of your jobs programmatically from an external site.
It's a bit involved to set everything up and write the logic or configure driving tools but this setup should give you full control, including error handling, logging, alerting, etc.
login (there are a number of options like SAML, and Salesforce credentials)
Then you could check the status and outcome of your jobs in the activity log or the activity monitor via API
use job and/or schedule via API to run your jobs.

Pentaho 7.1 Job No Longer Runs after I run it a second time

I'm running into a weird issue with Pentaho 7.1. I run a job that I created and it runs perfectly and quickly the first time I run it.
The job is an ETL Job consisting of a Start widget, 7 Transformations running in a sequence, and a Success widget.
I'm confused as to why the job runs once, and when I try to run it again it says "Spoon - Starting job..." and then the job just hangs.
If I delete the job and I create a brand new one, I am then able to run the job once and I am stuck again with the job no longer able to run after that. I don't understand why the job keeps hanging after it gets executed once, and it is then 100% broken after a Successful run...
I turned up the logging in Pentaho 7.1 Spoon, and it shows this continuously...
2018/08/14 17:30:24 - Job1 - Triggering heartbeat signal for
ported_jobs at every 10 seconds 2018/08/14 17:30:34 - Job1 -
Triggering heartbeat signal for ported_jobs at every 10 seconds
2018/08/14 17:30:44 - Job1 - Triggering heartbeat signal for
ported_jobs at every 10 seconds 2018/08/14 17:30:54 - Job1 -
Triggering heartbeat signal for ported_jobs at every 10 seconds
I can't seem to put my finger on why this happening.
Any help is appreciated
Probable answer: Check that your transformations are not opening the same database for input and output. A quick check may to run the first transformation directly (without the job) and see if it locks.
Happened because the server db you want to update are slow to respond. Probably high CPU and RAM. I tried to increase the RAM and CPU for the db server, now my job runs okay.

SSIS Agent job keeps running as "inprogress"

I have a SSIS package that monitors a folder. This package will run continuously until it's terminated.
I want to schedule this using a SQL Agent job.This SQL Agent job will utilize two steps. It is kind of heart beat job to to make sure the SSIS package runs.
Step - 1 it checks whether the SSIS package is running. if running quit else step 2.
Step - 2 Execute the SSIS job. if OK then report success and quit else report failure and quit.
uses a daily schedule Mon-Fri every 4 hrs.
When I execute the SQL Job, it starts the SSIS package but the job keeps running and the job monitor and history shows it as "inprogress"
I had to close the job to come out of the dialog but in background the SSIS job is still running as expected.
Is this normal behavior ? Do I need to approach this in a different way ?
Appreciate any pointers or help on this.
Once the job has begun, the Start Jobs dialog box has no impact whatsoever on the running of the job itself - it exists solely to provide a monitoring window for you. Closing it will have no effect on the running job.
From other phrases in your question, I gather that you do not expect the job to ever 'finish' - therefore I would expect it to always show as In Progress unless it errors out or is stopped.
"This package will run continuously until it's terminated."
"The job keeps running and the job monitor and history shows it as in progress"

queue job all day and execute it at a specified time

Is there a plugin or can I somehow configure it, that a job (that is triggered by 3 other jobs) queues until a specified time and only then executes the whole queue?
Our case is this:
we have tests run for 3 branches
each of the 3 build jobs for those branches triggers the same smoke-test-job that runs immediately
each of the 3 build jobs for those branches triggers the same complete-test-job
points 1. and 2. work perfectly fine.
The complete-test-job should queue the tests all day long and just execute them in the evening or at night (starting from a defined time like 6 pm), so that the tests are run at night and during the day the job is silent.
It's no option to trigger the complete-test-job on a specified time with the newest version. we absolutely need the trigger of the upstream build-job (because of promotion plugin and we do not want to run already run versions again).
That seems a rather strange request. Why queue a build if you don't want it now... And if you want a build later, then you shouldn't be triggering it now.
You can use Jenkins Exclusion plugin. Have your test jobs use a certain resource. Make another job whose task is to "hold" the resource during the day. While the resource is in use, the test jobs won't run.
Problem with this: you are going to kill your executors by having queued non-executing jobs, and there won't be free executors for other jobs.
Haven't tried it myself, but this sounds like a solution to your problem.

Oozie start time and submission time delay

I'm working on a workflow that has both Hive and Java actions. Very often we have been noticing that there is a few minutes delay between Java action start time and the job submission time. We don't see that with Hive jobs, meaning Hive jobs seem to be submitted almost immediately after they are started. The Java jobs do not do much and so they finish successfully in seconds after they are submitted but the time between start and submission seem to be very night ( 4 -5 minutes). We are using fair scheduler and the there are enough mapper/reducer slots available. But still even if it's a resource problem the Hive jobs should also show delay between start and submission but they don't ! Java jobs are very simple jobs and they don't process any files etc and basically used to call a web service and they spawn only single mapper and no reducers where are the Hive jobs creates hundreds of mapper/reducer tasks but still there is not delay between start and submission. We are not able to figure out why oozie is not submitting the Java job immediately. Any ideas?