Access tez job history - hive

I want to get job information from the job history server for the tez execution engine.
Currently all map reduce jobs are reflected on the job history server but not the tez ones.
Job history is using some kind of logs to get all the information. Where can I find those logs? If the information is not available on the job history server I can parse those logs to get the information I need.
I have already tried parsing pig-tez cmd logs. Parsing that does not contain enough information and does not work for hive on tez.

Related

BigQuery - How to see failed scheduled queries in Cloud Logging?

I would like to monitor the state of my BigQuery scheduled queries in a Cloud Monitoring dashboard. I have created several logs-based metrics to track errors in other services/resources, but having trouble finding any indication of scheduled query errors in Cloud Logging.
From the Scheduled Queries page in the BigQuery UI, I can check the run details on failed scheduled queries and it shows some log entries explaining the error e.g.
9:02:59 AM Error code 8 : Resources exceeded during query execution: Not enough resources for query planning - too many subqueries or query is too complex..; JobID: PROJECT:12345abc-0000-12a3-1234-123456abcdf
9:00:17 AM Starting to process the query job with no parameters.
9:00:00 AM Dispatched run to data source with id 1234567890
But for some reason I cannot find any of these messages in Cloud Logging. For succeeded jobs, there are some entries in the BigQuery logs, but the failed jobs are missing completely.
Any idea how to view failed scheduled queries in Cloud Logging or Cloud Monitoring?
You can use the following advanced filter to filter all the BigQuery errors related to "jobservice.insert"
resource.type="bigquery_resource"
protoPayload.serviceName="bigquery.googleapis.com"
protoPayload.methodName="jobservice.insert"
severity: "ERROR"
This is the result of that query:
Even a simple query like:
resource.type="bigquery_resource"
severity: "ERROR"
Is able to retrieve all the BigQuery related errors, as you can see here:
Once you find the one related to failed scheduled queries, you can click over the protopayload of the result and select "Show matching entries" to start constructing your own Advanced Query.
I was able to assemble this filter using the Advanced logs queries and BigQuery queries documents.
Please verify the permissions you have - in my case I was able to see failed scheduled queries in Cloud Logging in project where I had been set up as Owner. For another project, where I am just an Editor I was unable to see the erros, just like in your case.
The code bellow works (e.g. I am able to get errors related to "Permission denied while getting Drive credentials" - missing credentials on the Google Drive files for service account running scheduled query):
resource.type="bigquery_resource"
protoPayload.serviceName="bigquery.googleapis.com"
protoPayload.methodName="jobservice.insert"
severity=ERROR

ActiveMQ scheduler jobs count

I have activemq 5.15.* with jolokia getting for jmx status + python.
With this code i can get all scheduled jobs
j4p.request(type = 'read', mbean = '*:brockerName=*:name=JMS:service=JobSheduler:type=Broker')
If number of jobs too big request running too long with http timeout.
But I no need all jobs only they count, there is any way get only jobs count?
Because if the architecture of the on disk storage for the Job Scheduler there is no in memory job count that is kept as the in memory index holds on a cached subset of the total jobs and you don't always have an accurate view of what is on disk (especially after broker restart) so the management interface only exposes access to fetch jobs not to fetch statistics in general. To load and collect the numbers you'd generally be doing just what the code does now and then only exposing a fixed numeric result following all the hard work.
You could extend the store interface and carefully add such features if you wanted, the source code is open. You'd need to properly test that it works both during normal operation and after restart or after some cached data is paged out. The project is always looking for contributors.

resource management on spark jobs on Yarn and spark shell jobs

Our company has a 9 nodes clusters on cloudera.
We have 41 long running spark streaming jobs [YARN + cluster mode] & some regular spark shell jobs scheduled to run on 1pm daily.
All jobs are currently submitted at user A role [ with root permission]
The issue I encountered are that while all 41 spark streaming jobs are running, my scheduled jobs will not be able to obtain resource to run.
I have tried the YARN fair scheduler, but the scheduled jobs remain not running.
We expect the spark streaming jobs are always running, but it will reduce the resources occupied whenever other scheduled jobs start.
please feel free to share your suggestions or possible solutions.
Your spark streaming jobs are consuming too many resources for your scheduled jobs to get started. This is either because they're always scaled to a point that there aren't enough resources left for scheduled jobs or they aren't scaling back.
For the case where the streaming jobs aren't scaling back you could check whether you have dynamic resource allocation enabled for your streaming jobs. One way of checking is via the spark shell using spark.sparkContext.getConf.get("spark.streaming.dynamicAllocation.enabled"). If dynamic allocation is enabled then you could look at reducing the minimum resources for those jobs.

How to log the SQL Job failure to the Log Table?

SQL Jobs are running from the SQL Agent. At this moment, we don't have the Email Notification configured for failure from the SQL Job.
I am considering to fetch the errors from the log table. Further then to have a stored procedure to generate the error report.
select top 10 * from msdb.dbo.sysjobhistory
How to log the SQL Job failure to the Log Table? What will happen, when a job is scheduled to run for every 5 mins and the error will get updated (or) error will get inserted as new record?
Follow these steps:
Open Properties of job
Go under Steps
Get into the step by pressing Edit button
Navigate under Advanced
Change the on Failure Action to Quit the job reporting failure
Then Check Log To table
I ran into the limitations of the built-in logs, job history, etc., where it doesn't capture when a step fails, but the job itself doesn't.
Based upon this sqlshack article, I built and implemented something very similar which maintains a permanent history of job/step failures, etc., even if you delete a job/step. This solution will even notify when this job fails.
Best of luck!

Pentaho Logging specify Job or Trans for each line

I am running Pentaho Kettle 6.1 through a java application. All of the Pentaho logs are directed through the java app and logged out into the same log file at the java level.
When a job starts or finishes the logs indicate which job is starting or finishing, but when the job is in the middle of running the log output only indicates the specific step it is on without any indication of which job or trans is executing.
This causes confusion and is difficult to follow when there is more than one job running simultaneously. Does anyone know of a way to prepend the name of the job or trans to each log entry?
Not that I know, and I doubt there is for the simple reason that the same transformation/job may be split to run on more than one machine, by more that one user, and/or launched in parallel in different job hierarchies of callers.
The general answer is to log in a database (right-click any where, Parameters, Logging, define the logging table and what you want to log). All the logging will be copied to a table database together with a channel_id. This is a unique number that will be attributed to each "run" and link together all the logging information that comes from all the dependent job/transformations. You can then view this info with a SELECT...WHERE channel_id=...
However, you case seams to be simpler. Use the database logging with a log_intervale of, say, 2 seconds and SELECT TRANSNAME/JOBNAME, LOG_FIELD FROM LOG_TABLE continuously on your terminal.
You can also follow a specific job/transformation by logging in a specific table, but this means you know in advance which is the job/transformation to debug.