Silk Performer Assignment of Load to agents has failed because at least one of the load agents is locked , unavailable, or has insufficient privileges - microfocus

I had been running my Silk tests on 8 agents. During my last run, i closed the test in between and since then, whenever i trigger the test, i see that it is not executing on few of the agents. Also, when i created a new workload and assigned them few agents(which were not running), I am getting the error "Assignment of Load to agents has failed because at least one of the load agents is locked , unavailable, or has insufficient privileges".
I am not sure what is causing this issue since the Silk Agent Service is running. Also the agents are displaying as connected and even the Try Agent run goes fine. Request you to please suggest.
Silk Version - 18.5

Agents may remain locked if there was a problem during last run and there are remaining temp files in C:\Users\Public\Documents\Silk Performer 18.5\LocalResults_
Save the results if they are meaningful for you and delete the folder to get the agent unlocked.

Related

Pentaho Logging specify Job or Trans for each line

I am running Pentaho Kettle 6.1 through a java application. All of the Pentaho logs are directed through the java app and logged out into the same log file at the java level.
When a job starts or finishes the logs indicate which job is starting or finishing, but when the job is in the middle of running the log output only indicates the specific step it is on without any indication of which job or trans is executing.
This causes confusion and is difficult to follow when there is more than one job running simultaneously. Does anyone know of a way to prepend the name of the job or trans to each log entry?
Not that I know, and I doubt there is for the simple reason that the same transformation/job may be split to run on more than one machine, by more that one user, and/or launched in parallel in different job hierarchies of callers.
The general answer is to log in a database (right-click any where, Parameters, Logging, define the logging table and what you want to log). All the logging will be copied to a table database together with a channel_id. This is a unique number that will be attributed to each "run" and link together all the logging information that comes from all the dependent job/transformations. You can then view this info with a SELECT...WHERE channel_id=...
However, you case seams to be simpler. Use the database logging with a log_intervale of, say, 2 seconds and SELECT TRANSNAME/JOBNAME, LOG_FIELD FROM LOG_TABLE continuously on your terminal.
You can also follow a specific job/transformation by logging in a specific table, but this means you know in advance which is the job/transformation to debug.

When I start My SAP MMC EC6 server one service is not getting to wait mode

Can someone of you help me, how to make the following service selected in the image get into wait mode after starting the server.
Please let me know if developer trace is required to be posted for resolving this issue.
that particular process is a BATCH process, a process that runs scheduled background tasks (maintained by transaction SM36/SM37). If the process is busy right after starting the server, that means there were scheduled tasks with status released waiting for execution, and as soon as the server was up, it started those tasks.
If you want to make sure the system doesn't immediately start released background tasks, you'll have to set the status back to scheduled (which, thanks to a bit of weird translation, means they won't be executed because they are not released).
if you want to start the server without having a chance to first change the job status in SM37, you would either have to reset the status on database level (likely not officially supported by SAP) or first start the server without any BATCH processes (which would give you a number of great big warning messages upon login) and change the job status before then restarting the server with the BATCH processes. You can set the number of processes for each type in the profile of your instance (parameter rdisp/wp_no_btc).

Spring Batch restart crashed jobs

Hi spring batch users,
regarding the documentation http://docs.spring.io/spring-batch/reference/htmlsingle/#d5e1320
"If the process died ("kill -9" or server failure) the job is, of course, not running, but the JobRepository has no way of knowing because no-one told it before the process died."
I try to find and restart the stale job executions by using
Set<JobExecution> jobExecutions = jobExplorer.findRunningJobExecutions(jobName);
...
jobExecution.setStatus(FAILED);
jobExecution.setEndTime(new Date());
jobRepository.update(jobExecution);
jobOperator.restart(jobExecution.getId());
But this seems to be very inconvenient.
1) I have to do this before other (new) jobs could be started.
2) I have to handle multiple instances of running servers so findRunningJobExecutions will not do the trick.
You can find other questions regarding this topic:
https://jira.spring.io/browse/BATCH-2433?jql=project%20%3D%20BATCH%20AND%20status%20%3D%20Open%20ORDER%20BY%20priority%20DESC
Spring Batch after JVM crash
I would love to see a solution to register a "start up clean jobs listener". This will still not fix the problems originated by the multi server environment because spring batch does not know if the JobExecution marked by STARTED is not running on an other instance.
Thanks for any advice
Alex
Your job cannot and should not recover "automatically" from a kill -9 scenario. A kill -9 is treated very differently than you application throwing a caught Exception. The reason for this is that you've effectively pulled the carpet out from under the application without giving it a chance to reach a synchronization point with the database to commit any necessary information to the ExecutionContext or update the job/step status(es). Therefore, the last status touchpoint with the database will remain and the job will still look STARTED.
"OK, fine" you say, "but if I start another execution, I want it to find that STARTED execution, and pick up where it left off." The problem here is that there is no clean way for the application to distinguish a job that is ACTUALLY RUNNING from one that has failed but couldn't up the database. The framework here correctly errs on the side of caution and prevents you from starting a job that already appears running, and this is a GOOD thing.
Why? Because let's assume your job was actually still running and you restarted by accident. As coded, the framework will start to spin up, see your running execution and fail with the following message A job execution for this job is already running. I can't tell you how many times we've been saved by this because someone accidentally launched a job twice!
If you were to implement the listener you suggest, the 2nd execution would instead be allowed to start and you'd have 2 different JVMs repeating the same work, possibly writing to the same files/tables and causing a huge data mess that could be impossible to clean up.
Trust me, in the event the Linux terminal kills your job or your job dies because the connection to the database has been severed, you WANT human eyes on those execution states before you attempt a restart.
Finally, on the off chance you actually wanted to kill you job, you can leverage several other standard patterns for stopping jobs:
Stop via throw Exception
Stop via JobOperator.stop()

SQL Replication Error (On Server Agent)

I have created the new replication. Now what is issue I am facing:
When I go to ​start the 'View Agent Snapshot Status' Its just start working and First line shows "Starting Agent" and just keep working, working and continuously working.
..
After sometime it show the following message:
"The replication agent has not logged a progress message in 10 minutes. This might indicate an unresponsive agent or high system activity. Verify that records are being replicated to the destination and that connections to the Subscriber, Publisher, and Distributor are still active."
I try the following solution that I found, I have increased the value of #HeartBeat_interval property of distributor from 10 to 30 but no success.
I have Sql Server 2008 R2.
any help will be appreciated really.
May be this will help to someone else:
I did the following changes and my replication is working perfect.
1 - Job username and Job password must have full access and permission of windows.
2 - You must be logged In to user that you will use in the replication script to create replication.
That's all.
Thanks!!
I had the same behavior.
some of my articals are huge. while the replica's synch was over, the agent hanged up with the same message as yours.
after ~20 minutes it began running as expected.
I thought it is not not normal behavior, but after creating my second subscription, the error appeared again. it was gone approximately after 20 minutes.
I believe it is encounters high load of data (in case it is) and hangs up for while.
hope it helps

SQL Server Agent Job is not running

I have a job that is supposed to run every 11 AM and 8 PM. About two weeks ago, it started to not respect the schedule. The "fix" that I found was to start the job manually and then the job would restart respecting the schedule for a while but eventually the issue reappears.
The big problem is that there are no error message what so ever. If the job fails, I am supposed to get a notification Email which I do not. In the sql server agent logs and the Job history, there are no errors. In the job history, I can see clearly that the job skipped the schedule since there are no entries. It looks like it did not even start as if the running time had not arrived.
The schedule is set to run everyday and there are no limits on how long it is supposed to run. The sql Agent is set to restart automatically if it stops unexpectedly.
Did anyone get this problem before?
Check the user which is used to run the job. Maybe the user password is expired or the user itself is no longer active.