Hangfire executes job twice - asp.net-core

I am using Hangfire.AspNetCore 1.7.17 and Hangfire.MySqlStorage 2.0.3 for software that is currently in production.
Now and then, we get a report of jobs being executed twice, despite the usage of the [DisableConcurrentExecution] attribute with a timeout of 30 seconds.
It seems that as soon as those 30 seconds have passed, another worker picks up that same job again.
The code is fairly straightforward:
public async Task ProcessPicking(HttpRequest incomingRequest)
{
var filePath = await StoreStreamAsync(incomingRequest, TriggerTypes.Picking);
var picking = await XmlHelper.DeserializeFileAsync<Picking>(filePath);
// delay with 20 minutes so outbound-out gets the chance to be send first
BackgroundJob.Schedule(() => StartPicking(picking), TimeSpan.FromMinutes(20));
}
[TriggerAlarming("[IMPORTANT] Failed to parse picking message to **** object.")]
[DisableConcurrentExecution(30)]
public void StartPicking(Picking picking)
{
var orderlinePickModels = picking.ToSalesOrderlinePickQuantityRequests().ToList();
var orderlineStatusModels = orderlinePickModels.ToSalesOrderlineStatusRequests().ToList();
var isParsed = DateTime.TryParse(picking.Order.UnloadingDate, out var unloadingDate);
for (var i = 0; i < orderlinePickModels.Count; i++)
{
// prevents bugs with usage of i in the background jobs
var index = i;
var id = BackgroundJob.Enqueue(() => SendSalesOrderlinePickQuantityRequest(orderlinePickModels[index], picking.EdiReference));
BackgroundJob.ContinueJobWith(id, () => SendSalesOrderlineStatusRequest(
orderlineStatusModels.First(x=>x.SalesOrderlineId== orderlinePickModels[index].OrderlineId),
picking.EdiReference, picking.Order.PrimaryReference, isParsed ? unloadingDate : DateTime.MinValue));
}
}
[TriggerAlarming("[IMPORTANT] Failed to send order line pick quantity request to ****.")]
[AutomaticRetry(Attempts = 2)]
[DisableConcurrentExecution(30)]
public void SendSalesOrderlinePickQuantityRequest(SalesOrderlinePickQuantityRequest request, string ediReference)
{
var audit = new AuditPostModel
{
Description = $"Finished job to send order line pick quantity request for item {request.Itemcode}, part of ediReference {ediReference}.",
Object = request,
Type = AuditTypes.SalesOrderlinePickQuantity
};
try
{
_logger.LogInformation($"Started job to send order line pick quantity request for item {request.Itemcode}.");
var response = _service.SendSalesOrderLinePickQuantity(request).GetAwaiter().GetResult();
audit.StatusCode = (int)response.StatusCode;
if (!response.IsSuccessStatusCode) throw new TriggerRequestFailedException();
audit.IsSuccessful = true;
_logger.LogInformation("Successfully posted sales order line pick quantity request to ***** endpoint.");
}
finally
{
Audit(audit);
}
}
It schedules the main task (StartPicking) that creates the objects required for the two subtasks:
Send picking details to customer
Send statusupdate to customer
The first job is duplicated. Perhaps the second job as well, but this is not important enough to care about as it just concerns a statusupdate. However, the first job causes the customer to think that more items have been picked than in reality.
I would assume that Hangfire updates the state of a job to e.g. in progress, and checks this state before starting a job. Is my time-out on the disabled concurrent execution too low? Is it possible in this scenario that the database connection to update the state takes about 30 seconds (to be fair, it is running on a slow server with ~8GB Ram, 6 vCores) due to which the second worker is already picking up the job again?
Or is this a Hangfire specific issue that must be tackled?

Related

.getcurrentrow (DCIteratorBinding) is returning null after an hour

I am new in ADF (EJB/JPA not Business Component), when the user is using our new app developed on jdeveloper "12.2.1.2.0", after an hour of activity, system is loosing the current record. To note that the object lost is the parent object.
I tried to change the session-timeout (knowing that it will affect the inactivity time).
public List<SelectItem> getSProvMasterSelectItemList(){
List<SelectItem> sProvMasterSelectItemList = new ArrayList<SelectItem>();
DCIteratorBinding lBinding = ADFUtils.findIterator("pByIdIterator");/*After 1 hour I am able to get lBinding is not null*/
Row pRow = lBinding.getCurrentRow();/*But lBinding.getCurrentRow() is null*/
DCDataRow objRow = (DCDataRow) pRow;
Prov prov = (Prov) objRow.getDataProvider();
if (!StringUtils.isEmpty(prov)){
String code = prov.getCode();
if (StringUtils.isEmpty(code)){
return sProvMasterSelectItemList;
}else{
List<Lov> mProvList = getSessionEJBBean().getProvFindMasterProv(code);
sProvMasterSelectItemList.add(new SelectItem(null," "));
for (Lov pMaster:mProvList) {
sProvMasterSelectItemList.add(new SelectItem(pMaster.getId(),pMaster.getDescription()));
}
}
}
return sProvMasterSelectItemList ;
}
I expect to be able to read the current record at any time, specially that it is the master block, and one record is available.
This look like a classic issue of misconfigured Application Module.
Cause : Your application module is timing out and releasing it's transaction before the official adfc-config timeout value.
To Fix :
Go to the application module containing this VO > Configuration > Edit the default > Modify Idle Instance Timeout to be the same as your adf session timeout (Take time to validate the other configuration aswell)

EPiServer 9 - Add block to new page programmatically

I have found some suggestions on how to add a block to a page, but can't get it to work the way I want, so perhaps someone can help out.
What I want to do is to have a scheduled job that reads through a file, creating new pages with a certain pagetype and in the new page adding some blocks to a content property. The blocks fields will be updated with data from the file that is read.
I have the following code in the scheduled job, but it fails at
repo.Save((IContent) newBlock, SaveAction.Publish);
giving the error
The page name must contain at least one visible character.
This is my code:
public override string Execute()
{
//Call OnStatusChanged to periodically notify progress of job for manually started jobs
OnStatusChanged(String.Format("Starting execution of {0}", this.GetType()));
//Create Person page
PageReference parent = PageReference.StartPage;
//IContentRepository contentRepository = EPiServer.ServiceLocation.ServiceLocator.Current.GetInstance<IContentRepository>();
//IContentTypeRepository contentTypeRepository = EPiServer.ServiceLocation.ServiceLocator.Current.GetInstance<IContentTypeRepository>();
//var repository = EPiServer.ServiceLocation.ServiceLocator.Current.GetInstance<IContentRepository>();
//var slaegtPage = repository.GetDefault<SlaegtPage>(ContentReference.StartPage);
IContentRepository contentRepository = EPiServer.ServiceLocation.ServiceLocator.Current.GetInstance<IContentRepository>();
IContentTypeRepository contentTypeRepository = EPiServer.ServiceLocation.ServiceLocator.Current.GetInstance<IContentTypeRepository>();
SlaegtPage slaegtPage = contentRepository.GetDefault<SlaegtPage>(parent, contentTypeRepository.Load("SlaegtPage").ID);
if (slaegtPage.MainContentArea == null) {
slaegtPage.MainContentArea = new ContentArea();
}
slaegtPage.PageName = "001 Kim";
//Create block
var repo = ServiceLocator.Current.GetInstance<IContentRepository>();
var newBlock = repo.GetDefault<SlaegtPersonBlock1>(ContentReference.GlobalBlockFolder);
newBlock.PersonId = "001";
newBlock.PersonName = "Kim";
newBlock.PersonBirthdate = "01 jan 1901";
repo.Save((IContent) newBlock, SaveAction.Publish);
//Add block
slaegtPage.MainContentArea.Items.Add(new ContentAreaItem
{
ContentLink = ((IContent) newBlock).ContentLink
});
slaegtPage.URLSegment = UrlSegment.CreateUrlSegment(slaegtPage);
contentRepository.Save(slaegtPage, EPiServer.DataAccess.SaveAction.Publish);
_stopSignaled = true;
//For long running jobs periodically check if stop is signaled and if so stop execution
if (_stopSignaled) {
return "Stop of job was called";
}
return "Change to message that describes outcome of execution";
}
You can set the Name by
((IContent) newBlock).Name = "MyName";

Why batch jobs working on one server but not on other one?

I have same code on two servers.
I have job for adding batch job to the queue
static void Job_ScheduleBatch2(Args _args)
{
BatchHeader batHeader;
BatchInfo batInfo;
RunBaseBatch rbbTask;
str sParmCaption = "My Demonstration (b351) 2014-22-09T04:09";
SysRecurrenceData sysRecurrenceData = SysRecurrence::defaultRecurrence();
;
sysRecurrenceData = SysRecurrence::setRecurrenceStartDateTime(sysRecurrenceData, DateTimeUtil::utcNow());
sysRecurrenceData = SysRecurrence::setRecurrenceUnit(sysRecurrenceData, SysRecurrenceUnit::Minute,1);
rbbTask = new Batch4DemoClass();
batInfo = rbbTask .batchInfo();
batInfo .parmCaption(sParmCaption);
batInfo .parmGroupId(""); // The "Empty batch group".
batHeader = BatchHeader ::construct();
batHeader .addTask(rbbTask);
batHeader.parmRecurrenceData(sysRecurrenceData);
//batHeader.parmAlerts(NoYes::Yes,NoYes::Yes,NoYes::Yes,NoYes::Yes,NoYes::Yes);
batHeader.addUserAlerts(curUserId(),NoYes::No,NoYes::No,NoYes::No,NoYes::Yes,NoYes::No);
batHeader .save();
info(strFmt("'%1' batch has been scheduled.", sParmCaption));
}
and I have a batch job
class Batch4DemoClass extends RunBaseBatch
{
int methodVariable1;
int metodVariable2;
#define.CurrentVersion(1)
#localmacro.CurrentList
methodVariable1,
metodVariable2
endmacro
}
public container pack()
{
return [#CurrentVersion,#CurrentList];
}
public void run()
{
// The purpose of your job.
info(strFmt("epeating batch job Hello from Batch4DemoClass .run at %1"
,DateTimeUtil ::toStr(
DateTimeUtil ::utcNow())
));
}
public boolean unpack(container _packedClass)
{
Version version = RunBase::getVersion(_packedClass);
switch (version)
{
case #CurrentVersion:
[version,#CurrentList] = _packedClass;
break;
default:
return false;
}
return true;
}
On one server it do runs and in batch history I can see it is saving messages to the batch log and on the other server it does nothing. One server is R2 (running) and R3 (not running).
Both codes are translated to CIL. Both servers are Batch server. I can see the batch jobs in USMF/System administration/Area page inquries. Only difference I can find is that jobs on R2 has filled AOS instance name in genereal and in R3 it is empty. In R2 I can see that info messages in log and batch history but there is nothing in R3.
Any idea how to make batch jobs running?
See how to Configure an AOS instance as a batch server.
There are 3 preconditions:
Marked as a batch server
Time interval set correctly
Batch group set correctly
Update:
Deleting the user of a batch job may make the batch job never complete. Once the execute queue has filled, no further progress will happen.
Deleting the offending batch jobs is problematic, as only the owner can do so! Then consider changing BatchJob.aosValidateDelete.
Yes, worth a try to clear the tables, also if you could provide a screenshot of the already running batchjobs and their status, that might be of help further.

Longer than expected upload times for big query insert

I am uploading data to big query as csv format with JSON schemas. What I am seeing is the very long times to load into big query. I take the start and ending load times from the pollJob.getStatistics() when the load is DONE and compute a delta time as (startTime - endTime)/1000. Then I look at the number of bytes loaded. The data is from files stored in google cloud storage that I reprocess in app engine to do some reformatting. I convert the string into a byte stream and then load as the contents of the load as follows:
public static void uploadFileToBigQuerry(TableSchema tableSchema,String tableData,String datasetId,String tableId,boolean formatIsJson,int waitSeconds,String[] fileIdElements) {
/* Init diagnostic */
String projectId = getProjectId();
if (ReadAndroidRawFile.testMode) {
String s = String.format("My project ID at start of upload to BQ:%s datasetID:%s tableID:%s json:%b \nschema:%s tableData:\n%s\n",
projectId,datasetId,tableId,formatIsJson,tableSchema.toString(),tableData);
log.info(s);
}
else {
String s = String.format("Upload to BQ tableID:%s tableFirst60Char:%s\n",
tableId,tableData.substring(0,60));
log.info(s);
}
/* Setup the data each time */
Dataset dataset = new Dataset();
DatasetReference datasetRef = new DatasetReference();
datasetRef.setProjectId(projectId);
datasetRef.setDatasetId(datasetId);
dataset.setDatasetReference(datasetRef);
try {
bigquery.datasets().insert(projectId, dataset).execute();
} catch (IOException e) {
if (ReadAndroidRawFile.testMode) {
String se = String.format("Exception creating datasetId:%s",e);
log.info(se);
}
}
/* Set destination table */
TableReference destinationTable = new TableReference();
destinationTable.setProjectId(projectId);
destinationTable.setDatasetId(datasetId);
destinationTable.setTableId(tableId);
/* Common setup line */
JobConfigurationLoad jobLoad = new JobConfigurationLoad();
/* Handle input format */
if (formatIsJson) {
jobLoad.setSchema(tableSchema);
jobLoad.setSourceFormat("NEWLINE_DELIMITED_JSON");
jobLoad.setDestinationTable(destinationTable);
jobLoad.setCreateDisposition("CREATE_IF_NEEDED");
jobLoad.setWriteDisposition("WRITE_APPEND");
jobLoad.set("Content-Type", "application/octet-stream");
}
else {
jobLoad.setSchema(tableSchema);
jobLoad.setSourceFormat("CSV");
jobLoad.setDestinationTable(destinationTable);
jobLoad.setCreateDisposition("CREATE_IF_NEEDED");
jobLoad.setWriteDisposition("WRITE_APPEND");
jobLoad.set("Content-Type", "application/octet-stream");
}
/* Setup the job config */
JobConfiguration jobConfig = new JobConfiguration();
jobConfig.setLoad(jobLoad);
JobReference jobRef = new JobReference();
jobRef.setProjectId(projectId);
Job outputJob = new Job();
outputJob.setConfiguration(jobConfig);
outputJob.setJobReference(jobRef);
/* Convert input string into byte stream */
ByteArrayContent contents = new ByteArrayContent("application/octet-stream",tableData.getBytes());
int timesToSleep = 0;
try {
Job job = bigquery.jobs().insert(projectId,outputJob,contents).execute();
if (job == null) {
log.info("Job is null...");
throw new Exception("Job is null");
}
String jobIdNew = job.getId();
//log.info("Job is NOT null...id:");
//s = String.format("job ID:%s jobRefId:%s",jobIdNew,job.getJobReference());
//log.info(s);
while (true) {
try{
Job pollJob = bigquery.jobs().get(jobRef.getProjectId(), job.getJobReference().getJobId()).execute();
String status = pollJob.getStatus().getState();
String errors = "";
String workingDataString = "";
if ((timesToSleep % 10) == 0) {
String statusString = String.format("Job status (%dsec) JobId:%s status:%s\n", timesToSleep, job.getJobReference().getJobId(), status);
log.info(statusString);
}
if (pollJob.getStatus().getState().equals("DONE")) {
status = String.format("Job done, processed %s bytes\n", pollJob.getStatistics().toString()); // getTotalBytesProcessed());
log.info(status); // compute load stats with this string
if ((pollJob.getStatus().getErrors() != null)) {
errors = pollJob.getStatus().getErrors(). toString();
log.info(errors);
}
The performance I get is as follows: the median upload of BYTES/(deltaTime) is 17 BYTES/sec! Yes, bytes, not kilo or mega...
Worse is that sometimes for only a few hundred bytes, just one row, it takes up to 5 minutes. I generally have no errors, but I am thinking that with this performance, I will not be able to upload each app before more data arrives. I am processing with a task queue in a backends instance. This task queue gets a time-out after about an hour of processing.
Is this poor performance because of the contents method?
A couple of things:
If you are loading a small amount of data, you may be better off using TableData.insertAll() rather than a load job, which lets you post the data and have it be available immediately.
Load jobs are Batch oriented jobs. That is, you can insert (more or less) as many as you'd like and they'll be processed when there are resources to do so. Sometimes you create a job and the worker pool is resizing so you have to wait. Sometimes the worker pool is full.
If you provide a project & Job ID we can look into the performance of individual jobs to see what's taking so long.
Load jobs process in parallel; that is, once they start executing they should go very quickly, but the time to start executing may take a long time.
There are three time fields in the job statistics. createTime, startTime, and endTime.
createTime is the moment the BigQuery server receives your request.
startTime is when BigQuery actually starts working on your job
endTime is when the job is completely done
I'd expect that most of the time is being spent between create and start. If that is not the case for small jobs, then it means that something is strange is going on, and a Job ID would help diagnose the issue.

[play-framework : 2.1.2-java : thread pool configuration and AsyncResult

We have a typical web-service which serves JSON data read from a remote database. I was trying out returning Result and AsyncResult, each with the following configuration:
play {
akka {
event-handlers = ["akka.event.slf4j.Slf4jEventHandler"]
loglevel = WARNING
actor {
default-dispatcher = {
fork-join-executor {
parallelism-factor = 1.0
parallelism-max = 1
}
}
}
}
}
and one with
parallelism-factor = 1.0
parallelism-max = 5
Following are observations where the time taken to complete 500 requests is given (average of 5 readings):
1. parallelism-max=1 and parallelism-factor=1.0
Result :
Completion time = 291662 ms.
AsyncResult:
Completion time = 55601 ms
2. parallelism-max=5 and parallelism-factor=1.0
Result :
Completion time = 46419 ms.
AsyncResult:
Completion time = 46977 ms
We can see that with parallelism-max=1, AsyncResult clearly takes very less time compare to Result. However, with parallelism-max=5, Result and AsyncResult give very similar timings.
Shouldn't the time required become less as the number of threads increases, for AsyncResult also ?
Requesting for help to understand the reasons behind this observation.