We have a typical web-service which serves JSON data read from a remote database. I was trying out returning Result and AsyncResult, each with the following configuration:
play {
akka {
event-handlers = ["akka.event.slf4j.Slf4jEventHandler"]
loglevel = WARNING
actor {
default-dispatcher = {
fork-join-executor {
parallelism-factor = 1.0
parallelism-max = 1
}
}
}
}
}
and one with
parallelism-factor = 1.0
parallelism-max = 5
Following are observations where the time taken to complete 500 requests is given (average of 5 readings):
1. parallelism-max=1 and parallelism-factor=1.0
Result :
Completion time = 291662 ms.
AsyncResult:
Completion time = 55601 ms
2. parallelism-max=5 and parallelism-factor=1.0
Result :
Completion time = 46419 ms.
AsyncResult:
Completion time = 46977 ms
We can see that with parallelism-max=1, AsyncResult clearly takes very less time compare to Result. However, with parallelism-max=5, Result and AsyncResult give very similar timings.
Shouldn't the time required become less as the number of threads increases, for AsyncResult also ?
Requesting for help to understand the reasons behind this observation.
Related
I am using Hangfire.AspNetCore 1.7.17 and Hangfire.MySqlStorage 2.0.3 for software that is currently in production.
Now and then, we get a report of jobs being executed twice, despite the usage of the [DisableConcurrentExecution] attribute with a timeout of 30 seconds.
It seems that as soon as those 30 seconds have passed, another worker picks up that same job again.
The code is fairly straightforward:
public async Task ProcessPicking(HttpRequest incomingRequest)
{
var filePath = await StoreStreamAsync(incomingRequest, TriggerTypes.Picking);
var picking = await XmlHelper.DeserializeFileAsync<Picking>(filePath);
// delay with 20 minutes so outbound-out gets the chance to be send first
BackgroundJob.Schedule(() => StartPicking(picking), TimeSpan.FromMinutes(20));
}
[TriggerAlarming("[IMPORTANT] Failed to parse picking message to **** object.")]
[DisableConcurrentExecution(30)]
public void StartPicking(Picking picking)
{
var orderlinePickModels = picking.ToSalesOrderlinePickQuantityRequests().ToList();
var orderlineStatusModels = orderlinePickModels.ToSalesOrderlineStatusRequests().ToList();
var isParsed = DateTime.TryParse(picking.Order.UnloadingDate, out var unloadingDate);
for (var i = 0; i < orderlinePickModels.Count; i++)
{
// prevents bugs with usage of i in the background jobs
var index = i;
var id = BackgroundJob.Enqueue(() => SendSalesOrderlinePickQuantityRequest(orderlinePickModels[index], picking.EdiReference));
BackgroundJob.ContinueJobWith(id, () => SendSalesOrderlineStatusRequest(
orderlineStatusModels.First(x=>x.SalesOrderlineId== orderlinePickModels[index].OrderlineId),
picking.EdiReference, picking.Order.PrimaryReference, isParsed ? unloadingDate : DateTime.MinValue));
}
}
[TriggerAlarming("[IMPORTANT] Failed to send order line pick quantity request to ****.")]
[AutomaticRetry(Attempts = 2)]
[DisableConcurrentExecution(30)]
public void SendSalesOrderlinePickQuantityRequest(SalesOrderlinePickQuantityRequest request, string ediReference)
{
var audit = new AuditPostModel
{
Description = $"Finished job to send order line pick quantity request for item {request.Itemcode}, part of ediReference {ediReference}.",
Object = request,
Type = AuditTypes.SalesOrderlinePickQuantity
};
try
{
_logger.LogInformation($"Started job to send order line pick quantity request for item {request.Itemcode}.");
var response = _service.SendSalesOrderLinePickQuantity(request).GetAwaiter().GetResult();
audit.StatusCode = (int)response.StatusCode;
if (!response.IsSuccessStatusCode) throw new TriggerRequestFailedException();
audit.IsSuccessful = true;
_logger.LogInformation("Successfully posted sales order line pick quantity request to ***** endpoint.");
}
finally
{
Audit(audit);
}
}
It schedules the main task (StartPicking) that creates the objects required for the two subtasks:
Send picking details to customer
Send statusupdate to customer
The first job is duplicated. Perhaps the second job as well, but this is not important enough to care about as it just concerns a statusupdate. However, the first job causes the customer to think that more items have been picked than in reality.
I would assume that Hangfire updates the state of a job to e.g. in progress, and checks this state before starting a job. Is my time-out on the disabled concurrent execution too low? Is it possible in this scenario that the database connection to update the state takes about 30 seconds (to be fair, it is running on a slow server with ~8GB Ram, 6 vCores) due to which the second worker is already picking up the job again?
Or is this a Hangfire specific issue that must be tackled?
I am new in ADF (EJB/JPA not Business Component), when the user is using our new app developed on jdeveloper "12.2.1.2.0", after an hour of activity, system is loosing the current record. To note that the object lost is the parent object.
I tried to change the session-timeout (knowing that it will affect the inactivity time).
public List<SelectItem> getSProvMasterSelectItemList(){
List<SelectItem> sProvMasterSelectItemList = new ArrayList<SelectItem>();
DCIteratorBinding lBinding = ADFUtils.findIterator("pByIdIterator");/*After 1 hour I am able to get lBinding is not null*/
Row pRow = lBinding.getCurrentRow();/*But lBinding.getCurrentRow() is null*/
DCDataRow objRow = (DCDataRow) pRow;
Prov prov = (Prov) objRow.getDataProvider();
if (!StringUtils.isEmpty(prov)){
String code = prov.getCode();
if (StringUtils.isEmpty(code)){
return sProvMasterSelectItemList;
}else{
List<Lov> mProvList = getSessionEJBBean().getProvFindMasterProv(code);
sProvMasterSelectItemList.add(new SelectItem(null," "));
for (Lov pMaster:mProvList) {
sProvMasterSelectItemList.add(new SelectItem(pMaster.getId(),pMaster.getDescription()));
}
}
}
return sProvMasterSelectItemList ;
}
I expect to be able to read the current record at any time, specially that it is the master block, and one record is available.
This look like a classic issue of misconfigured Application Module.
Cause : Your application module is timing out and releasing it's transaction before the official adfc-config timeout value.
To Fix :
Go to the application module containing this VO > Configuration > Edit the default > Modify Idle Instance Timeout to be the same as your adf session timeout (Take time to validate the other configuration aswell)
I am running into issues where I have a streaming scio pipeline running on Dataflow that is deduplicating messages and performing some counting by key. When I try to drain the pipeline I get a large amount of None.get exceptions supposedly thrown in my deduplicate step (I am basing this assumption off the label I am observing in the stackdriver log).
We are currently running on scio version 0.7.0-beta1 and beam version 2.8.0. I have tried protecting as much as I can in my code from any potential Nones but this appears like it is occurring further down inside of the deduplicate step.
The error I am getting is the following:
"java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:347)
at scala.None$.get(Option.scala:345)
at com.spotify.scio.util.Functions$$anon$2.mergeAccumulators(Functions.scala:227)
at com.spotify.scio.util.Functions$$anon$2.mergeAccumulators(Functions.scala:220)
at org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillCombiningState.getAccum(WindmillStateInternals.java:958)
at org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillCombiningState.read(WindmillStateInternals.java:920)
at org.apache.beam.runners.core.SystemReduceFn.onTrigger(SystemReduceFn.java:125)
at org.apache.beam.runners.core.ReduceFnRunner.onTrigger(ReduceFnRunner.java:1060)
at org.apache.beam.runners.core.ReduceFnRunner.onTimers(ReduceFnRunner.java:768)
at org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:95)
at org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:42)
at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:115)
at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:73)
at org.apache.beam.runners.core.LateDataDroppingDoFnRunner.processElement(LateDataDroppingDoFnRunner.java:80)
at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:135)
at org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:45)
at org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:50)
at org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:202)
at org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:160)
at org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1226)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:141)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:965)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
As you can see, this never really enters my code and I am unsure how I should go about finding this issue. Perhaps it has something to do with the "LateDataDroppingDoFnRunner"? Our allowed lateness is relatively large (3 days with windows being an hour long).
val input = PubsubIO.readStrings()
.fromSubscription(subscription)
.withTimestampAttribute("ts")
.withName("Window messages")
.withFixedWindows(
duration = windowSize,
options = WindowOptions(
trigger = AfterWatermark.pastEndOfWindow()
.withEarlyFirings(AfterProcessingTime.pastFirstElementInPane()
.plusDelayOf(earlyFiring))
.withLateFirings(AfterProcessingTime.pastFirstElementInPane()
.plusDelayOf(lateFiring)),
accumulationMode = ACCUMULATING_FIRED_PANES,
allowedLateness = allowedLateness
)
)
.withName(s"Deduplicate messages")
.distinctBy[String](f = getId)
...
// I am being overly cautious here because I have been having
// so much trouble debugging this
def getId(message: Map[String, Any]): String = {
message match {
case null => {
logger.warn("message is null when getting id")
""
}
case message => {
message.get("id") match {
case None => {
logger.warn("id is null in message")
""
}
case id => id.get.toString
}
}
}
}
I am confused how I could possibly be getting a None.get here and why that would only occur when I am draining.
Can I have some advice on how I should go about debugging this error or where I should be looking?
I am uploading data to big query as csv format with JSON schemas. What I am seeing is the very long times to load into big query. I take the start and ending load times from the pollJob.getStatistics() when the load is DONE and compute a delta time as (startTime - endTime)/1000. Then I look at the number of bytes loaded. The data is from files stored in google cloud storage that I reprocess in app engine to do some reformatting. I convert the string into a byte stream and then load as the contents of the load as follows:
public static void uploadFileToBigQuerry(TableSchema tableSchema,String tableData,String datasetId,String tableId,boolean formatIsJson,int waitSeconds,String[] fileIdElements) {
/* Init diagnostic */
String projectId = getProjectId();
if (ReadAndroidRawFile.testMode) {
String s = String.format("My project ID at start of upload to BQ:%s datasetID:%s tableID:%s json:%b \nschema:%s tableData:\n%s\n",
projectId,datasetId,tableId,formatIsJson,tableSchema.toString(),tableData);
log.info(s);
}
else {
String s = String.format("Upload to BQ tableID:%s tableFirst60Char:%s\n",
tableId,tableData.substring(0,60));
log.info(s);
}
/* Setup the data each time */
Dataset dataset = new Dataset();
DatasetReference datasetRef = new DatasetReference();
datasetRef.setProjectId(projectId);
datasetRef.setDatasetId(datasetId);
dataset.setDatasetReference(datasetRef);
try {
bigquery.datasets().insert(projectId, dataset).execute();
} catch (IOException e) {
if (ReadAndroidRawFile.testMode) {
String se = String.format("Exception creating datasetId:%s",e);
log.info(se);
}
}
/* Set destination table */
TableReference destinationTable = new TableReference();
destinationTable.setProjectId(projectId);
destinationTable.setDatasetId(datasetId);
destinationTable.setTableId(tableId);
/* Common setup line */
JobConfigurationLoad jobLoad = new JobConfigurationLoad();
/* Handle input format */
if (formatIsJson) {
jobLoad.setSchema(tableSchema);
jobLoad.setSourceFormat("NEWLINE_DELIMITED_JSON");
jobLoad.setDestinationTable(destinationTable);
jobLoad.setCreateDisposition("CREATE_IF_NEEDED");
jobLoad.setWriteDisposition("WRITE_APPEND");
jobLoad.set("Content-Type", "application/octet-stream");
}
else {
jobLoad.setSchema(tableSchema);
jobLoad.setSourceFormat("CSV");
jobLoad.setDestinationTable(destinationTable);
jobLoad.setCreateDisposition("CREATE_IF_NEEDED");
jobLoad.setWriteDisposition("WRITE_APPEND");
jobLoad.set("Content-Type", "application/octet-stream");
}
/* Setup the job config */
JobConfiguration jobConfig = new JobConfiguration();
jobConfig.setLoad(jobLoad);
JobReference jobRef = new JobReference();
jobRef.setProjectId(projectId);
Job outputJob = new Job();
outputJob.setConfiguration(jobConfig);
outputJob.setJobReference(jobRef);
/* Convert input string into byte stream */
ByteArrayContent contents = new ByteArrayContent("application/octet-stream",tableData.getBytes());
int timesToSleep = 0;
try {
Job job = bigquery.jobs().insert(projectId,outputJob,contents).execute();
if (job == null) {
log.info("Job is null...");
throw new Exception("Job is null");
}
String jobIdNew = job.getId();
//log.info("Job is NOT null...id:");
//s = String.format("job ID:%s jobRefId:%s",jobIdNew,job.getJobReference());
//log.info(s);
while (true) {
try{
Job pollJob = bigquery.jobs().get(jobRef.getProjectId(), job.getJobReference().getJobId()).execute();
String status = pollJob.getStatus().getState();
String errors = "";
String workingDataString = "";
if ((timesToSleep % 10) == 0) {
String statusString = String.format("Job status (%dsec) JobId:%s status:%s\n", timesToSleep, job.getJobReference().getJobId(), status);
log.info(statusString);
}
if (pollJob.getStatus().getState().equals("DONE")) {
status = String.format("Job done, processed %s bytes\n", pollJob.getStatistics().toString()); // getTotalBytesProcessed());
log.info(status); // compute load stats with this string
if ((pollJob.getStatus().getErrors() != null)) {
errors = pollJob.getStatus().getErrors(). toString();
log.info(errors);
}
The performance I get is as follows: the median upload of BYTES/(deltaTime) is 17 BYTES/sec! Yes, bytes, not kilo or mega...
Worse is that sometimes for only a few hundred bytes, just one row, it takes up to 5 minutes. I generally have no errors, but I am thinking that with this performance, I will not be able to upload each app before more data arrives. I am processing with a task queue in a backends instance. This task queue gets a time-out after about an hour of processing.
Is this poor performance because of the contents method?
A couple of things:
If you are loading a small amount of data, you may be better off using TableData.insertAll() rather than a load job, which lets you post the data and have it be available immediately.
Load jobs are Batch oriented jobs. That is, you can insert (more or less) as many as you'd like and they'll be processed when there are resources to do so. Sometimes you create a job and the worker pool is resizing so you have to wait. Sometimes the worker pool is full.
If you provide a project & Job ID we can look into the performance of individual jobs to see what's taking so long.
Load jobs process in parallel; that is, once they start executing they should go very quickly, but the time to start executing may take a long time.
There are three time fields in the job statistics. createTime, startTime, and endTime.
createTime is the moment the BigQuery server receives your request.
startTime is when BigQuery actually starts working on your job
endTime is when the job is completely done
I'd expect that most of the time is being spent between create and start. If that is not the case for small jobs, then it means that something is strange is going on, and a Job ID would help diagnose the issue.
my sort command is
"SORT hot_ids by no_keys GET # GET msg:->msg GET msg:->count GET msg:*->comments"
it works fine in redis-cli, but it doesn't return data in RedisClient. the result is a byte[][], length of result is correct, but every element of array is null.
the response of redis is
...
$-1
$-1
...
c# code is
data = redis.Sort("hot_ids ", new SortOptions()
{
GetPattern = "# GET msg:*->msg GET msg:*->count GET msg:*->comments",
Skip = skip,
Take = take,
SortPattern = "not-key"
});
Redis Sort is used in IRedisClient.GetSortedItemsFromList, e.g. from RedisClientListTests.cs:
[Test]
public void Can_AddRangeToList_and_GetSortedItems()
{
Redis.PrependRangeToList(ListId, storeMembers);
var members = Redis.GetSortedItemsFromList(ListId,
new SortOptions { SortAlpha = true, SortDesc = true, Skip = 1, Take = 2 });
AssertAreEqual(members,
storeMembers.OrderByDescending(s => s).Skip(1).Take(2).ToList());
}
You can use the MONITOR command in redis-cli to help diagnose and see what requests the ServiceStack Redis client is sending to redis-server.