Quartz scheduler for long running tasks skips jobs - asp.net-core

This is my job. It takes about 3 to 5 minutes to complete each time:
[DisallowConcurrentExecution]
[PersistJobDataAfterExecution]
public class UploadNumberData : IJob
{
private readonly IServiceProvider serviceProvider;
public UploadNumberData(IServiceProvider serviceProvider)
{
this.serviceProvider = serviceProvider;
}
public async Task Execute(IJobExecutionContext context)
{
var jobDataMap = context.MergedJobDataMap;
string flattenedInput = jobDataMap.GetString("FlattenedInput");
string applicationName = jobDataMap.GetString("ApplicationName");
var parsedFlattenedInput = JsonSerializer.Deserialize<List<NumberDataUploadViewModel>>(flattenedInput);
var parsedApplicationName = JsonSerializer.Deserialize<string>(applicationName);
using (var scope = serviceProvider.CreateScope())
{
//Run Process
}
}
}
This is the function that calls the job:
try
{
var flattenedInput = JsonSerializer.Serialize(Input.NumData);
var triggerKey = Guid.NewGuid().ToString();
IJobDetail job = JobBuilder.Create<UploadNumberData >()
.UsingJobData("FlattenedInput", flattenedInput)
.UsingJobData("ApplicationName", flattenedApplicationName)
.StoreDurably()
.WithIdentity("BatchNumberDataJob", $"GP_BatchNumberDataJob")
.Build();
await scheduler.AddJob(job, true);
ITrigger trigger = TriggerBuilder.Create()
.ForJob(job)
.WithIdentity(triggerKey, $"GP_BatchNumberDataJob")
.WithSimpleSchedule(x => x.WithMisfireHandlingInstructionFireNow())
.StartNow()
.Build();
await scheduler.ScheduleJob(trigger);
}
catch(Exception e)
{
//log
}
Each job consists of 300 rows of data with the total count being about 14000 rows divided into 47 jobs.
This is the configuration:
NameValueCollection quartzProperties = new NameValueCollection
{
{"quartz.serializer.type","json" },
{"quartz.jobStore.type","Quartz.Impl.AdoJobStore.JobStoreTX, Quartz" },
{"quartz.jobStore.dataSource","default" },
{"quartz.dataSource.default.provider","MySql" },
{"quartz.dataSource.default.connectionString","connectionstring"},
{"quartz.jobStore.driverDelegateType","Quartz.Impl.AdoJobStore.MySQLDelegate, Quartz" },
{"quartz.jobStore.misfireThreshold","3600000" }
};
The problem now is that when I hit the function/api, only the first and last job gets inserted into the database. Strangely, the last job repeats itself multiple times as well.
I tried changing the Job Identity name to something different but I then get foreign key errors as my data is being inserted into the database.
Example sequence should be:
300,300,300,...,102
However, the sequence ends up being:
300,102,102,102
EDIT:
When I set the threads to 1 and changed the Job Identity to be dynamic, it works. However, does this defeat the purpose of DisallowConcurrentExecution?

I am reproduced your problem and found the way how you should rewrite your code to get expected behaviour as I understand it
Make job identity unique
First of all, I see you use same identity for every job you executing, duplicating causes because you have the same identity and 'replace' flag as 'true' in AddJob method call.
You are on the right way when you decide to use dynamic identity generation for each job, it could be new guid or some incremental int count for each identity. Something like this:
// 'i' variable is a job counter (0, 1, 2 ...)
.WithIdentity($"BatchNumberDataJob-{i}", $"GP_BatchNumberDataJob")
// or
.WithIdentity(Guid.NewGuid().ToString(), $"GP_BatchNumberDataJob")
// Also maybe you want to set 'replace' flag to 'false'
// to enable 'already exists' error if collision occurs.
// You may want handle such cases
await scheduler.AddJob(job, false);
After that you can remove [DisallowConcurrentExecution] attribute from the job, because it is based on a job key, it is not used anymore with such dynamic identity.
Concurrency
Basically, you have a few options how to execute your jobs, it really depends on what you trying to achieve.
Parallel execution
Fastest method to execute your code. Each job is completely separated from each others.
To do so you should prepare your database for such case (because as you said you have foreign key errors when you trying to achieve that behaviour).
It is hard exactly to say what you should change in database to support this behaviour because you say nothing about your database.
If your jobs needs to have an execution order - this method is not for you.
Ordered execution
The other way is to use ordered execution. If (for some reasons) you are not able to prepare your database to handle parallel job execution - you could use this method. This method is a way slower than parallel, but order which jobs are executing is determined.
You can achieve this behaviour two ways:
use jobs chaining. See this question.
set up max concurrency for scheduler:
var quartzProperties = new NameValueCollection
{
{"quartz.threadPool.maxConcurrency","1" },
};
So jobs will be executed in the way you triggering them in the right order completely without parallelism.
Summary
It is really depends of what you trying to achieve. If your point is a speed - then you should rework your database and your job to support completely separated job execution no matter which order it executing. If your point is an ordering - you should use non-parallel methods for job execution. It is up to you.

Related

How do I execute a new job on success or failure of Hangfire job?

I'm working on a Web API RESTful service that on a request needs to perform a task. We're using Hangfire to execute that task as a job, and on failure, will attempt to retry the job up to 10 times.
If the job eventually succeeds I want to run an additional job (to send an event to another service). If the job fails even after all of the retry attempts, I want to run a different additional job (to send a failure event to another service).
However, I can't figure out how to do this. I've created the following JobFilterAttribute:
public class HandleEventsAttribute : JobFilterAttribute, IElectStateFilter
{
public IBackgroundJobClient BackgroundJobClient { get; set; }
public void OnStateElection(ElectStateContext context)
{
var failedState = context.CandidateState as FailedState;
if (failedState != null)
{
BackgroundJobClient.Enqueue<MyJobClass>(x => x.RunJob());
}
}
}
The one problem I'm having is injecting the IBackgroundJobClient into this attribute. I can't pass it as a property to the attribute (I get a "Cannot access non-static field 'backgroundJobClient' in static context" error). We're using autofac for dependency injection, and I tried figuring out how to use property injection, but I'm at a loss. All of this leads me to believe I may be on the wrong track.
I'd think it would be a fairly common pattern to run some additional cleanup code if a Hangfire job fails. How do most people do this?
Thanks for the help. Let me know if there's any additional details I can provide.
Hangfire can build an execution chains. If you want to schedule next job after first one succeed, you need to use ContinueWith(string parentId, Expression<Action> methodCall, JobContinuationOptions options); with the JobContinuationOptions.OnlyOnSucceededState to run it only after success.
But you can create a HangFire extension like JobExecutor and run tasks inside it to get more possibilities.
Something like that:
public static JobResult<T> Enqueue<T>(Expression<Action> a, string name)
{
var exprInfo = GetExpressionInfo(a);
Guid jGuid = Guid.NewGuid();
var jobId = BackgroundJob.Enqueue(() => JobExecutor.Execute(jGuid, exprInfo.Method.DeclaringType.AssemblyQualifiedName, exprInfo.Method.Name, exprInfo.Parameters, exprInfo.ParameterTypes));
JobResult<T> result = new JobResult<T>(jobId, name, jGuid, 0, default(T));
JobRepository.WriteJobState(new JobResult<T>(jobId, name, jGuid, 0, default(T)));
return result;
}
More detailed information you can find here: https://indexoutofrange.com/Don%27t-do-it-now!-Part-5.-Hangfire-job-continuation,-ContinueWith/
I haven't been able to verify this will work, but BackgroundJobClient has no static methods, so you would need a reference to an instance of it.
When I enqueue tasks, I use the static Hangfire.BackgroundJob.Enqueue which should work without a reference to the JobClient instance.
Steve

RavenDB fails with ConcurrencyException when using new transaction

This code always fails with a ConcurrencyException:
[Test]
public void EventOrderingCode_Fails_WithConcurrencyException()
{
Guid id = Guid.NewGuid();
using (var scope1 = new TransactionScope())
using (var session = DataAccess.NewOpenSession)
{
session.Advanced.UseOptimisticConcurrency = true;
session.Advanced.AllowNonAuthoritativeInformation = false;
var ent1 = new CTEntity
{
Id = id,
Name = "George"
};
using (var scope2 = new TransactionScope(TransactionScopeOption.RequiresNew))
{
session.Store(ent1);
session.SaveChanges();
scope2.Complete();
}
var ent2 = session.Load<CTEntity>(id);
ent2.Name = "Gina";
session.SaveChanges();
scope1.Complete();
}
}
It fails at the last session.SaveChanges. Stating that it is using a NonCurrent etag. If I use Required instead of RequiresNew for scope2 - i.e. using the same Transaction. It works.
Now, since I load the entity (ent2) it should be using the newest Etag unless this is some cached value attached to scope1 that I am using (but I have disabled Caching). So I do not understand why this fails.
I really need this setup. In the production code the outer TransactionScope is created by NServiceBus, and the inner is for controlling an aspect of event ordering. It cannot be the same Transaction.
And I need the optimistic concurrency too - if other threads uses the entity at the same time.
BTW: This is using Raven 2.0.3.0
Since no one else have answered, I had better give it a go myself.
It turns out this was a human error. Due to a bad configuration of our IOC container the DataAccess.NewOpenSession gave me the same Session all the time (across other tests). In other words Raven works as expected :)
Before I found out about this I also experimented with using TransactionScopeOption.Suppress instead of RequiresNew. That also worked. Then I just had to make sure that whatever I did in the suppressed scope could not fail. Which was a valid option in my case.

RavenDB: The best way to wait for non-stale index first then query stale index if timeout

I'm using Ravendb 2.5. I have the situation that I need wait for none-stale index first, and if it's timeout after 15 seconds, query the stale index rather than throw a timeout exception. Here is my code.
RavenQueryStatistics stats;
var result = queryable.Statistics(out stats).Take(maxPageSize).ToList();
if (stats.IsStale)
{
try
{
return queryable.Customize(x => x.WaitForNonStaleResultsAsOfLastWrite(TimeSpan.FromSeconds(15))).ToList();
}
catch (Exception)
{
return result;
}
}
else
{
return result;
}
I need add extension method to make the above code work for all the queries, for example:
public static List ToList(this IRavenQueryable queryable)
I may also need add extension method to overwrite: .All(), .Any(), .Contains(), .Count(), .ToList(), .ToArray(), .ToDictionary(), .First(), .FirstOrDefault(), .Single(), .SingleOrDefault(), .Last(), .LastOrDefault(), etc.
I wonder if there is any other better solution for this. What's the best practice?
Does ravendb has an AOP cut point that when timeout exception throws, we can do something to change the query the stable index and return stale results?
Depending on your requirements, I would prefer to have this as two separate calls from the end-user client. First issue a query without waiting for non-stale results, show the query results immediatly to the end-user. If the results are stale then make that visible to the end-user, and make a second query to the server where you wait for non-stale results.
That way the end-user will always get something to see quickly without having to wait potentially 15 seconds even for stale results.
You may force the document store to always wait for the last write, then you could use queries without customize instructions
documentStore.Conventions.DefaultQueryingConsistency = ConsistencyOptions.QueryYourWrites;
Note: If the index is very busy or you write something to DB and query relative data immediately, using async instead of timeout will be better.
//deal with very busy index
using (var session = documentStore.OpenAsyncSession())
{
var result = await session.Query<...>()
.Where(x => ...)
.ToListAsync();
}
//write then read
using (var session = documentStore.OpenAsyncSession())
{
await session.StoreAsync(entity);
await session.SaveChangesAsync();
//query relative data of entity
var result = await session.Query<...>()
.Where(x => ...)
.ToListAsync();
}

Delaying writes to SQL Server

I am working on an app, and need to keep track of how any views a page has. Almost like how SO does it. It is a value used to determine how popular a given page is.
I am concerned that writing to the DB every time a new view needs to be recorded will impact performance. I know this borderline pre-optimization, but I have experienced the problem before. Anyway, the value doesn't need to be real time; it is OK if it is delayed by 10 minutes or so. I was thinking that caching the data, and doing one large write every X minutes should help.
I am running on Windows Azure, so the Appfabric cache is available to me. My original plan was to create some sort of compound key (PostID:UserID), and tag the key with "pageview". Appfabric allows you to get all keys by tag. Thus I could let them build up, and do one bulk insert into my table instead of many small writes. The table looks like this, but is open to change.
int PageID | guid userID | DateTime ViewTimeStamp
The website would still get the value from the database, writes would just be delayed, make sense?
I just read that the Windows Azure Appfabric cache does not support tag based searches, so it pretty much negates my idea.
My question is, how would you accomplish this? I am new to Azure, so I am not sure what my options are. Is there a way to use the cache without tag based searches? I am just looking for advice on how to delay these writes to SQL.
You might want to take a look at http://www.apathybutton.com (and the Cloud Cover episode it links to), which talks about a highly scalable way to count things. (It might be overkill for your needs, but hopefully it gives you some options.)
You could keep a queue in memory and on a timer drain the queue, collapse the queued items by totaling the counts by page and write in one SQL batch/round trip. For example, using a TVP you could write the queued totals with one sproc call.
That of course doesn't guarantee the view counts get written since its in memory and latently written but page counts shouldn't be critical data and crashes should be rare.
You might want to have a look at how the "diagnostics" feature in Azure works. Not because you would use diagnostics for what you are doing at all, but because it is dealing with a similar problem and may provide some inspiration. I am just about to implement a data auditing feature and I want to log that to table storage so also want to delay and bunch the updates together and I have taken a lot of inspiration from diagnostics.
Now, the way Diagnostics in Azure works is that each role starts a little background "transfer" thread. So, whenever you write any traces then that gets stored in a list in local memory and the background thread will (by default) bunch all the requests up and transfer them to table storage every minute.
In your scenario, I would let each role instance keep track of a count of hits and then use a background thread to update the database every minute or so.
I would probably use something like a static ConcurrentDictionary (or one hanging off a singleton) on each webrole with each hit incrementing the counter for the page identifier. You'd need to have some thread handling code to allow multiple request to update the same counter in the list. Alternatively, just allow each "hit" to add a new record to a shared thread-safe list.
Then, have a background thread once per minute increment the database with the number of hits per page since last time and reset the local counter to 0 or empty the shared list if you are going with that approach (again, be careful about the multi threading and locking).
The important thing is to make sure your database update is atomic; If you do a read-current-count from the database, increment it and then write it back then you may have two different web role instances doing this at the same time and thus losing one update.
EDIT:
Here is a quick sample of how you could go about this.
using System.Collections.Concurrent;
using System.Data.SqlClient;
using System.Threading;
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
// You would put this in your Application_start for the web role
Thread hitTransfer = new Thread(() => HitCounter.Run(new TimeSpan(0, 0, 1))); // You'd probably want the transfer to happen once a minute rather than once a second
hitTransfer.Start();
//Testing code - this just simulates various web threads being hit and adding hits to the counter
RunTestWorkerThreads(5);
Thread.Sleep(5000);
// You would put the following line in your Application shutdown
HitCounter.StopRunning(); // You could do some cleverer stuff with aborting threads, joining the thread etc but you probably won't need to
Console.WriteLine("Finished...");
Console.ReadKey();
}
private static void RunTestWorkerThreads(int workerCount)
{
Thread[] workerThreads = new Thread[workerCount];
for (int i = 0; i < workerCount; i++)
{
workerThreads[i] = new Thread(
(tagname) =>
{
Random rnd = new Random();
for (int j = 0; j < 300; j++)
{
HitCounter.LogHit(tagname.ToString());
Thread.Sleep(rnd.Next(0, 5));
}
});
workerThreads[i].Start("TAG" + i);
}
foreach (var t in workerThreads)
{
t.Join();
}
Console.WriteLine("All threads finished...");
}
}
public static class HitCounter
{
private static System.Collections.Concurrent.ConcurrentQueue<string> hits;
private static object transferlock = new object();
private static volatile bool stopRunning = false;
static HitCounter()
{
hits = new ConcurrentQueue<string>();
}
public static void LogHit(string tag)
{
hits.Enqueue(tag);
}
public static void Run(TimeSpan transferInterval)
{
while (!stopRunning)
{
Transfer();
Thread.Sleep(transferInterval);
}
}
public static void StopRunning()
{
stopRunning = true;
Transfer();
}
private static void Transfer()
{
lock(transferlock)
{
var tags = GetPendingTags();
var hitCounts = from tag in tags
group tag by tag
into g
select new KeyValuePair<string, int>(g.Key, g.Count());
WriteHits(hitCounts);
}
}
private static void WriteHits(IEnumerable<KeyValuePair<string, int>> hitCounts)
{
// NOTE: I don't usually use sql commands directly and have not tested the below
// The idea is that the update should be atomic so even though you have multiple
// web servers all issuing similar update commands, potentially at the same time,
// they should all commit. I do urge you to test this part as I cannot promise this code
// will work as-is
//using (SqlConnection con = new SqlConnection("xyz"))
//{
// foreach (var hitCount in hitCounts.OrderBy(h => h.Key))
// {
// var cmd = con.CreateCommand();
// cmd.CommandText = "update hits set count = count + #count where tag = #tag";
// cmd.Parameters.AddWithValue("#count", hitCount.Value);
// cmd.Parameters.AddWithValue("#tag", hitCount.Key);
// cmd.ExecuteNonQuery();
// }
//}
Console.WriteLine("Writing....");
foreach (var hitCount in hitCounts.OrderBy(h => h.Key))
{
Console.WriteLine(String.Format("{0}\t{1}", hitCount.Key, hitCount.Value));
}
}
private static IEnumerable<string> GetPendingTags()
{
List<string> hitlist = new List<string>();
var currentCount = hits.Count();
for (int i = 0; i < currentCount; i++)
{
string tag = null;
if (hits.TryDequeue(out tag))
{
hitlist.Add(tag);
}
}
return hitlist;
}
}

NHibernate - flush before querying?

I have a repository class that uses an NHibernate session to persist objects to the database. By default, the repository doesn't use an explicit transaction - that's up to the caller to manage. I have the following unit test to test my NHibernate plumbing:
[Test]
public void NHibernate_BaseRepositoryProvidesRequiredMethods()
{
using (var unitOfWork = UnitOfWork.Create())
{
// test the add method
TestRepo.Add(new TestObject() { Id = 1, Name = "Testerson" });
TestRepo.Add(new TestObject() { Id = 2, Name = "Testerson2" });
TestRepo.Add(new TestObject() { Id = 3, Name = "Testerson3" });
// test the getall method
var objects = TestRepo.GetAll();
Assert.AreEqual(3, objects.Length);
// test the remove method
TestRepo.Remove(objects[1]);
objects = TestRepo.GetAll();
Assert.AreEqual(2, objects.Length);
// test the get method
var obj = TestRepo.Get(objects[1].Id);
Assert.AreSame(objects[1], obj);
}
}
The problem is that the line
Assert.AreEqual(3, objects.Length);
fails the test because the object list returned from the GetAll method is empty. If I manually flush the session right after inserting the three objects, that part of the test passes. I'm using the default FlushMode on the session, and according to the documentation, it's supposed to flush before running the query to retrieve all the objects, but it's obviously not. What I am missing?
Edit: I'm using Sqlite for the unit test scenario, if that makes any difference.
You state that
according to the documentation, it's supposed to flush before running the query to retrieve all the objects
But the doc at https://www.hibernate.org/hib_docs/v3/api/org/hibernate/FlushMode.html, the doc states that in AUTO flush mode (emphasis is mine):
The Session is sometimes flushed
before query execution in order to
ensure that queries never return stale
state. This is the default flush mode.
So yes, you need to do a flush to save those values before expecting them to show up in your select.