Concurrent processing of Channels - asp.net-core

I'm following this tutorial to create a hosted service. The program runs as expected. However, I want to process the queued items concurrently.
In my app, there are 4 clients, each of these clients can process 4 items at a time. So at any given time, 16 items should be processed in parallel.
So based on these requirements, I've modified the code a bit:
In the MonitorLoop class:
private int count = 0;
private async ValueTask MonitorAsync()
{
while (!_cancellationToken.IsCancellationRequested)
{
await _taskQueue.QueueAsync(BuildWorkItem);
Interlocked.Increment(ref count);
Console.WriteLine($"Count: {count}");
}
}
and in the same class:
if (delayLoop == 3)
{
_logger.LogInformation("Queued Background Task {Guid} is complete.", guid);
Interlocked.Decrement(ref count);
}
This shows that, if I set the "Capacity" as 4, the value will never increase after 5.
Basically, if the queue is full, it will wait until there's room for one more.
The problem is that the items are processed one at a time.
Here's the code for the BackgroundProcessing method on the QueuedHostedService class:
private async Task BackgroundProcessing(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
var workItem = await TaskQueue.DequeueAsync(stoppingToken);
try
{
//instead of getting a single item from the queue, somehow, here
//we should be able to process them in parallel for 4 clients
//with a limit for maximum items each client can process
await workItem(stoppingToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error occurred executing {WorkItem}.", nameof(workItem));
}
}
}
I want to process them in parallel. I'm not sure if using Channel as the queue in the system is the best solution. Maybe I should have a ConcurrentQueue instead. But again, I'm not sure how to achieve a robust implementation that can have 4 clients with 4 threads each.

If you want four processors, then you can refactor the code to use four instances of your main loop, and use Task.WhenAll to (asynchronously) wait for all of them to complete:
private async Task BackgroundProcessing(CancellationToken stoppingToken)
{
var task1 = ProcessAsync(stoppingToken);
var task2 = ProcessAsync(stoppingToken);
var task3 = ProcessAsync(stoppingToken);
var task4 = ProcessAsync(stoppingToken);
await Task.WhenAll(task1, task2, task3, task4);
async Task ProcessAsync(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
var workItem = await TaskQueue.DequeueAsync(stoppingToken);
try
{
await workItem(stoppingToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error occurred executing {WorkItem}.", nameof(workItem));
}
}
}
}
I'm not sure how to achieve a robust implementation
If you want a robust implementation, then you can't use that tutorial, sorry. The primary problem with that kind of background work is that it will be lost on any app restart. And app restarts are normal: the server can lose power or crash, OS or runtime patches can be installed, IIS will recycle your app periodically, and whenever you deploy your code, the app will restart. And whenever any of these things happen, all in-memory queues like channels will lose all their work.
A production-quality implementation requires a durable queue at the very least. I also recommend a separate background processor. I have a blog series on the subject that may help you get started.

Related

Efficient thread use on high traffic ASP.NET Core Web API with processing timeout

My ASP.NET Core Web API (Linux) endpoint needs to serve a high volume of concurrent requests. If the request takes more than 200ms then it should abort and return a custom piece of JSON. The code is all awaitable. The request must always return HTTP 200 and the HTTP request timeout cannot be reduced from 30 secs to 200ms.
What is the most efficient way to accomplish what I want? Should I use a Task? Should I use Task.Wait or Task.WaitAsync? Or should the work methods run in the HTTP request thread, periodically check Stopwatch.Elapsed and throw a timeout exception?
This is my current code:
var task = Task.Factory.StartNew(async () =>
{
// Processing part 1
var result1 = await DoWorkPart1("Param1");
if (cancellationToken.IsCancellationRequested())
cancellationToken.ThrowIfCancellationRequested();
// Processing part 2
var result2 = wait DoWorkPart2(result1);
return result2;
}).Unwrap(); // Return lambda task, not outer task
// Is it better to use WaitAsync?
task.Wait(TimeSpan.FromMilliseconds(150));
if (task.IsCompleted) // Result within timeout
{
if (task.Exception == null) // Success
{
return Ok(task.Result);
}
else
{
return Ok(new FailedObject() { Reason = ReasonEnum.UnexpectedError };
}
}
else // Timeout
{
return OK(new FailedObject() { Reason = ReasonEnum.TookTooLong };
}
What is the most efficient way to accomplish what I want?
I recommend using CancellationTokens to cancel. With a very short timeout like 200ms, you might just want to create a CancellationTokenSource with that timeout and ignore the CancellationToken provided to you by ASP.NET, which handles situations like clients disconnecting early.
Should I use a Task? Should I use Task.Wait or Task.WaitAsync? Or should the work methods run in the HTTP request thread, periodically check Stopwatch.Elapsed and throw a timeout exception?
I would say none of these. Instead, pass the CancellationToken down as far as you possibly can, ideally right to the lowest-level APIs your asynchronous code is calling.
If some of those APIs ignore their cancellation tokens, or if it's possible they may complete synchronously (e.g., due to caching), then adding cancellationToken.ThrowIfCancellationRequested(); in-between steps is a good idea.
Side note: Don't use StartNew.
using var cts = new CancellationTokenSource(TimeSpan.FromMilliseconds(200));
try
{
// Processing part 1
var result1 = await DoWorkPart1("Param1", cts.Token);
cts.Token.ThrowIfCancellationRequested();
// Processing part 2
var result2 = wait DoWorkPart2(result1, cts.Token);
return Ok(result2);
}
catch (OperationCanceledException)
{
return OK(new FailedObject() { Reason = ReasonEnum.TookTooLong };
}
catch
{
return Ok(new FailedObject() { Reason = ReasonEnum.UnexpectedError };
}

How to determine job's queue at runtime

Our web app allows the end-user to set the queue of recurring jobs on the UI. (We create a queue for each server (use server name) and allow users to choose server to run)
How the job is registered:
RecurringJob.AddOrUpdate<IMyTestJob>(input.Id, x => x.Run(), input.Cron, TimeZoneInfo.Local, input.QueueName);
It worked properly, but sometimes we check the log on Production and found that it runs on the wrong queue (server). We don't have more access to Production so that we try to reproduce at Development but it's not happened.
To temporarily fix this issue, we need to get the queue name when the job running, then compare it with the current server name and stop it when they are diferent.
Is it possible and how to get it from PerformContext?
Noted: We use HangFire version: 1.7.9 and ASP.NET Core 3.1
You may have a look at https://github.com/HangfireIO/Hangfire/pull/502
A dedicated filter intercepts the queue changes and restores the original queue.
I guess you can just stop the execution in a very similar filter, or set a parameter to cleanly stop execution during the IElectStateFilter.OnStateElection phase by changing the CandidateState to FailedState
Maybe your problem comes from an already existing filter which messes up with the queues.
Here is the code from the link above :
public class PreserveOriginalQueueAttribute : JobFilterAttribute, IApplyStateFilter
{
public void OnStateApplied(ApplyStateContext context, IWriteOnlyTransaction transaction)
{
var enqueuedState = context.NewState as EnqueuedState;
// Activating only when enqueueing a background job
if (enqueuedState != null)
{
// Checking if an original queue is already set
var originalQueue = JobHelper.FromJson<string>(context.Connection.GetJobParameter(
context.BackgroundJob.Id,
"OriginalQueue"));
if (originalQueue != null)
{
// Override any other queue value that is currently set (by other filters, for example)
enqueuedState.Queue = originalQueue;
}
else
{
// Queueing for the first time, we should set the original queue
context.Connection.SetJobParameter(
context.BackgroundJob.Id,
"OriginalQueue",
JobHelper.ToJson(enqueuedState.Queue));
}
}
}
public void OnStateUnapplied(ApplyStateContext context, IWriteOnlyTransaction transaction)
{
}
}
I have found the simple solution: since we have known the Recurring Job Id, we can get its information from JobStorage and compare it with the current queue (current server name):
public bool IsCorrectQueue()
{
List<RecurringJobDto> recurringJobs = Hangfire.JobStorage.Current.GetConnection().GetRecurringJobs();
var myJob = recurringJobs.FirstOrDefault(x => x.Id.Equals("My job Id"));
var definedQueue = myJob.Queue;
var currentServerQueue = string.Concat(Environment.MachineName.ToLowerInvariant().Where(char.IsLetterOrDigit));
return definedQueue == "default" || definedQueue == currentServerQueue;
}
Then check it inside the job:
public async Task Run()
{
//Check correct queue
if (!IsCorrectQueue())
{
Logger.Error("Wrong queue detected");
return;
}
//Job logic
}

Ignore old events when processing events of an EventHub consumer group in ASP.Net Core

I have an API with ASP.Net Core 3.1 which uses Azure.Messaging.EventHubs and Azure.Messaging.EventHubs.Processor where I get events from a consumer group and then send them to a SignalR hub. The processor runs only when there are users connected to the hub and stops when the last one gets disconnected updating its checkpoint in a BlobStorage.
The current logic of process for each events its: If the time diff in minutes between DateTime.UtcNow and the event timestamp is less than 2, it sends the event to the SignalR hub, and nothing more.
The problem is as follows: Sometimes there is a long period of time where the EventProcessorClient is stopped and many events are retained in the EventHub, causing it to have a really long wait while slowly catching up to the most recent events until the SignalR Hub starts receiving them again. The period of time for the processor to catch up with the most recent events is way too much, specially when thinking about receiving hundreds of events per minute.
Is there a way of, for example, manually moving the checkpoint before starting the processor? or to get only the events of the last X minutes? maybe another idea/solution?
PS: I don't care for events older than 2 to 5 minutes for this consumer group.
PS2: The retention time configured in the EventHub is 1 day.
The code:
/* properties and stuff */
// Constructor
public BusEventHub(ILogger<BusEventHub> logger, IConfiguration configuration, IHubContext<BusHub> hubContext) {
_logger = logger;
Configuration = configuration;
_busExcessHub = hubContext;
/* Connection strings and stuff */
// Create a blob container client that the event processor will use
storageClient = new BlobContainerClient(this.blobStorageConnectionString, this.blobContainerName);
// Create an event processor client to process events in the event hub
processor = new EventProcessorClient(storageClient, consumerGroup, this.ehubNamespaceConnectionString, this.eventHubName);
// Register handlers for processing events and handling errors
processor.ProcessEventAsync += ProcessEventHandler;
processor.ProcessErrorAsync += ProcessErrorHandler;
}
public async Task Start() {
_logger.LogInformation($"Starting event processing for EventHub {eventHubName}");
await processor.StartProcessingAsync();
}
public async Task Stop() {
if (BusHubUserHandler.ConnectedIds.Count < 2) {
_logger.LogInformation($"Stopping event processing for EventHub {eventHubName}");
await processor.StopProcessingAsync();
} else {
_logger.LogDebug("There are still other users connected");
}
}
private async Task ProcessEventHandler(ProcessEventArgs eventArgs) {
try {
string receivedEvent = Encoding.UTF8.GetString(eventArgs.Data.Body.ToArray());
_logger.LogDebug($"Received event: {receivedEvent}");
BusExcessMinified busExcess = BusExcessMinified.FromJson(receivedEvent);
double timeDiff = (DateTime.UtcNow - busExcess.Timestamp).TotalMinutes;
if (timeDiff < 2) {
string responseEvent = busExcess.ToJson();
_logger.LogDebug($"Sending message to BusExcess Hub: {responseEvent}");
await _busExcessHub.Clients.All.SendAsync("UpdateBuses", responseEvent);
}
_logger.LogDebug("Update checkpoint in the blob storage"); // So that the service receives only new events the next time it's run
await eventArgs.UpdateCheckpointAsync(eventArgs.CancellationToken);
} catch (TaskCanceledException) {
_logger.LogInformation("The EventHub event processing was stopped");
} catch (Exception e) {
_logger.LogError($"Exception: {e}");
}
}
/* ProcessErrorHandler */
It is possible to request an initial position for partitions as they're initialized, which will allow you to specify the enqueue time as your starting point. This sample illustrates the details. The caveat is that the initial position is only used when there is no checkpoint for a partition; checkpoints will always take precedence.
From the scenario that you're describing, it sounds as if checkpoints aren't useful to you and are getting in the way of your preferred usage pattern. If there aren't other mitigating factors, I'd recommend never checkpointing and instead overriding the default starting position to dynamically reset to the time that you're interested in.
If you do, for some reason, need to checkpoint in addition to this then your best bet will be deleting the checkpoint data, as checkpoints are based on the offset and won't recognize an enqueue time for positioning.

Recursive Bus.Send() with-in a Handler (Transactions, Threading, Tasks)

I have a handler similar to the following, which essentially responds to a command and sends a whole bunch of commands to a different queue.
public void Handle(ISomeCommand message)
{
int i=0;
while (i < 10000)
{
var command = Bus.CreateInstance<IAnotherCommand>();
command.Id = i;
Bus.Send("target.queue#d1555", command);
i++;
}
}
The issue with this block is, until the loop is fully completed none of the messages appear in the target queue or in the outgoing queue. Can someone help me understand this behavior?
Also if I use Tasks to send messages within the Handler as below, messages appear immediately. So two questions on this,
What's the explanation on Task based Sends to go through immediately?
Are there are any ramifications on using Tasks with in message handlers?
public void Handle(ISomeCommand message)
{
int i=0;
while (i < 10000)
{
System.Threading.ThreadPool.QueueUserWorkItem((args) =>
{
var command = Bus.CreateInstance<IAnotherCommand>();
command.Id = i;
Bus.Send("target.queue#d1555", command);
i++;
});
}
}
Your time is much appreciated!
First question: Picking a message from a queue, running all the registered message handlers for it AND any other transactional action(like writing new messages or writes against a database) is performed in ONE transaction. Either it all completes or none of it. So what you are seeing is the expected behaviour: picking a message from the queue, handling ISomeCommand and writing 10000 new IAnotherCommand is either done completely or none of it. To avoid this behaviour you can do one of the following:
Configure your NServiceBus endpoint to not be transactional
public class EndpointConfig : IConfigureThisEndpoint, AsA_Publisher,IWantCustomInitialization
{
public void Init()
{
Configure.With()
.DefaultBuilder()
.XmlSerializer()
.MsmqTransport()
.IsTransactional(false)
.UnicastBus();
}
}
Wrap the sending of IAnotherCommand in a transaction scope that suppresses the ambient transaction.
public void Handle(ISomeCommand message)
{
using (new TransactionScope(TransactionScopeOption.Suppress))
{
int i=0;
while (i < 10000)
{
var command = Bus.CreateInstance();
command.Id = i;
Bus.Send("target.queue#d1555", command);
i++;
}
}
}
Issue the Bus.Send on another thread, by either starting a new thread yourself, using System.Threading.ThreadPool.QueueUserWorkItem or the Task classes. This works because an ambient transaction is not automatically carried over to a new thread.
Second question: The ramifications of using Tasks, or any of the other methods I mentioned, is that you have no transactional quarantee for the whole thing.
How do you handle the case where you have generated 5000 IAnotherMessage and the power suddenly goes out?
If you use 2) or 3) the original ISomeMessage will not complete and will be retried automatically by NServiceBus when you start up the endpoint again. End result: 5000 + 10000 IAnotherCommands.
If you use 1) you will lose IAnotherMessage completely and end up with only 5000 IAnotherCommands.
Using the recommended transactional way, the initial 5000 IAnotherCommands would be discarded, the original ISomeMessage comes back on the queue and is retried when the endpoint starts up again. Net result: 10000 IAnotherCommands.
If memory serves NServiceBus wraps the calls to the message handlers in a TransactionScope if the transaction option is used and TransactionScope needs some help to be cross-thread friendly:
TransactionScope and multi-threading
If you are trying to reduce overhead you can also bundle your messages. The signature for the send is Bus.Send(IMessage[]messages). If you can guarantee that you don't blow up the size limit for MSMQ, then you could Send() all the messages at once. If the size limit is an issue, then you can chunk them up or use the Databus.

Async WCF Service with multiple async calls inside

I have a web service in WCF that consume some external web services, so what I want to do is make this service asynchronous in order to release the thread, wait for the completion of all the external services, and then return the result to the client.
With Framework 4.0
public class MyService : IMyService
{
public IAsyncResult BeginDoWork(int count, AsyncCallback callback, object serviceState)
{
var proxyOne = new Gateway.BackendOperation.BackendOperationOneSoapClient();
var proxyTwo = new Gateway.BackendOperationTwo.OperationTwoSoapClient();
var taskOne = Task<int>.Factory.FromAsync(proxyOne.BeginGetNumber, proxyOne.EndGetNumber, 10, serviceState);
var taskTwo = Task<int>.Factory.FromAsync(proxyTwo.BeginGetNumber, proxyTwo.EndGetNumber, 10, serviceState);
var tasks = new Queue<Task<int>>();
tasks.Enqueue(taskOne);
tasks.Enqueue(taskTwo);
return Task.Factory.ContinueWhenAll(tasks.ToArray(), innerTasks =>
{
var tcs = new TaskCompletionSource<int>(serviceState);
int sum = 0;
foreach (var innerTask in innerTasks)
{
if (innerTask.IsFaulted)
{
tcs.SetException(innerTask.Exception);
callback(tcs.Task);
return;
}
if (innerTask.IsCompleted)
{
sum = innerTask.Result;
}
}
tcs.SetResult(sum);
callback(tcs.Task);
});
}
public int EndDoWork(IAsyncResult result)
{
try
{
return ((Task<int>)result).Result;
}
catch (AggregateException ex)
{
throw ex.InnerException;
}
}
}
My questions here are:
This code is using three threads: one that is instanced in the
BeginDoWork, another one that is instanced when the code enter
inside the anonymous method ContinueWhenAll, and the last one when
the callback is executed, in this case EndDoWork. Is that correct or
I’m doing something wrong on the calls? Should I use any
synchronization context? Note: The synchronization context is null
on the main thread.
What happen if I “share” information between
threads, for instance, the callback function? Will that cause a
performance issue or the anonymous method is like a closure where I
can share data?
With Framework 4.5 and Async and Await
Now with Framework 4.5, the code seems too much simple than before:
public async Task<int> DoWorkAsync(int count)
{
var proxyOne = new Backend.ServiceOne.ServiceOneClient();
var proxyTwo = new Backend.ServiceTwo.ServiceTwoClient();
var doWorkOne = proxyOne.DoWorkAsync(count);
var doWorkTwo = proxyTwo.DoWorkAsync(count);
var result = await Task.WhenAll(doWorkOne, doWorkTwo);
return doWorkOne.Result + doWorkTwo.Result;
}
But in this case when I debug the application, I always see that the code is executed on the same thread. So my questions here are:
3.. When I’m waiting for the “awaitable” code, is that thread released and goes back to the thread pool to execute more requests?
3.1. If So, I suppose that when I get a result from the await Task, the execution completes on the same thread that the call before. Is that possible? What happen if that thread is processing another request?
3.2 If Not, how can I release the thread to send it back to the thread pool with Asycn and Await pattern?
Thank you!
1. This code is using three threads: one that is instanced in the BeginDoWork, another one that is instanced when the code enter inside the anonymous method ContinueWhenAll, and the last one when the callback is executed, in this case EndDoWork. Is that correct or I’m doing something wrong on the calls? Should I use any synchronization context?
It's better to think in terms of "tasks" rather than "threads". You do have three tasks here, each of which will run on the thread pool, one at a time.
2. What happen if I “share” information between threads, for instance, the callback function? Will that cause a performance issue or the anonymous method is like a closure where I can share data?
You don't have to worry about synchronization because each of these tasks can't run concurrently. BeginDoWork registers the continuation just before returning, so it's already practically done when the continuation can run. EndDoWork will probably not be called until the continuation is complete; but even if it is, it will block until the continuation is complete.
(Technically, the continuation can start running before BeginDoWork completes, but BeginDoWork just returns at that point, so it doesn't matter).
3. When I’m waiting for the “awaitable” code, is that thread released and goes back to the thread pool to execute more requests?
Yes.
3.1. If So, I suppose that when I get a result from the await Task, the execution completes on the same thread that the call before. Is that possible? What happen if that thread is processing another request?
No. Your host (in this case, ASP.NET) may continue the async methods on any thread it happens to have available.
This is perfectly safe because only one thread is executing at a time.
P.S. I recommend
var result = await Task.WhenAll(doWorkOne, doWorkTwo);
return result[0] + result[1];
instead of
var result = await Task.WhenAll(doWorkOne, doWorkTwo);
return doWorkOne.Result + doWorkTwo.Result;
because Task.Result should be avoided in async programming.