I've recently started working on a new application which will utilize task parallelism. I have just begun writing a tasking framework, but have recently seen a number of posts on SO regarding the new System.Threading.Tasks namespace which may be useful to me (and I would rather use an existing framework than roll my own).
However looking over MSDN I haven't seen how / if, I can implement the functionality which I'm looking for:
Dependency on other tasks completing.
Able to wait on an unknown number of tasks preforming the same action (maybe wrapped in the same task object which is invoked multiple times)
Set maximum concurrent instances of a task since they use a shared resource there is no point running more than one at once
Hint at priority, or scheduler places tasks with lower maximum concurrent instances at a higher priority (so as to keep said resource in use as much as possible)
Edit ability to vary the priority of tasks which are preforming the same action (pretty poor example but, PredictWeather (Tommorrow) will have a higher priority than PredictWeather (NextWeek))
Can someone point me towards an example / tell me how I can achieve this? Cheers.
C# Use Case: (typed in SO so please for give any syntax errors / typos)
**note Do() / DoAfter() shouldn't block the calling thread*
class Application ()
{
Task LeafTask = new Task (LeafWork) {PriorityHint = High, MaxConcurrent = 1};
var Tree = new TaskTree (LeafTask);
Task TraverseTask = new Task (Tree.Traverse);
Task WorkTask = new Task (MoreWork);
Task RunTask = new Task (Run);
Object SharedLeafWorkObject = new Object ();
void Entry ()
{
RunTask.Do ();
RunTask.Join (); // Use this thread for task processing until all invocations of RunTask are complete
}
void Run(){
TraverseTask.Do ();
// Wait for TraverseTask to make sure all leaf tasks are invoked before waiting on them
WorkTask.DoAfter (new [] {TraverseTask, LeafTask});
if (running){
RunTask.DoAfter (WorkTask); // Keep at least one RunTask alive to prevent Join from 'unblocking'
}
else
{
TraverseTask.Join();
WorkTask.Join ();
}
}
void LeafWork (Object leaf){
lock (SharedLeafWorkObject) // Fake a shared resource
{
Thread.Sleep (200); // 'work'
}
}
void MoreWork ()
{
Thread.Sleep (2000); // this one takes a while
}
}
class TaskTreeNode<TItem>
{
Task LeafTask; // = Application::LeafTask
TItem Item;
void Traverse ()
{
if (isLeaf)
{
// LeafTask set in C-Tor or elsewhere
LeafTask.Do(this.Item);
//Edit
//LeafTask.Do(this.Item, this.Depth); // Deeper items get higher priority
return;
}
foreach (var child in this.children)
{
child.Traverse ();
}
}
}
There are numerous examples here:
http://code.msdn.microsoft.com/ParExtSamples
There's a great white paper which covers a lot of the details you mention above here:
"Patterns for Parallel Programming: Understanding and Applying Parallel Patterns with the .NET Framework 4"
http://www.microsoft.com/downloads/details.aspx?FamilyID=86b3d32b-ad26-4bb8-a3ae-c1637026c3ee&displaylang=en
Off the top of my head I think you can do all the things you list in your question.
Dependencies etc: Task.WaitAll(Task[] tasks)
Scheduler: The library supports numerous options for limiting number of threads in use and supports providing your own scheduler. I would avoid altering the priority of threads if at all possible. This is likely to have negative impact on the scheduler, unless you provide your own.
Related
I am building a Capacited Vehicle Routing Problem with Time Windows, but with one small difference when compared to the one provided in examples from the documentation: I don't have a depot. Instead, each order has a pickup step, and a delivery step, in two different locations.
(like in the Vehicle Routing example from the documentation, the previousStep planning variable has the CHAINED graph type, and its valueRangeProviderRefs includes both Drivers, and Steps)
This difference adds a couple of constraints:
the pickup and delivery steps of a given order must be handled by the same driver
the pickup must be before the delivery
After experimenting with constraints, I have found that it would be more efficient to implement two types of custom moves:
assign both steps of an order to a driver
rearrange the steps of a driver
I am currently implementing that first custom move. My solver's configuration looks like this:
SolverFactory<RoutingProblem> solverFactory = SolverFactory.create(
new SolverConfig()
.withSolutionClass(RoutingProblem.class)
.withEntityClasses(Step.class, StepList.class)
.withScoreDirectorFactory(new ScoreDirectorFactoryConfig()
.withConstraintProviderClass(Constraints.class)
)
.withTerminationConfig(new TerminationConfig()
.withSecondsSpentLimit(60L)
)
.withPhaseList(List.of(
new LocalSearchPhaseConfig()
.withMoveSelectorConfig(CustomMoveListFactory.getConfig())
))
);
My CustomMoveListFactory looks like this (I plan on migrating it to an MoveIteratorFactory later, but for the moment, this is easier to read and write):
public class CustomMoveListFactory implements MoveListFactory<RoutingProblem> {
public static MoveListFactoryConfig getConfig() {
MoveListFactoryConfig result = new MoveListFactoryConfig();
result.setMoveListFactoryClass(CustomMoveListFactory.class);
return result;
}
#Override
public List<? extends Move<RoutingProblem>> createMoveList(RoutingProblem routingProblem) {
List<Move<RoutingProblem>> moves = new ArrayList<>();
// 1. Assign moves
for (Order order : routingProblem.getOrders()) {
Driver currentDriver = order.getDriver();
for (Driver driver : routingProblem.getDrivers()) {
if (!driver.equals(currentDriver)) {
moves.add(new AssignMove(order, driver));
}
}
}
// 2. Rearrange moves
// TODO
return moves;
}
}
And finally, the move itself looks like this (nevermind the undo or the isDoable for the moment):
#Override
protected void doMoveOnGenuineVariables(ScoreDirector<RoutingProblem> scoreDirector) {
assignStep(scoreDirector, order.getPickupStep());
assignStep(scoreDirector, order.getDeliveryStep());
}
private void assignStep(ScoreDirector<RoutingProblem> scoreDirector, Step step) {
StepList beforeStep = step.getPreviousStep();
Step afterStep = step.getNextStep();
// 1. Insert step at the end of the driver's step list
StepList lastStep = driver.getLastStep();
scoreDirector.beforeVariableChanged(step, "previousStep"); // NullPointerException here
step.setPreviousStep(lastStep);
scoreDirector.afterVariableChanged(step, "previousStep");
// 2. Remove step from current chained list
if (afterStep != null) {
scoreDirector.beforeVariableChanged(afterStep, "previousStep");
afterStep.setPreviousStep(beforeStep);
scoreDirector.afterVariableChanged(afterStep, "previousStep");
}
}
The idea being that at no point I'm doing an invalid chained list manipulation:
However, as the title and the code comment indicate, I am getting a NullPointerException when I call scoreDirector.beforeVariableChanged. None of my variables are null (I've printed them to make sure). The NullPointerException doesn't occur in my code, but deep inside Optaplanner's inner workings, making it difficult for me to fix it:
Exception in thread "main" java.lang.NullPointerException
at org.drools.core.common.NamedEntryPoint.update(NamedEntryPoint.java:353)
at org.drools.core.common.NamedEntryPoint.update(NamedEntryPoint.java:338)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.update(StatefulKnowledgeSessionImpl.java:1579)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.update(StatefulKnowledgeSessionImpl.java:1551)
at org.optaplanner.core.impl.score.stream.drools.DroolsConstraintSession.update(DroolsConstraintSession.java:49)
at org.optaplanner.core.impl.score.director.stream.ConstraintStreamScoreDirector.afterVariableChanged(ConstraintStreamScoreDirector.java:137)
at org.optaplanner.core.impl.domain.variable.inverserelation.SingletonInverseVariableListener.retract(SingletonInverseVariableListener.java:96)
at org.optaplanner.core.impl.domain.variable.inverserelation.SingletonInverseVariableListener.beforeVariableChanged(SingletonInverseVariableListener.java:46)
at org.optaplanner.core.impl.domain.variable.listener.support.VariableListenerSupport.beforeVariableChanged(VariableListenerSupport.java:170)
at org.optaplanner.core.impl.score.director.AbstractScoreDirector.beforeVariableChanged(AbstractScoreDirector.java:430)
at org.optaplanner.core.impl.score.director.AbstractScoreDirector.beforeVariableChanged(AbstractScoreDirector.java:390)
at test.optaplanner.solver.AssignMove.assignStep(AssignMove.java:98)
at test.optaplanner.solver.AssignMove.doMoveOnGenuineVariables(AssignMove.java:85)
at org.optaplanner.core.impl.heuristic.move.AbstractMove.doMove(AbstractMove.java:35)
at org.optaplanner.core.impl.heuristic.move.AbstractMove.doMove(AbstractMove.java:30)
at org.optaplanner.core.impl.score.director.AbstractScoreDirector.doAndProcessMove(AbstractScoreDirector.java:187)
at org.optaplanner.core.impl.localsearch.decider.LocalSearchDecider.doMove(LocalSearchDecider.java:132)
at org.optaplanner.core.impl.localsearch.decider.LocalSearchDecider.decideNextStep(LocalSearchDecider.java:116)
at org.optaplanner.core.impl.localsearch.DefaultLocalSearchPhase.solve(DefaultLocalSearchPhase.java:70)
at org.optaplanner.core.impl.solver.AbstractSolver.runPhases(AbstractSolver.java:98)
at org.optaplanner.core.impl.solver.DefaultSolver.solve(DefaultSolver.java:189)
at test.optaplanner.OptaPlannerService.testOptaplanner(OptaPlannerService.java:68)
at test.optaplanner.App.main(App.java:13)
Is there something I did wrong? It seems I am following the documentation for custom moves fairly closely, outside of the fact that I am using exclusively java code instead of drools.
The initial solution I feed to the solver has all of the steps assigned to a single driver. There are 15 drivers and 40 orders.
In order to bypass this error, I have tried a number of different things:
remove the shadow variable annotation, turn Driver into a problem fact, and handle the nextStep field myself => this makes no difference
use Simulated Annealing + First Fit Decreasing construction heuristics, and start with steps not assigned to any driver (this was inspired by looking up the example here, which is more complete than the one from the documentation) => the NullPointerException appears on afterVariableChanged instead, but it still appears.
a number of other things which were probably not very smart
But without a more helpful error message, I can't think of anything else to try.
Thank you for your help
I need to run some jobs in a cluster, only one at a time.
Because my team uses Hazelcast, I ended up with a solution based on
Hazelcast ILock implementation. For the purpose of the question, I am going to make a generalisation about it. Let's suppose we have the following interfaces (that could be easily implemented e.g. by Hazelcast or Reddison (Redis)):
public interface MyDistributedLock {
boolean lock();
void unlock();
boolean isLockedByCurrentThread();
}
public interface MyLockDistributedFactory {
MyDistributedLock getLock(String name);
}
And lock method waiting if lock cannot be acquired:
private Mono<Void> lock(String name, Publisher<?> publisher, MyLockDistributedFactory myLockFactory) {
// important to release lock on the same thread as
// it was aquired
Scheduler scheduler = Schedulers.newSingle(name.toLowerCase());
return Mono.defer(() -> Mono.just(myLockFactory.getLock(name)))
publishOn(scheduler)
.doOnNext(MyDistributedLock::lock)
.doOnNext(lock -> LOGGER.info("Process acquired lock for resource {}", name))
.flatMapMany(lock -> Flux.from(publisher))
.publishOn(scheduler)
.doFinally(signalType -> {
MyDistributedLock lock = myLockFactory.getLock(name);
if (signalType == SignalType.CANCEL) {
// cancel ignores publishOn
scheduler.schedule(() -> {
lock.unlock();
LOGGER.info("Process released lock for resource {} due to signal type {}", name, signalType);
});
} else if (lock.isLockedByCurrentThread()) {
lock.unlock();
LOGGER.info("Process released lock for resource {} due to signal type {}", name, signalType);
}
})
.then();
}
And example of some job
private Mono<Void> someJobRunEveryOneHourOnEveryNodeInCluster() {
MyLockDistributedFactory hazelcast = ...;
return lock("some-job", Flux.just(1,2), hazelcast)
.repeatWhen(afterOneHour());
}
I wonder whether this is a good approach of using Project reactor (and correct implementation) or it should be done in a different way. Please advice.
it is a correct approach when using Reactor, because you took care of offsetting the blocking portion into a dedicated Scheduler/Thread.
But I'd say mutually exclusive code like this is not a very good fit for reactive programming in general: you lose one of the key benefits of doing more with less threads, you risk blocking other parts of the application should you forget to publishOn a dedicated thread, etc...
I have a process where I have 3 sequential user tasks (something like Task 1 -> Task 2 -> Task 3). So, to validate the Task 3, I have to validate the Task 1, then the Task 2.
My goal is to implement a workaround to go back in an execution of a process instance thanks to a Command like suggested in this link. The problem is I started to implement the command by it does not work as I want. The algorithm should be something like:
Retrieve the task with the passed id
Get the process instance of this task
Get the historic tasks of the process instance
From the list of the historic tasks, deduce the previous one
Create a new task from the previous historic task
Make the execution to point to this new task
Maybe clean the task pointed before the update
So, the code of my command is like that:
public class MoveTokenCmd implements Command<Void> {
protected String fromTaskId = "20918";
public MoveTokenCmd() {
}
public Void execute(CommandContext commandContext) {
HistoricTaskInstanceEntity currentUserTaskEntity = commandContext.getHistoricTaskInstanceEntityManager()
.findHistoricTaskInstanceById(fromTaskId);
ExecutionEntity currentExecution = commandContext.getExecutionEntityManager()
.findExecutionById(currentUserTaskEntity.getExecutionId());
// Get process Instance
HistoricProcessInstanceEntity historicProcessInstanceEntity = commandContext
.getHistoricProcessInstanceEntityManager()
.findHistoricProcessInstance(currentUserTaskEntity.getProcessInstanceId());
HistoricTaskInstanceQueryImpl historicTaskInstanceQuery = new HistoricTaskInstanceQueryImpl();
historicTaskInstanceQuery.processInstanceId(historicProcessInstanceEntity.getId()).orderByExecutionId().desc();
List<HistoricTaskInstance> historicTaskInstances = commandContext.getHistoricTaskInstanceEntityManager()
.findHistoricTaskInstancesByQueryCriteria(historicTaskInstanceQuery);
int index = 0;
for (HistoricTaskInstance historicTaskInstance : historicTaskInstances) {
if (historicTaskInstance.getId().equals(currentUserTaskEntity.getId())) {
break;
}
index++;
}
if (index > 0) {
HistoricTaskInstance previousTask = historicTaskInstances.get(index - 1);
TaskEntity newTaskEntity = createTaskFromHistoricTask(previousTask, commandContext);
currentExecution.addTask(newTaskEntity);
commandContext.getTaskEntityManager().insert(newTaskEntity);
AtomicOperation.TRANSITION_CREATE_SCOPE.execute(currentExecution);
} else {
// TODO: find the last task of the previous process instance
}
// To overcome the "Task cannot be deleted because is part of a running
// process"
TaskEntity currentUserTask = commandContext.getTaskEntityManager().findTaskById(fromTaskId);
if (currentUserTask != null) {
currentUserTask.setExecutionId(null);
commandContext.getTaskEntityManager().deleteTask(currentUserTask, "jumped to another task", true);
}
return null;
}
private TaskEntity createTaskFromHistoricTask(HistoricTaskInstance historicTaskInstance,
CommandContext commandContext) {
TaskEntity newTaskEntity = new TaskEntity();
newTaskEntity.setProcessDefinitionId(historicTaskInstance.getProcessDefinitionId());
newTaskEntity.setName(historicTaskInstance.getName());
newTaskEntity.setTaskDefinitionKey(historicTaskInstance.getTaskDefinitionKey());
newTaskEntity.setProcessInstanceId(historicTaskInstance.getExecutionId());
newTaskEntity.setExecutionId(historicTaskInstance.getExecutionId());
return newTaskEntity;
}
}
But the problem is I can see my task is created, but the execution does not point to it but to the current one.
I had the idea to use the activity (via the object ActivityImpl) to set it to the execution but I don't know how to retrieve the activity of my new task.
Can someone help me, please?
Unless somethign has changed in the engine significantly the code in the link you reference should still work (I have used it on a number of projects).
That said, when scanning your code I don't see the most important command.
Once you have the current execution, you can move the token by setting the current activity.
Like I said, the code in the referenced article used to work and still should.
Greg
Referring the same link in your question, i would personally recommend to work with the design of you your process. use an exclusive gateway to decide whether the process should end or should be returned to the previous task. if the generation of task is dynamic, you can point to the same task and delete local variable. Activiti has constructs to save your time from implementing the same :).
I have a web service in WCF that consume some external web services, so what I want to do is make this service asynchronous in order to release the thread, wait for the completion of all the external services, and then return the result to the client.
With Framework 4.0
public class MyService : IMyService
{
public IAsyncResult BeginDoWork(int count, AsyncCallback callback, object serviceState)
{
var proxyOne = new Gateway.BackendOperation.BackendOperationOneSoapClient();
var proxyTwo = new Gateway.BackendOperationTwo.OperationTwoSoapClient();
var taskOne = Task<int>.Factory.FromAsync(proxyOne.BeginGetNumber, proxyOne.EndGetNumber, 10, serviceState);
var taskTwo = Task<int>.Factory.FromAsync(proxyTwo.BeginGetNumber, proxyTwo.EndGetNumber, 10, serviceState);
var tasks = new Queue<Task<int>>();
tasks.Enqueue(taskOne);
tasks.Enqueue(taskTwo);
return Task.Factory.ContinueWhenAll(tasks.ToArray(), innerTasks =>
{
var tcs = new TaskCompletionSource<int>(serviceState);
int sum = 0;
foreach (var innerTask in innerTasks)
{
if (innerTask.IsFaulted)
{
tcs.SetException(innerTask.Exception);
callback(tcs.Task);
return;
}
if (innerTask.IsCompleted)
{
sum = innerTask.Result;
}
}
tcs.SetResult(sum);
callback(tcs.Task);
});
}
public int EndDoWork(IAsyncResult result)
{
try
{
return ((Task<int>)result).Result;
}
catch (AggregateException ex)
{
throw ex.InnerException;
}
}
}
My questions here are:
This code is using three threads: one that is instanced in the
BeginDoWork, another one that is instanced when the code enter
inside the anonymous method ContinueWhenAll, and the last one when
the callback is executed, in this case EndDoWork. Is that correct or
I’m doing something wrong on the calls? Should I use any
synchronization context? Note: The synchronization context is null
on the main thread.
What happen if I “share” information between
threads, for instance, the callback function? Will that cause a
performance issue or the anonymous method is like a closure where I
can share data?
With Framework 4.5 and Async and Await
Now with Framework 4.5, the code seems too much simple than before:
public async Task<int> DoWorkAsync(int count)
{
var proxyOne = new Backend.ServiceOne.ServiceOneClient();
var proxyTwo = new Backend.ServiceTwo.ServiceTwoClient();
var doWorkOne = proxyOne.DoWorkAsync(count);
var doWorkTwo = proxyTwo.DoWorkAsync(count);
var result = await Task.WhenAll(doWorkOne, doWorkTwo);
return doWorkOne.Result + doWorkTwo.Result;
}
But in this case when I debug the application, I always see that the code is executed on the same thread. So my questions here are:
3.. When I’m waiting for the “awaitable” code, is that thread released and goes back to the thread pool to execute more requests?
3.1. If So, I suppose that when I get a result from the await Task, the execution completes on the same thread that the call before. Is that possible? What happen if that thread is processing another request?
3.2 If Not, how can I release the thread to send it back to the thread pool with Asycn and Await pattern?
Thank you!
1. This code is using three threads: one that is instanced in the BeginDoWork, another one that is instanced when the code enter inside the anonymous method ContinueWhenAll, and the last one when the callback is executed, in this case EndDoWork. Is that correct or I’m doing something wrong on the calls? Should I use any synchronization context?
It's better to think in terms of "tasks" rather than "threads". You do have three tasks here, each of which will run on the thread pool, one at a time.
2. What happen if I “share” information between threads, for instance, the callback function? Will that cause a performance issue or the anonymous method is like a closure where I can share data?
You don't have to worry about synchronization because each of these tasks can't run concurrently. BeginDoWork registers the continuation just before returning, so it's already practically done when the continuation can run. EndDoWork will probably not be called until the continuation is complete; but even if it is, it will block until the continuation is complete.
(Technically, the continuation can start running before BeginDoWork completes, but BeginDoWork just returns at that point, so it doesn't matter).
3. When I’m waiting for the “awaitable” code, is that thread released and goes back to the thread pool to execute more requests?
Yes.
3.1. If So, I suppose that when I get a result from the await Task, the execution completes on the same thread that the call before. Is that possible? What happen if that thread is processing another request?
No. Your host (in this case, ASP.NET) may continue the async methods on any thread it happens to have available.
This is perfectly safe because only one thread is executing at a time.
P.S. I recommend
var result = await Task.WhenAll(doWorkOne, doWorkTwo);
return result[0] + result[1];
instead of
var result = await Task.WhenAll(doWorkOne, doWorkTwo);
return doWorkOne.Result + doWorkTwo.Result;
because Task.Result should be avoided in async programming.
Basically I have a static custom queue of objects I want to process. From multiple threads, I need to kick off a singular Task that will process the queued objects, stopping the task when all items are dequeued.
Some psuedo code:
static CustomQueue _customqueue;
static Task _processQueuedItems;
public static void EnqueueSomething(object something) {
_customqueue.Enqueue(something);
StartProcessingQueue();
}
static void StartProcessingQueue() {
if(_processQueuedItems != null) {
_processQueuedItems = new Task(() => {
while(_customqueue.Any()) {
var stuffToDequeue = _customqueue.Dequeue();
/* do stuff */
}
});
_processQueuedItems.Start();
}
if(_processQueuedItems.Status != TaskStatus.Running) {
_processQueuedItems.Start();
}
}
If it makes a difference my custom queue is a queue that essentially holds items until they reach a certain age, then allows them to dequeue. Everytime an item is touched its timer starts again. I know this piece works fine.
The part I'm struggling with is the parallelism. (Clearly, I don't know what I'm doing here). What I want is to have one thread process the queue until it's complete, then go away. If another call comes in it doesn't start a new thread unless it has to.
I hope that explains my issue okay.
You might want to consider using BlockingCollection<T> here. You could make your custom queue implement IProducerConsumerCollection, in which case BC could use it directly.
You'd then just need to start a long running Task to call blockingCollection.GetConsumingEnumerable() and process the items in a foreach. The task will automatically block when the collection is empty, and restart when a new item is Enqueued.