I have a usage issue I need some advice on.
I have a process with a main flow which loops, retrying a task every n hours until either a condition is met or a timeout is reached. So far so good.
There is a transactional sub process triggered to run in parallel to this main loop which, for as long as this main loop is active, carries out its own looping behaviour (every x days). This second loop should run for as long as the main loop continues, and be killed as soon as the main loop reaches one of its progression criteria.
The way I'd like to model it would be to use a message/signal throw event from the main flow after it has passed its progress criteria, with a corresponding catch message/signal as a boundary event on the sub process, which then triggers a sub process end/terminate event inside the boundaries of the sub process.
I've looked long and hard at resources and the standard, and I can't see any examples of people using boundary events in this way (as an input from outside the sub process, leading to an end event inside the sub process). Any idea if this is valid?
If not valid, anyone have a better method for having a main flow kill a sub process in this way?
Main Process: Start, parallel gateway (fork), first branch contains subprocess 1, second branch contains subprocess 2, exclusive gateway (join), end.
Subprocess 1: Start, loop, exit from the loop under some condition, then end.
Subprocess 2: Start, loop, no end node.
This way, subprocess 2 can't cause an end of looping of its own. But subprocess 1 can end, and by the exclusive join gateway, subprocess 2 will end as well.
I'm not quite sure whether a parallel fork, followed by an exclusive join, is actually allowed formally in BPMN. But some tools can handle it, and I received this hint from a tool vendor (Bonita).
Related
I have a situation here in my code where all tasks are running with same priority based on round robin (with fixed time slice of 50ms) scheduling algorithm. Now I want to run one particular task say Task A, exactly within a time period of 10ms to update some communication db. Since,current scheduling is based on round robin with fixed time slice of 50ms due to that the Task A is not able to get called exactly in 10ms. I am not getting any solution to cope up with the current problem.
Please do provide your valuable suggestion & advice.
Thanks in advance,
Vijay Khaitan
Not exactly sure what you are asking here. If you do not want Task A to run longer than 10ms, and you know that you will return from your communication functions in less than that, you can take a time reading at the beginning of Task A, and call osThreadYield() from Task A after you hit 10ms (busy loop).
If you are somewhere in Task B, and need to call Task A in exactly 10ms, it becomes a bit more complicated, since you don't know what thread can preempt your Task B at that time. What you can try, is in Task B, keep a handle to Task A. Then when you are ready to wait 10ms, do the following:
osThreadId id;
id = osThreadGetId (); // id for the currently running thread
osThreadSetPriority(id, osPriorityRealtime); // Make sure we get back here quickly
osWait(10); // Wait 10ms
osThreadSetPriority(id, osPriorityNormal); // Go back to normal
// If you need to create Task A, do so here, otherwise you can
// use osSignalSet here and osSignalWait in Task A
You can also call directly create Task A, set its priority to osPriorityRealtime, yield from Task B, and have the first method in Task A be osWait(10). As soon as you return, set its priority back to normal.
EDIT: I realized that I, unfortunately, overlooked a semicolon at the end of the while statement in the first example code and misinterpreted it myself. So there is in fact an empty loop for threads with threadIdx.x != s, a convergency point after that loop and a thread waiting at this point for all the others without incrementing the s variable. I am leaving the original (uncorrected) question below for anyone interested in it. Be aware, that there is a semicolon missing at the end of the second line in the first example and thus, s++ has nothing in common with the cycle body.
--
We were studying serialization in our CUDA lesson and our teacher told us that a code like this:
__shared__ int s = 0;
while (s != threadIdx.x)
s++; // serialized code
would end up with a HW deadlock because the nvcc compiler puts a reconvergence point between the while (s != threadIdx.x) and s++ statements. If I understand it correctly, this means that once the reconvergence point is reached by a thread, this thread stops execution and waits for the other threads until they reach the point too. In this example, however, this never happens, because thread #0 enters the body of the while loop, reaches the reconvergence point without incrementing the s variable and other threads get stuck in an endless loop.
A working solution should be the following:
__shared__ int s = 0;
while (s < blockDim.x)
if (threadIdx.x == s)
s++; // serialized code
Here, all threads within a block enter the body of the loop, all evaluate the condition and only thread #0 increments the s variable in the first iteration (and loop goes on).
My question is, why does the second example work if the first hangs? To be more specific, the if statement is just another point of divergence and in terms of the Assembler language should be compiled into the same conditional jump instruction as the condition in the loop. So why isn't there any reconvergence point before s++ in the second example and has it in fact gone immediately after the statement?
In other sources I have only found that a divergent code is computed independently for every branch - e.g. in an if/else statement, first the if branch is computed with all else-branched threads masked within the same warp and then the other threads compute the else branch while the first wait. There's a reconvergence point after the if/else statement. Why then does the first example freeze, not having the loop split into two branches (a true branch for one thread and a waiting false branch for all the others in a warp)?
Thank you.
It does not make sense to put the reconvergence point between the call to while (s != threadIdx.x) and s++;. It disrupts the program flow since the reconvergence point for a piece of code should be reachable by all threads at compile time. Below picture shows the flowchart of your first piece of code and possible and impossible points of reconvergence.
Regarding this answer about recording the convergence point via SSY instruction, I created below simple kernel resembling your first piece of code
__global__ void kernel_1() {
__shared__ int s;
if(threadIdx.x==0)
s = 0;
__syncthreads();
while (s == threadIdx.x)
s++; // serialized code
}
and compiled it for CC=3.5 with -O3. Below is the result of using cuobjdumbinary tool for the output to observe the CUDA assembly. The result is:
I'm not an expert in reading CUDA assembly but I can see while loop condition checks in lines 0038 and 00a0. At line 00a8, it branches to 0x80 if it satisfies the while loop condition and executes the code block again. The introduction of the reconvergence point is at line 0058 introducing line 0xb8 as the reconvergence point which is after the loop condition check near the exit.
Overall, it is not clear what you're trying to achieve with this piece of code. Also in the second piece of code, the reconvergence point should be again after while loop code block (I don't mean between while and if).
The reason why it "hangs" is neither a HW deadlock nor branching, at least not directly. You produce an endless loop for one or multiple threads (as already suspected).
In your example, there isn't really a convergence point. Since you do not use any synchronization, there aren't any threads that actually wait. What happens here with the while-loop is pretty much a busy-wait.
A kernel only finishes if all threads return. Since you have one (or multiple) endless loops (by accident maybe even none - this is unlikely however) the kernel will never finish.
You declared a shared variable s. This variable is known to all threads within a block.
With your while-statement you basically say (to each thread): increment s until it reaches the value of your (local) thread id. Since all threads are incrementing s in parallel, you introduce race conditions.
Example:
List item
Thread 5 is looping and checking for s to become 5
s is 4
Two threads increment s, it becomes 6
At the same time thread 5 only reached the end of its loop.
Now it reaches the next loop iteration and checks for s and it's not 5.
Thread 5 will never be able to finish since you check via == and the value of s already exceeded the value of the thread id.
Also your solution is quite confusing, because each thread executes the serialized code consecutively (which probably was the intention after all - even though that actually is strange):
Thread 0 will execute the serialized code
After that, thread 1 will execute the serialized code
and so on
Most examples show a program where each thread works on some code, then all threads are synchronized and only single thread executes some more code (maybe it needed the results of all threads).
So, your second example "works" because no thread is stuck in an endless loop, however I can't think of a reason why anyone would use such a code,
since it is confusing and, well, not parallel at all.
An example of this problem is when a user creates a resource and deletes a resource. We will perform the operation and also increment (decrement) a counter cache.
In testing, there is sometimes a race condition where the counter cache has not been updated by the go routine.
EDIT: Sorry about the confusion, to clarify: the counter cache is not in memory, it is actually a field in the database. The race condition is not to a variable in memory, it is actually that the goroutine might be slow to write into the database itself!
I currently use a 1 second sleep after the operation to ensure that the counter cache has been updated before testing the counter cache. Is there another way to test go routine without the arbitrary 1 second sleep to wait for the go routine to finish?
Cheers
In testing, there is sometimes a race condition where the counter cache has not been updated by the go routine. I currently use a 1 second sleep after the operation to ensure that the counter cache has been updated before testing the counter cache.
Yikes, I hate to say it, but you're doing it wrong. Go has first-class features to make concurrency easy! If you use them correctly, it's impossible to have race conditions.
In fact, there's a tool that will detect races for you. I'll bet it complains about your program.
One simple solution:
Have the main routine create a goroutine for keeping track of the counter.
the goroutine will just do a select and get a message to increment/decrement or read the counter. (If reading, it will be passed in a channel to return the number)
when you create/delete resources, send an appropriate message to the goroutine counter via it's channel.
when you want to read the counter, send a message for read, and then read the return channel.
(Another alternative would be to use locks. It would be a tiny bit more performant, but much more cumbersome to write and ensure it's correct.)
One solution is to make to let your counter offer a channel which is updated as soon as the value
changes. In go it is common practice to synchronize by communicating the result. For example your
Couter could look like this:
type Counter struct {
value int
ValueChange chan int
}
func (c *Counter) Change(n int) {
c.value += n
c.ValueChange <- c.value
}
Whenever Change is called, the new value is passed through the channel and whoever is
waiting for the value unblocks and continues execution, therefore synchronizing with the
counter. With this code you can listen on ValueChange for changes like this:
v := <-c.ValueChange
Concurrently calling c.Change is no problem anymore.
There is a runnable example on play.
I am going through process synchronization, and facing difficulty in understanding semaphore. So here is my doubt:
the source says that
" Semaphore S is an integer variable that is accessed through standard atomic operations i.e. wait() and signal().
It also provided basic definition of wait()
wait(Semaphore S)
{
while S<=0
; //no operation
S--;
}
Definition of signal()
signal(S)
{
S++;
}
Let the initial value of a semaphore be 1, and say there are two concurrent processes P0 and P1 which are not supposed to perform operations of their critical section simultaneously.
Now say P0 is in its critical section, so the Semaphore S must have value 0, now say P1 wants to enter its critical section so it executes wait(), and in wait() it continuously loops, now to exit from the loop the semaphore value must be incremented, but it may not be possible because according the source, wait() is an atomic operation and can't be interrupted and thus the process P0 can't call signal() in a single processor system.
I want to know, is the understanding i have so far is correct or not. and if correct then how come process P0 call signal() when process P1 is strucked in while loop?
I think the top-voted answer is inaccurate!
Operation wait() and signal() must be completely atomic; no two processes can execute wait() or signal() operation simultaneously because they are implemented in kernel and processes in kernel mode can not be preempted.
If several processes attempt a P(S) simultaneously, only one process will be allowed to proceed(non-preemptive kernel that is free of race condition).
for the above implementation to work preemption is necessary (preemptive kernel)
read about the atomicity of semaphore operations
http://personal.kent.edu/~rmuhamma/OpSystems/Myos/semaphore.htm
https://en.wikibooks.org/wiki/Operating_System_Design/Processes/Semaphores
I think it's an inaccuracy in your source. Atomic for the wait() operation means each iteration of it is atomic, meaning S-- is performed without interruption, but the whole operation is interruptible after each completion of S-- inside the while loop.
I don't think, keeping an infinite while loop inside the wait() operation is wise. I would go for Stallings' example;
void semWait(semaphore s){
s.count--;
if(s.count<0)
*place this process in s.queue and block this process
}
I think what the book means for the atomic operation is testing S<=0 to be true as well as S--. Just like testAndset() it mention before.
if both separate operations S<=0 and S-- are atomic but can be interrupt by other process, this method won't work.
imagine two process p0 and p1, if p0 want to enter the critical section and tested S<=0 to be true. and it was interrupted by p1 and tested S<=0 also be true. then both of the process will enter the critical section. And that's wrong.
the actual not atomic operation is inside the while loop, even if the while loop is empty, other process can still interrupt current one when S<=0 tested to be false, which enable other process can continue their work in critical section and release the lock.
however, I think the code from the book can not actually use in OS since I don't know how to make operations S<=0 to be true and S-- together atomic. more possible way to do that is put the S-- inside the while loop like SomeWittyUsername said.
When a task attempts to acquire a semaphore that is unavailable, the semaphore places the task onto a wait queue and puts the task to sleep.The processor is then free to execute other code.When the semaphore becomes available, one of the tasks on the wait queue is awakened so that it can then acquire the semaphore.
while S<=0
; //no operation This doesn't mean that the processor running this code. The process/task is blocked until it gets the semaphore.
i think ,
when process P1 is strucked in while loop it will be in the wait state.processor will switch over among the process p0 & p1 (context switching) so the priority goes to p0 and it call signal() and then s will be incremented by 1 and p0 exit from the section so process P1 can enter into critical section and can avoid the mutual exclusion
I want to create a small delay so that my 1st set of codes run smoothly.
How can I do that in vb.net ?
Edit 1
Suppose I have a few lines of code like this
..................Statement Line 1..............
..................Statement Line 2..............
..................Statement Line 3..............
..................Statement Line 1..............
..................Statement Line 5..............
WAIT UNTIL STATEMENT 5 IS COMPLETED
..................Statement Line 6..............
..................Statement Line 7..............
..................Statement Line 8..............
..................Statement Line 9..............
..................Statement Line 10.............
Only when the execution of first five statements are complete then only the next five can be executed
First - given the code sample you've provided line 6 will not execute until line 5 finishes. You don't need to do anything; unless line 5 is kicking of an external application or creating a new thread.
Beyond that -
Thread.Sleep will introduce a delay, but more often than not it really isn't what you are looking for.
If you use Thread.Sleep the executing thread will sleep for however long you tell it. But your sample code indicates you want the thread to wait UNTIL some condition is met. Assuming you are waiting on a condition that happens outside of the thread you are sleeping, at best, you'd end up with a loop that keeps sleeping for X milliseconds and then checking the condition.
There are other approaches that are easier (in the long run)/more robust than that. If you truly want something to happen on another thread and to be alerted by it's completion; consider the BackgroundWorker class.
http://msdn.microsoft.com/en-us/library/system.componentmodel.backgroundworker.aspx
It's very handy for simple multithreading tasks. You create a BackgroundWorker and handle it's 'DoWork' event with the logic to happen on the new thread and handle the 'Completed' event (check the docs for the correct name, I'm going from memory). You call 'RunWorkerAsynch' to start the process and when the 'Worker_Completed' event fires, you can continue execution of line 6.
I hope that makes sense/helps.
Are you looking to make a thread sleep?
Thread.Sleep(100);
Where 100 is the number of millisecond you want the thread to sleep for.
Also make sure to have Imports System.Threading, which I assume you have if you already have multiple threads.
EDIT: Okay, so you've added a bit of code. Still, this should come down to whether you have more than one thread running, and from your question it looks like it's all in one thread. In this case, statement 5 will always finish first before statement 6 runs. That's how code works. The only case where it wouldn't would be if one of statements 1-5 spawns something on a new thread.
I think
Application.DoEvents()
should do that.
Use BackgroundWorker object property "IsBusy" and do not allow to execute line 6 until the worker process is busy.
Read more about BackgroundWorker here