Raku .hyper() and .race() example not working - raku

The following example code should accelerate the execution of a Raku program:
for (1..4).race() {
say "Doing $_";
sleep 1;
}
say now - INIT now;
I remember, that it worked some time ago, but now I always end up with 4 seconds runtime. Also using .race() or adding parameters doesn't change anything. What does I have to do, to run 2 processes at the same time?

You should use race with the named argument batch and the statement prefix race.
say race for (1..4).race(batch=>1) {
say "Doing $_";
sleep 1.rand;$_
}
say now - INIT now;

Related

Go-lang Testing, What is the meaning of Parallel Setting on Benchmark?

In https://golang.org/pkg/testing/ described that we can use testing.B.RunParallel() function to run benchmark in a parallel setting. I tried to write simple testing code:
func BenchmarkFunctionSome(b *testing.B) {
for i := 0; i < b.N; i++ {
SomeFunction()
}
}
and then i changed it to use RunParallel()
func BenchmarkFunctionSome(b *testing.B) {
b.RunParallel(func(pb *testing.PB) {
for pb.Next() {
SomeFunction()
}
})
}
And the one which used RunParallel() is slower than the first benchmark.
Actually what is the meaning of parallel setting in benchmarking? Why it became slower if i used RunParallel() ?
The for loop in the first benchmark has all tests run sequentially, one at a time, and the performance is the time divided by iterations.
The RunParallel benchmark divides the iterations among available threads. The performance is calculated similarly, probably averaging each group. The point of this is that several (exact # based upon your GOMAXPROCS setting) test iterations are run concurrently. This is especially helpful in testing functions with shared resources and locking, which may run fine solo but introduce performance issues when run concurrently.
It depends on whats there inside
SomeFunction()
If its just an empty function or a simple calculation, serial benchmark will be faster. But if it's a heavy computation or IO,
RunParallel()
benchmark will be faster.

Sleep blocks whole program (Smalltalk Squeak)

I'm making a N*N queens problem with gui.
I want the gui to stop for x seconds each move of every queen, problem is, the program just stacks all the waits together and then runs everything at speed.
I'm giving the code here: http://pastebin.com/s2VT0E49
EDIT:
This is my workspace:
board := MyBoard new initializeWithStart: 8.
Transcript show:'something'.
3 seconds asDelay wait.
board solve.
3 seconds asDelay wait.
board closeBoard.
This is where i want the wait to happen
canAttack: testRow x: testColumn
| columnDifference squareMark |
columnDifference := testColumn - column.
((row = testRow
or: [row + columnDifference = testRow])
or: [row - columnDifference = testRow]) ifTrue: [
squareDraw := squareDraw
color: Color red.
0.2 seconds asDelay wait.
^ true ].
squareDraw := squareDraw color: Color black.
^ neighbor canAttack: testRow x: testColumn
Since you're using Morphic you should use stepping for animation, not processes or delays. In your Morph implement a step method. This will be executed automatically and repeatedly. Also implement stepTime to answer the interval in milliseconds, e.g. 4000 for every 4 seconds.
Inside the step method, calculate your new state. If each queen is modeled as a separate Morph and you just move the positions, then Morphic will take care of updating the screen. If you have your own drawOn: method then call self changed in your step method so that Morphic will later invoke your drawing code.
See this tutorial: http://static.squeak.org/tutorials/morphic-tutorial-1.html
The process you're suspending is the one your program is running in. This process also happens to be the UI process. So when you suspend your program you also suspend the UI and therefore the UI elements never get a chance to update themselves. Try running your program in a separate process:
[ MyProgram run ] forkAt: Processor userBackgroundPriority.
Note that the UI process usually runs at priority 40. #userBackgroundPriority is 30. This makes sure that you can't lock up the UI.
To make your workspace code work insert this before the delay:
World doOneCycle.
This will cause the Morphic world to be redisplayed.
Note that this is quick-and-very-dirty hack and not the proper way to do it (see my other answer). Delays block the whole UI process, whereas the whole point of Morphic is that you can do many things simultaneously while your code is executing.

PLC Object Oriented Programming - Using methods

I'm writing a program for a Schneider PLC using structured text, and I'm trying to do it using object oriented programming.
Being a newbie in PLC programming, I wrote a simple test program such a this:
okFlag:=myObject.aMethod();
IF okFlag THEN
// it's ok, go on
ELSE
// error handling
END_IF
aMethod must perform some operations, wait for the result (there is a "time-out" check to avoid deadlocks) and return TRUE or FALSE
This is what I expected during program execution
1) when the okFlag:=myObject.aMethod(); is reached, the code inside aMethod is executed until a result is returned. When I say "executed" I mean that in the next scan cycle the execution of aMethodcontinues from the point it had reached before.
2) the result of method calling is checked and the main flow of the program is executed
and this is what happens:
1) aMethod is executed but the program flow continues. That is, when it reaches the end of aMethod a value it's returned, even if the events that aMethod should wait for are still executing.
2) on the next cycle, aMethod is called again and restarts from the beginning
This is the first solution I found:
VAR_STATIC
imBusy: BOOL
END_VAR
METHOD aMethod: INT;
IF NOT(imBusy) THEN
imBusy:=FALSE;
aMethod:=-1; // result of method while in progress
ELSE
aMethod:=-1;
<rest of code. If everything is ok, the result is 0, otherwise is 1>
END_IF
imBusy:=aMethod<0;
and the main program:
CASE (myObject.aMethod()) OF
0: // it's ok, go on
1: // error handling
ELSE
// still executing...
END_CASE
and this seems to work, but I don't know if it's the right approach.
There are some libraries from Schneider which use methods that return boolean and seem to work as I expected in my program. That is: when the cycle reaches the call to method for the first time the program flow is "deviated" somehow so that in the next cycle it enters again the method until it's finished. It's there a way to have this behaviour ?
generally OOP isn't the approach that people would take when using IEC61131 languages. Your best bet is probably to implement your code as a state machine. I've used this approach in the past as a way of simplifying a complex sequence so that it is easier for plant maintainers to interpret.
Typically what I would recommend if you are going to take this approach is to try to segregate your state machine itself from your working code; you can implement a state machine of X steps, and then have your working code reference the statemachine step.
A simple example might look like:
stepNo := 0;
IF (start AND stepNo = 0) THEN
StepNo = 1;
END_IF;
(* there's a shortcut unity operation for resetting this array to zeroes which is faster, but I can't remember it off the top of my head... *)
ActiveStepArray := BlankStepArray;
IF stepNo > 0 THEN
IF StepComplete[stepNo] THEN
stepNo := stepNo +1;
END_IF;
ActiveStepArray[stepNo] := true;
END_IF;
Then in other code sections you can put...
IF ActiveStep[1] THEN
(* Do something *)
StepComplete[1] := true;
END_IF;
IF ActiveStep[2] THEN
(* Do Something *)
StepComplete[2] := true;
END_IF;
(* etc *)
The nice thing about this approach is that you can actually put all of the state machine code (including jumps, resets etc) into a DFB, test it and then shelve it, and then just use the active step, step complete, and any other inputs you require.
Your code is still always going to execute an entire section of logic, but if you really want to avoid that then you'll have to use a lot of IF statements, which will impede readability.
Hope that helps.
Why not use SFC it makes your live easier in many cases, since it is state machine language itself. Do subprogram, wait condition do another .. rince and repeat. :)
Don't hang just for ST, the other IEC languages are better in some other tasks and keep thing as clear as possible. There should be not so much "this is my cake" mentality on the industrial PLC programming circles as it is on the many other programming fields, since application timeline can be 40 years and you left the firm 20 years ago to better job and programs are almost always location/customer or atleast hardware specific.
http://www.automation.com/pdf_articles/IEC_Programming_Thayer_L.pdf

Understanding CUDA serialization and reconvergence point

EDIT: I realized that I, unfortunately, overlooked a semicolon at the end of the while statement in the first example code and misinterpreted it myself. So there is in fact an empty loop for threads with threadIdx.x != s, a convergency point after that loop and a thread waiting at this point for all the others without incrementing the s variable. I am leaving the original (uncorrected) question below for anyone interested in it. Be aware, that there is a semicolon missing at the end of the second line in the first example and thus, s++ has nothing in common with the cycle body.
--
We were studying serialization in our CUDA lesson and our teacher told us that a code like this:
__shared__ int s = 0;
while (s != threadIdx.x)
s++; // serialized code
would end up with a HW deadlock because the nvcc compiler puts a reconvergence point between the while (s != threadIdx.x) and s++ statements. If I understand it correctly, this means that once the reconvergence point is reached by a thread, this thread stops execution and waits for the other threads until they reach the point too. In this example, however, this never happens, because thread #0 enters the body of the while loop, reaches the reconvergence point without incrementing the s variable and other threads get stuck in an endless loop.
A working solution should be the following:
__shared__ int s = 0;
while (s < blockDim.x)
if (threadIdx.x == s)
s++; // serialized code
Here, all threads within a block enter the body of the loop, all evaluate the condition and only thread #0 increments the s variable in the first iteration (and loop goes on).
My question is, why does the second example work if the first hangs? To be more specific, the if statement is just another point of divergence and in terms of the Assembler language should be compiled into the same conditional jump instruction as the condition in the loop. So why isn't there any reconvergence point before s++ in the second example and has it in fact gone immediately after the statement?
In other sources I have only found that a divergent code is computed independently for every branch - e.g. in an if/else statement, first the if branch is computed with all else-branched threads masked within the same warp and then the other threads compute the else branch while the first wait. There's a reconvergence point after the if/else statement. Why then does the first example freeze, not having the loop split into two branches (a true branch for one thread and a waiting false branch for all the others in a warp)?
Thank you.
It does not make sense to put the reconvergence point between the call to while (s != threadIdx.x) and s++;. It disrupts the program flow since the reconvergence point for a piece of code should be reachable by all threads at compile time. Below picture shows the flowchart of your first piece of code and possible and impossible points of reconvergence.
Regarding this answer about recording the convergence point via SSY instruction, I created below simple kernel resembling your first piece of code
__global__ void kernel_1() {
__shared__ int s;
if(threadIdx.x==0)
s = 0;
__syncthreads();
while (s == threadIdx.x)
s++; // serialized code
}
and compiled it for CC=3.5 with -O3. Below is the result of using cuobjdumbinary tool for the output to observe the CUDA assembly. The result is:
I'm not an expert in reading CUDA assembly but I can see while loop condition checks in lines 0038 and 00a0. At line 00a8, it branches to 0x80 if it satisfies the while loop condition and executes the code block again. The introduction of the reconvergence point is at line 0058 introducing line 0xb8 as the reconvergence point which is after the loop condition check near the exit.
Overall, it is not clear what you're trying to achieve with this piece of code. Also in the second piece of code, the reconvergence point should be again after while loop code block (I don't mean between while and if).
The reason why it "hangs" is neither a HW deadlock nor branching, at least not directly. You produce an endless loop for one or multiple threads (as already suspected).
In your example, there isn't really a convergence point. Since you do not use any synchronization, there aren't any threads that actually wait. What happens here with the while-loop is pretty much a busy-wait.
A kernel only finishes if all threads return. Since you have one (or multiple) endless loops (by accident maybe even none - this is unlikely however) the kernel will never finish.
You declared a shared variable s. This variable is known to all threads within a block.
With your while-statement you basically say (to each thread): increment s until it reaches the value of your (local) thread id. Since all threads are incrementing s in parallel, you introduce race conditions.
Example:
List item
Thread 5 is looping and checking for s to become 5
s is 4
Two threads increment s, it becomes 6
At the same time thread 5 only reached the end of its loop.
Now it reaches the next loop iteration and checks for s and it's not 5.
Thread 5 will never be able to finish since you check via == and the value of s already exceeded the value of the thread id.
Also your solution is quite confusing, because each thread executes the serialized code consecutively (which probably was the intention after all - even though that actually is strange):
Thread 0 will execute the serialized code
After that, thread 1 will execute the serialized code
and so on
Most examples show a program where each thread works on some code, then all threads are synchronized and only single thread executes some more code (maybe it needed the results of all threads).
So, your second example "works" because no thread is stuck in an endless loop, however I can't think of a reason why anyone would use such a code,
since it is confusing and, well, not parallel at all.

Run loop in TCL shell background

I'm looking for a way to run a loop in TCL shell background (meaning - an infinite loop calling some function), but I can't find a way to do it.
The proc is a simple proc releasing all possible licenses from the user (we have a shortage of licenses here at work and we constantly fighting who's going to get them).
The loop itself is simple:
while {1} {
after 60000
clr_lic
}
But how can I make it run in the background so I'll be able to run other commands as well?
Are you running an event loop? If you are, you can do this with the every procedure:
proc every {ms script} {
after $ms [info level 0]
uplevel "#0" $script
# Double quotes just for syntax highlighting…
}
every 60000 clr_lic
Otherwise, things get awkward: you'd have to think about spinning up a thread or a subprocess and… well, it all gets a lot more awkward as then you're in a strongly segregated context and have to do a lot more work (well, usually).
When testing the code above, I used this:
every 500 {
puts "Hi there, world!"
}
vwait forever; # Conventional way to run the event loop in tclsh
And I got quite a lot of messages written out; it's flexible and easy.
Try appending an ampersand (&) when invoking your script. This will run it in background and you're be able to run any command you like.