Why do + and ~ affect Perl 6 junctions differently? - raku

Add one to a junction of Ints:
put any( 1, 3, 7 ) + 1;
Now you have a junction of those Ints increased by one:
any(2, 4, 8)
So, 2 == any(2, 4, 8) is true.
Make a junction of strings and append to those strings:
put any( <h H> ) ~ 'amadryas';
You get a different result that doesn't equal 'hamadryas' or 'Hamadryas':
any("h", "H")amadryas
I expected something like:
any( 'hamadryas', 'Hamadryas' );
What's the difference in these operations that gives them different behavior even though they should be similar?

on the High Sierra 10.13, put fails with:
put any( 1, 3, 7 ) + 1
This type cannot unbox to a native string: P6opaque, Junction
in block at line 1
perl6 -v
This is Rakudo Star version 2017.10 built on MoarVM version 2017.10
implementing Perl 6.c.

Quoting the filed bug report, as progressed by Zoffix++:
Thank you for the report. lizmat++ fixed this.
The put routine does not explicitly handle Junction arguments. As per design, the end result is therefore a call to put for each of its elements:
put any( 1, 3, 7 ) + 1; # 2␤4␤8
put any( <h H> ) ~ 'amadryas'; # hamadryas␤Hamadryas
Per design, the call order of the puts is indeterminate. So other runs of the same code, perhaps with later compilers, may result in:
put any( 1, 3, 7 ) + 1; # 4␤8␤2
put any( <h H> ) ~ 'amadryas'; # Hamadryas␤hamadryas
In contrast to put, the say routine does special case Junctions. So the end result is just a single call to say:
say any( 1, 3, 7 ) + 1; # any(2, 4, 8)
say any( <h H> ) ~ 'amadryas'; # any(hamadryas, Hamadryas)

Related

Junction ~~ Junction behavior

I want to check if all elements of an Array have given type.
$ raku -e 'my #t = 1,2,3; say all(#t) ~~ Int'
True
$ raku -e 'my #t = 1,2,3,"a"; say all(#t) ~~ Int'
False
Works as expected so far. Now I want to allow two types:
$ raku -e 'my #t = 1,2,3,"a"; say all(#t) ~~ Int|Str'
False
Why is so? If 1 ~~ Int|Str is True for single element why does it fail for all() elements junction?
BTW: This question is about understanding Junction ~~ Junction behavior (which is also a bit undocumented), not about alternative way of performing check from example (which I know is possible).
A few additional lines may help clarify what's going on:
say all(1, 2, 3) ~~ Int|Str; # OUTPUT: «True»
say all('a', 'b', 'c') ~~ Int|Str; # OUTPUT: «True»
say all(1, 2, 'c') ~~ Int|Str; # OUTPUT: «False»
That is, all(1, 2, 'c') ~~ Int|Str is asking "is it the case that all of 1, 2, 'c' are Ints or, alternatively, is it the case that all of 1, 2, 'c' are Strs?" Since neither of those is the case, it returns False.

How could I delete sql functions format in awk?

I've got a sql query which looks like this:
SELECT test1(func1(MYFIELD)),
test2(MAX(MYFIELD), LOWER("NOPE")),
test3(MAX(MYFIELD), 1234),
AVG(test1(test2(MYFIELD, func1(4)))),
func2(UPPER("stack"))
SUBSTR(MYFIELD, 2, 4),
test2(MIN(MYFIELD), SUBSTR(LOWER(UPPER("NOPE")), 1, 7)),
SUBSTR('func1(', 2, 4)
FROM MYTABLE;
Then I'm trying to remove all functions called:
test1
test2
test3
func1
func2
But preserving the AVG, MAX, UPPER, SUBSTR... and all native functions.
So the desired output would be:
SELECT MYFIELD,
MAX(MYFIELD),
MAX(MYFIELD),
AVG(MYFIELD),
UPPER("stack")
SUBSTR(MYFIELD, 2, 4),
MIN(MYFIELD)
SUBSTR('func1(', 2, 4)
FROM MYTABLE;
I want to remove the LOWER of the second line because, it is an argument of one of the functions to delete, in this case test2, which has two parameters. Then if we delete the function, we should delete its params as well.
I've tried to do it by this way in awk:
{
print gensub(/(test1|test2|func1|func2)\(/,"","gi", $0);
}
But the output doesn't have into account the right parentheses, it doesn't also delete the rest of parameters of the custom functions:
SELECT MYFIELD)),
MAX(MYFIELD), LOWER("NOPE")),
MAX(MYFIELD), 1234),
AVG(MYFIELD, 4)))),
UPPER("stack"))
SUBSTR(MYFIELD, 2, 4),
MIN(MYFIELD), SUBSTR(LOWER(UPPER("NOPE")), 1, 7)),
SUBSTR('', 2, 4)
FROM MYTABLE;
Any idea or clue to handle this situation?
you could just rename functions' names to built-in functionCOALESCE while keep the brakets ( ) and other params of users' functions.
It will produce the same result, not syntactically, but it will work the same UNLESS the built-in functions don't return NULL values. It will be much easier to achieve because you don't have to worry about brakets.
If file is an input you provide, then:
cat file | sed 's#\(test1\|test2\|func1\|func2\)(#COALESCE(#g'
will produce:
SELECT COALESCE(COALESCE(MYFIELD)),
COALESCE(MAX(MYFIELD), 4),
AVG(COALESCE(COALESCE(MYFIELD, COALESCE(4)))),
COALESCE(UPPER("stack"))
FROM MYTABLE;

Saving state of closure in Groovy

I would like to use a Groovy closure to process data coming from a SQL table. For each new row, the computation would depend on what has been computed previously. However, new rows may become available on further runs of the application, so I would like to be able to reload the closure, initialised with the intermediate state it had when the closure was last executed in the previous run of the application.
For example, a closure intending to compute the moving average over 3 rows would be implemented like this:
def prev2Val = null
def prevVal = null
def prevId = null
Closure c = { row ->
println([ prev2Val, prevVal, prevId])
def latestVal = row['val']
if (prev2Val != null) {
def movMean = (prev2Val + prevVal + latestVal) / 3
sql.execute("INSERT INTO output(id, val) VALUES (?, ?)", [prevId, movMean])
}
sql.execute("UPDATE test_data SET processed=TRUE WHERE id=?", [row['id']])
prev2Val = prevVal
prevVal = latestVal
prevId = row['id']
}
test_data has 3 columns: id (auto-incremented primary key), value and processed. A moving mean is calculated based on the two previous values and inserted into the output table, against the id of the previous row. Processed rows are flagged with processed=TRUE.
If all the data was available from the start, this could be called like this:
sql.eachRow("SELECT id, val FROM test_data WHERE processed=FALSE ORDER BY id", c)
The problem comes when new rows become available after the application has already been run. This can be simulated by processing a small batch each time (e.g. using LIMIT 5 in the previous statement).
I would like to be able to dump the full state of the closure at the end of the execution of eachRow (saving the intermediate data somewhere in the database for example) and re-initialise it again when I re-run the whole application (by loading those intermediate variable from the database).
In this particular example, I can do this manually by storing the values of prev2Val, prevVal and prevId, but I'm looking for a generic solution where knowing exactly which variables are used wouldn't be necessary.
Perhaps something like c.getState() which would return [ prev2Val: 1, prevVal: 2, prevId: 6] (for example), and where I could use c.setState([ prev2Val: 1, prevVal: 2, prevId: 6]) next time the application is executed (if there is a state stored).
I would also need to exclude sql from the list. It seems this can be done using c.#sql=null.
I realise this is unlikely to work in the general case, but I'm looking for something sufficiently generic for most cases. I've tried to dehydrate, serialize and rehydrate the closure, as described in this Groovy issue, but I'm not sure how to save and store all the # fields in a single operation.
Is this possible? Is there a better way to remember state between executions, assuming the list of variables used by the closure isn't necessarily known in advance?
Not sure this will work in the long run, and you might be better returning a list containing the values to pass to the closure to get the next set of data, but you can interrogate the binding of the closure.
Given:
def closure = { row ->
a = 1
b = 2
c = 4
}
If you execute it:
closure( 1 )
You can then compose a function like:
def extractVarsFromClosure( Closure cl ) {
cl.binding.variables.findAll {
!it.key.startsWith( '_' ) && it.key != 'args'
}
}
Which when executed:
println extractVarsFromClosure( closure )
prints:
['a':1, 'b':2, 'c':4]
However, any 'free' variables defined in the local binding (without a def) will be in the closures binding too, so:
fish = 42
println extractVarsFromClosure( closure )
will print:
['a':1, 'b':2, 'c':4, 'fish':42]
But
def fish = 42
println extractVarsFromClosure( closure )
will not print the value fish

Generating sequential number lists in tcsh

I've been trying to find a workaround to defining lists of sequential numbers extensively in tcsh, ie. instead of doing:
i = ( 1 2 3 4 5 6 8 9 10 )
I would like to do something like this (knowing it doesn't work)
i = ( 1..10 )
This would be specially usefull in foreach loops (I know I can use while, just trying to look for an alternative).
Looking around I found this:
foreach $number (`seq 1 1 9`)
...
end
Found that here. They say it would generate a list of number starting with 1, with increments of 1 ending in 9.
I tried it, but it didn't work. Apparently seq isn't a command. Does it exist or is this plain wrong?
Any other ideas?
seq certainly exists, but perhaps not on your system since it is not in the POSIX standard. I just noticed you have two errosr in your command. Does the following work?
foreach number ( `seq 1 9` )
echo $number
end
Notice the omission of the dollar sign and the extra backticks around the seq command.
If that still doesn't work you could emulate seq with awk:
foreach number ( `awk 'BEGIN { for (i=1; i<=9; i++) print i; exit }'` )
Update
Two more alternatives:
If your machine has no seq it might have jot (BSD/OSX):
foreach number ( `jot 9` )
I had never heard of jot before, but it looks like seq on steroids.
Use bash with built-in brace expansion:
for number in {1..9}

How would you format/indent this piece of code?

How would you format/indent this piece of code?
int ID = Blahs.Add( new Blah( -1, -2, -3) );
or
int ID = Blahs.Add( new Blah(
1,2,3,55
)
);
Edit:
My class has lots of parameters actually, so that might effect your response.
I agree with Patrick McElhaney; there is no need to nest it....
Blah aBlah = new Blah( 1, 2, 3, 55 );
int ID = Blahas.Add( aBlah );
There are a couple of small advantage here:
You can set a break point on the second line and inspect 'aBlah'.
Your diffs will be cleaner (changes more obvious) without nesting the statements, e.g. creating the new Blah is in an independent statement from adding it to the list.
I'd go with the one-liner. If the real arguments make one line too long, I would break it up with a variable.
Blah blah = new Blah(1,2,3,55);
int ID = Blahs.Add( blah );
All numbers are being added to a result. No need to comment each number separately. A comment "these numbers are added together" will do it. I'm going to do it like this:
int result = Blahs.Add( new Blah(1, 2, 3, 55) );
but if those numbers carry some meaning on their own, each number could stand for something entirely different, for example if Blah denotes the type for an inventory item. I would go with
int ID = Blahs.Add( new Blah(
1, /* wtf is this */
2, /* wtf is this */
3, /* wtf is this */
55 /* and huh */
));
int ID = Blahs.Add
(
new Blah
(
1, /* When the answer is within this percentage, accept it. */
2, /* Initial seed for algorithm */
3, /* Maximum threads for calculation */
55 /* Limit on number of hours, a thread may iterate */
)
);
or
int ID = Blahs.Add(
new Blah( 1, 2, 3, 55 )
);
I must confess, though, that 76 times out of 77 I do what you did the first time.
first way since you are inlining it anyway.
I would use similar formatting as your first example, but without the redundant space delimiters before and after the parenthesis delimiters:
int id = BLahs.Add(new Blah(-1, -2, -3));
Note that I also wouldn't use an all upper-case variable name in this situation, which often implies something special, like a constant.
Either split it into two lines:
new_Blah = new Blah(-1, -2, -3)
int ID = BLahs.Add(new_Blah);
Or indent the new Blah(); call:
int ID = BLahs.Add(
new Blah(-1, -2, -3)
);
Unless the arguments were long, in which case I'd probably do something like..
int ID = BLahs.Add(new Blah(
(-1 * 24) + 9,
-2,
-3
));
As a slightly more practical example, in Python I quite commonly do the either of the following:
myArray.append(
someFunction(-1, -2, -3)
)
myArray.append(someFunction(
otherFunction("An Arg"),
(x**2) + 4,
something = True
))
One line, unless there's a lot of data. I'd draw the line at about ten items or sixty, seventy columns in total, whatever comes first.
Whatever Eclipse's auto-formatter gives me, so when the next dev works on that code and formats before committing, there aren't weird issues with the diff.
int ID = Blahs.Add(new Blah(1,2,3,55)); // Numbers n such that the set of base 4 digits of n equals the set of base 6 digits of n.
The problem with
Blah aBlah = new Blah( 1, 2, 3, 55 );
int ID = Blahas.Add( aBlah );
is that it messes with your namespace. If you don't need a reference to the Blah you shouldn't create it.
I'd either do it as a one-liner or assign the new Blah to a variable, depending on whether I'll need to reference that Blah directly again.
As far as the readability issue which a couple answers have addressed by putting each argument on a separate line with comments, I would address that by using named parameters. (But not all languages support named parameters, unfortunately.)
int ID = BLahs.Add(new Blah( foo => -1, bar => -2, baz => -3 ));