Raku zip operator & space - raku

I found this one liner which joins same lines from multiple files.
How to add a space between two lines?
If line 1 from file A is blue and line 1 from file B is sky, a get bluesky,
but need blue sky.
say $_ for [Z~] #*ARGS.map: *.IO.lines;

This is using the side-effect of .Str on a List to add spaces between the elements:
say .Str for [Z] #*ARGS.map: *.IO.lines
The Z will create 2 element List objects, which the .Str will then stringify.
Or even shorter:
.put for [Z] #*ARGS.map: *.IO.lines
where the .put will call the .Str for you and output that.
If you want anything else inbetween, then you could probably use .join:
say .join(",") for [Z] #*ARGS.map: *.IO.lines
would put comma's between the words.

Note: definitely don't do this in anything approaching real code. Use (one of) the readable ways in Liz's answer.
If you really want to use the same structure as [Z~] – that is, an operator modified by the Zip meta-operator, all inside the Reduce meta-operator – you can. But it's not pretty:
say $_ for [Z[&(*~"\x20"~*)]] #*ARGS.map: *.IO.lines
Here's how that works: Z can take an operator, so we need to give it an operator that concatenates two strings with a space in between. But there's no operator like that built in. No problem – we can turn any function into an infix operator by surrounding it with [ ] (the infix form).
So all we need is a function that joins two strings with a space between them. That also doesn't exist, but we can create one: * ~ ' ' ~ *. So, we should be able to shove that into our infix form and pass the whole thing to the Zip operator Z[* ~ ' ' ~ *].
Except that doesn't work. Because Zip isn't really expecting an infix form, we need to give it a hint that we're passing in a function … that is, we need to put our function into a callable context with &( ), which gets us to Z[&(* ~ ' ' ~ *)].
That Zip expression does what we want when used in infix position – but it still doesn't work once we put it back into the Reduce/[ ] operator that we want to use. This time, the problem is due to something that may or may not be a bug – even after discussing it with jnthn on github, I'm still not sure whether this behavior is intended/correct.
Specifically, the issue is that the Reduction meta-operator doesn't allow whitespace – even in strings. Thus, we need to replace * ~ ' ' ~ * with *~"\c[space]"~* or *~"\x20"~* (where \x20 is the hex value of in Unicode/ASCII). Since we've come this far into obfuscated code, I figure we might as well go all the way. And that gets us back to
say $_ for [Z[&(*~"\x20"~*)]] #*ARGS.map: *.IO.lines
Again, I'm not recommending that you do this. (And, if you do, you could at least make it slightly more readable by saving the * ~ ' ' ~ * function as a named variable in the previous line, which at least gets you whitespace. But, really, just use one of Liz's suggestions).
I just thought this gives a useful window into some of the darker and more interesting corners of Raku's strangely consistent behavior.

Related

Brace Delimiters with qq Don't Interpolate Code in Raku

Sorry if this is documented somewhere, but I haven't been able to find it. When using brace delimiters with qq, code is not interpolated:
qq.raku
#!/usr/bin/env raku
say qq{"Two plus two": { 2 + 2 }};
say qq["Two plus two": { 2 + 2 }];
$ ./qq.raku
"Two plus two": { 2 + 2 }
"Two plus two": 4
Obviously, this isn't a big deal since I can use a different set of delimiters, but I ran across it and thought I'd ask.
Update
As #raiph pointed out, I forgot to put the actual question: Is this the way it's supposed to work?
The quote language "nibbler" (the bit of the grammar that eats its way through a quoted string) looks like this:
[
<!stopper>
[
|| <starter> <nibbler> <stopper>
|| <escape>
|| .
]
]*
That is, until we see a stopper, eat whichever comes first of:
A starter (the opening { in your case), followed by some internal stuff, followed by a stopper (the }); this allows for nesting of the construct inside of the string
An escape (and closure interpolation is considered a kind of escape)
Any other character
This ordering in the grammar means that a nesting of the chosen quote starter/stopper will always win over an escape. This issue was discussed during the language design; we could, after all, have reordered the alternation in the grammar to have escapes win. On balance, however, it was felt that the choice of starter/stopper was the more local decision than the general properties of the quoting language, and so should take precedence. (This is also consistent with how quote languages are constructed: we take the base quoted string grammar and mix starter/stopper methods into it.)
Obviously, this isn't a big deal since I can use a different set of delimiters, but I ran across it and thought I'd ask.
You didn't ask anything. :)
Let's say you've got some text. And you want to use double quote processing to get interpolation, except you don't want braced text to be interpolated as code. You could write, say, qq:!c '...'. But don't you think it's a lot easier to remember, write, and read qq{ ... }?
Nice little touch, right?
Which is why it's the way it is -- it's a very nice touch.
And, perhaps, why it's not documented -- it's little, and, once you encounter it, obvious what you need to do.
That said, the Q lang escapes include ones to recursively re-enter the Q lang:
say qq{"Two plus two": \qq[{ 2 + 2 }] }; # "Two plus two": 4
Does that answer your question? :)

How to add a small bit of context in a grammar?

I am tasked to parse (and transform) a code of a computer language, that has a slight quirk in its rules, at least I see it this way. To be exact, the compiler treats new lines (as well as semicolons) as statement separators, but other than that (e.g. inside the statement) it treats them as spacers (whitespace).
As an example, this code:
try
local x = 5 / 0
catch (i)
print(i + "\n")
is proved to be equivalent to this:
try local x = 5 / 0 catch (i) print(i + "\n")
I don't see how I can express such a rule in EBNF, or specifically in Lark EBNF dialect. I mean in a sensible way. I probably could define all possible newline positions inside all statements, but it would be cumbersome and error-prone.
I wish to find a way to treat newlines contextually. Is there a proven method for this, preferably within Python/Lark domain? If I have to modify the parser for that purpose, then where should I start?
Or if I misunderstood something in this language in particular or in machine language parsing in general, or my statement of the problem is wrong, I'd also be happy to get educated.
(As you may guess, the language in question has a well proven implementation, but no officially defined grammar. Also, it is Squirrel, for all that it matters.)
The relevant quote from the "specification" is this:
A squirrel program is a simple sequence of statements.:
stats := stat [';'|'\n'] stats
[...] Statements can be separated with a new line or ‘;’ (or with the keywords case or default if inside a switch/case statement), both symbols are not required if the statement is followed by ‘}’.
These are relatively complex rules and in their totality not context free if newlines can also be ignored everywhere else. Note however that in my understanding the text implies that ; or \n are required when no of the other cases apply. That would make your example illegal. That probably means that the BNF as written is correct, e.g. both ; and \n are optionally everywhere. In that case you can (for lark) just put an %ignore "\n" statement and it should work fine.
Also, lark should not complain if you both ignore the \n and use it in a rule: Where useful it will match it in a rule, otherwise it will just ignore it. Note however that this breaks if you use a Terminal that includes the \n (e.g. WS or /\s/). Just have \n as an extra case.
(For the future: You will probably get faster response for lark questions if you ask over on gitter or at least put a link to SO there.)

Fractions for variables names in Julia

In julia you can write subscripts by \_ for variable names. I was wondering if there is anything similar for writing fractions in variable names. Something like \frac{}{} in LaTeX. I understand this may be harder as it takes two arguments. If there is none, I will use /. But in this case I would like to use some enclosures to make clear what is being differentiated. I assume () is not usable? [] or {} would be ok?
The subscripts or other non-latin names you see in Julia code are just normal unicodes the same as "regular" names. the LaTeX commands are only a function of Julia REPL to remember and input them.
As for unicode, in principle you can represent some simple fractions like ⁽²⁺ⁱ⁾⁄₍ₛ₊ₜ₎, using the ⁄ (U+2044 Fraction slash) symbol and subscripts and superscripts. The rendering depends on your font, but do not expect a verticle layout in any current fonts.
However, Julia recognizes ⁄ (U+2044 Fraction slash, not the / in your keyboard) as "invalid character" when used along during parsing. The same applies to \not, which can only be used in conjunction with some operators, so it's not an option too.
As for the brackets and the normal /, they are operators and are parsed differently. However, there is an (ugly) way to circumvent this: you can use macros to bypass the parsing and use strings as variable names. For example:
julia> macro n_str(name)
esc(Symbol(name))
end
#n_str (macro with 1 method)
julia> n"∂(2x + 3)/∂x" = 2
2
julia> 2n"∂(2x + 3)/∂x"
4

Can I modify a literal regex in Perl 6?

Suppose we have a regular inflectional pattern, which cannot be split into segments. E.g. it can be infixation (adding some letters inside the word) or vowel change ('ablaut'). Consider an example from German.
my #words = <Vater Garten Nagel>;
my $search = "/#words.join('|')/".EVAL;
"mein Vater" ~~ $search;
say $/; # 「Vater」
All the three German words form plural by changing their 2nd letter 'a' to 'ä'. So 'Vater' → 'Väter', 'Garten' → 'Gärten', 'Nagel' → 'Nägel'.
Is there a way to modify my $search regex so that it would match the plural forms?
Here's what I'm looking for:
my $search_ä = $search.mymethod;
"ihre Väter" ~~ $search_ä;
say $/; # 「Väter」
Of course, I can modify the #words array and 'precompile' it into a new regex. But it would be better (if possible) to modify the existing regex directly.
You can't.
Regexes are code objects in Perl 6. So your question basically reads "Can I modify subroutines or methods after I've written them?". And the answer is the same for traditional code objects and for regexes: no, write them the you want them in the first place.
That said, you don't actually need EVAL for your use case. When you use an array variable inside a regex, it is interpolated as a list of alternative branches, so you could just write:
my #words = <Vater Garten Nagel>;
my $search = /#words/;
The regex $search becomes a closure, so if you modify #words, you also change what $search matches.
Another approach to this particular example would be to use the :ignoremark modifier, which makes a also match ä (though also lots of other forms, such as ā or ǎ.)

SWI-Prolog predicate for reading in lines from input file

I'm trying to write a predicate to accept a line from an input file. Every time it's used, it should give the next line, until it reaches the end of the file, at which point it should return false. Something like this:
database :-
see('blah.txt'),
loop,
seen.
loop :-
accept_line(Line),
write('I found a line.\n'),
loop.
accept_line([Char | Rest]) :-
get0(Char),
C =\= "\n",
!,
accept_line(Rest).
accept_line([]).
Obviously this doesn't work. It works for the first line of the input file and then loops endlessly. I can see that I need to have some line like "C =\= -1" in there somewhere to check for the end of the file, but I can't see where it'd go.
So an example input and output could be...
INPUT
this is
an example
OUTPUT
I found a line.
I found a line.
Or am I doing this completely wrong? Maybe there's a built in rule that does this simply?
In SWI-Prolog, the most elegant way to do this is to first use a DCG to describe what a "line" means, and then use library(pio) to apply the DCG to a file.
An important advantage of this is that you can then easily apply the same DCG also on queries on the toplevel with phrase/2 and do not need to create a file to test the predicate.
There is a DCG tutorial that explains this approach, and you can easily adapt it to your use case.
For example:
:- use_module(library(pio)).
:- set_prolog_flag(double_quotes, codes).
lines --> call(eos), !.
lines --> line, { writeln('I found a line.') }, lines.
line --> ( "\n" ; call(eos) ), !.
line --> [_], line.
eos([], []).
Example usage:
?- phrase_from_file(lines, 'blah.txt').
I found a line.
I found a line.
true.
Example usage, using the same DCG to parse directly from character codes without using a file:
?- phrase(lines, "test1\ntest2").
I found a line.
I found a line.
true.
This approach can be very easily extended to parse more complex file contents as well.
If you want to read into code lists, see library(readutil), in particular read_line_to_codes/2 which does exactly what you need.
You can of course use the character I/O primitives, but at least use the ISO predicates. "Edinburgh-style" I/O is deprecated, at least for SWI-Prolog. Then:
get_line(L) :-
get_code(C),
get_line_1(C, L).
get_line_1(-1, []) :- !. % EOF
get_line_1(0'\n, []) :- !. % EOL
get_line_1(C, [C|Cs]) :-
get_code(C1),
get_line_1(C1, Cs).
This is of course a lot of unnecessary code; just use read_line_to_codes/2 and the other predicates in library(readutil).
Since strings were introduced to Prolog, there are some new nifty ways of reading. For example, to read all input and split it to lines, you can do:
read_string(user_input, _, S),
split_string(S, "\n", "", Lines)
See the examples in read_string/5 for reading linewise.
PS. Drop the see and seen etc. Instead:
setup_call_cleanup(open(Filename, read, In),
read_string(In, N, S), % or whatever reading you need to do
close(In))