Trans, hash, and character classes in Perl 6 - raku

When I use a regex as the first argument of trans, it's OK:
> say 'abc'.trans(/\w <?before b>/ => 1)
1bc
Using a hash as an argument of trans is also OK:
> my %h
> %h{'a'} = '1'
> say 'abc'.trans(%h)
1bc
But when I try to use regexes in a hash, it doesn't work:
> my %h
> %h{'/\w/'} = '1'
> say 'abc'.trans(%h)
abc

'/\w/'
is not a regex, it is a string.
my %h{Any}; # make sure it accepts non-Str keys
%h{/\w/} = 1;
say 'abc'.trans(%h)
111

Related

Smart search and replace [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I had some code that had a few thousands lines of code that contain pieces like this
opencanmanager.GetObjectDict()->ReadDataFrom(0x1234, 1).toInt()
that I needed to convert to some other library that uses syntax like this
ReadFromOD<int>(0x1234, 1)
.
Basically I need to search for
[whatever1]opencanmanager.GetObjectDict()->ReadDataFrom([whatever2]).toInt()[whatever3]
across all the lines of a text file and to replace every occurence of it with
[whatever1]ReadFromOD<int>([whatever2])[whatever3]
and then do the same for a few other data types.
Doing that manually was going to be a few days of absolutely terrible dumb work but all the automatic functions of any editor I know of do not allow for any smart code refactoring tools.
Now I have solved the problem using GNU AWK with the script below
#!/usr/bin/awk -f
BEGIN {
spl1 = "opencanmanager.GetObjectDict()->ReadDataFrom("
spl2 = ").to"
spl2_1 = ").toString()"
spl2_2 = ").toUInt()"
spl2_3 = ").toInt()"
min_spl2_len = length(spl2_3)
repl_start = "ReadFromOD<"
repl_mid1 = "QString"
repl_mid2 = "uint"
repl_mid3 = "int"
repl_end = ">("
repl_after = ")"
}
function replacer(str)
{
pos1 = index(str, spl1)
pos2 = index(str, spl2)
if (!pos1 || !pos2) {
return str
}
strbegin = substr(str, 0, pos1-1)
mid_start_pos = pos1+length(spl1)
strkey = substr(str, pos2, min_spl2_len)
key1 = substr(spl2_1, 0, min_spl2_len)
key2 = substr(spl2_2, 0, min_spl2_len)
key3 = substr(spl2_3, 0, min_spl2_len)
strmid = substr(str, mid_start_pos, pos2-mid_start_pos)
if (strkey == key1) {
repl_mid = repl_mid1; spl2_fact = spl2_1;
} else if (strkey == key2) {
repl_mid = repl_mid2; spl2_fact = spl2_2;
} else if (strkey == key3) {
repl_mid = repl_mid3; spl2_fact = spl2_3;
} else {
print "ERROR!!! Found", spl1, "but not any of", spl2_1, spl2_1, spl2_3 "!" > "/dev/stderr"
exit EXIT_FAILURE
}
str_remainder = substr(str, pos2+length(spl2_fact))
return strbegin repl_start repl_mid repl_end strmid repl_after str_remainder
}
{
resultstr = $0
do {
resultstr = replacer(resultstr)
more_spl = index(resultstr, spl1) || index(resultstr, spl2)
} while (more_spl)
print(resultstr)
}
and everything works fine but the thing still bugs me somewhat. My solution still feels a bit too complicated for a job that must be very common and must have an easy standard solution that I just dont't know about for some reason.
I am prepared to just let it go but if you know a more elegant and quick one-liner solution or some specific tool for the smart code modification problem then I would definitely would like to know.
If sed is an option, you can try this solution which should match both output examples from input such as this.
$ cat input_file
opencanmanager.GetObjectDict()->ReadDataFrom(0x1234, 1).toInt()
power1 = opencanmanager.GetObjectDict()->ReadDataFrom(0x1234, 1).toInt() * opencanmanager.GetObjectDict()->ReadDataFrom(0x5678, 1).toUInt() * FACTOR1;
power2 = opencanmanager.GetObjectDict()->ReadDataFrom(0x5678, 1).toUInt() / 2;
$ sed -E 's/ReadDataFrom/ReadFromOD<int>/g;s/int/uint/2;s/(.*= )?[^>]*>([^\.]*)[^\*|/]*?(\*|\/.{2,})?[^\.]*?[^>]*?>?([^\.]*)?[^\*]*?(.*)?/\1\2 \3 \4 \5/' input_file
ReadFromOD<int>(0x1234, 1)
power1 = ReadFromOD<int>(0x1234, 1) * ReadFromOD<uint>(0x5678, 1) * FACTOR1;
power2 = ReadFromOD<int>(0x5678, 1) / 2;
Explanation
s/ReadDataFrom/ReadFromOD<int>/g - The first part of the command does a simple global substitution substituting all occurances of ReadDataFrom to ReadFromOD<int>
s/int/uint/2 - The second part will only substitute the second occurance of int to uint if there is one
s/(.*= )?[^>]*>([^\.]*)[^\*|/]*?(\*|\/.{2,})?[^\.]*?[^>]*?>?([^\.]*)?[^\*]*?(.*)?/\1\2 \3 \4 \5/ - The third part utilizes sed grouping and back referencing.
(.*= )? - Group one returned with back reference \1 captures everything up to an = character, ? makes it conditional meaning it does not have to exist for the remaining grouping to match.
[^>]*> - This is an excluded match as it is not within parenthesis (). It matches everything continuing from the space after the = character up to the >, a literal > is then included to exclude that also. This is not conditional and must match.
([^\.]*) - Continuing from the excluded match, this will continue to match everything up to the first . and can be returned with back reference \2. This is not conditional and must match.
[^\*|/]*? - This is an excluded match and will match everything up to the literal * or | to /. It is conditional ? so does not have to match.
(\*|\/.{2,})? - Continuing from the excluded match, this will continue to match everything up to and including * or | / followed by at least 2 or more{2,} characters. It can be returned with back reference \3 and is conditional ?
[^\.]*?[^>]*?>? - Conditional excluded matches. Match everything up to a literal ., then everything up to > and include >
([^\.]*)? - Conditional group matching up to a full stop .. It can be returned with back reference \4.
[^\*]*? - Excluded. Continue matching up to *
(.*)? - Everything else after the final * should be grouped and returned with back reference \5 if it exist ?

splitting of email-address in spark-sql

code:
case when length(neutral)>0 then regexp_extract(neutral, '(.*#)', 0) else '' end as neutral
The above query returns the output value with # symbol, for example if the input is 1234#gmail.com, then the output is 1234#. how to remove the # symbol using the above query. And the resulting output should be evaluated for numbers, if it contains any non-numeric characters it should get rejected.
sample input:1234#gmail.com output: 1234
sample input:123adc#gmail.com output: null
You could phrase the regex as ^[^#]+, which would match all characters in the email address up to, but not including, the # character:
REGEXP_EXTRACT(neutral, '^[^#]+', 0) AS neutral
Note that this approach is also clean and frees us from having to use the bulky CASE expression.
Try this code:
val pattern = """([0-9]+)#([a-zA-Z0-9]+.[a-z]+)""".r
val correctEmail = "1234#gmail.com"
val wrongEmail = "1234abcd#gmail.com"
def parseEmail(email: String): Option[String] =
email match {
case pattern(id, domain) => Some(id)
case _ => None
}
println(parseEmail(correctEmail)) // prints Some(1234)
println(parseEmail(wrongEmail)) // prints None
Also, it is more idiomatic to use Options instead of null

How to concatenate two Sets of strings in Perl 6?

Is there an idiomatic way or a built-in method to concatenate two Sets of strings?
Here's what I want:
> my Set $left = <start_ begin_>.Set
set(begin_ start_)
> my Set $right = <end finish>.Set
set(end finish)
> my Set $left_right = ($left.keys X~ $right.keys).Set
set(begin_end begin_finish start_end start_finish)
Or, if there are more than two of them:
> my Set $middle = <center_ base_>.Set
> my Set $all = ([X~] $left.keys, $middle.keys, $right.keys).Set
set(begin_base_end begin_base_finish begin_center_end begin_center_finish start_base_end start_base_finish start_center_end start_center_finish)
You can use the reduce function to go from an arbitrary number of sets to a single set with everything concatenated in it:
my Set #sets = set(<start_ begin_>),
set(<center_ base_>),
set(<end finish>);
my $result = #sets.reduce({ set $^a.keys X~ $^b.keys });
say $result.perl
# =>
Set.new("start_base_end","begin_center_finish","start_center_finish",
"start_center_end","start_base_finish","begin_base_end",
"begin_center_end","begin_base_finish")
That seems clean to me.

How to check if 1 (or more) variables out of a set of variables is equal to a value?

I can't find out how to check if 1 (or more) variables out of a set of variables is equal to a value:
p.e.
let linecurr = getline(endlijn-line)
let lineabov = getline(endlijn-line-1)
if lineabov =~ '[!;:.?]\s*$'
\ || (lineabov || linecurr) =~ '^\s*$'
\ || (lineabov || linecurr) =~ '^\s*\(---\|===\)'
etc.
(lineabov || linecurr) --> This doesn't work.
How can I check if 1 (or more) variables out of a set of variables is equal to a value?
It seems you want to match against a group of variables. To do this you could create a list then see if anything in that lists matches what you want.
let l = [ var1, var2, var3, var4 ]
if match(l, "pattern") != -1
...
match returns the index of the variable that matched or returns -1 if none matched.
See if the first variable has the value you want. If it doesn't check the other variable.
(lineabov =~ '^\s*$') || (linecurr =~ '^\s*$')

Combine options in inputdialog

I often use the inputdialog to execute a command using:
let n = confirm({msg} [, {choices} [, {default} [, {type}]]])
p.e. search numbers
if n == 1 --> p.e. do search of all numbers with '.,'
if n == 2 --> p.e. do search of all exponential numbers
if n == 3 --> p.e. do search of all numbers with 3 digits
etc
but with this method I can only choose one argument.
Is there a way in Vim where you can chose multiple arguments together in an inputdialog?
You could use input() to prompt the user to input a string, and then inspect the returned list:
let string = input( {msg}, {choices}, ... )
For example, the user could enter 1,2,3, and you can do a text comparison of this string:
if ( string =~ 1 )
" do something
endif
if ( string =~ 2 )
" do something
endif
if ( string =~ 3 )
" do something
endif
A more sophisticated approach (e.g. if there are more than 9 options) might be to split the string into a list:
let choice_list = split( string, ',' )
for choice in choice_list
if choice == 1
" do something
endif
if choice == 2
" do something
endif
if choice == 3
" do something
endif
endfor
Since the returned string could be anything the user decides to enter, you might want to add some sanity checks that the string is indeed a list of integers.
A workaround, use input() function, let the user to choose multiple options and split them into a list to process them. An example:
Add next function to vimrc or similar file:
func My_search()
let my_grouped_opts = input ( "1.- Search one\n2.- Search two\n3.- Search three\n" )
let my_list_opts = split( my_grouped_opts, '.\zs' )
for opt in my_list_opts
echo "Option number " opt " selected"
endfor
endfunction
Call it:
:call My_search()
There will appear your options:
1.- Search one
2.- Search two
3.- Search three
Select them like:
23
And the function will split them into a list.