Perl replace with variable - scripting

I'm trying to replace a word in a string. The word is stored in a variable so naturally I do this:
$sentence = "hi this is me";
$foo=~ m/is (.*)/;
$foo = $1;
$sentence =~ s/$foo/you/;
print $newsentence;
But this doesn't work.
Any idea on how to solve this? Why this happens?

Perl lets you interpolate a string into a regular expression, as many of the answers have already shown. After that string interpolation, the result has to be a valid regex.
In your original try, you used the match operator, m//, that immediately tries to perform a match. You could have used the regular expression quoting operator in it's place:
$foo = qr/me/;
You can either bind to that directory or interpolate it:
$string =~ $foo;
$string =~ s/$foo/replacement/;
You can read more about qr// in Regexp Quote-Like Operators in perlop.

You have to replace the same variable, otherwise $newsentence is not set and Perl doesn't know what to replace:
$sentence = "hi this is me";
$foo = "me";
$sentence =~ s/$foo/you/;
print $sentence;
If you want to keep $sentence with its previous value, you can copy $sentence into $newsentence and perform the substitution, that will be saved into $newsentence:
$sentence = "hi this is me";
$foo = "me";
$newsentence = $sentence;
$newsentence =~ s/$foo/you/;
print $newsentence;

You first need to copy $sentence to $newsentence.
$sentence = "hi this is me";
$foo = "me";
$newsentence = $sentence;
$newsentence =~ s/$foo/you/;
print $newsentence;

Even for small scripts, please 'use strict' and 'use warnings'. Your code snippet uses $foo and $newsentence without initialising them, and 'strict' would have caught this. Remember that '=~' is for matching and substitution, not assignment. Also be aware that regexes in Perl aren't word-bounded by default, so the example expression you've got will set $1 to 'is me', the 'is' having matched the tail of 'this'.
Assuming you're trying to turn the string from 'hi this is me' to 'hi this is you', you'll need something like this:
my $sentence = "hi this is me";
$sentence =~ s/\bme$/\byou$/;
print $sentence, "\n";
In the regex, '\b' is a word boundary, and '$' is end-of-line. Just doing 's/me/you/' will also work in your example, but would probably have unintended effects if you had a string like 'this is merry old me', which would become 'this is yourry old me'.

Related

LIKE operator with a bindvar pattern which works also for special characters

I have such a query:
WHERE x LIKE $1
, where $1 is a bindvar string built in the backend:
$1 = "%" + PATTERN + "%"
Is it possible to build a LIKE PATTERN in that way that special characters (% and _) are escaped, so I have the same functionality, but it works for all possible PATTERN values.
You would want to escape the literal % and _ with backslash. For example, in PHP we might try:
$pattern = "something _10%_ else";
$pattern = preg_replace("/([%_])/", "\\\\$1", $pattern);
echo $pattern; // something \_10\%\_ else

Combining regexes using a loop in Perl 6

Here I make a regex manually from Regex elements of an array.
my Regex #reg =
/ foo /,
/ bar /,
/ baz /,
/ pun /
;
my $r0 = #reg[0];
my $r1 = #reg[1];
my Regex $r = / 0 $r0 | 1 $r1 /;
"0foo_1barz" ~~ m:g/<$r>/;
say $/; # (「0foo」 「1bar」)
How to do it with for #reg {...}?
If a variable contains a regex, you can use it without further ado inside another regex.
The second trick is to use an array variable inside a regex, which is equivalent to the disjunction of the array elements:
my #reg =
/foo/,
/bar/,
/baz/,
/pun/
;
my #transformed = #reg.kv.map(-> $i, $rx { rx/ $i $rx /});
my #match = "0foo_1barz" ~~ m:g/ #transformed /;
.say for #match;
my #reg =
/foo/,
/bar/,
/baz/,
/pun/
;
my $i = 0;
my $reg = #reg
.map({ $_ = .perl; $_.substr(1, $_.chars - 2); })
.map({ "{$i++}{$_}" })
.join('|');
my #match = "foo", "0foo_1barz" ~~ m:g/(<{$reg}>) /;
say #match[1][0].Str;
say #match[1][1].Str;
# 0foo
# 2baz
See the docs
Edit: Actually read the docs myself. Changed implicit eval to $() construct.
Edit: Rewrote answer to something that actually works
Edit: Changed answer to a terrible, terrible hack

Change file paths

I want to change [%a/b] to [%a/c].
Basically, the same as Change path or refinement, but with file! instead:
I want to change the a/b inside a block to a/c
test: [a/b]
In this case, either change next test/1 'c or test/1/2: 'c works.
But not when test is a file!:
>> test: [%a/b]
== [%a/b]
>> test/1
== %a/b
>> test/1/2 ; can't access 2nd value
== %a/b/2
>> next first test ; not quite what you expect
== %/b
Trying to change it gives not something you'd expect:
>> change next test/1 'c
== %b
>> test
== [%acb]
You are confusing path! and file! series, they can look similar, but their nature are very different.
A path! is a collection of values (often word! values) separated by a slash symbol, a file! is a collection of char! values. Slash characters in file! series are just characters, so file! has no knowledge about any sub-structures. It has (mostly) the semantics of string! series, while path! has the semantics of a block! series.
Now that this is cleared, about the test/1/2 result, path notation on a file! series has a different behavior than on string!, it will do a smart concatenation instead of acting as an accessor. It's called smart because it will nicely handle extra slash characters present in left and right parts. For example:
>> file: %/index.html
== %/index.html
>> path: %www/
== %www/
>> path/file
== %www/file
>> path/:file
== %www/index.html
Same path notation rule applies to url! series too:
>> url: http://red-lang.org
== http://red-lang.org
>> url/index.html
== http://red-lang.org/index.html
>> file: %/index.html
== %/index.html
>> url/:file
== http://red-lang.org/index.html
So for changing the nested content of test: [%a/b], as file! behaves basically as string!, you can use any available method for strings to modify it. For example:
>> test: [%a/b]
== [%a/b]
>> change skip test/1 2 %c
== %""
>> test
== [%a/c]
>> change next find test/1 slash "d"
== %""
>> test
== [%a/d]
>> parse test/1 [thru slash change skip "e"]
== true
>> test
== [%a/e]
Files are string types and can be manipulated in the same way you'd modify a string. For example:
test: [%a/b]
replace test/1 %/b %/c
This is because file! is an any-string!, not any-array!
>> any-string? %a/c
== true
>> any-array? 'a/c
== true
So the directory separator '/' in a file! doesn't mean anything special with the action CHANGE. So 'a', '/', and 'b' in %a/b are treated the same way, and the interpreter isn't trying to parse it into a two segment file path [a b].
While for a path!, because it's an array, each component is a rebol value, and the interpreter knows that. For instance, 'bcd' in a/bcd will be seen as a whole (a word!), instead of three characters 'b', 'c' and 'd'.
I agree that the file! being an any-string! is not convenient.
Here is a maybe cumbersome solution, but suitable for directories treating them as files
test/1: to-file head change skip split-path test/1 1 %c

Break down JSON string in simple perl or simple unix?

ok so i have have this
{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}
and at the moment i'm using this shell command to decode it to get the string i need,
echo $x | grep -Po '"utterance":.*?[^\\]"' | sed -e s/://g -e s/utterance//g -e 's/"//g'
but this only works when you have a grep compiled with perl and plus the script i use to get that JSON string is written in perl, so is there any way i can do this same decoding in a simple perl script or a simpler unix command, or better yet, c or objective-c?
the script i'm using to get the json is here, http://pastebin.com/jBGzJbMk and if you want a file to use then download http://trevorrudolph.com/a.flac
How about:
perl -MJSON -nE 'say decode_json($_)->{hypotheses}[0]{utterance}'
in script form:
use JSON;
while (<>) {
print decode_json($_)->{hypotheses}[0]{utterance}, "\n"
}
Well, I'm not sure if I can deduce what you are after correctly, but this is a way to decode that JSON string in perl.
Of course, you'll need to know the data structure in order to get the data you need. The line that prints the "utterance" string is commented out in the code below.
use strict;
use warnings;
use Data::Dumper;
use JSON;
my $json = decode_json
q#{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}#;
#print $json->{'hypotheses'}[0]{'utterance'};
print Dumper $json;
Output:
$VAR1 = {
'status' => 0,
'hypotheses' => [
{
'utterance' => 'hello how are you',
'confidence' => '0.96311796'
}
],
'id' => '7aceb216d02ecdca7ceffadcadea8950-1'
};
Quick hack:
while (<>) {
say for /"utterance":"?(.*?)(?<!\\)"/;
}
Or as a one-liner:
perl -lnwe 'print for /"utterance":"(.+?)(?<!\\)"/g' inputfile.txt
The one-liner is troublesome if you happen to be using Windows, since " is interpreted by the shell.
Quick hack#2:
This will hopefully go through any hash structure and find keys.
my $json = decode_json $str;
say find_key($json, 'utterance');
sub find_key {
my ($ref, $find) = #_;
if (ref $ref) {
if (ref $ref eq 'HASH' and defined $ref->{$find}) {
return $ref->{$find};
} else {
for (values $ref) {
my $found = find_key($_, $find);
if (defined $found) {
return $found;
}
}
}
}
return;
}
Based on the naming, it's possible to have multiple hypotheses. The prints the utterance of each hypothesis:
echo '{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}' | \
perl -MJSON::XS -n000E'
say $_->{utterance}
for #{ JSON::XS->new->decode($_)->{hypotheses} }'
Or as a script:
use feature qw( say );
use JSON::XS;
my $json = '{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}';
say $_->{utterance}
for #{ JSON::XS->new->decode($json)->{hypotheses} };
If you don't want to use any modules from CPAN and try a regex instead there are multiple variants you can try:
# JSON is on a single line:
$json = '{"other":"stuff","hypo":[{"utterance":"hi, this is \"bob\"","moo":0}]}';
# RegEx with negative look behind:
# Match everything up to a double quote without a Backslash in front of it
print "$1\n" if ($json =~ m/"utterance":"(.*?)(?<!\\)"/)
This regex works if there is only one utterance. It doesn't matter what else is in the string around it, since it only searches for the double quoted string following the utterance key.
For a more robust version you could add whitespace where necessary/possible and make the . in the RegEx match newlines: m/"utterance"\s*:\s*"(.*?)(?<!\\)"/s
If you have multiple entries for the utterance confidence hash/object, changing case and weird formatting of the JSON string try this:
# weird JSON:
$json = <<'EOJSON';
{
"status":0,
"id":"an ID",
"hypotheses":[
{
"UtTeraNcE":"hello my name is \"Bob\".",
"confidence":0.0
},
{
'utterance' : 'how are you?',
"confidence":0.1
},
{
"utterance"
: "
thought
so!
",
"confidence" : 0.9
}
]
}
EOJSON
# RegEx with alternatives:
print "$1\n" while ( $json =~ m/["']utterance["']\s*:\s*["'](([^\\"']|\\.)*)["']/gis);
The main part of this RegEx is "(([^\\"]|\\.)*)". Description in detail as extended regex:
/
["'] # opening quotes
( # start capturing parentheses for $1
( # start of grouping alternatives
[^\\"'] # anything that's not a backslash or a quote
| # or
\\. # a backslash followed by anything
) # end of grouping
* # in any quantity
) # end capturing parentheses
["'] # closing quotes
/xgs
If you have many data sets and speed is a concern you can add the o modifier to the regex and use character classes instead of the i modifier. You can suppress the capturing of the alternatives to $2 with clustering parenthesis (?:pattern). Then you get this final result:
m/["'][uU][tT][tT][eE][rR][aA][nN][cC][eE]["']\s*:\s*["']((?:[^\\"']|\\.)*)["']/gos
Yes, sometimes perl looks like a big explosion in a bracket factory ;-)
Just stubmled upon another nice method of doing this, i finaly found how to acsess the Mac OS X JavaScript engine form commandline, heres the script,
alias jsc='/System/Library/Frameworks/JavaScriptCore.framework/Versions/A/Resources/jsc'
x='{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}'
jsc -e "print(${x}['hypotheses'][0]['utterance'])"
Ugh, yes i came up with another answer, im strudying python and it reads arrays in both its python format and the same format as a json so, i jsut made this one liner when your variable is x
python -c "print ${x}['hypotheses'][0]['utterance']"
figured it out for unix but would love to see your perl and c, objective-c answers...
echo $X | sed -e 's/.*utterance//' -e 's/confidence.*//' -e s/://g -e 's/"//g' -e 's/,//g'
:D
shorter copy of the same sed:
echo $X | sed -e 's/.*utterance//;s/confidence.*//;s/://g;s/"//g;s/,//g'

Variable expansion and escaped characters

In PowerShell, you can expand variables within strings as shown below:
$myvar = "hello"
$myvar1 = "$myvar`world" #without the `, powershell would look for a variable called $myvarworld
Write-Host $myvar1 #prints helloworld
The problem I am having is with escaped characters like nr etc, as shown below:
$myvar3 = "$myvar`albert"
Write-Host $myvar3 #prints hellolbert as `a is an alert
also the following doesnt work:
$myvar2 = "$myvar`frank" #doesnt work
Write-Host $myvar2 #prints hellorank.
Question:
How do I combine the strings without worrying about escaped characters when I am using the automatic variable expansion featurie?
Or do I have to do it only this way:
$myvar = "hello"
$myvar1 = "$myvar"+"world" #using +
Write-Host $myvar1
$myvar2 = "$myvar"+"frank" #using +
This way is not yet mentioned:
"$($myvar)frank"
And this:
"${myvar}frank"
This seems kind of kludgy, but as another option, you can add a space and a backspace:
$myvar = "hello"
$myvar1 = "$myvar `bworld"
$myvar1
Yet another option is to wrap your variable expression in a $():
$myvar3 = "$($myvar)albert"
Write-Host $myvar3
One other option is through the format operator:
"{0}world" -f $myvar
Another option is a double-quoted here-string:
$myvar = "Hello"
$myvar2 = #"
$myvar$("frank")
"#