how is reduce defined (using foreach) - iteration

I'm having hard time understanding how to use foreach. I kinda "understands" the text, but the example given in manual is little bit over my head. Can you please show me how to define reduce operation using foreach?

reduce SOURCE as $VAR (INIT; REDUCTION)
is equivalent to
[ foreach SOURCE as $VAR (INIT; REDUCTION; .) ] | last
Conversely,
foreach SOURCE as $VAR (INIT; UPDATE; EXTRACT)
is equivalent to
reduce SOURCE as $VAR (
{ state: ( INIT ), rv: [] };
.state |= ( UPDATE ) |
.rv += [ .state | EXTRACT ]
) | .rv[]
A you can see, they are very similar. Practically identical, in fact.
Use reduce when you want a single result.
Use foreach when you want a result for each input.
For example, say we have the following input:
["abc", "def", "ghi", "jkl", "mno"]
foreach .[] as $var (false; not; .)
produces
true
false
true
false
true
so
foreach .[] as $var (false; not; if . then $var else empty end)
produces
"abc"
"ghi"
"mno"

Related

awk does not get multiple matches in a line with match

AWK has the match(s, r [, a]) function which according to the manual is capable of recording all occuring patterns into array "a":
...If array a is provided, a is cleared and then elements 1 through n are filled with the portions of s that match the corresponding parenthesized subexpression in r. The 0'th element of a contains the portion of s matched by the entire regular expression r. Subscripts a[n, "start"], and a[n, "length"] provide the starting index in the string and length respectively, of EACH matching substring.
I expect that the following line:
echo 123412341234 | awk '{match($0,"1",arr); print arr[0] arr[1] arr[2];)'
prints 111
But in fact "match" ignores all other matches except the first one.
Could please someone tell me please what is the proper syntax here to populate "arr" with all occurrences of "1"?
match only finds first match and stops there. You will have to run match in a loop or else use this way where we use split input on anything this is not 1:
echo '123412341234' | awk -F '[^1]+' '{print $1 $2 $3}'
111
Or using split in gnu-awk:
echo '123412341234' | awk 'split($0, a, /1/, m) {print m[1] m[2] m[3]}'
111
I would harness GNU AWK patsplit function for that task following way, let file.txt content be
123412341234
then
awk '{patsplit($0,arr,"1");print arr[1] arr[2] arr[3]}' file.txt
gives output
111
Explanation: patsplit is function which allows you to get similar effect to using FPAT variable, it does put all matches of 3rd argument into array provided as 2nd argument (clearing it if is not empty) found in string provided as 1st argument. Observe that 1st finding does goes under key 1, 2nd under 2, 3rd under 3 and so on (there is nothing under 0)
(tested in GNU Awk 5.0.1)
If sub is allowed then you can do a substitution here. Try following awk code once.
awk '{gsub(/[^1]+/,"")} 1' Input_file
patsplit() is basically same as wrapping the desired regex pattern with a custom pair of SEPs before splitting, which is what anysplit() is emulating here, while being UTF-8 friendly.
echo "123\uC350abc:\uF8FF:|\U1F921#xyz" |
mawk2x '{ print ("\t\f"($0)"\n")>>(STDERR)
anysplit($_, reFLEX_UCode8 "|[[-_!-/3-?]",___=2,__)
OFS="\t"
for(_ in __) { if (!(_%___)) {
printf(" matched_items[ %2d ] = # %-2d = \42%s\42\n",
_,_/___,__[_])
} } } END { printf(ORS) }'
123썐abc::|🤡#xyz
matched_items[ 2 ] = # 1 = "3썐"
matched_items[ 4 ] = # 2 = "::"
matched_items[ 6 ] = # 3 = "🤡#"
In the background, anysplit() is nothing all that complicated either :
xs3pFS is a 3-byte string of \301\032\365 that I assumed would be extremely rare to show up even in binary data.
gsub(patRE, xs3pFS ((pat=="&")?"\\":"") "&" xs3pFS,_)
gsub(xs3pFS "("xs3pFS")+", "",_)
return split(_, ar8, xs3pFS)
By splitting the input string in this manner, all the desired items would exist in even-numbered array indices, while the rest of the string would be distributed along odd-numbered indices,
somewhat similar to the 2nd array i.e. 4th argument in gawk's split() and patsplit() for the seps, but difference being that both the matches and the seps, whichever way you want to see them, are in the same array.
When you print out every cell in the array, you'll see :
_SEPS_[ 1 ] = # 1 = "123"
matched_items
[ 2 ] = # 1 = "썐"
_SEPS_[ 3 ] = # 2 = "abc"
matched_items
[ 4 ] = # 2 = "::"
_SEPS_[ 5 ] = # 3 = "|"
matched_items
[ 6 ] = # 3 = "🤡#"
_SEPS_[ 7 ] = # 4 = "xyz"

Tcl : Guaranteed evaluation sequence of a boolean expression?

Let's say I have a conditional Tcl expression that is a boolean combination of steps.
Will the expression always be evaluated left to right (excluding parentheses)?
If the expression becomes true will the rest of the evaluation stop?
I have this piece of code that parses a file and conditionally replaces stuff in the lines.
set fp [ open "file" ]
set data [ read $fp ]
close $fp
foreach line [ split $data \n ] {
if { $enable_patch && [ regsub {<some_pattern>} $line {<some_other_pattern>} line ]} {
puts $outfp $line
<do_some_more_stuff>
}
}
So my issue here is that unless enable_patch is true, I don't want the line to be modified. Now my test shows that the code is deterministic in Tcl 8.5 on Linux. But I am wondering if this would break under other conditions/ versions/ OSes.
Yes, the || and && operators are "short-circuiting" operators in TCL. That means you can rely on them being evaluated left-to-right, and that evaluation will stop as soon as the value of the expression is known.

How do I write contents of a variable to a text file in Rebol 2?

Newbie question here...
I'd like to write the output of the "what" function to a text file.
So here is what I have done:
I've created a variable called "text" and assigned the output of "what" to it
text: [what]
Now I want to write the content of the "text" variable to a txt file...
Any help is appreciated. Thanks in advance!
the easiest way to write the output of statements to a file is to use
echo %file.log
what
with echo none you end this
>> help echo
USAGE:
ECHO target
DESCRIPTION:
Copies console output to a file.
ECHO is a function value.
ARGUMENTS:
target -- (Type: file none logic)
(SPECIAL ATTRIBUTES)
catch
Unfortunately there is not really a value returned from the what function:
Try the following in the console:
print ["Value of `what` is: " what]
So write %filename.txt [what] will not work.
Instead, what you could do is to look at the source of what
source what
which returns:
what: func [
"Prints a list of globally-defined functions."
/local vals args here total
][
total: copy []
vals: second system/words
foreach word first system/words [
if any-function? first vals [
args: first first vals
if here: find args /local [args: copy/part args here]
append total reduce [word mold args]
]
vals: next vals
]
foreach [word args] sort/skip total 2 [print [word args]]
exit
]
See that this function only prints (it doesn't return the values it finds) We can modify the script to do what you want:
new-what: func [
"Returns a list of globally-defined functions."
/local vals args here total collected
][
collected: copy []
total: copy []
vals: second system/words
foreach word first system/words [
if any-function? first vals [
args: first first vals
if here: find args /local [args: copy/part args here]
append total reduce [word mold args]
]
vals: next vals
]
foreach [word args] sort/skip total 2 [append collected reduce [word tab args newline]]
write %filename.txt collected
exit
]
This function is a little hackish (filename is set, but it will return what you want). You can extend the function to accept a filename or do whatever you want. The tab and newline are there to make the file output prettier.
Important things to notice from this:
Print returns unset
Use source to find out what functions do
write %filename value will write out a value to a file all at once. If you open a file, you can write more times.
Fairly elementary: use write if you just want to save some text, read to recover it; use save if you want to store some data and use load to recover it.
>> write %file.txt "Some Text"
>> read %file.txt
== "Some Text"
>> text: [what]
>> save/all %file.r text
>> load %file.r
== [what]
You can get more information on each word at the prompt: help save or view online: load, save, read and write.

Powershell Code for List of Distinct Directories

I have a file that contains a list such as:
tables\mytable1.sql
tables\myTable2.sql
procedures\myProc1.sql
functions\myFunction1.sql
functions\myFunction2.sql
From this data (and there will always be a path, and it will always be only one level), I want to retrieve a list of distinct paths (e.g. tables\, procedures\, functions\)
To maybe make it the file that contains this data will already have been read into a list (named $fileList), so the new list ($directoryList ??) can likely derived from it.
I've found reference to the -unique parameter, but I need to look from the start of the line, up to (and including) the '\', of which there will only be one occurrence of).
Assuming you already have the data on $fileList, try this:
$directoryList = $fileList | %{ $_.split("\")[0]} | select -unique
It will do a foreach (the %{}) on the elements of your list, and then split them by the \ and get you only the first part (in your case, the folder name), after that you use select -unique to get just the distinct values.
Alternatively, you could do it like this:
$fileList | %{ $_ -replace "\\.*$","" } | select -unique
Using -replace to remove everything after the \.
Also, if for some reason you don't have the values of your textfile on $fileList already, you can do so using:
$fileList = Get-Content yourFile.txt
Your file may contain empty lines and more often than not there's an empty line on the last one so this will account for that.
It also has a slightly different regular expression to match from the end of the string that is not a \ character which will work for paths with multiple levels including your example.
If you have a text file with the following:
Z:\Path to somewhere\Files\some file 1.txt
Z:\Path to somewhere\Files\some file 2.txt
tables\mytable1.sql
tables\myTable2.sql
procedures\myProc1.sql
functions\myFunction1.sql
functions\myFunction2.sql
With this code which also shows the output after the function:
$fileListToProcess = "$([Environment]::GetFolderPath(""Desktop""))\list.txt"
Function Get-UniqueDirectoriesFromFile {
Param
(
[Parameter(Mandatory = $true, HelpMessage = 'The file where the list of files is.')]
[string]$LiteralPath
)
if (Test-Path -LiteralPath $LiteralPath -PathType Leaf) {
$fileList = [IO.File]::ReadAllLines($LiteralPath)
return $fileList | %{ $_ -replace '\\[^\\]*$', '' } | ? { $_.trim() -ne "" } | Select -Unique
}
else {
return $null
}
}
$uniqueDirs = Get-UniqueDirectoriesFromFile -file $fileListToProcess
# Display the results:
$uniqueDirs
# PS>
# Z:\Path to somewhere\Files
# tables
# procedures
# functions
$uniqueDirs.count
# PS> 4

Break down JSON string in simple perl or simple unix?

ok so i have have this
{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}
and at the moment i'm using this shell command to decode it to get the string i need,
echo $x | grep -Po '"utterance":.*?[^\\]"' | sed -e s/://g -e s/utterance//g -e 's/"//g'
but this only works when you have a grep compiled with perl and plus the script i use to get that JSON string is written in perl, so is there any way i can do this same decoding in a simple perl script or a simpler unix command, or better yet, c or objective-c?
the script i'm using to get the json is here, http://pastebin.com/jBGzJbMk and if you want a file to use then download http://trevorrudolph.com/a.flac
How about:
perl -MJSON -nE 'say decode_json($_)->{hypotheses}[0]{utterance}'
in script form:
use JSON;
while (<>) {
print decode_json($_)->{hypotheses}[0]{utterance}, "\n"
}
Well, I'm not sure if I can deduce what you are after correctly, but this is a way to decode that JSON string in perl.
Of course, you'll need to know the data structure in order to get the data you need. The line that prints the "utterance" string is commented out in the code below.
use strict;
use warnings;
use Data::Dumper;
use JSON;
my $json = decode_json
q#{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}#;
#print $json->{'hypotheses'}[0]{'utterance'};
print Dumper $json;
Output:
$VAR1 = {
'status' => 0,
'hypotheses' => [
{
'utterance' => 'hello how are you',
'confidence' => '0.96311796'
}
],
'id' => '7aceb216d02ecdca7ceffadcadea8950-1'
};
Quick hack:
while (<>) {
say for /"utterance":"?(.*?)(?<!\\)"/;
}
Or as a one-liner:
perl -lnwe 'print for /"utterance":"(.+?)(?<!\\)"/g' inputfile.txt
The one-liner is troublesome if you happen to be using Windows, since " is interpreted by the shell.
Quick hack#2:
This will hopefully go through any hash structure and find keys.
my $json = decode_json $str;
say find_key($json, 'utterance');
sub find_key {
my ($ref, $find) = #_;
if (ref $ref) {
if (ref $ref eq 'HASH' and defined $ref->{$find}) {
return $ref->{$find};
} else {
for (values $ref) {
my $found = find_key($_, $find);
if (defined $found) {
return $found;
}
}
}
}
return;
}
Based on the naming, it's possible to have multiple hypotheses. The prints the utterance of each hypothesis:
echo '{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}' | \
perl -MJSON::XS -n000E'
say $_->{utterance}
for #{ JSON::XS->new->decode($_)->{hypotheses} }'
Or as a script:
use feature qw( say );
use JSON::XS;
my $json = '{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}';
say $_->{utterance}
for #{ JSON::XS->new->decode($json)->{hypotheses} };
If you don't want to use any modules from CPAN and try a regex instead there are multiple variants you can try:
# JSON is on a single line:
$json = '{"other":"stuff","hypo":[{"utterance":"hi, this is \"bob\"","moo":0}]}';
# RegEx with negative look behind:
# Match everything up to a double quote without a Backslash in front of it
print "$1\n" if ($json =~ m/"utterance":"(.*?)(?<!\\)"/)
This regex works if there is only one utterance. It doesn't matter what else is in the string around it, since it only searches for the double quoted string following the utterance key.
For a more robust version you could add whitespace where necessary/possible and make the . in the RegEx match newlines: m/"utterance"\s*:\s*"(.*?)(?<!\\)"/s
If you have multiple entries for the utterance confidence hash/object, changing case and weird formatting of the JSON string try this:
# weird JSON:
$json = <<'EOJSON';
{
"status":0,
"id":"an ID",
"hypotheses":[
{
"UtTeraNcE":"hello my name is \"Bob\".",
"confidence":0.0
},
{
'utterance' : 'how are you?',
"confidence":0.1
},
{
"utterance"
: "
thought
so!
",
"confidence" : 0.9
}
]
}
EOJSON
# RegEx with alternatives:
print "$1\n" while ( $json =~ m/["']utterance["']\s*:\s*["'](([^\\"']|\\.)*)["']/gis);
The main part of this RegEx is "(([^\\"]|\\.)*)". Description in detail as extended regex:
/
["'] # opening quotes
( # start capturing parentheses for $1
( # start of grouping alternatives
[^\\"'] # anything that's not a backslash or a quote
| # or
\\. # a backslash followed by anything
) # end of grouping
* # in any quantity
) # end capturing parentheses
["'] # closing quotes
/xgs
If you have many data sets and speed is a concern you can add the o modifier to the regex and use character classes instead of the i modifier. You can suppress the capturing of the alternatives to $2 with clustering parenthesis (?:pattern). Then you get this final result:
m/["'][uU][tT][tT][eE][rR][aA][nN][cC][eE]["']\s*:\s*["']((?:[^\\"']|\\.)*)["']/gos
Yes, sometimes perl looks like a big explosion in a bracket factory ;-)
Just stubmled upon another nice method of doing this, i finaly found how to acsess the Mac OS X JavaScript engine form commandline, heres the script,
alias jsc='/System/Library/Frameworks/JavaScriptCore.framework/Versions/A/Resources/jsc'
x='{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}'
jsc -e "print(${x}['hypotheses'][0]['utterance'])"
Ugh, yes i came up with another answer, im strudying python and it reads arrays in both its python format and the same format as a json so, i jsut made this one liner when your variable is x
python -c "print ${x}['hypotheses'][0]['utterance']"
figured it out for unix but would love to see your perl and c, objective-c answers...
echo $X | sed -e 's/.*utterance//' -e 's/confidence.*//' -e s/://g -e 's/"//g' -e 's/,//g'
:D
shorter copy of the same sed:
echo $X | sed -e 's/.*utterance//;s/confidence.*//;s/://g;s/"//g;s/,//g'