How to split a string based on comma, but not based on comma in double quote - raku

I want to split this string based on comma, but not based on the comma in double quote ":
my $str = '1,2,3,"4,5,6"';
.say for $str.split(/','/) # Or use comb?
The output should be:
1
2
3
"4,5,6"

fast solution with comb, take anything but not " nor ,
or take quoted string
my $str = '1,2,3,"4,5,6",7,8';
.say for $str.comb: / <-[",]>+ | <["]> ~ <["]> <-["]>+ / ;

as #melpomene suggested, use the Text::CSV module works too.
use Text::CSV;
my $str = '123,456,"78,91",abc,"de,f","ikm"';
for csv(in => csv(in => [$str], sep_char => ",")) -> $arr {
.say for #$arr;
}
which output:
123
456
78,91
abc
de,f
ikm

This may help:
my $str = ‘1,2,3,"4,5,6",7,8’;
for $str.split(/ \" \d+ % ',' \"/, :v) -> $l {
if $l.contains('"') {
say $l.Str;
} else {
.say for $l.comb(/\d+/);
}
}
Output:
1
2
3
"4,5,6"
7
8

Related

concatenating string with multiple array

I'm trying to rearrange from a specific string into the respective column.
Here is the input
String 1: 47/13528
String 2: 55(s)
String 3:
String 4: 114(n)
String 5: 225(s), 26/10533-10541
String 6: 103/13519
String 7: 10(s), 162(n)
String 8: 152/12345,12346
(d=dead, n=null, s=strike)
The alphabet in each value is the flag (d=dead, n=null, s=strike).
The String with value (digit) which is "String 1" will be the 47c1
etc:
String 1: 47/13528
value without any flag will be sorted into the null column along with null tag (n)
String 1 (the integer will be concatenated with 47/13528)
Sorted :
null
47c1#SP13528;114c4;103c6#SP13519;162c7
Str#2: 55(s)
flagged with (s) will be sorted into strike column
Sorted :
strike
55c2;225c5;26c5#SP10533-10541;162c7
I'm trying to parse it by modifying previous code, seems no luck
{
for (i=1; i<=NF; i++) {
num = $i+0
abbr = $i
gsub(/[^[:alpha:]]/,"",abbr)
list[abbr] = list[abbr] num " c " val ORS
}
}
END {
n = split("dead null strike",types)
for (i=1; i<=n; i++) {
name = types[i]
abbr = substr(name,1,1)
printf "name,list[abbr]\n"
}
}
Expected Output (sorted into csv) :
dead,null,strike
,47c1#SP13528;114c4; 26c5#SP10533-10541;103c6#SP13519;162c7, 152c8#SP12345;152c8#SP12346,55c2;225c5;162c7;10c7
Breakdown for crosscheck purpose:
dead
none
null
47c1#SP13528;114c4;103c6#SP13519;162c7;152c8#SP12345;152c8#SP12346;26c5#SP10533-10541;;162c7
strike
55c2;225c5;10c7
Here is an awk script for parsing your file.
BEGIN {
types["d"]; types["n"]; types["s"]
deft = "n"; OFS = ","; sep = ";"
}
$1=="String" {
gsub(/[)(]/,""); gsub(",", " ") # general line subs
for (i=3;i<=NF;i++) {
if (!gsub("/","c"$2+0"#SP", $i)) $i = $i"c"$2+0 # make all subs on items
for (t in types) { if (gsub(t, "", $i)) { x=t; break }; x=deft } #find type
items[x] = items[x]? items[x] sep $i: $i # append for type found
}
}
END {
print "dead" OFS "null" OFS "strike"
print items["d"] OFS items["n"] OFS items["s"]
}
Input:
String 1: 47/13528
String 2: 55(s)
String 3:
String 4: 114(n)
String 5: 225(s), 26/10533-10541
String 6: 103/13519
String 7: 10(s), 162(n)
String 8: 152/12345,12346
(d=dead, n=null, s=strike)
Output:
> awk -f tst.awk file
dead,null,strike
,47c1#SP13528;114c4;26c5#SP10533-10541;103c6#SP13519;162c7;152c8#SP12345;12346c8,55c2;225c5;10c7
Your description was changing on important details, like how we decide the type of an item or how they are separated, and untill now your input and outputs are not consistent to it, but in general I think you can easily get what is done into this script. Have in mind that gsub() returns the number of the substitutions made, while doing them also, so many times it is convenient to use it as a condition.
My usuall approuch is:
First preprocess the data to have one information on one line.
Then preprocess the data to have one information in one column row wise.
Then it's easy - just accumulate columns in some array in awk and print them.
The following code:
cat <<EOF |
String 1: 47/13528
String 2: 55(s)
String 3:
String 4: 114(n)
String 5: 225(s), 26/10533-10541
String 6: 103/13519
String 7: 10(s), 162(n)
String 8: 152/12345,12346
(d=dead, n=null, s=strike)
EOF
sed '
# filter only lines with String
/^String \([0-9]*\): */!d;
# Remove the String
# Remove the : and spaces
s//\1 /
# remove trailing spaces
s/ *$//
# Remove lines with nothing
/^[0-9]* *$/d
# remove the commas and split lines on comma
# by moving them to separate lines
# repeat that until a comma is found
: a
/\([0-9]*\) \(.*\), *\(.*\)/{
s//\1 \2\n\1 \3/
ba
}
' | sed '
# we should be having two fields here
# separated by a single space
/^[^ ]* [^ ]*$/!{
s/.*/ERROR: "&"/
q1
}
# Move the name in braces to separate column
/(\(.\))$/{
s// \1/
b not
} ; {
# default is n
s/$/ n/
} ; : not
# shuffle first and second field
# to that <num>c<num>(#SP<something>)? format
# if second field has a "/"
\~^\([0-9]*\) \([0-9]*\)/\([^ ]*\)~{
# then add a SP
s//\2c\1#SP\3/
b not2
} ; {
# otherwise just do a "c" between
s/\([0-9]*\) \([0-9]*\)/\2c\1/
} ; : not2
' |
sort -n -k1 |
# now it's trivial
awk '
{
out[$2] = out[$2] (!length(out[$2])?"":";") $1
}
function outputit(name, idx) {
print name
if (length(out[idx]) == 0) {
print "none"
} else {
print out[idx]
}
printf "\n"
}
END{
outputit("dead", "d")
outputit("null", "n")
outputit("strike", "s")
}
'
outputs on repl:
dead
none
null
26c5#SP10533-10541;47c1#SP13528;103c6#SP13519;114c4;152c8#SP12345;162c7;12346c8
strike
10c7;55c2;225c5
The output I believe matches yours up to the sorting order with the ; separated list, which you seem to sort first column then second column, I just sorted with sort.

How to make an array of alphabets from a file and update in a new file

I have a single column file.
A
A
A
B
B
B
C
C
D
I want to use this file and want to make a new one as below
command="A" "B" "C" "D"
TYPE=1 1 1 2 2 2 3 3 4,
These A B C D are random alphabets and varies file to file.
I tried to overcome the solution with below shell script
#!/bin/bash
NQ=$(cat RLP.txt | wc -l)
ELEMENT='element='
echo "$ELEMENT" > element.txt
TYPE='types='
echo "$TYPE" > types.txt
for i in `seq 1 1 $NQ`
do
RLP=$(echo "$i" | tail -n 1)
cat RLP.txt | head -n "$RLP" | tail -n 1 > el.$RLP.txt
done
paste element.txt el.*.txt
paste types.txt
The output of paste element.txt el.*.txt is element= A A A B B B C C D
I could not remove the repeated alphabets and put the reaming alphabets in "".
and cold not move forward for with second command to get
TYPE=1 1 1 2 2 2 3 3 4,
which represents that the 1st alphabets repeated three times, 2nd alphabets repeated three times, 3rd alphabets repeated two times and so on..
$ cat tst.awk
!seen[$1]++ {
cmd = cmd sep "\"" $1 "\""
cnt++
}
{
type = type sep cnt
sep = OFS
}
END {
print "command=" cmd
print "TYPE=" type ","
}
$ awk -f tst.awk file
command="A" "B" "C" "D"
TYPE=1 1 1 2 2 2 3 3 4,
Instead of using multiple text processing tools in a pipeline, this can be achieved by one awk command as below
awk '
{
unique[$0]
}
prev !~ $0 {
alpha[NR] = idx++
}
{
prev = $0
alpha[NR] = idx
}
END {
for (i in unique) {
str = str ? (str " " "\"" i "\"") : "\"" i "\""
}
first = "command=" str
str = ""
for (i = 1; i <= NR; i++) {
str = str ? (str " " alpha[i]) : alpha[i]
}
second = "TYPE=" str ","
print(first "\n" second) > "types.txt"
close("types.txt")
}' RLP.txt
The command works as follows
Each unique line in the file is saved as an index in into the array unique
The array alpha keeps track of the unique value counter, i.e. every time a value in the file changes, the counter is incremented at the corresponding line number NR
The END block is all about constructing the output from the array to a string value and writing the result to the new file "types.txt"
Pure bash implementation. Requires at least Bash version 4 for the associative array
#!/bin/bash
outfile="./RLP.txt"
infile="./infile"
declare -A map
while read line; do
(( map["$line"]++ ))
done < "$infile"
command="command="
command+=$(printf "\"%s\" " "${!map[#]}")
type="$(
for i in "${map[#]}"; do
((k++))
for (( j=0; j < i; j++ )); do
printf " %d" "$k"
done
done
),"
echo "$command" >> "$outfile"
echo "TYPE=${type#* }" >> "$outfile"

What is the correct way to scan "Quoted String" in ragel?

I m trying learn ragel with go, but i am not able to find a proper way to scan a Quoted-string
This is what i have defined
dquote = '"';
quoted_string = dquote (any*?) dquote ;
main := |*
quoted_string =>
{
current_token = QUOTED_STRING;
yylval.stringValue = string(lex.m_unScannedData[lex.m_ts:lex.m_te]);
fmt.Println("quoted string : ", yylval.stringValue)
fbreak;
};
The following expression with single quoted string works fine
if abc == "xyz.123" {
pp
}
If i scan the above condition then i get this printf
quoted string : "xyz.123"
But if i have 2 quoted string as shown below, it fails
if abc == "0003" {
if xyz == "5003" {
pp
}
}
it scans both the quoted string
quoted string : "0003" {
if xyz == "5003"
Can someone please help me with this ? If there is a better alternative
I am using below version
# ragel -v
Ragel State Machine Compiler version 6.10 March 2017
Copyright (c) 2001-2009 by Adrian Thurston
This did the trick
quoted_string = dquote (any - newline)* dquote ;

gsub for substituting translations not working

I have a dictionary dict with records separated by ":" and data fields by new lines, for example:
:one
1
:two
2
:three
3
:four
4
Now I want awk to substitute all occurrences of each record in the input
file, eg
onetwotwotwoone
two
threetwoone
four
My first awk script looked like this and works just fine:
BEGIN { RS = ":" ; FS = "\n"}
NR == FNR {
rep[$1] = $2
next
}
{
for (key in rep)
grub(key,rep[key])
print
}
giving me:
12221
2
321
4
Unfortunately another dict file contains some character used by regular expressions, so I have to substitute escape characters in my script. By moving key and rep[key] into a string (which can then be parsed for escape characters), the script will only substitute the second record in the dict. Why? And how to solve?
Here's the current second part of the script:
{
for (key in rep)
orig=key
trans=rep[key]
gsub(/[\]\[^$.*?+{}\\()|]/, "\\\\&", orig)
gsub(orig,trans)
print
}
All scripts are run by awk -f translate.awk dict input
Thanks in advance!
Your fundamental problem is using strings in regexp and backreference contexts when you don't want them and then trying to escape the metacharacters in your strings to disable the characters that you're enabling by using them in those contexts. If you want strings, use them in string contexts, that's all.
You won't want this:
gsub(regexp,backreference-enabled-string)
You want something more like this:
index(...,string) substr(string)
I think this is what you're trying to do:
$ cat tst.awk
BEGIN { FS = ":" }
NR == FNR {
if ( NR%2 ) {
key = $2
}
else {
rep[key] = $0
}
next
}
{
for ( key in rep ) {
head = ""
tail = $0
while ( start = index(tail,key) ) {
head = head substr(tail,1,start-1) rep[key]
tail = substr(tail,start+length(key))
}
$0 = head tail
}
print
}
$ awk -f tst.awk dict file
12221
2
321
4
Never mind for asking....
Just some missing parentheses...?!
{
for (key in rep)
{
orig=key
trans=rep[key]
gsub(/[\]\[^$.*?+{}\\()|]/, "\\\\&", orig)
gsub(orig,trans)
}
print
}
works like a charm.

awk count selective combinations only:

Would like to read and count the field value == "TRUE" only from 3rd field to 5th field.
Input.txt
Locationx,Desc,A,B,C,Locationy
ab123,Name1,TRUE,TRUE,TRUE,ab1234
ab123,Name2,TRUE,FALSE,TRUE,ab1234
ab123,Name2,FALSE,FALSE,TRUE,ab1234
ab123,Name1,TRUE,TRUE,TRUE,ab1234
ab123,Name2,TRUE,TRUE,TRUE,ab1234
ab123,Name3,FALSE,FALSE,FALSE,ab1234
ab123,Name3,TRUE,FALSE,FALSE,ab1234
ab123,Name3,TRUE,TRUE,FALSE,ab1234
ab123,Name3,TRUE,TRUE,FALSE,ab1234
ab123,Name1,TRUE,TRUE,FALSE,ab1234
While reading the headers from 3rd field to 5th field , i,e A, B, C want to generate unique combinations and permutations like A,B,C,AB,AC,AB,ABC only.
Note: AA, BB, CC, BA etc excluded
If the "TRUE" is considered for "AB" combination count then it should not be considered for "A" conut & "B" count again to avoid duplicate ..
Example#1
Locationx,Desc,A,B,C,Locationy
ab123,Name1,TRUE,TRUE,TRUE,ab1234
Op#1
Desc,A,B,C,AB,AC,BC,ABC
Name1,,,,,,,1
Example#2
Locationx,Desc,A,B,C,Locationy
ab123,Name1,TRUE,TRUE,FALSE,ab1234
Op#2
Desc,A,B,C,AB,AC,BC,ABC
Name1,,,,1,,,
Example#3
Locationx,Desc,A,B,C,Locationy
ab123,Name1,FALSE,TRUE,FALSE,ab1234
Op#3
Desc,A,B,C,AB,AC,BC,ABC
Name1,,1,,,,,
Desired Output:
Desc,A,B,C,AB,AC,BC,ABC
Name1,,,,1,,,2
Name2,,,1,,1,,1
Name3,1,,,2,,,
Actual File is like below :
Input.txt
Locationx,Desc,INCOMING,OUTGOING,SMS,RECHARGE,DEBIT,DATA,Locationy
ab123,Name1,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,ab1234
ab123,Name2,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,ab1234
ab123,Name2,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE,ab1234
ab123,Name1,TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,ab1234
ab123,Name2,TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,ab1234
ab123,Name3,FALSE,FALSE,FALSE,TRUE,FALSE,FALSE,ab1234
ab123,Name3,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,ab1234
ab123,Name3,TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,ab1234
ab123,Name3,TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,ab1234
ab123,Name1,TRUE,TRUE,FALSE,FALSE,FALSE,TRUE,ab1234
Have tried lot , nothing is materialised , any suggestions please !!!
Edit: Desired Output from Actual Input:
Desc,INCOMING-OUTGOING-SMS-RECHARGE-DEBIT-DATA,OUTGOING-SMS-RECHARGE-DEBIT-DATA,INCOMING-SMS-RECHARGE-DEBIT-DATA,INCOMING-OUTGOING-RECHARGE-DEBIT-DATA,INCOMING-OUTGOING-SMS-RECHARGE-DATA,INCOMING-OUTGOING-SMS-RECHARGE-DEBIT,SMS-RECHARGE-DEBIT-DATA,OUTGOING-RECHARGE-DEBIT-DATA,OUTGOING-SMS-RECHARGE-DATA,OUTGOING-SMS-RECHARGE-DEBIT,INCOMING-RECHARGE-DEBIT-DATA,INCOMING-SMS-DEBIT-DATA,INCOMING-SMS-RECHARGE-DATA,INCOMING-SMS-RECHARGE-DEBIT,INCOMING-OUTGOING-DEBIT-DATA,INCOMING-OUTGOING-RECHARGE-DATA,INCOMING-OUTGOING-RECHARGE-DEBIT,INCOMING-OUTGOING-SMS-DATA,INCOMING-OUTGOING-SMS-DEBIT,INCOMING-OUTGOING-SMS-RECHARGE,RECHARGE-DEBIT-DATA,SMS-DEBIT-DATA,SMS-RECHARGE-DATA,SMS-RECHARGE-DEBIT,OUTGOING-RECHARGE-DATA,OUTGOING-RECHARGE-DEBIT,OUTGOING-SMS-DATA,OUTGOING-SMS-DEBIT,OUTGOING-SMS-RECHARGE,INCOMING-DEBIT-DATA,INCOMING-RECHARGE-DATA,INCOMING-RECHARGE-DEBIT,INCOMING-SMS-DATA,INCOMING-SMS-DEBIT,INCOMING-SMS-RECHARGE,INCOMING-OUTGOING-DATA,INCOMING-OUTGOING-DEBIT,INCOMING-OUTGOING-RECHARGE,INCOMING-OUTGOING-SMS,DEBIT-DATA,RECHARGE-DATA,RECHARGE-DEBIT,SMS-DATA,SMS-DEBIT,SMS-RECHARGE,OUTGOING-DATA,OUTGOING-DEBIT,OUTGOING-RECHARGE,OUTGOING-SMS,INCOMING-DATA,INCOMING-DEBIT,INCOMING-RECHARGE,INCOMING-SMS,INCOMING-OUTGOING,DATA,DEBIT,RECHARGE,SMS,OUTGOING,INCOMING
Name1,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,,,1,,,,,,,,,,,,,,,,,,,,,
Name2,,,,1,1,,,,,,,,,,,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Name3,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2,,,,,,,,,,,,,,,,,,,1,,,
Don't have Perl and Python access !!!
I have written a perl script that does this for you. As you can see from the size and comments, it is really simple to get this done.
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
use Algorithm::Combinatorics qw(combinations);
## change the file to the path where your file exists
open my $fh, '<', 'file';
my (%data, #new_labels);
## capture the header line in an array
my #header = split /,/, <$fh>;
## backup the header
my #fields = #header;
## remove first, second and last columns
#header = splice #header, 2, -1;
## generate unique combinations
for my $iter (1 .. +#header) {
my $combination = combinations(\#header, $iter);
while (my $pair = $combination->next) {
push #new_labels, "#$pair";
}
}
## iterate through rest of the file
while(my $line = <$fh>) {
my #line = split /,/, $line;
## identify combined labels that are true
my #is_true = map { $fields[$_] } grep { $line[$_] eq "TRUE" } 0 .. $#line;
## increment counter in hash map keyed at description and then new labels
++$data{$line[1]}{$_} for map { s/ /-/g; $_ } "#is_true";
}
## print the new header
print join ( ",", "Desc", map {s/ /-/g; $_} reverse #new_labels ) . "\n";
## print the description and counter values
for my $desc (sort keys %data){
print join ( ",", $desc, ( map { $data{$desc}{$_} //= "" } reverse #new_labels ) ) . "\n";
}
Output:
Desc,INCOMING-OUTGOING-SMS-RECHARGE-DEBIT-DATA,OUTGOING-SMS-RECHARGE-DEBIT-DATA,INCOMING-SMS-RECHARGE-DEBIT-DATA,INCOMING-OUTGOING-RECHARGE-DEBIT-DATA,INCOMING-OUTGOING-SMS-DEBIT-DATA,INCOMING-OUTGOING-SMS-RECHARGE-DATA,INCOMING-OUTGOING-SMS-RECHARGE-DEBIT,SMS-RECHARGE-DEBIT-DATA,OUTGOING-RECHARGE-DEBIT-DATA,OUTGOING-SMS-DEBIT-DATA,OUTGOING-SMS-RECHARGE-DATA,OUTGOING-SMS-RECHARGE-DEBIT,INCOMING-RECHARGE-DEBIT-DATA,INCOMING-SMS-DEBIT-DATA,INCOMING-SMS-RECHARGE-DATA,INCOMING-SMS-RECHARGE-DEBIT,INCOMING-OUTGOING-DEBIT-DATA,INCOMING-OUTGOING-RECHARGE-DATA,INCOMING-OUTGOING-RECHARGE-DEBIT,INCOMING-OUTGOING-SMS-DATA,INCOMING-OUTGOING-SMS-DEBIT,INCOMING-OUTGOING-SMS-RECHARGE,RECHARGE-DEBIT-DATA,SMS-DEBIT-DATA,SMS-RECHARGE-DATA,SMS-RECHARGE-DEBIT,OUTGOING-DEBIT-DATA,OUTGOING-RECHARGE-DATA,OUTGOING-RECHARGE-DEBIT,OUTGOING-SMS-DATA,OUTGOING-SMS-DEBIT,OUTGOING-SMS-RECHARGE,INCOMING-DEBIT-DATA,INCOMING-RECHARGE-DATA,INCOMING-RECHARGE-DEBIT,INCOMING-SMS-DATA,INCOMING-SMS-DEBIT,INCOMING-SMS-RECHARGE,INCOMING-OUTGOING-DATA,INCOMING-OUTGOING-DEBIT,INCOMING-OUTGOING-RECHARGE,INCOMING-OUTGOING-SMS,DEBIT-DATA,RECHARGE-DATA,RECHARGE-DEBIT,SMS-DATA,SMS-DEBIT,SMS-RECHARGE,OUTGOING-DATA,OUTGOING-DEBIT,OUTGOING-RECHARGE,OUTGOING-SMS,INCOMING-DATA,INCOMING-DEBIT,INCOMING-RECHARGE,INCOMING-SMS,INCOMING-OUTGOING,DATA,DEBIT,RECHARGE,SMS,OUTGOING,INCOMING
Name1,,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,,,1,,,,,,,,,,,,,,,,,,,,,
Name2,,,,1,,1,,,,,,,,,,,,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Name3,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2,,,,,,,,,,,,,,,,,,,1,,,
Note: Please revisit your expected output. It has few mistakes in it as you can see from the output generated from the script above.
Here is an attempt at solving this using awk:
Content of script.awk
BEGIN { FS = OFS = "," }
function combinations(flds, itr, i, pre) {
for (i=++cnt; i<=numRecs; i++) {
++n
sep = ""
for (pre=1; pre<=itr; pre++) {
newRecs[n] = newRecs[n] sep (sprintf ("%s", flds[pre]));
sep = "-"
}
newRecs[n] = newRecs[n] sep (sprintf ("%s", flds[i])) ;
}
}
NR==1 {
for (fld=3; fld<NF; fld++) {
recs[++numRecs] = $fld
}
for (iter=0; iter<numRecs; iter++) {
combinations(recs, iter)
}
next
}
!seen[$2]++ { desc[++d] = $2 }
{
y = 0;
var = sep = ""
for (idx=3; idx<NF; idx++) {
if ($idx == "TRUE") {
is_true[++y] = recs[idx-2]
}
}
for (z=1; z<=y; z++) {
var = var sep sprintf ("%s", is_true[z])
sep = "-"
}
data[$2,var]++;
}
END{
printf "%s," , "Desc"
for (k=1; k<=n; k++) {
printf "%s%s", newRecs[k],(k==n?RS:FS)
}
for (name=1; name<=d; name++) {
printf "%s,", desc[name];
for (nR=1; nR<=n; nR++) {
printf "%s%s", (data[desc[name],newRecs[nR]]?data[desc[name],newRecs[nR]]:""), (nR==n?RS:FS)
}
}
}
Sample file
Locationx,Desc,A,B,C,Locationy
ab123,Name1,TRUE,TRUE,TRUE,ab1234
ab123,Name2,TRUE,FALSE,TRUE,ab1234
ab123,Name2,FALSE,FALSE,TRUE,ab1234
ab123,Name1,TRUE,TRUE,TRUE,ab1234
ab123,Name2,TRUE,TRUE,TRUE,ab1234
ab123,Name3,FALSE,FALSE,FALSE,ab1234
ab123,Name3,TRUE,FALSE,FALSE,ab1234
ab123,Name3,TRUE,TRUE,FALSE,ab1234
ab123,Name3,TRUE,TRUE,FALSE,ab1234
ab123,Name1,TRUE,TRUE,FALSE,ab1234
Execution:
$ awk -f script.awk file
Desc,A,B,C,A-B,A-C,A-B-C
Name1,,,,1,,2
Name2,,,1,,1,1
Name3,1,,,2,,
Now, there is pretty evident bug in the combination function. It does not recurse to print all combinations. For eg: for A B C D it will print
A
B
C
AB
AC
ABC
but not BC