Use tput colors with awk - awk

How can I specify colours with tput numbers in an awk file? Have not found enough information to set up colour variables like I can do with bash.
Have done the following, but the strategy is not working.
awk 'BEGIN {
sgr=system("tput sgr0")
wht=system("tput bold; tput setaf 15")
blu=system("tput bold; tput setaf 39")
}
/Code:$/ { kl=1 ; next }
!NF { kl=0 }
kl { printf("%s%s%s\n", blu, $0, sgr) }
!kl { printf("%s%s%s\n", wht, $0, sgr) }
' <<< "$#"

You have already a very nice answer with your another user here from Ed Morton.
In a MCVE:
Using sgr=system(), you assign sgr with the return code 0 of the tput command. Not the good way to achieve what you need.
Declaring tput commands internally as requested:
awk '
function printblue(text) {
cmd="tput bold; tput setaf 39; echo \047"text"\047; tput sgr0"
system(cmd)
}
BEGIN{
printblue("foobar")
}
'
or
awk 'BEGIN {
system("tput setaf 4")
printf("%s\n", "foobar")
system("tput sgr0")
}
'
or with ANSI sequence:
awk 'BEGIN{ print "\033[34msomething in blue\033[0m";}'

Related

Split multiple column with awk

I need to split a file with multiple columns that looks like this:
TCONS_00000001 q1:Ovary1.13|Ovary1.13.1|100|32.599877 q2:Ovary2.16|Ovary2.16.1|100|88.36
TCONS_00000002 q1:Ovary1.19|Ovary1.19.1|100|12.876644 q2:Ovary2.15|Ovary2.15.1|100|365.44
TCONS_00000003 q1:Ovary1.19|Ovary1.19.2|0|0.000000 q2:Ovary2.19|Ovary2.19.1|100|64.567
Output needed:
TCONS_00000001 Ovary1.13.1 32.599877 Ovary2.16.1 88.36
TCONS_00000002 Ovary1.19.1 12.876644 Ovary2.15.1 365.44
TCONS_00000003 Ovary1.19.2 0.000000 Ovary2.19.1 64.567
My attempt:
awk 'BEGIN {OFS=FS="\t"}{split($2,two,"|");split($3,thr,"|");print $1,two[2],two[4],thr[2],thr[4]}' in.file
Problem:
I have many more columns to split like 2 and 3, I would like to find a shorter solutions than splitting every column one by one.
While Sundeep's answer is great, if you are planning for a redundant action on a set of records, suggest using a function and run it on each record.
I would write an awk script as below
#!/usr/bin/env awk
function split_args(record) {
n=split(record,split_array,"[:|]")
return (split_array[3]"\t"split_array[n])
}
BEGIN { FS=OFS="\t" }
{
for (i=2;i<=NF;i++) {
$i=split_args($i)
}
print
}
and invoke it as
awk -f script.awk inputfile
An ugly command-line version of it would be
awk 'function split_args(record) {
n=split(record,split_array,"[:|]")
return (split_array[3]"\t"split_array[n])
}
BEGIN { FS=OFS="\t" }
{
for (i=2;i<=NF;i++) {
$i=split_args($i)
}
print
}
' newfile
$ # borrowing simplicity from #Inian's answer ;)
$ awk 'BEGIN{FS=OFS="\t"}
{for(i=2; i<=NF; i++){split($i,a,/[:|]/); $i=a[3] "\t" a[5]}} 1' ip.txt
TCONS_00000001 Ovary1.13.1 32.599877 Ovary2.16.1 88.36
TCONS_00000002 Ovary1.19.1 12.876644 Ovary2.15.1 365.44
TCONS_00000003 Ovary1.19.2 0.000000 Ovary2.19.1 64.567
$ # previous solution which leaves tab character at end
$ awk -F'\t' '{printf "%s\t",$1;
for(i=2; i<=NF; i++){split($i,a,/[:|]/); printf "%s\t%s\t",a[3],a[5]};
print ""}' ip.txt
TCONS_00000001 Ovary1.13.1 32.599877 Ovary2.16.1 88.36
TCONS_00000002 Ovary1.19.1 12.876644 Ovary2.15.1 365.44
TCONS_00000003 Ovary1.19.2 0.000000 Ovary2.19.1 64.567

awk: gsub /pattern1/, but not /pattern1pattern2/

In my work, I have to solve such a simple problem: change pattern1 to newpattern, but only if it is not followed by pattern2 or pattern3:
"pattern1 pattern1pattern2 pattern1pattern3 pattern1pattern4" → "newpattern pattern1pattern2 pattern1pattern3 newpatternpattern4"
Here is my solution, but I don't like it and I suppose there should be a more elegant and easy way to do that?
$ echo 'pattern1 pattern1pattern2 pattern1pattern3 pattern1pattern4' | awk '
{gsub(/pattern1pattern2/, "###", $0)
gsub(/pattern1pattern3/, "%%%", $0)
gsub(/pattern1/, "newpattern", $0)
gsub(/###/, "pattern1pattern2", $0)
gsub(/%%%/, "pattern1pattern3", $0)
print}'
newpattern pattern1pattern2 pattern1pattern3 newpatternpattern4
So, the sample input file:
pattern1 pattern1pattern2 aaa_pattern1pattern3 pattern1pattern4 pattern1pattern2pattern1
The sample output file should be:
newpattern pattern1pattern2 aaa_pattern1pattern3 newpatternpattern4 pattern1pattern2newpattern
This is trivial in perl, using a negative lookahead:
perl -pe 's/pattern1(?!pattern[23])/newpattern/g' file
Substitute all matches of pattern1 that are not followed by pattern2 or pattern3.
If for some reason you need to do it in awk, then here's one way you could go about it:
{
out = ""
replacement = "newpattern"
while (match($0, /pattern1/)) {
if (substr($0, RSTART + RLENGTH) ~ /^pattern[23]/) {
out = out substr($0, 1, RSTART + RLENGTH - 1)
}
else {
out = out substr($0, 1, RSTART - 1) replacement
}
$0 = substr($0, RSTART + RLENGTH)
}
print out $0
}
Consume the input while pattern1 matches and build the string out, inserting the replacement when the part after each match isn't pattern2 or pattern3. Once there are no more matches, print the string that has been build so far, followed by whatever is left in the input.
With GNU awk for the 4th arg to split():
$ cat tst.awk
{
split($0,flds,/pattern1(pattern2|pattern3)/,seps)
for (i=1; i in flds; i++) {
printf "%s%s", gensub(/pattern1/,"newpattern","g",flds[i]), seps[i]
}
print ""
}
$ awk -f tst.awk file
newpattern pattern1pattern2 aaa_pattern1pattern3 newpatternpattern4 pattern1pattern2newpattern
With other awks you can do the same with a while(match()) loop:
$ cat tst.awk
{
while ( match($0,/pattern1(pattern2|pattern3)/) ) {
tgt = substr($0,1,RSTART-1)
gsub(/pattern1/,"newpattern",tgt)
printf "%s%s", tgt, substr($0,RSTART,RLENGTH)
$0 = substr($0,RSTART+RLENGTH)
}
gsub(/pattern1/,"newpattern",$0)
print
}
$ awk -f tst.awk file
newpattern pattern1pattern2 aaa_pattern1pattern3 newpatternpattern4 pattern1pattern2newpattern
but obviously the gawk solution is simpler and more concise so, as always, get gawk!
awk solution. Nice question. Basically it's doing 2 gensubs:
$ cat tst.awk
{ for (i=1; i<=NF; i++){
s=gensub(/pattern1/, "newpattern", "g", $i);
t=gensub(/(newpattern)(pattern(2|3))/, "pattern1\\2", "g", s);
$i=t
}
}1
Testing:
echo "pattern1 pattern1pattern2 aaa_pattern1pattern3 pattern1pattern4 pattern1pattern2pattern1" | awk -f tst.awk
newpattern pattern1pattern2 aaa_pattern1pattern3 newpatternpattern4 pattern1pattern2newpattern
However, this will fail whenever you already have something like newpatternpattern2 in your input. But that's not what OP suggests with his input examples, I guess.

Convert rows into columns using awk

Not all columns (&data) are present for all records. Hence whenever fields missing are missing, they should be replaced with nulls.
My Input format:
.set 1000
EMP_NAME="Rob"
EMP_DES="Developer"
EMP_DEP="Sales"
EMP_DOJ="20-10-2010"
EMR_MGR="Jack"
.set 1001
EMP_NAME="Koster"
EMP_DEP="Promotions"
EMP_DOJ="20-10-2011"
.set 1002
EMP_NAME="Boua"
EMP_DES="TA"
EMR_MGR="James"
My desired output Format:
Rob~Developer~Sales~20-10-2010~Jack
Koster~~Promotions~20-10-2011~
Boua~TA~~~James
I tried the below:
awk 'NR>1{printf "%s"(/^\.set/?RS:"~"),a} {a=substr($0,index($0,"=")+1)} END {print a}' $line
This is printing:
Rob~Developer~Sales~20-10-2010~Jack
Koster~Promotions~20-10-2011~
Boua~TA~James~
This awk script produces the desired output:
BEGIN { FS = "[=\"]+"; OFS = "~" }
/\.set/ { ++records; next }
NR > 1 { f[records,$1] = $2 }
END {
for (i = 1; i <= records; ++i) {
print f[i,"EMP_NAME"], f[i,"EMP_DES"], f[i,"EMP_DEP"], f[i,"EMP_DOJ"], f[i,"EMR_MGR"]
}
}
A two-dimensional array is used to store all of the values that are defined for each record.
After all the file has been processed, the loop goes through each row of the array and prints all of the values. The elements that are undefined will be evaluated as an empty string.
Specifying the elements explicity allows you to control the order in which they are printed. Using print rather than printf allows you to make correct use of the OFS variable which has been set to ~, as well as the ORS which is a newline character by default.
Thanks to #Ed for his helpful comments that pointed out some flaws in my original script.
Output:
Rob~Developer~Sales~20-10-2010~Jack
Koster~~Promotions~20-10-2011~
Boua~TA~~~James
$ cat tst.awk
BEGIN{ FS="[=\"]+"; OFS="~" }
/\.set/ { ++numRecs; next }
{ name2val[numRecs,$1] = $2 }
!seen[$1]++ { names[++numNames] = $1 }
END {
for (recNr=1; recNr<=numRecs; recNr++)
for (nameNr=1; nameNr<=numNames; nameNr++)
printf "%s%s", name2val[recNr,names[nameNr]], (nameNr<numNames?OFS:ORS)
}
$ awk -f tst.awk file
Rob~Developer~Sales~20-10-2010~Jack
Koster~~Promotions~20-10-2011~
Boua~TA~~~James
If you want some pre-defined order of fields in your output rather than creating it on the fly from the rows in each record as they're read, just populate the names[] array explicitly in the BEGIN section and if you have that situation AND don't want to save the whole file in memory:
$ cat tst.awk
BEGIN{
FS="[=\"]+"; OFS="~";
numNames=split("EMP_NAME EMP_DES EMP_DEP EMP_DOJ EMR_MGR",names,/ /)
}
function prtName2val( nameNr, i) {
if ( length(name2val) ) {
for (nameNr=1; nameNr<=numNames; nameNr++)
printf "%s%s", name2val[names[nameNr]], (nameNr<numNames?OFS:ORS)
delete name2val
}
}
/\.set/ { prtName2val(); next }
{ name2val[$1] = $2 }
END { prtName2val() }
$ awk -f tst.awk file
Rob~Developer~Sales~20-10-2010~Jack
Koster~~Promotions~20-10-2011~
Boua~TA~~~James
The above uses GNU awk for length(name2val) and delete name2val, if you don't have that then use for (i in name2val) { do stuff; break } and split("",name2val) instead..
This is all I can suggest:
awk '{ t = $0; sub(/^[^"]*"/, "", t); gsub(/"[^"]*"/, "~", t); sub(/".*/, "", t); print t }' file
Or sed:
sed -re 's|^[^"]*"||; s|"[^"]*"|~|g; s|".*||' file
Output:
Rob~Developer~Sales~20-10-2010~Jack~Koster~Promotions~20-10-2011~Boua~TA~James

How to translate a column value in the file using awk with tr command in unix

Details:
Input file : file.txt
P123456789,COLUMN2
P123456790,COLUMN2
P123456791,COLUMN2
Expected output:
Z678999999,COLUMN2
Z678999995,COLUMN2
Z678999996,COLUMN2
If i try using a variable it is giving proper result.
(i.e) /tmp>echo "P123456789"|tr "0-9" "5-9"|tr "A-Z" "X-Z"
Z678999999
But if i do with awk command it is not giving result instead giving error:
/tmp>$ awk 'BEGIN { FS=OFS="," } { $1=echo $1|tr "0-9" "5-9"|tr "A-Z" "X-Z";$2="COLUMN2"); print }' /tmp/file.txt >/tmp/file.txt.tmp
awk: BEGIN { FS=OFS="," } { $1=echo $1|tr "0-9" "5-9"|tr "A-Z" "X-Z";$2="COLUMN2"); print }
awk: ^ syntax error
awk: BEGIN { FS=OFS="," } { $1=echo $1|tr "0-9" "5-9"|tr "A-Z" "X-Z";$2="COLUMN2"); print }
awk: ^ syntax error
awk: BEGIN { FS=OFS="," } { $1=echo $1|tr "0-9" "5-9"|tr "A-Z" "X-Z";$2="COLUMN2"); print }
awk: ^ syntax error
Can anyone help please?
just do what you wanted, without changing your logic:
awk line:
awk -F, -v OFS="," '{ "echo \""$1"\"|tr \"0-9\" \"5-9\"|tr \"A-Z\" \"X-Z\"" |getline $1}7'
with your data:
kent$ echo "P123456789,COLUMN2
P123456790,COLUMN2
P123456791,COLUMN2"|awk -F, -v OFS="," '{ "echo \""$1"\"|tr \"0-9\" \"5-9\"|tr \"A-Z\" \"X-Z\"" |getline $1}7'
Z678999999,COLUMN2
Z678999995,COLUMN2
Z678999996,COLUMN2
$ cat tst.awk
function tr(old,new,str, oldA,newA,strA,i,j) {
split(old,oldA,"")
split(new,newA,"")
split(str,strA,"")
str = ""
for (i=1;i in strA;i++) {
for (j=1;(j in oldA) && !sub(oldA[j],newA[j],strA[i]);j++)
;
str = str strA[i]
}
return str
}
BEGIN { FS=OFS="," }
{ print tr("P012345678","Z567899999",$1), $2 }
$ awk -f tst.awk file
Z678999999,COLUMN2
Z678999995,COLUMN2
Z678999996,COLUMN2
Unfortunately, AWK does not have a built in translation function. You could write one like Ed Morton has done, but I would reach for (and highly recommend) a more powerful tool. Perl, for example, can process fields using the autosplit (-a) command switch:
-a turns on autosplit mode when used with a -n or -p. An implicit split command to the #F array is done as the first thing inside the
implicit while loop produced by the -n or -p.
You can type perldoc perlrun for more details.
Here's my solution:
perl -F, -lane '$F[0] =~ tr/0-9/5-9/; $F[0] =~ tr/A-Z/X-Z/; print join (",", #F)' file.txt
Results:
Z678999999,COLUMN2
Z678999995,COLUMN2
Z678999996,COLUMN2

reproducing grep "my pattern" myfile.log | sort | uniq | wc -l in awk

If I perform this grep on my target file I get eg 275 as result.
But I want to learn awk so tried this in awk:
awk 'BEGIN { count=0 } /my pattern/ { count++ } END { print count }' myfile.log
And this prints the 275 as expected.
So getting ambitious I created an awk script like this:
BEGIN {
print "Log File Analysis";
message=0;
events=0;
}
{
/message/ { messages++; }
/event/ { events++; }
}
END {
print "messages:\t" messages;
print "events:\t" events;
}
I get a syntax error,
$ awk -f test_learn.awk test_log.log
awk: test_learn.awk:16: /message/ { messages++; }
awk: test_learn.awk:16: ^ syntax error
What am I doing wrong?
I am using awk from MinGW shell on windows 7.
try
awk 'BEGIN { count=0 }; /my pattern/{count++ }; END { print count }' myfile.log
OR
awk 'BEGIN { count=0}; { if ($0 ~ /my pattern/) count++ }; END { print count };' myfile.log
Better yet, as variables are initialized as zero by default, you don't need the BEGIN block, so
awk '/my pattern/{count++ }; END { print count };' myfile.log
You can either have a default loop applied to all lines in a file, as in 2d example with the if, or you can have multiple blocks, "filtered" by pattern, as above, and in your edited addition.
When doing one-liners have you have, some awks required the semi-colon to separate the BEGIN and END blocks from the main loop block.
Edit
Same Idea with your 2nd issue, and integrating Ed Morton's improvments (thanks)
/message/ { messages++ }
/event/ { events++ }
END {
print "Log File Analysis"
print "messages:\t" messages
print "events:\t" events
}
IHTH