GAWK strftime doesn't recognize a passed variable - gawk

I am not able to pass $daysSeconds in GAWK My expression doesn't evaluate correctly
#___________
days=20
daysSeconds=$(expr $days \* 86400)
DAY=`gawk 'BEGIN{print strftime("%d", systime() - $daysSeconds )}'`
echo $DAY
#____

Not sure what you are trying to do.
%d The day of the month as a decimal number (01–31).
You should not mix parentheses $() and back tics. Use parentheses.
You should not use variable directly in awk.
It seems to be a problem with you begin block do like this:
DAY=$(echo "$daysSeconds" | awk '{print strftime("%d",systime()-$1)}')
Gives 18 for $DAY

Related

How do I check if input is one word or 2 separated by delimiter "-"

I need help with the following ksh script:
ExpResult=`echo "$LoadString" | awk -F"-" '{print NF}'=2`
MinExp=`echo "$ExpResult" | tr -s " " | sed 's/^[ ]//g'| cut -d"-" -f1`
MaxExp=`echo "$ExpResult" | tr -s " " | sed 's/^[ ]//g'| cut -d"-" -f2`
I can get an input as two options : "50-100" or "50" (for example)
I have two questions:
How do I check if the input is "one word" or two words separated by delimiter "-"?
If the input is two words, how can I separate them?
Rather than call an external program to parse your input, you can use the internal case statement to validate input and parameter expansion features to convert your input, i.e.
# set a copy/paste value for $1
set -- 50-10
case "$1" in
*-* )
range="$1"
min="${range%-*}"
max="${range#*-}"
;;
* )
singleNum="$1"
;;
esac
echo min=$min ... max=$max
output
min=50 ... max=100
Try for non-pair
unset min max
set -- other values
case ...
echo min= ... max= ... singleNum=$singleNum
output
min= ... max= ... singleNum=other
Hopefully the case processing is self-explanatory, but the parameter expansion may require a little explanation.
The statement
min=${range%-*}
says remove from the right side of the expanded value (50-100) anything starting at the last - until the end of the string. This leaves the value 50 remaining.
The reverse happens with
max=${range#*-}
Says remove from the left side of the expanded value anything up to the first - char. This leaves the 100.
As there is only one - char in this string, you don't need to worry about the other versions of ${var##*-} which says remove all from the left until the last match of -, and the reverse ${var%%-*} , remove all from right (backwards) until the very first - char.
The fanatical minimalists will remind us that this can be done without a temporary variable, i.e.
min=${1%-*} ; max=${1#*-}
And the one-line fantasists can be satisfied with
case "$1" in *-* ) range="$1";min="${range%-*}";max="${range#*-}";;* ) singleNum="$1";;esac; echo min=$min ... max=$max .,, singleNum=$singleNum
:-)
IHTH
you can try this;
LoadString=$1
MinExp=`echo "$LoadString" | awk -F"-" '{if (NF==2) print $1}`
MaxExp=`echo "$LoadString" | awk -F"-" '{if (NF==2) print $2}`
echo $MinExp
echo $MaxExp
eg:
user#host:/tmp/test$ ksh test.ksh 50-100
50
100

AWK sum all values of a column keeping all floats

I'm using the following code to grep the lines that im interested in, keep only the last ones and sum over column nine:
grep -n -49 'FINAL BlaBla' output |tail -9 | awk 'BEGIN {SUM=0}; {SUM=SUM+$9}; END {printf "%.3f\n" SUM}.
However the sum over column 9 returns 0,000
the selected lines look as follows
84- C -3.42056726 +1 -0.82831327 +1 1.52743549 +1 0.5647
85- N -4.78612760 +1 -1.01185554 +1 1.58894854 +1 -0.5837
86- C -5.19047197 +1 -2.20130686 +1 2.06176295 +1 0.3890
87- N -4.42537785 +1 -3.22689397 +1 2.47304603 +1 -0.4775
88- C -3.03532546 +1 -2.98933854 +1 2.38795560 +1 0.3686
89- N -2.51737448 +1 -1.78267672 +1 1.92262528 +1 -0.5526
90- Cl -6.86455806 +1 -2.45050886 +1 2.15229544 +1 0.0934
91- N -2.24043582 +1 -3.93651444 +1 2.76082642 +1 0.0890
92- N -2.94053526 +1 0.36941710 +1 1.06455738 +1 -0.3274
I can't find out where the mistake is.
I also tried to sum over $1 and I correctly obtain 792,000 but when I sum over $3 I get 31,000 ...
what's wrong?
I think the problem lies in the missing comma in your printf expression:
$ awk 'BEGIN {SUM=0}; {SUM=SUM+$9}; END {printf "%.3f\n", SUM}' file
# ^
# comma!
-0.436
Note by the way that there is no need to set the variable to zero, since this is the default. So drop the BEGIN {} block and leave to just:
awk '{sum+=$9}; END {printf "%.3f\n", sum}' file
For the other fields:
$ awk '{sum+=$nvar}; END {printf "%.3f\n", sum}' nvar=1 file
792.000
$ awk '{sum+=$nvar}; END {printf "%.3f\n", sum}' nvar=3 file
-35.421
$ awk '{sum+=$nvar}; END {printf "%.3f\n", sum}' nvar=9 file
-0.436
Why wasn't it working?
From The GNU Awk user's guide 5.5.1 Introduction to the printf Statement:
A simple printf statement looks like this:
printf format, item1, item2, …
As for print, the entire list of arguments may optionally be enclosed
in parentheses
The difference between printf and print is the format argument. This
is an expression whose value is taken as a string; it specifies how to
output each of the other arguments. It is called the format string.
The format string is very similar to that in the ISO C library
function printf(). Most of format is text to output verbatim.
Scattered among this text are format specifiers—one per item. Each
format specifier says to output the next item in the argument list at
that place in the format.
So if you don't use commas to call the fields you get this error:
$ awk 'BEGIN {printf "%.3f" 3}'
awk: cmd. line:1: fatal: not enough arguments to satisfy format string
`%.3f3'
^ ran out for this one
Using them it works!
$ awk 'BEGIN {printf "%.3f", 3}'
3.000

how to get the output of 'system' command in awk

I have a file and a field is a time stamp like 20141028 20:49:49, I want to get the hour 20, so I use the system command :
hour=system("date -d\""$5"\" +'%H'")
the time stamp is the fifth field in my file so I used $5. But when I executed the program I found the command above just output 20 and return 0 so hour is 0 but not 20, so my question is how to get the hour in the time stamp ?
I know a method which use split function two times like this:
split($5, vec, " " )
split(vec[2], vec2, ":")
But this method is a little inefficient and ugly.
so are there any other solutions? Thanks
Another way using gawk:
gawk 'match($5, " ([0-9]+):", r){print r[1]}' input_file
If you want to know how to manage externall process output in awk:
awk '{cmd="date -d \""$5"\" +%H";cmd|getline hour;print hour;close(cmd)}' input_file
You can use the substr function to extract the hour without using system command.
for example:
awk {'print substr("20:49:49",1,2)}'
will produce output as
20
Or more specifically as in question
$ awk {'print substr("20141028 20:49:49",10,2)}'
20
substr(str, pos, len) extracts a substring from str at position pos and lenght len
if the value of $5 is 20141028 20:49:49,
$ awk {'print substr($5,10,2)}'
20

setting default numeric format in awk

I wanted to do a simple parsing of two files with ids and some corresponding numerical values. I didn't want awk to print numbers in scientific notation.
File looks like this:
someid-1 860025 50.0401 4.00022
someid-2 384319 22.3614 1.78758
someid-3 52096 3.03118 0.242314
someid-4 43770 2.54674 0.203587
someid-5 33747 1.96355 0.156967
someid-6 20281 1.18004 0.0943328
someid-7 12231 0.711655 0.0568899
someid-8 10936 0.636306 0.0508665
someid-9 10224.8 0.594925 0.0475585
someid-10 10188.8 0.59283 0.047391
when use print instead of printf :
awk 'BEGIN{FS=OFS="\t"} NR==FNR{x[$1]=$0;next} ($1 in x){split(x[$1],k,FS); print $1,k[2],k[3],k[4],$2,$3,$4}' OSCAo.txt dme_miRNA_PIWI_OSC.txt | sort -n -r -k 7 | head
i get this result:
dme-miR-iab-4-5p 0.333333 0.000016 0.000001 0.25 0.000605606 9.36543e-07
dme-miR-9c-5p 10987.300000 0.525413 0.048798 160.2 0.388072 0.000600137
dme-miR-9c-3p 731.986000 0.035003 0.003251 2.10714 0.00510439 7.89372e-06
dme-miR-9b-5p 30322.500000 1.450020 0.134670 595.067 1.4415 0.00222922
dme-miR-9b-3p 2628.280000 0.125684 0.011673 48 0.116276 0.000179816
dme-miR-9a-3p 10.365000 0.000496 0.000046 0.25 0.000605606 9.36543e-07
dme-miR-999-5p 103.433000 0.004946 0.000459 0.0769231 0.00018634 2.88167e-07
dme-miR-999-3p 1513.790000 0.072389 0.006723 28 0.0678278 0.000104893
dme-miR-998-5p 514.000000 0.024579 0.002283 73 0.176837 0.000273471
dme-miR-998-3p 3529.000000 0.168756 0.015673 42 0.101742 0.000157339
Notice the scientific notation in the last column
I understand that printf with appropriate format modifier can do the job but the code becomes very lengthy. I have to write something like this:
awk 'BEGIN{FS=OFS="\t"} NR==FNR{x[$1]=$0;next} ($1 in x){split(x[$1],k,FS); printf "%s\t%3.6f\t%3.6f\t%3.6f\t%3.6f\t%3.6f\t%3.6f\n", $1,k[2],k[3],k[4],$2,$3,$4}' file1.txt file2.txt > fileout.txt
This becomes clumsy when I have to parse fileout with another similarly structured file.
Is there any way to specify default numerical output, such that any string will be printed like a string but all numbers follow a particular format.
I think You misinterpreted the meaning of %3.6f. The first number before the decimal point is the field width not the "number of digits before decimal point". (See prinft(3))
So You should use %10.6f instead. It can be tested easily in bash
$ printf "%3.6f\n%3.6f\n%3.6f" 123.456 12.345 1.234
123.456000
12.345000
1.234000
$ printf "%10.6f\n%10.6f\n%10.6f" 123.456 12.345 1.234
123.456000
12.345000
1.234000
You can see that the later aligns to the decimal point properly.
As sidharth c nadhan mentioned You can use the OFMT awk internal variable (seem awk(1)). An example:
$ awk 'BEGIN{print 123.456; print 12.345; print 1.234}'
123.456
12.345
1.234
$ awk -vOFMT=%10.6f 'BEGIN{print 123.456; print 12.345; print 1.234}'
123.456000
12.345000
1.234000
As I see in You example the number with maximum digits can be 123456.1234567, so the format %15.7f to cover all and show a nice looking table.
But unfortunately it will not work if the number has no decimal point in it or even if it does, but it ends with .0.
$ awk -vOFMT=%15.7f 'BEGIN{print 123.456;print 123;print 123.0;print 0.0+123.0}'
123.4560000
123
123
123
I even tried gawk's strtonum() function, but the integers are considered as non-OFMT strings. See
awk -vOFMT=%15.7f -vCONVFMT=%15.7f 'BEGIN{print 123.456; print strtonum(123); print strtonum(123.0)}'
It has the same output as before.
So I think, you have to use printf anyway. The script can be a little bit shorter and a bit more configurable:
awk -vf='\t'%15.7f 'NR==FNR{x[$1]=sprintf("%s"f f f,$1,$2,$3,$4);next}$1 in x{printf("%s"f f f"\n",x[$1],$2,$3,$4)}' file1.txt file2.txt
The script will not work properly if there are duplicated IDs in the first file. If it does not happen then the two conditions can be changed and the ;next can be left off.
awk 'NR==FNR{x[$1]=$0;next} ($1 in x){split(x[$1],k,FS); printf "%s\t%9s\t%9s\t%9s\t%9s\t%9s\t%9s\n", $1,k[2],k[3],k[4],$2,$3,$4}' file1.txt file2.txt > fileout.txt

replacing the `'` char using awk

I have lines with a single : and a' in them that I want to get rid of. I want to use awk for this. I've tried using:
awk '{gsub ( "[:\\']","" ) ; print $0 }'
and
awk '{gsub ( "[:\']","" ) ; print $0 }'
and
awk '{gsub ( "[:']","" ) ; print $0 }'
non of them worked, but return the error Unmatched ".. when I put
awk '{gsub ( "[:_]","" ) ; print $0 }'
then It works and removes all : and _ chars. How can I get rid of the ' char?
tr is made for this purpose
echo test\'\'\'\':::string | tr -d \':
teststring
$ echo test\'\'\'\':::string | awk '{gsub(/[:\47]*/,"");print $0}'
teststring
This works:
awk '{gsub( "[:'\'']","" ); print}'
You could use:
Octal code for the single quote:
[:\47]
The single quote inside double quotes, but in that case special
characters will be expanded by the shell:
% print a\': | awk "sub(/[:']/, x)"
a
Use a dynamic regexp, but there are performance implications related
to this approach:
% print a\': | awk -vrx="[:\\\']" 'sub(rx, x)'
a
With bash you cannot insert a single quote inside a literal surrounded with single quotes. Use '"'"' for example.
First ' closes the current literal, then "'" concatenates it with a literal containing only a single quote, and ' reopens a string literal, which will be also concatenated.
What you want is:
awk '{gsub ( "[:'"'"']","" ) ; print $0; }'
ssapkota's alternative is also good ('\'').
I don't know why you are restricting yourself to using awk, anyways you've got many answers from other users. You can also use sed to get rid of " :' "
sed 's/:\'//g'
This will also serve your purpose. Simple and less complex.
This also works:
awk '{gsub("\x27",""); print}'
simplest
awk '{gsub(/\047|:/,"")};1'