shell: awk to int variable - awk

I have a speedtest-cli skript and try to awk the result. I want to get the downloadspeed as integer, so I can compair with other results in an If, then.. condition.
part of my skript:
#!/bin/sh
speedtest-cli | awk '/Download:/ {print $2} ' > /root/tmp1
read speed1 < /root/tmp1
speedtest-cli | awk '/Download:/ {print $2} ' > /root/tmp2;
read speed2 < /root/tmp2
if [ $speed1 -gt $speed2 ];then
echo "test";fi
The problem is, that my awk result (75.27) isnt saved as integer! When it comes to if, I get an error:
sh: 75.27: bad number
I also would perfer to define the variable direct from the awk result, but that doesnt work!
speedtest-cli | var=$(awk '/Download:/ {print $2} ' > /root/tmp1)
How can I "awk" the speedtest-cli result, to get an variable that can be compaired in an if...then conditin?
please help,
thx greetings Igor

if you want to compare floating point numbers you can do within awk or with bc -l
for example
speed1=$(speedtest-cli | awk '/Download:/{print $2}')
...
speed2=$(speedtest-cli | awk '/Download:/{print $2}')
if (( $(echo "$speed1 > $speed2" | bc -l) )); then ...
another alternative is if you want to compare them as integers but don't want to lose digits due to rounding, multiply them with a large value and convert to int.
speed1000=$(speedtest-cli | awk '/Download:/{print int($2*1000)}')
...
now bash can handle integer comparisons...

Related

awk: print each column of a file into separate files

I have a file with 100 columns of data. I want to print the first column and i-th column in 99 separate files, I am trying to use
for i in {2..99}; do awk '{print $1" " $i }' input.txt > data${i}; done
But I am getting errors
awk: illegal field $(), name "i"
input record number 1, file input.txt
source line number 1
How to correctly use $i inside the {print }?
Following single awk may help you too here:
awk -v start=2 -v end=99 '{for(i=start;i<=end;i++){print $1,$i > "file"i;close("file"i)}}' Input_file
An all awk solution. First test data:
$ cat foo
11 12 13
21 22 23
Then the awk:
$ awk '{for(i=2;i<=NF;i++) print $1,$i > ("data" i)}' foo
and results:
$ ls data*
data2 data3
$ cat data2
11 12
21 22
The for iterates from 2 to the last field. If there are more fields that you desire to process, change the NF to the number you'd like. If, for some reason, a hundred open files would be a problem in your system, you'd need to put the print into a block and add a close call:
$ awk '{for(i=2;i<=NF;i++){f=("data" i); print $1,$i >> f; close(f)}}' foo
If you want to do what you try to accomplish :
for i in {2..99}; do
awk -v x=$i '{print $1" " $x }' input.txt > data${i}
done
Note
the -v switch of awk to pass variables
$x is the nth column defined in your variable x
Note2 : this is not the fastest solution, one awk call is fastest, but I just try to correct your logic. Ideally, take time to understand awk, it's never a wasted time

AWK how to count patterns on the first column?

I was trying get the total number of "??", " M", "A" and "D" from this:
?? this is a sentence
M this is another one
A more text here
D more and more text
I have this sample line of code but doesn't work:
awk -v pattern="\?\?" '{$1 == pattern} END{print " "FNR}'
$ awk '{ print $1 }' file | sort | uniq -c
1 ??
1 A
1 D
1 M
If for some reason you want an awk-only solution:
awk '{ ++cnt[$1] } END { for (i in cnt) print cnt[i], i }' file
but I think that's needlessly complicated compared to using the built-in unix tools that already do most of the work.
If you just want to count one particular value:
awk -v value='??' '$1 == value' file | wc -l
If you want to count only a subset of values, you can use a regex:
$ awk -v pattern='A|D|(\\?\\?)' '$1 ~ pattern { print $1 }' file | sort | uniq -c
1 ??
1 A
1 D
Here you do need to send a \ in order that the ?s are escaped within the regular expression. And because the \ is itself a special character within the string being passed to awk, you need to escape it first (hence the double backslash).

How to extract only numbers with awk

Hello i have the following output:
replication complete (rid=969811 lid=969811)
or sometimes:
no change of listener transaction id for last 0 checks (rid=971489 lid=970863)
now i want to use awk to get only the numbers from rid and lid, the following works only with the first option
|awk -F'[^0-9]*' '{print $2-$3}'
$ echo "no change of listener transaction id for last 0 checks (nid=971491 lid=970876)" |
awk -F'[()]' '{gsub(/[^0-9 ]/,"",$2); print $2}'
971491 970876
$ echo "no change of listener transaction id for last 0 checks (nid=971491 lid=970876)" |
awk -F'[()= ]' '{print $(NF-3), $(NF-1)}'
971491 970876
$ echo "no change of listener transaction id for last 0 checks (nid=971491 lid=970876)" |
awk -F'[()= ]' '{for (i=1;i<=NF;i++) m[$i]=$(i+1); print m["nid"], m["lid"]}'
971491 970876
$ echo "no change of listener transaction id for last 0 checks (nid=971491 lid=970876)" |
awk '{gsub(/.*\(|[^0-9 ]+|\).*$/,"")}1'
971491 970876
etc., etc.... The right one for you really depends what else you plan to do with the text.
Hmm, I now see in your question that you MIGHT want to print the subtraction of one number from the other instead of printing the numbers as I thought. Here's one way based on the above:
$ echo "no change of listener transaction id for last 0 checks (nid=971491 lid70876)" |
awk -F'[()= ]' '{print $(NF-3) - $(NF-1)}'
615
Alternatives left as an exercise!
You can use this awk, if your goal is to work with rid and lid values.
awk -F\(rid=\|lid=\) '{print $2-$3}' yourfile
(OR)
awk 'BEGIN{FS="(rid=|lid=)"} {print $2-$3}' yourfile
awk -F'=' '{print int($2)-int($3)}'
Works because of the way awk parses strings.
Another solution, this works in GNU-awk 4 only .... Defining Fields by Content in GAWK
echo "no change of listener transaction id for last 0 checks (rid=971489 lid=970863)" |
gawk -vFPAT='[0-9]+' '{print $(NF-1), $NF}'
you get,
971489 970863
echo "replication complete (rid=969811 lid=969811)" |
gawk -vFPAT='[0-9]+' '{print $(NF-1), $NF}'
you get,
969811 969811
Note: if, you want to do subtraction
echo "replication complete (rid=969811 lid=969811)" |
gawk -vFPAT='[0-9]+' '{print $(NF-1)-$NF}'
you get,
0

sum occurrence output of uniq -c

I want to sum up occurrence output of "uniq -c" command.
How can I do that on the command line?
For example if I get the following in output, I would need 250.
45 a4
55 a3
1 a1
149 a5
awk '{sum+=$1} END{ print sum}'
This should do the trick:
awk '{s+=$1} END {print s}' file
Or just pipe it into awk with
uniq -c whatever | awk '{s+=$1} END {print s}'
for each line add the value of of first column to SUM, then print out the value of SUM
awk is a better choice
uniq -c somefile | awk '{SUM+=$1}END{print SUM}'
but you can also implement the logic using bash
uniq -c somefile | while read num other
do
let SUM+=num;
done
echo $SUM
uniq -c is slow compared to awk. like REALLY slow.
{mawk/mawk2/gawk} 'BEGIN { OFS = "\t" } { freqL[$1]++; } END { # modify FS for that
# column you want
for (x in freqL) { printf("%8s %s\n", freqL[x], x) } }' # to uniq -c upon
if your input isn't large like 100MB+, then gawk suffices after adding in the
PROCINFO["sorted_in"] = "#ind_num_asc"; # gawk specific, just use gawk -b mode
if it's really large, it's far faster to use mawk2 then pipe to to
{ mawk/mawk2 stuff... } | gnusort -t'\t' -k 2,2
While the aforementioned answer uniq -c example-file | awk '{SUM+=$1}END{print SUM}' would theoretically work to sum the left column output of uniq -c so should wc -l somefile as mentioned in the comment.
If what you are looking for is the number of uniq lines in your file, then you can use this command:
sort -h example-file | uniq | wc -l

How to split a delimited string into an array in awk?

How to split the string when it contains pipe symbols | in it.
I want to split them to be in array.
I tried
echo "12:23:11" | awk '{split($0,a,":"); print a[3] a[2] a[1]}'
Which works fine. If my string is like "12|23|11" then how do I split them into an array?
Have you tried:
echo "12|23|11" | awk '{split($0,a,"|"); print a[3],a[2],a[1]}'
To split a string to an array in awk we use the function split():
awk '{split($0, array, ":")}'
# \/ \___/ \_/
# | | |
# string | delimiter
# |
# array to store the pieces
If no separator is given, it uses the FS, which defaults to the space:
$ awk '{split($0, array); print array[2]}' <<< "a:b c:d e"
c:d
We can give a separator, for example ::
$ awk '{split($0, array, ":"); print array[2]}' <<< "a:b c:d e"
b c
Which is equivalent to setting it through the FS:
$ awk -F: '{split($0, array); print array[2]}' <<< "a:b c:d e"
b c
In GNU Awk you can also provide the separator as a regexp:
$ awk '{split($0, array, ":*"); print array[2]}' <<< "a:::b c::d e
#note multiple :
b c
And even see what the delimiter was on every step by using its fourth parameter:
$ awk '{split($0, array, ":*", sep); print array[2]; print sep[1]}' <<< "a:::b c::d e"
b c
:::
Let's quote the man page of GNU awk:
split(string, array [, fieldsep [, seps ] ])
Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in array[1], the second piece in array[2], and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records). If fieldsep is omitted, the value of FS is used. split() returns the number of elements created. seps is a gawk extension, with seps[i] being the separator string between array[i] and array[i+1]. If fieldsep is a single space, then any leading whitespace goes into seps[0] and any trailing whitespace goes into seps[n], where n is the return value of split() (i.e., the number of elements in array).
Please be more specific! What do you mean by "it doesn't work"?
Post the exact output (or error message), your OS and awk version:
% awk -F\| '{
for (i = 0; ++i <= NF;)
print i, $i
}' <<<'12|23|11'
1 12
2 23
3 11
Or, using split:
% awk '{
n = split($0, t, "|")
for (i = 0; ++i <= n;)
print i, t[i]
}' <<<'12|23|11'
1 12
2 23
3 11
Edit: on Solaris you'll need to use the POSIX awk (/usr/xpg4/bin/awk) in order to process 4000 fields correctly.
I do not like the echo "..." | awk ... solution as it calls unnecessary fork and execsystem calls.
I prefer a Dimitre's solution with a little twist
awk -F\| '{print $3 $2 $1}' <<<'12|23|11'
Or a bit shorter version:
awk -F\| '$0=$3 $2 $1' <<<'12|23|11'
In this case the output record put together which is a true condition, so it gets printed.
In this specific case the stdin redirection can be spared with setting an awk internal variable:
awk -v T='12|23|11' 'BEGIN{split(T,a,"|");print a[3] a[2] a[1]}'
I used ksh quite a while, but in bash this could be managed by internal string manipulation. In the first case the original string is split by internal terminator. In the second case it is assumed that the string always contains digit pairs separated by a one character separator.
T='12|23|11';echo -n ${T##*|};T=${T%|*};echo ${T#*|}${T%|*}
T='12|23|11';echo ${T:6}${T:3:2}${T:0:2}
The result in all cases is
112312
Actually awk has a feature called 'Input Field Separator Variable' link. This is how to use it. It's not really an array, but it uses the internal $ variables. For splitting a simple string it is easier.
echo "12|23|11" | awk 'BEGIN {FS="|";} { print $1, $2, $3 }'
I know this is kind of old question, but I thought maybe someone like my trick. Especially since this solution not limited to a specific number of items.
# Convert to an array
_ITEMS=($(echo "12|23|11" | tr '|' '\n'))
# Output array items
for _ITEM in "${_ITEMS[#]}"; do
echo "Item: ${_ITEM}"
done
The output will be:
Item: 12
Item: 23
Item: 11
Joke? :)
How about echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'
This is my output:
p2> echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'
112312
so I guess it's working after all..
echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'
should work.
echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'
code
awk -F"|" '{split($0,a); print a[1],a[2],a[3]}' <<< '12|23|11'
output
12 23 11
The challenge: parse and store split strings with spaces and insert them into variables.
Solution: best and simple choice for you would be convert the strings list into array and then parse it into variables with indexes. Here's an example how you can convert and access the array.
Example: parse disk space statistics on each line:
sudo df -k | awk 'NR>1' | while read -r line; do
#convert into array:
array=($line)
#variables:
filesystem="${array[0]}"
size="${array[1]}"
capacity="${array[4]}"
mountpoint="${array[5]}"
echo "filesystem:$filesystem|size:$size|capacity:$capacity|mountpoint:$mountpoint"
done
#output:
filesystem:/dev/dsk/c0t0d0s1|size:4000|usage:40%|mountpoint:/
filesystem:/dev/dsk/c0t0d0s2|size:5000|usage:50%|mountpoint:/usr
filesystem:/proc|size:0|usage:0%|mountpoint:/proc
filesystem:mnttab|size:0|usage:0%|mountpoint:/etc/mnttab
filesystem:fd|size:1000|usage:10%|mountpoint:/dev/fd
filesystem:swap|size:9000|usage:9%|mountpoint:/var/run
filesystem:swap|size:1500|usage:15%|mountpoint:/tmp
filesystem:/dev/dsk/c0t0d0s3|size:8000|usage:80%|mountpoint:/export
awk -F'['|'] -v '{print $1"\t"$2"\t"$3}' file <<<'12|23|11'