How to increment a counter when a value is found in awk

How to increment a counter when a value is found in awk - awk

I am trying to write an awk script. part of the code needs to count the number of times $10 (in the code below its 256) is a certain value.
The possibilities are 4, 8, 16, 32, 64, 128, 256
Each time one of these values appears I want a corresponding variable to be incremented by one.
My block of code is
{
if ($10 == "4") {bs_4k++}
else if ($10 == "8") {bs_8k++}
if ($10 == "16") {bs_16k++}
if ($10 == "32") {bs_32k++}
if ($10 == "64") {bs_64k++}
if ($10 == "128") {bs_128k++}
if ($10 == "256") {bs_256k++}
};
This is the format of the data:
259,0 23 1 0.000000000 0 C WS 167588096 + 256 [0]
259,0 23 2 0.000002073 0 C WS 167588352 + 256 [0]
259,0 23 3 0.000004040 0 C WS 167587840 + 256 [0]
And my code to print results:
END {
print "Number of 256k blocks: " 256k_bs;
print "Number of 128k blocks: "128k_bs;
print "Number of 64k blocks: " 64k_bs;
print "Number of 32k blocks: " 32k_bs;
print "Number of 16k blocks: " 16k_bs;
print "Number of 8k blocks: " 8k_bs;
print "Number of 4k blocks: " 4k_bs;
};
When I try to print the variable I get:
Number of 256k blocks: 256
Number of 128k blocks: 128
Number of 64k blocks: 64
Number of 32k blocks: 32
Number of 16k blocks: 16
Number of 8k blocks: 8
Number of 4k blocks: 4
Any help is appreciated.

An easier way to hadnle this in awk might be to use an array instead of a collection of distinct variables.
Using this awk script:
BEGIN{
for (i=4;i<=256;i=i*2) { valid[i] }
}
$10 in valid {
bs[$10]++
}
END {
for (n in bs) {
printf("Number of %dk blocks: %d\n", n, bs[n])
}
}
on the following input data:
259,0 23 1 0.000000000 0 C WS 167588096 + 256 [0]
259,0 23 2 0.000002073 0 C WS 167588352 + 256 [0]
259,0 23 3 0.000004040 0 C WS 167587840 + 256 [0]
259,0 23 3 0.000004040 0 C WS 167587000 + 32 [0]
259,0 23 3 0.000004040 0 C WS 167587001 + 64 [0]
gets me the following results:
$ awk -f i.awk input.txt
Number of 64k blocks: 1
Number of 32k blocks: 1
Number of 256k blocks: 3
Of course, if you want the output sorted, you can pipe it through sort, or structure your awk array differently. Such additions are left as an exercise for the reader. :-)

You're not using your variables in the END block. As such, the numeric part is being interpreted literally, then a variable k_bs is being evaluated (which is unset, so you get an empty string).
Change 256k_bs to bs_256k, etc.

you can achieve the same with unit toolset
$ cut -d' ' -f10 file | sort -nr | uniq -c
3 256
1 64
1 32
and format the result as desired.

Related

Replacing sequences of space-delimited numbers in huge input with awk

How can I replace sequences of space-delimited numbers when those sequences span multiple lines and the input is too big to fit in RAM.
A sample input would be:
edit: I re-worked the sample input and input parameters for introducing border cases (excluding ones that have to do with the length of the matched sequence or replacement priorities)
3 12 3 4
0 6 7 10
8 9 12 3
4 6 7 8
10 6 6 7
9 199 10 11
11
note: the number of fields per line is homogeneous but not known in advance; the last line might contain less fields
From that input I would like to:
replace 3 4 with &
replace 6 7 8 with 9 9
replace 6 7 9 with 8 8
replace 7 10 with 11 12
replace 0 with nothing
replace 10 with 13 10
replace 8 9 12 3 5 with #
The expected output would have one number or replacement per line:
3
12
&
6
11 12
8
9
12
&
9 9
13 10
6
8 8
199
13 10
11
11
I'm trying to do the task with awk but I'm having a hard time implementing a dynamic state machine with a pseudo B-Tree:
tr -s '[:space:]' '\n' < input.txt |
awk '
BEGIN {
for (i = 2; i < ARGC; i += 2) {
n = split(ARGV[i], arr)
k = ""
for (j = 1; j <= n; j++) {
k = j SUBSEP k SUBSEP arr[j]
Tree[k]
}
Tree[k] = "$" ARGV[i+1] #=> now can test "if (Tree[k])"
delete ARGV[i]
delete ARGV[i+1]
}
}
{
Key = (int(Key) + 1) SUBSEP Key SUBSEP $1
if ( Key in Tree ) {
if (Tree[Key]) {
print substr(Tree[Key],2)
Buffer = ""
Key = ""
}
else
Buffer = Buffer $1 "\n"
} else {
print Buffer $1
Buffer = ""
Key = ""
}
}
END { if (Buffer != "") printf ("%s", Buffer) }
' - \
'3 4' '&' \
'6 7 8' '9 9' \
'6 7 9' '8 8' \
'7 10' '11 12' \
'0' '' \
'10' '13 10' \
'8 9 12 3 5' '#'
edit: I realised that the code doesn't backtrack after failing to find a complete match in the B-tree, so it's wrong...
How I'm planning to tackle the problem
I'm emulating a B-tree with an array and keys in the following format:
from the middle to the left of the key are the consecutive depths
from the middle to the right of the key are the consecutive values
When a key exists in Tree:
if it doesn't have an associated value then it's a node
if there's a value then it's a leaf
So, for the current input parameters, the content of the Tree array will be:
# from param: "3 4" => "&"
Tree[ 1,"",3 ]
Tree[2,1,"",3,4] = "$&"
# from param: "6 7 8" => "9 9"
Tree[ 1,"",6 ]
Tree[ 2,1,"",6,7 ]
Tree[3,2,1,"",6,7,8] = "$9 9"
# from param: "6 7 9" => "8 8"
Tree[ 1,"",6 ]
Tree[ 2,1,"",6,7 ]
Tree[3,2,1,"",6,7,9] = "$8 8"
# from param: "7 10" => "11 12"
Tree[ 1,"",7 ]
Tree[2,1,"",7,10] = "$11 12"
# from param: "0" => ""
Tree[1,"",0] = "$"
# from param: "10" => "13 10"
Tree[1,"",10] = "$13 10"
# from param: "8 9 12 3 5" => "#"
Tree[ 1,"",8 ]
Tree[ 2,1,"",8,9 ]
Tree[ 3,2,1,"",8,9,12 ]
Tree[ 4,3,2,1,"",8,9,12,3 ]
Tree[5,4,3,2,1,"",8,9,12,3,5] = "$#"

FWIW I'd approach this by figuring out the max number of records that you might need to search in based on the mappings you want, keep a rolling buffer of that number of records, and then do the comparison part on each buffer, e.g.:
$ cat tst.awk
BEGIN {
RS = "[[:space:]]+"
map("3,4" , "&")
map("6,7,8" , "9")
map("9" , "")
map("0" , "\\000")
map("13,10" , "10")
}
{ buf[((NR-1) % maxRecs) + 1] = $0 }
NR >= maxRecs { prt() }
END { prt() }
function prt( nr,sep,str) {
for ( nr=NR-maxRecs+1; nr<=NR; nr++ ) {
str = str sep buf[((nr-1) % maxRecs) + 1]
sep = ORS
}
print ">>>>" ORS str ORS "<<<<"
# Replace the above with something that loops through the
# strings you want replaced, e.g.
#
# for ( mapNr=1; mapNr<=numMaps; mapNr++ ) {
# old = olds[mapNr]
# if ( str ~ old ) { # add something to avoid partial matches
# new = news[mapNr]
# replace old with new in the output
# }
# }
}
function map(old,new, numRecs) {
++numMaps
numRecs = gsub(/,/,ORS,old) + 1
maxRecs = ( numRecs > maxRecs ? numRecs : maxRecs )
olds[numMaps] = old
news[numMaps] = new
}
$ awk -f tst.awk file
>>>>
112
3
4
<<<<
>>>>
3
4
6
<<<<
>>>>
4
6
7
<<<<
>>>>
6
7
8
<<<<
>>>>
7
8
9
<<<<
>>>>
8
9
12
<<<<
>>>>
9
12
0
<<<<
>>>>
12
0
3
<<<<
>>>>
0
3
4
<<<<
>>>>
3
4
15
<<<<
>>>>
4
15
255
<<<<
>>>>
15
255
13
<<<<
>>>>
255
13
10
<<<<
>>>>
13
10
6
<<<<
>>>>
10
6
7
<<<<
>>>>
6
7
8
<<<<
>>>>
7
8
199
<<<<
>>>>
8
199
9
<<<<
>>>>
199
9
0
<<<<
>>>>
9
0
13
<<<<
>>>>
9
0
13
<<<<
The above is just printing the buff-sized strings, the part to be added is replacing the target strings with the new ones in a way that the next target doesn't match the replaced part which is a common problem with, I expect, lots of solutions online so it's left as an exercise.
You'll also need to tweak it to make sure it doesn't revisit lines at the end of the input.
The above uses GNU awk for multi-char RS, if you don't have GNU awk then just pipe the input from tr -s '[:space:]' '\n' as shown in the question.

UPDATE:
previous answer (see edit revisions) was woefully slow (several minutes) when run against a ramped up input (7K mappings in map.txt; 25M tokens
in input.txt1)
new answer (below) is a complete rewrite and processes the 7K-mappings/25M-tokens in ~45 seconds
The main component of this design centers around a tree-like node structure used to manage the series of tokens (lines of input from map.txt):
tree [ParentNodeNbr] [token] [NodeType] = value
Where:
ParentNodeNbr == 0 for the root
token from map.txt
NodeType has one of two values 'node' or 'leaf'
for NodeType = 'node' the value stored in the array is a numeric node number (implemented as an counter that's incremented each time a new node is added to the tree); this node number becomes the ParentNodeNbr for the next token in the series
for NodeType = 'leaf' this designates the 'end' of a series of tokens (line of input from map.txt) and the value stored in the array is the line number (aka FNR) from map.txt; this line number (FNR) is used as an index into a couple other arrays and to determine precendence when an input sequence (from input.txt) has multiple matches from map.txt
when processing a series of tokens from a map.txt line of input we start at ParentNodeNbr == 0 looking for a series of matching nodes, adding new nodes as needed
Setup: storing replacements in a comma-delimited file (map.txt), and adding one additional line to input.txt:
$ head map.txt input.txt
==> map.txt <==
2 3 4,X # "2 3 4" has precendence over ...
2 3,Y # "2 3"
3 4,&
6 7 8,9
9,
0,\000
13 10,10
==> input.txt <==
2 3 4 # keep eye on "2 3" vs "2 3 4" precendence
112 3
4 6 7
8 9 12 0 3
4 15 255 13
10 6
7 8 199 9
0 13
NOTE: here's what tree[][][] looks like when populated from map.txt:
tree [Parent] [Token] [NodeType] = NodeVal
Parent Token NodeType NodeVal MapTo ** MapTo only applies to NodeType = leaf
====== ===== ======== ======= =====
0 0 leaf 6 "\000"
0 2 node 1
0 3 node 3
0 6 node 4
0 9 leaf 5 ""
0 13 node 6
1 3 node 2
1 3 leaf 1 "Y"
2 4 leaf 2 "X"
3 4 leaf 3 "&"
4 7 node 5
5 8 leaf 4 "9"
6 10 leaf 7 "10"
One GNU awk (for multidimensional arrrays):
awk '
function replace(op) {
while ( ((maxToken - minToken + 1) >= maxlen) || op == "flush" ) {
NodeNbr=root
minOrd=maxOrd
for (j=0 ; j<maxlen; j++) { # loop through tokens in buffer[]
token=buffer[ ((minToken + j - 1) % maxlen) + 1 ]
# if we find a matching "leaf" node then keep track of the ordering (ie, FNR from map.txt; lower order == higher precedence)
if ( token in tree [NodeNbr] && "leaf" in tree[NodeNbr][token] )
minOrd= ( tree[NodeNbr][token]["leaf"] < minOrd ) ? tree[NodeNbr][token]["leaf"] : minOrd
# if we find a matching "node" node then grab the next node to compare against the next token from buffer[]
if ( token in tree[NodeNbr] && "node" in tree[NodeNbr][token] ) {
NodeNbr=tree[NodeNbr][token]["node"]
continue
}
break # if we get here we have a token from buffer[] that does not match any of our replacement mappings so abort checking rest of buffer[]
}
if (minOrd < maxOrd) { # if we found at least one complete match (ie, hit a "leaf" node) then ...
print map[minOrd] # use the associated "ord"er to print the associated replacement string and ...
minToken=minToken + len[minOrd] # update the pointer into the buffer[] array
}
else { # otherwise we did not find a match so ...
print buffer[ ((minToken - 1) % maxlen) + 1 ] # print the first token from buffer[] and ...
minToken++ # update the pointer into the buffer[] array
}
if (minToken > maxToken)
break
}
}
BEGIN { root=maxNodeNbr=maxToken=0
minToken=1
maxOrd=9999999999
}
FNR==NR { split($0,a,",")
map[FNR]=a[2] # save replacement string for this input line from map.txt
n=split(a[1],b) # break our matching pattern into tokens
len[FNR]=n # make note of number of tokens in this line of input
maxlen=(n > maxlen) ? n : maxlen # keep track of longest series of tokens
NodeNbr=root # initiate our tree search
for (i=1 ; i<=n ; i++) { # loop through our list of tokens
token=b[i]
if (i==n) # if the last token for this line then create a "leaf" node and store the line number (aka "order")
tree[NodeNbr][token]["leaf"]=FNR
else
if ( tree[NodeNbr][token]["node"] ) # else if we already have a node at this point in the tree then grab its associated node number for the next level in the tree
NodeNbr=tree[NodeNbr][token]["node"]
else { # else create a new "node" node and populate with the next available node number
tree[NodeNbr][token]["node"]=++maxNodeNbr
NodeNbr=maxNodeNbr # use this as the next level in our tree traversal
}
}
maxrec=FNR # keep track of total number of replacement sets from map.txt (only used if we decide to print the contents of map[] to stdout
next
}
FNR==1 {
# Uncomment following to display the contents of the map[] array:
# for (i=1;i<=maxrec;i++)
# print "map:" i ":" map[i] ":"
#
# Uncomment following to display the contents of the tree[][][] array:
# fmt="%6s%8s%10s%10s%10s\n"
# fmt="%6s%8s%10s%10s%10s\n"
# printf "tree [Parent] [Token] [NodeType]\n\n"
# printf fmt, "Parent", "Token", "NodeType", "NodeVal", "MapTo"
# printf fmt, "======", "=====", "========", "=======", "====="
#
# for (NodeNbr=root ; NodeNbr<=maxNodeNbr ; NodeNbr++)
# for (token in tree[NodeNbr])
# for (NodeType in tree[NodeNbr][token]) { # ??
# NodeVal=tree[NodeNbr][token][NodeType]
# printf fmt, NodeNbr, token, NodeType, NodeVal, (NodeType=="leaf") ? "\"" map[NodeVal] "\"" : ""
# }
}
{ for (i=1 ; i<=NF ; i++) { # loop through tokens in current line from input.txt
maxToken++
buffer[ ((maxToken - 1) % maxlen) + 1 ] = $i
if ( (maxToken - minToken + 1) >= maxlen ) # if we have a "full" buffer then ...
replace() # look for replacement match
}
}
END { replace("flush") } # flush the rest of buffer[]
' map.txt input.txt
This generates:
X # "2 3 4" has precendence over "2 3"
112
&
9
12
\000
&
15
255
10
9
199
\000
13
If we switch the first 2 lines of map.txt like such:
==> map.txt <==
2 3,Y # "2 3" has precendence over ...
2 3 4,X # "2 3 4"
We now generate:
Y # "2 3" has precendence over "2 3 4" thus ...
4 # leaving "4" by itself
112
&
9
12
\000
&
15
255
10
9
199
\000
13

awk : awk script to group by column with condition

I have tab delimited file like following and I am trying to write a awk script
aaa_log-000592 2 p STARTED 7027691 21.7 a1
aaa_log-000592 28 r STARTED 7027815 21.7 a2
aaa_log-000592 33 p STARTED 7032607 21.7 a3
aaa_log-000592 33 r STARTED 7032607 21.7 a4
aaa_log-000592 43 p STARTED 7025709 21.7 a5
aaa_log-000592 43 r STARTED 7025709 21.7 a6
aaa_log-000595 2 r STARTED 7027691 21.7 a7
aaa_log-000598 28 p STARTED 7027815 21.7 a8
aaa_log-000599 13 p STARTED 7033090 21.7 a9
I am trying to count for 3rd column (p or r) and group by column 1
Output would be like
Col1 Count-P Count-R
aaa_log-000592 3 3
aaa_log-000595 0 1
aaa_log-000598 1 0
aaa_log-000599 1 0
I can't find an example that would have IF condition with group by in awk.

awk(more specifically, the GNU variant, gawk) has multi-dimensional arrays that can be indexed using input values (including character strings like in your example). As such, you can count the values in the way you want by doing
{
values[$3] = 1 # this line records the values in column three
counts[$1][$3]++ # and this lines counts their frequency
}
The first line isn't strictly required, but it simplifies generating the output.
The only remaining part is to have an END clause that outputs the tabulated results.
END {
# Print column headings
printf "Col1 "
for (v in values) {
printf " Count-%s", v
}
printf "\n"
# Print tabulated results
for (i in counts) {
printf "%-20s", i
for (v in values) {
printf " %d", counts[i][v]
}
printf "\n"
}
}
Generating the values array handles the case when the values of column three may not be known (e.g., like when there's an error in your input).
If you're using a different awk implementation (like what you might find in macOS, for example), array indexing may be different (e.g., they are single-dimensional arrays, but indexed by a comma-separate list of indices). This may add some additional complexity, but the idea is the same.
{
files[$1] = 1
values[$3] = 1
counts[$1,$3]++
}
END {
# Print column headings
printf "Col1 "
for (v in values) {
printf " Count-%s", v
}
printf "\n"
# Print tabulated results
for (f in files) {
printf "%-20s", f
for (v in values) {
printf " %d", counts[f,v]
}
printf "\n"
}
}

AWK Combine based on area of file

I have a file like
Sever Name aad98722RHEL 20120630 075022
CPU
1 sec 10 sec 15 sec 1 min 1 hour
5 8 0 1 19
TX kbits/sec:
interface 10 sec 1 min 10 min 1 hour 1 day
--------- ------ ----- ------ ------ -----
eth0 32 33 39 40 33
eth1 6 186 321 199 18
eth2 0 0 0 0 0
mgt0 0 0 0 0 0
RX kbits/sec:
interface 10 sec 1 min 10 min 1 hour 1 day
--------- ------ ----- ------ ------ -----
eth0 19 19 25 26 23
eth1 9 26 40 28 10
eth2 0 0 0 0 0
mgt0 0 0 0 0 0
Total memory usage: 1412916 kB
Resident set size : 1256360 kB
Heap usage : 1368212 kB
Stack usage : 84 kB
Library size : 16316 kB
What I would like to produce is
aad98722RHE 20120630 075022 CPU 5 8 0 1 19
aad98722RHE 20120630 075022 TX kbits/sec: 32 33 39 40 33 6 186 321 199 18 0 0 0 0 0 0 0 0 0 0
aad98722RHE 20120630 075022 RX kbits/sec: 19 19 25 26 23 9 26 40 28 10 0 0 0 0 0 0 0 0 0 0
aad98722RHE 20120630 075022 Total memory usage: 1412916 kB Resident set size : 1256360 kB Heap usage : 1368212 kB Stack usage : 84 kB Library size : 16316 kB
Can this be done in Awk/Sed and how?

Perhaps it not better solution, but it work.
file: a.awk:
function print_cpu( server_name, cpu )
{
while ( $0 !~ cpu )
{
getline
}
getline
getline
printf "%s %s ", server_name, cpu
for ( i = 1; i < NF + 1; i++ )
{
printf "%s ", $i
}
printf "\n"
}
function print_rx_or_tx( server_name, rx_or_tx )
{
while ( $0 !~ rx_or_tx )
{
getline
}
getline
getline
getline
printf "%s %s ", server_name, rx_or_tx
while ( $0 != "" )
{
getline
for ( i = 2; i < NF; i++ )
{
printf "%s ", $i
}
}
printf "\n"
}
function print_stuff( server_name )
{
while ( $0 == "" )
{
getline
}
printf "%s ", server_name
while ( $0 != "" )
{
printf "%s ", $0
if ( getline <= 0 )
{
break
}
}
printf "\n"
}
BEGIN { server = "Server Name"; cpu = "CPU"; tx = "TX kbits/sec:"; rx = "RX kbits/sec:" }
server { server_name = $3 " " $4 " " $5 }
! server
{
print_cpu( server_name, cpu )
print_rx_or_tx( server_name, tx )
print_rx_or_tx( server_name, rx )
print_stuff( server_name )
}
run: awk -f a.awk your_input_file

One way using perl:
Assuming infile has content of your question and next content in script.pl:
use warnings;
use strict;
my ($header, $newlines, $trans, #nums);
## Read input in paragraph mode.
local $/ = qq||;
while ( my $par = <> ) {
chomp $par;
## Save data of the header.
if ( $. == 1 ) {
my #header = $par =~ m/\ASer?ver\s+Name\s+(\S+)\s+(\S+)\s+(\S+)\s*\Z/s;
last unless #header;
$header = join qq| |, #header;
next;
}
## Number of '\n' in each paragraph (number of lines minus one).
$newlines = $par =~ tr/\n/\n/;
## Three lines, the CPU info. Extract what I need and print.
if ( $newlines == 2 ) {
printf qq|%s %s %s\n|, $header, $par =~ m/\A([^\n]+).*\n([^\n]+)\Z/s;
next;
}
## Transmission string.
if ( $newlines == 0 ) {
$trans = $par;
next;
}
## Transmission info. Extract numbers and print.
if ( $newlines == 5 ) {
my #lines = split /\n/, $par;
for my $i ( 0 .. $#lines ) {
my #f = split /\s+/, $lines[ $i ];
if ( grep { m/\D/ } #f[ 1 .. $#f ] ) {
next;
}
else {
push #nums, #f[ 1 .. $#f ];
}
}
printf qq|%s %s\n|, $header, join qq| |, #nums;
#nums = ();
}
## Resume info. Extract and print.
if ( $newlines == 4 ) {
$par =~ s/\n/\t/gs;
printf qq|%s %s\n|, $header, $par;
}
}
Run it like:
perl script.pl infile
With following output:
aad98722RHEL 20120630 075022 CPU 5 8 0 1 19
aad98722RHEL 20120630 075022 32 33 39 40 33 6 186 321 199 18 0 0 0 0 0 0 0 0 0 0
aad98722RHEL 20120630 075022 19 19 25 26 23 9 26 40 28 10 0 0 0 0 0 0 0 0 0 0
aad98722RHEL 20120630 075022 Total memory usage: 1412916 kB Resident set size : 1256360 kB Heap usage : 1368212 kB Stack usage : 84 kB Library size : 16316 kB

Get Ascii Code?

To retrieve the ascii code of all charterers of column 13th of a file I write this script
awk -v ch="'" '{
for (i=1;i<=length(substr($13,6,length($13)));i++)
{cmd = printf \"%d\\n\" \"" ch substr(substr($13,6,length($13)),i,1) "\"" cmd | getline output close(cmd) ;
Number= Number " " output
}
print Number ; Number=""
}' ~/a.test
but it doesn't work in the right way! I mean it works fine a while then produces the weird results!?
As an example , for this input (assume it's column 13th)
CQ:Z:%8%%%%0%%%%9%%%%:%%%%%%%%%%%%%%%%%%
I have to get this
37 56 37 37 37 37 48 37 37 37 37 57 37 37 37 37 58 37 37 37 37 ...............
But I have this
37 56 37 37 37 37 48 48 48 48 48 57 57 57 57 57 58 58 58 58 58 ...............
As you can see first miss-computation appear after character "0" (48 in result).
Do you know which part of my code is responsible for this error ?!

Try this:
awk '{
str = substr($13, 6)
for (i=1; i<=length(str); i++) {
cmd = "printf %d \42\47" substr(str, i, 1) "\42"
cmd | getline output
close(cmd)
Number= Number " " output
}
print Number
Number=""
}' ~/a.test
\42 is " and \47 is ', so this runs printf %d "'${char}" in the shell for each ${char}, which triggers evaluation as a C constant with the POSIX extension dictating a numeric value as noted in the final bullet of the POSIX printf definition's §Extended Description.
N.B. The formatting matters!
Don't try to squeeze the code unless you know exactly what you're doing!
And a pure awk solution (I took the ord/chr functions directly from the manual):
printf '%s\n' 'CQ:Z:%8%%%%0%%%%9%%%%:%%%%%%%%%%%%%%%%%%'|
awk 'BEGIN { _ord_init() }
{
str = substr($0, 6)
for (i = 0; ++i <= length(str);)
printf "%s", (ord(substr(str, i, 1)) (i < length(str) ? OFS : ORS))
}
func _ord_init( low, high, i, t) {
low = sprintf("%c", 7) # BEL is ascii 7
if (low == "\a") { # regular ascii
low = 0
high = 127
}
else if (sprintf("%c", 128 + 7) == "\a") {
# ascii, mark parity
low = 128
high = 255
}
else { # ebcdic(!)
low = 0
high = 255
}
for (i = low; i <= high; i++) {
t = sprintf("%c", i)
_ord_[t] = i
}
}
func ord(str, c) {
# only first character is of interest
c = substr(str, 1, 1)
return _ord_[c]
}
func chr(c) {
# force c to be numeric by adding 0
return sprintf("%c", c + 0)
}'

This might work for you:
awk -vSQ="'" -vDQ='"' '{args=space="";n=split($13,a,"");for(i=1;i<=n;i++){args=args space DQ SQ a[i] DQ;format=format space "%d";space=" "};format=DQ format "\\n" DQ;system("printf " format " " args)}'

calculate the difference from flat file

I have a text file and the last 2 lines look like this...
Uptime: 822832 Threads: 32 Questions: 13591705 Slow queries: 722 Opens: 81551 Flush tables: 59 Open tables: 64 Queries per second avg: 16.518
Uptime: 822893 Threads: 31 Questions: 13592768 Slow queries: 732 Opens: 81551 Flush tables: 59 Open tables: 64 Queries per second avg: 16.618
How do I find the difference between the two values of each parameter?
The expected output is:
61 -1 1063 10 0 0 0 0.1
In other words I will like to deduct the current uptime value from the earlier uptime.
Find the difference between the threads and Questions and so on.
The purpose of this exercise is to watch this file and alert the user when the difference is too high. For e.g. if the slow queries are more than 500 or the "Questions" parameter is too low (<100)
(It is the MySQL status but has nothing to do with it, so mysql tag does not apply)

Just a slight variation on ghostdog74's (original) answer:
tail -2 file | awk ' {
gsub(/[a-zA-Z: ]+/," ")
m=split($0,a," ");
for (i=1;i<=m;i++)
if (NR==1) b[i]=a[i]; else print a[i] - b[i]
} '

here's one way. tail is used to get the last 2 lines, especially useful in terms of efficiency if you have a big file.
tail -2 file | awk '
{
gsub(/[a-zA-Z: ]+/," ")
m=split($0,a," ")
if (f) {
for (i=1;i<=m;i++){
print -(b[i]-a[i])
}
# to check for Questions, slow queries etc
if ( -(b[3]-a[3]) < 100 ){
print "Questions parameter too low"
}else if ( -(b[4]-a[4]) > 500 ){
print "Slow queries more than 500"
}else if ( a[1] - b[1] < 0 ){
print "mysql ...... "
}
exit
}
for(i=1;i<=m;i++ ){ b[i]=a[i] ;f=1 }
} '
output
$ ./shell.sh
61
-1
1063
10
0
0
0
0.1

gawk:
BEGIN {
arr[1] = "0"
}
length(arr) > 1 {
print $2-arr[1], $4-arr[2], $6-arr[3], $9-arr[4], $11-arr[5], $14-arr[6], $17-arr[7], $22-arr[8]
}
{
arr[1] = $2
arr[2] = $4
arr[3] = $6
arr[4] = $9
arr[5] = $11
arr[6] = $14
arr[7] = $17
arr[8] = $22
}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to increment a counter when a value is found in awk - awk

You're not using your variables in the END block. As such, the numeric part is being interpreted literally, then a variable k_bs is being evaluated (which is unset, so you get an empty string). Change 256k_bs to bs_256k, etc.

you can achieve the same with unit toolset $ cut -d' ' -f10 file | sort -nr | uniq -c 3 256 1 64 1 32 and format the result as desired.

Related

Replacing sequences of space-delimited numbers in huge input with awk

awk : awk script to group by column with condition

AWK Combine based on area of file

Get Ascii Code?

calculate the difference from flat file

Categories

Resources