Comparing variables in awk - awk

My file has timestamps in its 6th field which looks like this: Mon Jul 7 14:53:16 PDT 2014
I want to get all those lines from this file whose 6th field values are within the last 24 hours.
Sample Input:
abc -> /aa/bbb, hello, /home/user/blah.pl, 516, usc, Mon Jul 4 10:06:33 PDT 2014
abc -> /aa/bbb, hello, /home/user/blah.pl, 516, usc, Mon Jul 5 10:06:33 PDT 2014
abc -> /aa/bbb, hello, /home/user/blah.pl, 516, usc, Mon Jul 7 07:06:33 PDT 2014
abc -> /aa/bbb, hello, /home/user/blah.pl, 516, usc, Mon Jul 7 08:06:33 PDT 2014
abc -> /aa/bbb, hello, /home/user/blah.pl, 516, usc, Mon Jul 7 09:06:33 PDT 2014
abc -> /aa/bbb, hello, /home/user/blah.pl, 516, usc, Mon Jul 7 10:06:33 PDT 2014
The field delimiter is comma.
Sample Code
But it's not working as expected:
awk 'BEGIN {FS = ","};
{ a=$6;
aint=a +"%y%m%d%H%M%S";
yestint=$(date --date='1 day ago' +"%y%m%d%H%M%S");
if (aint>yestint)
print aint;
}' /location/canzee/textfile.txt
Sample Output
I get an output like this:
awk: cmd. line:4: yestint=$(date --date=1
awk: cmd. line:4: ^ syntax error
awk: cmd. line:5: (END OF FILE)
awk: cmd. line:5: syntax error
Desired Output
Mon Jul 7 07:06:33 PDT 2014
Mon Jul 7 08:06:33 PDT 2014
Mon Jul 7 09:06:33 PDT 2014
Mon Jul 7 10:06:33 PDT 2014
I would like to know how to go about this if I can't call shell commands like date within awk command. I hope its clear enough.

Here's a sketch of an idea. Beware that it is gawk-specific.
# An array to convert abbreviated month names to numbers.
BEGIN {m["Jan"]=1; m["Feb"]=2; m["Mar"]=3; m["Apr"]=4; m["May"]=5; m["Jun"]=6
m["Jul"]=7; m["Aug"]=8; m["Sep"]=9; m["Oct"]=10; m["Nov"]=11; m["Dec"]=12;}
# later in your script
{
# systime() gives the number of seconds since the "epoch".
# Subtract 24-hours-worth of seconds from it to get "yesterday".
# (Note that this is yesterday at a specific time, which may not
# really be what you want.)
yest = systime() - 24 * 60 * 60;
a = "Mon Jul 7 14:27:56 PDT 2014" # or however a gets its value
# Split the fields of a into the array f (splitting on spaces).
split(a, f, " ");
# Split the fields of f[4] (the time) into the array t (splitting on colons).
split(f[4], t, ":")
# mktime() converts a date specification into seconds since the epoch.
# The datespec format is: 2014 7 7 14 27 56 [optional dst flag]
# If the daylight savings time flag is left out the system tries to determine
# whether or not dst is in effect.
tm = mktime(f[6] " " m[f[2]] " " f[3] " " t[1] " " t[2] " " t[3])
#Compare the seconds since epochs.
if (tm > yest)
...
}
In the context of your program, it might be done like this:
awk '
BEGIN {
m["Jan"]=1; m["Feb"]=2; m["Mar"]=3; m["Apr"]=4; m["May"]=5; m["Jun"]=6
m["Jul"]=7; m["Aug"]=8; m["Sep"]=9; m["Oct"]=10; m["Nov"]=11; m["Dec"]=12;
FS = "[[:space:]]*,[[:space:]]*"
yest = systime() - 24 * 60 * 60;
}
{
split($6, f, " ")
split(f[4], t, ":")
tm = mktime(f[6] " " m[f[2]] " " f[3] " " t[1] " " t[2] " " t[3])
if (tm > yest)
print $6;
}
' /location/canzee/textfile.txt

Related

Awk array is printing elements in unexpected order

I am using awk to capture timestamp from a dataset, and print out a value (memory) associated with that timestamp.
The following awk code works well to achieve this.
awk '
/show memory compare start/ {getline
start_time = $0;
}
/show memory compare end/ {getline
end_time = $0;
}
/mibd_interface/{
print start_time, $3, "\n" end_time, $4
}' snmpoutput.txt
Thu Sep 19 14:38:06.400 WIB 8670334
Thu Sep 19 14:40:56.123 WIB 8484152
Thu Sep 19 14:43:07.946 WIB 8369050
Thu Sep 19 14:45:27.916 WIB 8514825
Thu Sep 19 14:46:28.464 WIB 8446906
Thu Sep 19 14:50:10.422 WIB 8264885
Thu Sep 19 14:50:44.374 WIB 8264884
Thu Sep 19 14:55:05.760 WIB 8264960
After putting this data into an array and printing it, the elements appear out of order.
I have entered the order of appearance in the right most column, when comparing this output with the desired output above.
awk '
/show memory compare start/ {getline
start_time = $0;
}
/show memory compare end/ {getline
end_time = $0;
}
/mibd_interface/{mem_stats[start_time]=$3; mem_stats[end_time]=$4} END {for (time in mem_stats) {printf "%s => %s\n",time,mem_stats[time]}}' snmpoutput.txt
Thu Sep 19 14:55:05.760 WIB => 8264960 8
Thu Sep 19 14:45:27.916 WIB => 8514825 4
Thu Sep 19 14:43:07.946 WIB => 8369050 3
Thu Sep 19 14:40:56.123 WIB => 8484152 2
Thu Sep 19 14:50:44.374 WIB => 8264884 7
Thu Sep 19 14:38:06.400 WIB => 8670334 1
Thu Sep 19 14:50:10.422 WIB => 8264885 6
Thu Sep 19 14:46:28.464 WIB => 8446906 5
DATASET
(posted a sample as full dataset is too large)
One iteration
xr_lab#show memory compare start
Thu Sep 19 14:38:06.400 WIB
Successfully stored memory snapshot in /var/log/malloc_dump_memcmp_start.out
xr_lab#
xr_lab#
xr_lab#show memory compare end
Thu Sep 19 14:40:56.123 WIB
Successfully stored memory snapshot in /var/log/malloc_dump_memcmp_end.out
xr_lab#
xr_lab#show memory compare report
Thu Sep 19 14:41:08.084 WIB
PID NAME MEM BEFORE MEM AFTER DIFFERENCE MALLOCS-NEW
-------------------------------------------------------------------------------
2550 sysdb_svr_local 7881443 7878256 -3187 87391
7582 mibd_interface 8670334 8484152 -186182 267657
Second iteration
xr_lab#show memory compare start
Thu Sep 19 14:43:07.946 WIB
Successfully stored memory snapshot in /var/log/malloc_dump_memcmp_start.out
xr_lab#
xr_lab#
xr_lab#
xr_lab#show memory compare end
Thu Sep 19 14:45:27.916 WIB
Successfully stored memory snapshot in /var/log/malloc_dump_memcmp_end.out
xr_lab#
xr_lab#
xr_lab#show memory compare report
Thu Sep 19 14:45:42.091 WIB
PID NAME MEM BEFORE MEM AFTER DIFFERENCE MALLOCS-NEW
-------------------------------------------------------------------------------
6777 ospf 24294569 24283592 -10977 227389
7582 mibd_interface 8369050 8514825 145775 126259
Can I know why the elements are printed out of order, and the best way to fix this?
Thanks.
Can I know why the elements are printed out of order, and the best way to fix this?
The standard has the following to say:
The awk language supplies arrays that are used for storing numbers or strings. Arrays need not be declared. They shall initially be empty, and their sizes shall change dynamically. The subscripts, or element identifiers, are strings, providing a type of associative array capability. <snip>
for (variable in array)
which shall iterate, assigning each index of the array to variable in an unspecified order.
So from this, we know that an array in awk is an associative array, nothing more than a key-value-pair combination. A classic example in the programming world is a binary-tree such as C++'s std::map. Usually, ordering needs to be imposed to traverse and search the array efficiently, however standard awk does not give us the option to define such ordering. The standard leaves the key-order a free choice for whoever implements awk. That is also why it states that for (variable in array) will traverse the array in an unspecified order.
GNU awk, on the other hand, allows one to define the key-order on a global level using the array-variable PROCINFO["sorted_in"] and on a local level, using the asorti(source [, dest [, how ] ]) function. The latter will store the keys of array source in an integer-indexed array dest. The latter is populated such that the order of the keys is defined by the function how (dest[1] < dest[2] < dest[3] < ... with how defining <).
If you do not want to use GNU awk features and you know you have sorted input, then you can make use of two arrays. One that keeps track of the key-order and one that keeps track of the key-values:
{ key_order[++c]="key"
data["key"] = "value" }
END { for(i=1;i<=c;++i) print data[key_order[i]] }
Could you please try following, not tested it since lack of sample of actual Input_file.
1st solution: Considering that timestamps will never be same for any other lines if this is the case then simply do:
awk '
/show memory compare start/{
getline
start_time = $0;
b[++count]=start_time
}
/show memory compare end/{
getline
end_time = $0;
b[++count]=end_time
}
/mibd_interface/{
mem_stats[start_time]=$3
mem_stats[end_time]=$4
}
END{
for(i=1;i<=count;i++){
printf "%s => %s\n",b[i],mem_stats[b[i]]
}
}' Input_file
2nd solution: OR following will consider that you may have same timestamp sometimes in logs:
awk '
/show memory compare start/{
getline
start_time = $0;
if(!a[start_time]++){
b[++count]=start_time
}
}
/show memory compare end/{
getline
end_time = $0;
if(!a[end_time]++){
b[++count]=end_time
}
}
/mibd_interface/{
mem_stats[start_time]=$3
mem_stats[end_time]=$4
}
END{
for(i=1;i<=count;i++){
printf "%s => %s\n",b[i],mem_stats[b[i]]
}
}' Input_file
Tested code with Input_file:
cat Input_file
xr_lab#show memory compare start
Thu Sep 19 14:38:06.400 WIB
Successfully stored memory snapshot in /var/log/malloc_dump_memcmp_start.out
xr_lab#
xr_lab#
xr_lab#show memory compare end
Thu Sep 19 14:40:56.123 WIB
Successfully stored memory snapshot in /var/log/malloc_dump_memcmp_end.out
xr_lab#
xr_lab#show memory compare report
Thu Sep 19 14:41:08.084 WIB
PID NAME MEM BEFORE MEM AFTER DIFFERENCE MALLOCS-NEW
-------------------------------------------------------------------------------
2550 sysdb_svr_local 7881443 7878256 -3187 87391
7582 mibd_interface 8670334 8484152 -186182 267657
xr_lab#show memory compare start
Thu Sep 19 14:43:07.946 WIB
Successfully stored memory snapshot in /var/log/malloc_dump_memcmp_start.out
xr_lab#
xr_lab#
xr_lab#
xr_lab#show memory compare end
Thu Sep 19 14:45:27.916 WIB
Successfully stored memory snapshot in /var/log/malloc_dump_memcmp_end.out
xr_lab#
xr_lab#
xr_lab#show memory compare report
Thu Sep 19 14:45:42.091 WIB
PID NAME MEM BEFORE MEM AFTER DIFFERENCE MALLOCS-NEW
-------------------------------------------------------------------------------
6777 ospf 24294569 24283592 -10977 227389
7582 mibd_interface 8369050 8514825 145775 126259
Output is as follows:
Thu Sep 19 14:38:06.400 WIB => 8670334
Thu Sep 19 14:40:56.123 WIB => 8484152
Thu Sep 19 14:43:07.946 WIB => 8369050
Thu Sep 19 14:45:27.916 WIB => 8514825

How to print matching line, 3 lines after, and matching URL

I try to extract information from SMTP mails in text, i.e:
the date (ex: Wed, 9 Oct 2019 01:55:58 -0700 (PDT)
the sender (ex: from xxx.yyy.com (zzz:com. [111.222.333.444])
URLs present in the mail (ex: http://some.thing)
Here's an example of an input:
Delivered-To: SOME#ADDRESS.COM
Received: by X.X.X.X with SMTP id SOMEID;
Wed, 9 Oct 2019 01:55:58 -0700 (PDT)
X-Received: by X.X.X.X with SMTP id SOMEID;
Wed, 09 Oct 2019 01:55:58 -0700 (PDT)
Return-Path: <SOME#ADDRESS.COM>
Received: from SOME.URL.COM (SOME.OTHER.URL.COM. [X.X.X.X])
by SOME.THIRD.URL.COM with ESMTP id SOMEID
for <SOME#ADDRESS.COM>;
Wed, 09 Oct 2019 01:55:58 -0700 (PDT)
SOME_HTML
SOME_HTML
href="http://URL1"><img
SOME_HTML
src="http://URL2"
SOME_HTML
The example is deliberately truncated because the header is longer, but this is for the example
I've tried sed and awk and I managed to do some thing but not as I want.
SED:
sed -e 's/http/\nhttp/g' -n -e '/Received: from/{h;n;n;n;H;x;s/\n \+/;/;p}' a.txt
The first one is to have the URL on one lien but I didn't manage to use it after.
And anyway, it's not in order.
AWK:
BEGIN{
RS = "\n";
FS = "";
}
/Received: from/{
from = $0;
getline;
getline;
getline;
date = $0
}
/"\"https?://[^\"]+"/
{
FS="\"";
print $0;
}
END{
print date";"from;
};
This one works except for the URL. The rexgexp doesn't works while in a oneline yes.
I also tried to find a more elegant way for the date by using the value of NR+3, but it didn't work.
And display this in csv format:
date;sender;URL1;URL2;...
I would prefer pure sed or pure awk, because I think I can do it with grep, tail, sed and awk but as I want to learn, I prefer one or both of them :)
Well, the following longish sed script with comments inside:
sed -nE '
/Received: from /{
# hold mu line!
h
# ach, spagetti code, here we go again
: notdate
${
s/.*/ERROR: INVALID INPUT: DATE NOT FOUND/
p
q1
}
# the next line after the line ending with ; should be the date
/;$/!{
# so search for a line ending with ;
n
b notdate
}
# the next line is the date
n
# remove leading spaces
s/^[[:space:]]*//
# grab the Received: from line
G
# and save it for later
h
}
# headers end with an empty line
/^$/{
# loop over lines
: read_next_line
n
# flag with \x01<URL>\x02 all occurences of URLs
s/"(http[^"]+)"/\x01\1\x02/g
# we found at least one URL if there is \x01 in the pattern space
/\x01/{
# extract each occurence to the end of pattern space with a newline
: again
s/^([^\x01]*)\x01([^\x02]*)\x02(.*)$/\1\3\n\2/
t again
# remove everything in front of separator - the unparsed part of line
s/^[^\n]*\n//
# add URLs to hold space
H
}
# if this is the last line, we should finally print something!, and, exit
${
# grab the hold space
x
# replace the separator for a ;
s/\n/;/g
# print and exit successfully
p
q 0
}
# here we go again!
b read_next_line
}
'
for the following input:
Delivered-To: SOME#ADDRESS.COM
Received: by X.X.X.X with SMTP id SOMEID;
Wed, 9 Oct 2019 01:55:58 -0700 (PDT)
X-Received: by X.X.X.X with SMTP id SOMEID;
Wed, 09 Oct 2019 01:55:58 -0700 (PDT)
Return-Path: <SOME#ADDRESS.COM>
Received: from SOME.URL.COM (SOME.OTHER.URL.COM. [X.X.X.X])
by SOME.THIRD.URL.COM with ESMTP id SOMEID
for <SOME#ADDRESS.COM>;
Wed, 09 Oct 2019 01:55:58 -0700 (PDT)
SOME_HTML
SOME_HTML
href="http://URL1"><img
SOME_HTML
src="http://URL2"
SOME_HTML
SOMEHTML src="http://URL3" SOMEHTML src="http://URL4"
outputs:
Wed, 09 Oct 2019 01:55:58 -0700 (PDT);Received: from SOME.URL.COM (SOME.OTHER.URL.COM. [X.X.X.X]);http://URL1;http://URL2;http://URL3;http://URL4

Text processing to create a list of unique ID's

I Have a file with ID’s and Names applicable to them as below:
1234|abc|cde|fgh
5678|ijk|abc|lmn
9101|cde|fgh|klm
1213|klm|abc|cde
I need a file with only unique Names as a list.
Output File:
abc|sysdate
cde|sysdate
fgh|sysdate
ijk|sysdate
lmn|sysdate
klm|sysdate
Where sysdate is the current timestamp of processing.
Requesting you to help on this. Also requesting for a explanation for the code suggested.
What this code does :
awk -F\| '{ for(i=2; i <= NF; i++) a[$i] = a[$i] FS $1 }' input.csv
-F sets the delimiter to |, awk process line by line your file, creates a map named 'a', reads from column 2 until the end and fill the map using the current cell processed as key and the current cell + file separator + value in the first column as value.
When awk ends processing the first line, 'a' is :
a['abc'] = 'abc|1234'
a['cde'] = 'cde|1234'
a['fgh'] = 'fgh|1234'
This script does not print anything.
What you want is something like this :
awk -F'|' '{for(i=2;i<=NF;i++){if(seen[$i] != 1){print $i, strftime(); seen[$i]=1}}}' OFS='|' input.csv
-F sets the input delimiter to |, OFS does the same for the output delimiter.
For each value from the column 2 to the end of the line, we check if it has already been seen before. If not, we print the value and the time of process. Then we register the value in a map so we can avoid to process it again.
Output :
abc|Thu Oct 18 10:40:13 CEST 2018
cde|Thu Oct 18 10:40:13 CEST 2018
fgh|Thu Oct 18 10:40:13 CEST 2018
ijk|Thu Oct 18 10:40:13 CEST 2018
lmn|Thu Oct 18 10:40:13 CEST 2018
klm|Thu Oct 18 10:40:13 CEST 2018
You can change the format of sysdate. See documentation of gawk strftime here

Remove commas and format dates

I have a large file with entries such as:
<VAL>17,451.26</VAL>
<VAL>353.93</VAL>
<VAL>395.00</VAL>
<VAL>2,405.00</VAL>
<DATE>31 Jul 2013</DATE>
<DATE>31 Jul 2013</DATE>
<DATE>31 Dec 2014</DATE>
<DATE>21 Jun 2002</DATE>
<DATE>10 Jul 2002</DATE>
<MOD>PL</MOD>
<BATCH>13382</BATCH>
<TYPE>Invoice</TYPE>
<REF1>13541/13382</REF1>
<REF2>671042638320</REF2>
<NOTES>a-07 final elec</NOTES>
<SNAME>EDF ENERGY ( Electricity )</SNAME>
<VAL>55.22</VAL>
</CLT>
<CLT>
<CHD>MAT-01</CHD>
<OPN>U5U1</OPN>
<PERIOD>07 2013</PERIOD>
<DATE>13 Jun 2013</DATE>
<DATE>10 Jul 2002</DATE>
<DATE>10 Jul 2002</DATE>
<DATE>21 Aug 2007</DATE>
<DATE>10 Jul 2002</DATE>
<VAL>-4,122,322.03</VAL>
I need to remove the commas in the VAL fields and change the dates to YYYY-MM-DD (e.g. 2013-07-31) in the DATE fields.
Looking for a quick (efficient) way of doing this.
Thanks
This should get you started:
awk -F"[<>]" 'BEGIN {split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec",month," ");for (i=1;i<=12;i++) mdigit[month[i]]=i} /<VAL>/ {gsub(/\,/,"")} /<DATE>/ {split($3,a," ");$0=sprintf("<DATE>%s-%02d-%02d</DATE>",a[3],mdigit[a[2]],a[1])}1' file
<VAL>17451.26</VAL>
<VAL>353.93</VAL>
<VAL>395.00</VAL>
<VAL>2405.00</VAL>
<DATE>2013-07-31</DATE>
<DATE>2013-07-31</DATE>
<DATE>2014-12-31</DATE>
<DATE>2002-06-21</DATE>
<DATE>2002-07-10</DATE>
<MOD>PL</MOD>
<BATCH>13382</BATCH>
<TYPE>Invoice</TYPE>
<REF1>13541/13382</REF1>
<REF2>671042638320</REF2>
<NOTES>a-07 final elec</NOTES>
<SNAME>EDF ENERGY ( Electricity )</SNAME>
<VAL>55.22</VAL>
</CLT>
<CLT>
<CHD>MAT-01</CHD>
<OPN>U5U1</OPN>
<PERIOD>07 2013</PERIOD>
<DATE>2013-06-13</DATE>
<DATE>2002-07-10</DATE>
<DATE>2002-07-10</DATE>
<DATE>2007-08-21</DATE>
<DATE>2002-07-10</DATE>
<VAL>-4122322.03</VAL>
sed '# init month convertor in holding buffer
1{h;s/.*/Jan01Fev02Mar03Apr04May05Jun06Jul07Aug08Sep09Oct10Nov11Dec12/;x;}
# change Val
/^<VAL>/ s/,//g
# Change Date
/^<DATE>/ {
# change month
G
s/[[:space:]]\{1,\}\([A-Z][a-z][a-z]\)[[:space:]]\{1,\}\(.*\)\n.*\1\([0-9][0-9]\).*/-\3-\2/
# reformat order
s/>\(.*\)-\(.*\)-\(.*\)</>\3-\2-\1</
}' YourFile
posix sed with not extra sub shell for dae conversion
reformat date take 2 s///here but could be merged in 1 s/// a bit more unreadeable (already very attractive regex like this)
could easily add some security feature about source date like bad date format
Your input seems like XML. I'd use a proper XML handling tool, e.g. XML::XSH2, a wrapper around Perl's XML::LibXML:
open file.xml ;
for //VAL set . xsh:subst(., ',', '','g') ;
perl { use Time::Piece } ;
for my $d in //DATE {
$t = $d/text() ;
set $d/text() { Time::Piece->strptime($t, '%d %b %Y')->ymd } ;
}
save :b ;
This might work for you (GNU sed & bash):
sed -r '/^<VAL>/s/,//g;/^(<DATE>)(.*)(<\/DATE>)$/s//echo "\1"$(date -d "\2" +%F)"\3"/e' file
This removes all commas on a line starting <VAL> and for those lines that contain date tags, uses the date utility and the evaluate flag in the substitution command to rearrange the date to YYYY-MM-DD.
An alternative solution, using only seds commands:
sed -r '/^<VAL>/s/,//g;/^<DATE>/!b;s/$/\nJan01Feb02Mar03Apr04May05Jun06Jul07Aug08Sep09Oct10Nov11Dec12/;s/^(<DATE>)(..) (...) (....)(<\/DATE>\n).*\3(..)/\1\4-\6-\2\5/;P;d' file
Appends a lookup to the end of the date line and uses regexp to rearrange the output.

convert month from Aaa to xx in little script with awk

I am trying to report on the number of files created on each date. I can do that with this little one liner:
ls -la foo*.bar|awk '{print $7, $6}'|sort|uniq -c
and I get a list how many fooxxx.bar files were created by date, but the month is in the form: Aaa (ie: Apr) and I want xx (ie: 04).
I have feeling the answer is in here:
awk '
BEGIN{
m=split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",d,"|")
for(o=1;o<=m;o++){
months[d[o]]=sprintf("%02d",o)
}
format = "%m/%d/%Y %H:%M"
}
{
split($4,time,":")
date = (strftime("%Y") " " months[$2] " " $3 " " time[1] " " time[2] " 0")
print strftime(format, mktime(date))
}'
But have no to little idea what I need to strip out and no idea how to pass $7 to whatever I carve out of this to convert Apr to 04.
Thanks!
Here's the idiomatic way to convert an abbreviated month name to a number in awk:
$ echo "Feb" | awk '{printf "%02d\n",(index("JanFebMarAprMayJunJulAugSepOctNovDec",$0)+2)/3}'
02
$ echo "May" | awk '{printf "%02d\n",(index("JanFebMarAprMayJunJulAugSepOctNovDec",$0)+2)/3}'
05
Let us know if you need more info to solve your problem.
Assuming the name of the months only appear in the month column, then you could do this:
ls -la foo*.bar|awk '{sub(/Jan/,"01");sub(/Feb/,"02");print $7, $6}'|sort|uniq -c
Just use the field number of your month as an index into the months array.
print months[$6]
Since ls output differs from system to system and sometimes on the same system depending on file age and you didn't give any examples, I have no way of knowing how to guide you further.
Oh, and don't parse ls.
To parse AIX istat, I use:
istat .profile | grep "^Last modified" | read dummy dummy dummy mon day time dummy yr dummy
echo "M: $mon D: $day T: $time Y: $yr"
-> Month: Mar Day: 12 Time: 12:05:36 Year: 2012
To parse AIX istat month, I use this two-liner AIX 6.1 ksh 88:
monstr="???JanFebMarAprMayJunJulAugSepOctNovDec???"
mon="Oct" ; hugo=${monstr%${mon}*} ; hugolen=${#hugo} ; let hugol=hugolen/3 ; echo "Month: $hugol"
-> Month: 10
1..12 : month name ok
If lt 1 or gt 12 : month name not ok
Instead of "hugo" use speaking names ;-))
Adding a version for AIX, that shows how to retrieve all the date elements (in whatever timezone you need it them in), and display an iso8601 output
tempTZ="UTC" ; TZ="$tempTZ" istat /path/to/somefile \
| grep modified \
| awk -v tmpTZ="$tempTZ" '
BEGIN {Mmms="Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec";
n=split(Mmms,Mmm," ") ;
for(i=1;i<=n;i++){ mm[Mmm[i]]=sprintf("%02d",i) }
}
{ printf("%s-%s-%sT%s %s",$NF, mm[$4], $5, $6, tmpTZ ) }
' ## this will output an iso8601 date of the modification date of that file,
## for ex: 2019-04-18T14:16:05 UTC
## you can tempTZ=anything, for ex: tempTZ="UTC+2" to see that date in UTC+2 timezone... or tempTZ="EST" , etc
I show the iso8601 version to make it more known & used, but of course you may only need the "mm" portion, which is easly done : mm[$4]