Merge two files and specify fields using awk

Merge two files and specify fields using awk - awk

I want to merge two files and output them.
So file0.txt and file1.txt are merged.
However, the column of file1.txt is appended to the end.
$ cat file0.txt
Name: vSwitch0 MTU: 1500 Uplinks: vmnic0 Portgroups: VM Network, Management Network
Name: vSwitch1 MTU: 1500 Uplinks: vmnic1 Portgroups: VM Network 2
Name: vSwitch2 MTU: 1500 Uplinks: vmnic2 Portgroups: VM Network 3
$ cat file1.txt
vmnic2 nvmxnet3 Down 0 half
vmnic0 nvmxnet3 Up 10000 full
vmnic1 nvmxnet3 Up 10000 full
$ awk 'NR==FNR { temp[FNR]=$2 FS $3 FS $4 FS $5; next }; {print ($0) " " temp[FNR]}' <(sort file1.txt) <(sort file0.txt)
Name: vSwitch0 MTU: 1500 Uplinks: vmnic0 Portgroups: VM Network, Management Network **nvmxnet3 Up 10000 full**
Name: vSwitch1 MTU: 1500 Uplinks: vmnic1 Portgroups: VM Network 2 **nvmxnet3 Up 10000 full**
Name: vSwitch2 MTU: 1500 Uplinks: vmnic2 Portgroups: VM Network 3 **nvmxnet3 Down 0 half**
I've tried but failed.
I want to do something like below
However, I don't want to use join or paste.. I will only using awk command .
Name: vSwitch0 MTU: 1500 Uplinks: vmnic0 **nvmxnet3 Up 10000 full** Portgroups: VM Network, Management Network
Name: vSwitch1 MTU: 1500 Uplinks: vmnic1 **nvmxnet3 Up 10000 full** Portgroups: VM Network 2
Name: vSwitch2 MTU: 1500 Uplinks: vmnic2 **nvmxnet3 Down 0 half** Portgroups: VM Network 3
OR
Second my question
Is it even possible like this?
cat file1.txt
vmnic0 nvmxnet3 Up 10000 Full
vmnic1 nvmxnet3 Up 10000 Full
vmnic2 nvmxnet3 Down 0 Half
vmnic3 nvmxnet3 Up 10000 Full
# cat file2.txt
Name: vSwitch0 MTU: 1500 Uplinks: vmnic0 Portgroups: VM Network, Management Network
Name: vSwitch1 MTU: 1500 Uplinks: vmnic1 Portgroups: VM Network 2
Name: vSwitch2 MTU: 1500 Uplinks: vmnic3, vmnic2 Portgroups: VM Network 3
Is it possible to output something like this?
vmnic0 nvmxnet3 Up 10000 Full Name: vSwitch0 MTU: 1500 Portgroups: VM Network, Management Network
vmnic1 nvmxnet3 Up 10000 Full Name: vSwitch1 MTU: 1500 Portgroups: VM Network 2
vmnic2 nvmxnet3 Down 0 Half Name: vSwitch2 MTU: 1500 Portgroups: VM Network 3
vmnic3 nvmxnet3 Up 10000 Full Name: vSwitch2 MTU: 1500 Portgroups: VM Network 3

Replace Portgroups with the text you want to add.
awk '
NR==FNR { temp[FNR]=$2 FS $3 FS $4; next }
{gsub(" Portgroups:", temp[FNR]" Portgroups:"; print}
' <(sort file1.txt) <(sort file0.txt)

Related

Format dmidecode output

I need from dmidecode -t 17 only few values and grouped line by line.
Here what I am getting
dmidecode -t 17 | egrep "Serial Number|Size|Manufacturer:" | egrep -v "No Module|Unknown|None|Volatile"
Size: 32 GB
Manufacturer: X
Serial Number: 1
Size: 32 GB
Manufacturer: X
Serial Number: 2
Size: 32 GB
Manufacturer: X
Serial Number: 3
and I am trying to get something similar to
32GB,X,1
32GB,X,2
32GB,X,3
any suggestion?

Modify awk parameter with other command

I have a file like this that contains multiple fields in lines. I want to display some of them, while processing one of them with another command.
TITLE,OpenVPN ...
HEADER,CLIENT_LIST,Common Name,Real Address,Virtual Address,Virtual IPv6 Address,Bytes Received,Bytes Sent,Connected Since,Connected Since (time_t),Username,Client ID,Peer ID
CLIENT_LIST,name1,1.1.1.1:1,10.0.0.1,,2692253,3765861,Wed Jun 23 12:51:08 2021,1624452668,name1,4727,0
CLIENT_LIST,name2,2.2.2.2:2,10.0.0.2,,1571221,2080242,Thu Jul 1 19:24:10 2021,1625167450,name2,5625,0
CLIENT_LIST,name3,3.3.3.3:3,10.0.0.3,,2670410,3736957,Wed Jun 23 16:20:51 2021,1624465251,name3,4747,0
...
The expected output is this:
name1 10.0.0.1 2021-06-23 12:51:08
name2 10.0.0.2 2021-07-01 19:24:10
name3 10.0.0.3 2021-06-23 16:20:51
The command I have now is this:
grep '^CLIENT_LIST,' /var/run/ovpn-server.status |awk -F',' '{print $2 $4 $9}' |sort
It prints the desired fields, but doesn't convert the timestamp to a formatted time. Here's the command for that:
date -d #1624452668 +"%Y-%m-%d %H:%M:%S"
How can I integrate the date command into the awk script? Or what other solution is there to accomplish this?
I also intend to put the output into a columns/table layout with the column command, I've done that before, so that's not part of the question.

You may use this awk:
awk -F, -v OFS='\t' '$1 == "CLIENT_LIST" {
cmd = "date +\047%Y-%m-%d %H:%M:%S\047 -d\047#" $9 "\047"
print $2, $4, ((cmd | getline dt) > 0 ? dt : $9)
close(dt)
}' file
name1 10.0.0.1 2021-06-23 08:51:08
name2 10.0.0.2 2021-07-01 15:24:10
name3 10.0.0.3 2021-06-23 12:20:51
Explanation:
-F, -v OFS='\t': Sets input field separator as , and output field separator as tab
'$1 == "CLIENT_LIST": Do it when first field is CLIENT_LIST
cmd = "date +\047%Y-%m-%d %H:%M:%S\047 -d\047#" $9 "\047": Format date command using $9
cmd | getline dt invokes external date command
(cmd | getline dt) > 0: When date command is a success
print: prints 2nd, 4th and output of date field

If you actually just want the date+time from $8 reformatted instead of converting the seconds since the epoch from $9 to a date+time then you can just do the following which will be orders of magnitude faster than calling date since that would require awk to spawn a subshell once per input line to call date from that subshell which would be extremely slow.
Using any awk in any shell on every Unix box:
$ cat tst.awk
BEGIN { FS=","; OFS="\t" }
NR > 2 {
split($8,t," ")
mthNr = (index("JanFebMarAprMayJunJulAugSepOctNovDec",t[2])+2)/3
print $2, $4, sprintf("%04d-%02d-%02d %s", t[5], mthNr, t[3], t[4])
}
$ awk -f tst.awk file
name1 10.0.0.1 2021-06-23 12:51:08
name2 10.0.0.2 2021-07-01 19:24:10
name3 10.0.0.3 2021-06-23 16:20:51
or if you really want to use the epoch seconds from $9 then use GNU awk for strftime() so you don't have to spawn subshells to call date (but note that the output now becomes TZ-dependent, just like with date):
$ cat tst.awk
BEGIN { FS=","; OFS="\t" }
NR > 2 {
print $2, $4, strftime("%F %T",$9)
}
$ awk -f tst.awk file
name1 10.0.0.1 2021-06-23 07:51:08
name2 10.0.0.2 2021-07-01 14:24:10
name3 10.0.0.3 2021-06-23 11:20:51
$ TZ=UTC awk -f tst.awk file
name1 10.0.0.1 2021-06-23 12:51:08
name2 10.0.0.2 2021-07-01 19:24:10
name3 10.0.0.3 2021-06-23 16:20:51
or setting the UTC flag in strftime() if UTC is what you have in your data:
$ cat tst.awk
BEGIN { FS=","; OFS="\t" }
NR > 2 {
print $2, $4, strftime("%F %T",$9,1)
}
$ awk -f tst.awk file
name1 10.0.0.1 2021-06-23 12:51:08
name2 10.0.0.2 2021-07-01 19:24:10
name3 10.0.0.3 2021-06-23 16:20:51

AWK scripting to to find highest number

1. \#id;clientid;product name;qty;price
1. 1;fooclient;product A;3;100
2. 2;booclient;product B;4;200
1. 3;xyzzycompany;product C;2;35000
1. 4;testclient;product B;1;190
1. 5;fooclient;product A;10;100
1. 6;testclient;product B;1;25000
1. 7;Mouccccccc;product C;2;300
1. 8;Deeccccccc;product C;2;10
1. 9;ICICT;product Z;12;45000
1. 10;AXISX;product D;14;75000
1. 11;Fcebook;product Z;12;65000
Need help to find highest price having clientid name using awk.
Options tried:
filename: invoices_input.txt [having all above mentioned values]
awk 'BEGIN { FS = ";" } !/^#/ {print $2 " " $NF}' invoices_input.txt
Result:
fooclient 100
booclient 200
xyzzycompany 35000
testclient 190
fooclient 100
testclient 25000
Mouccccccc 300
Deeccccccc 10
ICICT 45000
AXISX 75000
Fcebook 65000
I am expecting AXISX to printed as highest numbered client.

with awk and friends
$ sort -t';' -k2,2 -k5,5nr file |
awk -F';' '!a[$2]++{print $1 "\t" $2,$NF}' |
sort -n |
cut -f2-
clientid price
fooclient 100
AXISX 75000
Fcebook 65000
booclient 200
xyzzycompany 35000
testclient 25000
Mouccccccc 300
Deeccccccc 10
ICICT 45000
of course you can do all in awk as well.
without maintaining the order (assumes prices>0)
$ awk -F';' 'a[$2]<$NF {a[$2]=$NF}
END {for(k in a) print k,a[k]}' file
clientid price
fooclient 100
ICICT 45000
Deeccccccc 10
xyzzycompany 35000
Fcebook 65000
testclient 25000
booclient 200
Mouccccccc 300
AXISX 75000
If you just need the client with highest price you don't need all this complexity
$ sort -t';' -k5,5nr file | sed 1q | cut -d';' -f2
AXISX

Try:
awk -F\; 'NR > 1 {
if ($5 > price) {
price = $5
company = $2
}
else if ($5 == price) {
company = company "\n" $2
}
}
END {
print company
}' file

get rid of the end of lines matching pattern linux

I am trying to get list of ip addresses from configuration, and I receive them in format: *.*.*.*:* where the last field is the port number of the established connection.
how can I get rid of the port numeber?
here is the line i do now:
ss -ta | tail -n +2 |awk '{print $4}' | sort -u
I understand I need using sed as pipe between awk and sort for removing the part after the colon, but I am not sure how to do it the right way.
the line ss -ta
returns the following:
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:ssh *:*
LISTEN 0 100 127.0.0.1:smtp *:*
CLOSE-WAIT 32 0 192.168.1.7:48474 104.18.35.72:https
CLOSE-WAIT 32 0 192.168.1.7:52879 104.18.34.72:https
CLOSE-WAIT 1 0 192.168.1.7:38492 82.80.211.109:http
LISTEN 0 128 :::ssh :::*
LISTEN 0 100 ::1:smtp :::*
ESTAB 0 52 fe80::a00:27ff:fead:6df2%enp0s3:ssh fe80::e1
this is the output to my command:
> 127.0.0.1:smtp
> 192.168.1.7:38492
> 192.168.1.7:48474
> 192.168.1.7:52879
> ::1:smtp
> fe80::a00:27ff:fead:6df2%enp0s3:ssh
> :::ssh
> *:ssh
the desired output is:
> 127.0.0.1
> 192.168.1.7
thanks

Without testable sample input and expected output it's a bit of a guess but it sounds like all you need is
ss -ta | awk '{$0=$4;sub(/:[^:]+$/,"")} NR>1 && !seen[$0]++'
e.g. using cat file instead of ss ta to pipe your expected input to the command:
$ cat file | awk '{$0=$4;sub(/:[^:]+$/,"")} NR>1 && !seen[$0]++'
*
127.0.0.1
192.168.1.7
::
::1
fe80::a00:27ff:fead:6df2%enp0s3
but if we look at your posted expected output then maybe what you really want is more like:
$ cat file | awk '{$0=$4;sub(/:[^:]+$/,"")} NR>1 && /[0-9]+(\.[0-9]+){3}/ && !seen[$0]++'
127.0.0.1
192.168.1.7

You can do the port removal with gnu awk, use awk '{print gensub(/:.*/,"","g",$4)}' in your original pipe.

just use the regex to delete everything after :, you can use:
ss -ta | tail -n +2 |awk '{print $4}' | sort -u | sed 's/:.*$//g' | uniq
or you can even use awk with : as the field separator:
ss -ta | tail -n +2 |awk '{print $4}' | sort -u | awk -F : '{print $1}' | uniq
or cut with delimeter as :
ss -ta | tail -n +2 |awk '{print $4}' | sort -u | cut -d : -f 1 | uniq

updating a count depending on values in a file fulfilling criteria specified by a second file

I have two files and I want to update file A with a new column containing counts of how many times the number in $2 of file B fell with the range of $2 and $3 of file A, but only when $1 matches in both files.
file A
n01 2000 9000
n01 29000 41000
n01 60000 89000
n05 10000 15000
n80 5000 12000
n80 59000 68000
n80 100000 110000
file B
n01 6000
n01 6800
n01 35000
n05 14000
n80 65000
n80 104000
expected output
n01 2000 9000 2
n01 29000 41000 1
n01 60000 89000 0
n05 10000 15000 1
n80 5000 12000 0
n80 59000 68000 1
n80 100000 110000 1

awk '
FNR==NR{
A[$1,$2]
next
}
{
c = 0
for(i in A)
{
split(i,X,SUBSEP)
if(X[1] == $1)
{
if(X[2] >= $2 && X[2] <= $3)
{
c++
}
}
}
print $0,c
}
' fileB fileA

Not exactly strict awk, but you can help your script with some bash utils like this:
join fileA fileB -a1 | awk '{ key=$1 " " $2 " " $3; if (! (key in array) ){array[key]=0} } $4>=$2 && $4<=$3{key=$1 " " $2 " " $3; array[key]=array[key] + 1; }END{ for(val in array){print val" "array[val]} }' | sort -n
First join both files with the join command. Then create an array in AWK and sum 1 each time that the desired condition fulfills. Finally, you may want to sort your output to get elements sorted by the key.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Merge two files and specify fields using awk - awk

Replace Portgroups with the text you want to add. awk ' NR==FNR { temp[FNR]=$2 FS $3 FS $4; next } {gsub(" Portgroups:", temp[FNR]" Portgroups:"; print} ' <(sort file1.txt) <(sort file0.txt)

Related

Format dmidecode output

Modify awk parameter with other command

AWK scripting to to find highest number

get rid of the end of lines matching pattern linux

updating a count depending on values in a file fulfilling criteria specified by a second file

Categories

Resources