get rid of the end of lines matching pattern linux - awk

I am trying to get list of ip addresses from configuration, and I receive them in format: *.*.*.*:* where the last field is the port number of the established connection.
how can I get rid of the port numeber?
here is the line i do now:
ss -ta | tail -n +2 |awk '{print $4}' | sort -u
I understand I need using sed as pipe between awk and sort for removing the part after the colon, but I am not sure how to do it the right way.
the line ss -ta
returns the following:
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:ssh *:*
LISTEN 0 100 127.0.0.1:smtp *:*
CLOSE-WAIT 32 0 192.168.1.7:48474 104.18.35.72:https
CLOSE-WAIT 32 0 192.168.1.7:52879 104.18.34.72:https
CLOSE-WAIT 1 0 192.168.1.7:38492 82.80.211.109:http
LISTEN 0 128 :::ssh :::*
LISTEN 0 100 ::1:smtp :::*
ESTAB 0 52 fe80::a00:27ff:fead:6df2%enp0s3:ssh fe80::e1
this is the output to my command:
> 127.0.0.1:smtp
> 192.168.1.7:38492
> 192.168.1.7:48474
> 192.168.1.7:52879
> ::1:smtp
> fe80::a00:27ff:fead:6df2%enp0s3:ssh
> :::ssh
> *:ssh
the desired output is:
> 127.0.0.1
> 192.168.1.7
thanks

Without testable sample input and expected output it's a bit of a guess but it sounds like all you need is
ss -ta | awk '{$0=$4;sub(/:[^:]+$/,"")} NR>1 && !seen[$0]++'
e.g. using cat file instead of ss ta to pipe your expected input to the command:
$ cat file | awk '{$0=$4;sub(/:[^:]+$/,"")} NR>1 && !seen[$0]++'
*
127.0.0.1
192.168.1.7
::
::1
fe80::a00:27ff:fead:6df2%enp0s3
but if we look at your posted expected output then maybe what you really want is more like:
$ cat file | awk '{$0=$4;sub(/:[^:]+$/,"")} NR>1 && /[0-9]+(\.[0-9]+){3}/ && !seen[$0]++'
127.0.0.1
192.168.1.7

You can do the port removal with gnu awk, use awk '{print gensub(/:.*/,"","g",$4)}' in your original pipe.

just use the regex to delete everything after :, you can use:
ss -ta | tail -n +2 |awk '{print $4}' | sort -u | sed 's/:.*$//g' | uniq
or you can even use awk with : as the field separator:
ss -ta | tail -n +2 |awk '{print $4}' | sort -u | awk -F : '{print $1}' | uniq
or cut with delimeter as :
ss -ta | tail -n +2 |awk '{print $4}' | sort -u | cut -d : -f 1 | uniq

Related

AWK Assignment and execute operation with variables

I would like to find out how to assign and execute an operation with the value variable.
Suppose that I get these files as a result of ls -A *.pdf | grep -v '^d':
firstOne.pdf
ordenSiq.pdf
Now I'm trying to execute any operation later of assignment, an example:
% ls -lAh ordenSiq.pdf
-rw-r--r--# 1 joseluisbz staff 47K Jun 29 15:35 ordenSiq.pdf
Here my attempt (but is not working!)
awk -v thelast="$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}')" 'BEGIN {ls -lAh thelast;}'
EDIT
Obtaining The Last File With Extension!
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); awk -v result=$thelast 'BEGIN{print result}' Otutput: ordenSiq.pdf
Extracting Only The Name (split)
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); awk -v result="$thename" 'BEGIN{print result}' Output:ordenSiq
ALL Extensions For name (concatenation)
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); allexts=$(echo ${thename}'.*'); awk -v result="$allnames" 'BEGIN{print result}' Output:ordenSiq.*
or
thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); awk -v allexts=${thename}".*" 'BEGIN{print allexts}'
Execution of command with variable
I would like to obtain something like:
% ls -lAh ordenSiq.*
-rw-r--r-- 1 joseluisbz staff 0B Jul 15 12:34 ordenSiq.abc
-rw-r--r-- 1 joseluisbz staff 0B Jul 15 12:34 ordenSiq.def
-rw-r--r--# 1 joseluisbz staff 47K Jun 29 15:35 ordenSiq.pdf
%
ERROR:
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); awk -v allexts=${thename}".*" 'BEGIN{system(ls -lAh allexts)}' Output:
sh: 0ordenSiq.*: command not found
And with
thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); awk -v allexts=${thename}".*" 'BEGIN{system(ls -lAh $allexts)}' Output:
awk: illegal field $(ordenSiq.*), name "allexts"
source line number 1
Some Working example:
% root1="/webroot"; echo | awk -v r=$root1 '{ print "shell variable $root1 value is " r}' Output:
shell variable $root1 value is /webroot
Statically Works!
% ls -lAh ordenSiq.*
-rw-r--r-- 1 joseluisbz staff 0B Jul 15 12:34 ordenSiq.abc
-rw-r--r-- 1 joseluisbz staff 0B Jul 15 12:34 ordenSiq.def
-rw-r--r--# 1 joseluisbz staff 47K Jun 29 15:35 ordenSiq.pdf
%
And the variable's value is correct!
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); allexts=$(echo ${thename}'.*'); echo ${allexts} Output:
ordenSiq.*
But doing this, do not work;
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); allexts=$(echo ${thename}'.*'); ls -lAh $allexts Output:
ls: ordenSiq.*: No such file or directory
QUESTION:
What is wrong in my steps in order to perform the final operation with variables (with and without AWK)?
{ls -lAh thelast;}
This is not correct way of using shell command in GNU AWK, you should prepare string with command and then use system function, so for example if you want to sleep for 5 seconds at beginning you might do
awk 'BEGIN{system("sleep 5")}'
Keep in mind that system function returns exit status code (see linked documentation for further discussion), not output of command.
I think this is what you're trying to do:
thelast=$(find . -type f -name '*.pdf' -printf '%T#\t%p\n' | sort -n | cut -f2- | tail -n 1)
thename="${thelast%.pdf}"
allexts=( "$thename".* )
ls -lh "${allexts[#]}"
As for what's wrong with your original code:
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); allexts=$(echo ${thename}'.*'); ls -lAh $allexts
It's trying to parse the output of ls, see https://mywiki.wooledge.org/ParsingLs
It's removing the names of files that start with d for unknown reasons.
It's not listing the files in time order
It's using awk '{print}' which does nothing but copy the input to the output.
It's got unquoted variables (copy/paste your code into http://shellcheck.net and it'll tell you about the basic issues)
echo ${thename}'.*' is leaving the part that needs to be double-quoted unquoted and then single-quoting the part that needs to be unquoted.
Using split() in awk as you are would corrupt file names that contain multiple .s.
There may be other issues that aren't as obvious, idk.

Bash,Postfix, AWK, Error in filtering deferred mail output

This is what I have tried so far:
cat /var/spool/postfix/deferred/D3B921090 | awk -F"/" '{print $6}' |awk '{$1="" print $0}' | sort | uniq -c | sort -n
and
awk -F"/" '{print $6}' < /var/spool/postfix/deferred/D3B921090 | awk '{$1="" print $0}' | sort | uniq -c | sort -n
I get the following error message when trying to run either command:
awk: line 1: syntax error at or near print
What am I doing wrong?
awk '{$1="" print $0}'
is not a syntactically valid expression, did you mean
awk '{$1=""; print $0}'
which is equal to
awk '{$1=""}1'
?

GNU parallel used with xargs and awk

I have two large tab separated files A.tsv and B.tsv, they look like (the header is not in the file):
A.tsv:
ID AGE
User1 18
...
B.tsv:
ID INCOME
User4 49000
...
I want to select list of IDs in A such that 10=< AGE <=20 and select rows in B that match the list. And I want to use GNU parallel tool. My attempt is two steps:
cat A.tsv | parallel --pipe -q awk '{ if ($3 >= 10 && $3 <= 20) print $1}' > list.tsv
cat list.tsv | parallel --pipe -q xargs -I% awk 'FNR==NR{a[$1];next}($1 in a)' % B.tsv > result.tsv
The first step works but the second one comes with error like:
awk: cannot open User1 (No such file or directory)
How can I fix this? Does this method work even if A.tsv and list.tsv are 2 to 3 times bigger than the memory?
$ for I in $(seq 8 2 22); do echo -e "User$I\t$I" >> A.txt; done; cat A.txt
User8 8
User10 10
User12 12
User14 14
User16 16
User18 18
User20 20
User22 22
$ for I in $(seq 8 2 22); do echo -e "User$I\t100${I}00" >> B.txt; done; cat B.txt
User8 100800
User10 1001000
User12 1001200
User14 1001400
User16 1001600
User18 1001800
User20 1002000
User22 1002200
$ cat A.txt | parallel --pipe -q awk '{if ($2 >= 10 && $2 <= 20) print $1}' > list.txt
$ cat B.txt | parallel --pipe -q grep -f list.txt
User10 1001000
User12 1001200
User14 1001400
User16 1001600
User18 1001800
User20 1002000
I know this: (yes, I saw it)
GNU parallel used with xargs and awk
Asked 8 years, 3 months ago
Modified 8 years, 3 months ago
Viewed 2k times
My solution:
only xargs and awk, only a line without intermediate file, and you don't need install a new tool
awk '{if ($2 >= 10 && $2 <= 20) print $1}' A.tsv | xargs -I myItem awk --assign quebuscar=myItem '$1==quebuscar {print}' B.tsv

Reading a file from line 4 to the end

I want to read a file from the line 4 to the very end is there anyway to this with awk or something?
This sed command will do:
sed -n '4,$p' file.txt
Or using awk:
awk 'NR>=4' file.txt
Or using tail:
tail +4 file.txt
awk 'NR >= 4 {print $0}'
For example
$> seq 101 110 | awk 'NR >= 4 {print $0}'
104
105
106
107
108
109
110
tail +4 filename ll serve ur purpose.
more on tail
heres a method (that can depend on the type of shell you use, bash should work):
tmpvar=`cat a_file | wc -l `; tail -$((tmpvar-4)) a_file
heres another method that should work in more shells:
cat a_file -n | awk '{if($1>4) print $2}'

grep and awk parse line

I have e a line that looks like:
Feb 21 1:05:14 host kernel: [112.33000] SRC=192.168.0.1 DST=90.90.90.90 PREC=0x40 TTL=51 ....
I would like to the a list of uniq IPs from SRC=
How can I do this? Thanks
This will work, although you could probably simplify it further in a single awk script if you wanted:
awk '{print $7}' <your file> | awk -F= '{print $2}' | sort -u
grep -o 'SRC=\([^ ]\+\)' | cut -d= -f2 | sort -u
cat thefile | grep SRC= | sed -r 's/^.*SRC=([^ ]+).*$/\1/' | sort | uniq
This awk script will do:
{a[$7]=1}
END{for (i in a) print i}
This will print the IP addresses in order without the "SRC=" string:
awk '{a[$7] = $7} END {asort(a); for (i in a) {split(a[i], b, "="); print b[2]}}' inputfile
Example output:
192.168.0.1
192.168.0.2
192.168.1.1
grep -Po "SRC=(.[^\s]*)" file | sed 's/SRC=//' | sort -u
Ruby(1.9+)
ruby -ne 'puts $_.scan(/SRC=(.[^\s]*)/)[0] if /SRC=/' file| sort -u