Extract unique IPs from live tcpdump capture - awk

I am using the following command to output IPs from live tcpdump capture
sudo tcpdump -nn -q ip -l | awk '{print $3; fflush(stdout)}' >> ips.txt
I get the following output
192.168.0.100.50771
192.168.0.100.50770
192.168.0.100.50759
Need 2 things:
Extract only the IPs, not the ports.
Generate a file with unique IPs, no duplicated, and sorted if posible.
Thank you in advance

To extract unique IPs from tcpdump you can use:
awk '{ ip = gensub(/([0-9]+.[0-9]+.[0-9]+.[0-9]+).*/,"\\1","g",$3); if(!d[ip]) { print ip; d[ip]=1; fflush(stdout) } }' YOURFILE
So your command to see unique IPs live would be:
sudo tcpdump -nn -q ip -l | awk '{ ip = gensub(/([0-9]+.[0-9]+.[0-9]+.[0-9]+)(.*)/,"\\1","g",$3); if(!d[ip]) { print ip; d[ip]=1; fflush(stdout) } }'
This will print each IP to output as soon as they appear, so it cannot sort them. If you want to sort those, you can save the output to a file and then use sort tool:
sudo tcpdump -nn -q ip -l | awk '{ ip = gensub(/([0-9]+.[0-9]+.[0-9]+.[0-9]+)(.*)/,"\\1","g",$3); if(!d[ip]) { print ip; d[ip]=1; fflush(stdout) } }' > IPFILE
sort -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4 IPFILE
Example output:
34.216.156.21
95.46.98.113
117.18.237.29
151.101.65.69
192.168.1.101
192.168.1.102
193.239.68.8
193.239.71.100
202.96.134.133
NOTE: make sure you are using gawk. It doesn't work with mawk.

While I'm a huge Awk fan, it's worthwhile having alternatives. Consider this example using cut:
tcpdump -n ip | cut -d ' ' -f 3 | cut -d '.' -f 1-4 | sort | uniq

This is a using match (working in macOs)
sudo tcpdump -nn -q ip -l | \
awk '{match($3,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/); \
ip = substr($3,RSTART,RLENGTH); \
if (!seen[ip]++) print ip }'
In case want to pre-filter the input you could use something like:
sudo tcpdump -nn -q ip -l | \
awk '$3 !~ /^(192\.168|10\.|172\.1[6789]|172\.2[0-9]\.|172\.3[01]\.)/ \
{match($3,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/); \
ip = substr($3,RSTART,RLENGTH); \
if (!seen[ip]++) print ip }'

sudo tcpdump -n ip | cut -d ' ' -f 3 | cut -d '.' -f 1-4 | awk '!x[$0]++'
Is the command that did it for me. Simple and elegant.

Related

How to filter output of a URL

I have a URL and when I send a request by curl, I get a big output.
curl https://www.aparat.com/video/video/embed/videohash/lXhkG/vt/frame -H "Accept: application/json" -s
I get: https://pastebin.mozilla.org/QM6FN8MZ#L
But I just want to get the URL of 720p, I mean just:
https:\/\/caspian1.cdn.asset.aparat.com\/aparat-video\/de54245e862b62249b6b7958c734276547445778-720p.apt?wmsAuthSign=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ0b2tlbiI6IjQ2NDJhYmQ4NGFiN2UzNDJkNGMxZWI3ZTNkMzlmZmQ5IiwiZXhwIjoxNjY5ODA5NzI1LCJpc3MiOiJTYWJhIElkZWEgR1NJRyJ9.havkkhJyXjBt_jHPVv4poEVb65_7tRsLIxO5pCO7tGE
Any idea how to do it?
I'm trying to use grep but I don't know how to remove other things from else 720p URL.
curl https://www.aparat.com/video/video/embed/videohash/lXhkG/vt/frame -H "Accept: application/json" -s | grep -e "720p"
You could go the html-parsing/json-parsing route, e.g.:
curl -s https://www.aparat.com/video/video/embed/videohash/lXhkG/vt/frame |
# Normalize html
xmlstarlet fo -o -H -R 2> /dev/null |
# Extract relevant js bit
xmlstarlet sel -t -v '_:html/_:body/_:div/_:script' 2> /dev/null |
# Extract relevant json
sed -nE '/^ *var +options *= */ { s///; s/;$//p; }' |
# Extract desired url, i.e. the 720p in this case
jq -r '.multiSRC[][] | select( .label == "720p" ) | .src'
I would harness GNU AWK for this following way
wget --quiet -O - https://www.aparat.com/video/video/embed/videohash/lXhkG/vt/frame | awk 'match($0, /http[^"]*720[^"]*/){print substr($0,RSTART,RLENGTH)}'
gives output
https:\/\/caspian1.cdn.asset.aparat.com\/aparat-video\/de54245e862b62249b6b7958c734276547445778-720p.apt?wmsAuthSign=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ0b2tlbiI6IjY1OTcxYTRkNGZiMjkyYjk0NjM0Mjk2ODVkOTc3YjEwIiwiZXhwIjoxNjY5ODIxNDM2LCJpc3MiOiJTYWJhIElkZWEgR1NJRyJ9.NI2_6nwOxLEOxhWghsR2bOqzrXINXqqscbduHpCWwok
Explanation: I use wget with information like progress bar &c turned-off (--quiet) and writing to standard output (-O -) which is piped into awk, which for each line is matching against following regular expression http[^"]*720[^"]* that is http followed by zero-or-more (*) not-quotes followed by 720 followed by zero-or-more non-quotes, if there is match I print substring of line containing that match. match string function sets RSTART and RLENGTH variables, which I use later in substr. Note: this might give false positivie if there are others URL containing 720.
(tested in GNU Wget 1.20.3 and GNU Awk 5.0.1)
Using any awk:
$ cat file | awk 'match($0,/"https?:\\\/\\\/[^"]*-720p\.apt\?[^"]*"/) { print substr($0,RSTART+1,RLENGTH-2) }'
https:\/\/caspian1.asset.aparat.com\/aparat-video\/de54245e862b62249b6b7958c734276547445778-720p.apt?wmsAuthSign=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ0b2tlbiI6ImViODhjZDNlYzZhYzk3OTBhZDc3MWJhMzIyNWQ3NmZlIiwiZXhwIjoxNjY5ODE4Mjc5LCJpc3MiOiJTYWJhIElkZWEgR1NJRyJ9.e6do9Ha9EkDS46NZDoHT2dYHSOezu_TbdGAGblfi2tM
The contents of file are what you provided in pastebin, obviously just replace cat file with your curl command.

Running "ip | grep | awk" within a sed replacement

Problem Set (Raspberry Pi OS):
I have a file example.conf that contains a line IPv4addr=XXXXX. I am attempting to change this to the IP that is generated the in the command
ipTest=$(ip --brief a show | grep eth0 | awk '{ print $3 }')
I want to automate this file change during a script install.sh, the line I am attempting is:
IPtest=$(ip --brief a show | grep eth0 | awk '{ print $3 }')
sudo sed -e "/IPv4addr/s/[^=]*$/$IPtest/" example.conf
Returns error:
sed: -e expression #1, char 32: unknown option to `s'
A simple line in that code works, such as SimpleTest='Works'
Any thoughts? I am open to other solutions as well, however I am not an experienced linux user so I am using the tools I know to work with other problem sets.
$IPtest contains the / character; try something like this:
IPtest=$(ip --brief a show | grep eth0 | awk '{ print $3 }')
sudo sed -e '/IPv4addr/s#[^=]*$#'"$IPtest"'#' example.conf
You can shorten your variable and allow awk to do the job of grep at the same time
IPtest=$(ip --brief a s | awk '/eth0/{print $3}')
Using sed grouping and back referencing
sed -i.bak "s|\([^=]*.\).*|\1$IPtest|" example.conf

Ansible grep from shell variable

I am trying to create an Ansible playbook to pull out MTU size for exact NIC (unfortunately i have 5k VMs and this exact NIC does not have the same name on all VMs). I need to parse IP from file to variable and grep by that.
My command i will use in playbook:
/sbin/ifconfig -a | grep -C 1 $IP | grep MTU | awk '{print $5}' | cut -c 5-10
And output should be looking like this:
9000
This one gnu awk command should do:
ifconfig -a | awk -v ip="$IP" -v RS= -F'MTU:' '$0~ip {split($2,a," ");print a[1]}'
9216
Another variations
ifconfig -a | awk -v ip="$IP" 'f {split($6,a,":");print a[2];exit} $0~ip{f=1}'
ifconfig -a | awk -v ip="$IP" 'f {print substr($6,5,99);exit} $0~ip{f=1}'
9216

Get all keys in Redis cluster

I am using Redis cluster version redis-5.0.5. I want to see all the keys present in my Redis cluster. I know for standalone we use KEYS * to get all the keys.
what is the way to see all keys in Redis cluster?
$ redis-cli -h hostname -p 90001 -c
hostname:90001> KEYS *
(empty list or set)
// I have data on my cluster
Basically, you'd need to run KEYS * (not in production, please!) on every one of the nodes. The cli can do this with the '--cluster call' command, like so:
redis-cli --cluster call hostname:90001 KEYS "*"
Requirement:
redis-cli
awk
grep
xargs
May be can try this assuming redis server reside in localhost with the default port 6379:
redis-cli cluster nodes | awk '{print $2" "$3}' | grep master | awk -F # '{print $1}' | awk -F : '{print " -h "$1" -p "$2" --scan"}' | xargs -L 1 redis-cli -c
Longer version base question above (90001 port number seriously?) and also you can change the pattern (* no filter) for filtering certain key pattern:
redis-cli -h hostname -p 90001 cluster nodes | awk '{print $2" "$3}' | grep master | awk -F # '{print $1}' | awk -F : '{print " -h "$1" -p "$2" --scan --pattern *"}' | xargs -L 1 redis-cli -c
It connects to any one of the redis node to get cluster info and then execute the keys scanning command on each of the master node.
The SCAN command may be what you're looking for, but it's O(N) so the more keys you have, the slower it's going to be. Also, check out this answer by Marc Gravell for another approach using sets: Get values by key pattern in StackExchange.Redis

piping to awk hangs

I am trying to pipe tshark output to awk. The tshark command works fine on its own, and when piped to other programs such as cat, it works fine (real time printing of output). However, when piped to awk, it hangs and nothing happens.
sudo tshark -i eth0 -l -f "tcp" -R 'http.request.method=="GET"' -T fields -e ip.src -e ip.dst -e
tcp.srcport -e tcp.dstport -e tcp.seq -e tcp.ack | awk '{printf("mz -A %s -B %s -tcp \"s=%s sp=%s
dp=%s\"\n", $2, $1, $5, $4, $3)}'
Here is a simplier version:
sudo tshark -i eth0 -f "tcp" -R 'http.request.method=="GET"' | awk '{print $0}'
And to compare, the following works fine (although is not very useful):
sudo tshark -i eth0 -f "tcp" -R 'http.request.method=="GET"' | cat
Thanks in advance.
I had the same problem.
I have found some partial "solutions" that are not completely portable.
Some of them point to use the fflush() or flush() awk functions or -W interactive option
http://mywiki.wooledge.org/BashFAQ/009
I tried both and none works. So awk is not the appropriate command at all.
A few of them suggest to use gawk but it neither does the trick for me.
cut command has the same problem.
My solution: In my case I just needed to put --line-buffered in GREP and not touching awk command but in your case I would try:
sed -u
with the proper regular expression. For example:
sed -u 's_\(.*\) \(.*\) \(.*\) DIFF: \(.*\)_\3 \4_'
This expression gives you the 3rd and 4th columns separate by TAB (written with ctrl+v and TAB combination). With -u option you get unbuffered output and also you have -l option that gives you line buffered output.
I hope you find this answer useful although is late
Per our previous messages in comments, maybe it will work to force closing the input and emitting a linefeed.
sudo tshark -i eth0 -f "tcp" -R 'http.request.method=="GET"' ...... \
| {
awk '{print $0}'
printf "\n"
}
Note, no pipe between awk and printf.
I hope this helps.
I found the solution here https://superuser.com/questions/742238/piping-tail-f-into-awk (by John1024).
It says:
"You don't see it in real time because, for purposes of efficiency, pipes are buffered. tail -f has to fill up the buffer, typically 4 kB, before the output is passed to awk."
The proposed solutions is to use "unbuffer" or "stdbuf -o0" commands to disable buffering. It worked for me like this:
stdbuf -o0 tshark -i ens192 -f "ip" | awk '{print $0}'