How to present multiple greps as if they were in different columns? - awk

Let's say.
I have one file with the name of the computer and some other information.
E.g.
Computer1
There's another file with the ip address and some other information.
192.168.100.2
I have 2 greps for example:
grep -i computer /etc/hosts
grep -i ips /etc/hosts
They give me answers like
Computer1
19.168.100.2
Well, I would like to get a file with headers and the information organized as this:
Name
Ip
oser1313
19.168.100.1
I'm quite lost I have no idea how could I format this I usually copy-paste it on Excel but I don't want to do it anymore and since I have to do this on several computers from a server It would be great if I can format it.

Just do something like this:
awk '
{ lc = tolower($0) }
lc ~ /computer/ { name = $0 }
lc ~ /ips/ { ip = $0 }
END {
print "Name", "Ip"
print name, ip
}
' /etc/hosts
The above is untested since you didn't provide a sample input file to test with and it's just mimicing what your grep commands do but there may be a better way to do it if we knew what your input looked like.

I suppose that your two files have the same number of lines and that line numbers match between one file and the other: if oser1313 is line n in the output of grep from /etc/hosts then same for 19.168.100.1 in /etc/hosts.
So it turns pretty simple as bash script:
grep -i computer /etc/hosts > part1.dat
grep -i ips /etc/hosts > part2.dat
echo "Name,IP" > out.dat
paste -d"," part1.dat part2.dat >> out.dat
rm part1.dat part2.dat
Or a oneliner, as suggested in comments:
printf "Name,IP\n$(grep -i computer /etc/hosts),$(grep -i ips /etc/hosts)\n" > out.dat

Related

Validate contents of .ssh/known_hosts file

Is there some cli tool I can use to validate the contents of known_hosts? Maybe try to ping all the hosts in there and see if I can connect to each?
Probably using either ssh-keygen or ssh-keyscan?
If you have list of all hosts available you can do it like this:
ssh-keyscan -t rsa,dsa -f hosts_list > ~/.ssh/known_hosts_revised
This will generate a new known_hosts_revised which you can make a diff with your current know_hosts to see the differences.
If you don't need to compare it you can simply do ... > ~/.ssh/known_hosts to overwrite it (WARNING: the original known_hosts will be lost!)
The source of information are the OpenBSD man pages for ssh-keyscan(1).
Edit
The hosts_list expected in for:
1.2.3.4,1.2.4.4 name.my.domain,name,n.my.domain,n,1.2.3.4,1.2.4.4
At least for my setup, using ssh-keyscan is impossible due to my extensive ~/.ssh/config file. I use lots of proxy commands, jump hosts, and alternate Hostname declarations.
For example:
# Connect to Tor nodes
Host *.onion
ProxyCommand socat - SOCKS4A:localhost:%h:%p,socksport=9050
# Work jump box
Host bastion
Hostname bastion.work.com
# Office system, e.g. bob.office -> bastion -> bob.work.com
Host *.office
ProxyCommand ssh bastion nc -w600s $(echo "%h" |sed 's/\.office$/work.com/') %p
# Home system, e.g. adam -> home.com -> adam-laptop.local
Host adam
Hostname adam-laptop.local
ProxyJump home.com
None of the above will work.
Here's a script that should work for the rest though:
#!/usr/bin/awk -f
!/^#/ && NF > 2 {
split($1, hosts, ",")
key_type = $2
gsub(/^ssh-/, "", key_type)
gsub(/-.*/, "", key_type)
for (h in hosts) {
p = index(hosts[h], "]:") # [host]:port (supports raw IPv6 hosts)
if (!p && hosts[h] ~ /^[^:]+:[0-9]+$/) p = index(hosts[h], ":") # host:port
if (p > 0) {
port = substr(hosts[h], p + 2)
gsub("\[|\]?:" port, "", hosts[h])
} else {
port = 22
}
if (seen[key_type,port,hosts[h]]++) next # prevent duplicate lookups
if (port_list[key_type,port]) { comma = "," } else { comma = "" }
port_list[key_type,port] = port_list[key_type,port] comma hosts[h]
}
}
END {
for (tp in port_list) {
split(tp, a, SUBSEP)
system("echo ssh-keyscan -t " a[1] " -p " a[2] " " port_list[tp])
}
}
Remove the echo parts to run once you're convinced this will do what you desire.
This parses non-commented lines and with 3+ fields (since the format is host_list key_type key_hash). It splits the host list since it can be comma-delimited, and further parsing is needed because it can contain ports but ssh-keyscan cannot accept hosts in the format used by known_hosts.
There are two ways a port can be specified:
The old style, which does not work with bare IPv6 addresses, is host:port
The new style, which is required for bare IPv6 addresses, is [host]:port
p is set to the position of ]: if present (the new style). If that string isn't present, we check for the old style and reset p.
If p is positive, we have a port specification. Extract the port and remove (it and the square brackets) from the host name. Otherwise, the port is 22.
Just in case there are duplicate entries, we check for them and continue if we've already seen the type,port,host combination (x++ is false (0) only when first run). Finally, we push the host to a comma-delimited list string in the port_list array as keyed by a tuple of type and port.
After reading in the entire known_hosts file, we iterate on the type,port tuple pairs that key the port_list array, split them into an array named a, and run ssh-keyscan on them.
Run this like awk -f 'this_script.awk' ~/.ssh/known_hosts and if you like the ssh-keyscan commands that it spits out, remove the echo from the system command and re-run.
Do not pipe this output into ~/.ssh/known_hosts! You will want to manually review the results (and probably filter out the comments). Also, you can't redirect output onto one of the files used in the input.

Use scp to send a file to an IP from a file

I need to send a file ( my_file.txt ) to an FTP Printer whose IP address is contained in another file ( printer_ip.txt ). This file contains only one IP address.
$ cat printer_ip.txt
10.111.22.333
What is the simple command I can use? Something like this?
$ scp my_file.txt | cat printer_ip.txt
I think what you want is
$ scp my_file.txt $(cat printer_ip.txt)

Grep logs for IPs in the format [client 123.456.78.90]

I'm a grep and sed newbie, and have read through a bunch of answers on SO referring to grepping IPs in apache logs with no luck for my particular situation.
I have megs of error logs from bots and nefarious humans hitting a site, and I need to search through the logs and find the most common IPs so I can confirm they're bad and block them in .htaccess.
But, my error logs don't have the IP as the first item on the line as it seems most Apache logs do, according to the other answers here on SO. In my logs, the IP is within each line and in the format [client 123.456.78.90].
This older answer is exactly what I need, I think, Grepping logs for IP adresses as it "will print each IP... sorted prefixed with the count."
But according to the answerer, "It assumes the IP-address is the first thing on each line."
How can I modify the sed command from that answer for the IP format [client 123.456.78.90] rather than the IP on the first line of each log entry?
sed -e 's/\([0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+\).*$/\1/' -e t -e d access.log | sort | uniq -c
8/25/14 This works re: Kent's answer below:
grep -o '[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+' logfile|sort|uniq -c
Update 9/02/14
To sort by number of occurrences of each IP;
grep -o '[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+' logfile|sort -n | uniq -c | sort -rn
grep is for Globally finding Regular Expressions on individual lines and Printing them (G/RE/P get it?).
sed is for Stream EDiting (SED get it?), i.e. making simple substitutions on individual lines.
For any other general text manipulation (including anything that spans multiple lines) you should use awk (named after 3 guys who ran out of imagination for naming tools).
awk '
match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/) { cnt[substr($0,RSTART,RLENGTH)]++ }
END { for (ip in cnt) print cnt[ip], ip }
' logfile
dirty and quick :
grep -o '[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+' logfile|sort|uniq -c
a big diff between sed and grep is: sed can change the input text (like substitution), but grep can't. :-)

Apache server log highest traffic using bash

I have an Apache server log and am trying to determine what IP address has generated the most traffic. I've already managed to get it formatted so its just the IPs and their traffic in bytes:
xxx.xxx.xxx.xxx 915925
yyy.yyy.yyy.yyy 1193
zzz.zzz.zzz.zzz 2356
So now I'm looking for a method to combine and add the bytes of identical IPs and then just find the top value.
Any ideas?
If you have the ip and traffic bytes in a file use the following to get the work done.
cat file | perl -ane '$h{ $F[0] } += $F[1]; END { for ( sort keys %h ) { printf qq[%s %d\n], $_, $h{ $_ } } }' | sort -k2 -n -r
awk '{A[$1]+=$2;next}END{for(i in A){print i,A[i]}}' file | sort -k2 -n -r

How can I write a program (script) to remove obsolete host keys from ~/.ssh/known_hosts?

I use a cluster of about 30 machines that have all recently been reconfigured with new OpenSSH host keys. When I try to log into one, I get this error message (many lines removed for brevity):
# WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! #
The fingerprint for the RSA key sent by the remote host is
52:bb:71:83:7e:d0:e2:66:92:0e:10:78:cf:a6:41:49.
Add correct host key in /home/nr/.ssh/known_hosts to get rid of this message.
Offending key in /home/nr/.ssh/known_hosts:50
I can go remove the offending line manually, in which case I get a different complaint about the IP addresss, which requires removing another line manually, and I have no desire to repeat this exercise 29 times. I would like to write a program to do it. Unfortunately, the line in the .ssh file no longer contains the host name and IP address in clear text, as it did in earlier versions.
So here's my question:
Given a host name and an IP address, how can I write a program to find out which lines of my ~/.ssh/known_hosts store an SSH host key for that host or IP address?
If I can recover this info, I think I can do the rest myself.
Footnote: I would prefer to code in bash/ksh/sh or C or Lua; my Perl and Python are very rusty.
Clarifications:
I don't want to remove the whole file and repopulate it; it contains over a hundred validated keys that I prefer not to re-validate.
Whether I maintain a single master copy or multiple replicas, the problem of scrubbing away a large group of obsolete host keys remains.
Answer
Here's the Lua script I wrote using ssh-keygen -F:
#!/usr/bin/env lua
require 'osutil'
require 'ioutil'
local known = os.getenv 'HOME' .. '/.ssh/known_hosts'
local function lines(name)
local lines = { }
for l in io.lines(name) do
table.insert(lines, l)
end
return lines
end
local function remove_line(host)
local f = io.popen('ssh-keygen -F ' .. os.quote(host))
for l in f:lines() do
local line = l:match '^# Host %S+ found: line (%d+) type %u+$'
if line then
local thelines = lines(known)
table.remove(thelines, assert(tonumber(line)))
table.insert(thelines, '')
io.set_contents(known, table.concat(thelines, '\n'))
return
end
end
io.stderr:write('Host ', host, ' not found in ', known, '\n')
end
for _, host in ipairs(arg) do
local ip = os.capture('ipaddress ' .. host)
remove_line(host)
remove_line(ip)
end
ssh-keygen -R hostname
ssh-keygen -R ipaddress
personally I scrub the IP addresses with a loop and perl, and remove the conflicts by hand.
$!/usr/bin/perl
for (1..30){
`ssh keygen -R 192.168.0.$_`; #note: backticks arent apostrophies
}
If I want to find out on what line the entry for a host lives,
ssh-keygen -F hostname
The same trick works with IP addresses.
touch and edit "clearkey.sh" or what ever name makes you happy.
#! /bin/bash
# $1 is the first argument supplied after calling the script
sed -i "$1d" ~/.ssh/known_hosts
echo "Deleted line $1 from known_hosts file"
Should be able to do "clearkey.sh 3" and it will delete the offending line!
I usually do the following in bash script checkssh to automatically remove the line:
#!/bin/bash
# Path to "known_hosts" file
KH=~/.ssh/known_hosts
# Find the host in the file, showing line number
L=`grep -i -n $1 $KH`
# If line is not found, exit
[[ $? -ne 0 ]] && exit
# Isolate line number
L=`echo $L | cut -f 1 -d :`
sed -i "${L}d" $KH
You can add ssh $1 exit at the end to automatically re-create an entry in the file, if your ssh is configured to do so.
Call it like checkssh <hostname>.
You might like to try the following when script writing:
declare CHANGED_HOST_NAME="host.yourpublic.work";
declare CHANGED_HOST_IP=$(dig +short ${CHANGED_HOST_NAME});
# Remove old IP address if found
[ -z ${CHANGED_HOST_IP} ] || ssh-keygen -R ${CHANGED_HOST_IP};
# Remove old host key
ssh-keygen -R ${CHANGED_HOST_NAME};
# Add new host key
ssh-keyscan ${CHANGED_HOST_NAME} >> $HOME/.ssh/known_hosts;
Big thanks to #Storm Knight (#289844)