Extract multiple data between multiple, same tags - awk

Further to the solution in Extract data between two tags, of which extract one set of CIDR, now we just come across something more complex.
The result returned is as follow:
<tr><td class='bold vmiddle'> Owner CIDR: </td><td><span class='jtruncate-text'>80.245.225.0/24, 80.245.226.0/23, 80.245.228.0/22, 80.245.232.0/22, 80.245.236.0/23, 80.245.238.0/24</span></td></tr>
It has 6 CIDR:
80.245.225.0/24, 80.245.226.0/23, 80.245.228.0/22, 80.245.232.0/22, 80.245.236.0/23, 80.245.238.0/24
In fact, for other queries we don't know how many CIDRs would be returned.
What solution in bash should we use? Expand the sed string in the linked question? Or something completely different?
Can anyone help?

maybe this could help you;
grep -oE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}</a>/[0-9]{1,2}' yourFile | sed 's/<\/a>//'
Eg;
user#host:/tmp$ grep -oE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}</a>/[0-9]{1,2}' test | sed 's/<\/a>//'
80.245.225.0/24
80.245.226.0/23
80.245.228.0/22
80.245.232.0/22
80.245.236.0/23
80.245.238.0/24

Related

Capture and parse output of Whateverable bots

Since that is the standard way to present output in the Perl 6 documentation, I'm using the whateverable bots to evaluate expressions via the #perl6 IRC channel or the #whateverable channel. Produced output is something like this:
10:28:19 jmerelo | p6: say 333444777 ~~ /(3+)/ │
10:28:19 evalable6 | jmerelo, rakudo-moar 5ce24929f: OUTPUT: «「333」␤ 0 => 「333」␤»
(in the WeeChat console program). From that output, I cut and paste to the document, erasing the parts I'm not interested in.
I was wondering if there was some easy way to parse and save that output directly, either server-based (some Whateverable bots save to gists, for instance), or client-based via scriptint the irssi or weechat platform.
I think the most convenient solution in this case would be to bypass irc bots and define a bash function. Something like this:
d6() { echo -n '# OUTPUT: «'; perl6 -e "$1" | sed -z 's/\n/␤/g'; echo '»'; }
Then you can use it like this:
d6 'say 42'
Which will produce this output:
# OUTPUT: «42␤»
Of course, you'd need a different solution for other operating systems.
As a bonus, you can also put it into the clipboard automatically:
d6 'say 42' | tee >(xclip -selection clipboard)

grep/awk - how to filter out a certain keyword

I have the following line of text, where i want to filter out the N from (KEY_N) etc. Keep in mind that the N is not constant, it can be anything, like (KEY_J), (KEY_K), (KEY_L), (KEY_I), (KEY_SPACE) and so on..
Event: time 1442439135.995248, type 1 (EV_KEY), code 49 (KEY_N), value 0
Update:
I hope that I got the question properly, if not then please let me know.
Having GNU grep you can use this:
grep -oP '.*\(\K[^)]+' file
An alternative on non GNU systems might be to use sed:
sed 's/.*(\([^)]\{1,\}\)).*/\1/' file

Updating Values on json file A using reference on file B - The return

Ok, i should feel ashamed for that, but i'm unable to understand how awk works...
A few days ago i posted this question which questions about how to replace fields on file A using the file B as a reference ( both files have matching ID's for reference ).
But after accepting the answer as correct ( Thanks Ed !) i'm struggling about how to do it using this following pattern:
File A
{"test_ref":32132112321,"test_id":12345,"test_name":"","test_comm":"test", "null_test": "true"}
{"test_ref":32133321321,"test_id":12346,"test_name":"","test_comm":"test", "test_type": "alfa"}
{"test_ref":32132331321,"test_id":12347,"test_name":"","test_comm":"test", "test_val": 1923}
File B
{"test_id": 12345, "test_name": "Test values for null"}
{"test_id": 12346, "test_name": "alfa tests initiated"}
{"test_id": 12347, "test_name": "discard values"}
Expected result:
{"test_ref":32132112321,"test_id":12345,"test_name":"Test values for null","test_comm":"test", "null_test": "true"}
{"test_ref":32133321321,"test_id":12346,"test_name":"alfa tests initiated","test_comm":"test", "test_type": "alfa"}
{"test_ref":32132331321,"test_id":12347,"test_name":"discard values","test_comm":"test", "test_val": 1923}
I tried some alterations with the original solution but without success. So, Based on the Question posted before, how could i achieve the same results with this new pattern?
PS: One important note, the lines on file A not always have the same length
Big Thanks in advance.
EDIT:
After trying the solution posted by Wintermute, it seens it doens't work with lines having:
{"test_ref":32132112321,"test_id":12345,"test_name":"","test_comm":"test", "null_test": "true","modifiers":[{"type":3,"value":31}{"type":4,"value":33}]}
Error received.
error: parse error: Expected separator between values at line xxx, column xxx
Parsing JSON with awk or sed is not a good idea for the same reasons that it's not a good idea to parse XML with them: sed works based on lines, and JSON is not line-based. awk works on vaguely tabular data, and JSON is not vaguely tabular. People don't expect their JSON tools to break when they insert newlines in benign places.
Instead, consider using a tool geared towards JSON processing, such as jq. In this particular case, you could use
jq -c -s 'group_by(.test_id) | map(.[0] + .[1]) | .[]' a.json b.json > c.json
Here jq slurps (-s) the input files into an array of JSON objects, groups these by test_id, merges them and unpacks the array. -c means compact output format, so each JSON object in the result ends up on a single line in the output.

NSLookup Script to update hosts file once a week.

First time poster so my apologies if this has been covered in a previous topic that I was unable to locate. Basically I'm tasked with creating a script to perform a NSLookup on 50 domain names, format the results and pass them to the hosts file. I'll worry about checking and overwriting duplicate entries later.
Example:
Input: nslookup www.cbc.ca
Result:
Name: a1849.gc.akamai.net
Addresses: 184.50.238.64, 184.50.238.89
Aliases: www.cbc.ca, www.cbc.ca.edgesuite.net
Eventual Output: #184.50.238.64 www.cbc.ca a1849.gc.akamai.net
I figured this was possible with grep, awk and sed but have been messing about with switches and haven't gotten the right combination (mostly cause I'm not the most learned when it comes to regular expressions.) I'm partial to vbs, batch, cmd suggestions as well.
Thanks in advance for the time and effort! :)
nslookup $NAME | awk -v name="$NAME" 'BEGIN{hit=0; addr=""; alias=""} /answer:/{hit=1} /^Address:/{if (hit == 1 && "" == addr) addr=$2} /^Name:/{alias=alias " " $2} END{print(addr, name, alias)}'
Only one address and would not solve multiple identical names like nslookup google.com...

How to delete last row in output file generated by nzsql

I am trying to delete last row in the file generated by nzsql.Please find the below query.
nzsql -A -c "SELECT * FROM AM_MAS_DIVISION_DIM" > abc.out
When I execute this query the output will be generated and stored in abc.out.This will include both header columns as well as some time information at the bottom.But I don't need the bottom metadata and want to keep only my header columns. How can I do this using only nzsql.Please help me.Thanks in advance.
use -r flag in the nzsql command to avoid getting that row [assuming the metadata referred in question is the row count summary line, ex: (3 rows)]
-r Suppresses the row count that is displayed at the end of the SQL output.
reference: http://pic.dhe.ibm.com/infocenter/ntz/v7r0m3/index.jsp?topic=%2Fcom.ibm.nz.adm.doc%2Fr_sysadm_nzsql_command.html
Why don't you just pipe the output to a unix command to remove it? I think something like this will work:
nzsql -A -c "SELECT * FROM AM_MAS_DIVISION_DIM" | sed '$d' > abc.out
Seems to be a recommended solution for getting rid of the last line (although ed, gawk, and other tools can handle it).