Combine grep output when piping

Combine grep output when piping - awk

I use the following command sipcalc to display information about an IP:
sipcalc 192.16.12.1/16 | grep -E 'Network address|Network mask \(bits\)'
The output is:
Network address - 192.16.0.0
Network mask (bits) - 16
Is there a way to combine the above output (only the right part), so the output would be:
192.16.0.0/16
I have my own way to do this by separating grep call and then concatenate the result, but I don't think it is a good solution. Can grep or any other commands that can be used to pipe the output like awk in order to obtain the output above?

grep is not really an ideal tool for doing operations beyond just searching for your expected text. Use awk alone!
awk '/Network address/{ ip = $NF } /Network mask \(bits\)/{ print ip "/" $NF}'
Awk processes records in /pattern/ { action } syntax. So when the first pattern in matched, extract the last field delimited by space $NF i.e. a special variable Awk uses to store the value of last column when delimited by space ( See 7.5.1 Built-in Variables That Control awk)
When the second pattern is matched in a similar way, join that last field with the value stored in ip variable. The + just concatenates the individual strings to produce the desired result.

Related

How to use awk to count the occurence of a word beginning with something?

I have a file that looks like this:
**FID IID**
1 RQ50131-0
2 469314
3 469704
4 469712
5 RQ50135-2
6 469720
7 470145
I want to use awk to count the occurences of IDs beginning with 'RQ' in column 2.
So for the little snapshot, it should be 2. After the RQ, the numbers differ so I want a count with anything that begins with RQ.
I am using this code
awk -F '\t' '{if(match("^RQ$",$2))print}'|wc -l ID.txt > RQ.txt
But I don't get an output.

Tabs are used as field delimiters by default (same as spaces), so you can omit -F '\t'.
You can use
awk '$2 ~ /^RQ/{cnt++} END{print cnt}' ID.txt > RQ.txt
Once Field 2 starts with RQ, increment cnt and once the file is processed print cnt.
See the online demo.

You did
{if(match("^RQ$",$2))print}
but compulsory arguments to match function are string, regexp. Also do not use $ if you are interesting in finding strings starting with as $ denotes end. After fixing that issues code would be
{if(match($2,"^RQ"))print}
Disclaimer: this answer does describe solely fixing problems with your current code, it does not contain any ways to ameliorate your code.

Also apart from the reversed parameters for match, the file ID.txt should come right after the closing single quote.
As you want to print the whole line, you can omit the if statement and the print statement because match returns the index at which that substring begins, or 0 if there is no match.
awk 'match($2,"^RQ")' ID.txt | wc -l > RQ.txt

Finding sequence in data

I to use awk to find the sequence of pattern in a DNA data but I cannot figure out how to do it. I have a text file "test.tx" which contains a lot of data and I want to be able to match any sequence that starts with ATG and ends with TAA, TGA or TAG and prints them.
for instance, if my text file has data that look like below. I want to find and match all the existing sequence and output as below.
AGACGCCGGAAGGTCCGAACATCGGCCTTATTTCGTCGCTCTCTTGCTTTGCTCGAATAAACGAGTTTGGCTTTATCGAATCTCCGTACCGTAAGGTCGAAAACGGCCGGGTCATTGAGTACGTGAAAGTACAAAATGG
GTCCGCGAATTTTTCGGTTCGTCTCAGCTTTCGCAGTTTATGGATCAGACGAACCCGCTCTCTGAAATTACTCATAAACGCAGGCTCTCGGCGCTCGGGCCCGGCGGACTCTCGCGGGAGCGTGCAGGTTTCGAAGTTC
GGATGATATCGACCATCTCGGCAATCGACGCGTTCGGGCCGTAGGCGAACTGCTCGAAAATCAATTCCGAATCGGGCTTGAGCGAATGGAGCGGGCCATCAAGGAAAAAATGTCTATCCAGCAGGATATGCAAACGACG
AAAGTATGTTTTTCGATCCGCGCCGATTCGACCTCTCAAGAGTCGGAAGGCTTAAATTCAATATCAAAATGGGACGCCCCGAGCGCGACCGTATAGACGATCCGCTGCTTGCGCCGATGGATTTCATCGACGTTGTGAA
ATGAGACCGGGCGATCCGCCGACTGTGCCAACCGCCTACCGGCTTCTGG
Print out matches:
ATGATATCGACCATCTCGGCAATCGACGCGTTCGGGCCGTAG
ATGATATCGACCATCTCGGCAATCGACGCGTTCGGGCCGTAG
ATGTTTTTCGATCCGCGCCGATTCGACCTCTCAAGAGTCGGAAGGCTTAA
I try something like this, but it only display the rows that starts with ATG. it doesn't actually fix my problem
awk '/^AGT/{print $0}' test.txt

assuming the records are not spanning multiple lines
$ grep -oP 'ATG.*?T(AA|AG|GA)' file
ATGGATCAGACGAACCCGCTCTCTGA
ATGATATCGACCATCTCGGCAATCGACGCGTTCGGGCCGTAG
ATGTTTTTCGATCCGCGCCGATTCGACCTCTCAAGAGTCGGAAGGCTTAA
ATGGGACGCCCCGAGCGCGACCGTATAG
ATGGATTTCATCGACGTTGTGA
non-greedy match, requires -P switch (to find the first match, not the longest).

Could you please try following.
awk 'match($0,/ATG.*TAA|ATG.*TGA|ATG.*TAG/){print substr($0,RSTART,RLENGTH)}' Input_file

How to read a specific column in Unix

Filesystem State 1024-blocks Used Avail Capacity Mounted on
$ZPMON.DELETEMESTARTED 71686344 58788360 12897984 82% /deleteme
Here I want to read the first column, meaning only read up to the text DELETEME.
I tried, but when I read the first column I get ZPMON.DELETEMESTARTED.
The entries of the Filesystem and State columns are treated as one combined column (containing ZPMON.DELETEMESTARTED).
How can I resolve this?
This is what I tried:
df -k DELETEME | tail -1 | awk 'BEGIN{FS=" "};{print NF}'

Are you sure two column values are combined in the output? I guess it must be separated by tabs, so in that case all you need to do is use awk '{print $1}' to print the first column. Default field separators for awk is space and tab. In case if you don't want to rely on the default separator specify it explicitly.

How to extract the first column from a tsv file?

I have a file containing some data and I want to use only the first column as a stdin for my script, but I'm having trouble extracting it.
I tried using this
awk -F"\t" '{print $1}' inputs.tsv
but it only shows the first letter of the first column. I tried some other things but it either shows the entire file or just the first letter of the first column.
My file looks something like this:
Harry_Potter 1
Lord_of_the_rings 10
Shameless 23
....

You can use cut which is available on all Unix and Linux systems:
cut -f1 inputs.tsv
You don't need to specify the -d option because tab is the default delimiter. From man cut:
-d delim
Use delim as the field delimiter character instead of the tab character.
As Benjamin has rightly stated, your awk command is indeed correct. Shell passes literal \t as the argument and awk does interpret it as a tab, while other commands like cut may not.
Not sure why you are getting just the first character as the output.
You may want to take a look at this post:
Difference between single and double quotes in Bash

Try this (better rely on a real csv parser...):
csvcut -c 1 -f $'\t' file
Check csvkit
Output :
Harry_Potter
Lord_of_the_rings
Shameless
Note :
As #RomanPerekhrest said, you should fix your broken sample input (we saw spaces where tabs are expected...)

text processing: sed to work backwards to delete until string

My AWK script generates 1 of the following 2 outputs depending on what text file it is being used on.
49 1146.469387755102 mongodb 192.168.0.8:27017 -p mongodb.database
1 1243.0 jdbc:mysql 192.168.0.8:3306/ycsb -p db.user
I need a way of deleting everything past the IP address, including the port number.
sed 's/:[^:]*//2g'
Works apart from the fact it deletes from left to right and as one of the outputs contains 2 : 's it stops and deletes everything after that. Is there a way of reversing sed to work from right to left?
Just to be clear, desired output of each would be:
49 1146.469387755102 mongodb 192.168.0.8
1 1243.0 jdbc:mysql 192.168.0.8

You could use the below sed command.
sed 's/:[0-9]\{4\}.*//' file
OR
sed 's/:[^:]*$//' file
[^:]* negated character class which matches any char but not of :, zero or more times. $ matches the end of the line boundary. So :[^:]*$ matches all the chars from the last colon upto the end. Replacing those matched chars with empty string will give you the desired output.

You can take advantage of the greedy nature of the Kleene *:
sed 's/\(.*\):.*/\1/' file
The .* consumes as much as it can, while still matching the pattern. The captured part of the line is used in the replacement.
Alternatively, using awk (thanks to glenn jackman for setting me straight):
awk -F: -v OFS=: 'NF{NF--}1' file
Set the input and output field separators to a colon remove the final field by decrementing NF. 1 is true so the default action {print} is performed. The NF condition prevents empty lines from causing an error, which may not be necessary in your case but does no harm.
Output either way:
49 1146.469387755102 mongodb 192.168.0.8
1 1243.0 jdbc:mysql 192.168.0.8

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Combine grep output when piping - awk

Related

How to use awk to count the occurence of a word beginning with something?

Finding sequence in data

How to read a specific column in Unix

How to extract the first column from a tsv file?

text processing: sed to work backwards to delete until string

Categories

Resources