Delete a line that contain an occurence in the first or second column

Delete a line that contain an occurence in the first or second column - awk

I would like to del a line that contain the occurence in the first or second column (separator \t). For exemple :
line 1 uni:1 uni:2 blabla blabla
line 2 uni:3 EBI:1 blbla blabla
I Want to delete the line2. The "blabla" text can contain the occurence (EBI) but I don't want to select by the rest of the text, just with the two first column.
I try : awk -F "\t" '{print $1 $2}' file1 |grep -v EBI > file2
but this will stock just the first and second column and not the entire line.
I try this too : awk -F "\t" '{print $1 $2}'file1 |grep -n EBI
and sed "numberOfLined" file1 >file2
But I have a lot of occurences so I don't want to write all numbers of lines by hand.

You can use if statement and regex matching via ~:
awk -F '\t' '{if (! (($1 ~ ".*EBI.*") || ($2 ~ ".*EBI.*"))) {print $0} }'
And thanks to comments, it could looks even better:
awk '!($1~/EBI/ || $2~/EBI/)'

Related

awk conditional statement based on a value between colon

I was just introduced to awk and I'm trying to retrieve rows from my file based on the value on column 10.
I need to filter the data based on the value of the third value if ":" was used as a separator in column 10 (last column).
Here is an example data in column 10. 0/1:1,9:10:15:337,0,15.
I was able to extract the third value using this command awk '{print $10}' file.txt | awk -F ":" '/1/ {print $3}'
This returns the value 10 but how can I return other rows (not just the value in column 10) if this third value is less than or greater than a specific number?
I tried this awk '{if($10 -F ":" "/1/ ($3<10))" print $0;}' file.txt but it returns a syntax error.
Thanks!

Your code:
awk '{print $10}' file.txt | awk -F ":" '/1/ {print $3}'
should be just 1 awk script:
awk '$10 ~ /1/ { split($10,f,/:/); print f[3] }' file.txt
but I'm not sure that code is doing what you think it does. If you want to print the 3rd value of all $10s that contain :s, as it sounds like from your text, that'd be:
awk 'split($10,f,/:/) > 1 { print f[3] }' file.txt
and to print the rows where that value is less than 7 would be:
awk '(split($10,f,/:/) > 1) && (f[3] < 7)' file.txt

How to exclude lines matching a regex pattern in a column by awk?

I want to exclude lines containing a specific string.
header
1:test
2:test
3:none
4:test
Why don't these commands work?
awk -F: 'FNR>1 {$0 !~ /none/} {print $1}' 1.txt
awk -F: 'FNR>1 {$2 !~ /none/} {print $1}' 1.txt
but this works:
awk '$0 !~ /none/ {print $0}' 1.txt
I intend to get
1
2
4

You need to provide the regex test as condition, not as action, and may use
awk -F: 'FNR>1 && !/none/{print $1}' file
awk -F: 'FNR>1 && $2 !~ /none/{print $1}' file
See an awk online demo
Details
-F: - sets the field separator to a colon
FNR>1 && !/none/ - if number of processed records for current file is more than 1 and there is no none on the line (if $2 !~ /none/ is used, returns true if Field 2 does not contain none pattern)
{print $1} - print Field 1 value.

awk to extract data from a file, file name is dump below

I am using awk to extract ip address but I get lot of white spaces, how can I get rid of the same?
I am using below command
awk -F'f5public_ip =' '{print $2}' examples/aws/dump >> some.txt
shitole$ cat some.txt
54.83.174.153

test that $2 is not empty:
awk -F'f5public_ip =' '$2 != "" {print $2}'
$2 might contain blanks, in that case test for non-blank characters
awk -F'f5public_ip =' '$2 ~ /[^[:blank:]]/ {print $2}'

how to get the common rows according to the first column in awk

I have two ',' separated files as follow:
file1:
A,inf
B,inf
C,0.135802
D,72.6111
E,42.1613
file2:
A,inf
B,inf
C,0.313559
D,189.5
E,38.6735
I want to compare 2 files ans get the common rows based on the 1st column. So, for the mentioned files the out put would look like this:
A,inf,inf
B,inf,inf
C,0.135802,0.313559
D,72.6111,189.5
E,42.1613,38.6735
I am trying to do that in awk and tried this:
awk ' NR == FNR {val[$1]=$2; next} $1 in val {print $1, val[$1], $2}' file1 file2
this code returns this results:
A,inf
B,inf
C,0.135802
D,72.6111
E,42.1613
which is not what I want. do you know how I can improve it?

$ awk 'BEGIN{FS=OFS=","}NR==FNR{a[$1]=$0;next}$1 in a{print a[$1],$2}' file1 file2
A,inf,inf
B,inf,inf
C,0.135802,0.313559
D,72.6111,189.5
E,42.1613,38.6735
Explained:
$ awk '
BEGIN {FS=OFS="," } # set separators
NR==FNR { # first file
a[$1]=$0 # hash to a, $1 as index
next # next record
}
$1 in a { # second file, if $1 in a
print a[$1],$2 # print indexed record from a with $2
}' file1 file2

Your awk code basically works, you are just missing to tell awk to use , as the field delimiter. You can do it by adding BEGIN{FS=OFS=","} to the beginning of the script.
But having that the files are sorted like in the examples in your question, you can simply use the join command:
join -t, file1 file2
This will join the files based on the first column. -t, tells join that columns are separated by commas.
If the files are not sorted, you can sort them on the fly like this:
join -t, <(sort file1) <(sort file2)

output the record by comparing string in other file using awk script

I want to output the record having matching string in another file using awk script
file1 code
849002|48|1208004|1
849007|28|1208004|1
855003|48|1208004|1
855004|28|1208004|1
855006|28|1208004|1
file2 code :
00990029000000004804470425|ST1400029|0.550|Recurring|1248073|ST1400029
00990029000000008410517183|IM1450029|1.000|Recurring|855003|ST1400029
009900290000000000007800612988|IM3350029|1.000|Recurring|1248063|ST1400029
Notice that 855003 occurs in the middle row of each sample? That's the match I'm looking for, and the output should be:
00990029000000008410517183|IM1450029|1.000|Recurring|855003|ST1400029
Because I want to search $1 of file1 in $5 in file2, if match found then output the line
I tried this but its returning zero record
awk 'NR==FNR{a[$1]=$1;next}a[$5]{print $0}' file2 file1 > outfile
Your help will resolve my issue, I have to search long list of data

Don't forget to set the delimiter using the -F flag:
awk -F "|" 'FNR==NR { a[$1]; next } $5 in a' file1 file2
Results:
00990029000000008410517183|IM1450029|1.000|Recurring|855003|ST1400029

try this (didn't test)
awk 'NR==FNR{a[$1];next}$5 in a' file1 file2

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Delete a line that contain an occurence in the first or second column - awk

You can use if statement and regex matching via ~: awk -F '\t' '{if (! (($1 ~ ".EBI.") || ($2 ~ ".EBI."))) {print $0} }' And thanks to comments, it could looks even better: awk '!($1~/EBI/ || $2~/EBI/)'

Related

awk conditional statement based on a value between colon

How to exclude lines matching a regex pattern in a column by awk?

awk to extract data from a file, file name is dump below

how to get the common rows according to the first column in awk

output the record by comparing string in other file using awk script

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Delete a line that contain an occurence in the first or second column - awk

You can use if statement and regex matching via ~: awk -F '\t' '{if (! (($1 ~ ".*EBI.*") || ($2 ~ ".*EBI.*"))) {print $0} }' And thanks to comments, it could looks even better: awk '!($1~/EBI/ || $2~/EBI/)'

Related

awk conditional statement based on a value between colon

How to exclude lines matching a regex pattern in a column by awk?

awk to extract data from a file, file name is dump below

how to get the common rows according to the first column in awk

output the record by comparing string in other file using awk script

Categories

Resources

You can use if statement and regex matching via ~: awk -F '\t' '{if (! (($1 ~ ".EBI.") || ($2 ~ ".EBI."))) {print $0} }' And thanks to comments, it could looks even better: awk '!($1~/EBI/ || $2~/EBI/)'