I am trying to look for $2 of file1 (skipping the header) in $2 of file2 and if they match and the value in $10 is > 30 and $11 is > 49, then print the line to a output file. The below awk has syntax errors in it though shellcheck didn't return any. Both the input and output are tab-delimited. I think the below is close, but not sure what is wrong. Thank you :).
awk
awk -F'\t' -v OFS='\t' 'NR==FNR{A[$2];next}$2 in A
{if($10 >.5 OFS $11 > 49)
print ; next
' file1 file2
awk: cmd. line:2: {if($10 >.5 OFS $11 > 49)
awk: cmd. line:2: ^ syntax error
awk: cmd. line:3: print ; next
awk: cmd. line:3: ^ unexpected newline or end of string
file1
Missing in IDP but found in Reference:
2 166848646 G A exonic SCN1A 68 13 16;20 0;0 17;15 0;0 0;0 0;0 c.[5139C>T]+[=] 52.94
file2
chr2 166245425 SCN2A AMPL5155065355 SNP Het C/T C T 54 100 50 23 27
chr2 166848646 SCN1A AMPL1543060606 SNP Het G/A G A 52.9411764706 100 68 32 36
desired output
2 166848646 G A exonic SCN1A 68 13 16;20 0;0 17;15 0;0 0;0 0;0 c.[5139C>T]+[=] 52.94
edit with new awk
awk -F'\t' -v OFS='\t' 'NR==FNR{A[$2];next}$2 in A {
if($10 >.5 OFS $11 > 49) >>> if($10 >.5 && $11 > 49)
print }
' file1 file2 > out
awk: cmd. line:2: if($10 >.5 OFS $11 > 49) >>> if($10 >.5 && $11 > 49)
awk: cmd. line:2: ^ syntax error
here you go...
$ awk 'BEGIN{FS=OFS="\t"} NR==FNR{a[$2]; next}
($2 in a) && $10>30 && $11>49 ' file1 file2
I am trying to use awk to remove the lines in file that do not match the digits after the NM_ but before the . in $2 of list. Thank you :).
file
204 NM_003852 chr7 + 138145078 138270332 138145293
204 NM_015905 chr7 + 138145078 138270332 138145293
list
TRIM24 NM_015905.2
awk
awk -v OFS="\t" '{ sub(/\r/, "") } ; NR==FNR { N=$2 ; sub(/\..*/, "", $2); A[$2]=N; next } ; $2 in A { $2=A[$2] } 1' list file > out
current output
204 NM_003852 chr7 + 138145078 138270332 138145293
204 NM_015905.2 chr7 + 138145078 138270332 138145293
desired output (line 1 removed as that is the line that does not match)
204 NM_015905.2 chr7 + 138145078 138270332 138145293
awk 'NR==FNR{split($2,f2,".");a[f2[1]];next} $2 in a' list file
$ awk -F'[ .]' 'NR==FNR{a[$2];next}$2 in a' list file
204 NM_015905 chr7 + 138145078 138270332 138145293
I have a file which is formatted in below format:
create_terminal \
-name {abc} \
-port {abc} \
-layer metal1 \
-bbox {{2 0.000} {3 0.204}}
I want the output file to look like below:
create_terminal -name {abc} -port {abc} -layer metal1 -bbox {{2 0.000} {3 0.204}}
Is there a quick sed or awk command to do this?
Thanks
With GNU awk for multi-char RS:
$ gawk -v RS='^$' -v ORS= '{gsub(/\\\n/,"")}1' file
create_terminal -name {abc} -port {abc} -layer metal1 -bbox {{2 0.000} {3 0.204}}
With other awks:
$ awk '{rec=rec $0 RS} END{gsub(/\\\n/,"",rec); printf "%s",rec}' file
create_terminal -name {abc} -port {abc} -layer metal1 -bbox {{2 0.000} {3 0.204}}
or:
$ awk 'sub(/\\$/,""){rec=rec $0; next} {print rec $0; rec=""}' file
create_terminal -name {abc} -port {abc} -layer metal1 -bbox {{2 0.000} {3 0.204}}
The above assumes you just want to remove all backslashes-followed-by-newlines. If you want something else then edit your question to clarify.
Using awk you can do it as:
awk '!/\\\s*$/{f=1} {sub(/\\\s*$/,""); a=a""$0 } f{print a; a=""; f=0}' filename
Explanation:
awk '
!/\\\s*$/{f=1} # set f=1 is line does not end with \ followed by whitespace(s)
{sub(/\\\s*$/,""); a=a""$0 } # substitutes the \ from the end of line with "" (nothing) and concatenates a with the current modified line.
f{print a; a=""; f=0} # prints a and reset the value of a and f. Runs only if f=1 is set
' filename
For input:
create_terminal \
-name {abc} \
-port {abc} \
-layer metal1 \
-bbox {{2 0.000} {3 0.204}}
temp_123 \
-sdg
Output:
create_terminal -name {abc} -port {abc} -layer metal1 -bbox {{2 0.000} {3 0.204}}
temp_123 -sdg
You could try tr instead then pipe the output to sed, like this (assuming the file name is textf.txt):
cat testf.txt |tr '\n' ' ' |sed 's#\\# #g'
*NOTE:This works on OS X with BSD sed
This might work for you (GNU sed):
sed ':a;/\\$/{$!N;ba};s/\\\n//g' file
fatal: not enough arguments to satisfy format string
`%s SPT=80'
^ ran out for this one
This my code
for ((h = 1 ; h < 4 ; h++ )); do
x=$(awk -v i=h -v j=17 'FNR == 2 {printf "%s " $j}' newiptables.log)
echo $x
This is my file
Dec 26 09:17:51 localhost kernel: IN=eth0 OUT= MAC=00:10:c6:a8:da:68:00:90:7f:9c:50:5a:08:00 SRC=198.252.206.16 DST=10.128.1.225 LEN=313 TOS=0x00 PREC=0x00 TTL=64 ID=59334 PROTO=TCP SPT=80 DPT=56506 WINDOW=46535 RES=0x00 ACK PSH URGP=0
Dec 26 09:17:52 localhost kernel: IN=eth0 OUT= MAC=00:10:c6:a8:da:68:00:90:7f:9c:50:5a:08:00 SRC=198.252.206.16 DST=10.128.1.225 LEN=1440 TOS=0x00 PREC=0x00 TTL=64 ID=47303 PROTO=TCP SPT=80 DPT=56506 WINDOW=46535 RES=0x00 ACK URGP=0
Dec 26 09:17:52 localhost kernel: IN=eth0 OUT= MAC=00:10:c6:a8:da:68:00:90:7f:9c:50:5a:08:00 SRC=198.252.206.16 DST=10.128.1.225 LEN=1440 TOS=0x00 PREC=0x00 TTL=64 ID=47559 PROTO=TCP SPT=80 DPT=56506 WINDOW=46535 RES=0x00 ACK URGP=0
The problem is a missing comma in the printf command for awk:
awk -v i=h -v j=17 'FNR == 2 {printf "%s ", $j}' newiptables.log
^
|== This is needed
Quoting from the manual:
A simple printf statement looks like this:
printf format, item1, item2, ...
I have File1
A,B,C
and File2
D,E,F
I am trying to have
AD, AE, AF, BD, BE, BF, CD, CE, CF
unsuccessfully by
echo {`cat File1`}{`cat File2`}
giving
{A,B,C}{D,E,F}
How can you solve the problem by Zsh/AWK?
awk -F, '
NR==FNR {
# read lines from File1 into the array f1
f1[NR]=$0
next
}
{
# foreach line in File2
split(f1[FNR], words); # get words from corresponding line in File1
sep = ""
for (i in words) {
for (j=1; j<=NF; j++) {
printf("%s%s%s", sep, words[i], $j)
sep = ", "
}
}
print ""
}
' File1 File2
If File1 contains
A,B,C
1,2,3
and File2 contains
D,E,F
4,5,6
then the awk script outputs
AD, AE, AF, BD, BE, BF, CD, CE, CF
14, 15, 16, 24, 25, 26, 34, 35, 36
I don't know zsh, here's what I did with bash and sed:
echo "A,B,C" >a
echo "D,E,F" >b
for i in `cat a | sed -e "s#,#\n#g"`;
do for j in `cat b | sed -e "s#,#\n#g"`;
do echo -n "$i$j, ";
done ;
done | sed -e "s#,\s\$##"
The output then is:
AD, AE, AF, BD, BE, BF, CD, CE, CF