Can anybody help in resolving this issue? Not sure wht it is giving problem.
ssh root#host1 "tail -f /data1/logs/logger.log | awk '{ if(\$0 ~ /^Mar|^Apr/) { printf(\"\\n%s\",\$0) } if(\$0 \!~ /^Mar|^Apr/) { printf(\"%s\", \$0);} };' "
root#host1's password:
awk: { if($0 ~ /^Mar|^Apr/) { printf("\n%s",$0) } if($0 \!~ /^Mar|^Apr/) { printf("%s", $0);} };
awk: ^ backslash not last character on line
Rather than testing over ssh, you can replicate the behaviour using eval. I made a test file (called month):
Mar line1
line2 line2
Apr line3
You have (at least) three options:
First option
Your two options are mutually exclusive, so you can sidestep the issue of escaping a ! entirely by using two blocks with next in the first block:
eval "awk '/^Mar|^Apr/ { printf(\"\\n%s\",\$0); next } { printf(\"%s\", \$0) }' month"
If the condition is true, the first block is taken and next skips the rest. Note that I have removed the unnecessary $0 ~ from the condition. The match is performed against the whole line by default.
Second option
You could actually just do this:
eval "awk '/^Mar|^Apr/ { \$0 = \"\\n\"\$0 } { printf(\"%s\", \$0) }' month"
If the line matches, precede it with a newline.
In all cases (no condition before the { }), print the line.
Third option
If you wrap the overall command in single quotes, you don't need to do anything fancy with the !:
eval 'awk "{ if(/^Mar|^Apr/) { printf(\"\\n%s\",\$0) } if(!/^Mar|^Apr/) { printf(\"%s\", \$0)} }" month'
I recommend one of the other two solutions, I just thought that it would be worth showing that you can use ! within the command if you need to.
Output for all three cases:
Mar line1line2 line2
Apr line3
Here is a cleaned up version of your awk
awk '/^Mar|^Apr/ {printf "\n%s",$0;next} {printf "%s",$0}'
This will test if $0 (default, so need to add) is starting with Mar or Apr
If yes do printf "\n%s",$0;next.
The next makes final code to be rune only if $0 is not starting with Mar or Apr
Then run printf "%s",$0
Or just:
awk '/^Mar|^Apr/ {$0="\n"$0} {printf "%s",$0}'
Related
I want to merge some rows in a file so that the lines should contain 22 fields seperated by ~.
Input file looks like this.
200269~7414~0027001~VALTD~OM3500~963~~~~716~423~2523~Y~UN~~2423~223~~~~A~200423
2269~744~2701~VALD~3500~93~~~~76~423~223~Y~
UN~~243~223~~~~A~200123
209~7414~7001~VALD~OM30~963~~~
~76~23~2523~Y~UN~~223~223~~~~A~123
and So on
First line looks fine. 2nd and 3rd line needs to be merged so that it becomes a line with 22 fields. 4th,5th and 6th line should be merged and so on.
Expected output:
200269~7414~0027001~VALTD~OM3500~963~~~~716~423~2523~Y~UN~~2423~223~~~~A~200423
2269~744~2701~VALD~3500~93~~~~76~423~223~Y~UN~~243~223~~~~A~200123
209~7414~7001~VALD~OM30~963~~~~76~23~2523~Y~UN~~223~223~~~~A~123
The file has 10 GB data but the code I wrote (used while loop) is taking too much time to execute . How to solve this problem using awk/sed command?
Code Used:
IFS=$'\n'
set -f
while read line
do
count_tild=`echo $line | grep -o '~' | wc -l`
if [ $count_tild == 21 ]
then
echo $line
else
checkLine
fi
done < file.txt
function checkLine
{
current_line=$line
read line1
next_line=$line1
new_line=`echo "$current_line$next_line"`
count_tild_mod=`echo $new_line | grep -o '~' | wc -l`
if [ $count_tild_mod == 21 ]
then
echo "$new_line"
else
line=$new_line
checkLine
fi
}
Using only the shell for this is slow, error-prone, and frustrating. Try Awk instead.
awk -F '~' 'NF==1 { next } # Hack; see below
NF<22 {
for(i=1; i<=NF; i++) f[++a]=$i }
a==22 {
for(i=1; i<=a; ++i) printf "%s%s", f[i], (i==22 ? "\n" : "~")
a=0 }
NF==22
END {
if(a) for(i=1; i<=a; i++) printf "%s%s", f[i], (i==a ? "\n" : "~") }' file.txt>file.new
This assumes that consecutive lines with too few fields will always add up to exactly 22 when you merge them. You might want to check this assumption (or perhaps accept this answer and ask a new question with more and better details). Or maybe just add something like
a>22 {
print FILENAME ":" FNR ": Too many fields " a >"/dev/stderr"
exit 1 }
The NF==1 block is a hack to bypass the weirdness of the completely empty line 5 in your sample.
Your attempt contained multiple errors and inefficiencies; for a start, try http://shellcheck.net/ to diagnose many of them.
$ cat tst.awk
BEGIN { FS="~" }
{
sub(/^[0-9]+\./,"")
gsub(/[[:space:]]+/,"")
$0 = prev $0
if ( NF == 22 ) {
print ++cnt "." $0
prev = ""
}
else {
prev = $0
}
}
$ awk -f tst.awk file
1.200269~7414~0027001~VALTD~OM3500~963~~~~716~423~2523~Y~UN~~2423~223~~~~A~200423
2.2269~744~2701~VALD~3500~93~~~~76~423~223~Y~UN~~243~223~~~~A~200123
3.209~7414~7001~VALD~OM30~963~~~~76~23~2523~Y~UN~~223~223~~~~A~123
The assumption above is that you never have more than 22 fields on 1 line nor do you exceed 22 in any concatenation of the contiguous lines that are each less than 22 fields, just like you show in your sample input.
You can try this awk
awk '
BEGIN {
FS=OFS="~"
}
{
while(NF<22) {
if(NF==0)
break
a=$0
getline
$0=a$0
}
if(NF!=0)
print
}
' infile
or this sed
sed -E '
:A
s/((.*~){21})([^~]*)/\1\3/
tB
N
bA
:B
s/\n//g
' infile
Why does the first statement work but not the second? I'm trying to add an additional two (one shown) variables to do another comparison, but the second instance errors out.
1st Instance
awk 'f1=substr($1,0,9), f2=substr($3,0,9){if(f1==f2)print $1,$2,$3,$4}' file
2nd Instance
awk 'f1=substr($1,0,9), f2=substr($3,0,9), f3=substr($1,5,3){if(f1==f2)print $1,$2,$3,$4}' file
awk: cmd. line:1: f1=substr($1,0,9), f2=substr($3,0,9), f3=substr($1,5,3){if(f1==f2)print $1,$2,$3,$4,16}
awk: cmd. line:1: ^ syntax error
File
TULSHDRJ02 ae0.0 KSCYBBRJ01 ae1.0
MTC3BBRJ02 ae4.0 KSCYBBRJ01 ae6.0
KSCYBBRJ01 ae2.0 KSCYBBRJ02 ae2.0
MTC1BBRJ02 ae4.0 KSCYBBRJ02 ae6.0
Output
KSCYBBRJ01 ae2.0 KSCYBBRJ02 ae2.0
$ awk 'substr($1,1,9)==substr($3,1,9){print $1,$2,$3,$4}' file
since you're printing everything you can drop the action part
$ awk 'substr($1,1,9)==substr($3,1,9)' file
or, for DRY
$ awk 'function s(v) {return substr(v,1,9)}
s($1)==s($3)' file
The general program structure of an awk program is as follows:
condition { action [; action [ ; ... ]] }
Multiple actions are separated by ; or newline.
Both the condition and the block of actions are optional. When you omit the condition
{ action [; action [ ; ... ]] }
... actions will be always executed. If you omit the actions:
condition
... the default action is print.
Multiple of those blocks can be put in a row:
cond1 { action1 } cond2 {action2} ...
Note: newline can be always used as a delimiter (for multiline programs)
I guess you wanted:
awk '{f1=substr($1,0,9);f2=substr($3,0,9)} f1==f2{print $1,$2,$3,$4}'
... or in multiline form:
awk '# Runs on every line
{
f1=substr($1,0,9)
f2=substr($3,0,9)
}
# Runs only if condition is met
f1==f2 {
print $1,$2,$3,$4
}'
But not quite!
It should be
awk '{f1=substr($1,1,9);f2=substr($3,1,9)} f1==f2{print $1,$2,$3,$4}'
instead of
awk '{f1=substr($1,0,9);f2=substr($3,0,9)} f1==f2{print $1,$2,$3,$4}'
Note that string, field and array indices in awk start at 1, not 0.
Please check also karakfa's answer, which shows how the command can be simplified.
I have a text file that is comma delimited. The first line is a list of field names, and subsequent lines contain data. I'll get new versions of the file, and I want to extract all the values from a particular column by name rather than by column number. (I.e. the column I want may be in different positions in different versions of the file.)
For example, here are two files:
foo,bar,interesting,junk
1,2,gold,ramjet
2,25,diamonds,superfluous
and
foo,bar,baz,interesting,junk,morejunk
5,3,smurf,platinum,garbage,scrap
6,2.5,mushroom,sodium,liverwurst,eew
I'd like a single script that will go through multiple files, extracting the minerals in the "interesting" column. :-)
What I've got so far is something that works on ONE file, but I know that awk is more elegant than this. How do I clean this up and make it work on multiple files at once?
BEGIN {
FS=",";
}
NR == 1 {
for(i=1; i<=NF; i++) {
if($i=="interesting") {
col=i;
}
}
}
NR > 1 {
print $col;
}
You're pretty darn close already. Just use FNR instead of NR, for "File NR".
#!/usr/bin/awk -f
BEGIN { FS="," }
FNR==1 {
for (col=1;col<=NF;col++)
if ($col=="interesting")
next
}
{ print $col }
Or if you like:
#!/usr/bin/awk -f
BEGIN { FS="," }
FNR==1 { for (col=1;$col!="interesting";col++); next }
{ print $col }
Or if you prefer one-liners:
$ awk -F, -v txt="interesting" 'FNR==1{for(c=1;$c!=txt;c++);next} {print $c}' file1 file2
Of course, be careful that you actually have the specified column, or you may find yourself in an endless loop. You can probably figure out the extra condition that saves you from that risk.
Note that in awk, you only need to terminate commands with semicolons if they are followed by another command. Thus, you would do this:
command1; command2
But you can drop the semicolon if you separate commands with newlines:
command1
command2
Do it this way:
$ cat tst.awk
BEGIN { FS=OFS="," }
FNR==1 { for (i=1;i<=NF;i++) f[$i]=i; next }
{ print $(f["interesting"]) }
$ awk -f tst.awk file1 file2
gold
diamonds
platinum
sodium
Creating a name->value array is always the best approach when it's applicable. It keeps every part of the code simple and decoupled from the rest of the code, and it sets you up for doing other things like changing the order of the fields when you output the results, e.g.:
$ cat tst.awk
BEGIN { FS=OFS="," }
FNR==1 { for (i=1;i<=NF;i++) f[$i]=i; next }
{ print $(f["junk"]), $(f["interesting"]), $(f["bar"]) }
$ awk -f tst.awk file1 file2
ramjet,gold,2
superfluous,diamonds,25
garbage,platinum,3
liverwurst,sodium,2.5
awk -F, '{if ($2 == 0) awk '{ total += $3; count++ } END { print total/count }' CLN_Tapes_LON; }' /tmp/CLN_Tapes_LON
awk: {if ($2 == 0) awk {
awk: ^ syntax error
bash: count++: command not found
Just for fun, let's look at what's wrong with your original version and transform it into something that works, step by step. Here's your initial version (I'll call it version 0):
awk -F, '{if ($2 == 0) awk '{ total += $3; count++ } END { print total/count }' CLN_Tapes_LON; }' /tmp/CLN_Tapes_LON
The -F, sets the field separator to be the comma character, but your later comment seems to indicate that the columns (fields) are separated by spaces. So let's get rid of it; whitespace-separation is what awk expects by default. Version 1:
awk '{if ($2 == 0) awk '{ total += $3; count++ } END { print total/count }' CLN_Tapes_LON; }' /tmp/CLN_Tapes_LON
You seem to be attempting to nest a call to awk inside your awk program? There's almost never any call for that, and this wouldn't be the way to do it anyway. Let's also get rid of the mismatched quotes while we're at it: note in passing that you cannot nest single quotes inside another pair of single quotes that way: you'd have to escape them somehow. But there's no need for them at all here. Version 2:
awk '{if ($2 == 0) { total += $3; count++ } END { print total/count } }' /tmp/CLN_Tapes_LON
This is close but not quite right: the END block is only executed when all lines of input are finished processing: it doesn't make sense to have it inside an if. So let's move it outside the braces. I'm also going to tighten up some whitespace. Version 3:
awk '{if ($2==0) {total+=$3; count++}} END{print total/count}' /tmp/CLN_Tapes_LON
Version 3 actually works, and you could stop here. But awk has a handy way of specifying to run a block of code only against lines that match a condition: 'condition {code}' So yours can more simply be written as:
awk '$2==0 {total+=$3; count++} END{print total/count}' /tmp/CLN_Tapes_LON
... which, of course, is pretty much exactly what John1024 suggested.
$ awk '$2 == 0 { total += $3; count++;} END { print total/count; }' CLN_Tapes_LON
3
This assumes that your input file looks like:
$ cat CLN_Tapes_LON
CLH040 0 3
CLH041 0 3
CLH042 0 3
CLH043 0 3
CLH010 1 0
CLH011 1 0
CLH012 1 0
CLH013 1 0
CLH130 1 40
CLH131 1 40
CLH132 1 40
CLH133 1 40
Thought I'd try to do this without awk. Awk is clearly the better choice, but it's still a one-liner.
bc<<<"($(grep ' 0 ' file|tee >(wc -l>i)|cut -d\ -f3|tr '\n' '+')0)/"$(<i)
3
It extracts lines with 0 in the second column with grep. This is passed to tee for wc -l to count the lines and to cut to extract the third column. tr replaces the new lines with "+" which is put over the number of lines (i.e., "12 / 4"). This is then passed to bc.
I'm trying to run the command below, and its giving me the error. Thoughts on how to fix? I would rather have this be a one line command than a script.
grep "id\": \"http://room.event.assist.com/event/room/event/" failed_events.txt |
head -n1217 |
awk -F/ ' { print $7 } ' |
awk -F\" ' { print "url \= \"http\:\/\/room\.event\.assist\.com\/event\/room\/event\/'{ print $1 }'\?schema\=1\.3\.0\&form\=json\&pretty\=true\&token\=582EVTY78-03iBkTAf0JAhwOBx\&account\=room_event\"" } '
awk: non-terminated string url = "ht... at source line 1
context is
>>> <<<
awk: giving up
source line number 2
The line below exports out a single column of ID's:
grep "id\": \"http://room.event.assist.com/event/room/event/" failed_events.txt |
head -n1217 |
awk -F/ ' { print $7 } '
156512145
898545774
454658748
898432413
I'm looking to get the ID's above into a string like so:
" url = "string...'ID'string"
take a look what you have in last awk :
awk -F\"
' #single start here
{ print " #double starts for print, no ends
url \= \"http\:\/\/room\.event\.assist\.com\/event\/room\/event\/
' #single ends here???
{ print $1 }'..... #single again??? ...
(rest codes)
and you want to print exact {print } out? i don't think so. why you were nesting print ?
Most of the elements of your pipe can be expressed right inside awk.
I can't tell exactly what you want to do with the last awk script, but here are some points:
Your "grep" is really just looking for a string of text, not a
regexp.
You can save time and simplify things if you use awk's
index() function instead of a RE. Output formats are almost always
best handled using printf().
Since you haven't provided your input data, I can't test this code, so you'll need to adapt it if it doesn't work. But here goes:
awk -F/ '
BEGIN {
string="id\": \"http://room.event.assist.com/event/room/event/";
fmt="url = http://example.com/event/room/event/%s?schema=whatever\n";
}
count == 1217 { nextfile; }
index($0, string) {
split($7, a, "\"");
printf(fmt, a[0]);
count++;
}' failed_events.txt
If you like, you can use awk's -v option to pass in the string variable from a shell script calling this awk script. Or if this is a stand-alone awk script (using #! shebang), you could refer to command line options with ARGV.