Awk scripting help - Logic Issue - scripting

I'm currently writing a simple .sh script to parse an Exim log file for strings matching " o' ". Currently, when viewing output.txt, all that is there is a 0 printed on every line(606 lines). I'm guessing my logic is wrong, as awk does not throw any errors.
Here is my code(updated for concatenation and counter issues). Edit: I've adopted some new code from dmckee's answer that I'm now working with over the old code in favor of simplicity.
awk '/o'\''/ {
line = "> ";
for(i = 20; i <= 33; i++) {
line = line " " $i;
}
print line;
}' /var/log/exim/main.log > output.txt
Any ideas?
EDIT: For clarity's sake, I'm grepping for "o'" in email addresses, because ' is an illegal character in email addresses(and in our databases, appears only with o'-prefixed names).
EDIT 2: As per commentary request, here is a sanitized sample of some desired output:
[xxx.xxx.xxx.xxx] kathleen.o'toole#domain.com <kathleen.o'toole#domain.com> routing defer (-51): retry time not reached
[xxx.xxx.xxx.xxx] julie.o'brien#domain.com <julie.o'brien#domain.com> routing defer (-51): retry time not reached
[xxx.xxx.xxx.xxx] james.o'dell#domain.com <james.o'dell#domain.com> routing defer (-51): retry time not reached
[xxx.xxx.xxx.xxx] daniel_o'leary#domain.com <aniel_o'leary#domain.com> routing defer (-51): retry time not reached
The reason I'm starting at 20 in my loop is because everything before the 20th field is just standard log information that isn't needed for my purposes here. All I need is everything from the IP and beyond for this solution(the messages for each 550 error are different for each mail server in use out there. I'm compiling a list of common ones)

+ means numerical addition in awk. If you want to concatenate, just place the constants and/or expressions separated with spaces.
So, this
line += " " + $i
should become
line = line " " $i
EDIT: Iff exim log files (I am more into Postfix :) are separated by a single space, isn't the following more simple:
grep -F o\' /var/log/exim/main.log | cut -d\ -f20-33 >output.txt
?

There is no real need for the grep here. Let awk select the matching lines for you (and fixing your concatenation bug as per ΤΖΩΤΖΙΟΥ):
awk '/o'\''/ {
line = "> ";
for(i = 20; i <= 33; i++) {
line = line " " $i;
}
print line;
}' /var/log/exim/main.log > output.txt
Of course, you end up needing some weird escaping if you do it at the promp like above. It is cleaner in a script...
Edit: On the first pass I missed the += problem...
Also assuming that the line you gave above is partial, as it has only 13ish fields (by default fields are white space delimited).

"'" is not illegal in local parts. From RFC2821, section 4.1.2:
Local-part = Dot-string / Quoted-string
Dot-string = Atom *("." Atom)
Atom = 1*atext
2821 further references RFC2822 for non-locally-defined elements, so:
atext = ALPHA / DIGIT / ; Any character except controls,
"!" / "#" / ; SP, and specials.
"$" / "%" / ; Used for atoms
"&" / "'" /
"*" / "+" /
"-" / "/" /
"=" / "?" /
"^" / "_" /
"`" / "{" /
"|" / "}" /
"~"
In other words, "'" is a perfectly legal unquoted characted to have in an email localpart. Now, it may not be legal at your site, but that's not what you said.
Sorry for not staying directly on topic, but I wanted to correct your assertion.

Off task, and simpler still: python.
import fileinput
for line in fileinput.input():
if "'" in line:
fields = line.split(' ')
print "> ", ' '.join( fields[20:34] )

Related

Simplest way to find text by regex and replace by lookup table

A legacy web application needs to be internationalized. Error messages are currently written inside source code in this way:
addErrorMessage("some text here");
These signs can be easily found and extracted using regex. They should be replaced with something like this:
addErrorMessage(ResourceBundle.getBundle("/Bundle", lcale).getString("key for text here"));
The correspondence between key for text here and some text here will be in a .property file.
According to some linux guru it can be achieved using awk, but I don't know anything about it. I can write a small application to do that task but it could be overkill. Are there ide plugin or existing applications for this goal ?
awk -v TextOrg='some text here' -v key='key for text here' ' "addErrorMessage(\"" TextOrg "\")" {
gsub( "addErrorMessage(\"" TextOrg "\")" \
, "addErrorMessage(ResourceBundle.getBundle(\"/Bundle\", lcale).getString(\"" key "\"))")
}
7
' YourFile
this is one way for a specific combination. Be carefful with:
assignation of value (-v ... that are constraint by shell interpretation in this case)
gsub is using regex to find, so your text need to be treated with this constraint (ex: "this f***ing text" -> "this f\*\*\*ing text" )
You certainly want to do if for several peer.
her with a file conatinaing peers
assuming that Trad.txt is a file that containt a series of 2 lines 1st, original text, second the key (to avoid some chara as separator that need complexe escape sequence interpretation if used)
ex: Trad.txt
some text
key text
other text
other key
sample code (simple, no exhaustive security, ...) Not tested, but for the concept with awk
awk '
# for first file only
FNR == NR {
# keep in memory first line as text to change
if ( NR % 2 ) TextOrg = $0
else {
# load in array the key corresponding (index is the text to change)
Key[ TextOrg] = $0
Len[ TextOrg] = length( addErrorMessage(\"" $0 "\")"
}
# don't go further in script for this line
next
}
# this point and further is reach only by second file
# if addError is found
/addErrorMessage(".*")/{
# try with each peer if there is a change (a more complex loop is more perfomant checking just necessary replacement but this one do the job)
for( TextOrg in Key) {
# to avoid regex interpretation
# Assuming for this sample code that there is 1 replace (loop is needed normaly)
# try a find of text (return the place where)
Here = index( addErrorMessage(\"" TextOrg "\")", $0)
if( Here > 0) {
# got a match, replace the substring be recreating a full one
$0 = substr( $0, 1, Here) \
"addErrorMessage(ResourceBundle.getBundle(\"/Bundle\", lcale).getString(\"" Key[ TextOrg] "\"))") \
substr( $0, Here + Len[ TextOrg])
}
}
}
# print the line in his current state (modified or not)
7
' Trad.txt YourFile
Finally, this is a workaround solution because lot of special case could occurs like "ref: function addErrorMessage(\" ...\") bla bla" will be an issue, or space inside () not treated here, or cutted line insdie (), ...

using Awk how to convert the count displayed in text to a link to some webpage

i am using awk to generate the report of error count. but the count displayed should now act as a link to a webpage.how to convert this text count to a link so that on clicking that number it takes me to some webpage???
My code is:
awk -F "," '{
names[$4]=$4
excpCount[$4]+=$5
}END{
n = asort(names,sorted)
total=0
for (i = 1; i <= n; i++) {
jobName=sorted[i]
if (jobName != "")
printf("%43s %9d\n", jobName,<html><body>excpCount[jobName]</body></html>)
total+=excpCount[jobName]
}
printf("%45s\n", " ")
printf("%43s %15s\n", "-----------------", "----------")
printf("%43s %15d\n", "Total Errors", total)
}' ~/ode.$$.tmp
excpCount[jobName] gives the no. of times the exception occurs after generating report. can somebody help on this??
You need to write your static HTML code as a part of the format String. Only the variable replacements are given as the successive arguments. So your line should be as follows:
printf("%43s <html><body> %9d </body></html>\n", jobName, excpCount[jobName])
Note that the quotes that are part of the link are escaped with a '\'. Also, replace the dummy link with your original link.

Using SQL like and % in Perl

When I use the following code, I only seem to print the last results from my array. I think it has something to do with my like clause and the % sign. Any ideas?
my #keywords = <IN>;
my #number = <IN2>;
foreach my $keywords (#keywords)
{
chomp $keywords;
my $query = "select *
from table1 a, table2 b
where a.offer = b.offer
and a.number not in (#number)
and a.title like ('%$keywords%')";
print $query."\n";
my $sth = $dbh->prepare($query)
or die ("Error: Could not prepare sql statement on $server : $sth\n" .
"Error: $DBI::errstr\n");
$sth->execute
or die ("Error: Could not execute sql statement on $server : $sth\n" .
"Error: $DBI::errstr\n");
while (my #results = $sth->fetchrow_array())
{
print OUT "$results[0]\t$results[1]\t$results[2]\t$results[3]\t",
"$results[4]\t$results[5]\t$results[6]\t$results[7]\t",
"$results[8]\n";
}
}
close (OUT);
I'm guessing that your IN file was created on a Windows system, so has CRLF sequences (roughly \r\n) between the lines, but that you're running this script on a *nix system (or in Cygwin or whatnot). So this line:
chomp $keywords;
will remove the trailing \n, but not the \r before it. So you have a stray carriage-return inside your LIKE expression, and no rows match it.
If my guess is right, then you would fix it by changing the above line to this:
$keywords =~ s/\r?\n?\z//;
to remove any carriage-return and/or newline from the end of the line.
(You should also make the changes that innaM suggests above, using bind variables instead of interpolating your values directly into the query. But that change is orthogonal to this one.)
Show the output of the print $query and maybe we can help you. Better yet, show the output of:
use Data::Dumper;
$Data::Dumper::Useqq=1;
print Dumper($query);
Until then, your comment about "replaces the an of and" makes me think your input has carriage returns, and the use of #number is unlikely to work if there's more than one.

awk: concat string with number in dict value

I have next awk oneliner :
{dict[$2"#"$6]=($(NF-2)/($(NF-2)+$NF))*100 } END {for (a in dict) { printf "%s %d :" , a, int(dict[a]) }}
What i need, is to add to value of each dictionary key combination of
($(NF-2)/($(NF-2)+$NF))*100 " out of" $(NF-2)+$NF
So i want awk to calculate all math , then compose string and put it as dictionary value. I already tried with some combination of spaces and brackets but still no luck.
Vars are filled from input stream :
$2 - host , not unique in input stream
$3 - partition , not unique in input stream
$NF - space avail
$NF-2 - space used
$(NF-2)+$NF - gives you overall capacity of partiton
Output is
80% host1#/local/1
Output expected:
80% host1#/local/1 out of 112G
----------------------Solution-----------------------------------
With good catch below , i resolved this. Issue was that i did int() in printf part, that truncated output. Though, further i faced other problems with my wrap-around shell part, therefore my final code was different than i expected it to be asking question.
'{key=($2 "#" $6 " out of " int((($(NF-2)+$NF)/1000)/1000) "GB" ) ; dict[key]=($(NF-2)/($(NF-2)+$NF))*100 } END {for (a in dict) { printf "%s , %d :" , a, int(dict[a]) }}'
I`ve moved "out of " and capacity part to dictionary key , because dict value cannot be string in my case, futher i will compare it with INT.
The concatenation is working fine. It's not the problem.
The problem is that you are calculating the int() of the dictionary value when you print. Since the value is a string, the result is truncated. If you need to use int() do it at the time you perform the calculations rather than at print time.
By the way, if you had provided some sample data it would have been a lot easier to test your code and provide an answer. This is especially important since it's sometimes the case, as it is here, that the problem is in a place that is not where it was anticipated.

How to write this in idiomatic awk?

The following program prints out the name of the file, the number of rows, and the number of rows that begin with // in the case that more than one fifth of the rows begin that way.
awk '$1 == "//" { a+=1 } END { if (a * 5 >= NR) {print FILENAME " " NR " " a}}' MyClass.java
This works, but the nested {{}} make me question if I'm doing it right, knowing that the typical structure of an awk program is:
awk 'condition { actions }'
So I suspect that something like
awk '$1 == "//" { a+=1 } END && (a * 5 >= NR) {print FILENAME " " NR " " a}' MyClass.java
would be more appropriate, but every such attempt gives syntax errors. Is there a right way to do this, or is my approach as good as it gets.
There are other ways to express it, but you wrote it idiomatically the first time. Although the authors tend to omit braces whenever they can, you can still find examples of code like that throughout The AWK Programming Language. They should know.
It seems like Aho, Weinberger, and Kernighan have several centuries of development experience in languages whose syntax derives from C. And when they write something like this
if (a * 5 >= NR)
print FILENAME " " NR " " a
it communicates perfectly that the block following the if statement is supposed to contain one and only one statement.
I have considerably fewer centuries of experience. Whenever I read something like that, it communicates perfectly that a) somebody forgot to type {}, and b) somebody else is about to introduce a bug by adding a statement to that block without adding the braces.
Over the years, I've trained myself to type this whenever I type an if.
if () {}
Then I go back and fill it in, breaking lines if I need to. In my normal editor, "if" expands automatically to "if () {}". I'm pretty sure I haven't omitted braces even once since the mid-1980s.