Find strings for 'awk' in the squid log [duplicate] - awk

This question already has answers here:
How do I use shell variables in an awk script?
(7 answers)
Closed 5 years ago.
I want to do the following in the code below when I find these 2 strings, I want to get the IP of the same line that found these string (at least one of them) and throw the IP into a .txt file so that I can handle it in squid.conf .
I'm trying to build a Splash Page in squid, and I only have the features of IPcop. Because of the code that I put up, it does not work because it compares any string, not the ones I need. Can anyone help?
#!/bin/sh
TAIL="/usr/bin/tail -f"
SQUID="/var/log/squid/access.log"
PRINCIPAL1="http://cartilha.cert.br/"
PRINCIPAL2="cartilha.cert.br:443"
LOG="/tmp/autenticados.txt"
$TAIL $SQUID | gawk '{if ($7 = $PRINCIPAL1 || $7 = $PRINCIPAL2) {print $3} }' >> $LOG

With the variable -v does not accept conditional.
I do not think the topic is duplicated, because in the one they have passed, it has no conditional whatsoever.
I tried instead of just putting = = = = but doing so it does not give anything ..
The idea is simple, when accessing one of these links above I need to pick which IP accessed and play in a txt ... just that.

Related

Print all lines except those matching a variable with awk [duplicate]

This question already has answers here:
How to remove the lines which appear on file B from another file A?
(12 answers)
Closed 1 year ago.
I have the following for getting the lines matching a variable using awk:
for i in `cat file1`; do awk -v id="$i" '$0 ~ id' file2; done
How can I do the opposite?
Getting the lines that DON'T match? I think I should use ! somewhere, but don't know where.
File 1 looks like this:
5NX1D
5NX3D
4NTYB
File 2 looks like this:
2R9PA IVGGYTCEENS
2RA3C RPDFCLEPPYT
6HARE YVDYKDDDDKE
4NTYB EYVDYKDDDDD
Output should look like this:
2R9PA IVGGYTCEENS
2RA3C RPDFCLEPPYT
6HARE YVDYKDDDDKE
This is the standard inner join, except we print if a match was not found.
awk 'NR==FNR { a[$1]; next }
!($1 in a)' file1 file2 >newfile2
This is very standard Awk idiom all the way, but very briefly, the line number within the current file FNR will be equal to the overall input line number NR while we are traversing the first input file, file1. If so, we add its contents to the array a, and skip to the next line. Else, if we fall through, we are no longer in the first file; we print if the first field $1 is not in a.
Your original script would be more idiomatic and a lot more efficient phrased similarly (just without the !), too.

How do i use awk command within a script to save permanent modifications [duplicate]

This question already has answers here:
Save modifications in place with awk
(7 answers)
Closed 4 years ago.
How do I use awk command to make permanent modifications to a file? I have been using:
awk '/'"'"'test'"'"' =>./{c++}(c==2){sub("'"'"'test'"'"' =>.","'"'"'test'"'"' => '"'"'test1'"'"',")}1' testfile
I have been using the above command to make temporary changes that open the changes instantly. But I want to use it within the script file and make permanent changes to the file similar to sed -i.
Before this gets closed as a dup, lets at least clean up your code. This:
awk '/'"'"'test'"'"' =>./{c++}(c==2){sub("'"'"'test'"'"' =>.","'"'"'test'"'"' => '"'"'test1'"'"',")}1' testfile
is extremely hard to read. I assume all those '"'"'s are trying to get single quotes into the code. If so, to improve clarity and so it'd work if/when the script is stored in a file, use the octal representation \047 for every single quote instead:
awk '/\047test\047 =>./{c++} (c==2){sub("\047test\047 =>.","\047test\047 => \047test1\047,")}1' testfile
Now use regexp delimiters for the regexp that's the first arg to sub():
awk '/\047test\047 =>./{c++} (c==2){sub(/\047test\047 =>./,"\047test\047 => \047test1\047,")}1' testfile
There are several other possible improvements including using a backreference instead of hard-coding the original sub() string in the replacement and using match() so you don't need to test for the same regexp in the condition part of the script and then again in the sub() so something like this (with GNU awk for the 3rd arg to match()) is probably all you need:
awk 'match($0,/(.*\047test\047 =>.)(.*)/,a){c++} c==2{$0=a[1] "\047test1\047" a[2]} 1' testfile
but without sample input output we can't know for sure - post a new question with sample input/output if you'd like more help.

user input inside awk -- or -- variable usage in search pattern [duplicate]

This question already has answers here:
get the user input in awk
(3 answers)
Closed 6 years ago.
I'm trying to take a user input in a pure [g]awk code. now the requirement is that I want the user to enter either today or current - number of days to generate a report. I can't find any routine inside the awk to read user's input. sometime back I had read a document on awk where it was done using either sprintf or printf, but I dont know how.
OR
in awk, I'm using BEGIN block to setup a variable and then search based on that, but not finding it quite helpful to search the variable based search. something like below:
awk -F "|" ' BEGIN { PWR="Nov 3"; }
/Deployment started at PWR/ { print $1 + $NF }' /var/log/deployments
this offensively denies me any search for the pattern of "Deployment started at Nov 3".
Inside the regex slashes, you don't have access to your variables. What you can do is make a string out of the search phrase then apply that string as a regex.
awk -F "|" ' BEGIN { PWR="Nov 3"; }
$0 ~ "Deployment started at "PWR { print $1 + $NF }' /var/log/deployments

Change FS and RS to parse newline char [duplicate]

This question already has answers here:
Read lines from a file into a Bash array [duplicate]
(6 answers)
Closed 6 years ago.
I'm using awk in a shell script to parse a file.
My question has been marked as duplicate of an other, but I want to use awk and I didn't find the same question
Here is the file format:
Hi everyone I'm new\n
Can u help me please\n
to split this file\n
with awk ?\n
The result I hope:
tab[0]=Hi everyone I'm new
tab[1]=Can u help me please
tab[2]=to split this file
tab[3]=with awk ?
So I tried to change FS and RS values to tried get what I wanted but without success. Here what I tried:
config=`cat $1`
tab=($(echo $config | awk '
{
for (i = 1; i < (NF); i++)
print $i;
}'))
And what I get:
Hi
everyone
I'm
new
Can
u
help
me
please
to
split
this
file
with
awk
Do u know how to proceed please ? :/
The problem is that however you parse the file in awk, it's returned to the shell as a simple string.
AWK splits a file into records (line ending in \n), and records are further split into fields (separated by FS, space by default).
In order to assign the returned string to an array, you need to set the shell's IFS to newline, or assign the lines to array items one by one (you can filter the record with NR, which would then require you to read the file several times with AWK).
Your best course of action is to just print the records in AWK and assign them to a bash array using compound assignment, with IFS set to newline character
#/bin/bash
declare -a tab
IFS='
'
# Compount assignment: array=(words)
# Print record: { print } is the same as { print $0 }
# where $0 is the record and $1 ... $N are the fields in the record
tab=($(awk '{ print }' file))
unset IFS
for index in ${!tab[#]}; do
echo "${index}: ${tab[index]}"
done
# Output:
# 0: Hi everyone I'm new
# 1: Can u help me please
# 2: to split this file
# 3: with awk ?
Notice that awk is hardly used at all, and should be replaced with simple cat.

Linux parsing space delimited log files

I need to parse apache-access log files which has 16 space delimited columns, that is,
xyz abc ... ... home?querystring
I need to count total number of hits for each page in that file, that is, total number of home page hits ignoring querystring
For few lines the url is column 16 and for other its 14 or 15. Hence I need to parse each line in reverse order (get the last column, ignore query string of the last column, aggregate page hits)
I am new to linux, shell scripting. How do I approach this, do I have to look into awk or shell scripting. Can you give a small sample code that would perform such task.
ANSWER: perl one liner solved the problem
perl -lane | scalar array
Well for starters, if you are only interested in working on columns 14-16, I would start by running
cut -d\ -f14-16 <input_file.log> | awk '{ one = match($1,/www/)
two = match($2,/www/)
three = match($3,/www/)
if (one)
print $1
else if(two)
print $2
else if(three)
Note: there are two spaces after the d\
You can then pretty easily just count up the urls that you see. I also think this would be solved a lot easier using a few lines of python or perl.
You can read line by line of input using the read bash command:
while read my_variable; do
echo "The text is: $my_variable"
done
To get input from a specific file, use the input redirect <:
while read my_variable; do
echo "The text is: $my_variable"
done < my_logfile
Now, to get the last column, you can use the ${var##* } construction. For example, if the variable my_var is the string some_file_name, then ${my_var##*_} is the same string, but whith everything before (and including) the last _ deleted.
We come up with:
while read line; do
echo "The last column is: ${line##* }"
done < my_logfile
If you want to echo it to another file, use the >> redirect:
while read line; do
echo "The last column is: ${line##* }" >> another_file
done < my_logfile
Now, to take away the querystring, you can use the same technique:
while read line; do
last_column="${line##* }"
url="${last_column%%\?*}"
echo "The last column without querystring is: $url" >> another_file
done < my_logfile
This time, we have %%?* instead of ##*? because we want to delete what's after the first ?, instead of before the last. (Note that I have escaped the character ?, which is special to bash.) You can read all about it here.
I didn't understand where to get the page hits, but I think the main idea is there.
EDIT: Now the code works. I had forgotten the do bash keywork. Also, we need to use >> instead of > in order not to overwrite the another_file every time we do echo "..." > another_file. By using >>, we append to the file. I have also corrected the %% instead of ##.
It's hard to say without a few lines of concrete sample input and expected output, but it sounds like all you need is:
awk -F'[ ?]' '{sum[$(NF-1)]++} END{for (url in sum) print url, sum[url]}' file
For example:
$ cat file
xyz abc ... ... http://www.google.com?querystring
xyz abc ... ... some other http://www.google.com?querystring1
xyz abc ... some stuff we ignore http://yahoo.com?querystring1
$
$ awk -F'[ ?]' '{sum[$(NF-1)]++} END{for (url in sum) print url, sum[url]}' file
http://www.google.com 2
http://yahoo.com 1