Insert text before the first non-comment-line before a pattern - awk

To update a system configuration file on a Linux server I was in need to add a rule (line of text) before a given other rule (and, if possible, before the comments of that other rule).
With the following input file:
# Foo
# Bar
# Comment about rule on the next line
RULE_A
# Comment about rule on the next line
# Continuation of comment
RULE_B
I want to get the following output:
# Foo
# Bar
# Comment about rule on the next line
RULE_A
# ADDED COMMENT
# ADDED COMMENT CONTINUATION
ADDED_RULE
# Comment about rule on the next line
# Continuation of comment
RULE_B
I ended up with the following combination of :
sed : convert my multi-line text to add in a single line with \n.
tac : to reverse the file.
awk : to work on the file.
A temporary file that will replace the original file (because I don't have "in-place" option on awk)
CONF_FILEPATH="sample.conf"
# Create sample work file:
cat > "${CONF_FILEPATH}" <<EOT
# Foo
# Bar
# Comment about rule on the next line
RULE_1
# Comment about rule on the next line
# Continuation of comment
RULE_2
RULE_3_WITHOUT_COMMENT
RULE_4_WITHOUT_COMMENT
RULE_5_WITHOUT_COMMENT
# Comment about rule on the next line
RULE_6
# Comment about rule on the next line
# Continuation of comment
RULE_7
EOT
# Text (of new rule) to add:
TEXT_TO_ADD="# ADDED COMMENT
# ADDED COMMENT CONTINUATION
ADDED_RULE
"
# The rule before which we want to add our text:
BEFORE_RULE="RULE_7"
# Temporary file:
TMP_FILEPATH="$(mktemp)"
# Convert newlines to \n:
TEXT_TO_ADD_FOR_AWK="$(echo ${TEXT_TO_ADD} | tac | sed -E ':a;N;$!ba;s/\r{0,1}\n/\\n/g')"
# Process
awk 'BEGIN {
ADD_TO_LINE="";
}
{
if ($0 ~ "^'${BEFORE_RULE}'") {
# DEBUG: Got the "deny all" line
ADD_TO_LINE=NR+1 ;
print $0;
} else {
if (ADD_TO_LINE==NR) {
# DEBUG: Current line is the candidate
if ($0 ~ "#") {
ADD_TO_LINE=NR+1;
# DEBUG: Its a comment, wont add here, taking note to try on the next line
print $0;
} else {
# DEBUG: Not a comment: this is the place!
print "'${TEXT_TO_ADD_FOR_AWK}'";
ADD_TO_LINE="";
print $0;
}
} else {
print $0;
}
}
}' <(tac "${CONF_FILEPATH}") \
| tac > "${TMP_FILEPATH}"
# Overwrite:
cat "${TMP_FILEPATH}" > "${CONF_FILEPATH}"
# Cleaning up:
rm "${TMP_FILEPATH}"
I then get (look just before RULE_7):
# Foo
# Bar
# Comment about rule on the next line
RULE_1
# Comment about rule on the next line
# Continuation of comment
RULE_2
RULE_3_WITHOUT_COMMENT
RULE_4_WITHOUT_COMMENT
RULE_5_WITHOUT_COMMENT
# Comment about rule on the next line
RULE_6
# ADDED COMMENT
# ADDED COMMENT CONTINUATION
ADDED_RULE
# Comment about rule on the next line
# Continuation of comment
RULE_7
Which is OK, but I'm sure there is a cleaner/simpler way of doing that with awk.
Context: I am editing the /etc/security/access.conf to add an allow rule before the deny all rule.

Reading the file paragraph-wise makes things simpler:
awk -v text_to_add="$TEXT_TO_ADD" \
-v before_rule="$BEFORE_RULE" \
-v RS='' \
-v ORS='\n\n' \
'$0 ~ "\n" before_rule {print text_to_add} 1' file
Get out of the habit of using ALLCAPS variable names, leave those as
reserved by the shell. One day you'll write PATH=something and then
wonder why
your script is broken.

You never need sed when you're using awk:
text_to_add='# ADDED COMMENT
# ADDED COMMENT CONTINUATION
ADDED_RULE
'
before_rule='RULE_B'
awk -v rule="$before_rule" -v text="$text_to_add" '
/^#/ { cmt = cmt $0 ORS; next }
$0==rule { print text }
{ printf "%s%s\n", cmt, $0; cmt="" }
' file
# Foo
# Bar
# Comment about rule on the next line
RULE_A
# ADDED COMMENT
# ADDED COMMENT CONTINUATION
ADDED_RULE
# Comment about rule on the next line
# Continuation of comment
RULE_B
If you can have comments after the final non-comment line then just add END { printf "%s", cmt } to the end of the script.
Don't use all-caps variable names (see Correct Bash and shell script variable capitalization) and always quote shell variables (see https://mywiki.wooledge.org/Quotes). Copy/paste your original script into http://shellcheck.net and it'll tell you some of the issues.
Regarding ...because I don't have "in-place" option on awk from your question - GNU awk has -i inplace for that.

ed, the standard editor, to the rescue! Because it looks at the entire file, not just a line at a time, it's able to move the current line cursor around forwards and backwards with ease:
ed -s input.txt <<EOF
/RULE_7/;?^[^#]?a
# ADDED COMMENT
# ADDED COMMENT CONTINUATION
ADDED_RULE
.
w
EOF
After this, input.txt looks like your desired result.
It first sets the current line to the first one containing RULE_7, then looks backwards for the first non-empty line above it that doesn't start with # (The line with RULE_6 in this case), and appends the desired text after that line. Then it writes the modified file back to disk.

Related

Printing sections with awk

I am using a bash script to print particular sections in a file defined by beg_ere and end_ere by matching title, subtitle, and keywords.
The user will set faml, subtitle, keywords. when a match is encountered in beg_ere, the section gets printed.
For instance, consider
faml="DN"
subtitle="AMBIT"
keywords="resource"
Currently everything is printed except upon reaching end_ere.
spc='[[:space:]]*'
ebl='\\[' ; ebr='\\]' # for awk to apply '\['' and '\]'
pn_ere='^[[:space:]]*([#;!]+|#c|//)[[:space:]]+'
## :- modifier, use GPH if parameters are unset or empty (null).
nfaml=${faml:-"[[:graph:]]+"} # Use GPH if FAML null ("" or '')
nasmb=${asmb:-"[[:graph:]]+"} # Use GPH if ASMB null ("" or '')
nkeys=${keys:-".*"} # Use GPH if KEYS null ("" or '')
local pn_ere="^[[:space:]]*([#;!]+|#c|//)[[:space:]]+"
beg_ere="${pn_ere}(${nfaml}) ${ebl}(${nasmb})${ebr}${spc}(${nkeys})$"
end_ere="${pn_ere}END OF ${nfaml} ${ebl}${nasmb}${ebr}${spc}$"
awk -v beg_ere="$beg_ere" -v pn_ere="$pn_ere" -v end_ere="$end_ere" \
'$0 ~ beg_ere {
title=gensub(beg_ere, "\\2", 1, $0);
subtitle=gensub(beg_ere, "\\3", 1, $0);
keywords=gensub(beg_ere, "\\4", 1, $0);
display=1;
next
}
$0 ~ end_ere { display=0 ; print "" }
display { sub(pn_ere, "") ; print }
' "$filename"
An example file would be
## DN [AMBIT] bash
## hodeuiihoedu
## AVAL:
## + ooeueocu
## END OF DN [AMBIT]
## NAVAID: Pattern Matching (Cogent)
## Cogent Convincing by virtue of clear and thorough presentation.
find ~/Opstk/bin/gungadin-1.0/ -name '*.rc'
-exec grep --color -hi -C 8 -e \"EDV\" -e \"GUN\" {} \+
## DN [AMBIT] bash,resource,rysnc
## hodeuiihoedu
## AVAL:
## + ooeueocu
## END OF DN [AMBIT]
## NAVAID: Pattern Matching (Cogent)
## Cogent Convincing by virtue of clear and thorough presentation.
I want to test for keywords supplied to keywords. Currently the match to beg_ere assumes that the string in keywords is matched exactly. But the the user supplied keywords could be in wrong order.
I want to be able to specify keys="bash,rsync". If there is a match with the begin section, the corresponding section gets printed.

How to inplace substitute the content between 2 tags with SED (bash)?

I want to inplace edit a file with sed (Oracle-Linux/Bash).
The content between 2 search-tags (in form of "#"-comments) should get commented out.
Example:
Some_Values
#NORMAL_LISTENER_START
LISTENER =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)
(KEY = LISTENER)
)
)
)
#NORMAL_LISTENER_END
Other_Values
Should result in:
Some_Values
#NORMAL_LISTENER_START
# LISTENER =
# (DESCRIPTION =
# (ADDRESS = (PROTOCOL = IPC)
# (KEY = LISTENER)
# )
# )
# )
#NORMAL_LISTENER_END
Other_Values
The following command already achieves it, but it also puts a comment+blank in front of the search-tags:
sed -i "/#NORMAL_LISTENER_START/,/#NORMAL_LISTENER_END/ s/^/# /" ${my_file}
Now my research told me to exclude those search-tags like:
sed -i '/#NORMAL_LISTENER_START/,/#NORMAL_LISTENER_END/{//!p;} s/^/# /' ${my_file}
But it won't work - with the following message as a result:
sed: -e expression #1, char 56: extra characters after command
I need those SearchTags to be as they are, because I need them afterwards again.
If ed is available/acceptable.
printf '%s\n' 'g/#NORMAL_LISTENER_START/+1;/#NORMAL_LISTENER_END/-1s/^/#/' ,p Q | ed -s file.txt
Change Q to w if you're satisfied with the output and in-place editing will occur.
Remove the ,p If you don't want to see the output.
This might work for you (GNU sed):
sed '/#NORMAL_LISTENER_START/,/#NORMAL_LISTENER_END/{//!s/^/# /}' file
Use a range, delimited by two regexp and insert # before the lines between the regexps but not including the regexps.
Alternative:
sed '/#NORMAL_LISTENER_START/,/#NORMAL_LISTENER_END/{s/^[^#]/# &/}' file
Or if you prefer:
sed '/#NORMAL_LISTENER_START/{:a;n;/#NORMAL_LISTENER_END/!s/^/# /;ta}' file
With your shown samples only, please try following awk code. Simple explanation would be, look for specific string and set/unset vars as per that and then print updated(added # in lines) lines as per that, stop updating lines once we find line from which we need not to update lines.
awk ' /Other_Values/{found=""} found{$0=$0!~/^#/?"#"$0:$0} /Some_Values/{found=1} 1' Input_file
Above will print output on terminal, once you are happy with results you could run following code to do inplace save into Input_file.
awk ' /Other_Values/{found=""} found{$0=$0!~/^#/?"#"$0:$0} /Some_Values/{found=1} 1' Input_file > temp && mv temp Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
/Other_Values/{ found="" } ##If line contains Other_Values then nullify found here.
found { $0=$0!~/^#/?"#"$0:$0 } ##If found is SET then check if line already has # then leave it as it is OR add # in starting to it.
/Some_Values/{ found=1 } ##If line contains Some_Values then set found here.
1 ##Printing current line here.
' Input_file ##Mentioning Input_file name here.

sed match a pattern and insert newline followed by replacement text

Let's say my input file es-service has the following lines:
# Comment 1
key1=value1
# Comment 3
key3=value3
If the pattern key2=value2 is not present in the above file, then add it after key1=value1
Hence, the file should now have:
# Comment 1
key1=value1
# Comment 2
key2=value2
# Comment 3
key3=value3
I came up with the following to achieve it:
if ! grep -qxF 'key2=value2' es-service;
then sed -i "/key1/a \n# Comment 2\nkey2=value2" es-service
fi
The problem is the first \n after /a doesn't insert a new-line. Hence I end-up getting the below:
key1=value1
n# Comment 2
key2=value2
instead of
key1=value1
# Comment 2
key2=value2
Edit:
I eventually solved it by adding one more sed to match Comment 2 and add a newline before it by using option i.
if ! grep -qxF 'key2=value2' es-service;
then sed -i "/key1/a \n# Comment 2\nkey2=value2" es-service; sed -i '/# Comment 2/i\ ' es-service
fi
All in a awk using loop
awk '/key2=/ {f=1} /key1=/ {n=NR} {a[NR]=$0} END {for(i=1;i<=NR;i++) {print a[i];if(i==n && !f) print "\n# Comment 2\nkey2=value2"}}' file
# Comment 1
key1=value1
# Comment 2
key2=value2
# Comment 3
key3=value3
/key2=/ {f=1} if key2= is found set flag f to prevent double insertion.
/key1=/ {n=NR} if key1 is found, store the line number in n
a[NR]=$0 store all line in array a
END After file is rund, do:
for(i=1;i<=NR;i++) loop trough all line, then
print a[i] print the line and
if(i==n && !f) if line is where key=1 is found and flag f is not true, do:
print "\n# Comment 2\nkey2=value2" print extra information.
Could you please try following, this code will take care of any missing key(if keys are NOT continuous in their sequence and add them with comment number too).
awk '
BEGIN{
FS="="
}
!NF{
print
next
}
/^# Comment/{
val=$0
next
}
/^key/{
first_col=$1
sub(/[a-zA-Z]+/,"",first_col)
while(first_col!=prev+1){
prev++
print "# Comment "prev ORS "key"prev"=value"prev ORS
}
prev=first_col
print val ORS $0
}
' Input_file
A gnu awk solution without loop
awk -v RS= -v ORS='\n\n' 'NR>1 && a~/key1/ && !/key2/ {print "# Comment 2\nkey2=value2"} 1; {a=$0}' file
# Comment 1
key1=value1
# Comment 2
key2=value2
# Comment 3
key3=value3
-v RS= -v ORS='\n\n' Set Record selector to nothing and output record selector to two new line
NR>1 && a~/key1/ && !/key2/ skip first block and test if previous block contains key1 and current line does not contain key2, then
print "# Comment 2\nkey2=value2" add new block
1; is always true, so it will print all line.
a=$0 store line in variable a to use for test in next line
a and i are tough to inline.
So this just uses s/// replacement and & for the match data. In other words, s/.*/&\n.../ where ... is your appended strings.
sed -i '/key1/s/.*/&\n# Comment 2\nkey2=value2/' es-service
Alternately:
You can use s///e to construct a shell command to generate output to be placed in the stream.
sed -i '/key1/s/.*/printf "&\n# Comment 2\nkey2=value2\n"/e' es-service
So I'm replacing .* with printf "&\n followed by what you'd like to insert.
e then executes the printf and sticks the output in the stream. I thought e was GNU-sed-only, but it's working for me with --posix.

Extract specific line of code using awk (or non-awk) from a log file

I am trying to find a way to extract scripts from the log file generated.
I am stuck at a place where a command calls for multiple files and the script separates them with a trailing "\" for line continuity. For example, a sample script is:
my_command -option \
file1 \
file2 \
file3
my_command2 .. ..
It looked easy but somehow the trick is not hitting me at this point.
Please help.
Every line in the log starts with a specific identifier for command, like:
:: Script_Command:: my_command -option \
:: file1 \
:: file2 \
:: file3
:: Info lines....
:: More info lines ...
:: Script_Command:: my_command2 ... ..
:: Info lines ...
So I used:
awk '/Script_Command/ {print }'
And then I tried to combine it with a if condition with:
awk '/Script_Command/ {print substr(length(),1)}'
But the entire thing is not falling in place.
Please help.
Edit:
The closest I got is here:
awk '{if ($NF=="\\" || == "Script_Command::") print ;}' file
It still leaves the file3 line as it does not match anything.
Pure intention is:
1. When Script_Command is matched, print line.
2. When "\" is matched, print the next line.
3. When both are matched, print line and next line.
You can use sed for this:
sed -n '/Script_Command/ {:a;/\\$/!be;N;ba;:e;p;}'
Breakdown
# -n disables auto printing.
sed -n '/Script_Command/ { # Match regex
:a # Define label 'a'
/\\$/!be # Goto 'e' unless pattern space ends with \
N # Append next line to pattern space
ba # Goto 'a'
:e # Define label 'e'
p # Print pattern space
}'
You can add [[:space:]]* to /\\$!be if you want to read lines ending in slash followed zero or more spaces:
/\\[[:space:]]*$/!be
If each continues line starts with :: one can use awk like this:
awk '!/^::/ { p = 0 } # Set p = 0 if line does not start with ::
/Script_Command/{ p = 1 } # Set p = 1 when line contains Script_Command
p' # Print if p is truly
I finally got this working with following command:
awk '/\\/ && /Script_Command/ {print $0;getline;print $0;next} /Script_Command/ {print $0;next} /\\/ {getline;print $0}'

Trying to modify awk code

awk 'BEGIN{OFS=","} FNR == 1
{if (NR > 1) {print fn,fnr,nl}
fn=FILENAME; fnr = 1; nl = 0}
{fnr = FNR}
/ERROR/ && FILENAME ~ /\.gz$/ {nl++}
{
cmd="gunzip -cd " FILENAME
cmd; close(cmd)
}
END {print fn,fnr,nl}
' /tmp/appscraps/* > /tmp/test.txt
the above scans all files in a given directory. prints the file name, number of lines in each file and number of lines found containing 'ERROR'.
im now trying to make it so that the script executes a command if any of the file it reads in isn't a regular file. i.e., if the file is a gzip file, then run a particular command.
above is my attempt to include the gunzip command in there and to do it on my own. unfortunately, it isn't working. also, i cannot "gunzip" all the files in the directory beforehand. this is because not all files in the directory will be "gzip" type. some will be regular files.
so i need the script to treat any .gz file it finds a different way so it can read it, count and print the number of lines that's in it, and the number of lines it found matching the pattern supplied (just as it would if the file had been a regular file).
any help?
This part of your script makes no sense:
{if (NR > 1) {print fn,fnr,nl}
fn=FILENAME; fnr = 1; nl = 0}
{fnr = FNR}
/ERROR/ && FILENAME ~ /\.gz$/ {nl++}
Let me restructure it a bit and comment it so it's clearer what it does:
{ # for every line of every input file, do the following:
# If this is the 2nd or subsequent line, print the values of these variables:
if (NR > 1) {
print fn,fnr,nl
}
fn = FILENAME # set fn to FILENAME. Since this will occur for the first line of
# every file, this is that value fn will have when printed above,
# so why not just get rid of fn and print FILENAME?
fnr = 1 # set fnr to 1. This is immediately over-written below by
# setting it to FNR so this is pointless.
nl = 0
}
{ # for every line of every input file, also do the following
# (note the unnecessary "}" then "{" above):
fnr = FNR # set fnr to FNR. Since this will occur for the first line of
# every file, this is that value fnr will have when printed above,
# so why not just get rid of fnr and print FNR-1?
}
/ERROR/ && FILENAME ~ /\.gz$/ {
nl++ # increment the value of nl. Since nl is always set to zero above,
# this will only ever set it to 1, so why not just set it to 1?
# I suspect the real intent is to NOT set it to zero above.
}
You also have the code above testing for a file name that ends in ".gz" but then you're running gunzip on every file in the very next block.
Beyond that, just call gunzip from shell as everyone else also suggested. awk is a tool for parsing text, it's not an environment from which to call other tools - that's what a shell is for.
For example, assuming your comment (prints the file name, number of lines in each file and number of lines found containing 'ERROR) accurately describes what you want your awk script to do and assuming it makes sense to test for the word "ERROR" directly in a ".gz" file using awk:
for file in /tmp/appscraps/*.gz
do
awk -v OFS=',' '/ERROR/{nl++} END{print FILENAME, NR+0, nl+0}' "$file"
gunzip -cd "$file"
done > /tmp/test.txt
Much clearer and simpler, isn't it?
If it doesn't make sense to test for the word ERROR directly in a ".gz" file, then you can do this instead:
for file in /tmp/appscraps/*.gz
do
zcat "$file" | awk -v file="$file" -v OFS=',' '/ERROR/{nl++} END{print file, NR+0, nl+0}'
gunzip -cd "$file"
done > /tmp/test.txt
To handle gz and non-gz files as you've now described in your comment below:
for file in /tmp/appscraps/*
do
case $file in
*.gz ) cmd="zcat" ;;
* ) cmd="cat" ;;
esac
"$cmd" "$file" |
awk -v file="$file" -v OFS=',' '/ERROR/{nl++} END{print file, NR+0, nl+0}'
done > /tmp/test.txt
I left out the gunzip since you don't need it as far as I can tell from your stated requirements. If I'm wrong, explain what you need it for.
I think it could be simpler than that.
With shell expansion you already have the file name (hence you can print it).
So you can do a loop over all the files, and for each do the following:
print the file name
zgrep -c ERROR $file (this outputs the number of lines containing 'ERROR')
zcat $file|wc -l (this will output the line numbers)
zgrep and zcat work on both plain text files and gzipped ones.
Assuming you don't have any spaces in the paths/filenames:
for f in /tmp/appscraps/*
do
n_lines=$(zcat "$f"|wc -l)
n_errors=$(zgrep -c ERROR "$f")
echo "$f $n_lines $n_errors"
done
This is untested but it should work.
You can use execute the following command for each file :
gunzip -t FILENAME; echo $?
It will pass print the exit code 0(for gzip files) or 1(corrupt/other file). Now you can compare the output using IF to execute the required processing.