Replace a string in each occurence of string in a file, add additional line at first line in that file - awk

I did search and found how to replace each occurrence of a string in files. Besides that I want to add one line to a file only at the first occurrence of the string.
I know this
grep -rl 'windows' ./ | xargs sed -i 's/windows/linux/g'
will replace each occurrence of string. So how do I add a line to that file at first match of the string? Can any have an idea how to do that? Appreciate your time.
Edited :
Exaple : replace xxx with TTT in file, add a line at starting of file for first match.
Input : file1, file2.
file1
abc xxx pp
xxxy rrr
aaaaaaaaaaaddddd
file2
aaaaaaaaaaaddddd
Output
file1
#ADD LINE HERE FOR FIRST MATCH DONT ADD FOR REST OF MATCHES
abc TTT pp
TTTy rrr
aaaaaaaaaaaddddd
file2
aaaaaaaaaaaddddd

Cribbing from the answers to this question.
Something like this would seem to work:
sed -e '0,/windows/{s/windows/linux/; p; T e; a \new line
;:e;d}; s/windows/linux/g'
From start of the file to the first match of /windows/ do:
replace windows with linux
print the line
if s/windows/linux/ did not replace anything jump to label e
add the line new line
create label e
delete the current pattern space, read the next line and start processing again
Alternatively:
awk '{s=$0; gsub(/windows/, "linux")} 7; (s ~ /windows/) && !w {w=1; print "new line"}' file
save the line in s
replace windows with linux
print the line (7 is true and any true pattern runs the default action of {print})
if the original line contained windows and w is false (variables are empty strings by default and empty strings are false-y in awk)
set w to 1 (truth-y value)
add the new line

If I understand you correctly, all you need is:
find . -type f -print |
while IFS= read -r file; do
awk 'gsub(/windows/,"unix"){if (!f) $0 = $0 ORS "an added line"; f=1} 1' "$file" > tmp &&
mv tmp "$file"
done
Note that the above, like sed and grep would, is working with REs, not strings. To use strings would require the use of index() and substr() in awk, is not possible with sed, and with grep requires an extra flag.
To add a leading line to the file if a change is made using gNU awk for multi-char RS (and we may as well do sed-like inplace editing since we're using gawk):
find . -type f -print |
while IFS= read -r file; do
gawk -i inplace -v RS='^$' -v ORS= 'gsub(/windows/,"unix"){print "an added line"} 1' "$file"
done

Related

With sed or awk, move line matching pattern to bottom of file

I have a similar problem. I need to move a line in /etc/sudoers to the end of the file.
The line I am wanting to move:
#includedir /etc/sudoers.d
I have tried with a variable
#creates variable value
templine=$(cat /etc/sudoers | grep "#includedir /etc/sudoers.d")
#delete value
sed '/"${templine}"/d' /etc/sudoers
#write value to the bottom of the file
cat ${templine} >> /etc/sudoers
Not getting any errors nor the result I am looking for.
Any suggestions?
With awk:
awk '$0=="#includedir /etc/sudoers.d"{lastline=$0;next}{print $0}END{print lastline}' /etc/sudoers
That says:
If the line $0 is "#includedir /etc/sudoers.d" then set the variable lastline to this line's value $0 and skip to the next line next.
If you are still here, print the line {print $0}
Once every line in file is processed, print whatever is in the lastline variable.
Example:
$ cat test.txt
hi
this
is
#includedir /etc/sudoers.d
a
test
$ awk '$0=="#includedir /etc/sudoers.d"{lastline=$0;next}{print $0}END{print lastline}' test.txt
hi
this
is
a
test
#includedir /etc/sudoers.d
You could do the whole thing with sed:
sed -e '/#includedir .etc.sudoers.d/ { h; $p; d; }' -e '$G' /etc/sudoers
This might work for you (GNU sed):
sed -n '/regexp/H;//!p;$x;$s/.//p' file
This removes line(s) containing a specified regexp and appends them to the end of the file.
To only move the first line that matches the regexp, use:
sed -n '/regexp/{h;$p;$b;:a;n;p;$!ba;x};p' file
This uses a loop to read/print the remainder of the file and then append the matched line.
If you have multiple entries which you want to move to the end of the file, you can do the following:
awk '/regex/{a[++c]=$0;next}1;END{for(i=1;i<=c;++i) print a[i]}' file
or
sed -n '/regex/!{p;ba};H;:a;${x;s/.//;p}' file

How can I print only lines that are immediately preceeded by an empty line in a file using sed?

I have a text file with the following structure:
bla1
bla2
bla3
bla4
bla5
So you can see that some lines of text are preceeded by an empty line.
I understand that sed has the concept of two buffers, a pattern space buffer and a hold space buffer, so I'm guessing these need to come in to play here, but I'm unclear how to specify them to accomplish what I need.
In my contrived example above, I'd expect to see the following lines outputted:
bla3
bla5
sed is for doing s/old/new on individual lines, that is all. Any time you start talking about buffers or doing anything related to multi-lines comparisons you're using the wrong tool.
You could do this with awk:
$ awk -v RS= -F'\n' 'NR>1{print $1}' file
bla3
bla5
but it would fail to print the first non-empty line if the first line(s) in the file were empty so this may be what you want if you want lines of all space chars considered to be empty lines:
$ awk 'NF && !p{print} {p=NF}' file
bla3
bla5
and this otherwise:
$ awk '($0!="") && (p==""){print} {p=$0}' file
bla3
bla5
All of the above will work even if there are multiple empty lines preceding any given non-empty line.
To see the difference between the 3 approaches (which you won't see given the sample input in the question):
PS1> printf '\nfoo\n \nbar\n\netc\n' | cat -E
$
foo$
$
bar$
$
etc$
PS1> printf '\nfoo\n \nbar\n\netc\n' | awk -v RS= -F'\n' 'NR>1{print $1}'
etc
PS1> printf '\nfoo\n \nbar\n\netc\n' | awk 'NF && !p{print} {p=NF}'
foo
bar
etc
PS1> printf '\nfoo\n \nbar\n\netc\n' | awk '($0!="") && (p==""){print} {p=$0}'
foo
etc
You can use the hold buffer easily to print the line before the blank like this:
sed -n -e '/^$/{x; p;}' -e h input
But I don't see an easy way to use it for your use case. For your case, instead of using the hold buffer, you could do:
sed -n -e '/^$/ba' -e d -e :a -e n -e p input
But I would do this with awk.
awk 'NR!=1{print $1}' RS= FS=\\n input-file
awk 'p;{p=/^$/}' file
above command does these for each line:
if p is 1, print line;
if line is empty, set p to 1.
if lines consisting of one or more spaces are also considered empty:
awk 'p;{p=!NF}' file
to print non-empty lines each coming right after an empty line, you can use this:
awk 'p*!(p=/^$/)' file
if p is 1 and this line is not empty (1*!(0) = 1*1 = 1), print this line;
otherwise (1*!(1) = 1*0 = 0, 0*anything = 0), don't print anything.
note that this one may not work with all awks, a portable version of this would look like:
awk 'p*(/./);{p=/^$/}' file
if lines consisting of one or more spaces are also considered empty:
awk 'p*NF;{p=!NF}' file
see them online here, and here.
If sed/awk is not mandatory, you can do it with grep:
grep -A 1 '^$' input.txt | grep -v -E '^$|--'
You can use sed to match a range of lines and do sub-matches inside the matches, like so:
# - use the "-n" option to omit printing of lines
# - match lines between a blank line (/^$/) and a non-blank one (/^./),
# then print only the line that contains at least a character,
# i.e, the non-blank line.
sed -ne '
/^$/,/^./ {
/^./{ p; }
}' input.txt
tested by gnu sed, your data in 'a':
$ sed -nE '/^$/{N;s/\n(.+)/\1/p}' a
bla3
bla5
add -i option precedes -n to real editing

Removing blank lines

I have a csv file in which every other line is blank. I have tried everything, nothing removes the lines. What should make it easier is that the the digits 44 appear in each valid line. Things I have tried:
grep -ir 44 file.csv
sed '/^$/d' <file.csv
cat -A file.csv
sed 's/^ *//; s/ *$//; /^$/d' <file.csv
egrep -v "^$" file.csv
awk 'NF' file.csv
grep '\S' file.csv
sed 's/^ *//; s/ *$//; /^$/d; /^\s*$/d' <file.csv
cat file.csv | tr -s \n
Decided I was imagining the blank lines, but import into Google Sheets and there they are still! Starting to question my sanity! Can anyone help?
sed -n -i '/44/p' file
-n means skip printing
-i inplace (overwrite same file)
- /44/p print lines where '44' exists
without '44' present
sed -i '/^\s*$/d' file
\s is matching whitespace, ^startofline, $endofline, d delete line
Use the -i option to replace the original file with the edited one.
sed -i '/^[ \t]*$/d' file.csv
Alternatively output to another file and rename it, which is doing the exactly what -i does.
sed '/^[[:blank:]]*$/d' file.csv > file.csv.out && mv file.csv.out file.csv
Given:
$ cat bl.txt
Line 1 (next line has a tab)
Line 2 (next has several space)
Line 3
You can remove blank lines with Perl:
$ perl -lne 'print unless /^\s*$/' bl.txt
Line 1 (next line has a tab)
Line 2 (next has several space)
Line 3
awk:
$ awk 'NF>0' bl.txt
Line 1 (next line has a tab)
Line 2 (next has several space)
Line 3
sed + tr:
$ cat bl.txt | tr '\t' ' ' | sed '/^ *$/d'
Line 1 (next line has a tab)
Line 2 (next has several space)
Line 3
Just sed:
$ sed '/^[[:space:]]*$/d' bl.txt
Line 1 (next line has a tab)
Line 2 (next has several space)
Line 3
Aside from the fact that your commands do not show that you capture their output in a new file to be used in place of the original, there's nothing wrong with them, EXCEPT that:
cat file.csv | tr -s \n
should be:
cat file.csv | tr -s '\n' # more efficient alternative: tr -s '\n' < file.csv
Otherwise, the shell eats the \ and all that tr sees is n.
Note, however, that the above only eliminates only truly empty lines, whereas some of your other commands also eliminate blank lines (empty or all-whitespace).
Also, the -i (for case-insensitive matching) in grep -ir 44 file.csv is pointless, and while using -r (for recursive searches) will not change the fact that only file.csv is searched, it will prepend the filename followed by : to each matching line.
If you have indeed captured the output in a new file and that file truly still has blank lines, the cat -A (cat -et on BSD-like platforms) you already mention in your question should show you if any unusual characters are present in the file, in the form of ^<char> sequences, such as ^M for \r chars.
If you like awk, this should do:
awk '/44/' file
It will only print lines that contains 44

How to grep from within specified character range in line and then print entire line

I have a file which have multiple row each row contains 3400 characters. I want to grep something from specified character range, let's say I want to grep "pavan" between character range 14 to 25 in the line.
To do this I can simply do like below
cat filename | cut -c 14-25 | grep pavan
I tried to use awk command but it does not work since the lines have more than `3000 characters
but by this complete line will not print.
I want to print complete line also so that I can perform further operation on it.
awk -v pattern="pavan" 'match( substr($0, 14, 11), pattern )' file
Will print the matching lines.
A more complicated way of doing the same thing:
awk -v patt="pavan" -v start=14 -v end=25 '
match($0,patt) && start <= RSTART && RSTART <= end-RLENGTH
' file
-- stricken due to valid commentary from Ed Morton.
Some bit of arithmetic and you could use grep:
grep -E '^.{13}.{0,7}pavan' filename
This would match lines containing pavan between the specified character range.
It essentially matches 13 arbitrary characters at the beginning of a line. Then looks for pavan that can be preceded by 0 to 7 arbitrary characters.
This is not very elegant, but does work!
Start off with what you had, but remove the unnecessary cat:
cut -c 14-25 file
now get awk to find the string you want and print the line number:
cut -c 14-25 file | awk '/paven/{print NR}'
Now you have a list of all the line numbers that you want. You can either process them in a while loop, like this:
cut -c 14-25 file | awk '/pavan/{print NR}' | while read line; do
echo $line
sed -n "${line} p"
done
or put them in an array
lines=($(cut -c 14-25 file | awk '/pavan/{print NR}'))
echo ${lines[#]}

Using grep and awk to search and print the output to new file

I have 100 files and want to search a specific word in the first column of each file and print the content of all columns from this word to a new file
I tried this code but doesn't work well it prints only the content of one file not all:
ls -t *.txt > Filelist.tmp
cat Filelist.tmp | while read line do; grep "searchword" | awk '{print $0}' > outputfile.txt; done
This is what you want:
$ awk '$1~/searchword/' *.txt >> output
This compares the first field against searchword and appends the line to output if it matches. The default field separator with awk is whitespace.
The main problem with your attempt is you are overwriting > the file evertime, you want to be using append >>.