interview questions on sed or awk [closed] - awk

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Hi I recently crashed and burned on a interview question on file editing. Kinda been bugging me. I searched on the forum but did quite get it to go.
The question was:
testfile.txt has the following:
some test text 1
some test text 2
some test text 3
Change testfile.txt so it can look like this (with out using VI or gedit):
some test text 1
some more test text
some test text 2
some more test text
some test text 3
some more test text
I have tried to use sed and awk but the text does not come out where it needs to be. Any help would be greatly appreciated.

This might work for you (GNU sed):
sed -i 'asome more test text' file
This will append the text some more test text to every line and amend the original file in place.

Given:
$ echo "$txt"
some test text 1
some test text 2
some test text 3
In awk:
$ echo "$txt" | awk '{print; print "some more test text"}'
some test text 1
some more test text
some test text 2
some more test text
some test text 3
some more test text

posting this for the original question for inserting second line to a text file.
here is one way to do it with sed
$ seq 3 | sed 2iinserted
1
inserted
2
3
For inserting text after every line
$ seq 3 | sed 1~1ainserted
1
inserted
2
inserted
3
inserted
if you don't understand what this is, read some of the tutorials, internet is full of them.

Given the output matching against a repeating phrase some test text, I think
sed 's/some test text.*/&\nsome more test text/' file
will do
some test text 1
some more test text
some test text 2
some more test text
some test text 3
some more test text
Of course the & char in the RHS of the substitute command captures all chars matched and inserts them into the output. We add in \n some more test text on a new-line and we're good.
Note that non-gnu/linux seds may not recognized \n as a newline, in which case (from the cmd-line), you need to use the key combination Ctrl-V Ctrl-M to add line breaks to your output. (I've never seen it explained how ^M is converted in the file to ^J, but it goes back to the 80s, so go figure).
IHTH

Related

Print next word after pattern match for two strings in same line for one file in se [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last month.
Improve this question
i have different lines in one file .I want to achieve the output as below. I want to print the next word after aaaa and test words with delimiter ,.
Input is
Line Aaaa orange test match
Colour Aaaa banana test sun
Ball Aaaa guava test Saturday
Basket Aaaa tomato test sunset
Output has to be
Orange ,match
Banana ,sun
Guava, Saturday
Tomato,sunset
Could anyone please help on this
I have tried using sed ,grep commands but i didnt get the expected output
perl, using look-behinds:
perl -nE 'if (/(?<=aaaa )(\w+).*(?<=test )(\w+)/i) {say "$1,$2"}' file
awk solution
Assuming you want to ignore lines where "Aaaa" does not occur, the following awk process should achieve your needs (if the relative positions of"Aaaa" and "test" are constant - see edit below):
awk '{for(i=1;i<NF;i++) if ($i=="Aaaa") {print $(i+1) ", " $(i+3); next}}' file
explanation
Each line of file is processed (by default) as white-space separated fields. A loop examines each field for the required pattern ("Aaaa") and (if found) prints the values of the next field, the required comma, and the value of the final required field.
Edit
For cases where the position of "test" may also vary, providing "test" is never before "Aaaa"*, the following procedure should work:
awk '{for(i=1;i<NF;i++) {if ($i=="Aaaa") {line= $(i+1) ", "; } if ($i=="test") {line=line $(i+1); print line;next}}}' file
It will need reworking if "test" can come before "Aaaa" as follows:
awk '{partA=partB=""; for(i=1;i<NF;i++) {if($i=="Aaaa") partA=$(i+1); if($i=="test") partB=$(i+1); if(partB && partA) {print partA ", " partB; next}}}' file
This version requires both "Aaaa" and "test" are present but they can be in any order on the line.
(each) tested on file:
Line Aaaa orange test match
Colour Aaaa banana test sun
Ball Aaaa guava test Saturday
Basket Aaaa tomato test sunset
output (for all three versions)
orange, match
banana, sun
guava, Saturday
tomato, sunset
(using GNU Awk 5.1.0)

Delete lines that contain only spaces [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 months ago.
The community reviewed whether to reopen this question 5 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
Consider following file.
Dalot
# Really empty line
Eriksen
# Line containing space
Malacia
# Line containing tab
# Really empty line
Varane
How do I remove line that ONLY contain either whitespace or tab on it, and leaving empty line intact. The other answer here mostly will remove all empty line including spaces and tab.
Following is desired output.
Dalot
# Really empty line
Eriksen
Malacia
# Really empty line
Varane
Using awk:
awk '/^$/ || NF' file
sed -E '/^[\t ]+$/d'
i.e. "If the line contains spaces and tabs only, remove it".
This might work for you (GNU sed):
sed '/\S/!d' file
Delete all lines that do not contain at least one non-white spaced character.
Alternative:
sed '/^\s*$/d' file
$ awk '/^(|[^\t ]+)$/' file
or
$ sed -En '/^(|[^\t ]+)$/p' file
print if no chars or at least one non-whitespace char.
Note that #choroba did the reverse of this logic to delete lines, which is actually smarter.
cat file | tr ' ' | tr '\t' > f2
mv f2 file

Print lines in a file with delimiters [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 months ago.
This post was edited and submitted for review 7 months ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
input.txt
hello
cruel
world
I want to print all the lines from the above file such that they have a beginning and ending string added along with a delimiter.
BEGIN='
END='
DELIMITER=|
Expected output:
'hello'|'cruel'|'world'
I would GNU AWK for this task following way, let file.txt content be
hello
cruel
world
then
awk '{printf "%s\047%s\047",(NR>1?"|":""),$0}' file.txt
gives output
'hello'|'cruel'|'world'
Explanation: I use printf with 2 places to fill (denoted to %s) and 2 ' (as they have special meaning we must not use just ' but escaped version, that is \047) and so-called ternany operator (condition?valueiftrue:valueiffalse) to use | for lines after first (NR>1) and empty string for all else to fill 1st place and content of whole line ($0) to fill 2nd place.
(tested in gawk 4.2.1)

how to remove part of the string if the condition exists [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have a file similar to this:
A*01:03:05
B1*02:06:08
F2*03:01:06
R5*02:01
S1*02:08
And would like to remove the last 2 numbers and the colon, only when there are 2 colon separators. so it will be:
A*01:03
B1*02:06
F2*03:01
R5*02:01
S1*02:08
The last 2 lines remain unchanged because they do not have 2 colon separators after the *, so no changes are made to those values
I used sed and gsub to remove everything after the last underscore but was not sure how to add a condition to exempt the condition when I do not have 2 colons after the *.
This might work for you (GNU sed):
sed 's/:..//2' file
This removes the second occurrence of a : followed by 2 characters.
If this is too lax, use:
sed -E 's/^([^:]*:[^:]*):[0-9]{2}/\1/' file
With cut, you can set : as delimiter and print only upto first two fields
cut -d: -f-2 ip.txt
Similar logic can be done with awk, assuming the implementation supports manipulating NF
awk 'BEGIN{FS=OFS=":"} NF==3{NF=2} 1' ip.txt
This works:
$ sed -E 's/^([^:]*:[^:]*):[0-9][0-9]$/\1/' file
The [^:] means 'any character other than a :' so it works by making the deletion at the end only if there are two leading colons.
This awk works too:
$ awk 'gsub(/:/,":")==2 {sub(/:[0-9][0-9]$/,"")} 1' file
In this case, gsub returns the number of replacements made. So if there are two colons, delete the ending.
You can also use GNU grep (with PCRE) to only match the template of what you are looking for:
$ grep -oP '^\w+\*\d\d:\d\d' file
Or perl same way:
$ perl -lnE 'say "$1" if /(^\w+\*\d\d:\d\d)/' file

Search for a string which is stretched over 2 lines [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
i search for this string abcdefgh in a very large file like this and i don't know at which position the new line begin. My first thought was to remove all \n, but the file is over 3 gb ... I think there is smart way to do this (sed, awk, ...?)
efhashivia
hjasjdjasd
oqkfoflABC
DEFGHqpclq
pasdpapsda
Assuming that your search string cannot expand into more than 2 lines, you can use this awk:
awk -v p="ABCDEFGH" 's $0 ~ p {print NR,s $0} {s=$0}' file
Or you can paste each line with its next one, and grep the result. This way you have to create a file with double size of your large input.
tail -n +2 file | paste -d '' file - > output.txt
> cat output.txt
efhashiviahjasjdjasd
hjasjdjasdoqkfoflABC
oqkfoflABCDEFGHqpclq
DEFGHqpclqpasdpapsda
pasdpapsda
> grep -n ABCDEFGH output.txt
3:oqkfoflABCDEFGHqpclq