Search for a string which is stretched over 2 lines [closed] - awk

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
i search for this string abcdefgh in a very large file like this and i don't know at which position the new line begin. My first thought was to remove all \n, but the file is over 3 gb ... I think there is smart way to do this (sed, awk, ...?)
efhashivia
hjasjdjasd
oqkfoflABC
DEFGHqpclq
pasdpapsda

Assuming that your search string cannot expand into more than 2 lines, you can use this awk:
awk -v p="ABCDEFGH" 's $0 ~ p {print NR,s $0} {s=$0}' file
Or you can paste each line with its next one, and grep the result. This way you have to create a file with double size of your large input.
tail -n +2 file | paste -d '' file - > output.txt
> cat output.txt
efhashiviahjasjdjasd
hjasjdjasdoqkfoflABC
oqkfoflABCDEFGHqpclq
DEFGHqpclqpasdpapsda
pasdpapsda
> grep -n ABCDEFGH output.txt
3:oqkfoflABCDEFGHqpclq

Related

Print lines in a file with delimiters [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 months ago.
This post was edited and submitted for review 7 months ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
input.txt
hello
cruel
world
I want to print all the lines from the above file such that they have a beginning and ending string added along with a delimiter.
BEGIN='
END='
DELIMITER=|
Expected output:
'hello'|'cruel'|'world'
I would GNU AWK for this task following way, let file.txt content be
hello
cruel
world
then
awk '{printf "%s\047%s\047",(NR>1?"|":""),$0}' file.txt
gives output
'hello'|'cruel'|'world'
Explanation: I use printf with 2 places to fill (denoted to %s) and 2 ' (as they have special meaning we must not use just ' but escaped version, that is \047) and so-called ternany operator (condition?valueiftrue:valueiffalse) to use | for lines after first (NR>1) and empty string for all else to fill 1st place and content of whole line ($0) to fill 2nd place.
(tested in gawk 4.2.1)

How do you delete all lines that contain no letters from the alphabet using either grep, sed, or awk? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
Remove all lines that don't contain a letter from the alphabet (upper or lower case)
Input :
34
76
0hjjAby68xp
H5e
895
Output :
0hjjAby68xp
H5e
With GNU grep:
grep '[[:alpha:]]' file
or GNU sed:
sed '/[[:alpha:]]/!d' file
Output:
0hjjAby68xp
H5e
Using awk:
$ awk '/[[:alpha:]]/' file
0hjjAby68xp
H5e

Script to grep on range of start/end text [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I am trying to write a script to grep output from a range based on text, not line numbers.
For instance, in my text file, I want to grep the output starting with $hostname and capture everything in between $endText and then output the data in between those to a file named $hostname.txt.
Since you didn't provide any details, here is the boiler plate.
$ sed -n '/start/,/end/p' file > outputfile
or
$ awk '/start/,/end/' file > outputfile

How to use variable in awk substr [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
n=4;
echo "abcd" awk '{print substr($0,$n,1);}'
I want to get the substring by the usage of variable but I am not getting please help
$ n=4; echo "abcd" | awk -v n="$n" '{print substr($0,n,1);}'
d
Possibly, it is clearer to have two different variable names:
$ n=4; echo "abcd" | awk -v m="$n" '{print substr($0,m,1);}'
d
Here, n is a shell variable and m is an awk variable. The -v option is used to assign the awk variable m to have the value of the shell variable n.

Add header to line using awk [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a file with the following format:
AACCCGTAGATCCGAACTTGTG
ACCCGTAGATCCGAACTTGTG
CCGTAGATCCGAACTTGTG
CGTAGATCCGAACTTGT
I want to give a header to each line, using awk, where the header is equal to the line that follows, like this:
>AACCCGTAGATCCGAACTTGTG
AACCCGTAGATCCGAACTTGTG
>ACCCGTAGATCCGAACTTGTG
ACCCGTAGATCCGAACTTGTG
>CCGTAGATCCGAACTTGTG
CCGTAGATCCGAACTTGTG
>CGTAGATCCGAACTTGT
CGTAGATCCGAACTTGT
Simply:
$ awk '{print ">"$0;print}' file
>AACCCGTAGATCCGAACTTGTG
AACCCGTAGATCCGAACTTGTG
>ACCCGTAGATCCGAACTTGTG
ACCCGTAGATCCGAACTTGTG
>CCGTAGATCCGAACTTGTG
CCGTAGATCCGAACTTGTG
>CGTAGATCCGAACTTGT
CGTAGATCCGAACTTGT
Or:
$ awk '{printf ">%s\n%s\n",$0,$0}' file
>AACCCGTAGATCCGAACTTGTG
AACCCGTAGATCCGAACTTGTG
>ACCCGTAGATCCGAACTTGTG
ACCCGTAGATCCGAACTTGTG
>CCGTAGATCCGAACTTGTG
CCGTAGATCCGAACTTGTG
>CGTAGATCCGAACTTGT
CGTAGATCCGAACTTGT
The -v flag allows you to set a variable. Then for each line in the file print that variable followed by the line, and then the line itself.
awk -v c=">" '{ print c $0; print $0; }' <file>