I would like to print everything between two two lines that match a certain pattern. For example, if my input file looks like
--- START ---
line 1
line 2
--- END ---
I would like to have as output
line 1
line 2
Can this be done (e.g. using grep or awk?)
This is how to do it with awk
awk '/END/{f=0} f; /START/{f=1}' file.txt
line 1
line 2
You should easily find solution for this using Google.
Another version:
awk '/START/{f=1;next} /END/{f=0} f' file.txt
line 1
line 2
you can do
sed -n '/--- START ---/,/--- END ---/{/--- START ---\|--- END ---/!p}' < input
or
awk '/--- END ---/{exit}; flag {print}; /--- START ---/{flag=1} ' < input
Using perl :
perl -0777 -ne 'print $1 if /^--- START ---\s*\n(.*?)--- END ---/s' file
Here is a simple solution which uses awk and grep:
awk '/-- START ---/,/--- END ---/ {print $0}' file.txt \
| grep -v -- '--- START ---' \
| grep -v -- '--- END ---'
Related
I am trying to process the contents of a file from this format:
this1,EUR
that2,USD
other3,GBP
to this format:
this1(EUR),that2(USD),other3(GBP)
The result should be a single line.
As of now I have come up with this circuit of commands that works fine:
cat myfile | sed -e 's/,/\(/g' | sed -e 's/$/\)/g' | tr '\n' , | awk '{print substr($0, 0, length($0)- 1)}'
Is there a simpler way to do the same by just an awk command?
Another awk:
$ awk -F, '{ printf "%s%s(%s)", c, $1, $2; c = ","} END { print ""}' file
1(EUR),2(USD),3(GBP)
Following awk may help you on same.
awk -F, '{val=val?val OFS $1"("$2")":$1"("$2")"} END{print val}' OFS=, Input_file
Toying around with separators and gsub:
$ awk 'BEGIN{RS="";ORS=")\n"}{gsub(/,/,"(");gsub(/\n/,"),")}1' file
this1(EUR),that2(USD),other3(GBP)
Explained:
$ awk '
BEGIN {
RS="" # record ends in an empty line, not newline
ORS=")\n" # the last )
}
{
gsub(/,/,"(") # replace commas with (
gsub(/\n/,"),") # and newlines with ),
}1' file # output
Using paste+sed
$ # paste -s will combine all input lines to single line
$ seq 3 | paste -sd,
1,2,3
$ paste -sd, ip.txt
this1,EUR,that2,USD,other3,GBP
$ # post processing to get desired format
$ paste -sd, ip.txt | sed -E 's/,([^,]*)(,?)/(\1)\2/g'
this1(EUR),that2(USD),other3(GBP)
I have a file with 100 columns of data. I want to print the first column and i-th column in 99 separate files, I am trying to use
for i in {2..99}; do awk '{print $1" " $i }' input.txt > data${i}; done
But I am getting errors
awk: illegal field $(), name "i"
input record number 1, file input.txt
source line number 1
How to correctly use $i inside the {print }?
Following single awk may help you too here:
awk -v start=2 -v end=99 '{for(i=start;i<=end;i++){print $1,$i > "file"i;close("file"i)}}' Input_file
An all awk solution. First test data:
$ cat foo
11 12 13
21 22 23
Then the awk:
$ awk '{for(i=2;i<=NF;i++) print $1,$i > ("data" i)}' foo
and results:
$ ls data*
data2 data3
$ cat data2
11 12
21 22
The for iterates from 2 to the last field. If there are more fields that you desire to process, change the NF to the number you'd like. If, for some reason, a hundred open files would be a problem in your system, you'd need to put the print into a block and add a close call:
$ awk '{for(i=2;i<=NF;i++){f=("data" i); print $1,$i >> f; close(f)}}' foo
If you want to do what you try to accomplish :
for i in {2..99}; do
awk -v x=$i '{print $1" " $x }' input.txt > data${i}
done
Note
the -v switch of awk to pass variables
$x is the nth column defined in your variable x
Note2 : this is not the fastest solution, one awk call is fastest, but I just try to correct your logic. Ideally, take time to understand awk, it's never a wasted time
I was trying to do masking of file with command 'tr' and 'awk' but failing with error fatal: cannot open pipe ( Too many open pipes) error. FILE has approx 1000000 records quite a huge number.
Below is the code I am trying :-
awk - F "|" - v OFS="|" '{ "echo \""$1"\" | tr \" 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ\" \" QWERTYUIOPASDFGHJKLZXCVBNM9876543210mnbvcxzlkjhgfdsapoiuytrewq\"" | get line $1}1' FILE.CSV > test.CSV
It is showing error :-
awk: (FILENAME=- FNR=1019) fatal: cannot open pipe `echo ""TTP_123"" | tr "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" "QWERTYUIOPASDFGHJKLZXCVBNM9876543210mnbvcxzlkjhgfdsapoiuytrewq"' (Too many open pipes)
Please let me know what I am doing wrong here
Also a Note any number of columns could be used for masking and can be at any positions in this example I have taken 1 and 2 column positions but it could be 3 and 10 or 5,7,25 columns
Thanks
AJ
First things first, you can't have a space between - and F or v.
I was going to suggest sed, but as you only want to translate the first column, that's not as easy.
Unfortunately, awk doesn't have built-in tr functionality, so you'd have to use the shell like you are and just close the pipe:
awk -F "|" -v OFS="|" '{
command="echo \"\\"$1"\\\" | tr \" 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ\" \" QWERTYUIOPASDFGHJKLZXCVBNM9876543210mnbvcxzlkjhgfdsapoiuytrewq\""
command | getline $1
close(command)
}1' FILE.CSV > test.CSV
However, I suggest using perl, which can do field splitting and character translation:
perl -F'\|' -lane '$F[0] =~ tr/0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/QWERTYUIOPASDFGHJKLZXCVBNM9876543210mnbvcxzlkjhgfdsapoiuytrewq/; print join("|", #F)' FILE.CSV > test.CSV
Or, for a shorter command line, just put the program into a file, drop the e in -lane and use the file name instead of the '...' command.
you can do the mapping in awk instead of making a system call for each line, or perhaps simply
paste -d'|' <(cut -d'|' -f1 file | tr '0-9' 'a-z') <(cut -d'|' -f2- file)
replace the tr arguments with yours.
This does not answer your question, but you can implement tr as an awk function that would save having to spawn lots of external processes
$ cat tr.awk
function tr(str, from, to, s,i,c,idx) {
s = ""
for (i=1; i<=length($str); i++) {
c = substr(str, i, 1)
idx = index(from, c)
s = s (idx == 0 ? c : substr(to, idx, 1))
}
return s
}
{
print $1, tr($1,
" 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ",
" QWERTYUIOPASDFGHJKLZXCVBNM9876543210mnbvcxzlkjhgfdsapoiuytrewq")
}
Example:
$ printf "%s\n" hello wor-ld | awk -f tr.awk
hello KGCCN
wor-ld 3N8-CF
Hoping somebody can help me out.
I have large number of files with different number of lines.
I would like to add new lines in to the files up to specific rows, say 6.
Infile.txt
text1
text2
text3
The out file I would like to have is
Outfile.txt
text1
text2
text3
\n
\n
\n
Short awk solution:
awk -v r=6 'END{ while((r--)-NR>0) print "" }1' file
-v r=6 - variable r indicating total/maximal number of rows
In awk's END block, the built-in variable NR will contain the row number of the last line of the file. From there it's easy to print the needed number of additional empty rows.
$ awk -v lines=6 '1; END {for (i=NR; i<lines; ++i) print ""}' file
text1
text2
text3
$ awk -v lines=6 '1; END {for (i=NR; i<lines; ++i) print ""}' file | wc -l
6
IMHO the clearest and most obvious way to handle this is to simply loop from the last line number plus 1 to the target number of lines:
$ seq 3 | awk -v n=6 '{print} END{for (i=NR+1; i<=n; i++) print ""}'
1
2
3
$
You can also count down if you want to save a variable:
$ seq 3 | awk -v n=6 '{print} END{while (n-- > NR) print ""}'
1
2
3
$
but IMHO that's sacrificing clarity in favor of brevity and not worthwhile.
Can anyone tell me how to print line numbers including zero using awk?
Here is my input file stackfile2.txt
when I run the below awk command I get actual_output.txt
awk '{print NR,$0}' stackfile2.txt | tr " ", "," > actual_output.txt
whereas my expected output is file.txt
How do I print the line numbers starting with zero (0)?
NR starts at 1, so use
awk '{print NR-1 "," $0}'
Using awk.
i starts at 0, i++ will increment the value of i, but return the original value that i held before being incremented.
awk '{print i++ "," $0}' file
Another option besides awk is nl which allows for options -v for setting starting value and -n <lf,rf,rz> for left, right and right with leading zeros justified. You can also include -s for a field separator such as -s "," for comma separation between line numbers and your data.
In a Unix environment, this can be done as
cat <infile> | ...other stuff... | nl -v 0 -n rz
or simply
nl -v 0 -n rz <infile>
Example:
echo "Here
are
some
words" > words.txt
cat words.txt | nl -v 0 -n rz
Out:
000000 Here
000001 are
000002 some
000003 words
If Perl is an option, you can try this:
perl -ne 'printf "%s,$_" , $.-1' file
$_ is the line
$. is the line number