printing repeated parts of file with awk

printing repeated parts of file with awk - awk

I would like to go through this file:
\chapter{CHAPTER}
TEXT
\e
h454
\e
\e
454
\e
\begin{figure}
\NOTE{figure}
\centering
\includegraphics[width=0.49\textwidth]{f.pdf}
\caption{\NOTEB{The concept}}
\label{fig}
\end{figure}
SOME TEXT
\e
454
\e
SOME TEXT
\begin{figure}
\NOTE{figure}
\centering
\includegraphics[width=0.49\textwidth]{f.pdf}
\caption{\NOTEB{The concept}}
\label{fig}
\end{figure}
\chapter{CHAPTER}
SOME TEXT
and print some parts:
awk '/\\begin\{figure\}/,/\\end\{figure\}/' file.tex
awk '/\\e/,/\\e/' file.tex
awk '/\chapter/' file.tex
but all into one file and in the order as in the input file. So, the desired output is (empty line does't matter):
\chapter{CHAPTER}
\e
h454
\e
\e
454
\e
\begin{figure}
\NOTE{figure}
\centering
\includegraphics[width=0.49\textwidth]{f.pdf}
\caption{\NOTEB{The concept}}
\label{fig}
\end{figure}
\e
454
\e
\begin{figure}
\NOTE{figure}
\centering
\includegraphics[width=0.49\textwidth]{f.pdf}
\caption{\NOTEB{The concept}}
\label{fig}
\end{figure}
\chapter{CHAPTER}
How to connect these commands and make it to follow the order of input file?

Could you please try following.
awk '
/\\label/{
next
}
/\\begin\{figure\}|\\beq/{
found=1
}
found;
/\\end\{figure\}|\\eeq/{
found=""
}
/chapter/
' Input_file
Since I have written this on my cell I haven't tested it, please feel free to comment on any suggestions.

Related

LaTeX align word characters to the right

I made a latex document with a line next to the margin
\documentclass[12pt]{article}
\usepackage[doublespacing]{setspace}
\usepackage[left=0.95in,top=1in,right=1in,bottom=0.75in]{geometry}
\usepackage{background}
\pagenumbering{gobble}
\SetBgScale{1}
\SetBgColor{black}
\SetBgAngle{0}
\SetBgHshift{0pt}
\SetBgVshift{0mm}
\SetBgContents{
\hspace{1in}
\rule{1pt}{\paperheight} % right first line
\rule[0.75in]{6.5in}{1pt} % bottom line
\rule{1pt}{\paperheight}
}
\setlength{\marginparwidth}{3.0in}
\begin{document}
\reversemarginpar{\vspace{1em}
\begin{spacing}{1.6} %space vertical between numbers
\noindent Sam \\ Rams\\ Tamim \\ Smartcoi \\ 9d5 \\ lousy99\\
\end{spacing}}
\end{document}
How do I get the characters on the words to align right and end at the line? Currently it renders like this:
I am trying to get the end of the words to line up with the line. I tried \begin{flushright} but it moved everything out of place

One possible approach is to use a tabular:
\documentclass[12pt]{article}
\usepackage[doublespacing]{setspace}
\usepackage[left=1.5in,top=1in,right=0.5in,bottom=0.75in,showframe]{geometry}
\usepackage{lipsum}
\pagenumbering{gobble}
\begin{document}
\reversemarginpar%
\marginpar{%
\begin{tabular}{#{}r#{}}
Sam \\
Rams\\
Tamim \\
Smartcoi \\
9d5 \\
lousy99\\
\end{tabular}%
}
\lipsum
\end{document}

Print parts of file using awk

I have several conditions for what I want to print (skip hello that is included in part I would like to print, print from \k{f} to \l{k}, from \word{g} to \word2{g}, print row starting \hello2 and print the part between \b and \bf - there is a problem: in \bf} is } that should not to be printed):
awk '
/\\hello/{
next
}
/\\k\{f\}|\\word\{g\}|\\b/{
found=1
}
found;
/\\l\{f\}|\\word2\{g\}|\\bf/{
found=""
}
/\\hello2/
' file.txt
I would like to add condition for \bf that it should be alone in the row. How to do that please?
file.txt:
text
text
\hello2
456
565
\word{g}
s
\hello
\word2{g}
\k{f}
fdsfd
fgs
\l{f}
text
\b
7
\hello
\bf}
text
Output now:
\word{g}
s
\word2{g}
\k{f}
fdsfd
fgs
\l{f}
\b
7
\bf}
The desired output:
\word{g}
s
\word2{g}
\k{f}
fdsfd
fgs
\l{f}
\b
7
\bf
This question is related to: this question

Add a condition to replace \bf} with \bf
awk '
/\\hello/{
next
}
/\\k\{f\}|\\word\{g\}|\\b/{
found=1
}
# Fix BF lines
/\\bf}/ { $0 = "\\bf" }
#
found;
/\\l\{f\}|\\word2\{g\}|\\bf/{
found=""
}
/\\hello2/
' file.txt

How to get only first occurrence in log file using awk

i've a log file like this
some text line
other text line
<a>
<b>1</b>
<c>2</c>
</a>
another text line
<a>
<b>1</b>
<c>2</c>
</a>
yet another text line
I need to get only ther first occurrence of the XML "a":
<a>
<b>1</b>
<c>2</c>
</a>
I know
awk '/<a>/,/<\/a>/' file.log
will find all occurrences, how can I get just the first? (adding |head -n1 obvously doesn't work because it will capture only first line, and I can't know for sure how long "a" is because the awk expression must be generic because I've different log files with different "a" contents)

Another slight variation is to simply use a simple counter variable to indicate when you are in the first <a>...</a> block, outputting that block and then exiting afterwards. In your case using n as the variable to indicate in the first block, e.g.
awk -v n=0 '$1=="</a>" {print $1; exit} $1=="<a>" {n=1}; n==1' f.xml
Example Use/Output
With your input file as f.xml you would get:
$ awk -v n=0 '$1=="</a>" {print $1; exit} $1=="<a>" {n=1}; n==1' f.xml
<a>
<b>1</b>
<c>2</c>
</a>
(note: the {n=1} and n==1 rules rely on the default operation (print) to output the record)

This awk:
awk '
match($0,/<a>/) {
$0=substr($0,RSTART)
flag=1
}
match($0,/<\/a/) {
$0=substr($0,1,RSTART+RLENGTH)
print
exit
}
flag' file
can handle these forms:
The above awk handles this:
<a><b>1</b><c>2</c></a>
and this:
<a>
<b>1</b>
<c>2</c>
</a>
and also <a>
<b>1</b>
<c>2</c>
</a> this
the end
Another for GNU awk:
$ gawk -v RS="</?a>" '
NR==1 { printf RT }
NR==2 { print $0 RT }
' file

First:
$ awk '/<a>/{f=1} f; /<\/a>/{exit}' file
<a>
<b>1</b>
<c>2</c>
</a>
Last:
$ tac file | awk '/<\/a>/{f=1} f; /<a>/{exit}' | tac
<a>
<b>1</b>
<c>2</c>
</a>
Nth:
$ awk -v n=2 '/<a>/{c++} c==n{print; if (/<\/a>/) exit}' file
<a>
<b>1</b>
<c>2</c>
</a>

Awk printing more records than are in original file

I have written matches.awk to print from each line of a text file the text which matches my regular expression.
`{
line = $0
while (match(line, /([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9][A-Za-z]?))))[[:space:]]?[0-9][A-Za-z]{2})/)>0) {
print substr(line, RSTART, RLENGTH)
line = substr(line, RSART + RLENGTH) }}`
which I then call with
awk -f matches.awk file.txt
It is printing the data correctly but strangely is printing some records far more frequently than they appear in the text file.
This one record which is a line in file.txt '20 Lilac Grove, Leeds LS5 3AG, Lilac Grove' appears four times as many times (212) as it is in file.txt (53). Any idea why this is?

You have a typo in your code (RSART instead of RSTART).
It should be:
{
line = $0
while (match(line, /([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9][A-Za-z]?))))[[:space:]]?[0-9][A-Za-z]{2})/)>0) {
print substr(line, RSTART, RLENGTH)
line = substr(line, RSTART + RLENGTH) }}
Just tested and seems to be OK (i.e: the regexp will hit your line once).
Anyway, I'm afraid you don't understand how your code is working. Please check the String functions GAWK page and see what match() and substr() actually do and what they return.

using awk to count characters and modify file accordingly

I have a file that looks like this
#FCD17BKACXX:8:1101:2703:2197#0/1
CAGCTTTACTCGTCATTTCCCCCAAGGGTAAAATGCGTCCGTCCATTAAGTTCACAGTCATCGTCT
+FCD17BKACXX:8:1101:2703:2197#0/1
^`^\eggcghheJ`dffhhhffhe`ecd^a^_ceacecfhf\beZegfhh_fghhgfZbdg]c^a`
#FCD17BKACXX:8:1101:4434:2244#0/1
CTGCGTTCATCGCGTTGTTGGGAGGAATCTCTACCCCAGGTTCTCGCTGTGAA
+FCD17BKACXX:8:1101:4434:2244#0/1
eeecgeceeffhhihi_fhhiicdgfghiiihiiihiiihVbcdgfhge`cee
#FCD17BKACXX:8:1101:6394:2107#0/1
CAGCAGGACTAGGGCCTGCAGACGTACTG
+FCD17BKACXX:8:1101:6394:2107#0/1
eeeccggeghhiihiihihihhhhcfghf
I would like to go to every second line and count the number of characters. If the line contains less than e.g. 66 characters then fill it to 66 with 'A' and print to new file. If it contains 66 characters then just print the line as is.
The output file would look like this;
#FCD17BKACXX:8:1101:2703:2197#0/1
CAGCTTTACTCGTCATTTCCCCCAAGGGTAAAATGCGTCCGTCCATTAAGTTCACAGTCATCGTCT
+FCD17BKACXX:8:1101:2703:2197#0/1
^`^\eggcghheJ`dffhhhffhe`ecd^a^_ceacecfhf\beZegfhh_fghhgfZbdg]c^a`
#FCD17BKACXX:8:1101:4434:2244#0/1
CTGCGTTCATCGCGTTGTTGGGAGGAATCTCTACCCCAGGTTCTCGCTGTGAAAAAAAAAAAAAAA
+FCD17BKACXX:8:1101:4434:2244#0/1
eeecgeceeffhhihi_fhhiicdgfghiiihiiihiiihVbcdgfhge`ceeAAAAAAAAAAAAA
#FCD17BKACXX:8:1101:6394:2107#0/1
CAGCAGGACTAGGGCCTGCAGACGTACTGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+FCD17BKACXX:8:1101:6394:2107#0/1
eeeccggeghhiihiihihihhhhcfghfAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
I have a very basic knowledge of awk so from a learning perspective I would like to use awk to solve the problem.

One way:
awk '!(NR%2) && length<66{for(i=length;i<66;i++)$0=$0 "A"}1' file

This should be faster than the accepted approach:
awk 'NR%2==0 { x = sprintf("%-66s", $0); gsub(/ /,"A",x); $0 = x }1' file
Results:
#FCD17BKACXX:8:1101:2703:2197#0/1
CAGCTTTACTCGTCATTTCCCCCAAGGGTAAAATGCGTCCGTCCATTAAGTTCACAGTCATCGTCT
+FCD17BKACXX:8:1101:2703:2197#0/1
^`^\eggcghheJ`dffhhhffhe`ecd^a^_ceacecfhf\beZegfhh_fghhgfZbdg]c^a`
#FCD17BKACXX:8:1101:4434:2244#0/1
CTGCGTTCATCGCGTTGTTGGGAGGAATCTCTACCCCAGGTTCTCGCTGTGAAAAAAAAAAAAAAA
+FCD17BKACXX:8:1101:4434:2244#0/1
eeecgeceeffhhihi_fhhiicdgfghiiihiiihiiihVbcdgfhge`ceeAAAAAAAAAAAAA
#FCD17BKACXX:8:1101:6394:2107#0/1
CAGCAGGACTAGGGCCTGCAGACGTACTGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+FCD17BKACXX:8:1101:6394:2107#0/1
eeeccggeghhiihiihihihhhhcfghfAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

I would paste another strange (maybe) oneliner:
awk 'BEGIN{while(++i<66)t=t"A"}!(NR%2){$0=$0substr(t,length)}1' file

awk 'NR%2 == 0{
printf("%s", $0)
for(i=length($0); i<66; i++)printf("A")
print "";next }
{print}'

awk -v FS= '{printf "%s",$0} !(NR%2){for (i=NF+1;i<=66;i++) printf "A"} {print ""}'
or if you don't like loops:
awk -v FS= '{sfx=(NR%2 ? "" : sprintf("%*s",66-NF,"")); gsub(/ /,"A",sfx); print $0 sfx}'

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

printing repeated parts of file with awk - awk

Could you please try following. awk ' /\\label/{ next } /\\begin\{figure\}|\\beq/{ found=1 } found; /\\end\{figure\}|\\eeq/{ found="" } /chapter/ ' Input_file Since I have written this on my cell I haven't tested it, please feel free to comment on any suggestions.

Related

LaTeX align word characters to the right

Print parts of file using awk

How to get only first occurrence in log file using awk

Awk printing more records than are in original file

using awk to count characters and modify file accordingly

Categories

Resources