Extracting data after a tag and CR with Busybox sed - awk

I have a script that extracts a file from a bash script combined with a binary file. It does so using the following GNU sed syntax
sed -n '/__DATA__/{n;:1;n;p;b1}' /tmp/combined.file > /tmp/binary.file
The files are assembled by cat'ing an ISO file to the end of a bash script. Which is then sent over the network to an embedded device and extracted on the device, piping the ISO file to a temporary dir and executing the bash script to install it.
However, on executing this I get a
sed: unterminated {
Am I missing something here? Is this task possible with BusyBox sed?

It tried the "Second attempt" below with OSX/BSD awk and it failed, just printing up til the first NUL character. So you can't do this job portably with awk or sed.
Here's what should work everywhere given that the POSIX standard says
the input file to tail can be any type
so the input to tail doesn't have to be a POSIX text file (no NULs) and we're exiting from awk before the first NUL is encountered in the input so they should both be happy:
$ tail -n +"$(awk '/^__DATA__$/{print NR+2; exit}' binary.bin)" binary.bin | cat -ev
ER^H^#^#^#M-^PM-^P^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#3��M-^Nռ^#|��f1�f1�fSfQ^FWM-^N�M-^N�R�^#|�^#^F�^#^A��K^F^#^#R�A��U1�0���^Sr^VM-^A�U�u^PM-^C�^At^Kf�^F�^F�B�^U�^B1�ZQ�^H�^S[^O��#PM-^C�?Q��SRP�^#|�^D^#f��^G�D^#^OM-^BM-^#^#f#M-^#�^B��fM-^A>#|��xpu ��{�D|^#^#�M-^C^#isolinux.bin missing or corrupt.^M$
f`f1�f^C^F�{f^S^V�{fRfP^FSj^Aj^PM-^I�f�6�{��^FM-^H�M-^H�M-^R�6�{M-^H�^H�A�^A^BM-^J^V�{�^SM-^Md^Pfa��^^^#Operating system load error.^M$
^��^NM-^J>b^D�^G�^P<$
u��^X���^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#L^D^#^#^#^#^#^#�K�6^#^#M-^#^#^A^#^#?�M-^K^#^#^#^#^#`^\^#^#�������<R^#^#^#^_^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#U�EFI PART^#^#^A^#\^#^#^#]3�.^#^#^#^#^A^#^#^#^#^#^#^#�_^\^#^#^#^#^##^#^#^#^#^#^#^#�_^\^#^#^#^#^#Uc�r^Oqc#M-^Rc^F�$LZ�^L^#^#^#^#^#^#^#�^#^#^#M-^#^#^#^#�t^]F^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#$
Second attempt:
Now that I have a better idea what you're trying to do (process a file consisting of POSIX text lines up to a point and then can contain NUL characters afterwards), try this:
$ cat -ev file
echo "I: Installation finished!"$
exit 0$
$
__DATA__$
$
foo^#bar^#etc
$ cat tst.awk
/^__DATA__$/ { n=NR + 1 }
n && (NR == n) { RS="\0"; ORS="" }
n && (NR > n) { print (c++ ? RS : "") $0 }
$ awk -f tst.awk file | cat -ev
foo^#bar^#etc
The above doesn't try to store any input lines containing NUL in memory, instead it reads \n-terminated text lines until it reaches the line after the one containing __DATA__ and then switches to reading NUL-terminated records into memory and printing NULs between them on output.
It's still undefined behavior per POSIX (see my comments below) but in theory it should work since it just relies on being able to set one variable (RS) to NUL rather than trying to store input strings that contain NULs. Also, setting RS to NUL has been a (flawed) workaround for awk scripts for years to be able to read a whole file into memory at once so being able to set RS to NUL should work in any modern awk.
Using the new sample you provided with the missing blank line after the __DATA__ line added:
$ cat -ev file
#!/bin/bash$
$
echo "I: Awesome Things happened here"$
exit 0$
$
__DATA__$
$
ER^H^#^#^#M-^PM-^P^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#3M-mM-zM-^NM-UM-<^#|M-{M-|f1M-[f1M-IfSfQ^FWM-^NM-]M-^NM-ERM->^#|M-?^#^FM-9^#^AM-sM-%M-jK^F^#^#RM-4AM-;M-*U1M-I0M-vM-yM-M^Sr^VM-^AM-{UM-*u^PM-^CM-a^At^KfM-G^FM-s^FM-4BM-k^UM-k^B1M-IZQM-4^HM-M^S[^OM-6M-F#PM-^CM-a?QM-wM-aSRPM-;^#|M-9^D^#fM-!M-0^GM-hD^#^OM-^BM-^#^#f#M-^#M-G^BM-bM-rfM-^A>#|M-{M-#xpu M-zM-<M-l{M-jD|^#^#M-hM-^C^#isolinux.bin missing or corrupt.^M$
f`f1M-Rf^C^FM-x{f^S^VM-|{fRfP^FSj^Aj^PM-^IM-ffM-w6M-h{M-#M-d^FM-^HM-aM-^HM-EM-^RM-v6M-n{M-^HM-F^HM-aAM-8^A^BM-^J^VM-r{M-M^SM-^Md^PfaM-CM-h^^^#Operating system load error.^M$
^M-,M-4^NM-^J>b^DM-3^GM-M^P<$
uM-qM-M^XM-tM-kM-}^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#L^D^#^#^#^#^#^#M-/KM-66^#^#M-^#^#^A^#^#?M-`M-^K^#^#^#^#^#`^\^#^#M-~M-^?M-^?M-oM-~M-^?M-^?<R^#^#^#^_^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#UM-*EFI PART^#^#^A^#\^#^#^#]3M-%.^#^#^#^#^A^#^#^#^#^#^#^#M-^?_^\^#^#^#^#^##^#^#^#^#^#^#^#M-J_^\^#^#^#^#^#UcM-)r^Oqc#M-^Rc^FM-2$LZM-p^L^#^#^#^#^#^#^#M-P^#^#^#M-^#^#^#^#M-{t^]F^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#$
.
$ awk -f tst.awk file | cat -ev
ER^H^#^#^#M-^PM-^P^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#3M-mM-zM-^NM-UM-<^#|M-{M-|f1M-[f1M-IfSfQ^FWM-^NM-]M-^NM-ERM->^#|M-?^#^FM-9^#^AM-sM-%M-jK^F^#^#RM-4AM-;M-*U1M-I0M-vM-yM-M^Sr^VM-^AM-{UM-*u^PM-^CM-a^At^KfM-G^FM-s^FM-4BM-k^UM-k^B1M-IZQM-4^HM-M^S[^OM-6M-F#PM-^CM-a?QM-wM-aSRPM-;^#|M-9^D^#fM-!M-0^GM-hD^#^OM-^BM-^#^#f#M-^#M-G^BM-bM-rfM-^A>#|M-{M-#xpu M-zM-<M-l{M-jD|^#^#M-hM-^C^#isolinux.bin missing or corrupt.^M$
f`f1M-Rf^C^FM-x{f^S^VM-|{fRfP^FSj^Aj^PM-^IM-ffM-w6M-h{M-#M-d^FM-^HM-aM-^HM-EM-^RM-v6M-n{M-^HM-F^HM-aAM-8^A^BM-^J^VM-r{M-M^SM-^Md^PfaM-CM-h^^^#Operating system load error.^M$
^M-,M-4^NM-^J>b^DM-3^GM-M^P<$
uM-qM-M^XM-tM-kM-}^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#L^D^#^#^#^#^#^#M-/KM-66^#^#M-^#^#^A^#^#?M-`M-^K^#^#^#^#^#`^\^#^#M-~M-^?M-^?M-oM-~M-^?M-^?<R^#^#^#^_^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#UM-*EFI PART^#^#^A^#\^#^#^#]3M-%.^#^#^#^#^A^#^#^#^#^#^#^#M-^?_^\^#^#^#^#^##^#^#^#^#^#^#^#M-J_^\^#^#^#^#^#UcM-)r^Oqc#M-^Rc^FM-2$LZM-p^L^#^#^#^#^#^#^#M-P^#^#^#M-^#^#^#^#M-{t^]F^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#$
Original answer:
Assuming this question is related to your previous question, this will work using any awk in any shell on every UNIX box:
$ awk '/^__DATA__$/{n=NR+1} n && NR>n' file
3<ED>M-^PM-^PM-^PM-^PM-^
When it finds __DATA__ it sets a variable n to the line number to start printing after and then when n is set prints every line for which the line number is greater than n.
The above was run against this input file from your previous question:
$ cat -ev file
echo "I: Installation finished!"$
exit 0$
$
__DATA__$
$
3<ED>M-^PM-^PM-^PM-^PM-^$

Related

SUBSTITUTING VARIBALE IN SED ESCAPING $

I have 20 lines in file i want to remove all lines after nth line
nth line number is set using varible
eg var1=5 then remove line 5 to 20 using sed
Tried
sed ""$var1",$d" file -i
But it produces following error
sed: -e expression #1, char 5: unexpected `,'```
We can try
var="3"
awk -v line="$var" '1; FNR==line{exit}' file > temp && mv temp file
Details:
Here we are creating a shell variable var with value 3, then we are passing that variable to awk program, now the awk program is printing every line till the line number is 3 then the program is exited.
Use this Perl one-liner:
export var1=5
perl -i.bak -pe 'last if $. == $ENV{var1}' in_file
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.
-i.bak : Edit input files in-place (overwrite the input file). Before overwriting, save a backup copy of the original file by appending to its name the extension .bak.
$. : Current input line number.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
I was beginner to linux thats why i asked such question.
Since no one answered.
I kept trying.
Now i learned how to debug linux shells and hence while using set -x i encountered mistake i was doing while writing up this command.
Now it is solved.
var=2
sed ""$(echo "$var")",\$d" file

Redirecting AWK output back to the input file

Consider the bash command, where file is a file with a single nonempty line.
awk '{print "stuff"}' file >> file
It seems like this should do the following: awk reads a line of file, writes "stuff" to it, and then proceeds to the next line, at which point, it should write stuff to file again, and so on to infinity. But, instead, it just terminates after writing once. Why is this? Is this a property of the file system, unix piping or awk?
It works, you just need a bigger file:
$ echo foo > foo
$ awk '{print $1}' foo >> foo
$ wc -l foo
2 foo
But:
$ for i in {1..4096} ; do echo $i ; done >> foo
$ awk '{print $1}' foo >> foo
^C
$ wc -l foo
19429617 foo
Using GNU awk in this example. I assume it (GNU awk) opens the file and reads one full block of data, not just one record. If there is less than or equal to the size of one block amount of data, it closes the file from reading. If there is more, it keeps the file open for reading until EOF and keeps appending to the end of it.

awk print overwrite strings

I have a problem using awk in the terminal.
I need to move many files in a group from the actual directory to another one and I have the list of the necessary files in a text file, as:
filename.txt
file1
file2
file3
...
I usually digit:
paste filename.txt | awk '{print "mv "$1" ../dir/"}' | sh
and it executes:
mv file1 ../dir/
mv file2 ../dir/
mv file3 ../dir/
It usually works, but now the command changes its behaviour and awk overwrites the last string ../dir/ on the first one, starting again the print command from the initial position, obtaining:
../dire1 ../dir/
../dire2 ../dir/
../dire3 ../dir/
and of course it cannot be executed.
What's happened?
How do I solve it?
Your input file contains carriage returns (\r aka control-M). Run dos2unix on it before running a UNIX tool on it.
idk what you're using paste for though, and you should not be using awk for this at all anyway, it's just a job for a simple shell script, e.g. remove the echo once you've tested this:
$ < file xargs -n 1 -I {} echo mv "{}" "../dir"
mv file1 ../dir
mv file2 ../dir
mv file3 ../dir

Prepend a # to the first line not already having a #

I have a file with options for a command I run. Whenever I run the command I want it to run with the options defined in the first line which is not commented out. I do this using this bash script:
while read run opt c; do
[[ $run == \#* ]] && continue
./submit.py $opt $run -c "$c"
break
done < to_submit.txt
The file to_submit.txt has entries like this:
#167993 options/optionfile.py long description
167995 options/other_optionfile.py other long description
...
After having run the submit script with the options in the last not commented out line, I want to comment out that line after the command ran successfully.
I can find the line number of the options I used adding this to the while loop:
line=$(grep -n "$run" to_submit.txt | grep "$opt" | grep "$c" | cut -f 1 -d ":")
But I'm not sure how to actually prepend a # to that line now. I could probably use head and tail to save the other lines and process that line separately and combine it all back into the file. But this sounds like it's to complicated, there must be an easier sed or awk solution to this.
$ awk '!f && sub(/^[^#]/,"#&"){f=1} 1' file
#167993 options/optionfile.py long description
#167995 options/other_optionfile.py other long description
...
To overwrite the contents of the original file:
awk '!f && sub(/^[^#]/,"#&"){f=1} 1' file > tmp && mv tmp file
just like with any other UNIX command.
Using GNU sed is probably simplest here:
sed '0,/^[^#]/ s//#&/' file
Add option -i if you want to update file in place.
'0,/^[^#]/ matches all lines up to and including the first one that doesn't start with #
s//#&/ then prepends # to that line.
Note that s//.../ (i.e., an empty regex) reuses the last matching regex in the range, which is /^[^#]/ in this case.
Note that the command doesn't work with BSD/OSX sed, unfortunately, because starting a range with 0 so as to allow the range endpoint to match the very first line also is not supported there. It is possible to make the command work with BSD/OSX sed, but it's more cumbersome.
If the input/output file is not very large, you can do it all in Bash:
optsfile=to_submit.txt
has_run_cmd=0
outputlines=()
while IFS= read -r inputline || [[ -n $inputline ]] ; do
read run opt c <<<"$inputline"
if (( has_run_cmd )) || [[ $run == \#* ]] ; then
outputlines+=( "$inputline" )
elif ./submit.py "$opt" "$run" -c "$c" ; then
has_run_cmd=1
outputlines+=( "#$inputline" )
else
exit $?
fi
done < "$optsfile"
(( has_run_cmd )) && printf '%s\n' "${outputlines[#]}" > "$optsfile"
The lines of the file are put in the outputlines array, with a hash prepended to the line that was used in the ./submit.py command. If the command runs successfully, the file is overwritten with the lines in outputlines.
After some searching around I found that
awk -v run="$run" -v opt="$opt" '{if($1 == run && $2 == opt) {print "#" $0} else print}' to_submit.txt > temp
mv -b -f temp to_submit.txt
seems to solve this (without needing to find the line number first, just comparing $ run and $opt). This assumes that the combination of run and opt is enough to identify a line and the comment is not needed (which happens to be true in my case). Not sure how the comment which is spanning multiple fields in awk would also be taken into account.

awk won't print new line characters

I am using the below code to change an existing awk script so that I can add more and more cases with a simple command.
echo `awk '{if(/#append1/){print "pref'"$1"'=0\n" $0 "\n"} else{print $0 "\n"}}' tf.a
note that the first print is "pref'"$1"'=0\n" so it is referring to the variable $1 in its environment, not in awk itself.
The command ./tfb.a "c" should change the code from:
BEGIN{
#append1
}
...
to:
BEGIN{
prefc=0
#append1
}
...
However, it gives me everything on one line.
Does anyone know why this is?
If you take awk right out of the equation you can see what's going on:
# Use a small test file instead of an awk script
$ cat xxx
hello
there
$ echo `cat xxx`
hello there
$ echo "`cat xxx`"
hello
there
$ echo "$(cat xxx)"
hello
there
$
The backtick operator expands the output into shell "words" too soon. You could play around with the $IFS variable in the shell (yikes), or you could just use double-quotes.
If you're running a modern sh (e.g. ksh or bash, not the "classic" Bourne sh), you may also want to use the $() syntax (it's easier to find the matching start/end delimiter).
do it like this. pass the variable from shell to awk properly using -v
#!/bin/bash
toinsert="$1"
awk -v toinsert=$toinsert '
/#append1/{
$0="pref"toinsert"=0\n"$0
}
{print}
' file > temp
mv temp file
output
$ cat file
BEGIN{
#append1
}
$ ./shell.sh c
BEGIN{
prefc=0
#append1
}