Awk replacement pieces size limit

Awk replacement pieces size limit - awk

Trying to find a single word and replace it with the contents of a file. Works on MacOS, but not under linux.
Here is the awk that fails under linux:
awk -v var="${blah}" '{sub(/%WORD%/,var)}1' file.xml
(file.xml is 122 lines, 4.7K)
Error is:
awk: program limit exceeded: replacement pieces size=255
Same file.xml under MacOS, using a slightly different awk works fine:
awk -v var="${blah//$'\n'/\\n}" '{sub(/%WORD%/,var)1}'
Recompiling awk is not an option. This is Ubuntu 12.04, 32-bit.

You could use sed
FILE=`cat Filename`
sed "s/WORD/${FILE}/g" file.xml > newfile.xml

Turns out that good old 'replace' out performs awk in this use case--who would have thought?
replace -v "%WORD%" "$blah" -- file.xml

Using Gnu Awk version 4, and the readfile extension:
gawk -f a.awk file.xml
where a.awk is:
#load "readfile"
BEGIN{
var = readfile("blah")
if (var == "" && ERRNO != "")
print("problem reading file", ERRNO) > "/dev/stderr"
}
{
sub(/%WORD%/,var)
print
}

Related

Adding hashtag to files

I have a awk program add_hashtag.awk
BEGIN{printf("#")}1
and a bash program
for file in *.asc; do awk -f add_hashtag.awk "$file" > "$file"_in; done
that add hashtag into file. It works, however, I would like to get files with same names. When I run
for file in *.asc; do awk -f add_hashtag.awk "$file" > "$file"; done
I get files only with #.
How to do that? Thank you

Could you please try following.
for file in *.asc; do awk -f add_hashtag.awk "$file" > "temp_file" && mv "temp_file" "$file"; done
I am going with approach where creating a temp_file for output and later renaming it to Input_file so that there will not be any danger of losing or truncating actual Input_file. Also it will not rename temp_file to actual Input_file until/unless awk command is a success(with use of &&)
With gawk 4.1.0 version or so try(haven't tested it since no samples were given):
awk -i inplace -f add_hashtag.awk *.asc
OR in case you want to inplace edit files along with taking their backup:
awk -i inplace -v INPLACE_SUFFIX=.backup -f add_hashtag.awk *.asc

How does gawk -e 'BEGIN {' -e 'print "hello" }' work?

Gawk 5.0.0 was released on April 12, 2019. Going through the announcement I found this:
Changes from 4.2.1 to 5.0.0
(...) 11. Namespaces have been implemented! See the manual. One consequence of this is that files included with -i, read with -f, and command line program segments must all be self-contained syntactic units. E.g., you can no longer do something like this:
gawk -e 'BEGIN {' -e 'print "hello" }'
I was curious about this behaviour that is no longer supported, but unfortunately my Gawk 4.1.3 did not offer much output out of it:
$ gawk -e 'BEGIN {' -e 'print "hello" }'
gawk: cmd. line:1: BEGIN {
gawk: cmd. line:1: ^ unexpected newline or end of string
From what I see in the manual of GAWK 4.2, the -e option was marked as problematic already:
GNU Awk User's Guide, on Options
-e program-text
--source program-text
Provide program source code in the program-text. This option allows you to mix source code in files with source code that you enter on the command line. This is particularly useful when you have library functions that you want to use from your command-line programs (see AWKPATH Variable).
Note that gawk treats each string as if it ended with a newline character (even if it doesn’t). This makes building the total program easier.
CAUTION: At the moment, there is no requirement that each program-text be a full syntactic unit. I.e., the following currently works:
$ gawk -e 'BEGIN { a = 5 ;' -e 'print a }'
-| 5
However, this could change in the future, so it’s not a good idea to rely upon this feature.
But, again, this fails in my console:
$ gawk -e 'BEGIN {a=5; ' -e 'print a }'
gawk: cmd. line:1: BEGIN {a=5;
gawk: cmd. line:1: ^ unexpected newline or end of string
So what is gawk -e 'BEGIN {' -e 'print "hello" }' doing exactly on Gawk < 5?

It's doing just what you'd expect - concatenating the parts to form gawk 'BEGIN {print "hello" }' and then executing it. You can actually see how gawk is combining the code segments by pretty-printing it:
$ gawk -o- -e 'BEGIN {' -e 'print "hello" }'
BEGIN {
print "hello"
}
That script isn't useful to be written in sections and concatenated but if you consider something like:
$ cat usea.awk
{ a++ }
$ echo foo | gawk -e 'BEGIN{a=5}' -f usea.awk -e 'END{print a}'
6
then you can see the intended functionality might be useful for mixing some command-line code with scripts stored in files to run:
$ gawk -o- -e 'BEGIN{a=5}' -f usea.awk -e 'END{print a}'
BEGIN {
a = 5
}
{
a++
}
END {
print a
}

AWK (igawk) #include statement fails

Here’s what I’m currently trying as a base case with the function definition written manually (which works):
igawk 'function tripleit(x) {return x*3} {print tripleit($1)}' <(echo 5)
Here is a theoretically more practical version calling a function library (which fails):
igawk '#include $HOME/code/thefunc {print tripleit($1)}' <(echo 5)
Here's "thefunc" :
function tripleit(x){return x*3}
If anyone knows HOW or WHY this is failing, and how I can get something like this to work, it would be super-helpful. I love AWK, but I'm not about to type and retype UDFs each and every time I need them.
I have tried to create foo.awk:
function foo(){print "Hello World"}
And call this as suggested:
$ cat foo.awk
function foo(){print "Hello World"}
$ igawk '#include "foo.awk"; BEGIN{foo()}'
igawk:/dev/stdin:0: cannot find "foo.awk";
$ igawk '#include "$PWD/foo.awk"; BEGIN{foo()}'
$ igawk '#include "./foo.awk"; BEGIN{foo()}'
$
No output yet.

awk has no idea what the shell variable $HOME contains and #include requires a string as it's argument.
$ cat foo.awk
function foo() {
print "Hello World"
}
$ gawk '#include $PWD/foo.awk; BEGIN{foo()}'
gawk: cmd. line:1: #include $PWD/foo.awk; BEGIN{foo()}
gawk: cmd. line:1: ^ syntax error
$ gawk '#include "$PWD/foo.awk"; BEGIN{foo()}'
gawk: cmd. line:1: error: can't open source file `$PWD/foo.awk' for reading (No such file or directory)
$ gawk '#include "./foo.awk"; BEGIN{foo()}'
Hello World
You can also use AWKPATH instead of explitly providing the library directory path every time:
$ echo "$AWKPATH"
$ gawk '#include "foo.awk"; BEGIN{foo()}'
Hello World
$ mkdir blob
$ mv foo.awk blob
$ gawk '#include "foo.awk"; BEGIN{foo()}'
gawk: cmd. line:1: error: can't open source file `foo.awk' for reading (No such file or directory)
$ AWKPATH="$PWD/blob:$AWKPATH" gawk '#include "foo.awk"; BEGIN{foo()}'
Hello World
alternatively try:
gawk -f foo.awk -f - <<<'BEGIN{foo()}'

(plopping this here in case I run into this again ...)
It took me some fiddling about to get this right, but you can encode AWKPATH (or any other environment variable) into any script like this:
#!/usr/bin/env -S AWKPATH=${HOME}/bin awk -f
#include "utilities.awk"
...
Don't forget to chmod +x the script.
The tricky part was the man page documentation for -S which says
-S, --split-string=S
which seems to imply the following (which fails):
#!/usr/bin/env -S AWKPATH=${HOME}/bin awk -f

awk ignore case isn't working

I am using this code to get ip entries from host file with ignore case and it doesn't seem to work on AIX
Input file
172.23.1.230 enboprtpapzp04.digjam.com enboprtpapzp04
#172.23.0.33 enboprtpapzp04.digjam.com enboprt enboprtpapzp04
172.23.1.230 enboprtpapzp04.fixture.com enboprtpap enboprtpapzp04
awk -v client="$client" 'BEGIN {IGNORECASE = 1}{k=0; for (i=1;i<=NF;i++){if ($i==client){print $1}; k++}}' file
See the output below
client=ENBOPRTPAPZP04
awk -v client="$client" 'BEGIN {IGNORECASE = 1}{k=0; for (i=1;i<=NF;i++){if ($i==client){print $1}; k++}}' file
Nothing comes up
expected output
grep -i ENBOPRTPAPZP04 /etc/hosts | awk '{print $1}' | grep -v "^#"
172.23.1.230
172.23.1.230

It works here:
$ awk -v client="$client" 'BEGIN{IGNORECASE = 1} $2==client && /^[^#]/{print $1}' your_hosts
172.23.1.230
172.23.1.230
Are you sure you are using GNU awk? If not, you could:
$ awk -v client="$client" 'tolower($2)==tolower(client) && /^[^#]/{print $1}' your_hosts
In the light of the resent - whoops, I meant recent - edits to the question and the mentioning of the loop in the comments I'll add this:
$ awk -v client="$client" '{for(i=1;i<=NF;i++) if(tolower($i)==tolower(client) && $1!~/^#/)print $1}' your_new_hosts
172.23.1.230
172.23.1.230
Also, check #EdMorton's last comment below for a non-looping version.
The check for the /^#/ could be outside of the action block in the condition part:
$ awk ... '!/^#/ {for(i=1;i<=NF;i++) if(tolower($i)==tolower(client)) print $1}' your_new_hosts

awk: passing variables from bash

I am getting syntax errors with the following code. Is there an awk version that does not support the "-v" option or am I missing something? Thanks.
#!/usr/local/bin/bash
f_name="crap.stat"
S_Date="2012-02-10"
E_Date="2012-02-13"
awk -F "\t" -v s_date="$S_Date" -v e_date="$E_Date" 'BEGIN {print s_date,e_date}' $f_name

Your code completely works on my awk (GNU Awk 3.1.6).
There is another way though, If you export your variables you can use it in ENVIRON array
$ export f_name="crap.stat"
$ awk '{ print ENVIRON["f_name"] }' anyfile
crap.stat

The default awk program on Solaris 10 (aka oawk) does not seem to support the -v option; the alternative nawk program does support it. Some people switch the name awk so it is a link to nawk, so you can't readily predict which you'll find as awk.
The awk programs on HP-UX 11.x, AIX 6.x and Mac OS X (10.7.x) all support the -v notation, which isn't very surprising since POSIX expects support for -v.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Awk replacement pieces size limit - awk

You could use sed FILE=`cat Filename` sed "s/WORD/${FILE}/g" file.xml > newfile.xml

Turns out that good old 'replace' out performs awk in this use case--who would have thought? replace -v "%WORD%" "$blah" -- file.xml

Using Gnu Awk version 4, and the readfile extension: gawk -f a.awk file.xml where a.awk is: #load "readfile" BEGIN{ var = readfile("blah") if (var == "" && ERRNO != "") print("problem reading file", ERRNO) > "/dev/stderr" } { sub(/%WORD%/,var) print }

Related

Adding hashtag to files

How does gawk -e 'BEGIN {' -e 'print "hello" }' work?

AWK (igawk) #include statement fails

awk ignore case isn't working

awk: passing variables from bash

Categories

Resources