How do I dump the contents of SYMTAB in gawk? - awk

How do I dump the contents of SYMTAB in gawk? I've tried things like the following which displays scalars just fine. It also displays the array names and indices, but it doesn't display the value of each array element.
for (i in SYMTAB) {
if (isarray(SYMTAB[i])) {
for (j in SYMTAB[i]) {
printf "%s[%s] = %s\r\n", i, j, SYMTAB[i, j]
}
} else {
printf "%s = %s\r\n", i, SYMTAB[i]
}
}
which gives results like:
OFS =
ARGC = 1
PREC = 53
ARGIND = 0
ERRNO =
ARGV[0] =
For example, I would expect to see a value after ARGV[0] but I'm not.

Use SYMTAB[i][j] instead of SYMTAB[i,j] - you're using multi-dimensional array syntax in the loops to access the indices so just keep doing that.
Here's a recursive function to dump SYMTAB or any other array or scalar:
$ cat tst.awk
function dump(name,val, i) {
if ( isarray(val) ) {
printf "%*s%s %s%s", indent, "", name, "{", ORS
indent += 3
for (i in val) {
dump(i,val[i])
}
indent -= 3
printf "%*s%s %s%s", indent, "", name, "}", ORS
}
else {
printf "%*s%s = <%s>%s", indent, "", name, val, ORS
}
}
BEGIN {
dump("SYMTAB",SYMTAB)
}
.
$ awk -f tst.awk
SYMTAB {
ARGV {
0 = <awk>
ARGV }
ROUNDMODE = <N>
ORS = <
>
OFS = < >
LINT = <0>
FNR = <0>
ERRNO = <>
NR = <0>
IGNORECASE = <0>
TEXTDOMAIN = <messages>
NF = <0>
ARGIND = <0>
indent = <3>
ARGC = <1>
PROCINFO {
argv {
0 = <awk>
1 = <-f>
2 = <tst.awk>
argv }
group9 = <15>
ppid = <2212>
...
strftime = <%a %b %e %H:%M:%S %Z %Y>
group8 = <11>
PROCINFO }
FIELDWIDTHS = <>
CONVFMT = <%.6g>
SUBSEP = <>
PREC = <53>
ENVIRON {
SHLVL = <1>
ENV = <.env>
...
INFOPATH = </usr/local/info:/usr/share/info:/usr/info>
TEMP = </tmp>
ProgramData = <C:\ProgramData>
ENVIRON }
RS = <
>
FPAT = <[^[:space:]]+>
RT = <>
RLENGTH = <0>
OFMT = <%.6g>
FS = < >
RSTART = <0>
FILENAME = <>
BINMODE = <0>
SYMTAB }
Massage to suit...

Thank you Ed Morton. Looks like a recursive process would be required if I needed to support arbitrary levels of nested arrays, but for now this code dumps my gawk SYMTAB without errors:
for (i in SYMTAB) {
if (!isarray(SYMTAB[i])) {
printf "%s = %s\r\n", i, SYMTAB[i]
} else {
for (j in SYMTAB[i]) {
if (!isarray(SYMTAB[i][j])) {
printf "%s[%s] = %s\r\n", i, j, SYMTAB[i][j]
} else {
for (k in SYMTAB[i][j]) {
if (!isarray(SYMTAB[i][j][k])) {
printf "%s[%s][%s] = %s\r\n", i, j, k, SYMTAB[i][j][k]
} else {
printf "Skipping highly nested array.\r\n"
}
}
}
}
}
}
Thanks again!

Related

awk: if a user defined function returns 1, start from the very beginning

Let's say I have the following script:
function helper1() {
if (NR==3 && !/PATTERN/) {
return 1
} else {
if (NR>=13) {
print $0
}
return 0
}
}
BEGIN {
if (helper1() == 1) {
print $0
}
}
Which means, I have a user-defined helper function, which checks a file if the 3rd line contains some PATTERN, and if that's true, then it prints out all the other lines starting from line 13.
But if it's not true (the helper function returns 1), then I'd like awk to print all the lines starting from line 1. Which is not happening :)
Would be grateful for any advice here,
Thank you.
You may use this awk:
awk 'NR < 3 { # for first 2 lines
s = s $0 ORS # store all lines in a variable s
next # skip to next record
}
NR == 3 { # for record number 3
if (/PATTERN/) # if PATTERN is found
p = 1 # set flag p to 1
else # else
printf "%s", s # print first 2 lines
}
(p && NR >= 13) || !p # print if flag is not set or else if NR >= 13
' file
Using a function:
awk '
function helper1() {
if (NR < 3) {
s = s $0 ORS
return 0
}
else if (NR == 3) {
if (/PATTERN/)
p = 1
else
printf "%s", s
}
return (p && NR >= 13) || !p
}
helper1()
' file

Awk create a new array of unique values from another array

I have my array:
array = [1:"PLCH2", 2:"PLCH1", 3:"PLCH2"]
I want to loop on array to create a new array unique of unique values and obtain:
unique = [1:"PLCH2", 2:"PLCH1"]
how can I achieve that ?
EDIT: as per #Ed Morton request, I show below how my array is populated. In fact, this post is the key solution to my previous post.
in my file.txt, I have:
PLCH2:A1007int&PLCH1:D987int&PLCH2:P977L
INTS11:P446P&INTS11:P449P&INTS11:P518P&INTS11:P547P&INTS11:P553P
I use split to obtain array:
awk '{
split($0,a,"&")
for ( i in a ) {
split(a[i], b, ":");
array[i] = b[1];
}
}' file.txt
This might be what you're trying to do:
$ cat tst.awk
BEGIN {
split("PLCH2 PLCH1 PLCH2",array)
printf "array ="
for (i=1; i in array; i++) {
printf " %s:\"%s\"", i, array[i]
}
print ""
for (i=1; i in array; i++) {
if ( !seen[array[i]]++ ) {
unique[++j] = array[i]
}
}
printf "unique ="
for (i=1; i in unique; i++) {
printf " %s:\"%s\"", i, unique[i]
}
print ""
}
$ awk -f tst.awk
array = 1:"PLCH2" 2:"PLCH1" 3:"PLCH2"
unique = 1:"PLCH2" 2:"PLCH1"
EDIT: given your updated question, here's how I'd really approach that:
$ cat tst.awk
BEGIN { FS="[:&]" }
{
numVals=0
for (i=1; i<NF; i+=2) {
vals[++numVals] = $i
}
print "vals =" arr2str(vals)
delete seen
numUniq=0
for (i=1; i<=numVals; i++) {
if ( !seen[vals[i]]++ ) {
uniq[++numUniq] = vals[i]
}
}
print "uniq =" arr2str(uniq)
}
function arr2str(arr, str, i) {
for (i=1; i in arr; i++) {
str = str sprintf(" %s:\"%s\"", i, arr[i])
}
return str
}
$ awk -f tst.awk file
vals = 1:"PLCH2" 2:"PLCH1" 3:"PLCH2"
uniq = 1:"PLCH2" 2:"PLCH1"
vals = 1:"INTS11" 2:"INTS11" 3:"INTS11" 4:"INTS11" 5:"INTS11"
uniq = 1:"INTS11" 2:"PLCH1"

Awk input variable as a rule

Good day!
I have the next code:
BLOCK=`awk '
/\/\* R \*\// {
level=1
count=0
}
level {
n = split($0, c, "");
for (i = 1; i <= n; i++)
{
printf(c[i]);
if (c[i] == ";")
{
if(level==1)
{
level = 0;
if (count != 0)
printf("\n");
};
}
else if (c[i] == "{")
{
level++;
count++;
}
else if (c[i] == "}")
{
level--;
count++;
}
}
printf("\n")
}' $i`
That code cuts the piece of the file from /* R */ mark to the ';' symbol with taking into account the details like braces etc. But that isn't important. I want to replace the hard-coded /* R */ by the variable:
RECORDSEQ="/* R */"
...
BLOCK=`awk -v rec="$RECORDSEQ" '
rec {
level=1
count=0
}
But that doesn't work.
How can I fix it?
Thank you in advance.
Found the solution:
RECORDSEQ="/* R */"
# Construct regexp for awk
RECORDSEQREG=`echo "$RECORDSEQ" | sed 's:\/:\\\/:g;s:\*:\\\*:g'`
# Cycle for files
for i in $SOURCE;
do
# Find RECORDSEQ and cut out the block
BLOCK=`awk -v rec="$RECORDSEQREG" '
$0 ~ rec {
level=1
count=0
}
...
Many thanks to people who helped.

awk nesting curling brackets

I have the following awk script where I seem to need to next curly brackets. But this is not allowed in awk. How can I fix this issue in my script here?
The problem is in the if(inqueued == 1).
BEGIN {
print "Log File Analysis Sequencing for " + FILENAME;
inqueued=0;
connidtext="";
thisdntext="";
}
/message EventQueued/ {
inqueued=1;
print $0;
}
if(inqueued == 1) {
/AttributeConnID/ { connidtext = $0; }
/AttributeThisDN / { thisdntext = $2; } #space removes DNRole
}
#if first chars are a timetamp we know we are out of queued text
/\#?[0-9]+:[0-9}+:[0-9]+/
{
if(thisdntext != 0) {
print connidtext;
print thisdntext;
}
inqueued = 0; connidtext=""; thisdntext="";
}
try to change
if(inqueued == 1) {
/AttributeConnID/ { connidtext = $0; }
/AttributeThisDN / { thisdntext = $2; } #space removes DNRole
}
to
inqueued == 1 {
if($0~ /AttributeConnID/) { connidtext = $0; }
if($0~/AttributeThisDN /) { thisdntext = $2; } #space removes DNRole
}
or
inqueued == 1 && /AttributeConnID/{connidtext = $0;}
inqueued == 1 && /AttributeThisDN /{ thisdntext = $2; } #space removes DNRole
awk is made up of <condition> { <action> } segments. Within an <action> you can specify conditions just like you do in C with if or while constructs. You have a few other problems too, just re-write your script as:
BEGIN {
print "Log File Analysis Sequencing for", FILENAME
}
/message EventQueued/ {
inqueued=1
print
}
inqueued == 1 {
if (/AttributeConnID/) { connidtext = $0 }
if (/AttributeThisDN/) { thisdntext = $2 } #space removes DNRole
}
#if first chars are a timetamp we know we are out of queued text
/\#?[0-9]+:[0-9}+:[0-9]+/ {
if (thisdntext != 0) {
print connidtext
print thisdntext
}
inqueued=connidtext=thisdntext=""
}
I don't know if that'll do what you want or not, but it's syntactically correct at least.

awk '/range start/,/range end/' within script

How do I use the awk range pattern '/begin regex/,/end regex/' within a self-contained awk script?
To clarify, given program csv.awk:
#!/usr/bin/awk -f
BEGIN {
FS = "\""
}
/TREE/,/^$/
{
line="";
for (i=1; i<=NF; i++) {
if (i != 2) line=line $i;
}
split(line, v, ",");
if (v[5] ~ "FOAM") {
print NR, v[5];
}
}
and file chunk:
TREE
10362900,A,INSTL - SEAL,Revise
,10362901,A,ASSY / DETAIL - PANEL,Revise
,,-203,ASSY - PANEL,Qty -,Add
,,,-309,PANEL,Qty 1,Add
,,,,"FABRICATE FROM TEKLAM NE1G1-02-250 PER TPS-CN-500, TYPE A"
,,,-311,PANEL,Qty 1,Add
,,,,"FABRICATE FROM TEKLAM NE1G1-02-750 PER TPS-CN-500, TYPE A"
,,,-313,FOAM SEAL,1.00 X 20.21 X .50 THK,Qty 1,Add
,,,,"BMS1-68, GRADE B, FORM II, COLOR BAC706 (BLACK)"
,,,-315,FOAM SEAL,1.50 X 8.00 X .25 THK,Qty 1,Add
,,,,"BMS1-68, GRADE B, FORM II, COLOR BAC706 (BLACK)"
,PN HERE,Dual Lock,Add
,
10442900,IR,INSTL - SEAL,Update (not released)
,10362901,A,ASSY / DETAIL - PANEL,Revise
,PN HERE,Dual Lock,Add
I want to have this output:
27 FOAM SEAL
29 FOAM SEAL
What is the syntax for adding the command line form '/begin regex/,/end regex/' to the script to operate on those lines only? All my attempts lead to syntax errors and googling only gives me the cli form.
why not use 2 steps:
% awk '/start/,/end/' < input.csv | awk csv.awk
Simply do:
#!/usr/bin/awk -f
BEGIN {
FS = "\""
}
/from/,/to/ {
line="";
for (i=1; i<=NF; i++) {
if (i != 2) line=line $i;
}
split(line, v, ",");
if (v[5] ~ "FOAM") {
print NR, v[5];
}
}
If the from to regexes are dynamic:
#!/usr/bin/awk -f
BEGIN {
FS = "\""
FROM=ARGV[1]
TO=ARGV[2]
if (ARGC == 4) { # the pattern was the only thing, so force read from standard input
ARGV[1] = "-"
} else {
ARGV[1] = ARGV[3]
}
}
{ if ($0 ~ FROM) { p = 1 ; l = 0} }
{ if ($0 ~ TO) { p = 0 ; l = 1} }
{
if (p == 1 || l == 1) {
line="";
for (i=1; i<=NF; i++) {
if (i != 2) line=line $i;
}
split(line, v, ",");
if (v[5] ~ "FOAM") {
print NR, v[5];
}
l = 0 }
}
Now you have to call it like: ./scriptname.awk "FROM_REGEX" "TO_REGEX" INPUTFILE. The last param is optional, if missing STDIN can be used.
HTH
You need to show us what you have tried. Is there something about /begin regex/ or /end regex/ you're not telling us, other wise your script with the additions should work, i.e.
#!/usr/bin/awk -f
BEGIN {
FS = "\""
}
/begin regex/,/end regex/{
line="";
for (i=1; i<=NF; i++) {
if (i != 2) line=line $i;
}
split(line, v, ",");
if (v[5] ~ "FOAM") {
print NR, v[5];
}
}
OR are you using an old Unix, where there is old awk as /usr/bin/awk and New awk as /usr/bin/nawk. Also see if you have /usr/xpg4/bin/awk or gawk (path could be anything).
Finally, show us the error messages you are getting.
I hope this helps.