binary numbers in gawk - gawk

How can one specify a number as binary in gawk?
According to the manual, gawk interprets all numbers as decimal unless they are preceded by a 0 (octal) or by a 0x (hexadecimal). Unlike in certain other languages, 0b does not do the trick.
For instance, the following lines do not give the desired output (010000 or 10000) because the values are interpreted as octal/decimal or decimal/decimal, respectively:
gawk '{print and(010000,110000)}'
0
gawk '{print and(10000,110000)}'
9488
I suspect that gawk may not support base-2 and that a user-defined function will be required to generate binary representations.

You're right, there's no internal support for binary conversion in gawk. And incredibly, there isn't even any in printf(). So you're stuck with functions.
Remember that awk is weakly typed. Which is why functions have insane behaviours like recognizing that "0x" at the beginning of a string means it's a hexadecimal number. In a language like this, better to control your own types.
Here's a couple of functions I've had sitting around for years...
#!/usr/local/bin/gawk -f
function bin2dec(n) {
result = 0;
if (n~/[^01]/) {
return n;
}
for (i=length(n); i; i--) {
result += 2^(length(n)-i) * substr(n,i,1);
}
return result;
}
function dec2bin(n) {
result = "";
while (n) {
if (n%2) {
result = "1" result;
} else {
result = "0" result;
}
n = int(n/2);
}
return result;
}
{
print dec2bin( and(bin2dec($1),bin2dec($2)) );
}
And the result:
$ echo "1101 1011" | ./doit.awk
1001
$ echo "11110 10011" | ./doit.awk
10010
$

Related

How to return 0 if awk returns null from processing an expression?

I currently have a awk method to parse through whether or not an expression output contains more than one line. If it does, it aggregates and prints the sum. For example:
someexpression=$'JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)'
might be the one-liner where it DOESN'T yield any information. Then,
echo "$someexpression" | awk '
NR>1 {a[$4]++}
END {
for (i in a) {
printf "%d\n", a[i]
}
}'
this will yield NULL or an empty return. Instead, I would like to have it return a numeric value of $0$ if empty. How can I modify the above to do this?
Nothing in UNIX "returns" anything (despite the unfortunately named keyword for setting the exit status of a function), everything (tools, functions, scripts) outputs X and exits with status Y.
Consider these 2 identical functions named foo(), one in C and one in shell:
C (x=foo() means set x to the return code of foo()):
foo() {
printf "7\n"; // this is outputting 7 from the full program
return 3; // this is returning 3 from this function
}
x=foo(); <- 7 is output on screen and x has value '3'
shell (x=foo means set x to the output of foo()):
foo() {
printf "7\n"; # this is outputting 7 from just this function
return 3; # this is setting this functions exit status to 3
}
x=foo <- nothing is output on screen, x has value '7', and '$?' has value '3'
Note that what the return statement does is vastly different in each. Within an awk script, printing and return codes from functions behave the same as they do in C but in terms of a call to the awk tool, externally it behaves the same as every other UNIX tool and shell script and produces output and sets an exit status.
So when discussing anything in UNIX avoid using the term "return" as it's imprecise and ambiguous and so different people will think you mean "output" while others think you mean "exit status".
In this case I assume you mean "output" BUT you should instead consider setting a non-zero exit status when there's no match like grep does, e.g.:
echo "$someexpression" | awk '
NR>1 {a[$4]++}
END {
for (i in a) {
print a[i]
}
exit (NR < 2)
}'
and then your code that uses the above can test for the success/fail exit status rather than testing for a specific output value, just like if you were doing the equivalent with grep.
You can of course tweak the above to:
echo "$someexpression" | awk '
NR>1 {a[$4]++}
END {
if ( NR > 1 ) {
for (i in a) {
print a[i]
}
}
else {
print "$0$"
exit 1
}
}'
if necessary and then you have both a specific output value and a success/fail exit status.
You may keep a flag inside for loop to detect whether loop has executed or not:
echo "$someexpression" |
awk 'NR>1 {
a[$4]++
}
END
{
for (i in a) {
p = 1
printf "%d\n", a[i]
}
if (!p)
print "$0$"
}'
$0$

Are strings in AWK treated as arrays? [duplicate]

This question already has answers here:
loop over characters in input string using awk
(4 answers)
Closed 5 years ago.
I'm pretty new to awk/gawk and I found this a little confusing. I hope you may help me shed some light on this question: Are strings in awk treated as array of chars?
I've read something about the stringy-ness of this computing language, so I tried to make a script like this:
BEGIN {
myArray[1] = "foo"
myArray[2] = "bar"
myArray[3] = 42
awkward()
}
function awkward() {
for (j in myArray) {
for (d in myArray[j]) {
# error!
}
}
}
Was I naive in thinking something like this could work? It actually does not work with my version of gawk, giving me this error:
for (d in myArray[j]) {
^ syntax error
Can anybody help me to understand more about why this should not work? Bonus: can anybody share their workaround on this?
To clarify a bit, I'm trying to access the content of myArray[j] char by char, using a for loop on index d.
No, strings in awk are not treated as array of chars. Use substr():
$ cat foo.awk
BEGIN {
myArray[1] = "foo"
myArray[2] = "bar"
myArray[3] = 42
awkward()
}
function awkward() {
for (j in myArray) {
for (i=1;i<=length(myArray[j]);i++) { # here
print substr(myArray[j],i,1) # and here
}
}
}
Run it:
$ awk -f foo.awk
f
o
o
b
a
r
4
2

Print smallest integer from file using awk custom function?

awk function looks like this in a file name fun.awk:
{
print small()
}
function small()
{
a[NR]=$0
smal=0
for(i=1;i<=3;i++)
{
if( a[i]<a[i+1])
smal=a[i]
else
smal=a[i+1]
}
return smal
}
The contents of awk.write:
1
23
32
The awk command is:
awk -f fun.awk awk.write
It gives me no result? Why?
I think you are going about this the wrong way. In awk, one approach might be:
NR == 1 {
small = $0
}
$0 < small {
small = $0
}
END {
print small
}
which simply simply sets small to the smallest integer we've seen so far on each line, and prints it at the end. (Note: you need to start with a initializing small on the first line.
A simpler approach might just be to sort the lines as numbers with sort, and pick the first one.

Busybox awk: How to treat each character in String as integer to perform bitwise operations?

I wanna change SSID wifi network name dynamically in OpenWRT via script which grab information from internet.
Because the information grabbed from internet may contains multiple-bytes characters, so it's can be easily truncated to invalid UTF-8 bytes sequence, so I want to use awk (busybox) to fix it. However, when I try to use bitwise function and on a String and integer, the result always return 0.
awk 'BEGIN{v="a"; print and(v,0xC0)}'
How to treat character in String as integer in awk like we can do in C/C++? char p[]="abc"; printf ("%d",*(p+1) & 0xC0);
You can make your own ord function like this - heavily borrowed from GNU Awk User's Guide - here
#!/bin/bash
awk '
BEGIN { _ord_init()
printf("ord(a) = %d\n", ord("a"))
}
function _ord_init( low, high, i, t)
{
low = sprintf("%c", 7) # BEL is ascii 7
if (low == "\a") { # regular ascii
low = 0
high = 127
} else if (sprintf("%c", 128 + 7) == "\a") {
# ascii, mark parity
low = 128
high = 255
} else { # ebcdic(!)
low = 0
high = 255
}
for (i = low; i <= high; i++) {
t = sprintf("%c", i)
_ord_[t] = i
}
}
function ord(str,c)
{
# only first character is of interest
c = substr(str, 1, 1)
return _ord_[c]
}'
Output
ord(a) = 97
I don;t know if it's what you mean since you didn't provide sample input and expected output but take a look at this with GNU awk and maybe it'll help:
$ gawk -lordchr 'BEGIN{v="a"; print v " -> " ord(v) " -> " chr(ord(v))}'
a -> 97 -> a

AWK multiple patterns substitution

Using AWK I'd like to process this text:
#replace count 12
in in
#replace in 77
main()
{printf("%d",count+in);
}
Into:
in in
ma77()
{pr77tf("%d",12+77);
}
When a '#replace' declaration occurs, only the code below it is affected. I've got:
/#replace/ { co=$2; czym=$3 }
!/#replace/ { gsub(co,czym); print }
However I'm getting only
in in
ma77()
{pr77tf("%d",count+77);
}
in return. As you can see only the second gsub works. Is there a simple way to remeber all the substitutions?
You just need to use an array to store the substitutions:
$ awk '/#replace/{a[$2]=$3;next}{for(k in a)gsub(k,a[k])}1' file
in in
ma77()
{pr77tf("%d",12+77);
}