using AWK, how do I convert a decimal number to hexadecimal - awk

If I do have an input stream of decimal numbers, e.g.
100 2000 599 232
and I pass them to awk, how do I print them in Hexadecimal notation?.
for example
0x64 0x74D 0x257 0xE8
starting script ...
echo "100 2000 599 232" | awk '{ print $1 }' #here print in hexa instead of decimal

You can use printf in awk with a format string to convert to hex:
awk '{ printf "%x\n", $1 }'

quick caveat - mawk 1.3.4 has severe limitations when it comes to printing octal and hex codes :
$ gawk 'BEGIN{ printf("%\043.16x\n",8^8*-1-2) }'
0xfffffffffefffffe
$ nawk 'BEGIN{ printf("%\043.16x\n",8^8*-1-2) }'
0xfffffffffefffffe
$ mawk 'BEGIN{ printf("%\043.16x\n",8^8*-1-2) }'
0000000000000000
$ mawk2 'BEGIN{ printf("%\043.16x\n",8^8*-1-2) }'
0xfffffffffefffffe
It's not even that large a value (-16777218), and mawk 1.3.4 completely bellyflops. On the flip side, it can directly decipher some hex constants (only gawk not in either posix or traditional mode can directly decipher octal constants :
$ mawk 'BEGIN { OFMT="%.f"; print +"0xDEADBEEF" }'
3735928559
nawk 'BEGIN { OFMT="%.f"; print +"0xDEADBEEF" }'
3735928559
$ mawk2 'BEGIN { OFMT="%.f"; print +"0xDEADBEEF" }'
0
$ gawk --posix 'BEGIN{ OFMT="%.f"; print +"0xDEADBEEF" }'
3735928559 <==== note the difference - posix mode only can decipher strings
the "+" in front is also necessary cuz gawk will just print
it as a string otherwise.
$ gawk -e 'BEGIN { OFMT="%.f"; print 0xDEADBEEF }'
3735928559 <==== standard mode only can decipher clear text ones
- mawk2 is the only one among those above that
even prints anything out with %p in printf(),
but still erroring out, as such
mawk2: line 1: invalid control character 'p'
in [s]printf format ("0x10f0099da
- both gawk and nawk properly prints out %a

Related

Portable way to split an external variable containing newlines in awk?

Consider these awk commands:
#!/bin/bash
awk 'BEGIN { print split("X\nX",a,"\n") }'
awk -v s=$'X\nX' 'BEGIN { print split(s,a,"\n") }'
Results:
Linux:
2
2
macOS, FreeBSD:
2
/usr/bin/awk: newline in string X
X... at source line 1
Solaris:
2
/usr/xpg4/bin/awk: file "(null)": line 1: Newline in string
Context is:
>>> X
>>> <<<
Is there a way to work around that?
Edit:
There's not even the need to use an external variable, the following will also fail in all awk implementations but the GNU one:
awk 'BEGIN { s = "X\nX"; print split(s,a,"\n") }'
POSIX awk does not allow physical newlines in string values.
When you use C/BASH string notation like $'a\nb' then any POSIX compliant awk implementation will fail.
Even with gnu-awk, when you enable posix option following error will be returned:
awk --posix -v s=$'X\nX' 'BEGIN { print split(s,a,"\n") }'
awk: fatal: POSIX does not allow physical newlines in string values
However if you remove $'...' notation then error will not be there:
awk --posix -v s="X\nX" 'BEGIN { print split(s,a,"\n") }'
2

How to control the format of float numbers in gawk?

The following two runs are different. How to make the first run the same as the second run (I still want print without any explicit arguments)? Is there a way to control the number of digits in $1 = 1/3?
$ gawk -v OFMT='%.20g' -e 'BEGIN { $1 = 1/3; print }'
0.333333
$ gawk -v OFMT='%.20g' -e 'BEGIN { print 1/3}'
0.33333333333333331483
EDIT: The following comparison is also unexpected. Ideally, if there is just one field, print $1 and print should be just the same. I think it could be considered as a bug?
$ gawk -v OFMT='%.20g' -e 'BEGIN { $1 = 1/3; print $1}'
0.33333333333333331483
$ gawk -v OFMT='%.20g' -e 'BEGIN { $1 = 1/3; print}'
0.333333
There is a subtlety here. There are two variables, OFMT and CONVFMT. The variable OFMT is used to control how numbers are converted to strings in the print statement while the variable CONVFMT is used to define how numbers are converted to strings in general (outside of the print statement):
Prior to the POSIX standard, awk used the value of OFMT for converting numbers to strings. OFMT specifies the output format to use when printing numbers with print. CONVFMT was introduced in order to separate the semantics of conversion from the semantics of printing. Both CONVFMT and OFMT have the same default value: "%.6g". In the vast majority of cases, old awk programs do not change their behaviour.
source: GNU awk manual
More detailed information about this reasoning can be found in the secion rationale of the awk POSIX standard.
numeric value in print statement:
$ awk 'BEGIN{print 1/3}'
0.333333
$ awk 'BEGIN{OFMT="%.20g"; print 1/3 }'
0.33333333333333331483
$ awk 'BEGIN{CONVFMT="%.20g"; print 1/3 }'
0.333333
variable with a numeric value in print statement:
$ awk 'BEGIN{a=1/3; print a}'
0.333333
$ awk 'BEGIN{OFMT="%.20g"; a=1/3; print a }'
0.33333333333333331483
$ awk 'BEGIN{CONVFMT="%.20g"; a=1/3; print a }'
0.333333
variable with a numeric value converted to string in print statement:
$ awk 'BEGIN{a=1/3; a=a""; print a}'
0.333333
$ awk 'BEGIN{OFMT="%.20g"; a=1/3; a=a""; print a }'
0.333333
$ awk 'BEGIN{CONVFMT="%.20g"; a=1/3; a=a""; print a }'
0.33333333333333331483
I am not sure if its a bug, but try to set a variable and not first field
gawk -v OFMT='%.20g' -e 'BEGIN { a = 1/3; print a}'
0.33333333333333331483

Awk print string with variables

How do I print a string with variables?
Trying this
awk -F ',' '{printf /p/${3}_abc/xyz/${5}_abc_def/}' file
Need this at output
/p/APPLE_abc/xyz/MANGO_abc_def/
where ${3} = APPLE
and ${5} = MANGO
printf allows interpolation of variables. With this as the test file:
$ cat file
a,b,APPLE,d,MANGO,f
We can use printf to achieve the output you want as follows:
$ awk -F, '{printf "/p/%s_abc/xyz/%s_abc_def/\n",$3,$5;}' file
/p/APPLE_abc/xyz/MANGO_abc_def/
In printf, the string %s means insert-a-variable-here-as-a-string. We have two occurrences of %s, one for $3 and one for $5.
Not as readable, but the printf isn't necessary here. Awk can insert the variables directly into the strings if you quote the string portion.
$ cat file.txt
1,2,APPLE,4,MANGO,6,7,8
$ awk -F, '{print "/p/" $3 "_abc/xyz/" $5 "_abc_def/"}' file.txt
/p/APPLE_abc/xyz/MANGO_abc_def/

Using variables in printf format

Suppose I have a file like this:
$ cat a
hello this is a sentence
and this is another one
And I want to print the first two columns with some padding in between them. As this padding may change, I can for example use 7:
$ awk '{printf "%7-s%s\n", $1, $2}' a
hello this
and this
Or 17:
$ awk '{printf "%17-s%s\n", $1, $2}' a
hello this
and this
Or 25, or... you see the point: the number may vary.
Then a question popped: is it possible to assign a variable to this N, instead of hardcoding the integer in the %N-s format?
I tried these things without success:
$ awk '{n=7; printf "%{n}-s%s\n", $1, $2}' a
%{n}-shello
%{n}-sand
$ awk '{n=7; printf "%n-s%s\n", $1, $2}' a
%n-shello
%n-sand
Ideally I would like to know if it is possible to do this. If it is not, what would be the best workaround?
If you use * in your format string, it gets a number from the arguments
awk '{printf "%*-s%s\n", 17, $1, $2}' file
hello this
and this
awk '{printf "%*-s%s\n", 7, $1, $2}' file
hello this
and this
As read in The GNU Awk User’s Guide #5.5.3 Modifiers for printf Formats:
The C library printf’s dynamic width and prec capability (for example,
"%*.*s") is supported. Instead of supplying explicit width and/or prec
values in the format string, they are passed in the argument list. For
example:
w = 5
p = 3
s = "abcdefg"
printf "%*.*s\n", w, p, s
is exactly equivalent to:
s = "abcdefg"
printf "%5.3s\n", s
does this count?
idea is building the "dynamic" fmt, used for printf.
kent$ awk '{n=7;fmt="%"n"-s%s\n"; printf fmt, $1, $2}' f
hello this
and this
Using simple string concatenation.
Here "%", n and "-s%s\n" concatenates as a single string for the format. Based on the example below, the format string produced is %7-s%s\n.
awk -v n=7 '{ printf "%" n "-s%s\n", $1, $2}' file
awk '{ n = 7; printf "%" n "-s%s\n", $1, $2}' file
Output:
hello this
and this
you can use eval (maybe not the most beautiful with all the escape characters, but it works)
i=15
eval "awk '{printf \"%$i-s%s\\n\", \$1, \$2}' a"
output:
hello this
and this

How to make calculations on hexadecimal numbers with awk?

I have a file containing a list of hexadecimal numbers, as 0x12345678 one per line.
I want to make a calculation on them. For this, I thought of using awk. But if printing an hexadecimal number with awk is easy with the printf function, I haven't find a way to interpret the hexadecimal input other than as text (or 0, conversion to integer stops on the x).
awk '{ print $1; }' // 0x12345678
awk '{ printf("%x\n", $1)}' // 0
awk '{ printf("%x\n", $1+1)}' // 1 // DarkDust answer
awk '{ printf("%s: %x\n", $1, $1)}' // 0x12345678: 0
Is it possible to print, e.g. the value +1?
awk '{ printf(%x\n", ??????)}' // 0x12345679
Edit: One liners on other languages welcomed! (if reasonable length ;-) )
In the original nawk and mawk implementations the hexadecimal (and octal) numbers are recognised. gawk (which I guess you are using) has the feature/bug of not doing this. It has a command line switch to get the behaviour you want: --non-decimal-data.
echo 0x12345678 | mawk '{ printf "%s: %x\n", $1, $1 }'
0x12345678: 12345678
echo 0x12345678 | gawk '{ printf "%s: %x\n", $1, $1 }'
0x12345678: 0
echo 0x12345678 | gawk --non-decimal-data '{ printf "%s: %x\n", $1, $1 }'
0x12345678: 12345678
gawk has the strtonum function:
% echo 0x12345678 | gawk '{ printf "%s: %x - %x\n", $1, $1, strtonum($1) }'
0x12345678: 0 - 12345678
Maybe you don't need awk at all, as string/number conversion is hairy. Bash versions 3 and 4 are very powerful. It is often simpler, clearer and more portable to stay in Bash, and maybe use grep and cut etc.
For example, in Bash hexadecimal numbers are converted naturally:
$ printf "%d" 0xDeadBeef
3735928559
$ x='0xE'; printf "%d %d %d" $x "$x" $((x + 1))
14 14 15
Hope this helps.
here are the different combinations of their behaviors :
using this command
'BEGIN { print 0xFACECAFEFEED^2, -0xFEEDCAFEFACEBEEFDEADFACEFEED7,
-"0xCAFECAFECAFECAFECAFECAFECAFECAFECAFECAFECAFECAFECAFECAFE" }'
gawk -e (GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0, GNU MP 6.2.1))
76046928626116243263483543552 -82729151009071240233065844435845120 0
gawk -P -e
00 0
-21377898657284658184582485743897013874545437686817998522919218577408
gawk -c -e
00 0 0
gawk -n -e
76046928626116243263483543552 -82729151009071240233065844435845120
-21377898657284658184582485743897013874545437686817998522919218577408
gawk -S -e
76046928626116243263483543552 -82729151009071240233065844435845120 0
gawk -M
76046928626116245157029816169
-82729151009071239007500567260950231 0
gawk -l mpfr
76046928626116243263483543552 -82729151009071240233065844435845120 0
nawk (macos awk version 20200816)
00 0 -2.13778986572846581845824857439e+67
mawk 1.3.4
00 0 -2.13779e+67
mawk2-beta (1.9.9.6)
00 0 0
In fact, if one has a custom awk-script library that works across multiple awk variants, but also wanna take their idiosyncrasies into account, one approach would be use the difference in outputs here to auto-flag, with relatively few combinations left where one needs a tie-breaker.
*** this is only an extension of my comment following schot's response, strictly for proper formatting purposes.
echo FACEBEACEBEFACEEFFFACEEEFFFACEFACFACEB |
mawk '{ printf("%s\n%.f\n%x\n%.f\n",$0,$0,"0x"$0,"0x"$0) }'
FACEBEACEBEFACEEFFFACEEEFFFACEFACFACEB
0
ffffffff
5593196314036579851314282024549245003233230848
5593196314036579608368524797845507287542639851 #exact