Hex string to integer conversion in Amazon Redshift - sql

Amazon Redshift is based on ParAccel which is based on Postgres. From my research it seems that the preferred way to perform hexadecimal string to integer conversion in Postgres is via a bit field, as outlined in this answer.
In the case of bigint, this would be:
select ('x'||lpad('123456789abcdef',16,'0'))::bit(64)::bigint
Unfortunately, this fails on Redshift with:
ERROR: cannot cast type text to bit [SQL State=42846]
What other ways are there to perform this conversion in Postgres 8.1ish (that's close to the Redshift level of compatibility)? UDFs are not supported in Redshift and neither are array, regex functions or set generating functions...

It looks like they added a function for this at some point: STRTOL
Syntax
STRTOL(num_string, base)
Return type
BIGINT. If num_string is null, returns NULL.
For example
SELECT strtol('deadbeef', 16);
Returns: 3735928559

Assuming that you want a simple digit-by-digit ordinal position conversion (i.e. you're not worried about two's compliment negatives, etc) I think this should work on an 8.1-equivalent DB:
CREATE OR REPLACE FUNCTION hex2dec(text) RETURNS bigint AS $$
SELECT sum(CASE WHEN v >= ascii('a') THEN v - ascii('a') + 10 ELSE v - ascii('0') END * 16^ordpos)::bigint
FROM (
SELECT n-1, ascii(substring(reverse($1), n, 1))
FROM generate_series(1, length($1)) n
) AS x(ordpos, v);
$$ LANGUAGE sql IMMUTABLE;
The function form is optional, it just makes it easier to avoid repeating the argument a bunch of times. It should get inlined anyway. Efficiency will probably be awful, but most of the tools available to do this smarter don't seem to be available on versions that old, and this at least works:
regress=> CREATE TABLE t AS VALUES ('c13b'), ('a'), ('f');
regress=> SELECT hex2dec(column1) FROM t;
hex2dec
---------
49467
10
15
(3 rows)
If you can use regexp_split_to_array and generate_subscripts it might be faster. Or slower. I haven't tried. Another possible trick is to use a digit mapping array instead of the CASE, like:
'[48:102]={0,1,2,3,4,5,6,7,8,9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,11,12,13,14,15}'::integer[]
which you can use with:
CREATE OR REPLACE FUNCTION hex2dec(text) RETURNS bigint AS $$
SELECT sum(
('[48:102]={0,1,2,3,4,5,6,7,8,9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,11,12,13,14,15}'::integer[])[ v ]
* 16^ordpos
)::bigint
FROM (
SELECT n-1, ascii(substring(reverse($1), n, 1))
FROM generate_series(1, length($1)) n
) AS x(ordpos, v);
$$ LANGUAGE sql IMMUTABLE;
Personally, I'd do it client-side instead, rather than wrangling the limited capabilities of an old PostgreSQL fork, especially one you can't load your own sensible user-defined C functions on, or use PL/Perl, etc.
In real PostgreSQL I'd just use this:
hex2dec.c:
#include "postgres.h"
#include "fmgr.h"
#include "utils/builtins.h"
#include "errno.h"
#include "limits.h"
#include <stdlib.h>
PG_MODULE_MAGIC;
Datum from_hex(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(hex2dec);
Datum
hex2dec(PG_FUNCTION_ARGS)
{
char *endpos;
const char *hexstr = text_to_cstring(PG_GETARG_TEXT_PP(0));
long decval = strtol(hexstr, &endpos, 16);
if (endpos[0] != '\0')
{
ereport(ERROR, (ERRCODE_INVALID_PARAMETER_VALUE, errmsg("Could not decode input string %s as hex", hexstr)));
}
if (decval == LONG_MAX && errno == ERANGE)
{
ereport(ERROR, (ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE, errmsg("Input hex string %s overflows int64", hexstr)));
}
PG_RETURN_INT64(decval);
}
Makefile:
MODULES = hex2dec
DATA = hex2dec--1.0.sql
EXTENSION = hex2dec
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)
hex2dec.control:
comment = 'Utility function to convert hex strings to decimal'
default_version = '1.0'
module_pathname = '$libdir/hex2dec'
relocatable = true
hex2dec--1.0.sql:
CREATE OR REPLACE FUNCTION hex2dec(hexstr text) RETURNS bigint
AS 'hex2dec','hex2dec'
LANGUAGE c IMMUTABLE STRICT;
COMMENT ON FUNCTION hex2dec(hexstr text)
IS 'Decode the hex string passed, which may optionally have a leading 0x, as a bigint. Does not attempt to consider negative hex values.';
Usage:
CREATE EXTENSION hex2dec;
postgres=# SELECT hex2dec('7fffffffffffffff');
hex2dec
---------------------
9223372036854775807
(1 row)
postgres=# SELECT hex2dec('deadbeef');
hex2dec
------------
3735928559
(1 row)
postgres=# SELECT hex2dec('12345');
hex2dec
---------
74565
(1 row)
postgres=# select hex2dec(to_hex(-1));
hex2dec
------------
4294967295
(1 row)
postgres=# SELECT hex2dec('8fffffffffffffff');
ERROR: Input hex string 8fffffffffffffff overflows int64
postgres=# SELECT hex2dec('0x7abcz123');
ERROR: Could not decode input string 0x7abcz123 as hex
The performance difference is ... noteworthy. Given sample data:
CREATE TABLE randhex AS
SELECT '0x'||to_hex( abs(random() * (10^((random()-.5)*10)) * 10000000)::bigint) AS h
FROM generate_series(1,1000000);
conversion from hex to decimal takes about 1.3 from a warm cache using the C extension, which isn't great for a million rows. Reading them without any transformation takes 0.95s. It took 36 seconds for the SQL based hex2dec approach to process the same rows. Frankly I'm really impressed that the SQL approach was as fast as that, and surprised the C ext was that slow.

A likely explanation is that the cast from text to bit(n) relies on undocumented behavior, I repeat the quote from Tom Lane:
This is relying on some undocumented behavior of the bit-type input
converter, but I see no reason to expect that would break. A possibly
bigger issue is that it requires PG >= 8.3 since there wasn't a text
to bit cast before that.
And Amazon derivate is obviously not allowing this undocumented feature. Not surprising, since it is based off of Postgres 8.1 where there was no cast at all.
Previously quoted in this closely related answer:
Convert hex in text representation to decimal number

Related

Rounding SUM (Float ) to one Decimal in PostGreSQL [duplicate]

I am using PostgreSQL via the Ruby gem 'sequel'.
I'm trying to round to two decimal places.
Here's my code:
SELECT ROUND(AVG(some_column),2)
FROM table
I get the following error:
PG::Error: ERROR: function round(double precision, integer) does
not exist (Sequel::DatabaseError)
I get no error when I run the following code:
SELECT ROUND(AVG(some_column))
FROM table
Does anyone know what I am doing wrong?
PostgreSQL does not define round(double precision, integer). For reasons #Mike Sherrill 'Cat Recall' explains in the comments, the version of round that takes a precision is only available for numeric.
regress=> SELECT round( float8 '3.1415927', 2 );
ERROR: function round(double precision, integer) does not exist
regress=> \df *round*
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+--------+------------------+---------------------+--------
pg_catalog | dround | double precision | double precision | normal
pg_catalog | round | double precision | double precision | normal
pg_catalog | round | numeric | numeric | normal
pg_catalog | round | numeric | numeric, integer | normal
(4 rows)
regress=> SELECT round( CAST(float8 '3.1415927' as numeric), 2);
round
-------
3.14
(1 row)
(In the above, note that float8 is just a shorthand alias for double precision. You can see that PostgreSQL is expanding it in the output).
You must cast the value to be rounded to numeric to use the two-argument form of round. Just append ::numeric for the shorthand cast, like round(val::numeric,2).
If you're formatting for display to the user, don't use round. Use to_char (see: data type formatting functions in the manual), which lets you specify a format and gives you a text result that isn't affected by whatever weirdness your client language might do with numeric values. For example:
regress=> SELECT to_char(float8 '3.1415927', 'FM999999999.00');
to_char
---------------
3.14
(1 row)
to_char will round numbers for you as part of formatting. The FM prefix tells to_char that you don't want any padding with leading spaces.
        ((this is a Wiki! please edit to enhance!))
Try also the old syntax for casting,
SELECT ROUND( AVG(some_column)::numeric, 2 ) FROM table;
works with any version of PostgreSQL. ...But, as definitive solution, you can overload the ROUND function.
Overloading as casting strategy
CREATE FUNCTION ROUND(float,int) RETURNS NUMERIC AS $f$
SELECT ROUND( CAST($1 AS numeric), $2 )
$f$ language SQL IMMUTABLE;
Now your instruction will works fine, try this complete comparison:
SELECT trunc(n,3), round(n,3) n_round, round(f,3) f_round,
pg_typeof(n) n_type, pg_typeof(f) f_type, pg_typeof(round(f,3)) f_round_type
FROM (SELECT 2.0/3.0, 2/3::float) t(n,f);
trunc
n_round
f_round
n_type
f_type
f_round_type
0.666
0.667
0.667
numeric
double precision
numeric
The ROUND(float,int) function is f_round, it returns a (decimal) NUMERIC datatype, that is fine for some applications: problem solved!
In another applications we need a float also as result. An alternative is to use round(f,3)::float or to create a round_tofloat() function.
Other alternative, overloading ROUND function again, and using all range of accuracy-precision of a floating point number, is to return a float when the accuracy is defined (see IanKenney's answer),
CREATE FUNCTION ROUND(
input float, -- the input number
accuracy float -- accuracy, the "counting unit"
) RETURNS float AS $f$
SELECT ROUND($1/accuracy)*accuracy
$f$ language SQL IMMUTABLE;
Try
SELECT round(21.04, 0.05); -- 21.05 float!
SELECT round(21.04, 5::float); -- 20
SELECT round(1/3., 0.0001); -- 0.3333
SELECT round(2.8+1/3., 0.5); -- 3.15
SELECT round(pi(), 0.0001); -- 3.1416
PS: the command \df round, on psql after overloadings, will show something like this table
Schema | Name | Result | Argument
------------+-------+---------+------------------
myschema | round | numeric | float, int
myschema | round | float | float, float
pg_catalog | round | float | float
pg_catalog | round | numeric | numeric
pg_catalog | round | numeric | numeric, int
where float is synonymous of double precision and myschema is public when you not use a schema. The pg_catalog functions are the default ones, see at Guide the build-in math functions.
Rounding and formating
The to_char function apply internally the round procedure, so, when your aim is only to show a final result in the terminal, you can use the FM modifier as a prefix to a numeric format pattern:
SELECT round(x::numeric,2), trunc(x::numeric,2), to_char(x, 'FM99.99')
FROM (SELECT 2.0/3) t(x);
round
trunc
to_char
0.67
0.66
.67
NOTES
Cause of the problem
There are a lack of overloads in some PostgreSQL functions, why (???): I think "it is a lack" (!), but #CraigRinger, #Catcall and the PostgreSQL team agree about "pg's historic rationale".
Note about performance and reuse
The build-in functions, such as ROUND of the pg_catalog, can be overloaded with no performance loss, when compared to direct cast encoding. Two precautions must be taken when implementing user-defined cast functions for high performance:
The IMMUTABLE clause is very important for code snippets like this, because, as said in the Guide: "allows the optimizer to pre-evaluate the function when a query calls it with constant arguments"
PLpgSQL is the preferred language, except for "pure SQL". For JIT optimizations (and sometimes for parallelism) language SQL can obtain better optimizations. Is something like copy/paste small piece of code instead of use a function call.
Conclusion: the above ROUND(float,int) function, after optimizations, is so fast than #CraigRinger's answer; it will compile to (exactly) the same internal representation. So, although it is not standard for PostgreSQL, it can be standard for your projects, by a centralized and reusable "library of snippets", like pg_pubLib.
Round to the nth bit or other numeric representation
Some people argue that it doesn't make sense for PostgreSQL to round a number of float datatype, because float is a binary representation, it requires rounding the number of bits or its hexadecimal representation.
Well, let's solve the problem, adding an exotic suggestion... The aim here is to return a float type in another overloaded function,   ROUND(float, text, int) RETURNS float The text is to offer a choice between
'dec' for "decimal representation",
'bin' for "binary" representation and
'hex' for hexadecimal representation.
So, in different representations we have a different interpretation about the number of digits to be rounded. Rounding a number x with an approximate shorter value, with less "fractionary digits" (tham its original d digits), will be shorter when d is couting binary digits instead decimal or hexadecimal.
It is not easy without C++, using "pure SQL", but this code snippets will illustrate and can be used as workaround:
-- Looking for a round_bin() function! this is only a workaround:
CREATE FUNCTION trunc_bin(x bigint, t int) RETURNS bigint AS $f$
SELECT ((x::bit(64) >> t) << t)::bigint;
$f$ language SQL IMMUTABLE;
CREATE FUNCTION ROUND(
x float,
xtype text, -- 'bin', 'dec' or 'hex'
xdigits int DEFAULT 0
)
RETURNS FLOAT AS $f$
SELECT CASE
WHEN xtype NOT IN ('dec','bin','hex') THEN 'NaN'::float
WHEN xdigits=0 THEN ROUND(x)
WHEN xtype='dec' THEN ROUND(x::numeric,xdigits)
ELSE (s1 ||'.'|| s2)::float
END
FROM (
SELECT s1,
lpad(
trunc_bin( s2::bigint, CASE WHEN xd<bin_bits THEN bin_bits - xd ELSE 0 END )::text,
l2,
'0'
) AS s2
FROM (
SELECT *,
(floor( log(2,s2::numeric) ) +1)::int AS bin_bits, -- most significant bit position
CASE WHEN xtype='hex' THEN xdigits*4 ELSE xdigits END AS xd
FROM (
SELECT s[1] AS s1, s[2] AS s2, length(s[2]) AS l2
FROM (SELECT regexp_split_to_array(x::text,'\.')) t1a(s)
) t1b
) t1c
) t2
$f$ language SQL IMMUTABLE;
Try
SELECT round(1/3.,'dec',4); -- 0.3333 float!
SELECT round(2.8+1/3.,'dec',1); -- 3.1 float!
SELECT round(2.8+1/3.,'dec'); -- ERROR, need to cast string
SELECT round(2.8+1/3.,'dec'::text); -- 3 float
SELECT round(2.8+1/3.,'dec',0); -- 3 float
SELECT round(2.8+1/3.,'hex',0); -- 3 float (no change)
SELECT round(2.8+1/3.,'hex',1); -- 3.1266
SELECT round(2.8+1/3.,'hex',3); -- 3.13331578486784
SELECT round(2.8+1/3.,'bin',1); -- 3.1125899906842625
SELECT round(2.8+1/3.,'bin',6); -- 3.1301821767286784
SELECT round(2.8+1/3.,'bin',12); -- 3.13331578486784
And \df round have also:
Schema | Name | Result | Argument
------------+-------+---------+---------------
myschema | round | float | x float, xtype text, xdigits int DEFAULT 0
Try with this:
SELECT to_char (2/3::float, 'FM999999990.00');
-- RESULT: 0.67
Or simply:
SELECT round (2/3::DECIMAL, 2)::TEXT
-- RESULT: 0.67
you can use the function below
SELECT TRUNC(14.568,2);
the result will show :
14.56
you can also cast your variable to the desire type :
SELECT TRUNC(YOUR_VAR::numeric,2)
SELECT ROUND(SUM(amount)::numeric, 2) AS total_amount
FROM transactions
Gives: 200234.08
Try casting your column to a numeric like:
SELECT ROUND(cast(some_column as numeric),2) FROM table
According to Bryan's response you can do this to limit decimals in a query. I convert from km/h to m/s and display it in dygraphs but when I did it in dygraphs it looked weird. Looks fine when doing the calculation in the query instead. This is on postgresql 9.5.1.
select date,(wind_speed/3.6)::numeric(7,1) from readings;
Error:function round(double precision, integer) does not exist
Solution: You need to addtype cast then it will work
Ex: round(extract(second from job_end_time_t)::integer,0)

How to process bitand operation in Informix with column in hex string format

In table I have string column which contains a hex value. For example value '000000000000000a' means 10. Now I need to process bitand operation: bitand(tableName.hexColumn, ?). When I read the Informix specification of this function it needs 2 int. So my question is: what is the simpler way to process this operation?
PS: Probably there is no solution in Informix so I will have to create my own bitandhexstring function where input will be 2 string and hex form but I have no idea where to start.
There are a variety of issues to be dealt with:
Your hex string has 16 digits, so the values are presumably (in general) 64-bit quantities. That means you need to be sure that the BITAND function has a variant that handles BIGINT (or perhaps INT8 — I'm not going to mention INT8 again, but it is nominally an option when BIGINT is mentioned) data.
You need to convert your hex string to a BIGINT.
It is not clear whether you'll need to convert the result BIGINT back to a hex string.
Some testing with Informix 11.70.FC6 on Mac OS X 10.10.4 shows that BITAND is safe with 64-bit numbers. That's good news!
The HEX function, when passed a BIGINT, returns a CHAR(20) string that starts with 0x and contains a hex representation of the number, so that more or less addresses point 3. The residual issue is 'how to convert 16-byte strings of hex digits to a BIGINT value'. Nominally, a cast operation like:
CAST('0xde3962e8c68a8001' AS BIGINT)
should do the job (but see below). There may be a better way of doing it than a brute-force and ignorance stored procedure, but I'm not immediately sure what it is.
Caveat Lector.
While testing this, I tried two queries:
SELECT bi, HEX(bi) FROM Test_BigInt;
SELECT bi, HEX(bi), SUBSTR(HEX(bi), 3, 16) FROM Test_BigInt;
on a table Test_BigInt with a single column bi of type BIGINT (not null, as it happened, but that's not material).
The first query worked fine. The type of the HEX(bi) expression was CHAR(20) and the values were like
0 0x0000000000000000
6898532535585831936 0x5fbc82ca87117c00
-2300268458811555839 0xe013ce0628808001
The second query sort of worked for small values of bi (0, 1, 2), but generated an error -1215: Value exceeds limit of INTEGER precision when the values got large. The problem is not the SUBSTR function directly. This was testing with Informix 11.70.FC6 on Mac OS X 10.10.4 — tested on 2015-07-08. The following pair of queries worked as expected (which is my justification for claiming that the problem is not in the SUBSTR function per se).
SELECT bi, HEX(bi) AS hex_bi FROM Test_BigInt INTO TEMP t;
SELECT bi, hex_bi, SUBSTR(hex_bi, 3, 16) FROM t;
It seems to be an interaction problem when the result of HEX is used in a string operation context. I first got the problem when trying to concatenate an empty string to the result of HEX: HEX(bi) || ''. That turns out to be unnecessary given that the result of HEX is reported as CHAR(20), but also indicates SUBSTR is not directly at fault.
I also tried CAST to get the hex string converted to BIGINT:
SELECT CAST('0xde3962e8c68a8001' AS BIGINT) FROM dual;
BIGINT
-964001791
SELECT HEX(CAST('0xde3962e8c68a8001' AS BIGINT)) FROM dual;
CHAR(18)
0xffffffffc68a8001
Grrr! Something is mishandling the conversion. This is not new software (well over 2 years old), but the chances are that unless someone else has spotted the bug, it has not yet been fixed, even in the latest version.
I've reported this through back-channels to IBM/Informix.
Stored procedures to convert hex string to BIGINT
CREATE PROCEDURE hexval(c CHAR(1)) RETURNING INTEGER;
RETURN INSTR("0123456789abcdef", lower(c)) - 1;
END PROCEDURE;
CREATE PROCEDURE hexstr_to_bigint(ival VARCHAR(18)) RETURNING bigint;
DEFINE oval DECIMAL(20,0);
DEFINE i,j,len INTEGER;
LET ival = LOWER(ival);
IF (ival[1,2] = '0x') THEN LET ival = ival[3,18]; END IF;
LET len = LENGTH(ival);
LET oval = 0;
FOR i = 1 TO len
LET j = hexval(SUBSTR(ival, i, 1));
LET oval = oval * 16 + j;
END FOR;
IF (oval > 9223372036854775807) THEN
LET oval = oval - 18446744073709551616;
END IF;
RETURN oval;
END PROCEDURE;
Casual testing:
execute procedure hexstr_to_bigint('000A');
10
execute procedure hexstr_to_bigint('FFff');
65535
execute procedure hexstr_to_bigint('FFFFffffFFFFffff');
-1
execute procedure hexstr_to_bigint('0XFFFFffffFFFFffff');
-1
execute procedure hexstr_to_bigint('000000000000000A');
10
Those values are correct.

What does %% in PL/pgSQL mean?

I was reading over Instagrams sharding solution and I noticed the following line:
SELECT nextval('insta5.table_id_seq') %% 1024 INTO seq_id;
What does the %% in the SELECT line above do?
I looked up PostgreSQL and the only thing I found was that %% is utilized when you want to use a literal percent character.
CREATE OR REPLACE FUNCTION insta5.next_id(OUT result bigint) AS $$
DECLARE
our_epoch bigint := 1314220021721;
seq_id bigint;
now_millis bigint;
shard_id int := 5;
BEGIN
SELECT nextval('insta5.table_id_seq') %% 1024 INTO seq_id;
SELECT FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 1000) INTO now_millis;
result := (now_millis - our_epoch) << 23;
result := result | (shard_id << 10);
result := result | (seq_id);
END;
$$ LANGUAGE PLPGSQL;
The only place I can think of, where a % would be doubled up in standard Postgres is inside the format() function, commonly used for producing a query string for dynamic SQL. Compare examples here on SO.
The manual:
In addition to the format specifiers described above, the special
sequence %% may be used to output a literal % character.
Tricky when using the modulo operator % in a dynamic statement!
I suspect they are running dynamic SQL behind the curtains - which they generalized and simplified for the article. (The schema-qualified name of the sequence is 'insta5.table_id_seq' and the table wouldn't be named "table".) In the process they forgot to "unescape" the modulo operator.
That's what they may actually be running:
EXECUTE format($$SELECT nextval('%I') %% 1024$$, seq_name)
INTO seq_id;
With default installation (on 9.2):
ERROR: operator does not exist: bigint %% integer
SQL state: 42883
So i would say it could be
a custom operator
or a typo, and they want to write the modulo operator: %
Looks like an escaped modulo operator to me.

How to round an average to 2 decimal places in PostgreSQL?

I am using PostgreSQL via the Ruby gem 'sequel'.
I'm trying to round to two decimal places.
Here's my code:
SELECT ROUND(AVG(some_column),2)
FROM table
I get the following error:
PG::Error: ERROR: function round(double precision, integer) does
not exist (Sequel::DatabaseError)
I get no error when I run the following code:
SELECT ROUND(AVG(some_column))
FROM table
Does anyone know what I am doing wrong?
PostgreSQL does not define round(double precision, integer). For reasons #Mike Sherrill 'Cat Recall' explains in the comments, the version of round that takes a precision is only available for numeric.
regress=> SELECT round( float8 '3.1415927', 2 );
ERROR: function round(double precision, integer) does not exist
regress=> \df *round*
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+--------+------------------+---------------------+--------
pg_catalog | dround | double precision | double precision | normal
pg_catalog | round | double precision | double precision | normal
pg_catalog | round | numeric | numeric | normal
pg_catalog | round | numeric | numeric, integer | normal
(4 rows)
regress=> SELECT round( CAST(float8 '3.1415927' as numeric), 2);
round
-------
3.14
(1 row)
(In the above, note that float8 is just a shorthand alias for double precision. You can see that PostgreSQL is expanding it in the output).
You must cast the value to be rounded to numeric to use the two-argument form of round. Just append ::numeric for the shorthand cast, like round(val::numeric,2).
If you're formatting for display to the user, don't use round. Use to_char (see: data type formatting functions in the manual), which lets you specify a format and gives you a text result that isn't affected by whatever weirdness your client language might do with numeric values. For example:
regress=> SELECT to_char(float8 '3.1415927', 'FM999999999.00');
to_char
---------------
3.14
(1 row)
to_char will round numbers for you as part of formatting. The FM prefix tells to_char that you don't want any padding with leading spaces.
        ((this is a Wiki! please edit to enhance!))
Try also the old syntax for casting,
SELECT ROUND( AVG(some_column)::numeric, 2 ) FROM table;
works with any version of PostgreSQL. ...But, as definitive solution, you can overload the ROUND function.
Overloading as casting strategy
CREATE FUNCTION ROUND(float,int) RETURNS NUMERIC AS $f$
SELECT ROUND( CAST($1 AS numeric), $2 )
$f$ language SQL IMMUTABLE;
Now your instruction will works fine, try this complete comparison:
SELECT trunc(n,3), round(n,3) n_round, round(f,3) f_round,
pg_typeof(n) n_type, pg_typeof(f) f_type, pg_typeof(round(f,3)) f_round_type
FROM (SELECT 2.0/3.0, 2/3::float) t(n,f);
trunc
n_round
f_round
n_type
f_type
f_round_type
0.666
0.667
0.667
numeric
double precision
numeric
The ROUND(float,int) function is f_round, it returns a (decimal) NUMERIC datatype, that is fine for some applications: problem solved!
In another applications we need a float also as result. An alternative is to use round(f,3)::float or to create a round_tofloat() function.
Other alternative, overloading ROUND function again, and using all range of accuracy-precision of a floating point number, is to return a float when the accuracy is defined (see IanKenney's answer),
CREATE FUNCTION ROUND(
input float, -- the input number
accuracy float -- accuracy, the "counting unit"
) RETURNS float AS $f$
SELECT ROUND($1/accuracy)*accuracy
$f$ language SQL IMMUTABLE;
Try
SELECT round(21.04, 0.05); -- 21.05 float!
SELECT round(21.04, 5::float); -- 20
SELECT round(1/3., 0.0001); -- 0.3333
SELECT round(2.8+1/3., 0.5); -- 3.15
SELECT round(pi(), 0.0001); -- 3.1416
PS: the command \df round, on psql after overloadings, will show something like this table
Schema | Name | Result | Argument
------------+-------+---------+------------------
myschema | round | numeric | float, int
myschema | round | float | float, float
pg_catalog | round | float | float
pg_catalog | round | numeric | numeric
pg_catalog | round | numeric | numeric, int
where float is synonymous of double precision and myschema is public when you not use a schema. The pg_catalog functions are the default ones, see at Guide the build-in math functions.
Rounding and formating
The to_char function apply internally the round procedure, so, when your aim is only to show a final result in the terminal, you can use the FM modifier as a prefix to a numeric format pattern:
SELECT round(x::numeric,2), trunc(x::numeric,2), to_char(x, 'FM99.99')
FROM (SELECT 2.0/3) t(x);
round
trunc
to_char
0.67
0.66
.67
NOTES
Cause of the problem
There are a lack of overloads in some PostgreSQL functions, why (???): I think "it is a lack" (!), but #CraigRinger, #Catcall and the PostgreSQL team agree about "pg's historic rationale".
Note about performance and reuse
The build-in functions, such as ROUND of the pg_catalog, can be overloaded with no performance loss, when compared to direct cast encoding. Two precautions must be taken when implementing user-defined cast functions for high performance:
The IMMUTABLE clause is very important for code snippets like this, because, as said in the Guide: "allows the optimizer to pre-evaluate the function when a query calls it with constant arguments"
PLpgSQL is the preferred language, except for "pure SQL". For JIT optimizations (and sometimes for parallelism) language SQL can obtain better optimizations. Is something like copy/paste small piece of code instead of use a function call.
Conclusion: the above ROUND(float,int) function, after optimizations, is so fast than #CraigRinger's answer; it will compile to (exactly) the same internal representation. So, although it is not standard for PostgreSQL, it can be standard for your projects, by a centralized and reusable "library of snippets", like pg_pubLib.
Round to the nth bit or other numeric representation
Some people argue that it doesn't make sense for PostgreSQL to round a number of float datatype, because float is a binary representation, it requires rounding the number of bits or its hexadecimal representation.
Well, let's solve the problem, adding an exotic suggestion... The aim here is to return a float type in another overloaded function,   ROUND(float, text, int) RETURNS float The text is to offer a choice between
'dec' for "decimal representation",
'bin' for "binary" representation and
'hex' for hexadecimal representation.
So, in different representations we have a different interpretation about the number of digits to be rounded. Rounding a number x with an approximate shorter value, with less "fractionary digits" (tham its original d digits), will be shorter when d is couting binary digits instead decimal or hexadecimal.
It is not easy without C++, using "pure SQL", but this code snippets will illustrate and can be used as workaround:
-- Looking for a round_bin() function! this is only a workaround:
CREATE FUNCTION trunc_bin(x bigint, t int) RETURNS bigint AS $f$
SELECT ((x::bit(64) >> t) << t)::bigint;
$f$ language SQL IMMUTABLE;
CREATE FUNCTION ROUND(
x float,
xtype text, -- 'bin', 'dec' or 'hex'
xdigits int DEFAULT 0
)
RETURNS FLOAT AS $f$
SELECT CASE
WHEN xtype NOT IN ('dec','bin','hex') THEN 'NaN'::float
WHEN xdigits=0 THEN ROUND(x)
WHEN xtype='dec' THEN ROUND(x::numeric,xdigits)
ELSE (s1 ||'.'|| s2)::float
END
FROM (
SELECT s1,
lpad(
trunc_bin( s2::bigint, CASE WHEN xd<bin_bits THEN bin_bits - xd ELSE 0 END )::text,
l2,
'0'
) AS s2
FROM (
SELECT *,
(floor( log(2,s2::numeric) ) +1)::int AS bin_bits, -- most significant bit position
CASE WHEN xtype='hex' THEN xdigits*4 ELSE xdigits END AS xd
FROM (
SELECT s[1] AS s1, s[2] AS s2, length(s[2]) AS l2
FROM (SELECT regexp_split_to_array(x::text,'\.')) t1a(s)
) t1b
) t1c
) t2
$f$ language SQL IMMUTABLE;
Try
SELECT round(1/3.,'dec',4); -- 0.3333 float!
SELECT round(2.8+1/3.,'dec',1); -- 3.1 float!
SELECT round(2.8+1/3.,'dec'); -- ERROR, need to cast string
SELECT round(2.8+1/3.,'dec'::text); -- 3 float
SELECT round(2.8+1/3.,'dec',0); -- 3 float
SELECT round(2.8+1/3.,'hex',0); -- 3 float (no change)
SELECT round(2.8+1/3.,'hex',1); -- 3.1266
SELECT round(2.8+1/3.,'hex',3); -- 3.13331578486784
SELECT round(2.8+1/3.,'bin',1); -- 3.1125899906842625
SELECT round(2.8+1/3.,'bin',6); -- 3.1301821767286784
SELECT round(2.8+1/3.,'bin',12); -- 3.13331578486784
And \df round have also:
Schema | Name | Result | Argument
------------+-------+---------+---------------
myschema | round | float | x float, xtype text, xdigits int DEFAULT 0
Try with this:
SELECT to_char (2/3::float, 'FM999999990.00');
-- RESULT: 0.67
Or simply:
SELECT round (2/3::DECIMAL, 2)::TEXT
-- RESULT: 0.67
you can use the function below
SELECT TRUNC(14.568,2);
the result will show :
14.56
you can also cast your variable to the desire type :
SELECT TRUNC(YOUR_VAR::numeric,2)
SELECT ROUND(SUM(amount)::numeric, 2) AS total_amount
FROM transactions
Gives: 200234.08
Try casting your column to a numeric like:
SELECT ROUND(cast(some_column as numeric),2) FROM table
According to Bryan's response you can do this to limit decimals in a query. I convert from km/h to m/s and display it in dygraphs but when I did it in dygraphs it looked weird. Looks fine when doing the calculation in the query instead. This is on postgresql 9.5.1.
select date,(wind_speed/3.6)::numeric(7,1) from readings;
Error:function round(double precision, integer) does not exist
Solution: You need to addtype cast then it will work
Ex: round(extract(second from job_end_time_t)::integer,0)

postgresql function confusion

if I write a query as such:
with WordBreakDown (idx, word, wordlength) as (
select
row_number() over () as idx,
word,
character_length(word) as wordlength
from
unnest(string_to_array('yo momma so fat', ' ')) as word
)
select
cast(wbd.idx + (
select SUM(wbd2.wordlength)
from WordBreakDown wbd2
where wbd2.idx <= wbd.idx
) - wbd.wordlength as integer) as position,
cast(wbd.word as character varying(512)) as part
from
WordBreakDown wbd;
... I get a table of 4 rows like so:
1;"yo"
4;"momma"
10;"so"
13;"fat"
... this is what I want. HOWEVER, if I wrap this into a function like so:
drop type if exists split_result cascade;
create type split_result as(
position integer,
part character varying(512)
);
drop function if exists split(character varying(512), character(1));
create function split(
_s character varying(512),
_sep character(1)
) returns setof split_result as $$
begin
return query
with WordBreakDown (idx, word, wordlength) as (
select
row_number() over () as idx,
word,
character_length(word) as wordlength
from
unnest(string_to_array(_s, _sep)) as word
)
select
cast(wbd.idx + (
select SUM(wbd2.wordlength)
from WordBreakDown wbd2
where wbd2.idx <= wbd.idx
) - wbd.wordlength as integer) as position,
cast(wbd.word as character varying(512)) as part
from
WordBreakDown wbd;
end;
$$ language plpgsql;
select * from split('yo momma so fat', ' ');
... I get:
1;"yo momma so fat"
I'm scratching my head on this. What am I screwing up?
UPDATE
Per the suggestions below, I have replaced the function as such:
CREATE OR REPLACE FUNCTION split(_string character varying(512), _sep character(1))
RETURNS TABLE (postition int, part character varying(512)) AS
$BODY$
BEGIN
RETURN QUERY
WITH wbd AS (
SELECT (row_number() OVER ())::int AS idx
,word
,length(word) AS wordlength
FROM unnest(string_to_array(_string, rpad(_sep, 1))) AS word
)
SELECT (sum(wordlength) OVER (ORDER BY idx))::int + idx - wordlength
,word::character varying(512) -- AS part
FROM wbd;
END;
$BODY$ LANGUAGE plpgsql;
... which keeps my original function signature for maximum compatibility, and the lion's share of the performance gains. Thanks to the answerers, I found this to be a multifaceted learning experience. Your explanations really helped me understand what was going on.
Observe this:
select length(' '::character(1));
length
--------
0
(1 row)
A cause of this confusion is a bizarre definition of character type in SQL standard. From Postgres documentation for character types:
Values of type character are physically padded with spaces to the specified width n, and are stored and displayed that way. However, the padding spaces are treated as semantically insignificant. Trailing spaces are disregarded when comparing two values of type character, and they will be removed when converting a character value to one of the other string types.
So you should use string_to_array(_s, rpad(_sep,1)).
You had several constructs that probably did not do what you think they would.
Here is a largely simplified version of your function, that is also quite a bit faster:
CREATE OR REPLACE FUNCTION split(_string text, _sep text)
RETURNS TABLE (postition int, part text) AS
$BODY$
BEGIN
RETURN QUERY
WITH wbd AS (
SELECT (row_number() OVER ())::int AS idx
,word
,length(word) AS wordlength
FROM unnest(string_to_array(_string, _sep)) AS word
)
SELECT (sum(wordlength) OVER (ORDER BY idx))::int + idx - wordlength
,word -- AS part
FROM wbd;
END;
$BODY$ LANGUAGE plpgsql;
Explanation
Use another window function to sum up the word lengths. Faster, simpler and cleaner. This makes for most of the performance gain. A lot of sub-queries slow you down.
Use the data type text instead of character varying or even character(). character varying and character are awful types, mostly just there for compatibility with the SQL standard and historical reasons. There is hardly anything you can do with those that could not better be done with text. In the meantime #Tometzky has explained why character(1) was a particularly bad choice for the parameter type. I fixed that by using text instead.
As #Tometzky demonstrated, unnest(string_to_array(..)) is faster than regexp_split_to_table(..) - even if just a tiny bit for small strings like we use here (max. 512 characters). So I switched back to your original expression.
length() does the same as character_length().
In a query with only one table source (and no other possible naming conflicts) you might as well not table-qualify column names. Simplifies the code.
We need an integer value in the end, so I cast all numerical values (bigint in this case) to integer right away, so additions and subtractions are done with integer arithmetic which is generally fastest.
'value'::int is just shorter syntax for cast('value' as integer) and otherwise equivalent.
I found the answer, but I don't understand it.
The string_to_array(_s, _sep) function does not split with a non-varying character; even if I wrote it like so it would not work:
string_to_array(_s, cast(_sep as character_varying(1)))
BUT if I redefined the parameters as such:
drop function if exists split(character varying(512), character(1));
create function split(
_s character varying(512),
_sep character varying(1)
... all of a sudden it works as I expected. Dunno what to make of this, and really not the answer I wanted... now I have changed the signature of the function, which is not what I wanted to do.