Column definitions in ROWS FROM() with multiple unnest calls - sql

I would like to use multiple arrays within a select clause. The obvious one didn't work and postgresql points to ROWS FROM() ...
select * from unnest(array[1,2], array[3,4]) as (a int, b int);
ERROR:
UNNEST() with multiple arguments cannot have a column definition list
LINE 1: select * from unnest(array[1,2], array[3,4]) as (a int, b in...
^
HINT: Use separate UNNEST() calls inside ROWS FROM(), and attach a column definition list to each one.
...
select * from rows from (unnest(array[1,2]), unnest(array[3,4])) as (a int, b int);
ERROR:
ROWS FROM() with multiple functions cannot have a column definition list
LINE 1: ...from (unnest(array[1,2]), unnest(array[3,4])) as (a int, b i...
^
HINT: Put a separate column definition list for each function inside ROWS FROM().
The manual explains this as well but how to define these 'separate column definitions'?

You can define the column names without their types using just AS t(a, b):
#= SELECT * FROM unnest(array[1,2], array[3,4,5]) AS t(a, b);
a | b
---+---
1 | 3
2 | 4
∅ | 5
To define types, do it on the arrays themselves:
#= SELECT a / 2 AS half_a, b / 2 AS half_b
FROM unnest(array[1,2]::float[], array[3,4,5]::integer[]) AS t(a, b);
half_a | half_b
--------+--------
0.5 | 1
1 | 2
∅ | 2

Related

matching array in Postgres with string manipulation

I was working with the "<#" operator and two arrays of strings.
anyarray <# anyarray → boolean
Every string is formed in this way: ${name}_${number}, and I would like to check if the name part is included and the number is equal or lower than the one in the other array.
['elementOne_10'] & [['elementOne_7' , 'elementTwo20']] → true
['elementOne_10'] & [['elementOne_17', 'elementTwo20']] → false
what would be an efficient way to do this?
Assuming your sample data elementTwo20 in fact follows your described schema and should be elementTwo_20:
step-by-step demo:db<>fiddle
SELECT
id
FROM (
SELECT
*,
split_part(u, '_', 1) as name, -- 3
split_part(u, '_', 2)::int as num,
split_part(compare, '_', 1) as comp_name,
split_part(compare, '_', 2)::int as comp_num
FROM
t,
unnest(data) u, -- 1
(SELECT unnest('{elementOne_10}'::text[]) as compare) s -- 2
)s
GROUP BY id -- 4
HAVING
ARRAY_AGG(name) #> ARRAY_AGG(comp_name) -- 5
AND MAX(comp_num) BETWEEN MIN(num) AND MAX(num)
unnest() your array elements into one element per record
JOIN and unnest() your comparision data
split the element strings into their name and num parts
unnest() creates several records per original array, they can be grouped by an identifier (best is an id column)
Filter with your criteria in the HAVING clause: Compare the name parts for example with array operators, for BETWEEN comparing you can use MIN and MAX on the num part.
Note:
As #a_horse_with_no_name correctly mentioned: If possible think about your database design and normalize it:
Don't store arrays -> You don't need to unnest them on every operation
Relevant data should be kept separated, not concatenated as a string -> You don't need to split them on every operation
id | name | num
---------------------
1 | elementOne | 7
1 | elementTwo | 20
2 | elementOne | 17
2 | elementTwo | 20
This is exactly the result of the inner subquery. You have to create this every time you need these data. It's better to store the data like this.

SQL Update table with all corresponding data from another table concatenated?

Running PostgreSQL 12.4. I am trying to accomplish this but the syntax given there doesn't seem to be working on psql, and I could not find another approach.
I have the following data:
Table 1
ID Trait
1 X
1 Y
1 Z
2 A
2 B
Table 2
ID Traits, Listed
1
2
3
4
I would like to create the following result:
Table 2
ID Traits, Listed
1 X + Y + Z
2 A + B
Concatenating with + would be ideal, as some traits have inherent commas.
Thank you for any help!
Try something like:
update table2
SET traits = agg.t
FROM
(select id,string_agg(trait, ',') t from table1 group by id) agg
where
table2.id = agg.id;
dbfiddle
Concatenating with + would be ideal, as some traits have inherent commas.
You can use whatever delimiter you like (just change the second argument to string_agg).

EAV on postgres using jsonb

I'm trying to implement EAV pattern using Attribute->Value tables but unlike standard way values stored in jsonb filed like {"attrId":[values]}. It's help make easy search request like:
SELECT * FROM products p WHERE p.attributes #> "{1:[2]} AND p.attributes #> "{1:[4]}"
Now I'm wondering is it will be a good approach, and what is a effective way to calculate count of available variations, for example:
-p1- {"width":[1]}
-p2- {"width":[2],"height":[3]}
-p3- {"width":[1]}
Output will
width: 1 (count 2); 2 (count 1)
height: 3 (count 1)
when select width 2
width: 1 (count 0); 2 (count 1)
height: 3 (count 1)
"Flat is better than nested" -- the zen of python
I think you would be better served to use simple key/value pairs and in the rare event you have a complex value, then make it a list. But I don't see that use case.
Here is an example which answers your question. It could be modified to use your structure, but let's keep it simple:
First create a table and insert some JSON:
# create table foo (a jsonb);
# insert into foo values ('{"a":"1", "b":"2"}');
# insert into foo values ('{"c":"3", "d":"4"}');
# insert into foo values ('{"e":"5", "a":"6"}');
Here are the records:
# select * from foo;
a
----------------------
{"a": "1", "b": "2"}
{"c": "3", "d": "4"}
{"a": "6", "e": "5"}
(3 rows)
Here is the output of the json_each_text() function from https://www.postgresql.org/docs/9.6/static/functions-json.html
# select jsonb_each_text(a) from foo;
jsonb_each_text
-----------------
(a,1)
(b,2)
(c,3)
(d,4)
(a,6)
(e,5)
(6 rows)
Now we need to put it in a table expression to be able to get access to the individual fields:
# with t1 as (select jsonb_each_text(a) as rec from foo)
select (rec).key, (rec).value from t1;
key | value
-----+-------
a | 1
b | 2
c | 3
d | 4
a | 6
e | 5
(6 rows)
And lastly here is a grouping with the SUM function. Notice the a key which was in the database 2x, has been properly summed.
# with t1 as (select jsonb_each_text(a) as rec from foo)
select (rec).key, sum((rec).value::int) from t1 group by (rec).key;
key | sum
-----+-----
c | 3
b | 2
a | 7
e | 5
d | 4
(5 rows)
As a final note, (rec) has parentheses around it because otherwise it is incorrectly looked at as a table and will result in this error:
ERROR: missing FROM-clause entry for table "rec"

kdb: Add a column showing sum of rows in a table with dynamic headers while ignoring nulls

I have a table whose columns are dynamic, except one column:A. The table also has some null values (0n) in it. How do I add another column that shows total of each row and either ignores the column that has "0n" in that particular row or takes 0 in its place.
Here is my code, it fails on sum and also does not ignore nulls.
addTotalCol:{[]
table:flip`A`B`C`D!4 4#til 9;
colsToSum: string (cols table) except `A; / don't sum A
table: update Total: sum (colsToSum) from table; / type error here. Also check for nulls
:table;
}
I think it is better to use functional update in your case:
addTotalCol:{[]
table:flip`A`B`C`D!4 4#til 9;
colsToSum:cols[table] except `A; / don't sum A
table:![table;();0b;enlist[`Total]!enlist(sum;enlist,colsToSum)];
:table;
}
Reason why it is not working is because your fourth line is parsed as:
table: update Total: sum (enlist"B";enlist"C";enlist"D") from table;
Since sum only works with numbers, it returns 'type error since your inputs are string.
Another solution to use colsToSum as string input:
addTotalCol:{[]
table:flip`A`B`C`D!4 4#til 9;
colsToSum:string cols[table] except `A; / don't sum A
table:get"update Total:sum(",sv[";";colsToSum],") from table"
:table;
}
Basically this will build the query in string before it is executed in q.
Still, functional update is preferred though.
EDIT: Full answer to sum 0n:
addTotalCol:{[]
table:flip`A`B`C`D!4 4#0n,til 9;
colsToSum:cols[table] except `A; / don't sum A
table:![table;();0b;enlist[`Total]!enlist(sum;(^;0;enlist,colsToSum))];
:table;
}
I think there is a cleaner version here without a functional form.
q)//let us build a table where our first col is symbols and the rest are numerics,
/// we will exclude first from row sums
q)t:flip `c0`c1`c2`c3!(`a`b`c`d;1 2 3 0N;0n 4 5 6f;1 2 3 0Nh)
q)//columns for sum
q)sc:cols[t] except `c0
q)///now let us make sure we fill in each column with zero,
/// add across rows and append as a new column
q)show t1:t,'flip enlist[`sumRows]!enlist sum each flip 0^t sc
c0 c1 c2 c3 sumRows
-------------------
a 1 1 2
b 2 4 2 8
c 3 5 3 11
d 6 6
q)meta t1
c | t f a
-------| -----
c0 | s
c1 | i
c2 | f
c3 | h
sumRows| f

How to avoid multiple function evals with the (func()).* syntax in a query?

Context
When a function returns a TABLE or a SETOF composite-type, like this one:
CREATE FUNCTION func(n int) returns table(i int, j bigint) as $$
BEGIN
RETURN QUERY select 1,n::bigint
union all select 2,n*n::bigint
union all select 3,n*n*n::bigint;
END
$$ language plpgsql;
the results can be accessed by various methods:
select * from func(3) will produce these output columns :
i | j
---+---
1 | 3
2 | 9
3 | 27
select func(3) will produce only one output column of ROW type.
func
-------
(1,3)
(2,9)
(3,27)
select (func(3)).* will produce like #1:
i | j
---+---
1 | 3
2 | 9
3 | 27
When the function argument comes from a table or a subquery, the syntax #3 is the only possible one, as in:
select N, (func(N)).*
from (select 2 as N union select 3 as N) s;
or as in this related answer. If we had LATERAL JOIN we could use that, but until PostgreSQL 9.3 is out, it's not supported, and the previous versions will still be used for years anyway.
Problem
The problem with syntax #3 is that the function is called as many times as there are columns in the result. There's no apparent reason for that, but it happens.
We can see it in version 9.2 by adding a RAISE NOTICE 'called for %', n in the function. With the query above, it outputs:
NOTICE: called for 2
NOTICE: called for 2
NOTICE: called for 3
NOTICE: called for 3
Now, if the function is changed to return 4 columns, like this:
CREATE FUNCTION func(n int) returns table(i int, j bigint,k int, l int) as $$
BEGIN
raise notice 'called for %', n;
RETURN QUERY select 1,n::bigint,1,1
union all select 2,n*n::bigint,1,1
union all select 3,n*n*n::bigint,1,1;
END
$$ language plpgsql stable;
then the same query outputs:
NOTICE: called for 2
NOTICE: called for 2
NOTICE: called for 2
NOTICE: called for 2
NOTICE: called for 3
NOTICE: called for 3
NOTICE: called for 3
NOTICE: called for 3
2 function calls were needed, 8 were actually made. The ratio is the number of output columns.
With syntax #2 that produces the same result except for the output columns layout, these multiple calls don't happen:
select N, func(N)
from (select 2 as N union select 3 as N) s;
gives:
NOTICE: called for 2
NOTICE: called for 3
followed by the 6 resulting rows:
n | func
---+------------
2 | (1,2,1,1)
2 | (2,4,1,1)
2 | (3,8,1,1)
3 | (1,3,1,1)
3 | (2,9,1,1)
3 | (3,27,1,1)
Questions
Is there a syntax or a construct with 9.2 that would achieve the expected result by doing only the minimum required function calls?
Bonus question: why do the multiple evaluations happen at all?
You can wrap it up in a subquery but that's not guaranteed safe without the OFFSET 0 hack. In 9.3, use LATERAL. The problem is caused by the parser effectively macro-expanding * into a column list.
Workaround
Where:
SELECT (my_func(x)).* FROM some_table;
will evaluate my_func n times for n result columns from the function, this formulation:
SELECT (mf).* FROM (
SELECT my_func(x) AS mf FROM some_table
) sub;
generally will not, and tends not to add an additional scan at runtime. To guarantee that multiple evaluation won't be performed you can use the OFFSET 0 hack or abuse PostgreSQL's failure to optimise across CTE boundaries:
SELECT (mf).* FROM (
SELECT my_func(x) AS mf FROM some_table OFFSET 0
) sub;
or:
WITH tmp(mf) AS (
SELECT my_func(x) FROM some_table
)
SELECT (mf).* FROM tmp;
In PostgreSQL 9.3 you can use LATERAL to get a saner behaviour:
SELECT mf.*
FROM some_table
LEFT JOIN LATERAL my_func(some_table.x) AS mf ON true;
LEFT JOIN LATERAL ... ON true retains all rows like the original query, even if the function call returns no row.
Demo
Create a function that isn't inlineable as a demonstration:
CREATE OR REPLACE FUNCTION my_func(integer)
RETURNS TABLE(a integer, b integer, c integer) AS $$
BEGIN
RAISE NOTICE 'my_func(%)',$1;
RETURN QUERY SELECT $1, $1, $1;
END;
$$ LANGUAGE plpgsql;
and a table of dummy data:
CREATE TABLE some_table AS SELECT x FROM generate_series(1,10) x;
then try the above versions. You'll see that the first raises three notices per invocation; the latter only raise one.
Why?
Good question. It's horrible.
It looks like:
(func(x)).*
is expanded as:
(my_func(x)).i, (func(x)).j, (func(x)).k, (func(x)).l
in parsing, according to a look at debug_print_parse, debug_print_rewritten and debug_print_plan. The (trimmed) parse tree looks like this:
:targetList (
{TARGETENTRY
:expr
{FIELDSELECT
:arg
{FUNCEXPR
:funcid 57168
...
}
:fieldnum 1
:resulttype 23
:resulttypmod -1
:resultcollid 0
}
:resno 1
:resname i
...
}
{TARGETENTRY
:expr
{FIELDSELECT
:arg
{FUNCEXPR
:funcid 57168
...
}
:fieldnum 2
:resulttype 20
:resulttypmod -1
:resultcollid 0
}
:resno 2
:resname j
...
}
{TARGETENTRY
:expr
{FIELDSELECT
:arg
{FUNCEXPR
:funcid 57168
...
}
:fieldnum 3
:...
}
:resno 3
:resname k
...
}
{TARGETENTRY
:expr
{FIELDSELECT
:arg
{FUNCEXPR
:funcid 57168
...
}
:fieldnum 4
...
}
:resno 4
:resname l
...
}
)
So basically, we're using a dumb parser hack to expand wildcards by cloning nodes.