Generate a JSON array of values for each row

Generate a JSON array of values for each row - sql

Assuming the following CTE:
with mytable as (
select column1 as foo, column2 as bar, column3 as baz
from (values
('a', 'b', 1),
('c', 'd', 2)
) v
)
Using array_agg() ouputs an array of values:
select
array_agg(v)
from mytable v;
-- {"(a,b,1)","(c,d,2)"}
but surprisingly (to me at least), using to_json() on this array restores the field names into an object for each row
select
to_json(array_agg(v))
from mytable v;
-- [{"foo":"a","bar":"b","baz":1},{"foo":"c","bar":"d","baz":2}]
How can we make PostgreSQL output an array of arrays instead, rendering each row as an array of values?
select
something(v)
from mytable v;
-- [["a", "b", 1],["c", "d", 2]]

You can convert a row into a json, then unnest the key/value pairs and then aggregate the values back:
with mytable (foo, bar, baz) as (
values
('a', 'b', 1),
('c', 'd', 2)
)
select jsonb_agg(x.vals)
from mytable m
cross join lateral (
select jsonb_agg(value order by idx) as vals
from json_each(row_to_json(m)) with ordinality as t(key,value,idx)
) x
It's important to use json to convert the row, if the order of the column values in the array is important for you.
If you need this often, you can put this into a function.
If the order of the column values in the array isn't important, you can use a JSON path function:
select jsonb_path_query_array(to_jsonb(m), '$.keyvalue().value')
from mytable m;

Besides the answer from a_horse_with_no_name, I just found a way to achieve this, assuming column names are known:
with mytable as (
select column1 as foo, column2 as bar, column3 as baz
from (values
('a', 'b', 1),
('c', 'd', 2)
) v
)
select
to_json(array_agg(x.vals))
from (
select
json_build_array(
v.foo,
v.bar,
v.baz
) as vals
from mytable v
) x
;
-- [["a", "b", 1],["c", "d", 2]]

Related

prestoSQL aggregate columns and rows into one column

I would like to aggregate some columns and rows into one column in prestoSQL table.
with example_table as (
select * from (
values ('A', 'nh', 7), ('A', 'mn', 4), ('A', 'sv', 3),
('B', 'tb', 6), ('B', 'ty', 5), ('A', 'rw', 2),
('C', 'op', 9), ('C', 'au', 8)
) example_table("id", "time", "value")
)
select id, agg(value, time) # Unexpected parameters (integer, VARCHAR(2)) for function array_agg. Expected: array_agg(T) T
from example_table
group by id
I would like to combine column "time" and "value" as one column and then aggregate all rows by "id" such that
id. time_value_agg
A. [['nh', 7], ['mn', 4], ['sv', 3], ['rw', 2]
B. [['tb', 6], ['tv',5]
C. [['op', 9], ['au', 8]]
the column
time_value_agg
should be an array of str. If the "time" col is not str, cast it to str.
I am not sure which function can be used for this ?
thanks

array_agg can be applied to single column only. If times are unique per id you can turn data into map:
select id, map(array_agg(time), array_agg(value)) time_value_agg
from example_table
group by id
Output:
id
time_value_agg
C
{op=9, au=8}
A
{mn=4, sv=3, rw=2, nh=7}
B
{ty=5, tb=6}
Or turn data into ROW type (or map) before aggregation:
select id,
array_agg(arr) time_value_agg
from (
select id, cast (row(time, value) as row(time varchar, value integer))arr
from example_table
)
group by id
Output:
id
time_value_agg
C
[{time=op, value=9}, {time=au, value=8}]
A
[{time=nh, value=7}, {time=mn, value=4}, {time=sv, value=3}, {time=rw, value=2}]
B
[{time=tb, value=6}, {time=ty, value=5}]

COALESCE function won't return CHAR(1)

Using COALESCE function but getting the following error:
Conversion failed when converting the varchar value 'X' to data type int.
I have to join two tables on two conditions. I want that if the second condition doesn't hold but there is a blank cell (not null but blank '') in Table 1 then to join to that row. If the second condition doesn't hold then to return a zero.
Join Table 1 and Table 2 - return Table 2 and column 3 from Table 1.
Table 1
(A, 1, X),
(A, 2, Y),
(A, 3, Z),
(A, , X),
(B, 1, X),
(B, 2, Z),
(B, 3, Y),
Table 2
(A, 1),
(A, 2),
(A, 3),
(A, 5),
(B, 1),
(B, 2),
(B, 3),
(B, 5)
I want to get a return of
(A, 1, X),
(A, 2, Y),
(A, 3, Z),
(A, 5, X),
(B, 1, X),
(B, 2, Z),
(B, 3, Y),
(B, 5, NULL)
Code:
DECLARE #table1 TABLE (letter1 CHAR(1), num1 INT, letter2 CHAR(1))
DECLARE #table2 TABLE (letter1 CHAR(1), num1 INT)
INSERT INTO #table1 VALUES
('A', 1, 'X'),
('A', 2, 'Y'),
('A', 3, 'Z'),
('A', null, 'X'),
('B', 1, 'X'),
('B', 2, 'Y'),
('B', 3, 'Z')
INSERT INTO #table2 VALUES
('A', 1),
('A', 2),
('A', 3),
('A', 5),
('B', 1),
('B', 2),
('B', 3),
('B', 5)
SELECT t2.*,
COALESCE(
(SELECT TOP 1 letter2 FROM #table1 WHERE letter1 = t2.letter1 AND num1 = t2.num1),
(SELECT TOP 1 letter2 FROM #table1 WHERE letter1 = t2.letter1 AND num1 IS NULL),
0
) AS missing_letter
FROM #table2 t2

Perhaps you need :
select t1.*, t2.*
from table1 t1 outer apply
( select top (1) t2.*
from table2 t2
where t1.col1 = t.col1 and t1.col2 in ('', t2.col2)
order by t2.col2 desc
) t2;

If I understand correctly, this has less to do with coalesce() and more to do with the joins:
select t2.*, coalesce(t1.letter2, t1def.letter2) as letter2
from table2 t2 left join
table1 t1
on t2.letter1 = t1.letter1 and t2.num1 = t1.num1 left join
table1 t1def
on t2.letter1 = t1def.letter1 and t1def.num1 is null;

The problem here is your datatype. COALESCE is short hand for a CASE expression. For example. COALESCE('a',1,'c') would be short hand for:
CASE WHEN 'a' IS NOT NULL THEN 'a'
WHEN 1 IS NOT NULL THEN 1
ELSE 'c'
END
The Documentation (COALESCE (Transact-SQL) describes this as well:
The COALESCE expression is a syntactic shortcut for the CASE
expression. That is, the code COALESCE(expression1,...n) is
rewritten by the query optimizer as the following CASE expression:
CASE
WHEN (expression1 IS NOT NULL) THEN expression1
WHEN (expression2 IS NOT NULL) THEN expression2
...
ELSE expressionN
END
A CASE expression follows Data type precedence, and int has a higher datatype precedence than varchar; thus everything will implicit cast to an int. This is why both the COALESCE and CASE expression will fail, because neither 'a' or 'c' can be converted to an int.
You'll need to therefore explicitly CONVERT your int to a varchar:
COALESCE('a',CONVERT(char(1),1),'c')
The documentation (cited above), however, also goes to state:
This means that the input values (expression1, expression2,
expressionN, etc.) are evaluated multiple times. Also, in compliance
with the SQL standard, a value expression that contains a subquery is
considered non-deterministic and the subquery is evaluated twice. In
either case, different results can be returned between the first
evaluation and subsequent evaluations.
For example, when the code COALESCE((subquery), 1) is executed, the
subquery is evaluated twice. As a result, you can get different
results depending on the isolation level of the query. For example,
the code can return NULL under the READ COMMITTED isolation level in a
multi-user environment. To ensure stable results are returned, use the
SNAPSHOT ISOLATION isolation level, or replace COALESCE with the
ISNULL function.
Considering you are using a subquery, a (nested) ISNULL might be the better choice here.
It's worth noting, as people seem to confuse them as they are functionally similar, but COALESCE and ISNULL do not behave the same. COALESCE uses Data Type precedence, however, ISNULL implicitly casts the second value to whatever the datatype of the first paramter is. Thus ISNULL('a',1) works fine, but COALESCE('a',1) does not.

Just change the zero to a null. You can't mix datatypes in a coalesce:
SELECT t2.*,
COALESCE(
(SELECT TOP 1 letter2 FROM #table1 WHERE letter1 = t2.letter1 AND num1 = t2.num1),
(SELECT TOP 1 letter2 FROM #table1 WHERE letter1 = t2.letter1 AND num1 IS NULL),
null
) AS missing_letter
FROM #table2 t2

The query works if the 0 in the COALESCE is replaced by '0'.
That way the COALESCE doesn't contain mixed data types.
SELECT t2.*,
COALESCE(
(SELECT TOP 1 letter2 FROM #table1 t1 WHERE t1.letter1 = t2.letter1 AND t1.num1 = t2.num1),
(SELECT TOP 1 letter2 FROM #table1 t1 WHERE t1.letter1 = t2.letter1 AND t1.num1 IS NULL),
'0'
) AS missing_letter
FROM #table2 t2
ORDER BY t2.letter1, t2.num1;
And you can avoid having to retrieve data from table1 twice.
By using an OUTER APPLY.
Since the expected results has a NULL for ('B',5), the COALESCE isn't even needed this way.
SELECT t2.letter1, t2.num1, t1.letter2 AS missing_letter
FROM #table2 AS t2
OUTER APPLY (
select top 1 t.letter2
from #table1 AS t
where t.letter1 = t2.letter1
and (t.num1 is null or t.num1 = t2.num1)
order by t.num1 desc
) AS t1
ORDER BY t2.letter1, t2.num1;
Result:
letter1 num1 missing_letter
------- ---- --------------
A 1 X
A 2 Y
A 3 Z
A 5 X
B 1 X
B 2 Y
B 3 Z
B 5 NULL

How to use a subset of the row columns when converting to JSON?

I have a table t with some columns a, b and c. I use the following query to convert rows to a JSON array of objects:
SELECT COALESCE(JSON_AGG(t ORDER BY c), '[]'::json)
FROM t
This returns as expected:
[
{
"a": ...,
"b": ...,
"c": ...
},
{
"a": ...,
"b": ...,
"c": ...
}
]
Now I want the same result, but with only columns a and b in the output. I will still use column c for ordering. The best I came up with is as following:
SELECT COALESCE(JSON_AGG(JSON_BUILD_OBJECT('a', a, 'b', b) ORDER BY c), '[]'::json)
FROM t
[
{
"a": ...,
"b": ...
},
{
"a": ...,
"b": ...
}
]
Although this works fine, I am wondering if there is a more elegant way to do this. It frustrates me that I have to manually define the JSON properties. Of course, I understand that I have to enumerate the columns a and b, but it's odd that I have to copy/paste the corresponding JSON property name, which is exactly the same as the column name anyway.
Is there a another way to do this?

You can use row_to_json instead of manually building object:
CREATE TABLE foobar (a text, b text, c text);
INSERT INTO foobar VALUES
('1', 'LOREM', 'A'),
('2', 'LOREM', 'B'),
('3', 'LOREM', 'C');
--Using CTE
WITH tmp AS (
SELECT a, b FROM foobar ORDER BY c
)
SELECT json_agg(row_to_json(t)) FROM tmp t
--Using subquery
SELECT
json_agg(row_to_json(t))
FROM
(SELECT a, b FROM foobar ORDER BY c) t;
EDIT: As you stated, result order is a strict requirement. In this case you could use a row constructor to build json object:
--Using a type to build json with desired keys
CREATE TYPE mytype AS (a text, b text);
SELECT
json_agg(
to_json(
CAST(
ROW(a, b) AS mytype
)
)
ORDER BY c)
FROM
foobar;
--Ignoring column names...
SELECT
json_agg(
to_json(
ROW(a, b)
)
ORDER BY c)
FROM
foobar;
SQL Fiddle here.

perform the ordering in a subquery or cte and then apply json_agg
SELECT COALESCE(JSON_AGG(t2), '[]'::json)
FROM (SELECT a, b FROM t ORDER BY c) t2
alternatively use jsonb. The jsonb type allows deletion of items by specifying their key
SELECT coalesce(jsonb_agg(row_to_json(t)::jsonb - 'c'
order by c), '[]'::jsonb)
FROM t

Conditionally include in fields for presto query

I have a presto query which works as expected:
SELECT json_format(cast(MAP(ARRAY['random_name'],
ARRAY[
MAP(
ARRAY['a', 'b', 'c'],
ARRAY[a, b, c]
)]
) as JSON)) as metadata from a_table_with_a_b_c; // a,b,c are all ints
Now I only want to include a,b,c when they are larger than 0, how do I change the query? I can add 'CASE WHEN' but it seems I will have 'a:null' instead of not having it.

You can try this:
with rows as (
select row_number() over() as row_id, ARRAY['a', 'b', 'c'] as keys, ARRAY[a, b, c] as vals
from a_table_with_a_b_c
)
select
json_format(cast(MAP(ARRAY['random_name'],
ARRAY[
MAP(
array_agg(key), array_agg(value)
)]
) as JSON)) as metadata
from rows
cross join unnest (keys, vals) as t (key, value)
where value is not null and value > 0
group by row_id;

sql generate code based on three column values

I have three columns
suppose
row no column1 column2 column3
1 A B C
2 A B C
3 D E F
4 G H I
5 G H C
I want to generate code by combining these three column values
For Eg.
1)ABC001
2)ABC002
3)DEF001
4)GHI001
5)GHC001
by checking combination of three columns
logic is that
if values of three columns are same then like first time it shows 'ABC001'
and 2nd time it shows 'ABC002'

You can try this:
I dont know what you want for logic with 00, but you can add them manuel or let the rn decide for you
declare #mytable table (rowno int,col1 nvarchar(50),col2 nvarchar(50),col3 nvarchar(50)
)
insert into #mytable
values
(1,'A', 'B', 'C'),
(2,'A', 'B', 'C'),
(3,'D', 'E', 'F'),
(4,'G', 'H', 'I'),
(5,'G', 'H', 'C')
Select rowno,col1,col2,col3,
case when rn >= 10 and rn < 100 then concatcol+'0'+cast(rn as nvarchar(50))
when rn >= 100 then concatcol+cast(rn as nvarchar(50))
else concatcol+'00'+cast(rn as nvarchar(50)) end as ConcatCol from (
select rowno,col1,col2,col3
,Col1+col2+col3 as ConcatCol,ROW_NUMBER() over(partition by col1,col2,col3 order by rowno) as rn from #mytable
) x
order by rowno
My case when makes sure when you hit number 10 it writes ABC010 and when it hits above 100 it writes ABC100 else if its under 10 it writes ABC001 and so on.
Result

TSQL: CONCAT(column1,column2,column3,RIGHT(REPLICATE("0", 3) + LEFT(row_no, 3), 3))

You should combine your columns like below :
SELECT CONVERT(VARCHAR(MAX), ROW_NUMBER() OVER(ORDER BY
(
SELECT NULL
)))+') '+DATA AS Data
FROM
(
SELECT column1+column2+column3+'00'+CONVERT(VARCHAR(MAX), ROW_NUMBER() OVER(PARTITION BY column1,
column2,
column3 ORDER BY
(
SELECT NULL
))) DATA
FROM <table_name>
) T;
Result :
1)ABC001
2)ABC002
3)DEF001
4)GHI001
5)GHC001

MySQL:
CONCAT(column1,column2,column3,LPAD(row_no, 3, '0'))
[you will need to enclose the 'row no' in ticks if there is a space in the name of the field instead of underscore.]

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas