My table column has nested arrays in a Snowflake database. I want to perform some aggregations using SQL (Snowflake SQL).
My table name is: DATA
The PROJ column is of VARIANT data type. The nested arrays will not always be 3, and I demonstrated that in the DATA table.
| ID | PROJ | LOCATION |
|----|-------------------------------|----------|
| 1 |[[0, 4], [1, 30], [10, 20]] | S |
| 2 |[[0, 2], [1, 20]] | S |
| 3 |[[0, 8], [1, 10], [10, 100]] | S |
Desired Output:
| Index | LOCATION | Min | Max | Mean|
|-------|----------|------|-----|-----|
| 0 | S | 2 | 8 | 4.66|
| 1 | S | 10 | 30 | 20 |
| 10 | S | 20 | 100| 60 |
First the nested array should be flattened, then Index is the first element of subarray and Value is the second element(array is 0-based):
CREATE OR REPLACE TABLE DATA
AS
SELECT 1 AS ID, [[0, 4], [1, 30], [10, 20]] AS PROJ UNION
SELECT 2 AS ID, [[0, 2], [1, 20]] AS PROJ UNION
SELECT 3 AS ID, [[0, 8], [1, 10], [10, 100]] AS PROJ;
Query:
SELECT s.VALUE[0]::INT AS Index,
MIN(s.VALUE[1]::INT) AS MinValue,
MAX(s.VALUE[1]::INT) AS MaxValue,
AVG(s.VALUE[1]::INT) AS MeanValue
FROM DATA
,LATERAL FLATTEN(input=> PROJ) s
GROUP BY s.VALUE[0]::INT
ORDER BY Index;
Output:
Given table of enums
|id |reaction |
|-- |-------- |
|1 |laugh |
|2 |love |
|3 |love |
|4 |like |
|5 |like |
|6 |surprised|
|7 |like |
|8 |love |
|9 |like |
|10 |surprised|
How can I select it to get following JSON array of tuples [reaction, count()]?
[
[laugh, 1],
[love, 3],
[like, 4],
[surprised, 2]
]
You can aggregate the result of a group by query:
select jsonb_agg(jsonb_build_object(reaction, count))
from (
select reaction, count(*)
from the_table
group by reaction
) t;
This would return:
[
{"surprised": 2},
{"like": 4},
{"laugh": 1},
{"love": 3}
]
Or if you really want the inner key/value pairs as a JSON array:
select jsonb_agg(array[reaction, "count"])
from (
select reaction, count(*)::text as "count"
from the_table
group by reaction
) t;
This would return
[
["surprised","2"],
["like","4"],
["laugh","1"],
["love","3"]
]
Online example
You can make use of postgres over partition by and jsonb_build_array function:
SELECT
jsonb_build_array(json_reactions.reaction, count)
FROM
(
SELECT
DISTINCT reaction, count(*) OVER (PARTITION BY reaction)
FROM
reactions r ) AS json_reactions ;
I'm using DB2 SQL. I have the following:
select * from mytable order by Var,Varseq
ID Var Varseq
-- --- ------
1 A 1
1 A 2
1 B 1
1 A 3
2 A 1
2 C 1
but would like to get:
ID Var Varseq NewSeq
-- --- ------ ------
1 A 1 1
1 A 2 2
1 B 1 1
1 A 3 1
2 A 1 1
2 C 1 1
However dense_rank produces the same as the original result. I hope you can see the difference in the desired output - in the 4th line when ID=1 returns to Var=A, I want the index reset to 1, instead of carrying on as 3. i.e. I would like the index to be reset every time Var changes for a given ID.
for ref here was my query:
SELECT *, DENSE_RANK() OVER (PARTITION BY ID, VAR ORDER BY VARSEQ) FROM MYTABLE
This is an example of a gaps-and-islands problem. However, SQL tables represent unordered sets. Without a column that specifies the overall ordering, your question does not make sense.
In this case, the difference of row numbers will do what you want. But you need an overall ordering column:
select t.*,
row_number() over (partition by id, var, seqnum - seqnum2 order by <ordering col>) as newseq
from (select t.*,
row_number() over (partition by id order by <ordering col>) as seqnum,
row_number() over (partition by id, var order by <ordering col>) as seqnum2
from t
) t
Not an answer yet, but just to have better formatting.
WITH TAB (ID, Var, Varseq) AS
(
VALUES
(1, 'A', 1)
, (1, 'A', 2)
, (1, 'A', 3)
, (1, 'B', 1)
, (2, 'A', 1)
, (2, 'C', 1)
)
SELECT *
FROM TAB
ORDER BY ID, <order keys>;
You specified Var, Varseq as <order keys> in the query above.
The result is:
|ID |VAR|VARSEQ |
|-----------|---|-----------|
|1 |A |1 |
|1 |A |2 |
|1 |A |3 |
|1 |B |1 |
|2 |A |1 |
|2 |C |1 |
But you need the following according to your question:
|ID |VAR|VARSEQ |
|-----------|---|-----------|
|1 |A |1 |
|1 |A |2 |
|1 |B |1 |
|1 |A |3 |
|2 |A |1 |
|2 |C |1 |
So, please, edit your question to specify such a <order keys> clause to get the result you need. And please, run your query getting such an order on your system first before posting here...
I'm building a forum, very much like Reddit/Slashdot, i.e.
Unlimited reply nesting levels
Popular comments (ordered by likes/votes) will rise to the top (within their own nesting/depth level), but the tree structure needs to be retained (parent is always shown directly above children)
Here's a sample table & data:
DROP TABLE IF EXISTS "comments";
CREATE TABLE comments (
id BIGINT PRIMARY KEY,
parent_id BIGINT,
body TEXT NOT NULL,
like_score BIGINT,
depth BIGINT
);
INSERT INTO comments VALUES ( 0, NULL, 'Main top of thread post', 5 , 0 );
INSERT INTO comments VALUES ( 1, 0, 'comment A', 5 , 1 );
INSERT INTO comments VALUES ( 2, 1, 'comment A.A', 3, 2 );
INSERT INTO comments VALUES ( 3, 1, 'comment A.B', 1, 2 );
INSERT INTO comments VALUES ( 9, 3, 'comment A.B.A', 10, 3 );
INSERT INTO comments VALUES ( 10, 3, 'comment A.B.B', 5, 3 );
INSERT INTO comments VALUES ( 11, 3, 'comment A.B.C', 8, 3 );
INSERT INTO comments VALUES ( 4, 1, 'comment A.C', 5, 2 );
INSERT INTO comments VALUES ( 5, 0, 'comment B', 10, 1 );
INSERT INTO comments VALUES ( 6, 5, 'comment B.A', 7, 2 );
INSERT INTO comments VALUES ( 7, 5, 'comment B.B', 5, 2 );
INSERT INTO comments VALUES ( 8, 5, 'comment B.C', 2, 2 );
Here's the recursive query I've come up with so far, but I can't figure out how to order children, but retain tree structure (parent should always be above children)...
WITH RECURSIVE tree AS (
SELECT
ARRAY[]::BIGINT[] AS sortable,
id,
body,
like_score,
depth
FROM "comments"
WHERE parent_id IS NULL
UNION ALL
SELECT
tree.sortable || "comments".like_score || "comments".id,
"comments".id,
"comments".body,
"comments".like_score,
"comments".depth
FROM "comments", tree
WHERE "comments".parent_id = tree.id
)
SELECT * FROM tree
ORDER BY sortable DESC
This outputs...
+----------------------------------------------------------+
|sortable |id|body |like_score|depth|
+----------------------------------------------------------+
|{10,5,7,6} |6 |comment B.A |7 |2 |
|{10,5,5,7} |7 |comment B.B |5 |2 |
|{10,5,2,8} |8 |comment B.C |2 |2 |
|{10,5} |5 |comment B |10 |1 |
|{5,1,5,4} |4 |comment A.C |5 |2 |
|{5,1,3,2} |2 |comment A.A |3 |2 |
|{5,1,1,3,10,9}|9 |comment A.B.A |10 |3 |
|{5,1,1,3,8,11}|11|comment A.B.C |8 |3 |
|{5,1,1,3,5,10}|10|comment A.B.B |5 |3 |
|{5,1,1,3} |3 |comment A.B |1 |2 |
|{5,1} |1 |comment A |5 |1 |
| |0 |Main top of thread post|5 |0 |
+----------------------------------------------------------+
...however notice that "comment B", "comment A" and "Main top of thread post" are below their children? How do I keep the contextual order? i.e. The output I want is:
+----------------------------------------------------------+
|sortable |id|body |like_score|depth|
+----------------------------------------------------------+
| |0 |Main top of thread post|5 |0 |
|{10,5} |5 |comment B |10 |1 |
|{10,5,7,6} |6 |comment B.A |7 |2 |
|{10,5,5,7} |7 |comment B.B |5 |2 |
|{10,5,2,8} |8 |comment B.C |2 |2 |
|{5,1} |1 |comment A |5 |1 |
|{5,1,5,4} |4 |comment A.C |5 |2 |
|{5,1,3,2} |2 |comment A.A |3 |2 |
|{5,1,1,3} |3 |comment A.B |1 |2 |
|{5,1,1,3,10,9}|9 |comment A.B.A |10 |3 |
|{5,1,1,3,8,11}|11|comment A.B.C |8 |3 |
|{5,1,1,3,5,10}|10|comment A.B.B |5 |3 |
+----------------------------------------------------------+
I actually want the users to be able to sort by a number of methods:
Most popular first
Least popular first
Newest first
Oldest first
etc
...but in all cases the parents need to be shown above their children. But I'm just using "like_score" here as the example, and I should be able to figure out the rest from there.
Spent a many hours researching the web and trying things myself, and feels like I'm getting close, but can't figure out this last part.
1.
tree.sortable || -"comments".like_score || "comments".id
^
/|\
|
|
2.
ORDER BY sortable
WITH RECURSIVE tree AS (
SELECT
ARRAY[]::BIGINT[] AS sortable,
id,
body,
like_score,
depth
FROM "comments"
WHERE parent_id IS NULL
UNION ALL
SELECT
tree.sortable || -"comments".like_score || "comments".id,
"comments".id,
"comments".body,
"comments".like_score,
"comments".depth
FROM "comments", tree
WHERE "comments".parent_id = tree.id
)
SELECT * FROM tree
ORDER BY sortable
+-------------------+----+-------------------------+------------+-------+
| sortable | id | body | like_score | depth |
+-------------------+----+-------------------------+------------+-------+
| (null) | 0 | Main top of thread post | 5 | 0 |
+-------------------+----+-------------------------+------------+-------+
| {-10,5} | 5 | comment B | 10 | 1 |
+-------------------+----+-------------------------+------------+-------+
| {-10,5,-7,6} | 6 | comment B.A | 7 | 2 |
+-------------------+----+-------------------------+------------+-------+
| {-10,5,-5,7} | 7 | comment B.B | 5 | 2 |
+-------------------+----+-------------------------+------------+-------+
| {-10,5,-2,8} | 8 | comment B.C | 2 | 2 |
+-------------------+----+-------------------------+------------+-------+
| {-5,1} | 1 | comment A | 5 | 1 |
+-------------------+----+-------------------------+------------+-------+
| {-5,1,-5,4} | 4 | comment A.C | 5 | 2 |
+-------------------+----+-------------------------+------------+-------+
| {-5,1,-3,2} | 2 | comment A.A | 3 | 2 |
+-------------------+----+-------------------------+------------+-------+
| {-5,1,-1,3} | 3 | comment A.B | 1 | 2 |
+-------------------+----+-------------------------+------------+-------+
| {-5,1,-1,3,-10,9} | 9 | comment A.B.A | 10 | 3 |
+-------------------+----+-------------------------+------------+-------+
| {-5,1,-1,3,-8,11} | 11 | comment A.B.C | 8 | 3 |
+-------------------+----+-------------------------+------------+-------+
| {-5,1,-1,3,-5,10} | 10 | comment A.B.B | 5 | 3 |
+-------------------+----+-------------------------+------------+-------+
Check this:
WITH RECURSIVE tree AS (
SELECT
ARRAY[]::BIGINT[] AS sortable,
id,
body,
like_score,
depth,
lpad(id::text, 2, '0') as path
FROM "comments"
WHERE parent_id IS NULL
UNION ALL
SELECT
tree.sortable || "comments".like_score || "comments".id,
"comments".id,
"comments".body,
"comments".like_score,
"comments".depth,
tree.path || '/' || lpad("comments".id::text, 2, '0') as path
FROM "comments", tree
WHERE "comments".parent_id = tree.id
)
SELECT * FROM tree
ORDER BY path
Please note that you can substitute the parameter 2 on lpad with whatever number of digits you want.
I have data in a table currently have data like below.
I want number rows based on child_start and child end columns using window functions.
Data Sample
LoadNumber |DispatchNumber|ChildLoadStart|ChildLoadEnd |
---------------------------------------------------------
123 | A |1 |1 |
---------------------------------------------------------
123 |B |1 |0 |
---------------------------------------------------------
123 |C |0 |0 |
---------------------------------------------------------
123 |D |0 |1 |
---------------------------------------------------------
In the above data for a load 123 I have two child loads i.e., dispatch A is one child load and dispatch B,C,D form one more child load.
So I need to number the each child loads like below;
the result should be something like below. Can some one help me on this?
LoadNumber |DispatchNumber|ChildLoadStart|ChildLoadEnd |Order |
-----------------------------------------------------------------------
123 | A |1 |1 |1 |
------------------------------------------------------------------------
123 |B |1 |0 |1 |
------------------------------------------------------------------------
123 |C |0 |0 |2 |
------------------------------------------------------------------------
123 |D |0 |1 |3 |
------------------------------------------------------------------------
If DispatchNumber can be used to order the data:
ROW_NUMBER()
OVER (PARTITION BY LoadNumber
ORDER BY DispatchNumber
RESET WHEN ChildLoadStart = 1)
Probably reset when clause proposed by #dnoeth is exactly what is needed here. But I'm not familiar with Teradata so below is the Oracle alternative, maybe this will be useful for someone.
At first divide your data into groups using cumulative sum and then use this column (grp) in partition by clause for row_number():
select loadnumber, dispatchnumber, childloadstart, childloadend,
row_number() over (partition by loadnumber, grp order by dispatchnumber) as "ORDER"
from (
select data.*,
sum(childloadstart) over (partition by loadnumber order by dispatchnumber) grp
from data )
Test data and output:
create table data (LoadNumber number(4), DispatchNumber varchar2(2),
ChildLoadStart number(1), ChildLoadEnd number(1));
insert into data values (123, 'A', 1, 1);
insert into data values (123, 'B', 1, 0);
insert into data values (123, 'C', 0, 0);
insert into data values (123, 'D', 0, 1);
LOADNUMBER DISPATCHNUMBER CHILDLOADSTART CHILDLOADEND ORDER
---------- -------------- -------------- ------------ ----------
123 A 1 1 1
123 B 1 0 1
123 C 0 0 2
123 D 0 1 3