Order comments by thread path and by number of total votes - sql

I'm having some trouble ordering comments by their thread path and by number of upvotes of each comment.
Now they are only ordering by thread path. I've tried and searched a lot of things but nothings results.
This is my query
WITH RECURSIVE first_comments AS (
(
(
SELECT id, text, level, parent_id, array[id] AS thread_path, total_votes FROM comments
WHERE comments."postId" = 1 AND comments."level" = 0
)
)
UNION
(
SELECT e.id, e.text, e.level, e.parent_id, (fle.thread_path || e.id), e.total_votes
FROM
(
SELECT id, text, level, parent_id, total_votes FROM comments
WHERE comments."postId" = 1
) e, first_comments fle
WHERE e.parent_id = fle.id
)
)
SELECT id, text, level, total_votes, thread_path from first_comments ORDER BY 5 ASC
This query results in:
--------------------------------------------------
| id | level | total_votes | thread_path |
--------------------------------------------------
| 1 | 0 | 5 | {1} |
| 3 | 1 | 9 | {1,3} |
| 7 | 2 | 5 | {1,3,7} |
| 9 | 2 | 7 | {1,3,9} |
| 11 | 3 | 0 | {1,3,9,11} |
| 12 | 4 | 0 | {1,3,9,11,12} |
| 13 | 5 | 0 | {1,3,9,11,12,13} |
| 10 | 1 | 20 | {1,10} |
| 2 | 0 | 10 | {2} |
| 6 | 1 | 1 | {2,6} |
| 4 | 0 | 8 | {4} |
| 8 | 1 | 6 | {4,8} |
| 5 | 0 | 3 | {5} |
--------------------------------------------------
And the result should be
--------------------------------------------------
| id | level | total_votes | thread_path |
--------------------------------------------------
| 2 | 0 | 10 | {2} |
| 6 | 1 | 1 | {2,6} |
| 4 | 0 | 8 | {4} |
| 8 | 1 | 6 | {4,8} |
| 1 | 0 | 5 | {1} |
| 10 | 1 | 20 | {1,10} |
| 3 | 1 | 9 | {1,3} |
| 9 | 2 | 7 | {1,3,9} |
| 11 | 3 | 0 | {1,3,9,11} |
| 12 | 4 | 0 | {1,3,9,11,12} |
| 13 | 5 | 0 | {1,3,9,11,12,13} |
| 7 | 2 | 5 | {1,3,7} |
| 5 | 0 | 3 | {5} |
--------------------------------------------------
What I'm missing here...?
Thank for the help

Just accumulate another array next to path, witch will contain not just the id of each comment in its path, but the total_votes (as a negative number) before each id. After that, you can order by that column.
WITH RECURSIVE first_comments AS (
(
(
SELECT id, text, level, parent_id, array[id] AS path, total_votes,
array[-total_votes, id] AS path_and_votes
FROM comments
WHERE comments."postId" = 1 AND comments."level" = 0
)
)
UNION
(
SELECT e.id, e.text, e.level, e.parent_id, (fle.path || e.id), e.total_votes,
(fle.path_and_votes || -e.total_votes || e.id)
FROM
(
SELECT id, text, level, parent_id, total_votes FROM comments
WHERE comments."postId" = 1
) e, first_comments fle
WHERE e.parent_id = fle.id
)
)
SELECT id, text, level, total_votes, path from first_comments ORDER BY path_and_votes ASC
SQLFiddle (only data -- without the recursive CTE)

You want to order by the total votes at the top level. I think I'll approach this by by using a window function.
Instead of:
SELECT id, text, level, total_votes, path
from first_comments
ORDER BY 5 ASC;
which explicitly orders by the path. Try this:
select id, text, level, total_votes,
max(total_votes) over (partition by path[1]) as toplevel_votes
from first_comments
order by 6 desc;
This calculates the total votes at the top most level and uses that for ordering.

Related

Get the count of longest streak including the break point

I am working on the problem where I have to get the count of streak with max value, but to get the exact result I have to count that point as well where the streak breaks. My table looks like this
+-----------------+--------+-------+
| customer_number | Months | Flags |
+-----------------+--------+-------+
| 1 | 12 | 1 |
| 1 | 1 | 1 |
| 1 | 2 | 1 |
| 1 | 3 | 1 |
| 1 | 4 | 1 |
| 1 | 5 | 1 |
| 1 | 8 | 1 |
| 1 | 9 | 1 |
| 1 | 10 | 1 |
| 1 | 11 | 1 |
| 6 | 12 | 1 |
| 6 | 1 | 1 |
| 6 | 2 | 1 |
| 6 | 3 | 1 |
| 6 | 4 | 1 |
| 6 | 5 | 4 |
| 6 | 9 | 1 |
| 6 | 10 | 1 |
| 6 | 11 | 1 |
| 7 | 5 | 1 |
| 8 | 9 | 1 |
| 8 | 10 | 1 |
| 8 | 11 | 1 |
| 9 | 9 | 1 |
| 9 | 10 | 1 |
| 9 | 11 | 1 |
| 10 | 11 | 1 |
+-----------------+--------+-------+
and my desired output is
+----------+--------------------+
| Customer | Consecutive streak |
+----------+--------------------+
| 1 | 10 |
| 6 | 6 |
| 7 | 1 |
| 8 | 3 |
| 9 | 3 |
| 10 | 1 |
+----------+--------------------+
the code I have
SELECT customer_number, max(streak) max_consecutive_streak FROM (
SELECT customer_number, COUNT(*) as streak
FROM
(select *,
(row_number() over (order by customer_number) -
row_number() over (order by customer_number)
) as counts
from table1
) cc
group by customer_number, counts
)
GROUP BY 1;
It is working good but for customer_number 6 it returns 5 but I want it to be 6, means it should count 4 as well in its longest streak as the streak breaks at this point. Any idea how can I achieve that?
You can use a cte with row_number:
with cte(r, id, flag) as (
select row_number() over (order by c.customer_number), c.* from customers c
),
freq(id, t, f) as (
select c2.id, c2.f, count(*) from
(select c.id, (select sum(c1.flag!=c.flag) from cte c1 where c1.id=c.id and c1.r <= c.r) f from cte c)
c2 group by c2.id, c2.f
)
select id, max(f) from freq group by id;

Get ID alongside max value ORACLE SQL

I currently have the following:
TABLE "QUARTO":
CREATE TABLE Quarto (
Id number(2) NOT NULL,
LotacaoMaxima number(1) NOT NULL,
TipoQuartoId number(1) NOT NULL,
NumeroQuartoNumSequencial number(3) NOT NULL,
NumeroQuartoAndarId varchar2(1) NOT NULL,
PRIMARY KEY (Id));
TABLE RESERVA:
CREATE TABLE Reserva (
Id number(3) NOT NULL,
ClienteNif number(9) NOT NULL,
QuartoId number(2) NOT NULL,
DataInicio date NOT NULL,
DataFim date NOT NULL,
NumPessoas number(1) NOT NULL,
Estado varchar2(15) NOT NULL,
DataCancelamento date,
PRIMARY KEY (Id));
And some data I have in both is:
QUARTO:
| ID | LOTACAOMAXIMA | TIPOQUARTOID | NUMEROQUARTONUMSEQUENCIAL | NUMEROQUARTOANDARID |
| :--- | :--- | :--- | :--- | :--- |
| 1 | 1 | 1 | 1 | 1 |
| 2 | 1 | 1 | 2 | 1 |
| 3 | 1 | 1 | 3 | 1 |
| 4 | 1 | 1 | 4 | 1 |
| 5 | 1 | 1 | 5 | 1 |
| 6 | 1 | 1 | 6 | 1 |
| 7 | 1 | 1 | 7 | 1 |
| 8 | 1 | 1 | 8 | 1 |
| 9 | 1 | 1 | 9 | 1 |
| 10 | 1 | 1 | 10 | 1 |
| 11 | 2 | 2 | 11 | 1 |
| 12 | 2 | 2 | 12 | 1 |
| 13 | 2 | 2 | 13 | 1 |
| 14 | 2 | 2 | 14 | 1 |
| 15 | 2 | 2 | 15 | 1 |
| 16 | 2 | 2 | 16 | 1 |
| 17 | 2 | 2 | 17 | 1 |
| 18 | 2 | 2 | 18 | 1 |
| 19 | 2 | 2 | 19 | 1 |
| 20 | 2 | 2 | 20 | 1 |
RESERVA:
| ID | CLIENTENIF | QUARTOID | DATAINICIO | DATAFIM | NUMPESSOAS | ESTADO | DATACANCELAMENTO |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 296837970 | 11 | 2020-06-01 00:00:00 | 2020-06-12 00:00:00 | 1 | Finalizada | NULL |
| 2 | 275784703 | 17 | 2020-06-13 00:00:00 | 2020-06-21 00:00:00 | 1 | Finalizada | NULL |
| 3 | 220347654 | 11 | 2020-07-07 00:00:00 | 2020-07-15 00:00:00 | 2 | Finalizada | NULL |
| 4 | 294772545 | 12 | 2020-08-01 00:00:00 | 2020-08-15 00:00:00 | 2 | Finalizada | NULL |
| 5 | 220347654 | 3 | 2020-01-01 00:00:00 | 2020-01-16 00:00:00 | 1 | Finalizada | NULL |
WITH CONTAGEM_QUARTO_POR_ID AS (SELECT q.ID, COUNT(r.QUARTOID) AS NUM_RESERVAS, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID
FROM RESERVA r
INNER JOIN QUARTO q on q.ID = r.QUARTOID
GROUP BY q.ID, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID)
SELECT t.TIPOQUARTOID, t.NUMEROQUARTOANDARID, MAX(t.NUM_RESERVAS) AS MAX
FROM CONTAGEM_QUARTO_POR_ID t
GROUP BY t.TIPOQUARTOID, t.NUMEROQUARTOANDARID
And the Output is the following:
| TIPOQUARTOID | NUMEROQUARTOANDARID | MAX |
| :----------- | :------------------ | :-- |
| 1 | 2 | 2 |
| 2 | 1 | 8 |
| 1 | 1 | 1 |
I want to, alongside the data I currently have, also show toe ID of each row, but when I add the t.ID to the SELECT it forces me to add it to GROUP BY and the output is this:
| TIPOQUARTOID | NUMEROQUARTOANDARID | MAX | ID |
| :----------- | :------------------ | :-- | :- |
| 2 | 1 | 2 | 11 |
| 1 | 1 | 1 | 1 |
| 1 | 1 | 1 | 3 |
| 1 | 2 | 2 | 21 |
| 2 | 1 | 1 | 17 |
| 2 | 1 | 1 | 12 |
| 2 | 1 | 8 | 16 |
I only wan to get the max value and the ID associated to that MAX.
You can use the KEEP clause in your query without changing it much as follows:
WITH CONTAGEM_QUARTO_POR_ID AS
(SELECT q.ID, COUNT(r.QUARTOID) AS NUM_RESERVAS, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID
FROM RESERVA r
INNER JOIN QUARTO q on q.ID = r.QUARTOID
GROUP BY q.ID, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID)
SELECT t.TIPOQUARTOID, t.NUMEROQUARTOANDARID, MAX(t.NUM_RESERVAS) AS MAX,
max(t.ID) keep(dense_rank first order by t.NUM_RESERVAS desc nulls last) as ID -- this
FROM CONTAGEM_QUARTO_POR_ID t
GROUP BY t.TIPOQUARTOID, t.NUMEROQUARTOANDARID
You need MAX() OVER () Analytic function for NUM_RESERVAS column with PARTITION BY TIPOQUARTOID, NUMEROQUARTOANDARID in order to provide grouping for those columns within the partition by list such as
WITH CONTAGEM_QUARTO_POR_ID AS
(
SELECT q.ID,
COUNT(r.QUARTOID) AS NUM_RESERVAS,
q.TIPOQUARTOID,
q.NUMEROQUARTOANDARID
FROM RESERVA r
JOIN QUARTO q
on q.ID = r.QUARTOID
GROUP BY q.ID, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID
), t AS
(
SELECT t.TIPOQUARTOID, t.NUMEROQUARTOANDARID, t.NUM_RESERVAS,
MAX(t.NUM_RESERVAS)
OVER (PARTITION BY t.TIPOQUARTOID, t.NUMEROQUARTOANDARID) AS MAX,
t.ID
FROM CONTAGEM_QUARTO_POR_ID t
)
SELECT TIPOQUARTOID, NUMEROQUARTOANDARID, NUM_RESERVAS, ID
FROM t
WHERE NUM_RESERVAS = MAX
or more straightforward by using HAVING clause
SELECT q.TIPOQUARTOID,
q.NUMEROQUARTOANDARID,
COUNT(r.QUARTOID) AS NUM_RESERVAS,
q.ID
FROM RESERVA r
JOIN QUARTO q
on q.ID = r.QUARTOID
GROUP BY q.ID, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID
HAVING COUNT(r.QUARTOID) = q.TIPOQUARTOID
I wouldn't suggest two levels of aggregation. Just use window funtions:
WITH CONTAGEM_QUARTO_POR_ID AS (
SELECT q.ID, COUNT(*) AS NUM_RESERVAS, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID,
ROW_NUMBER() OVER (PARTITION BY q.TIPOQUARTOID, q.NUMEROQUARTOANDARID ORDER BY COUNT(*) DESC) as seqnum
FROM RESERVA r INNER JOIN
QUARTO q
ON q.ID = r.QUARTOID
GROUP BY q.ID, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID
)
SELECT cq.*
FROM CONTAGEM_QUARTO_POR_ID cq
WHERE seqnum = 1;
I think this would have slightly better performance than two aggregations (but it is worth checking).
One advantage of this approach is that it is more flexible. If you want ties, just change the ROW_NUMBER() to RANK() in the subquery.
Perhaps more importantly, ROW_NUMBER() is an "idiom" in SQL for returning one row (or a specific number of rows) per group. Learning how to use it is very valuable.

Parent Child Hierarchy to Return All Descendants with Corresponding Primary ID

I have a parent child hierarchy table. I am trying to return a list of all of the child ID's for each child ID. My table is defined as follows:
CREATE TABLE Organization_Hierarchy_Test (ORGANIZATION_ID INT, PARENT_ORG_ID INT);
INSERT INTO Organization_Hierarchy_Test (ORGANIZATION_ID, PARENT_ORG_ID)
VALUES(1,0), (2,1), (3,1), (4,2), (5,2), (6,2), (7,3), (8,3), (9,3), (10,3);
The results that I am after would look like this:
+-----------------+---------------+--------------------------+
| ORGANIZATION_ID | PARENT_ORG_ID | ORIGINAL_ORGANIZATION_ID |
+-----------------+---------------+--------------------------+
| 1 | 0 | 1 |
| 2 | 1 | 1 |
| 3 | 1 | 1 |
| 4 | 2 | 1 |
| 5 | 2 | 1 |
| 6 | 2 | 1 |
| 7 | 3 | 1 |
| 8 | 3 | 1 |
| 9 | 3 | 1 |
| 10 | 3 | 1 |
| 2 | 0 | 2 |
| 3 | 0 | 2 |
| 4 | 1 | 2 |
| 5 | 1 | 2 |
| 6 | 1 | 2 |
| 7 | 1 | 2 |
| 8 | 1 | 2 |
| 9 | 1 | 2 |
| 10 | 1 | 2 |
| 4 | 0 | 4 |
| 5 | 0 | 4 |
| 6 | 0 | 4 |
| 7 | 0 | 4 |
| 8 | 0 | 4 |
| 9 | 0 | 4 |
| 10 | 0 | 4 |
+-----------------+---------------+--------------------------+
The query that I have written gets me a list of all of the descendants for each organization_id, but I can not figure out how to return the same organization_id that is in fact related to all of the descendants.
I have tried adding a group by and returning the max id with little luck. I have a delivery date tomorrow and I am worried that I am not going to be able to work through this in time.
with descendants as
( select PARENT_ORG_ID, ORGANIZATION_ID, 1 as level
from Organization_Hierarchy_Test OH
union all
select d.PARENT_ORG_ID , OH1.ORGANIZATION_ID, d.level + 1
from descendants as d
join Organization_Hierarchy_Test OH1 on d.ORGANIZATION_ID = OH1.PARENT_ORG_ID
)
select ORGANIZATION_ID, PARENT_ORG_ID, level
from descendants
order by level, PARENT_ORG_ID, ORGANIZATION_ID
Any ideas on how to return the original Organization_ID along with all of the descendant organization_id's?
I am trying to push this to a tabular model and this will save me loads of time in processing the data.
Thanks very much in advance.
Change your CTE to simply include an extra column d.ORGANIZATION_ID AS Orig:
with descendants as
( select PARENT_ORG_ID, ORGANIZATION_ID, ORGANIZATION_ID AS Orig, 1 as level
from Organization_Hierarchy_Test OH
union all
select d.PARENT_ORG_ID , OH1.ORGANIZATION_ID, d.ORGANIZATION_ID AS Orig, d.level + 1
from descendants as d
join Organization_Hierarchy_Test OH1 on d.ORGANIZATION_ID = OH1.PARENT_ORG_ID
)

group by top two results based on order

I have been trying to get this to work with some row_number, group by, top, sort of things, but I am missing some fundamental concept. I have a table like so:
+-------+-------+-------+
| name | ord | f_id |
+-------+-------+-------+
| a | 1 | 2 |
| b | 5 | 2 |
| c | 6 | 2 |
| d | 2 | 1 |
| e | 4 | 1 |
| a | 2 | 3 |
| c | 50 | 4 |
+-------+-------+-------+
And my desired output would be:
+-------+---------+--------+-------+
| f_id | ord_n | ord | name |
+-------+---------+--------+-------+
| 2 | 1 | 1 | a |
| 2 | 2 | 5 | b |
| 1 | 1 | 2 | d |
| 1 | 2 | 4 | e |
| 3 | 1 | 2 | a |
| 4 | 1 | 50 | c |
+-------+---------+--------+-------+
Where data is ordered by the ord value, and only up to two results per f_id. Should I be working on a Stored Procedure for this or can I just do it with SQL? I have experimented with some select TOP subqueries, but nothing has even come close..
Here are some statements to create the test table:
create table help(name varchar(255),ord tinyint,f_id tinyint);
insert into help values
('a',1,2),
('b',5,2),
('c',6,2),
('d',2,1),
('e',4,1),
('a',2,3),
('c',50,4);
You may use Rank or DENSE_RANK functions.
select A.name, A.ord_n, A.ord , A.f_id from
(
select
RANK() OVER (partition by f_id ORDER BY ord asc) AS "Rank",
ROW_NUMBER() OVER (partition by f_id ORDER BY ord asc) AS "ord_n",
help.*
from help
) A where A.rank <= 2
Sqlfiddle demo

Sequential Group By in sql server

For this Table:
+----+--------+-------+
| ID | Status | Value |
+----+--------+-------+
| 1 | 1 | 4 |
| 2 | 1 | 7 |
| 3 | 1 | 9 |
| 4 | 2 | 1 |
| 5 | 2 | 7 |
| 6 | 1 | 8 |
| 7 | 1 | 9 |
| 8 | 2 | 1 |
| 9 | 0 | 4 |
| 10 | 0 | 3 |
| 11 | 0 | 8 |
| 12 | 1 | 9 |
| 13 | 3 | 1 |
+----+--------+-------+
I need to sum sequential groups with the same Status to produce this result.
+--------+------------+
| Status | Sum(Value) |
+--------+------------+
| 1 | 20 |
| 2 | 8 |
| 1 | 17 |
| 2 | 1 |
| 0 | 15 |
| 1 | 9 |
| 3 | 1 |
+--------+------------+
How can I do that in SQL Server?
NB: The values in the ID column are contiguous.
Per the tag I added to your question this is a gaps and islands problem.
The best performing solution will likely be
WITH T
AS (SELECT *,
ID - ROW_NUMBER() OVER (PARTITION BY [STATUS] ORDER BY [ID]) AS Grp
FROM YourTable)
SELECT [STATUS],
SUM([VALUE]) AS [SUM(VALUE)]
FROM T
GROUP BY [STATUS],
Grp
ORDER BY MIN(ID)
If the ID values were not guaranteed contiguous as stated then you would need to use
ROW_NUMBER() OVER (ORDER BY [ID]) -
ROW_NUMBER() OVER (PARTITION BY [STATUS] ORDER BY [ID]) AS Grp
Instead in the CTE definition.
SQL Fiddle