Sum all sub group last value by group - sql

Consider the following table:
ID | ITEM | GROUP_ID | VAL | COST
---+------+----------+-----------+-------
1 | A | 1 | 1 | 12
2 | B | 1 | 2 | 12
3 | C | 1 | 3 | 12
4 | D | 1 | 4 | 13
5 | D | 1 | 5 | 12
6 | E | 2 | 1 | 17
7 | E | 2 | 2 | 10
8 | E | 2 | 3 | 11
9 | E | 2 | 4 | 12
10 | F | 2 | 5 | 15
11 | F | 2 | 6 | 13
12 | F | 2 | 7 | 11
13 | F | 2 | 8 | 12
how to get the result as follow:
GROUP_ID | VAL | COST
----------+-----------+-------
1 | 15 | 48
2 | 36 | 24
The val is the sum by group id.
The cost is the sum of last value by item.

Use analytic function ROW_NUMBER() on postgres, oracle or sql server
SqlFiddleDemo
WITH last_item as (
SELECT group_id, sum(cost) as sum_cost
FROM (
SELECT t.*,
ROW_NUMBER() over (partition by item order by id desc) as rn
FROM Table1 t
) as t
WHERE rn = 1
GROUP BY t.group_id
),
val_sum as (
SELECT t.group_id, SUM(val) as sum_val
FROM Table1 t
GROUP BY t.group_id
)
SELECT v.group_id, v.sum_val, l.sum_cost
FROM val_sum v
INNER JOIN last_item l
ON v.group_id = l.group_id
OUTPUT
| group_id | sum_val | sum_cost |
|----------|---------|----------|
| 1 | 15 | 48 |
| 2 | 36 | 24 |

Try this
WITH LastRow (id)
AS (
SELECT MAX(id)
FROM TheTable
GROUP BY item, group_id
)
SELECT group_Id, SUM(val), SUM(CASE WHEN B.id IS NULL THEN 0 ELSE cost END)
FROM TheTable A
LEFT OUTER JOIN LastRow B ON A.id = B.id
GROUP BY group_id
EDIT:
SQL Fiddle Demo
Thanks #Juan Carlos Oropeza for creating the SQL Fiddle test data

Related

Get the count of longest streak including the break point

I am working on the problem where I have to get the count of streak with max value, but to get the exact result I have to count that point as well where the streak breaks. My table looks like this
+-----------------+--------+-------+
| customer_number | Months | Flags |
+-----------------+--------+-------+
| 1 | 12 | 1 |
| 1 | 1 | 1 |
| 1 | 2 | 1 |
| 1 | 3 | 1 |
| 1 | 4 | 1 |
| 1 | 5 | 1 |
| 1 | 8 | 1 |
| 1 | 9 | 1 |
| 1 | 10 | 1 |
| 1 | 11 | 1 |
| 6 | 12 | 1 |
| 6 | 1 | 1 |
| 6 | 2 | 1 |
| 6 | 3 | 1 |
| 6 | 4 | 1 |
| 6 | 5 | 4 |
| 6 | 9 | 1 |
| 6 | 10 | 1 |
| 6 | 11 | 1 |
| 7 | 5 | 1 |
| 8 | 9 | 1 |
| 8 | 10 | 1 |
| 8 | 11 | 1 |
| 9 | 9 | 1 |
| 9 | 10 | 1 |
| 9 | 11 | 1 |
| 10 | 11 | 1 |
+-----------------+--------+-------+
and my desired output is
+----------+--------------------+
| Customer | Consecutive streak |
+----------+--------------------+
| 1 | 10 |
| 6 | 6 |
| 7 | 1 |
| 8 | 3 |
| 9 | 3 |
| 10 | 1 |
+----------+--------------------+
the code I have
SELECT customer_number, max(streak) max_consecutive_streak FROM (
SELECT customer_number, COUNT(*) as streak
FROM
(select *,
(row_number() over (order by customer_number) -
row_number() over (order by customer_number)
) as counts
from table1
) cc
group by customer_number, counts
)
GROUP BY 1;
It is working good but for customer_number 6 it returns 5 but I want it to be 6, means it should count 4 as well in its longest streak as the streak breaks at this point. Any idea how can I achieve that?
You can use a cte with row_number:
with cte(r, id, flag) as (
select row_number() over (order by c.customer_number), c.* from customers c
),
freq(id, t, f) as (
select c2.id, c2.f, count(*) from
(select c.id, (select sum(c1.flag!=c.flag) from cte c1 where c1.id=c.id and c1.r <= c.r) f from cte c)
c2 group by c2.id, c2.f
)
select id, max(f) from freq group by id;

BigQuery: Get top 3 records for each group

I'm new to big query. I need top 3 scores for each group
| Name | Group | Score |
| A | 1 | 100 |
| B | 2 | 80 |
| C | 3 | 101 |
| D | 1 | 53 |
| X | 2 | 8 |
| Y | 3 | 61 |
| Z | 1 | 97 |
| W | 2 | 20 |
Consider below
select * except(pos)
from (
select *, row_number() over(partition by `group` order by score desc) pos
from `project.dataset.table`
)
where pos <= 3
Another option (more BigQuery'ish)
select arr.*
from (
select array_agg(t order by score desc limit 3) arr
from `project.dataset.table` t
group by `group`
) a, a.arr

Cte within Cte in SQL

I have been encountered with a situation where I need to apply a where, group by condition on the result of CTE in the CTE.
Table 1 as follows
+---+---+---+---+
| x | y | z | w |
+---+---+---+---+
| 1 | 2 | 3 | 1 |
| 2 | 3 | 4 | 2 |
| 3 | 2 | 5 | 3 |
| 1 | 2 | 6 | 2 |
+---+---+---+---+
Table 2 as follows
+---+---+-----+---+
| a | b | c | d |
+---+---+-----+---+
| 1 | m | 100 | 1 |
| 2 | n | 23 | 2 |
| 4 | o | 34 | 4 |
| 1 | m | 23 | 2 |
+---+---+-----+---+
Assuming I have the data of following sql query in a table called TAB
with cte as (
select x,y,z from table1),
cte1 as (select a,b,c from table2)
select cte.x,cte1.y,cte1.z,cte2.b,cte2.c from cte left join cte1 on cte.x=cte.a and cte1.w=cte2.d
Result of above CTE would be as follows
+---+---+---+---+---+-----+
| x | y | z | w | b | c |
+---+---+---+---+---+-----+
| 1 | 2 | 3 | 1 | m | 100 |
| 2 | 3 | 4 | 2 | n | 23 |
| 1 | 2 | 6 | 2 | m | 23 |
+---+---+---+---+---+-----+
I would like to query the following from the table TAB
select * from TAB where (X||b) in (select (X||b) from TAB group by (X||Y) having sum(c)=123)
I'm trying to formulate the SQL query as follows but it's not as i expected:
select * from (
with cte as (
select x,y,z from table1),
cte1 as (select a,b,c from table2)
select cte.x,cte1.y,cte1.z,cte2.b,cte2.c from cte left join cte1 on cte.x=cte.a) as TAB
where ((X||b) in (select (X||b) from TAB group by (X||Y) having sum(c)=123))
The final result must be as follows
+---+---+---+---+---+-----+
| x | y | z | w | b | c |
+---+---+---+---+---+-----+
| 1 | 2 | 3 | 1 | m | 100 |
| 1 | 2 | 6 | 2 | m | 23 |
+---+---+---+---+---+-----+
I don't think DB2 allows CTEs in subqueries or to be nested. Why not just write this using another CTE?
with cte as (
select x,y,z from
table1
),
cte1 as (
select a,b,c
from table2
),
tab as (
select cte.x,cte1.y,cte1.z,cte1.w,cte2.b,cte2.c
from cte left join
cte1
on cte.x=cte.a and cte1.w=cte2.d
)
select *
from TAB
where (X||b) in (select (X||b) from TAB group by (X||Y) having sum(c)=123);

How to sum rows before a condition is met in SQL

I have a table which has multiple records for the same id. Looks like this, and the rows are sorted by sequence number.
+----+--------+----------+----------+
| id | result | duration | sequence |
+----+--------+----------+----------+
| 1 | 12 | 7254 | 1 |
+----+--------+----------+----------+
| 1 | 12 | 2333 | 2 |
+----+--------+----------+----------+
| 1 | 11 | 1000 | 3 |
+----+--------+----------+----------+
| 1 | 6 | 5 | 4 |
+----+--------+----------+----------+
| 1 | 3 | 20 | 5 |
+----+--------+----------+----------+
| 2 | 1 | 230 | 1 |
+----+--------+----------+----------+
| 2 | 9 | 10 | 2 |
+----+--------+----------+----------+
| 2 | 6 | 0 | 3 |
+----+--------+----------+----------+
| 2 | 1 | 5 | 4 |
+----+--------+----------+----------+
| 2 | 12 | 3 | 5 |
+----+--------+----------+----------+
E.g. for id=1, i would like to sum the duration for all the rows before and include result=6, which is 7254+2333+1000+5. Same for id =2, it would be 230+10+0. Anything after the row where result=6 will be left out.
My expected output:
+----+----------+
| id | duration |
+----+----------+
| 1 | 10592 |
+----+----------+
| 2 | 240 |
+----+----------+
The sequence has to be in ascending order.
I'm not sure how I can do this in sql.
Thank you in advance!
I think you want:
select t2.id, sum(t2.duration)
from t
where t.sequence <= (select t2.sequence
from t t2
where t2.id = t.id and t2.result = 6
);
In PrestoDB, I would recommend window functions:
select id, sum(duration)
from (select t.*,
min(case when result = 6 then sequence end) over (partition by id) as sequence_6
from t
) t
where sequence <= sequence_6;
You can use a simple aggregate query with a condition that uses a subquery to recover the sequence corresponding to the record whose sequence is 6 :
SELECT t.id, SUM(t.duration) total_duration
FROM mytable t
WHERE t.sequence <= (
SELECT sequence
FROM mytable
WHERE id = t.id AND result = 6
)
GROUP BY t.id
This demo on DB Fiddle with your test data returns :
| id | total_duration |
| --- | -------------- |
| 1 | 10592 |
| 2 | 240 |
Basic group by query should solve your issue
select
id,
sum(duration) duration
from t
group by id
for the certain rows:
select
id,
sum(duration) duration
from t
where id = 1
group by id
if you want to include it in your result set
select id, duration, sequence from t
union all
select
id,
sum(duration) duration
null sequence
from t
group by id

SQL Select top frequent records

I have the following table:
Table
+----+------+-------+
| ID | Name | Group |
+----+------+-------+
| 0 | a | 1 |
| 1 | a | 1 |
| 2 | a | 2 |
| 3 | a | 1 |
| 4 | b | 1 |
| 5 | b | 2 |
| 6 | b | 1 |
| 7 | c | 2 |
| 8 | c | 2 |
| 9 | c | 1 |
+----+------+-------+
I would like to select top 20 distinct names from a specific group ordered by most frequent name in that group. The result for this example for group 1 would return a b c (
a - 3 occurrences, b - 2 occurrences and c - 1 occurrence).
Thank you.
SELECT TOP(20) [Name], Count(*) FROM Table
WHERE [Group] = 1
GROUP BY [Name]
ORDER BY Count(*) DESC
SELECT Top(20)
name, group, count(*) as occurences
FROM yourtable
GROUP BY name, group
ORDER BY count(*) desc
SELECT
TOP 20
Name,
Group,
COUNT(1) Count,
FROM
MyTable
GROUP BY
Name,
Group
ORDER BY
Count DESC