Accumulate a summarized column - sql

I could need some help with a SQL statement. So I have the table "cont" which looks like that:
cont_id name weight
----------- ---------- -----------
1 1 10
2 1 20
3 2 40
4 2 15
5 2 20
6 3 15
7 3 40
8 4 60
9 5 10
10 6 5
I then summed up the weight column and grouped it by the name:
name wsum
---------- -----------
2 75
4 60
3 55
1 30
5 10
6 5
And the result should have a accumulated column and should look like that:
name wsum acc_wsum
---------- ----------- ------------
2 75 75
4 60 135
3 55 190
1 30 220
5 10 230
6 5 235
But I didn't manage to get the last statement working..
edit: this Statement did it (thanks Gordon)
select t.*,
(select sum(wsum) from (select name, SUM(weight) wsum
from cont
group by name)
t2 where t2.wsum > t.wsum or (t2.wsum = t.wsum and t2.name <= t.name)) as acc_wsum
from (select name, SUM(weight) wsum
from cont
group by name) t
order by wsum desc

So, the best way to do this is using cumulative sum:
select t.*,
sum(wsum) over (order by wsum desc) as acc_wsum
from (<your summarized query>) t
The order by clause makes this cumulative.
If you don't have that capability (in SQL Server 2012 and Oracle), a correlated subquery is an easy way to do it, assuming the summed weights are distinct values:
select t.*,
(select sum(wsum) from (<your summarized query>) t2 where t2.wsum >= t.wsum) as acc_wsum
from (<your summarized query>) t
This should work in all dialects of SQL. To work with situations where the accumulated weights might have duplicates:
select t.*,
(select sum(wsum) from (<your summarized query>) t2 where t2.wsum > t.wsum or (t2.wsum = t.wsum and t2.name <= t.name) as acc_wsum
from (<your summarized query>) t

try this
;WITH CTE
AS
(
SELECT *,
ROW_NUMBER() OVER(ORDER BY wsum) rownum
FROM #table1
)
SELECT
c1.name,
c1.wsum,
acc_wsum= (SELECT SUM(c2.wsum)
FROM cte c2
WHERE c2.rownum <= c1.rownum)
FROM CTE c1;
or you can join instead of using subquery
;WITH CTE
AS
(
SELECT *,
ROW_NUMBER() OVER(ORDER BY usercount) rownum
FROM #table1
)
SELECT
c1.name,
c1.wsum,
acc_wsum= SUM(c2.wsum)
FROM CTE c1
INNER JOIN CTE c2 ON c2.rownum <= c1.rownum
GROUP BY c1.name, c1.wsum;

Related

Query to restrict results from left join

I have the following query
select S.id, X.id, 15,15,1 from schema_1.tbl_2638 S
JOIN schema_1.tbl_2634_customid X on S.field_1=x.fullname
That returns the following results, where you can see the first column is duplicated on matches to the 2nd table.
1 1 15 15 1
2 3 15 15 1
2 2 15 15 1
3 5 15 15 1
3 4 15 15 1
I'm trying to get a query that would just give me a single row per 1st ID, and the min value from 2nd ID. So I want a result that would be:
1 1 15 15 1
2 2 15 15 1
3 4 15 15 1
I'm a little rust on my SQL skills, how would I write the query to provide the above result?
From your result you can do,this to achieve your result, for much more compicated structures, you can always take a look at window fucntions
select S.id, MIN(X.id) x_id, 15,15,1 from schema_1.tbl_2638 S
JOIN schema_1.tbl_2634_customid X on S.field_1=x.fullname
GROUP BY 1,3,4,5
window function can be used, need always a outer SELECT
SELECT
s_id,x_idm a,b,c
FROM
(select S.id as s_id, X.id as x_id, 15 a ,15 b,1 c
, ROW_NUMBER() OVER (PARTITION BY S.id ORDER BY X.id ASC) rn
from schema_1.tbl_2638 S
JOIN schema_1.tbl_2634_customid X on S.field_1=x.fullname)
WHERE rn = 1
Or as CTE
WITH CTE as (select S.id as s_id, X.id as x_id, 15 a ,15 b,1 c
, ROW_NUMBER() OVER (PARTITION BY S.id ORDER BY X.id ASC) rn
from schema_1.tbl_2638 S
JOIN schema_1.tbl_2634_customid X on S.field_1=x.fullname)
SELECT s_id,x_id,a,b,c FROM CTE WHERE rn = 1

Rolling Average in SQL with Partition [duplicate]

declare #t table
(
id int,
SomeNumt int
)
insert into #t
select 1,10
union
select 2,12
union
select 3,3
union
select 4,15
union
select 5,23
select * from #t
the above select returns me the following.
id SomeNumt
1 10
2 12
3 3
4 15
5 23
How do I get the following:
id srome CumSrome
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63
select t1.id, t1.SomeNumt, SUM(t2.SomeNumt) as sum
from #t t1
inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.SomeNumt
order by t1.id
SQL Fiddle example
Output
| ID | SOMENUMT | SUM |
-----------------------
| 1 | 10 | 10 |
| 2 | 12 | 22 |
| 3 | 3 | 25 |
| 4 | 15 | 40 |
| 5 | 23 | 63 |
Edit: this is a generalized solution that will work across most db platforms. When there is a better solution available for your specific platform (e.g., gareth's), use it!
The latest version of SQL Server (2012) permits the following.
SELECT
RowID,
Col1,
SUM(Col1) OVER(ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
or
SELECT
GroupID,
RowID,
Col1,
SUM(Col1) OVER(PARTITION BY GroupID ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
This is even faster. Partitioned version completes in 34 seconds over 5 million rows for me.
Thanks to Peso, who commented on the SQL Team thread referred to in another answer.
For SQL Server 2012 onwards it could be easy:
SELECT id, SomeNumt, sum(SomeNumt) OVER (ORDER BY id) as CumSrome FROM #t
because ORDER BY clause for SUM by default means RANGE UNBOUNDED PRECEDING AND CURRENT ROW for window frame ("General Remarks" at https://msdn.microsoft.com/en-us/library/ms189461.aspx)
Let's first create a table with dummy data:
Create Table CUMULATIVESUM (id tinyint , SomeValue tinyint)
Now let's insert some data into the table;
Insert Into CUMULATIVESUM
Select 1, 10 union
Select 2, 2 union
Select 3, 6 union
Select 4, 10
Here I am joining same table (self joining)
Select c1.ID, c1.SomeValue, c2.SomeValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Order By c1.id Asc
Result:
ID SomeValue SomeValue
-------------------------
1 10 10
2 2 10
2 2 2
3 6 10
3 6 2
3 6 6
4 10 10
4 10 2
4 10 6
4 10 10
Here we go now just sum the Somevalue of t2 and we`ll get the answer:
Select c1.ID, c1.SomeValue, Sum(c2.SomeValue) CumulativeSumValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Group By c1.ID, c1.SomeValue
Order By c1.id Asc
For SQL Server 2012 and above (much better performance):
Select
c1.ID, c1.SomeValue,
Sum (SomeValue) Over (Order By c1.ID )
From CumulativeSum c1
Order By c1.id Asc
Desired result:
ID SomeValue CumlativeSumValue
---------------------------------
1 10 10
2 2 12
3 6 18
4 10 28
Drop Table CumulativeSum
A CTE version, just for fun:
;
WITH abcd
AS ( SELECT id
,SomeNumt
,SomeNumt AS MySum
FROM #t
WHERE id = 1
UNION ALL
SELECT t.id
,t.SomeNumt
,t.SomeNumt + a.MySum AS MySum
FROM #t AS t
JOIN abcd AS a ON a.id = t.id - 1
)
SELECT * FROM abcd
OPTION ( MAXRECURSION 1000 ) -- limit recursion here, or 0 for no limit.
Returns:
id SomeNumt MySum
----------- ----------- -----------
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63
Late answer but showing one more possibility...
Cumulative Sum generation can be more optimized with the CROSS APPLY logic.
Works better than the INNER JOIN & OVER Clause when analyzed the actual query plan ...
/* Create table & populate data */
IF OBJECT_ID('tempdb..#TMP') IS NOT NULL
DROP TABLE #TMP
SELECT * INTO #TMP
FROM (
SELECT 1 AS id
UNION
SELECT 2 AS id
UNION
SELECT 3 AS id
UNION
SELECT 4 AS id
UNION
SELECT 5 AS id
) Tab
/* Using CROSS APPLY
Query cost relative to the batch 17%
*/
SELECT T1.id,
T2.CumSum
FROM #TMP T1
CROSS APPLY (
SELECT SUM(T2.id) AS CumSum
FROM #TMP T2
WHERE T1.id >= T2.id
) T2
/* Using INNER JOIN
Query cost relative to the batch 46%
*/
SELECT T1.id,
SUM(T2.id) CumSum
FROM #TMP T1
INNER JOIN #TMP T2
ON T1.id > = T2.id
GROUP BY T1.id
/* Using OVER clause
Query cost relative to the batch 37%
*/
SELECT T1.id,
SUM(T1.id) OVER( PARTITION BY id)
FROM #TMP T1
Output:-
id CumSum
------- -------
1 1
2 3
3 6
4 10
5 15
Select
*,
(Select Sum(SOMENUMT)
From #t S
Where S.id <= M.id)
From #t M
You can use this simple query for progressive calculation :
select
id
,SomeNumt
,sum(SomeNumt) over(order by id ROWS between UNBOUNDED PRECEDING and CURRENT ROW) as CumSrome
from #t
There is a much faster CTE implementation available in this excellent post:
http://weblogs.sqlteam.com/mladenp/archive/2009/07/28/SQL-Server-2005-Fast-Running-Totals.aspx
The problem in this thread can be expressed like this:
DECLARE #RT INT
SELECT #RT = 0
;
WITH abcd
AS ( SELECT TOP 100 percent
id
,SomeNumt
,MySum
order by id
)
update abcd
set #RT = MySum = #RT + SomeNumt
output inserted.*
For Ex: IF you have a table with two columns one is ID and second is number and wants to find out the cumulative sum.
SELECT ID,Number,SUM(Number)OVER(ORDER BY ID) FROM T
Once the table is created -
select
A.id, A.SomeNumt, SUM(B.SomeNumt) as sum
from #t A, #t B where A.id >= B.id
group by A.id, A.SomeNumt
order by A.id
The SQL solution wich combines "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" and "SUM" did exactly what i wanted to achieve.
Thank you so much!
If it can help anyone, here was my case. I wanted to cumulate +1 in a column whenever a maker is found as "Some Maker" (example). If not, no increment but show previous increment result.
So this piece of SQL:
SUM( CASE [rmaker] WHEN 'Some Maker' THEN 1 ELSE 0 END)
OVER
(PARTITION BY UserID ORDER BY UserID,[rrank] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Cumul_CNT
Allowed me to get something like this:
User 1 Rank1 MakerA 0
User 1 Rank2 MakerB 0
User 1 Rank3 Some Maker 1
User 1 Rank4 Some Maker 2
User 1 Rank5 MakerC 2
User 1 Rank6 Some Maker 3
User 2 Rank1 MakerA 0
User 2 Rank2 SomeMaker 1
Explanation of above: It starts the count of "some maker" with 0, Some Maker is found and we do +1. For User 1, MakerC is found so we dont do +1 but instead vertical count of Some Maker is stuck to 2 until next row.
Partitioning is by User so when we change user, cumulative count is back to zero.
I am at work, I dont want any merit on this answer, just say thank you and show my example in case someone is in the same situation. I was trying to combine SUM and PARTITION but the amazing syntax "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" completed the task.
Thanks!
Groaker
Above (Pre-SQL12) we see examples like this:-
SELECT
T1.id, SUM(T2.id) AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < = T1.id
GROUP BY
T1.id
More efficient...
SELECT
T1.id, SUM(T2.id) + T1.id AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < T1.id
GROUP BY
T1.id
Try this
select
t.id,
t.SomeNumt,
sum(t.SomeNumt) Over (Order by t.id asc Rows Between Unbounded Preceding and Current Row) as cum
from
#t t
group by
t.id,
t.SomeNumt
order by
t.id asc;
Try this:
CREATE TABLE #t(
[name] varchar NULL,
[val] [int] NULL,
[ID] [int] NULL
) ON [PRIMARY]
insert into #t (id,name,val) values
(1,'A',10), (2,'B',20), (3,'C',30)
select t1.id, t1.val, SUM(t2.val) as cumSum
from #t t1 inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.val order by t1.id
Without using any type of JOIN cumulative salary for a person fetch by using follow query:
SELECT * , (
SELECT SUM( salary )
FROM `abc` AS table1
WHERE table1.ID <= `abc`.ID
AND table1.name = `abc`.Name
) AS cum
FROM `abc`
ORDER BY Name

How to use this in sql -- > max(sum (paid * quantity )) to solve a query

How to get the max value order of each customer ?
select num, max(sum(paid*quantity))
from orders join
pizza
using (order#)
group by customer#;
table
num orderN price
-------- --- -------
1 109 30
1 118 25
3 101 30
3 115 27
4 107 23
5 100 17
5 129 16
output req-
num Pnum price
-------- --- -------
1 109 30
3 101 30
4 107 23
5 100 17
You want to select the record having the highest price in each group of nums.
If your RDBMS supports window functions, that's straight forward with ROW_NUMBER() :
SELECT num, pnum, price
FROM (
SELECT t.*, ROW_NUMBER OVER(PARTITION BY num ORDER BY price DESC) rn
FROM mytable t
) x
WHERE rn = 1
Else, you can take the following approach, that uses a NOT EXISTS condition with a correlated subquery to ensure that the record being joined in the one with the highest price for the current num :
SELECT num, pnum, price
FROM mytable t
WHERE NOT EXISTS (
SELECT 1 FROM mytable t1 WHERE t1.num = t.num AND t1.price > t.price
)

How can I select top 3 for each group based on another column in sqlite?

I'm trying to get top 3 most profitable UserIDs in each country in one table using sqlite. I'm not sure where to use LIMIT 3.
Here is the table I have:
Country | UserID | Profit
US 1 100
US 12 98
US 13 10
US 5 8
US 2 5
IR 9 95
IR 3 90
IR 8 70
IR 4 56
IR 15 40
the result should look like this:
Country | UserID | Profit
US 1 100
US 12 98
US 13 10
IR 9 95
IR 3 90
IR 8 70
One pretty simple method is:
select t.*
from t
where t.profit >= (select t2.profit
from t t2
where t2.country = t.country
order by t2.profit desc
limit 1 offset 2
);
This assumes at least three records for each country. You can get around that with coalesce():
select t.*
from t
where t.profit >= coalesce((select t2.profit
from t t2
where t2.country = t.country
order by t2.profit desc
limit 1 offset 2
), t.profit
);
Since SQLite doesn't support windows function, so you can write a subquery be a seqnum by Country, then get top 3
You can try this query.
select t.Country,t.UserID,t.Profit
from(
select t.*,
(select count(*)
from T t2
where t2.Country = t.Country and t2.Profit >= t.Profit
) as seqnum
from T t
)t
where t.seqnum <=3
sqlfiddle:https://www.db-fiddle.com/f/tmNhRLGG2oKqCKXJEDsjfe/0
LIMIT won't be usefull as it applies to a whole result set.
I would create an auxiliary column "CountryRank" like this:
SELECT *, (SELECT COUNT() FROM Data AS d WHERE d.Country=Data.Country AND d.Profit>Data.Country)+1 AS CountryRank
FROM Data;
And query on that result:
SELECT Country, UserID, Profit
FROM (
SELECT *, (SELECT COUNT() FROM Data AS d WHERE d.Country=Data.Country AND d.Profit>Data.Profit)+1 AS CountryRank FROM Data)
WHERE CountryRank<=3
ORDER BY Country, CountryRank;

SQL: How do I display all records per unique id, but not the first record ever recorded in SQL

Example:
id Pricemoney time/date
1 100 01/20/2017
1 10 01/21/2017
1 1000 01/21/20147
2 10 01/23/2017
2 100 01/24/2017
3 1000 01/19/2017
3 100 01/22/2017
3 10 01/24/2017
I want to run a SQL query where I can display all the Id and it's pricemoney BUT NOT include the first record (based on time/date) per unique
Just to clarify what I do not want to be displayed
userid Pricemoney issuedate
1 100 01/20/2017 -- not included
1 10 01/21/2017
1 1000 01/21/20147
2 10 01/23/2017 --- not inlcuded
2 100 01/24/2017
3 1000 01/19/2017 -- not included
3 100 01/22/2017
3 10 01/24/2017
Expected result:
id Pricemoney time/date
1 10 01/21/2017
1 1000 01/21/20147
2 100 01/24/2017
3 100 01/22/2017
3 10 01/24/2017
You can use row_number():
select t.*
from (select t.*,
row_number() over (partition by id order by time_date asc) as seqnum
from <tablename> t
) t
where seqnum > 1;
If you want to keep single rows, you can do:
select t.*
from (select t.*,
row_number() over (partition by id order by time_date asc) as seqnum,
count(*) over (partition by id) as cnt
from <tablename> t
) t
where seqnum > 1 and cnt > 1;
You may use EXISTS
select t1.*
from data t1
where exists (
select 1
from data t2
where t1.id = t2.id and t2.time_date < t1.time_date
)
you can try this :
select data1.id,data1.Date,data1.Pricemoney from data1
left join (
select id ,min(Date) date from data1
group by id
) as t
on data1.date= t.date and t.id = data1.id
where t.id is null
group by data1.id,data1.Date,data1.Pricemoney
above query not duplicated records also ignore, if want
not duplicated records then use having count(id) > 1 in left query e,g.
select data1.id,data1.Date,data1.Pricemoney from data1
left join (
select id ,min(Date) date from data1
group by id
having COUNT(id) > 1
) as t
on data1.date= t.date and t.id = data1.id
where t.id is null
group by data1.id,data1.Date,data1.Pricemoney