declare #t table
(
id int,
SomeNumt int
)
insert into #t
select 1,10
union
select 2,12
union
select 3,3
union
select 4,15
union
select 5,23
select * from #t
the above select returns me the following.
id SomeNumt
1 10
2 12
3 3
4 15
5 23
How do I get the following:
id srome CumSrome
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63
select t1.id, t1.SomeNumt, SUM(t2.SomeNumt) as sum
from #t t1
inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.SomeNumt
order by t1.id
SQL Fiddle example
Output
| ID | SOMENUMT | SUM |
-----------------------
| 1 | 10 | 10 |
| 2 | 12 | 22 |
| 3 | 3 | 25 |
| 4 | 15 | 40 |
| 5 | 23 | 63 |
Edit: this is a generalized solution that will work across most db platforms. When there is a better solution available for your specific platform (e.g., gareth's), use it!
The latest version of SQL Server (2012) permits the following.
SELECT
RowID,
Col1,
SUM(Col1) OVER(ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
or
SELECT
GroupID,
RowID,
Col1,
SUM(Col1) OVER(PARTITION BY GroupID ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
This is even faster. Partitioned version completes in 34 seconds over 5 million rows for me.
Thanks to Peso, who commented on the SQL Team thread referred to in another answer.
For SQL Server 2012 onwards it could be easy:
SELECT id, SomeNumt, sum(SomeNumt) OVER (ORDER BY id) as CumSrome FROM #t
because ORDER BY clause for SUM by default means RANGE UNBOUNDED PRECEDING AND CURRENT ROW for window frame ("General Remarks" at https://msdn.microsoft.com/en-us/library/ms189461.aspx)
Let's first create a table with dummy data:
Create Table CUMULATIVESUM (id tinyint , SomeValue tinyint)
Now let's insert some data into the table;
Insert Into CUMULATIVESUM
Select 1, 10 union
Select 2, 2 union
Select 3, 6 union
Select 4, 10
Here I am joining same table (self joining)
Select c1.ID, c1.SomeValue, c2.SomeValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Order By c1.id Asc
Result:
ID SomeValue SomeValue
-------------------------
1 10 10
2 2 10
2 2 2
3 6 10
3 6 2
3 6 6
4 10 10
4 10 2
4 10 6
4 10 10
Here we go now just sum the Somevalue of t2 and we`ll get the answer:
Select c1.ID, c1.SomeValue, Sum(c2.SomeValue) CumulativeSumValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Group By c1.ID, c1.SomeValue
Order By c1.id Asc
For SQL Server 2012 and above (much better performance):
Select
c1.ID, c1.SomeValue,
Sum (SomeValue) Over (Order By c1.ID )
From CumulativeSum c1
Order By c1.id Asc
Desired result:
ID SomeValue CumlativeSumValue
---------------------------------
1 10 10
2 2 12
3 6 18
4 10 28
Drop Table CumulativeSum
A CTE version, just for fun:
;
WITH abcd
AS ( SELECT id
,SomeNumt
,SomeNumt AS MySum
FROM #t
WHERE id = 1
UNION ALL
SELECT t.id
,t.SomeNumt
,t.SomeNumt + a.MySum AS MySum
FROM #t AS t
JOIN abcd AS a ON a.id = t.id - 1
)
SELECT * FROM abcd
OPTION ( MAXRECURSION 1000 ) -- limit recursion here, or 0 for no limit.
Returns:
id SomeNumt MySum
----------- ----------- -----------
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63
Late answer but showing one more possibility...
Cumulative Sum generation can be more optimized with the CROSS APPLY logic.
Works better than the INNER JOIN & OVER Clause when analyzed the actual query plan ...
/* Create table & populate data */
IF OBJECT_ID('tempdb..#TMP') IS NOT NULL
DROP TABLE #TMP
SELECT * INTO #TMP
FROM (
SELECT 1 AS id
UNION
SELECT 2 AS id
UNION
SELECT 3 AS id
UNION
SELECT 4 AS id
UNION
SELECT 5 AS id
) Tab
/* Using CROSS APPLY
Query cost relative to the batch 17%
*/
SELECT T1.id,
T2.CumSum
FROM #TMP T1
CROSS APPLY (
SELECT SUM(T2.id) AS CumSum
FROM #TMP T2
WHERE T1.id >= T2.id
) T2
/* Using INNER JOIN
Query cost relative to the batch 46%
*/
SELECT T1.id,
SUM(T2.id) CumSum
FROM #TMP T1
INNER JOIN #TMP T2
ON T1.id > = T2.id
GROUP BY T1.id
/* Using OVER clause
Query cost relative to the batch 37%
*/
SELECT T1.id,
SUM(T1.id) OVER( PARTITION BY id)
FROM #TMP T1
Output:-
id CumSum
------- -------
1 1
2 3
3 6
4 10
5 15
Select
*,
(Select Sum(SOMENUMT)
From #t S
Where S.id <= M.id)
From #t M
You can use this simple query for progressive calculation :
select
id
,SomeNumt
,sum(SomeNumt) over(order by id ROWS between UNBOUNDED PRECEDING and CURRENT ROW) as CumSrome
from #t
There is a much faster CTE implementation available in this excellent post:
http://weblogs.sqlteam.com/mladenp/archive/2009/07/28/SQL-Server-2005-Fast-Running-Totals.aspx
The problem in this thread can be expressed like this:
DECLARE #RT INT
SELECT #RT = 0
;
WITH abcd
AS ( SELECT TOP 100 percent
id
,SomeNumt
,MySum
order by id
)
update abcd
set #RT = MySum = #RT + SomeNumt
output inserted.*
For Ex: IF you have a table with two columns one is ID and second is number and wants to find out the cumulative sum.
SELECT ID,Number,SUM(Number)OVER(ORDER BY ID) FROM T
Once the table is created -
select
A.id, A.SomeNumt, SUM(B.SomeNumt) as sum
from #t A, #t B where A.id >= B.id
group by A.id, A.SomeNumt
order by A.id
The SQL solution wich combines "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" and "SUM" did exactly what i wanted to achieve.
Thank you so much!
If it can help anyone, here was my case. I wanted to cumulate +1 in a column whenever a maker is found as "Some Maker" (example). If not, no increment but show previous increment result.
So this piece of SQL:
SUM( CASE [rmaker] WHEN 'Some Maker' THEN 1 ELSE 0 END)
OVER
(PARTITION BY UserID ORDER BY UserID,[rrank] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Cumul_CNT
Allowed me to get something like this:
User 1 Rank1 MakerA 0
User 1 Rank2 MakerB 0
User 1 Rank3 Some Maker 1
User 1 Rank4 Some Maker 2
User 1 Rank5 MakerC 2
User 1 Rank6 Some Maker 3
User 2 Rank1 MakerA 0
User 2 Rank2 SomeMaker 1
Explanation of above: It starts the count of "some maker" with 0, Some Maker is found and we do +1. For User 1, MakerC is found so we dont do +1 but instead vertical count of Some Maker is stuck to 2 until next row.
Partitioning is by User so when we change user, cumulative count is back to zero.
I am at work, I dont want any merit on this answer, just say thank you and show my example in case someone is in the same situation. I was trying to combine SUM and PARTITION but the amazing syntax "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" completed the task.
Thanks!
Groaker
Above (Pre-SQL12) we see examples like this:-
SELECT
T1.id, SUM(T2.id) AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < = T1.id
GROUP BY
T1.id
More efficient...
SELECT
T1.id, SUM(T2.id) + T1.id AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < T1.id
GROUP BY
T1.id
Try this
select
t.id,
t.SomeNumt,
sum(t.SomeNumt) Over (Order by t.id asc Rows Between Unbounded Preceding and Current Row) as cum
from
#t t
group by
t.id,
t.SomeNumt
order by
t.id asc;
Try this:
CREATE TABLE #t(
[name] varchar NULL,
[val] [int] NULL,
[ID] [int] NULL
) ON [PRIMARY]
insert into #t (id,name,val) values
(1,'A',10), (2,'B',20), (3,'C',30)
select t1.id, t1.val, SUM(t2.val) as cumSum
from #t t1 inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.val order by t1.id
Without using any type of JOIN cumulative salary for a person fetch by using follow query:
SELECT * , (
SELECT SUM( salary )
FROM `abc` AS table1
WHERE table1.ID <= `abc`.ID
AND table1.name = `abc`.Name
) AS cum
FROM `abc`
ORDER BY Name
my problem looks like this. I have a large table with 1M+ records.
It looks something like this
ID STEP_ID
1 abc
1 dce
1 bbv
2 abc
2 ddb
3 bbv
4 asd
The thing is I want to write select query that would exclude all IDs if step_id is equal to abc. So it would look like
ID STEP_ID
3 bbv
4 asd
You can use not exists:
select t.*
from mytable t
where not exists (select 1 from mytable t1 where t1.id = t.id and t1.step_id = 'abc')
For performance, consider an index on (id, step_id).
Use not exists:
select t.*
from t
where not exists (select 1 from t t2 where t2.id = t.id and t2.step_id = 'abc');
For performance, you want an index on (id, step_id) -- both columns, in that order.
If you cannot create an index, it might be faster to use window functions:
select t.*
from (select t.*,
sum(case when step_id = 'abc' then 1 else 0 end) over (partition by id) as num_abc
from t
) t
where num_abc = 0;
Can't see the wood for the trees on this and I'm sure it's simple.
I'm trying to return the max ID for a related record in a joined table
Table1
NiD
Name
1
Peter
2
John
3
Arthur
Table2
ID
NiD
Value
1
1
5
2
2
10
3
3
10
4
1
20
5
2
15
Max Results
NiD
ID
Value
1
4
20
2
5
15
3
3
10
You can use row_number() for this:
select NiD, ID, Value
from (select t2.*,
row_number() over (partition by NiD order by ID desc) as seqnum
from table2 t2
) t2
where seqnum = 1;
As the question is stated, you do not need table1, because table2 has all the ids.
This is how I'd do it, I think ID and Value will be NULL when Table2 does not have a corresponding entry for a Table1 record:
SELECT NiD, ID, [Value]
FROM Table1
OUTER APPLY (
SELECT TOP 1 ID, [Value]
FROM Table2
WHERE Table1.NiD = Table2.NiD
ORDER BY [Value] DESC
) AS Top_Table2
CREATE TABLE Names
(
NID INT,
[Name] VARCHAR(MAX)
)
CREATE TABLE Results
(
ID INT,
NID INT,
VALUE INT
)
INSERT INTO Names VALUES (1,'Peter'),(2,'John'),(3,'Arthur')
INSERT INTO Results VALUES (1,1,5),(2,2,10),(3,3,10),(4,1,20),(5,2,15)
SELECT a.NID,
r.ID,
a.MaxVal
FROM (
SELECT NID,
MAX(VALUE) as MaxVal
FROM Results r
GROUP BY NID
) a
JOIN Results r
ON a.NID = r.NID AND a.MaxVal = r.VALUE
ORDER BY NID
Here's what I have used in similar situations, performance was fine, provided that the data set wasn't too large (under 1M rows).
SELECT
table1.nid
,table2.id
,table2.value
FROM table1
INNER JOIN table2 ON table1.nid = table2.nid
WHERE table2.value = (
SELECT MAX(value)
FROM table2
WHERE nid = table1.nid)
ORDER BY 1
My question is this: Is it possible to output multiple rows when joining from the same table?
With this code for example, I would like it to output 2 rows, one for each table. Instead, what it does is gives me 1 row with all of the data.
SELECT t1.*, t2.*
FROM table t1
JOIN table t2
ON t2.id = t1.oldId
WHERE t1.id = '1'
UPDATE
Well the problem that I have with the UNION/UNION ALL is this: I don't know what the t1.oldId value is equal to. All I know is the id for t1. I am trying to avoid using 2 queries so is there a way I could do something like this:
SELECT t1.*
FROM table t1
WHERE t1.id = '1'
UNION
SELECT t2.*
FROM table t2
WHERE t2.id = t1.oldId
SAMPLE DATA
messages_users
id message_id user_id box thread_id latest_id
--------------------------------------------------------
8 1 1 1 NULL NULL
9 2 1 2 NULL 16
10 2 65 1 NULL 15
11 3 65 2 2 NULL
12 3 1 1 2 NULL
13 4 1 2 2 NULL
14 4 65 1 2 NULL
15 5 65 2 2 NULL
16 6 1 1 2 NULL
Query:
SELECT mu.id FROM messages_users mu
JOIN messages_users mu2 ON mu2.latest_id IS NOT NULL
WHERE mu.user_id = '1' AND mu2.user_id = '1' AND ((mu.box = '1'
AND mu.thread_id IS NULL AND mu.latest_id IS NULL) OR mu.id = mu2.latest_id)
This query fixes my problem. But it seems the answer to my question is to not use a JOIN but a UNION.
You mean one row for t1 and one row from t2?
You're looking for UNION, not JOIN.
select * from table where id = 1
union
select * from table where oldid = 1
If you are trying to multiply rows in a table, you need UNION ALL (not UNION):
select *
from ((select * from t) union all
(select * from t)
) t
I also sometimes use a cross join to do this:
select *
from t cross join
(select 1 as seqnum union all select 2) vals
The cross join is explicitly multiplying the number of rows, in this case, with a sequencenumber attached.
Well, since it's the same table, you could do:
SELECT t2.*
FROM table t1
JOIN table t2
ON t2.id = t1.oldId
OR t2.id = t1.id
WHERE t1.id = '1'
If I have a table which is of the following format:
ID NAME NUM TIMESTAMP BOOL
1 A 5 09:50 TRUE
1 B 6 13:01 TRUE
1 A 1 10:18 FALSE
2 A 3 12:20 FALSE
1 A 1 05:30 TRUE
1 A 12 06:00 TRUE
How can I get the ID, NAME and NUM for each unique ID, NAME pair with the latest Timestamp and BOOL=TRUE.
So for the above table the output should be:
ID NAME NUM
1 A 5
1 B 6
I tried using Group By but I cannot seem to get around that either I need to put an aggregator function around num (max, min will not work when applied to this example) or specifying it in group by (which will end up matching on ID, NAME, and NUM combined). Both as far as I can think will break in some case.
PS: I am using SQL Developer (that is the SQL developed by Oracle I think, sorry I am a newbie at this)
If you're using at least SQL-Server 2005 you can use the ROW_NUMBER function:
WITH CTE AS
(
SELECT ID, NAME, NUM,
RN = ROW_NUMBER()OVER(PARTITION BY ID, NAME ORDER BY TIMESTAMP DESC)
FROM Table
WHERE BOOL='TRUE'
)
SELECT ID, NAME, NUM FROM CTE
WHERE RN = 1
Result:
ID NAME NUM
1 A 5
1 B 6
Here's the fiddle: http://sqlfiddle.com/#!3/a1dc9/10/0
select t1.* from table as t1 inner join
(
select NAME, NUM, max(TIMESTAMP) as TIMESTAMP from table
where BOOL='TRUE'
) as t2
on t1.name=t2.name and t1.num=t2.num and t1.timestamp=t2.timestamp
where t1.BOOL='TRUE'
select t1.*
from TABLE1 as t1
left join
TABLE1 as t2
on t1.name=t2.name and t1.TIMESTAMP>t2.TIMESTAMP
where t1.BOOL='TRUE' and t2.id is null
should do it for you.