Use a standard single sql to group data (SQL Server) - sql

Raw data with 2 columns:
0 33
2 null
0 44
2 null
2 null
2 null
0 55
2 null
2 null
.....
Results I want:
2 33
2 44
2 44
2 44
2 55
2 55
....
Can I use a SQL statement to accomplish this? (return the rows with 2 only but fill with values come from the previous row that is 0), there could be many '2 null' between 0.

This way
with s as (
select *
from
(values
(1,0,33 ),
(2,2,null),
(3,0,44 ),
(4,2,null),
(5,2,null),
(6,2,null),
(7,0,55 ),
(8,2,null),
(9,2,null)
) T(id,a,b)
)
select s1.a, t.b
from s s1
cross apply (
select top(1) s2.b
from s s2
where s2.id < s1.id and s2.b is not null and s2.a = 0
order by s2.id desc ) t
where s1.a = 2
order by s1.id;
I use CROSS APPLY so the query may be easily extended to get other columns from the relevant '0' row.

First of all, select value for every row with null:
SELECT col2 FROM (SELECT MAX(ID) FROM your_tbl t WHERE t.ID < ID AND col2 IS NOT NULL);
Then write a condition for your table with that subquery:
SELECT col1, (
SELECT col2 FROM your_tbl where id = (SELECT MAX(ID) FROM your_tbl t
WHERE t.ID < tbl.ID AND col2 IS NOT NULL))
FROM your_tbl tbl WHERE col1 <> 0;

Related

Rolling Average in SQL with Partition [duplicate]

declare #t table
(
id int,
SomeNumt int
)
insert into #t
select 1,10
union
select 2,12
union
select 3,3
union
select 4,15
union
select 5,23
select * from #t
the above select returns me the following.
id SomeNumt
1 10
2 12
3 3
4 15
5 23
How do I get the following:
id srome CumSrome
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63
select t1.id, t1.SomeNumt, SUM(t2.SomeNumt) as sum
from #t t1
inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.SomeNumt
order by t1.id
SQL Fiddle example
Output
| ID | SOMENUMT | SUM |
-----------------------
| 1 | 10 | 10 |
| 2 | 12 | 22 |
| 3 | 3 | 25 |
| 4 | 15 | 40 |
| 5 | 23 | 63 |
Edit: this is a generalized solution that will work across most db platforms. When there is a better solution available for your specific platform (e.g., gareth's), use it!
The latest version of SQL Server (2012) permits the following.
SELECT
RowID,
Col1,
SUM(Col1) OVER(ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
or
SELECT
GroupID,
RowID,
Col1,
SUM(Col1) OVER(PARTITION BY GroupID ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
This is even faster. Partitioned version completes in 34 seconds over 5 million rows for me.
Thanks to Peso, who commented on the SQL Team thread referred to in another answer.
For SQL Server 2012 onwards it could be easy:
SELECT id, SomeNumt, sum(SomeNumt) OVER (ORDER BY id) as CumSrome FROM #t
because ORDER BY clause for SUM by default means RANGE UNBOUNDED PRECEDING AND CURRENT ROW for window frame ("General Remarks" at https://msdn.microsoft.com/en-us/library/ms189461.aspx)
Let's first create a table with dummy data:
Create Table CUMULATIVESUM (id tinyint , SomeValue tinyint)
Now let's insert some data into the table;
Insert Into CUMULATIVESUM
Select 1, 10 union
Select 2, 2 union
Select 3, 6 union
Select 4, 10
Here I am joining same table (self joining)
Select c1.ID, c1.SomeValue, c2.SomeValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Order By c1.id Asc
Result:
ID SomeValue SomeValue
-------------------------
1 10 10
2 2 10
2 2 2
3 6 10
3 6 2
3 6 6
4 10 10
4 10 2
4 10 6
4 10 10
Here we go now just sum the Somevalue of t2 and we`ll get the answer:
Select c1.ID, c1.SomeValue, Sum(c2.SomeValue) CumulativeSumValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Group By c1.ID, c1.SomeValue
Order By c1.id Asc
For SQL Server 2012 and above (much better performance):
Select
c1.ID, c1.SomeValue,
Sum (SomeValue) Over (Order By c1.ID )
From CumulativeSum c1
Order By c1.id Asc
Desired result:
ID SomeValue CumlativeSumValue
---------------------------------
1 10 10
2 2 12
3 6 18
4 10 28
Drop Table CumulativeSum
A CTE version, just for fun:
;
WITH abcd
AS ( SELECT id
,SomeNumt
,SomeNumt AS MySum
FROM #t
WHERE id = 1
UNION ALL
SELECT t.id
,t.SomeNumt
,t.SomeNumt + a.MySum AS MySum
FROM #t AS t
JOIN abcd AS a ON a.id = t.id - 1
)
SELECT * FROM abcd
OPTION ( MAXRECURSION 1000 ) -- limit recursion here, or 0 for no limit.
Returns:
id SomeNumt MySum
----------- ----------- -----------
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63
Late answer but showing one more possibility...
Cumulative Sum generation can be more optimized with the CROSS APPLY logic.
Works better than the INNER JOIN & OVER Clause when analyzed the actual query plan ...
/* Create table & populate data */
IF OBJECT_ID('tempdb..#TMP') IS NOT NULL
DROP TABLE #TMP
SELECT * INTO #TMP
FROM (
SELECT 1 AS id
UNION
SELECT 2 AS id
UNION
SELECT 3 AS id
UNION
SELECT 4 AS id
UNION
SELECT 5 AS id
) Tab
/* Using CROSS APPLY
Query cost relative to the batch 17%
*/
SELECT T1.id,
T2.CumSum
FROM #TMP T1
CROSS APPLY (
SELECT SUM(T2.id) AS CumSum
FROM #TMP T2
WHERE T1.id >= T2.id
) T2
/* Using INNER JOIN
Query cost relative to the batch 46%
*/
SELECT T1.id,
SUM(T2.id) CumSum
FROM #TMP T1
INNER JOIN #TMP T2
ON T1.id > = T2.id
GROUP BY T1.id
/* Using OVER clause
Query cost relative to the batch 37%
*/
SELECT T1.id,
SUM(T1.id) OVER( PARTITION BY id)
FROM #TMP T1
Output:-
id CumSum
------- -------
1 1
2 3
3 6
4 10
5 15
Select
*,
(Select Sum(SOMENUMT)
From #t S
Where S.id <= M.id)
From #t M
You can use this simple query for progressive calculation :
select
id
,SomeNumt
,sum(SomeNumt) over(order by id ROWS between UNBOUNDED PRECEDING and CURRENT ROW) as CumSrome
from #t
There is a much faster CTE implementation available in this excellent post:
http://weblogs.sqlteam.com/mladenp/archive/2009/07/28/SQL-Server-2005-Fast-Running-Totals.aspx
The problem in this thread can be expressed like this:
DECLARE #RT INT
SELECT #RT = 0
;
WITH abcd
AS ( SELECT TOP 100 percent
id
,SomeNumt
,MySum
order by id
)
update abcd
set #RT = MySum = #RT + SomeNumt
output inserted.*
For Ex: IF you have a table with two columns one is ID and second is number and wants to find out the cumulative sum.
SELECT ID,Number,SUM(Number)OVER(ORDER BY ID) FROM T
Once the table is created -
select
A.id, A.SomeNumt, SUM(B.SomeNumt) as sum
from #t A, #t B where A.id >= B.id
group by A.id, A.SomeNumt
order by A.id
The SQL solution wich combines "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" and "SUM" did exactly what i wanted to achieve.
Thank you so much!
If it can help anyone, here was my case. I wanted to cumulate +1 in a column whenever a maker is found as "Some Maker" (example). If not, no increment but show previous increment result.
So this piece of SQL:
SUM( CASE [rmaker] WHEN 'Some Maker' THEN 1 ELSE 0 END)
OVER
(PARTITION BY UserID ORDER BY UserID,[rrank] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Cumul_CNT
Allowed me to get something like this:
User 1 Rank1 MakerA 0
User 1 Rank2 MakerB 0
User 1 Rank3 Some Maker 1
User 1 Rank4 Some Maker 2
User 1 Rank5 MakerC 2
User 1 Rank6 Some Maker 3
User 2 Rank1 MakerA 0
User 2 Rank2 SomeMaker 1
Explanation of above: It starts the count of "some maker" with 0, Some Maker is found and we do +1. For User 1, MakerC is found so we dont do +1 but instead vertical count of Some Maker is stuck to 2 until next row.
Partitioning is by User so when we change user, cumulative count is back to zero.
I am at work, I dont want any merit on this answer, just say thank you and show my example in case someone is in the same situation. I was trying to combine SUM and PARTITION but the amazing syntax "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" completed the task.
Thanks!
Groaker
Above (Pre-SQL12) we see examples like this:-
SELECT
T1.id, SUM(T2.id) AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < = T1.id
GROUP BY
T1.id
More efficient...
SELECT
T1.id, SUM(T2.id) + T1.id AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < T1.id
GROUP BY
T1.id
Try this
select
t.id,
t.SomeNumt,
sum(t.SomeNumt) Over (Order by t.id asc Rows Between Unbounded Preceding and Current Row) as cum
from
#t t
group by
t.id,
t.SomeNumt
order by
t.id asc;
Try this:
CREATE TABLE #t(
[name] varchar NULL,
[val] [int] NULL,
[ID] [int] NULL
) ON [PRIMARY]
insert into #t (id,name,val) values
(1,'A',10), (2,'B',20), (3,'C',30)
select t1.id, t1.val, SUM(t2.val) as cumSum
from #t t1 inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.val order by t1.id
Without using any type of JOIN cumulative salary for a person fetch by using follow query:
SELECT * , (
SELECT SUM( salary )
FROM `abc` AS table1
WHERE table1.ID <= `abc`.ID
AND table1.name = `abc`.Name
) AS cum
FROM `abc`
ORDER BY Name

SQL Server - Sum of difference between rows in a table

I have a table in the format :
SomeID SomeData
1 3
2 7
3 9
4 10
5 14
6 16
. .
. .
I want to find sum of difference between rows in this table. i.e ( (7-3) + (10-9) + (16-14) + ....)
Which is the best way to do this
Using a self join along with the modulus:
SELECT SUM(t1.SomeData - t2.SomeData) AS total_diff
FROM yourTable t1
INNER JOIN yourTable t2
ON t1.SomeID = t2.SomeID + 1
WHERE t1.SomeID % 2 = 0;
Demo
This answer assumes that the SomeID sequence in fact starts with 1 and increments by 1 with each subsequent row. If not, then we might be able to first apply ROW_NUMBER over SomeID and generate a 1 to N sequence.
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY SomeID) rn
FROM yourTable
)
SELECT SUM(t1.SomeData - t2.SomeData) AS total_diff
FROM cte t1
INNER JOIN cte t2
ON t1.SomeID = t2.SomeID + 1
WHERE t1.rn % 2 = 0;
You can try to use ROW_NUMBER window function to make a serial number then MOD by 2 to get your expected group then use condition aggregate function.
Query 1:
SELECT SUM(CASE WHEN rn = 0 THEN SomeData END) - SUM(CASE WHEN rn = 1 THEN SomeData END)
FROM (
SELECT SomeData,ROW_NUMBER() over(order by SomeID) % 2 rn
FROM t t1
) t1
Results:
| |
|---|
| 7 |

How can I get the first result for each account in this SQL query?

I'm trying to write a query that follows this logic:
Find the first following status code of an account that had a previous status code of X.
So if I have a table of:
id account_num status_code
64 1 X
82 1 Y
72 2 Y
87 1 Z
91 2 X
103 2 Z
The results would be:
id account_num status_code
82 1 Y
103 2 Z
I've come up with a couple of solutions but I'm not all that great with SQL and so they've been pretty inelegeant thus far. I was hoping that someone here might be able to point me in the right direction.
View:
SELECT account_number, id
FROM table
WHERE status_code = 'X'
Query:
SELECT account_number, min(id)
FROM table
INNER JOIN view
ON table.account_number = view.account_number
WHERE table.id > view.id
At this point I have the id that I need but I'd have to write ANOTHER query that uses the id tolook up the status_code.
Edit: To add some context, I'm trying to find calls that have a status_code of X. If a call has a status_code of X we want to dial it a different way the next time we make an attempt. The aim of this query is to provide a report that will show the results of the second dial if the first dial resulted an X status code.
Here's a SQL Server solution.
UPDATE
The idea is to avoid a number of NESTED LOOP joins as proposed by Olaf because they roughly have O(N * M) complexity and thus extremely bad for your performance. MERGED JOINS complexity is O(NLog(N) + MLog(M)) which is much better for real world scenarios.
The query below works as follows:
RankedCTE is a subquery that assigns a row number to each id partioned by account and sorted by id which represents the time. So for the data below the output of this
SELECT
id,
account_num,
status_code,
ROW_NUMBER() OVER (PARTITION BY account_num ORDER BY id DESC) AS item_rank
FROM dbo.Test
would be:
id account_num status_code item_rank
----------- ----------- ----------- ----------
87 1 Z 1
82 1 Y 2
64 1 X 3
103 2 Z 1
91 2 X 2
72 2 Y 3
Once we have them numbered we join the result on itself like this:
WITH RankedCTE AS
(
SELECT
id,
account_num,
status_code,
ROW_NUMBER() OVER (PARTITION BY account_num ORDER BY id DESC) AS item_rank
FROM dbo.Test
)
SELECT
*
FROM
RankedCTE A
INNER JOIN RankedCTE B ON
A.account_num = B.account_num
AND A.item_rank = B.item_rank - 1
which will give us an event and a preceding event in the same table
id account_num status_code item_rank id account_num status_code item_rank
----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
87 1 Z 1 82 1 Y 2
82 1 Y 2 64 1 X 3
103 2 Z 1 91 2 X 2
91 2 X 2 72 2 Y 3
Finally, we just have to take the preceding event with code "X" and the event with code not "X":
WITH RankedCTE AS
(
SELECT
id,
account_num,
status_code,
ROW_NUMBER() OVER (PARTITION BY account_num ORDER BY id DESC) AS item_rank
FROM dbo.Test
)
SELECT
A.id,
A.account_num,
A.status_code
FROM
RankedCTE A
INNER JOIN RankedCTE B ON
A.account_num = B.account_num
AND A.item_rank = B.item_rank - 1
AND A.status_code <> 'X'
AND B.status_code = 'X'
Query plans for this query and #Olaf Dietsche solution (one of the versions) are below.
Data setup script
CREATE TABLE dbo.Test
(
id int not null PRIMARY KEY,
account_num int not null,
status_code nchar(1)
)
GO
INSERT dbo.Test (id, account_num, status_code)
SELECT 64 , 1, 'X' UNION ALL
SELECT 82 , 1, 'Y' UNION ALL
SELECT 72 , 2, 'Y' UNION ALL
SELECT 87 , 1, 'Z' UNION ALL
SELECT 91 , 2, 'X' UNION ALL
SELECT 103, 2, 'Z'
SQL Fiddle with subselect
select id, account_num, status_code
from mytable
where id in (select min(t1.id)
from mytable t1
join mytable t2 on t1.account_num = t2.account_num
and t1.id > t2.id
and t2.status_code = 'X'
group by t1.account_num)
and SQL Fiddle with join, both for MS SQL Server 2012, both returning the same result.
select id, account_num, status_code
from mytable
join (select min(t1.id) as min_id
from mytable t1
join mytable t2 on t1.account_num = t2.account_num
and t1.id > t2.id
and t2.status_code = 'X'
group by t1.account_num) t on id = min_id
SELECT MIN(ID), ACCOUNT_NUM, STATUS_CODE FROM (
SELECT ID, ACCOUNT_NUM, STATUS_CODE
FROM ACCOUNT A1
WHERE EXISTS
(SELECT 1
FROM ACCOUNT A2
WHERE A1.ACCOUNT_NUM = A2.ACCOUNT_NUM
AND A2.STATUS_CODE = 'X'
AND A2.ID < A1.ID)
) SUB
GROUP BY ACCOUNT_NUM
Here's an SQLFIDDLE
Here's query, with your data, checked under PostgreSQL:
SELECT t0.*
FROM so13594339 t0 JOIN
(SELECT min(t1.id), t1.account_num
FROM so13594339 t1, so13594339 t2
WHERE t1.account_num = t2.account_num AND t1.id > t2.id AND t2.status_code = 'X'
GROUP BY t1.account_num
) z
ON t0.id = z.min AND t0.account_num = z.account_num;

SQL: Outputting Multiple Rows When Joining From Same Table

My question is this: Is it possible to output multiple rows when joining from the same table?
With this code for example, I would like it to output 2 rows, one for each table. Instead, what it does is gives me 1 row with all of the data.
SELECT t1.*, t2.*
FROM table t1
JOIN table t2
ON t2.id = t1.oldId
WHERE t1.id = '1'
UPDATE
Well the problem that I have with the UNION/UNION ALL is this: I don't know what the t1.oldId value is equal to. All I know is the id for t1. I am trying to avoid using 2 queries so is there a way I could do something like this:
SELECT t1.*
FROM table t1
WHERE t1.id = '1'
UNION
SELECT t2.*
FROM table t2
WHERE t2.id = t1.oldId
SAMPLE DATA
messages_users
id message_id user_id box thread_id latest_id
--------------------------------------------------------
8 1 1 1 NULL NULL
9 2 1 2 NULL 16
10 2 65 1 NULL 15
11 3 65 2 2 NULL
12 3 1 1 2 NULL
13 4 1 2 2 NULL
14 4 65 1 2 NULL
15 5 65 2 2 NULL
16 6 1 1 2 NULL
Query:
SELECT mu.id FROM messages_users mu
JOIN messages_users mu2 ON mu2.latest_id IS NOT NULL
WHERE mu.user_id = '1' AND mu2.user_id = '1' AND ((mu.box = '1'
AND mu.thread_id IS NULL AND mu.latest_id IS NULL) OR mu.id = mu2.latest_id)
This query fixes my problem. But it seems the answer to my question is to not use a JOIN but a UNION.
You mean one row for t1 and one row from t2?
You're looking for UNION, not JOIN.
select * from table where id = 1
union
select * from table where oldid = 1
If you are trying to multiply rows in a table, you need UNION ALL (not UNION):
select *
from ((select * from t) union all
(select * from t)
) t
I also sometimes use a cross join to do this:
select *
from t cross join
(select 1 as seqnum union all select 2) vals
The cross join is explicitly multiplying the number of rows, in this case, with a sequencenumber attached.
Well, since it's the same table, you could do:
SELECT t2.*
FROM table t1
JOIN table t2
ON t2.id = t1.oldId
OR t2.id = t1.id
WHERE t1.id = '1'

Get next minimum, greater than or equal to a given value for each group

given the following Table1:
RefID intVal SomeVal
----------------------
1 10 val01
1 20 val02
1 30 val03
1 40 val04
1 50 val05
2 10 val06
2 20 val07
2 30 val08
2 40 val09
2 50 val10
3 12 val11
3 14 val12
4 10 val13
5 100 val14
5 150 val15
5 1000 val16
and Table2 containing some RefIDs and intVals like
RefID intVal
-------------
1 11
1 28
2 9
2 50
2 51
4 11
5 1
5 150
5 151
need an SQL Statement to get the next greater intValue for each RefID and NULL if not found in Table1
following is the expected result
RefID intVal nextGt SomeVal
------------------------------
1 11 20 val01
1 28 30 val03
2 9 10 val06
2 50 50 val10
2 51 NULL NULL
4 11 NULL NULL
5 1 100 val14
5 150 150 val15
5 151 1000 val16
help would be appreciated !
Derived table a retrieves minimal values from table1 given refid and intVal from table2; outer query retrieves someValue only.
select a.refid, a.intVal, a.nextGt, table1.SomeVal
from
(
select table2.refid, table2.intval, min (table1.intVal) nextGt
from table2
left join table1
on table2.refid = table1.refid
and table2.intVal <= table1.intVal
group by table2.refid, table2.intval
) a
-- table1 is joined again to retrieve SomeVal
left join table1
on a.refid = table1.refid
and a.nextGt = table1.intVal
Here is Sql Fiddle with live test.
You can solve this using the ROW_NUMBER() function:
SELECT
RefID,
intVal,
NextGt,
SomeVal,
FROM
(
SELECT
t2.RefID,
t2.intVal,
t1.intVal AS NextGt,
t1.SomeVal,
ROW_NUMBER() OVER (PARTITION BY t2.RefID, t2.intVal ORDER BY t1.intVal) AS rn
FROM
dbo.Table2 AS t2
LEFT JOIN dbo.Table1 AS t1 ON t1.RefID = t2.RefID AND t1.intVal >= t2.intVal
) s
WHERE
rn = 1
;
The derived table matches each Table2 row with all Table1 rows that have the same RefID and an intVal that is greater than or equal to Table2.intVal. Each subset of matches is ranked and the first row is returned by the main query.
The nested query uses an outer join, so that those Table2 rows that have no Table1 matches are still returned (with nulls substituted for the Table1 columns).
Alternatively you can use OUTER APPLY:
SELECT
t2.RefID,
t2.intVal,
t1.intVal AS NextGt,
t1.SomeVal
FROM
dbo.Table2 AS t2
OUTER APPLY
(
SELECT TOP (1)
t1.intVal
FROM
dbo.Table1 AS t1
WHERE
t1.RefID = t2.RefID
AND t1.intVal >= t2.intVal
ORDER BY
t1.intVal ASC
) AS t1
;
This method is arguably more straightforward: for each Table2 row, get all matches from Table1 based on the same set of conditions, sort the matches in the ascending order of Table1.intVal and take the topmost intVal.
This can be done with a join, group by, and a case statement, and a trick:
select t1.refid, t2.intval,
min(case when t1.intval > t2.intval then t1.intval end) as min_greater_than_ref,
substring(min(case when t1.intval > t2.intval
then right('00000000'+cast(t1.intval as varchar(255)), 8)+t1.SomeVal)
end)), 9, 1000)
from table1 t1 left join
table2 t2
on t1.refid = t2.refid
group by t1.refid, t2.intval
SO, the trick is to prepend the integer value to SomeValue, zero-padding the integer value (in this case to 8 characters). You get something like: "00000020val01". The minimum on this column is based on the minimum of the integer. The final step is to extract the value.
For this example, I used SQL Server syntax for the concatenation. In other databases you might use CONCAT() or ||.