SQL - Calculate next row based on previous in the same column - sql

I have spent hours trying to solve this with loops, the lag function but it doesn't solve my problem. I have a table where the first row of a particular field is populated, the next row is calculated based on a subtraction of the previous row of data from 2 columns, the next row is then based on the result of this. The example is below of the original table and the result set:
a b a b
502.5 33.85 502.5 33.85
25.46 468.65 25.46
20.83 443.19 20.83
133.07 422.36 133.07
144.65 289.29 144.65
144.65 144.64 144.65
I have tried several different methods with stored procedures and can get the 2nd row result set but I can't get it to continue and calculate the rest of the fields, it's easy in excel but not so in SQL. Any suggestions?

If your RDBMS supports windowed aggregate functions:
Assuming you have an id or some such thing that is determining the order of your rows (as you indicated there is a first).
You can use the max() over() (in this case min() works instead of max() as well) and sum() over() windowed aggregate functions
select
id
, max(a) over (order by id) - (sum(b) over (order by id) - b) as a
, b
from t
rextester demo: http://rextester.com/MGKM17497
returns:
+----+--------+--------+
| id | a | b |
+----+--------+--------+
| 1 | 502,50 | 33,85 |
| 2 | 468,65 | 25,46 |
| 3 | 443,19 | 20,83 |
| 4 | 422,36 | 133,07 |
| 5 | 289,29 | 144,65 |
| 6 | 144,64 | 144,65 |
+----+--------+--------+

In case, as I saw data before editing )
This solution also assumes that you have id column and order depends on this column
with t(id, a, b) as(
select 1, 502.5, 33.85 union all
select 2, 25.46, null union all
select 3, 20.83, null union all
select 4, 133.07, null union all
select 5, 144.65, null union all
select 6, 144.65, null
)
select case when id = 1 then a else b end as a, case when id = 1 then (select b from t order by id offset 0 rows fetch next 1 rows only) else a end as b from (
select id, a, lag((select a from t order by id offset 0 rows fetch next 1 rows only)-s) over(order by id) as b from (
select id, a, sum(case when b is null then a else b end ) over(order by id) s
from t
) tt
) ttt

Related

Find the count of IDs that have the same value

I'd like to get a count of all of the Ids that have have the same value (Drops) as other Ids. For instance, the illustration below shows you that ID 1 and 3 have A drops so the query would count them. Similarly, ID 7 & 18 have B drops so that's another two IDs that the query would count totalling in 4 Ids that share the same values so that's what my query would return.
+------+-------+
| ID | Drops |
+------+-------+
| 1 | A |
| 2 | C |
| 3 | A |
| 7 | B |
| 18 | B |
+------+-------+
I've tried the several approaches but the following query was my last attempt.
With cte1 (Id1, D1) as
(
select Id, Drops
from Posts
),
cte2 (Id2, D2) as
(
select Id, Drops
from Posts
)
Select count(distinct c1.Id1) newcnt, c1.D1
from cte1 c1
left outer join cte2 c2 on c1.D1 = c2.D2
group by c1.D1
The result if written out in full would be a single value output but the records that the query should be choosing should look as follows:
+------+-------+
| ID | Drops |
+------+-------+
| 1 | A |
| 3 | A |
| 7 | B |
| 18 | B |
+------+-------+
Any advice would be great. Thanks
You can use a CTE to generate a list of Drops values that have more than one corresponding ID value, and then JOIN that to Posts to find all rows which have a Drops value that has more than one Post:
WITH CTE AS (
SELECT Drops
FROM Posts
GROUP BY Drops
HAVING COUNT(*) > 1
)
SELECT P.*
FROM Posts P
JOIN CTE ON P.Drops = CTE.Drops
Output:
ID Drops
1 A
3 A
7 B
18 B
If desired you can then count those posts in total (or grouped by Drops value):
WITH CTE AS (
SELECT Drops
FROM Posts
GROUP BY Drops
HAVING COUNT(*) > 1
)
SELECT COUNT(*) AS newcnt
FROM Posts P
JOIN CTE ON P.Drops = CTE.Drops
Output
newcnt
4
Demo on SQLFiddle
You may use dense_rank() to resolve your problem. if drops has the same ID then dense_rank() will provide the same rank.
Here is the demo.
with cte as
(
select
drops,
count(distinct rnk) as newCnt
from
( select
*,
dense_rank() over (partition by drops order by id) as rnk
from myTable
) t
group by
drops
having count(distinct rnk) > 1
)
select
sum(newCnt) as newCnt
from cte
Output:
|newcnt |
|------ |
| 4 |
First group the count of the ids for your drops and then sum the values greater than 1.
select sum(countdrops) as total from
(select drops , count(id) as countdrops from yourtable group by drops) as temp
where countdrops > 1;

SQL select all rows per group after a condition is met

I would like to select all rows for each group after the last time a condition is met for that group. This related question has an answer using correlated subqueries.
In my case I will have millions of categories and hundreds of millions/billions of rows. Is there a way to achieve the same results using a more performant query?
Here is an example. The condition is all rows (per group) after the last 0 in the conditional column.
category | timestamp | condition
--------------------------------------
A | 1 | 0
A | 2 | 1
A | 3 | 0
A | 4 | 1
A | 5 | 1
B | 1 | 0
B | 2 | 1
B | 3 | 1
The result I would like to achieve is
category | timestamp | condition
--------------------------------------
A | 4 | 1
A | 5 | 1
B | 2 | 1
B | 3 | 1
If you want everything after the last 0, you can use window functions:
select t.*
from (select t.*,
max(case when condition = 0 then timestamp end) over (partition by category) as max_timestamp_0
from t
) t
where timestamp > max_timestamp_0 or
max_timestamp_0 is null;
With an index on (category, condition, timestamp), the correlated subquery version might also perform quite well:
select t.*
from t
where t.timestamp > all (select t2.timestamp
from t t2
where t2.category = t.category and
t2.condition = 0
);
You might want to try window functions:
select category, timestamp, condition
from (
select
t.*,
min(condition) over(partition by category order by timestamp desc) min_cond
from mytable t
) t
where min_cond = 1
The window min() with the order by clause computes the minimum value of condition over the current and following rows of the same category: we can use it as a filter to eliminate rows for which there is a more recent row with a 0.
Compared to the correlated subquery approach, the upside of using window functions is that it reduces the number of scans needed on the table. Of course this computing also has a cost, so you'll need to assess both solutions against your sample data.

Get row which matched in each group

I am trying to make a sql query. I got some results from 2 tables below. Below results are good for me. Now I want those values which is present in each group. for example, A and B is present in each group(in each ID). so i want only A and B in result. and also i want make my query dynamic. Could anyone help?
| ID | Value |
|----|-------|
| 1 | A |
| 1 | B |
| 1 | C |
| 1 | D |
| 2 | A |
| 2 | B |
| 2 | C |
| 3 | A |
| 3 | B |
In the following query, I have placed your current query into a CTE for further use. We can try selecting those values for which every ID in your current result appears. This would imply that such values are associated with every ID.
WITH cte AS (
-- your current query
)
SELECT Value
FROM cte
GROUP BY Value
HAVING COUNT(DISTINCT ID) = (SELECT COUNT(DISTINCT ID) FROM cte);
Demo
The solution is simple - you can do this in two ways at least. Group by letters (Value), aggregate IDs with SUM or COUNT (distinct values in ID). Having that, choose those letters that have the value for SUM(ID) or COUNT(ID).
select Value from MyTable group by Value
having SUM(ID) = (SELECT SUM(DISTINCT ID) from MyTable)
select Value from MyTable group by Value
having COUNT(ID) = (SELECT COUNT(DISTINCT ID) from MyTable)
Use This
WITH CTE
AS
(
SELECT
Value,
Cnt = COUNT(DISTINCT ID)
FROM T1
GROUP BY Value
)
SELECT
Value
FROM CTE
WHERE Cnt = (SELECT COUNT(DISTINCT ID) FROM T1)

SELECT First Group

Problem Definition
I have an SQL query that looks like:
SELECT *
FROM table
WHERE criteria = 1
ORDER BY group;
Result
I get:
group | value | criteria
------------------------
A | 0 | 1
A | 1 | 1
B | 2 | 1
B | 3 | 1
Expected Result
However, I would like to limit the results to only the first group (in this instance, A). ie,
group | value | criteria
------------------------
A | 0 | 1
A | 1 | 1
What I've tried
Group By
SELECT *
FROM table
WHERE criteria = 1
GROUP BY group;
I can aggregate the groups using a GROUP BY clause, but that would give me:
group | value
-------------
A | 0
B | 2
or some aggregate function of EACH group. However, I don't want to aggregate the rows!
Subquery
I can also specify the group by subquery:
SELECT *
FROM table
WHERE criteria = 1 AND
group = (
SELECT group
FROM table
WHERE criteria = 1
ORDER BY group ASC
LIMIT 1
);
This works, but as always, subqueries are messy. Particularly, this one requires specifying my WHERE clause for criteria twice. Surely there must be a cleaner way to do this.
You can try following query:-
SELECT *
FROM table
WHERE criteria = 1
AND group = (SELECT MIN(group) FROM table)
ORDER BY value;
If your database supports the WITH clause, try this. It's similar to using a subquery, but you only need to specify the criteria input once. It's also easier to understand what's going on.
with main_query as (
select *
from table
where criteria = 1
order by group, value
),
with min_group as (
select min(group) from main_query
)
select *
from main_query
where group in (select group from min_group);
-- this where clause should be fast since there will only be 1 record in min_group
Use DENSE_RANK()
DECLARE #yourTbl AS TABLE (
[group] NVARCHAR(50),
value INT,
criteria INT
)
INSERT INTO #yourTbl VALUES ( 'A', 0, 1 )
INSERT INTO #yourTbl VALUES ( 'A', 1, 1 )
INSERT INTO #yourTbl VALUES ( 'B', 2, 1 )
INSERT INTO #yourTbl VALUES ( 'B', 3, 1 )
;WITH cte AS
(
SELECT i.* ,
DENSE_RANK() OVER (ORDER BY i.[group]) AS gn
FROM #yourTbl AS i
WHERE i.criteria = 1
)
SELECT *
FROM cte
WHERE gn = 1
group | value | criteria
------------------------
A | 0 | 1
A | 1 | 1

Grouping SQL Results based on order

I have table with data something like this:
ID | RowNumber | Data
------------------------------
1 | 1 | Data
2 | 2 | Data
3 | 3 | Data
4 | 1 | Data
5 | 2 | Data
6 | 1 | Data
7 | 2 | Data
8 | 3 | Data
9 | 4 | Data
I want to group each set of RowNumbers So that my result is something like this:
ID | RowNumber | Group | Data
--------------------------------------
1 | 1 | a | Data
2 | 2 | a | Data
3 | 3 | a | Data
4 | 1 | b | Data
5 | 2 | b | Data
6 | 1 | c | Data
7 | 2 | c | Data
8 | 3 | c | Data
9 | 4 | c | Data
The only way I know where each group starts and stops is when the RowNumber starts over. How can I accomplish this? It also needs to be fairly efficient since the table I need to do this on has 52 Million Rows.
Additional Info
ID is truly sequential, but RowNumber may not be. I think RowNumber will always begin with 1 but for example the RowNumbers for group1 could be "1,1,2,2,3,4" and for group2 they could be "1,2,4,6", etc.
For the clarified requirements in the comments
The rownumbers for group1 could be "1,1,2,2,3,4" and for group2 they
could be "1,2,4,6" ... a higher number followed by a lower would be a
new group.
A SQL Server 2012 solution could be as follows.
Use LAG to access the previous row and set a flag to 1 if that row is the start of a new group or 0 otherwise.
Calculate a running sum of these flags to use as the grouping value.
Code
WITH T1 AS
(
SELECT *,
LAG(RowNumber) OVER (ORDER BY ID) AS PrevRowNumber
FROM YourTable
), T2 AS
(
SELECT *,
IIF(PrevRowNumber IS NULL OR PrevRowNumber > RowNumber, 1, 0) AS NewGroup
FROM T1
)
SELECT ID,
RowNumber,
Data,
SUM(NewGroup) OVER (ORDER BY ID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Grp
FROM T2
SQL Fiddle
Assuming ID is the clustered index the plan for this has one scan against YourTable and avoids any sort operations.
If the ids are truly sequential, you can do:
select t.*,
(id - rowNumber) as grp
from t
Also you can use recursive CTE
;WITH cte AS
(
SELECT ID, RowNumber, Data, 1 AS [Group]
FROM dbo.test1
WHERE ID = 1
UNION ALL
SELECT t.ID, t.RowNumber, t.Data,
CASE WHEN t.RowNumber != 1 THEN c.[Group] ELSE c.[Group] + 1 END
FROM dbo.test1 t JOIN cte c ON t.ID = c.ID + 1
)
SELECT *
FROM cte
Demo on SQLFiddle
How about:
select ID, RowNumber, Data, dense_rank() over (order by grp) as Grp
from (
select *, (select min(ID) from [Your Table] where ID > t.ID and RowNumber = 1) as grp
from [Your Table] t
) t
order by ID
This should work on SQL 2005. You could also use rank() instead if you don't care about consecutive numbers.