Calculate difference between rows and keep the first row always 0? - sql

I have to calculate the difference between row values in Table X (SQL Server)
Table X
ID A
1 100
2 200
3 300
4 400
So I wrote the following SQL query
SELECT ID,
A
A - COALESCE (lag(A) OVER (ORDER BY date), 0) AS Difference
FROM Table X
And the result is
ID A Difference
1 100 100
2 200 -100
3 300 -100
4 400 -100
What I want is to keep the first-row Difference always as 0
ID A Difference
1 100 0
2 200 -100
3 300 -100
4 400 -100
But I have no idea how to do it.

You may try to pass а value for the default parameter of the LAG() window function. As is explained in the documentation, the default parameter is the value to return when offset is beyond the scope of the partition (and for the first row, the previous row is beyond that scope).
Table:
CREATE TABLE Data (ID int, A int, [Date] date)
INSERT INTO Data (ID, A, [Date])
VALUES
(1, 100, '20200701'),
(2, 200, '20200702'),
(3, 300, '20200703'),
(4, 400, '20200704')
Statment:
SELECT
ID,
A,
LAG(A, 1, A) OVER (ORDER BY [Date]) - A AS Difference
FROM Data
Result:
ID A Difference
------------------
1 100 0
2 200 -100
3 300 -100
4 400 -100

Thanks to #zhorov for the table schema, data
You can use ISNULL or COALESCE to arrive at the difference.
DECLARE #Data table(ID int, A int, [Date] date)
INSERT INTO #Data (ID, A, [Date])
VALUES
(1, 100, '20200701'),
(2, 200, '20200702'),
(3, 300, '20200703'),
(4, 400, '20200704')
SELECT ID,A, ISNULL(LAG(A,1) OVER(ORDER BY DATE),A) AS difference FROM #Data
--or you can use COALESCE
SELECT ID,A, COALESCE(LAG(A,1) OVER(ORDER BY DATE),A) AS difference FROM #Data
+----+-----+------------+
| ID | A | difference |
+----+-----+------------+
| 1 | 100 | 100 |
| 2 | 200 | 100 |
| 3 | 300 | 200 |
| 4 | 400 | 300 |
+----+-----+------------+

You can try the following query.
For this type of query order by clause is important based on the column and the applied order by clause ascending or descending the result can be different.
create table Test(ID int,
A int)
insert into Test values
(1, 100),
(2, 200),
(3, 300),
(4, 400)
SELECT ID
,A
,Difference
FROM (
SELECT ID
,A
,isnull(A - LAG(A) OVER (
ORDER BY A DESC
), 0) Difference
FROM test
) t
Live Demo

Related

Daily record count based on status allocation

I have a table named Books and a table named Transfer with the following structure:
CREATE TABLE Books
(
BookID int,
Title varchar(150),
PurchaseDate date,
Bookstore varchar(150),
City varchar(150)
);
INSERT INTO Books VALUES (1, 'Cujo', '2022-02-01', 'CentralPark1', 'New York');
INSERT INTO Books VALUES (2, 'The Hotel New Hampshire', '2022-01-08', 'TheStrip1', 'Las Vegas');
INSERT INTO Books VALUES (3, 'Gorky Park', '2022-05-19', 'CentralPark2', 'New York');
CREATE TABLE Transfer
(
BookID int,
BookStatus varchar(50),
TransferDate date
);
INSERT INTO Transfer VALUES (1, 'Rented', '2022-11-01');
INSERT INTO Transfer VALUES (1, 'Returned', '2022-11-05');
INSERT INTO Transfer VALUES (1, 'Rented', '2022-11-06');
INSERT INTO Transfer VALUES (1, 'Returned', '2022-11-09');
INSERT INTO Transfer VALUES (2, 'Rented', '2022-11-03');
INSERT INTO Transfer VALUES (2, 'Returned', '2022-11-09');
INSERT INTO Transfer VALUES (2, 'Rented', '2022-11-15');
INSERT INTO Transfer VALUES (2, 'Returned', '2022-11-23');
INSERT INTO Transfer VALUES (3, 'Rented', '2022-11-14');
INSERT INTO Transfer VALUES (3, 'Returned', '2022-11-21');
INSERT INTO Transfer VALUES (3, 'Rented', '2022-11-25');
INSERT INTO Transfer VALUES (3, 'Returned', '2022-11-29');
See fiddle.
I want to do a query for a date interval (in this case 01.11 - 09.11) that returns the book count for each day based on BookStatus from Transfer, like so:
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
| Status | 01.11 | 02.11 | 03.11 | 04.11 | 05.11 | 06.11 | 07.11 | 08.11 | 09.11 |
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
| Rented | 2 | 1 | 2 | 2 | 0 | 2 | 3 | 3 | 1 |
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
| Returned | 1 | 2 | 1 | 1 | 3 | 1 | 0 | 0 | 2 |
+────────────+────────+────────+────────+────────+────────+────────+────────+────────+────────+
A book remains rented as long as it was not returned, and is counted as 'Returned' every day until it is rented out again.
This is what the query result would look like for one book (BookID 1):
I see two possible solutions.
Dynamic solution
Use a (recursive) common table expression to generate a list of all the dates that fall within the requested range.
Use two cross apply statements that each perform a count() aggregation function to count the amount of book transfers.
-- generate date range
with Dates as
(
select convert(date, '2022-11-01') as TransferDate
union all
select dateadd(day, 1, d.TransferDate)
from Dates d
where d.TransferDate < '2022-11-10'
)
select d.TransferDate,
c1.CountRented,
c2.CountReturned
from Dates d
-- count all rented books up till today, that have not been returned before today
cross apply ( select count(1) as CountRented
from Transfer t1
where t1.BookStatus = 'Rented'
and t1.TransferDate <= d.TransferDate
and not exists ( select 'x'
from Transfer t2
where t2.BookId = t1.BookId
and t2.BookStatus = 'Returned'
and t2.TransferDate > t1.TransferDate
and t2.TransferDate <= d.TransferDate ) ) c1
-- count all returned books for today
cross apply ( select count(1) as CountReturned
from Transfer t1
where t1.BookStatus = 'Returned'
and t1.TransferDate = d.TransferDate ) c2;
Result:
TransferDate CountRented CountReturned
------------ ----------- -------------
2022-11-01 1 0
2022-11-02 1 0
2022-11-03 2 0
2022-11-04 2 0
2022-11-05 1 1
2022-11-06 2 0
2022-11-07 2 0
2022-11-08 2 0
2022-11-09 0 2
2022-11-10 0 0
This result is not the pivoted outcome described in the question. However, pivoting this dynamic solution requires dynamic sql, which is not trivial!
Static solution
This will delivery the exact outcome as described in the question (including the date formatting), but requires the date range to be fully typed out once.
The essential building blocks are similar to the dynamic solution above:
A recursive common table expression to generate a date range.
Two cross apply's to perform the counting calculations like before.
There is also:
An extra cross join to duplicate the date range for each BookStatus (avoid NULL values in the result).
Some replace(), str() and datepart() functions to format the dates.
A case expression to merge the two counts to a single column.
The solution is probably not the most performant, but it does deliver the requested result. If you want to validate for BookID=1 then just uncomment the extra WHERE filter clauses.
with Dates as
(
select convert(date, '2022-11-01') as TransferDate
union all
select dateadd(day, 1, d.TransferDate)
from Dates d
where d.TransferDate < '2022-11-10'
),
PivotInput as
(
select replace(str(datepart(day, d.TransferDate), 2), space(1), '0') + '.' + replace(str(datepart(month, d.TransferDate), 2), space(1), '0') as TransferDate,
s.BookStatus as [Status],
case when s.BookStatus = 'Rented' then sc1.CountRented else sc2.CountReturned end as BookStatusCount
from Dates d
cross join (values('Rented'), ('Returned')) s(BookStatus)
cross apply ( select count(1) as CountRented
from Transfer t1
where t1.BookStatus = s.BookStatus
and t1.TransferDate <= d.TransferDate
--and t1.BookID = 1
and not exists ( select 'x'
from Transfer t2
where t2.BookId = t1.BookId
and t2.BookStatus = 'Returned'
and t2.TransferDate > t1.TransferDate
and t2.TransferDate <= d.TransferDate ) ) sc1
cross apply ( select count(1) as CountReturned
from Transfer t3
where t3.TransferDate = d.TransferDate
--and t3.BookID = 1
and t3.BookStatus = 'Returned' ) sc2
)
select piv.*
from PivotInput pivi
pivot (sum(pivi.BookStatusCount) for pivi.TransferDate in (
[01.11],
[02.11],
[03.11],
[04.11],
[05.11],
[06.11],
[07.11],
[08.11],
[09.11],
[10.11])) piv;
Result:
Status 01.11 02.11 03.11 04.11 05.11 06.11 07.11 08.11 09.11 10.11
Rented 1 1 2 2 1 2 2 2 0 0
Returned 0 0 0 0 1 0 0 0 2 0
Fiddle to see things in action.

How can i duplicate records with T-SQL and keep track of the progressive number?

How can I duplicate the records of table1 and store them in table2 along with the progressive number calculated from startnum and endnum?
Thanks
the first row must be duplicated in 4 records i.e num: 80,81,82,83
Startnum | Endnum | Data
---------+-------------+----------
80 | 83 | A
10 | 11 | C
14 | 16 | D
Result:
StartEndNum | Data
------------+-----------
80 | A
81 | A
82 | A
83 | A
10 | C
11 | C
14 | D
15 | D
16 | D
A simple method uses a recursive CTE:
with cte as
select startnum, endnum, data
from t
union all
select startnum + 1, endnum, data
from cte
where startnum < endnum
)
select startnum, data
from cte;
If you have ranges that exceed 100, you need option (maxrecursion 0).
Note: There are other solutions as well, using numbers tables (either built-in or generated). I like this solution as a gentle introduction to recursive CTEs.
Without recursion:
declare #t table(Startnum int, Endnum int, Data varchar(20))
insert into #t values
(80, 83, 'A'),
(10, 11, 'C'),
(14, 16, 'D');
select a.StartEndNum, t.Data
from #t t cross apply (select top (t.Endnum - t.Startnum + 1)
t.Startnum + row_number() over(order by getdate()) - 1 as StartEndNum
from sys.all_columns) a;
You can use any other table with enough rows instead of sys.all_columns

How to SUM column 1 and select column 2 by condition?

I've stuck with how to sum column A and select column B with a condition if column B >= 50 select this row id.
Example Table Like this
+----+-----------+---------+
| ID | PRICE | PERCENT |
+----+-----------+---------+
| 1 | 5 | 5 |
| 2 | 18 | 20 |
| 3 | 7 | 50 |
| 4 | 16 | 56 |
| 5 | 50 | 87 |
| 6 | 17 | 95 |
| 7 | 40 | 107 |
+----+-----------+---------+
SELECT ID, SUM(PRICE) AS PRICE, PERCENT FROM Table
Column ID and PERCENT, I want to select from a row with PERCENT >= 50
The result should be
Any suggestions?
Try below query:
declare #tbl table(ID int, PRICE int, [PERCENT] int);
insert into #tbl values
(1, 5, 5),
(2, 18, 20),
(3, 7, 50),
(4, 16, 56),
(5, 50, 87),
(6, 17, 95),
(7, 40, 107);
select top 1 ID,
(select sum(PRICE) from #tbl) PRICE,
[PERCENT]
from #tbl
where [PERCENT] > 50
You could include the total in a subquery in the SELECT clause of your query like this:
SELECT
[ID],
(SELECT SUM([PRICE]) FROM T) AS [PRICE],
[PERCENT]
FROM
T
WHERE
[PRICE] >= 50
However, it remains unclear which of the five valid records should be picked. You indicated it should be the record where PERCENT has value 56, but IMHO value 50 would be possible too, just like 87, 95, and 107 (?). It is unclear why you pick value 56 as the correct one. If it doesn't matter, you could use TOP (1) in the SELECT clause, but if it does matter, you should extend the WHERE clause with appropriate conditions/filters.
Mixing aggregate data from groups back with individual elements/records like this is often fuzzy. I consider it to be a "code smell" and here in your question on StackOverflow, it might indicate an XY-problem. Anyway, these query results might get misinterpreted quite easily if you are not careful. Always remember that such aggregated data in the result (in this case the PRICE field) has practically nothing to do with the detail data in the result (in this case the ID and PERCENT fields). Unless you want to combine your aggregate data with your detail data (in a calculation for example), but you do not indicate you want anything like that in your question...
you can do this Trick to have a result of 2 queries in 1 query:
select ID as ID,T.[PERCENT] AS B, 0 as sumA
from Table_1 as T
where T.[PERCENT]>=50
union All
select 0 as ID,0 AS B, sum(t.[PRICE]) as sumA
from Table_1 as T
Am not sure why you need this but certainly, You can Archive Above Output using below query
Sample Data
declare #data table
(Id int, Price int, [Percent] int)
insert #data
VALUES (1,5,5),
(2,18,20),
(3,7,50),
(4,16,56),
(5,50,87),
(6,17,95),
(7,40,107)
Query
select top 1 ID, (select sum(price) from #data) as Price, [Percent ]
from #data
where [Percent ] >50
You can try the following code:
SELECT TOP (1) [ID], SUM(PRICE) OVER (), [PERCENT]
FROM #tbl
ORDER BY CASE WHEN [PERCENT] > 50 THEN 0 ELSE 1 END, [ID];
I am using OVER clause in order to extract/read data from the table only once - one table scan.

Script required

Suppose we have a table #temp1, what we required here is, we want an additional column ABC and in that, we want to print the output as (10-10 = 0), (20-10) = 10, (30-10 = 20), (40-10 = 30) and (50-10 = 40)
So, we have created a table and insert script below.
Create table #temp1 (ID Int Identity(1,1),Name varchar(10),Series bigint)
insert into #temp1 values('A',10)
insert into #temp1 values('B',20)
insert into #temp1 values('C',30)
insert into #temp1 values('D',40)
insert into #temp1 values('E',50)
I tried as below, where its incrementing the values rows by rows.
select ID,Name,Series, SUM(series) over(order by series asc Rows Between Unbounded Preceding and Current Row) ranking from #temp1
Output should be:
ID|Name|Series|ABC
1 | A | 10 | 0
2 | B | 20 | 10
3 | C | 30 | 20
4 | D | 40 | 30
5 | E | 50 | 40
Can anyone here me, how to do that.
It's simply Series - 10:
SELECT ID, Name, Series, Series - 10 AS ABC
FROM #Temp1;
try the below query
select ID,Name,Series,(series - (select top 1 series from #temp1)) as abc from #temp1
Agree with Sami. Unless something else is intended from the query that was attempted, it seems like it should not be Series - 10, but rather series - the first series value in the table?
If that is the case, then the answer should be
select ID,Name,Series, Series - MIN(series) over
(order by series asc Rows Between unbounded Preceding and Current Row) ABC from #temp1
select ID,Name,Series,(series - 10) as abc from #temp1

Get id of max value in group

I have a table and i would like to gather the id of the items from each group with the max value on a column but i have a problem.
SELECT group_id, MAX(time)
FROM mytable
GROUP BY group_id
This way i get the correct rows but i need the id:
SELECT id,group_id,MAX(time)
FROM mytable
GROUP BY id,group_id
This way i got all the rows. How could i achieve to get the ID of max value row for time from each group?
Sample Data
id = 1, group_id = 1, time = 2014.01.03
id = 2, group_id = 1, time = 2014.01.04
id = 3, group_id = 2, time = 2014.01.04
id = 4, group_id = 2, time = 2014.01.02
id = 5, group_id = 3, time = 2014.01.01
and from that i should get id: 2,3,5
Thanks!
Use your working query as a sub-query, like this:
SELECT `id`
FROM `mytable`
WHERE (`group_id`, `time`) IN (
SELECT `group_id`, MAX(`time`) as `time`
FROM `mytable`
GROUP BY `group_id`
)
Have a look at the below demo
DROP TABLE IF EXISTS mytable;
CREATE TABLE mytable(id INT , group_id INT , time_st DATE);
INSERT INTO mytable VALUES(1, 1, '2014-01-03'),(2, 1, '2014-01-04'),(3, 2, '2014-01-04'),(4, 2, '2014-01-02'),(5, 3, '2014-01-01');
/** Check all data **/
SELECT * FROM mytable;
+------+----------+------------+
| id | group_id | time_st |
+------+----------+------------+
| 1 | 1 | 2014-01-03 |
| 2 | 1 | 2014-01-04 |
| 3 | 2 | 2014-01-04 |
| 4 | 2 | 2014-01-02 |
| 5 | 3 | 2014-01-01 |
+------+----------+------------+
/** Query for Actual output**/
SELECT
id
FROM
mytable
JOIN
(
SELECT group_id, MAX(time_st) as max_time
FROM mytable GROUP BY group_id
) max_time_table
ON mytable.group_id = max_time_table.group_id AND mytable.time_st = max_time_table.max_time;
+------+
| id |
+------+
| 2 |
| 3 |
| 5 |
+------+
When multiple groups may contain the same value, you could use
SELECT subq.id
FROM (SELECT id,
value,
MAX(time) OVER (PARTITION BY group_id) as max_time
FROM mytable) as subq
WHERE subq.time = subq.max_time
The subquery here generates a new column (max_time) that contains the maximum time per group. We can then filter on time and max_time being identical. Note that this still returns multiple rows per group if the maximum value occurs multiple time within the same group.
Full example:
CREATE TABLE test (
id INT,
group_id INT,
value INT
);
INSERT INTO test (id, group_id, value) VALUES (1, 1, 100);
INSERT INTO test (id, group_id, value) VALUES (2, 1, 200);
INSERT INTO test (id, group_id, value) VALUES (3, 1, 300);
INSERT INTO test (id, group_id, value) VALUES (4, 2, 100);
INSERT INTO test (id, group_id, value) VALUES (5, 2, 300);
INSERT INTO test (id, group_id, value) VALUES (6, 2, 200);
INSERT INTO test (id, group_id, value) VALUES (7, 3, 300);
INSERT INTO test (id, group_id, value) VALUES (8, 3, 200);
INSERT INTO test (id, group_id, value) VALUES (9, 3, 100);
select * from test;
id | group_id | value
----+----------+-------
1 | 1 | 100
2 | 1 | 200
3 | 1 | 300
4 | 2 | 100
5 | 2 | 300
6 | 2 | 200
7 | 3 | 300
8 | 3 | 200
9 | 3 | 100
(9 rows)
SELECT subq.id
FROM (SELECT id,
value,
MAX(value) OVER (partition by group_id) as max_value
FROM test) as subq
WHERE subq.value = subq.max_value;
id
----
3
5
7
(3 rows)