Running total by date/ID based on latest change to value SQL - sql

I have a unique case where I want to calculate the running total of quantities day over day. I have been searching a lot but couldn't find the right answer. Code-wise, there is nothing much I can share as it refers to a lot of sensitive data
Below is the table of dummy data:
As you can see, there are multiple duplicate IDs by date. I want to be able to calculate the running total of a date as follows:
For 2022/03/24, the running total would be 9+33 = 42, on 2022/03/26 the running total should be 9+31 = 40. Essentially, the running total for any given day should pick the last value by ID if it changed or the value that exists. In this case on 2022/03/26 for that date, for ID 2072, we pick 31 and not 33 because that's the latest value available.
Expected Output:
There maybe be many days spanning across and the running total needs to be day over day.
Possible related question: SQL Server running total based on change of state of a column
PS: For context, ID is just a unique identifier for an inventory of items. Each item's quantity changes day by day. In this example, ID 1's inventoyr last changed on 2022/03/24 where as ID 2072's changed multiple times. Running total for 2022/03/24 would be quantities of inventory items on that day. On 26th there are no changes for ID 1 but ID 2072 changed, the inventory pool should reflect the total as current inventory size of ID 2072+ current size of ID 1. On 26th, again ID 1 did not have any change, but ID 2072 changed. Therefore inventory size = current size of ID 2072 + current size of ID 1, in this case, 40. Essentially, it is just a current size of inventory with day over day change.
Any help would be really appreciated! Thanks.

I added a few more rows just in case if this is what you really wanted.
I used T-SQL.
declare #orig table(
id int,
quantity int,
rundate date
)
insert into #orig
values (1,9,'20220324'),(2072,33,'20220324'),(2072,31,'20220326'),(2072,31,'20220327'),
(2,10,'20220301'),(2,20,'20220325'),(2,30,'20220327')
declare #dates table (
runningdate date
)
insert into #dates
select distinct rundate from #orig
order by rundate
declare #result table (
dates date,
running_quality int
)
DECLARE #mydate date
DECLARE #sum int
-- CURSOR definition
DECLARE my_cursor CURSOR FOR
SELECT * FROM #dates
OPEN my_cursor
-- Perform the first fetch
FETCH NEXT FROM my_cursor into #mydate
-- Check ##FETCH_STATUS to see if there are any more rows to fetch
WHILE ##FETCH_STATUS = 0
BEGIN
;with cte as (
select * from #orig
where rundate <= #mydate
), cte2 as (
select id, max(rundate) as maxrundate
from cte
group by id
), cte3 as (
select a.*
from cte as a join cte2 as b
on a.id = b.id and a.rundate = b.maxrundate
)
select #sum = sum(quantity)
from cte3
insert into #result
select #mydate, #sum
-- This is executed as long as the previous fetch succeeds
FETCH NEXT FROM my_cursor into #mydate
END -- cursor
CLOSE my_cursor
DEALLOCATE my_cursor
select * from #result
Result:
dates running_quality
2022-03-01 10
2022-03-24 52
2022-03-25 62
2022-03-26 60
2022-03-27 70

Related

How do I return lets say 100 distinct ticket Id data from a view which returns two records per ticket Id using pagination?

I have a view that returns two columns Ticket_Id and Price. Each ticket can have up to 2 different prices. Along with this, I have a stored procedure that returns the data from the view to the caller based on input parameters for pagination.
#page : indicates the page number
#pageSize : indicates the number of records per page.
When a user requests 100 (unique tickets)rows I will have to return at most 200 rows of data.
For which i am using pagination as follows
OFFSET ', #pageSize,' * (',#page,' - 1) ROWS FETCH NEXT ', #pageSize,' ROWS ONLY
But it returns only 100 rows of data including duplicates. Is there a way I can modify the pagination parameters to retrieve all 200 rows of data ?
Example :
view returns as follows :
ticket_id
price
ticket1
10
ticket1
12
ticket2
11
ticket2
13
ticket3
12
ticket3
14
when the user requests with the input parameters:
#page = 1 , #PageSize = 3
I need to return all 6 rows of data.
View(Using view because stored procedure dosent not have access to tickets table directly)
select tck.ticket_id, tck.cost as 'price'
--,RANK() OVER(ORDER BY tck.ticket_id) 'Rank'
from tickets tck with (NOLOCK)
Store procedure:
ALTER PROCEDURE [dbo].[p_trans_history_srch]
-- Add the parameters for the stored procedure here
#page int=1, --optional
#pageSize int=20 --optional
AS
BEGIN
declare #finalsqlstmt nvarchar(max)
declare #pageString nvarchar(max)
declare #pageCount nvarchar(max) = ''
declare #viewName nvarchar(max)
set #pageString =concat(' OFFSET ', #pageSize,' * (',#page,' - 1) ROWS FETCH NEXT ', #pageSize,' ROWS ONLY')
set #finalsqlstmt = concat('select * from ',dbo.f_get_dbname(),#viewName,'where ',#search ,' and created_date between ''',#startDate,''' and ''',#endDate,''' order by created_date desc ',#pageString)
set #pageCount =concat('select count(distinct ticket_id) from ',dbo.f_get_dbname(),#viewName,'where ',#search,' and created_date between ''',#startDate,''' and ''',#endDate,'''' )
exec (#finalsqlstmt)
exec (#pageCount)
END
Note: I tried using RANK() OVER(ORDER BY ticket_id) 'Rank' and returning data based on rank, but because of the huge table size the performance of the query reduced drastically.
You can get "100 rows" if you pivot your two prices into two columns (for example "A" and "B"). I don't know if this is an option in your situation but here is an example:
DECLARE #t6 TABLE (A VARCHAR(100), B INT)
INSERT INTO #t6 (A,B)
SELECT 'ticket1',10
UNION ALL SELECT 'ticket1',12
UNION ALL SELECT 'ticket2',11
UNION ALL SELECT 'ticket2',13
UNION ALL SELECT 'ticket3',12
UNION ALL SELECT 'ticket3',14
;WITH cte_topivot AS
(
SELECT
[A],
CASE WHEN ROW_NUMBER() OVER (PARTITION BY A ORDER BY B ASC) = 1
THEN 'ticketA' ELSE 'ticketB' END [pivotCol],
[B]
FROM #t6 t
)
SELECT p.*
FROM cte_toPivot tp
PIVOT(MIN(B) FOR pivotCol IN ([ticketA],[ticketB])) p
Otherwise, if you are always getting exactly half the rows you want, can you multiply the user supplied page size by 2?

Convert Excel formula ' =COUNTIF($B$2:B2,[#[reg_no]]) ' to SQL

My excel sheet having a column Count is responsible for counting how many times one registration number is repeated as you can see in the given picture.
Whenever I am going to add any new record in my excel table this column go up and count how many records are there as like my reg_no
Let us take Example:
If we add new record at 17th id with
Reg_no = 3591
Name = 'dani'
grade = 'A'
Count ?
Now it will be like Count = 4
I want to convert this table into a SQL query and I am having a problem converting this Count column that how I am going to calculate this count column in SQL
Does anyone know? please help
step 1 create a temp table with empty column
SELECT * , null as desired_column ,
into #yourTable_t1
FROM #yourTable j;
step 2 create a cursor to calculate your desired_column and update temp_table
begin
declare #row int, #order int, #prod varchar(100), #prod_count int =0 ;
declare prod_cur cursor for
SELECT row_num, MyColumn1,MyColumn2
FROM #yourTable_t1 ;
open prod_cur;
fetch next from prod_cur into #row , #order, #prod;
while (##FETCH_STATUS=0)
begin
set #prod_count= ( select count(MyColumn2) from #yourTable_t1 where
MyColumn2= #prod and ROW_NUM <= #row);
update #yourTable_t1
set desired_column = #prod_count
where ROW_NUM= #row;
fetch next from prod_cur into #row , #order, #prod;
end;
close prod_cur;
deallocate prod_cur;
--select * from #yourTable_t1 order by MyColumn2;
end;
Good Luck!
This can be done using window functions
count(*) over (partition by rege_no order by id) as count
Online example

SQL Query help needed - Multiple rows in 1st table should match to multiple table in 2nd table

Problem Illustration
I am trying to find that magical query to generate summary information. I have mapped my problem into fictitious illustration. I have 'WaterLeakage%' table which records leakage occurred in hotel rooms over several year.
I have another table which records WaterConsumption in liters for each table.
Now i have to find actual water leakage in liters for given room number over given date range.
Basically i have to group several rows in 'WaterLeakage%' table to several rows in 'WaterConsumption' table. I am trying to figure out magical efficient query to find this. Unable to find it, please help.
DECLARE #START_DATE_PARAM DATE = '01/10/2017';
DECLARE #END_DATE_PARAM DATE = '01/31/2017';
DECLARE #ROOM_NUMBER INT = 101;
IF (EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = '#WATER_CONSUMPTION'))
DROP TABLE #WATER_CONSUMPTION;
IF (EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = '#WATER_LEAKAGE_PER'))
DROP TABLE #WATER_LEAKAGE_PER;
--Table for daily daily water consumption per room
CREATE TABLE #WATER_CONSUMPTION(
ROOM_NUMBER INT,
UDAY DATE,
WATER_CONSUMPTION_LITER INT
)
--Table for water leakage percent per room for date range
CREATE TABLE #WATER_LEAKAGE_PER
(
ROOM_NUMBER INT,
START_DATE DATE,
END_DATE DATE,
WATER_LEAKAGE_PERCENT INT
)
-- Raw Data
INSERT INTO #WATER_LEAKAGE_PER(ROOM_NUMBER,START_DATE,END_DATE,WATER_LEAKAGE_PERCENT)
VALUES(101,'2017/01/01','2017/01/02',5),
(102,'2017/01/01','2017/01/05',10),
(101,'2017/01/04','2017/02/06',10);
-- Raw Data
INSERT INTO #WATER_CONSUMPTION
VALUES(101,'2017/01/01',100),
(101,'2017/01/02',100),
(101,'2017/01/03',100),
(101,'2017/01/04',100),
(101,'2017/01/05',100),
(101,'2017/01/06',100),
(102,'2017/01/01',100),
(102,'2017/01/02',100),
(102,'2017/01/03',100),
(102,'2017/01/04',100),
(102,'2017/01/05',100);
DECLARE #TotalLeak REAL = 0;
SELECT * FROM #WATER_CONSUMPTION;
SELECT * FROM #WATER_LEAKAGE_PER;
SELECT * FROM #WATER_CONSUMPTION T1 JOIN (SELECT * FROM #WATER_LEAKAGE_PER WHERE ROOM_NUMBER=#ROOM_NUMBER) T2
ON (T1.ROOM_NUMBER=T2.ROOM_NUMBER AND T1.UDAY >= T2.START_DATE AND T1.UDAY <= T2.END_DATE);
DROP TABLE #WATER_CONSUMPTION;
DROP TABLE #WATER_LEAKAGE_PER;
I am very close to solution now. Basically i changed my thinking. I will join reverse now.
BEGIN
--Input Parameters for calculating water wastage between date range
DECLARE #START_DATE_PARAM DATE = '01/10/2017';
DECLARE #END_DATE_PARAM DATE = '01/31/2017';
--Table for daily daily water consumption per room
CREATE TABLE #WATER_CONSUMPTION(
ROOM_NUMBER INT,
UDAY DATE,
WATER_CONSUMPTION_LITER INT
)
--Table for water leakage percent per room for date range
CREATE TABLE #WATER_LEAKAGE_PER
(
ROOM_NUMBER INT,
START_DATE DATE,
END_DATE DATE,
WATER_LEAKAGE_PERCENT INT,
LEAKAGE_PER_DAY_IN_LITER INT
)
-- Leakage in liter per room for each day, This will have multiple entries for room and date if room number and date is available in multiple date ranges, ex. in #WATER_CONSUMPTION table for room number 101 we have multiple entries with overlapping dates
CREATE TABLE #DAY_WISE_LEAKAGE
(
ROOM_NUMBER INT,
LDATE DATE,
LEAKAGE_IN_LITER INT
)
-- Raw Data
INSERT INTO #WATER_LEAKAGE_PER(ROOM_NUMBER,START_DATE,END_DATE,WATER_LEAKAGE_PERCENT)
VALUES(101,'2017/01/15','2017/01/18',30),
(102,'2017/01/15','2017/01/18',10),
(101,'2017/01/15','2017/02/13',5);
-- Raw Data
INSERT INTO #WATER_CONSUMPTION
VALUES(101,'01/01/2017',1001),
(101,'01/02/2017',1001),
(101,'01/03/2017',1001),
(101,'01/04/2017',1001),
(101,'01/05/2017',1001),
(101,'01/06/2017',1001),
(101,'01/07/2017',1001),
(101,'01/08/2017',1001),
(101,'01/09/2017',1001),
(101,'01/10/2017',1001),
(101,'01/11/2017',1001),
(101,'01/12/2017',1001),
(101,'01/13/2017',1001),
(101,'01/14/2017',1001),
(101,'01/15/2017',1001),
(101,'01/16/2017',1001),
(101,'01/17/2017',1001),
(101,'01/18/2017',1001),
(101,'01/19/2017',1001),
(101,'01/20/2017',1001),
(101,'01/21/2017',1001),
(101,'01/22/2017',1001),
(101,'01/23/2017',1001),
(101,'01/24/2017',1001),
(101,'01/25/2017',1001),
(101,'01/26/2017',1001),
(101,'01/27/2017',1001),
(101,'01/28/2017',1001),
(101,'01/29/2017',1001),
(101,'01/30/2017',1001),
(101,'01/31/2017',1001);
DECLARE #ROOM_NUMBER INT
DECLARE #START_DATE DATE
DECLARE #END_DATE DATE
DECLARE #WATER_LEAKAGE_PERCENT INT
-- cursor for calculating water wastage pre date range per day available in #WATER_LEAKAGE_PER table
DECLARE WATER_LEAKAGE_PER_CURSOR CURSOR FOR
SELECT ROOM_NUMBER,START_DATE,END_DATE,WATER_LEAKAGE_PERCENT FROM #WATER_LEAKAGE_PER
OPEN WATER_LEAKAGE_PER_CURSOR
FETCH NEXT FROM WATER_LEAKAGE_PER_CURSOR
INTO #ROOM_NUMBER, #START_DATE ,#END_DATE, #WATER_LEAKAGE_PERCENT
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #TOTAL_WATER_USED_FOR_DATE_RANGE INT=0;
DECLARE #NUMBER_OF_DAYS INT=0;
DECLARE #LEAKAGE_PER_DAY_IN_LITER INT=0;
-- Total Liters of water used for 1 date range
SELECT #TOTAL_WATER_USED_FOR_DATE_RANGE =SUM(WATER_CONSUMPTION_LITER),#NUMBER_OF_DAYS=COUNT(1) FROM #WATER_CONSUMPTION WHERE ROOM_NUMBER=#ROOM_NUMBER AND UDAY BETWEEN #START_DATE AND #END_DATE;
-- Liters of water leakage per day for selevted date range in cursor
SELECT #LEAKAGE_PER_DAY_IN_LITER=((#TOTAL_WATER_USED_FOR_DATE_RANGE*#WATER_LEAKAGE_PERCENT)/100)/#NUMBER_OF_DAYS;
UPDATE #WATER_LEAKAGE_PER SET LEAKAGE_PER_DAY_IN_LITER = #LEAKAGE_PER_DAY_IN_LITER WHERE ROOM_NUMBER=#ROOM_NUMBER AND START_DATE = #START_DATE AND END_DATE=#END_DATE AND WATER_LEAKAGE_PERCENT=#WATER_LEAKAGE_PERCENT;
-- generate dates and water leakage, this will be used for actual calculation of water leakage in date range.
;WITH n AS
(
SELECT TOP (DATEDIFF(DAY, #START_DATE, #END_DATE) + 1)
n = ROW_NUMBER() OVER (ORDER BY [object_id])
FROM sys.all_objects
)
INSERT INTO #DAY_WISE_LEAKAGE SELECT #ROOM_NUMBER, DATEADD(DAY, n-1, #START_DATE),#LEAKAGE_PER_DAY_IN_LITER
FROM n;
FETCH NEXT FROM WATER_LEAKAGE_PER_CURSOR
INTO #ROOM_NUMBER, #START_DATE ,#END_DATE, #WATER_LEAKAGE_PERCENT
END
CLOSE WATER_LEAKAGE_PER_CURSOR;
DEALLOCATE WATER_LEAKAGE_PER_CURSOR;
-- Average of Liters of water leakage per Room number.
SELECT ROOM_NUMBER,SUM(LEAKAGE_IN_LITER) FROM #DAY_WISE_LEAKAGE WHERE LDATE BETWEEN #START_DATE_PARAM AND #END_DATE_PARAM GROUP BY ROOM_NUMBER;
DROP TABLE #WATER_CONSUMPTION;
DROP TABLE #WATER_LEAKAGE_PER;
DROP TABLE #DAY_WISE_LEAKAGE
END

Create a random selection weighted on number of points, SQL

I have a table of winners for a prize draw, where each winner has earned a number of points over the year. There are 1300 registered users, with points varying between 50 and 43,000. I need to be able to select a random winner, which is straight forward, but the challenge I am having is building the logic where each point counts as an entry ticket into the prize draw. Would appreciate any help.
John
Your script would look something similar to this:
Script 1 :
DECLARE #Name varchar(100),
#Points int,
#i int
DECLARE Cursor1 CURSOR FOR SELECT Name, Points FROM Table1
OPEN Cursor1
FETCH NEXT FROM Cursor1
INTO #Name, #Points
WHILE ##FETCH_STATUS = 0
BEGIN
SET #i = 0
WHILE #i < #Points
BEGIN
INSERT INTO Table2 (Name)
VALUES (#Name)
SET #i = #i + 1
END
FETCH NEXT FROM Cursor1 INTO #Name, #Points
END
DEALLOCATE Cursor1
I have created a table (Table1) with only a Name and Points column (varchar(100) and int), I have created a cursor in order to look through all the records within Table1 and then loop through the Points and then inserted each record into another table (Table2).
This then imports the Name depending on the Points column.
Script 2 :
DECLARE #Name varchar(100),
#Points int,
#i int,
#Count int
CREATE TABLE #temptable(
UserEmailID nvarchar(200),
Points int)
DECLARE Cursor1 CURSOR FOR SELECT UserEmailID, Points FROM Table1_TEST
OPEN Cursor1
FETCH NEXT FROM Cursor1
INTO #Name, #Points
WHILE ##FETCH_STATUS = 0
BEGIN
SET #i = 0
WHILE #i < #Points
BEGIN
INSERT INTO #temptable (UserEmailID, Points)
VALUES (#Name, #Points)
SET #i = #i + 1
END
FETCH NEXT FROM Cursor1 INTO #Name, #Points
END
DEALLOCATE Cursor1
SELECT * FROM #temptable
DROP TABLE #temptable
In Script2 I have imported the result into a TEMP table as requested.
The script now runs through each record within you Table1 and imports the individuals UserEmailID and Points into the TEMP table depending on how much the Points are in Table1.
So if John has a total of 3 points, and Sarah 2, the script will import Johns UserEmailID 3 times into the TEMP table and 2 times for Sarah.
If you apply the random selector on the TEMP table, it will then randomly select a individual.
John would obviously stand a better chance to win because he has 3 records in the TEMP table whereas Sarah only has 2.
Suppose Johns UserEmailID is 1 and Sarah is 2:
The OUTPUT of TEMP table would then be:
UserEmailID | Points
1 | 3
1 | 3
1 | 3
2 | 2
2 | 2
Please let me know if you need any clarity.
Hope this helps.
You can do a weighted draw using the following method:
Calculate the cumulative sum of points.
Divide by the total number of points to get a value between 0 and 1
Each row in the original data will have a range, such as [0, 0.1), [0.1, 0.3), [0.3, 1]
Calculate a random number and choose the row where the value falls in the range
Here is standard'ish SQL for this approach:
with u as (
select u.*,
coalesce(lead(rangestart) over (order by points) as rangeend, 1)
from (select u.*,
sum(points*1.0) over (order by points) / sum(points) over () as rangestart
from users u
) u
),
r as (
select random() as rand
)
select u.*
from u
where r.rand between rangestart and rangeend;
In addition to using window functions (which can be handled by correlated subqueries in many cases), the exact format depends on whether the random number generator is deterministic for a query (such as SQL Server where random() returns one value no matter how often called in a query) or non-deterministic (such as in other databases). This method only requires one value for the random number generator, so it will work with either method.
So you want a winner with 1000 points have double the chances as another with only 500 points.
Sort the winners by whatever order and create a running total for the points:
id points
winner1 100
winner2 50
winner3 150
gives:
id points from to
winner1 100 1 100
winner2 50 101 150
winner3 150 151 300
Then compare with a random number from 1 to sum(points), in the example a number between 1 and 300. Find the winner with that number range and you're done.
select winpoints.id_winner
from
(
select
id as id_winner,
coalesce(sum(points) over(order by id rows between unbounded preceding and 1 preceding), 0) + 1 as from_points,
sum(points) over(order by id rows between unbounded preceding and current row) as to_points
from winners
) winpoints
where (select floor(rand() * (sum(points) from winners)) + 1)
between winpoints.from_points and winpoints.to_points;
This solution also works with fractional points/weights. It creates a helper table usersum.
create table user (id int primary key, points float);
insert into user values (1, 0.5), (2, 0), (3, 1);
create table usersum (id int primary key, pointsum float);
insert into usersum
select id, (select sum(points) from user b where b.id <= a.id)
from user a;
set #r = rand() * (select max(pointsum) from usersum);
select #r, usersum.* from usersum where pointsum >= #r order by id limit 1;
http://sqlfiddle.com/#!2/ae539e/1

Recursive SQL- How can I get this table with a running total?

ID debit credit sum_debit
---------------------------------
1 150 0 150
2 100 0 250
3 0 50 200
4 0 100 100
5 50 0 150
I have this table, my problem is how to get sum_debit column which is the total of the previous row sum_debit with debit minus credit (sum_debit = sum_debit + debit - credit).
each new row I enter debit but credit data is zero, or by entering the value of credit and debit is zero. How do I get sum_debit?
In SQL-Server 2012, you can use the newly added ROWS or RANGE clause:
SELECT
ID, debit, credit,
sum_debit =
SUM(debit - credit)
OVER (ORDER BY ID
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
)
FROM
CreditData
ORDER BY
ID ;
Tested in SQL-Fiddle
We could just use OVER(ORDER BY ID) there and the result would be the same. But then the default would be used, which is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW and there are efficiency differences (ROWS should be preferred with running totals.)
There is a great article by #Aaron Bertrand, that has a thorough test of various methods to calculate a running total: Best approaches for running totals – updated for SQL Server 2012
For previous versions of SQL-Server, you'll have to use some other method, like a self-join, a recursive CTE or a cursor. Here is a cursor solution, blindly copied from Aaron's blog, with tables and columns adjusted to your problem:
DECLARE #cd TABLE
( [ID] int PRIMARY KEY,
[debit] int,
[credit] int,
[sum_debit] int
);
DECLARE
#ID INT,
#debit INT,
#credit INT,
#RunningTotal INT = 0 ;
DECLARE c CURSOR
LOCAL STATIC FORWARD_ONLY READ_ONLY
FOR
SELECT ID, debit, credit
FROM CreditData
ORDER BY ID ;
OPEN c ;
FETCH NEXT FROM c INTO #ID, #debit, #credit ;
WHILE ##FETCH_STATUS = 0
BEGIN
SET #RunningTotal = #RunningTotal + (#debit - #credit) ;
INSERT #cd (ID, debit, credit, sum_debit )
SELECT #ID, #debit, #credit, #RunningTotal ;
FETCH NEXT FROM c INTO #ID, #debit, #credit ;
END
CLOSE c;
DEALLOCATE c;
SELECT ID, debit, credit, sum_debit
FROM #cd
ORDER BY ID ;
Tested in SQL-Fiddle-cursor
Assuming "have" is your data table, this should be an ANSI SQL solution:
select h.*, sum(i.debit) as debsum, sum(i.credit) as credsum, sum(i.debit) - sum(i.credit) as rolling_sum
from have h inner join have i
on h.id >= i.id
group by h.id, h.debit, h.credit
order by h.id
In general, the solution is to join the row to all rows preceding the row, and extract the sum of those rows, then group by everything to get back to one row per what you expect. Like this question for example.