Rank by Date and Qty in SQL - sql

I have a table called SALES, with Person, Date and Qty.
Person Date Qty
Jim 2016-08-01 1
Jim 2016-08-02 3
Jim 2016-08-03 1
Bob 2016-08-01 5
Bob 2016-08-02 1
Bob 2016-08-03 6
Sheila 2016-08-01 4
Sheila 2016-08-02 0
Sheila 2016-08-03 2
I'm looking to rank by qty for each date, with the following output:
Person Date Qty Rank
Bob 2016-08-01 5 1
Sheila 2016-08-01 4 2
Jim 2016-08-01 1 3
Jim 2016-08-02 3 1
Bob 2016-08-02 1 2
Sheila 2016-08-02 0 3
Bob 2016-08-03 6 1
Sheila 2016-08-03 2 2
Jim 2016-08-03 1 3
How do I use the Rank Function here?

Try this
DECLARE #SALES TABLE(
Person NVARCHAR(50),
[Date] NVARCHAR(50),
Qty INT
)
INSERT INTO #SALES VALUES('Jim','2016-08-01',1)
INSERT INTO #SALES VALUES('Jim','2016-08-02',3)
INSERT INTO #SALES VALUES('Jim','2016-08-03',1)
INSERT INTO #SALES VALUES('Bob','2016-08-01',5)
INSERT INTO #SALES VALUES('Bob','2016-08-02',1)
INSERT INTO #SALES VALUES('Bob','2016-08-03',6)
INSERT INTO #SALES VALUES('Sheila','2016-08-01',4)
INSERT INTO #SALES VALUES('Sheila','2016-08-02',0)
INSERT INTO #SALES VALUES('Sheila','2016-08-03',2)
SELECT * ,
ROW_NUMBER() OVER( PARTITION BY [Date]
ORDER BY Qty DESC) N'Rank'
FROM #SALES
GROUP BY [Date], Person, Qty
ORDER BY [Date] ASC,Qty DESC
Result
==============================================
Update: select top 2 for each day
SELECT *
FROM(
SELECT * ,
ROW_NUMBER() OVER( PARTITION BY [Date]
ORDER BY Qty DESC) N'Rank'
FROM #SALES
GROUP BY [Date], Person, Qty
) AS NewTable
WHERE NewTable.Rank < 3
Result
NOTE:
You can't use WHERE Rank < 3 directly, because WHERE clause is before SELECT statement, WHERE clause can't recognize Rank column. You have to use subquery.

Related

SQL server group / partition to condense history table

Got a table of dates someone was in a particular category like this:
drop table if exists #category
create table #category (personid int, categoryid int, startdate datetime, enddate datetime)
insert into #category
select * from
(
select 1 Personid, 1 CategoryID, '01/04/2010' StartDate, '31/07/2016' EndDate union
select 1 Personid, 5 CategoryID, '07/08/2016' StartDate, '31/03/2019' EndDate union
select 1 Personid, 5 CategoryID, '01/04/2019' StartDate, '01/04/2019' EndDate union
select 1 Personid, 5 CategoryID, '02/04/2019' StartDate, '11/08/2019' EndDate union
select 1 Personid, 4 CategoryID, '12/08/2019' StartDate, '03/11/2019' EndDate union
select 1 Personid, 5 CategoryID, '04/11/2019' StartDate, '22/03/2020' EndDate union
select 1 Personid, 5 CategoryID, '23/03/2020' StartDate, NULL EndDate union
select 2 Personid, 1 CategoryID, '01/04/2010' StartDate, '09/04/2015' EndDate union
select 2 Personid, 4 CategoryID, '10/04/2015' StartDate, '31/03/2018' EndDate union
select 2 Personid, 4 CategoryID, '01/04/2018' StartDate, '31/03/2019' EndDate union
select 2 Personid, 4 CategoryID, '01/04/2019' StartDate, '23/06/2019' EndDate union
select 2 Personid, 4 CategoryID, '24/06/2019' StartDate, NULL EndDate
) x
order by personid, startdate
I'm trying to condense it so I get this:
PersonID
categoryid
startdate
EndDate
1
1
01/04/2010
31/07/2016
1
5
07/08/2016
11/08/2019
1
4
12/08/2019
03/11/2019
1
5
04/11/2019
NULL
2
1
01/04/2010
09/04/2015
2
4
01/04/2015
NULL
I'm having issues with people like personid 1 where they are in (e.g.) category 5, then go into category 4 and them back into category 5.
So doing something like:
select
personid,
categoryid,
min(startdate) startdate,
max(enddate) enddate
from #category
group by
personid, categoryid
gives me the earliest date from category 5's first period, and the latest date from the second period - and means it creates an overlapping period.
So I tried partitioning it with a rownum or rank, but it still does the same thing - i.e. treats the 'category 5's as the same group:
select
rank() over (partition by personid, categoryid order by personid, startdate) rank,
c.*
from #category c
order by personid, startdate
rank
personid
categoryid
startdate
enddate
1
1
1
2010-04-01 00:00:00.000
2016-07-31 00:00:00.000
1
1
5
2016-08-07 00:00:00.000
2019-03-31 00:00:00.000
2
1
5
2019-04-01 00:00:00.000
2019-04-01 00:00:00.000
3
1
5
2019-04-02 00:00:00.000
2019-08-11 00:00:00.000
1
1
4
2019-08-12 00:00:00.000
2019-11-03 00:00:00.000
4
1
5
2019-11-04 00:00:00.000
2020-03-22 00:00:00.000
5
1
5
2020-03-23 00:00:00.000
NULL
1
2
1
2010-04-01 00:00:00.000
2015-04-09 00:00:00.000
1
2
4
2015-04-10 00:00:00.000
2018-03-31 00:00:00.000
2
2
4
2018-04-01 00:00:00.000
2019-03-31 00:00:00.000
3
2
4
2019-04-01 00:00:00.000
2019-06-23 00:00:00.000
4
2
4
2019-06-24 00:00:00.000
NULL
You can see in the rank column that the category 5's start off 1,2,3, miss a row and carry on 4, 5 so obvs in the same partition - I thought that adding the order by clause would force it to start a new partition when the category changed from 5 to 4 and back again.
Any thoughts?
This is a type of gaps and islands problem. However, if your data tiles perfectly (no gaps) as it does in your example data, then you can do this without any aggregation at all -- which should be the most efficient method:
select personid, categoryid, startdate,
dateadd(day, -1, lead(startdate) over (partition by personid order by startdate)) as enddate
from (select c.*,
lag(categoryid) over (partition by personid order by startdate) as prev_categoryid
from #category c
) c
where prev_categoryid is null or prev_categoryid <> categoryid;
The where clause only selects the rows where the category changes. The lead() then gets the next start date -- and subtracts 1 for your desired enddate.

Date range with minimum and maximum dates from dataset having records with continuous date range

I have a dataset with id ,Status and date range of employees.
The input dataset given below are the details of one employee.
The date ranges in the records are continuous(in exact order) such that startdate of second row will be the next date of enddate of first row.
If an employee takes leave continuously for different months, then the table is storing the info with date range as separated for different months.
For example: In the input set, the employee has taken Sick leave from '16-10-2016' to '31-12-2016' and joined back on '1-1-2017'.
So there are 3 records for this item but the dates are continuous.
In the output I need this as one record as shown in the expected output dataset.
INPUT
Id Status StartDate EndDate
1 Active 1-9-2007 15-10-2016
1 Sick 16-10-2016 31-10-2016
1 Sick 1-11-2016 30-11-2016
1 Sick 1-12-2016 31-12-2016
1 Active 1-1-2017 4-2-2017
1 Unpaid 5-2-2017 9-2-2017
1 Active 10-2-2017 11-2-2017
1 Unpaid 12-2-2017 28-2-2017
1 Unpaid 1-3-2017 31-3-2017
1 Unpaid 1-4-2017 30-4-2017
1 Active 1-5-2017 13-10-2017
1 Sick 14-10-2017 11-11-2017
1 Active 12-11-2017 NULL
EXPECTED OUTPUT
Id Status StartDate EndDate
1 Active 1-9-2007 15-10-2016
1 Sick 16-10-2016 31-12-2016
1 Active 1-1-2017 4-2-2017
1 Unpaid 5-2-2017 9-2-2017
1 Active 10-2-2017 11-2-2017
1 Unpaid 12-2-2017 30-4-2017
1 Active 1-5-2017 13-10-2017
1 Sick 14-10-2017 11-11-2017
1 Active 12-11-2017 NULL
I can't take min(startdate) and max(EndDate) group by id,status because if the same employee has taken another Sick leave then that end date ('11-11-2017' in the example) will come as the End date.
can anyone help me with the query in SQL server 2014?
It suddenly hit me that this is basically a gaps and islands problem - so I've completely changed my solution.
For this solution to work, the dates does not have to be consecutive.
First, create and populate sample table (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
Id int,
Status varchar(10),
StartDate date,
EndDate date
);
SET DATEFORMAT DMY; -- This is needed because how you specified your dates.
INSERT INTO #T (Id, Status, StartDate, EndDate) VALUES
(1, 'Active', '1-9-2007', '15-10-2016'),
(1, 'Sick', '16-10-2016', '31-10-2016'),
(1, 'Sick', '1-11-2016', '30-11-2016'),
(1, 'Sick', '1-12-2016', '31-12-2016'),
(1, 'Active', '1-1-2017', '4-2-2017'),
(1, 'Unpaid', '5-2-2017', '9-2-2017'),
(1, 'Active', '10-2-2017', '11-2-2017'),
(1, 'Unpaid', '12-2-2017', '28-2-2017'),
(1, 'Unpaid', '1-3-2017', '31-3-2017'),
(1, 'Unpaid', '1-4-2017', '30-4-2017'),
(1, 'Active', '1-5-2017', '13-10-2017'),
(1, 'Sick', '14-10-2017', '11-11-2017'),
(1, 'Active', '12-11-2017', NULL);
The (new) common table expression:
;WITH CTE AS
(
SELECT Id,
Status,
StartDate,
EndDate,
ROW_NUMBER() OVER(PARTITION BY Id ORDER BY StartDate)
- ROW_NUMBER() OVER(PARTITION BY Id, Status ORDER BY StartDate) As IslandId,
ROW_NUMBER() OVER(PARTITION BY Id ORDER BY StartDate DESC)
- ROW_NUMBER() OVER(PARTITION BY Id, Status ORDER BY StartDate DESC) As ReverseIslandId
FROM #T
)
The (new) query:
SELECT DISTINCT Id,
Status,
MIN(StartDate) OVER(PARTITION BY IslandId, ReverseIslandId) As StartDate,
NULLIF(MAX(ISNULL(EndDate, '9999-12-31')) OVER(PARTITION BY IslandId, ReverseIslandId), '9999-12-31') As EndDate
FROM CTE
ORDER BY StartDate
(new) Results:
Id Status StartDate EndDate
1 Active 01.09.2007 15.10.2016
1 Sick 16.10.2016 31.12.2016
1 Active 01.01.2017 04.02.2017
1 Unpaid 05.02.2017 09.02.2017
1 Active 10.02.2017 11.02.2017
1 Unpaid 12.02.2017 30.04.2017
1 Active 01.05.2017 13.10.2017
1 Sick 14.10.2017 11.11.2017
1 Active 12.11.2017 NULL
You can see a live demo on rextester.
Please note that string representation of dates in SQL should be acccording to ISO 8601 - meaning either yyyy-MM-dd or yyyyMMdd as it's unambiguous and will always be interpreted correctly by SQL Server.
It's an example of GROUPING AND WINDOW.
First you set a reset point for each Status
Sum to set a group
Then get max/min dates of each group.
;with x as
(
select Id, Status, StartDate, EndDate,
iif (lag(Status) over (order by Id, StartDate) = Status, null, 1) rst
from emp
), y as
(
select Id, Status, StartDate, EndDate,
sum(rst) over (order by Id, StartDate) grp
from x
)
select Id,
MIN(Status) as Status,
MIN(StartDate) StartDate,
MAX(EndDate) EndDate
from y
group by Id, grp
order by Id, grp
GO
Id | Status | StartDate | EndDate
-: | :----- | :------------------ | :------------------
1 | Active | 01/09/2007 00:00:00 | 15/10/2016 00:00:00
1 | Sick | 16/10/2016 00:00:00 | 31/12/2016 00:00:00
1 | Active | 01/01/2017 00:00:00 | 04/02/2017 00:00:00
1 | Unpaid | 05/02/2017 00:00:00 | 09/02/2017 00:00:00
1 | Active | 10/02/2017 00:00:00 | 11/02/2017 00:00:00
1 | Unpaid | 12/02/2017 00:00:00 | 30/04/2017 00:00:00
1 | Active | 01/05/2017 00:00:00 | 13/10/2017 00:00:00
1 | Sick | 14/10/2017 00:00:00 | 11/11/2017 00:00:00
1 | Active | 12/11/2017 00:00:00 | null
dbfiddle here
Here's an alternative answer that doesn't use LAG.
First I need to take a copy of your test data:
DECLARE #table TABLE (Id INT, [Status] VARCHAR(50), StartDate DATE, EndDate DATE);
INSERT INTO #table SELECT 1, 'Active', '20070901', '20161015';
INSERT INTO #table SELECT 1, 'Sick', '20161016', '20161031';
INSERT INTO #table SELECT 1, 'Sick', '20161101', '20161130';
INSERT INTO #table SELECT 1, 'Sick', '20161201', '20161231';
INSERT INTO #table SELECT 1, 'Active', '20170101', '20170204';
INSERT INTO #table SELECT 1, 'Unpaid', '20170205', '20170209';
INSERT INTO #table SELECT 1, 'Active', '20170210', '20170211';
INSERT INTO #table SELECT 1, 'Unpaid', '20170212', '20170228';
INSERT INTO #table SELECT 1, 'Unpaid', '20170301', '20170331';
INSERT INTO #table SELECT 1, 'Unpaid', '20170401', '20170430';
INSERT INTO #table SELECT 1, 'Active', '20170501', '20171013';
INSERT INTO #table SELECT 1, 'Sick', '20171014', '20171111';
INSERT INTO #table SELECT 1, 'Active', '20171112', NULL;
Then the query is:
WITH add_order AS (
SELECT
*,
ROW_NUMBER() OVER (ORDER BY StartDate) AS order_id
FROM
#table),
links AS (
SELECT
a1.Id,
a1.[Status],
a1.order_id,
MIN(a1.order_id) AS start_order_id,
MAX(ISNULL(a2.order_id, a1.order_id)) AS end_order_id,
MIN(a1.StartDate) AS StartDate,
MAX(ISNULL(a2.EndDate, a1.EndDate)) AS EndDate
FROM
add_order a1
LEFT JOIN add_order a2 ON a2.Id = a1.Id AND a2.[Status] = a1.[Status] AND a2.order_id = a1.order_id + 1 AND a2.StartDate = DATEADD(DAY, 1, a1.EndDate)
GROUP BY
a1.Id,
a1.[Status],
a1.order_id),
merged AS (
SELECT
l1.Id,
l1.[Status],
l1.[StartDate],
ISNULL(l2.EndDate, l1.EndDate) AS EndDate,
ROW_NUMBER() OVER (PARTITION BY l1.Id, l1.[Status], ISNULL(l2.EndDate, l1.EndDate) ORDER BY l1.order_id) AS link_id
FROM
links l1
LEFT JOIN links l2 ON l2.order_id = l1.end_order_id)
SELECT
Id,
[Status],
StartDate,
EndDate
FROM
merged
WHERE
link_id = 1
ORDER BY
StartDate;
Results are:
Id Status StartDate EndDate
1 Active 2007-09-01 2016-10-15
1 Sick 2016-10-16 2016-12-31
1 Active 2017-01-01 2017-02-04
1 Unpaid 2017-02-05 2017-02-09
1 Active 2017-02-10 2017-02-11
1 Unpaid 2017-02-12 2017-04-30
1 Active 2017-05-01 2017-10-13
1 Sick 2017-10-14 2017-11-11
1 Active 2017-11-12 NULL
How does it work? First I add a sequence number, to assist with merging contiguous rows together. Then I determine the rows that can be merged together, add a number to identify the first row in each set that can be merged, and finally pick the first rows out of the final CTE. Note that I also have to handle rows that can't be merged, hence the LEFT JOINs and ISNULL statements.
Just for interest, this is what the output from the final CTE looks like, before I filter out all but the rows with a link_id of 1:
Id Status StartDate EndDate link_id
1 Active 2007-09-01 2016-10-15 1
1 Sick 2016-10-16 2016-12-31 1
1 Sick 2016-11-01 2016-12-31 2
1 Sick 2016-12-01 2016-12-31 3
1 Active 2017-01-01 2017-02-04 1
1 Unpaid 2017-02-05 2017-02-09 1
1 Active 2017-02-10 2017-02-11 1
1 Unpaid 2017-02-12 2017-04-30 1
1 Unpaid 2017-03-01 2017-04-30 2
1 Unpaid 2017-04-01 2017-04-30 3
1 Active 2017-05-01 2017-10-13 1
1 Sick 2017-10-14 2017-11-11 1
1 Active 2017-11-12 NULL 1
You could use lag() and lead() function together to check the previous and next status
WITH CTE AS
(
select *,
COALESCE(LEAD(status) OVER(ORDER BY (select 1)), '0') Nstatus,
COALESCE(LAG(status) OVER(ORDER BY (select 1)), '0') Pstatus
from table
)
SELECT * FROM CTE
WHERE (status <> Nstatus AND status <> Pstatus) OR
(status <> Pstatus)

Sum of Top N in SQL

I have a SALES table with Person, Date and Qty:
Person Date Qty
Jim 2016-08-01 1
Jim 2016-08-02 3
Jim 2016-08-03 2
Sheila 2016-08-01 1
Sheila 2016-08-02 1
Sheila 2016-08-03 1
Bob 2016-08-03 6
Bob 2016-08-02 2
Bob 2016-08-01 5
I can rank the top 2 by Date with the following code:
/****** Top 2 Salespersons ******/
SELECT *
FROM(
SELECT * ,
ROW_NUMBER() OVER( PARTITION BY [Date]
ORDER BY Qty DESC) N'Rank'
FROM [Coinmarketcap].[dbo].[sales]
GROUP BY [Date], Person, Qty
) AS NewTable
WHERE NewTable.Rank < 3
Person Date Qty Rank
Bob 2016-08-01 5 1
Jim 2016-08-01 1 2
Jim 2016-08-02 3 1
Bob 2016-08-02 2 2
Bob 2016-08-03 6 1
Jim 2016-08-03 2 2
My two questions are:
1) How can I just see the total qty for the top 2 for each date, such as:
Date Total Qty
2016-08-01 6
2016-08-02 5
2016-08-03 8
2) How can I get the total Qty each day for different ranking groups, such as:
Date Ranking Group Total Qty
2018-08-01 1-2 6
2018-08-01 3-4 1
2018-08-01 5-6 0
2018-08-02 1-2 5
2018-08-02 3-4 1
2018-08-02 5-6 0
2018-08-03 1-2 8
2018-08-03 3-4 1
2018-08-03 5-6 0
First:
SELECT NewTable.Date, Sum(NewTable.Qty)
FROM(
SELECT * ,
ROW_NUMBER() OVER( PARTITION BY [Date]
ORDER BY Qty DESC) N'Rank'
FROM [Coinmarketcap].[dbo].[sales]
GROUP BY [Date], Person, Qty
) AS NewTable
WHERE NewTable.Rank < 3
group by NewTable.Date
Second try this:
SELECT NewTable.Date,
Trunc((NewTable.Rank - 1) / 2) * 2 + 1, -- lower rank
Trunc((NewTable.Rank - 1) / 2) * 2 + 2, -- upper rank
Sum(NewTable.Qty)
FROM(
SELECT * ,
ROW_NUMBER() OVER( PARTITION BY [Date]
ORDER BY Qty DESC) N'Rank'
FROM [Coinmarketcap].[dbo].[sales]
GROUP BY [Date], Person, Qty
) AS NewTable
group by NewTable.Date,
Trunc((NewTable.Rank - 1) / 2) * 2 + 1,
Trunc((NewTable.Rank - 1) / 2) * 2 + 2

Sql group by latest repeated field

I don't even know what's a good title for this question.
But I'm having a table:
create table trans
(
[transid] INT IDENTITY (1, 1) NOT NULL,
[customerid] int not null,
[points] decimal(10,2) not null,
[date] datetime not null
)
and records:
--cus1
INSERT INTO trans ( customerid , points , date )
VALUES ( 1, 10, '2016-01-01' ) , ( 1, 20, '2017-02-01' ) , ( 1, 22, '2017-03-01' ) ,
( 1, 24, '2018-02-01' ) , ( 1, 50, '2018-02-25' ) , ( 2, 44, '2016-02-01' ) ,
( 2, 20, '2017-02-01' ) , ( 2, 32, '2017-03-01' ) , ( 2, 15, '2018-02-01' ) ,
( 2, 10, '2018-02-25' ) , ( 3, 10, '2018-02-25' ) , ( 4, 44, '2015-02-01' ) ,
( 4, 20, '2015-03-01' ) , ( 4, 32, '2016-04-01' ) , ( 4, 15, '2016-05-01' ) ,
( 4, 10, '2017-02-25' ) , ( 4, 10, '2018-02-27' ) ,( 4, 20, '2018-02-28' ) ,
( 5, 44, '2015-02-01' ) , ( 5, 20, '2015-03-01' ) , ( 5, 32, '2016-04-01' ) ,
( 5, 15, '2016-05-01' ) ,( 5, 10, '2017-02-25' );
-- selecting the data
select * from trans
Produces:
transid customerid points date
----------- ----------- --------------------------------------- -----------------------
1 1 10.00 2016-01-01 00:00:00.000
2 1 20.00 2017-02-01 00:00:00.000
3 1 22.00 2017-03-01 00:00:00.000
4 1 24.00 2018-02-01 00:00:00.000
5 1 50.00 2018-02-25 00:00:00.000
6 2 44.00 2016-02-01 00:00:00.000
7 2 20.00 2017-02-01 00:00:00.000
8 2 32.00 2017-03-01 00:00:00.000
9 2 15.00 2018-02-01 00:00:00.000
10 2 10.00 2018-02-25 00:00:00.000
11 3 10.00 2018-02-25 00:00:00.000
12 4 44.00 2015-02-01 00:00:00.000
13 4 20.00 2015-03-01 00:00:00.000
14 4 32.00 2016-04-01 00:00:00.000
15 4 15.00 2016-05-01 00:00:00.000
16 4 10.00 2017-02-25 00:00:00.000
17 4 10.00 2018-02-27 00:00:00.000
18 4 20.00 2018-02-28 00:00:00.000
19 5 44.00 2015-02-01 00:00:00.000
20 5 20.00 2015-03-01 00:00:00.000
21 5 32.00 2016-04-01 00:00:00.000
22 5 15.00 2016-05-01 00:00:00.000
23 5 10.00 2017-02-25 00:00:00.000
I'm trying to group all the customerid and sum their points. But here's the catch, If the trans is not active for 1 year(the next tran is 1 year and above), the points will be expired.
For this case:
Points for each customers should be:
Customer1 20+22+24+50
Customer2 20+32+15+10
Customer3 10
Customer4 10+20
Customer5 0
Here's what I have so far:
select
t1.transid as transid1,
t1.customerid as customerid1,
t1.date as date1,
t1.points as points1,
t1.rank1 as rank1,
t2.transid as transid2,
t2.customerid as customerid2,
t2.points as points2,
isnull(t2.date,getUTCDate()) as date2,
isnull(t2.rank2,t1.rank1+1) as rank2,
cast(case when(t1.date > dateadd(year,-1,isnull(t2.date,getUTCDate()))) Then 0 ELSE 1 END as bit) as ShouldExpire
from
(
select transid,CustomerID,Date,points,
RANK() OVER(PARTITION BY CustomerID ORDER BY date ASC) AS RANK1
from trans
)t1
left join
(
select transid,CustomerID,Date,points,
RANK() OVER(PARTITION BY CustomerID ORDER BY date ASC) AS RANK2
from trans
)t2 on t1.RANK1=t2.RANK2-1
and t1.customerid=t2.customerid
which gives
from the above table,how do I check for ShouldExpire field having max(rank1) for customer, if it's 1, then totalpoints will be 0, otherwise,sum all the consecutive 0's until there are no more records or a 1 is met?
Or is there a better approach to this problem?
The following query uses LEAD to get the date of the next record withing the same CustomerID slice:
;WITH CTE AS (
SELECT transid, CustomerID, [Date], points,
LEAD([Date]) OVER (PARTITION BY CustomerID
ORDER BY date ASC) AS nextDate,
CASE
WHEN [date] > DATEADD(YEAR,
-1,
-- same LEAD() here as above
ISNULL(LEAD([Date]) OVER (PARTITION BY CustomerID
ORDER BY date ASC),
getUTCDate()))
THEN 0
ELSE 1
END AS ShouldExpire
FROM trans
)
SELECT transid, CustomerID, [Date], points, nextDate, ShouldExpire
FROM CTE
ORDER BY CustomerID, [Date]
Output:
transid CustomerID Date points nextDate ShouldExpire
-------------------------------------------------------------
1 1 2016-01-01 10.00 2017-02-01 1 <-- last exp. for 1
2 1 2017-02-01 20.00 2017-03-01 0
3 1 2017-03-01 22.00 2018-02-01 0
4 1 2018-02-01 24.00 2018-02-25 0
5 1 2018-02-25 50.00 NULL 0
6 2 2016-02-01 44.00 2017-02-01 1 <-- last exp. for 2
7 2 2017-02-01 20.00 2017-03-01 0
8 2 2017-03-01 32.00 2018-02-01 0
9 2 2018-02-01 15.00 2018-02-25 0
10 2 2018-02-25 10.00 NULL 0
11 3 2018-02-25 10.00 NULL 0 <-- no exp. for 3
12 4 2015-02-01 44.00 2015-03-01 0
13 4 2015-03-01 20.00 2016-04-01 1
14 4 2016-04-01 32.00 2016-05-01 0
15 4 2016-05-01 15.00 2017-02-25 0
16 4 2017-02-25 10.00 2018-02-27 1 <-- last exp. for 4
17 4 2018-02-27 10.00 2018-02-28 0
18 4 2018-02-28 20.00 NULL 0
19 5 2015-02-01 44.00 2015-03-01 0
20 5 2015-03-01 20.00 2016-04-01 1
21 5 2016-04-01 32.00 2016-05-01 0
22 5 2016-05-01 15.00 2017-02-25 0
23 5 2017-02-25 10.00 NULL 1 <-- last exp. for 5
Now, you seem to want to calculate the sum of points after the last expiration.
Using the above CTE as a basis you can achieve the required result with:
;WITH CTE AS (
... above query here ...
)
SELECT CustomerID,
SUM(CASE WHEN rnk = 0 THEN points ELSE 0 END) AS sumOfPoints
FROM (
SELECT transid, CustomerID, [Date], points, nextDate, ShouldExpire,
SUM(ShouldExpire) OVER (PARTITION BY CustomerID ORDER BY [Date] DESC) AS rnk
FROM CTE
) AS t
GROUP BY CustomerID
Output:
CustomerID sumOfPoints
-----------------------
1 116.00
2 77.00
3 10.00
4 30.00
5 0.00
Demo here
The tricky part here is to dump all points when they expire, and start accumulating them again. I assumed that if there was only one transaction that we don't expire the points until there's a new transaction, even if that first transaction was over a year ago now?
I also get a different answer for customer #5, as they do appear to have a "transaction chain" that hasn't expired?
Here's my query:
WITH ordered AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY customerid ORDER BY [date]) AS order_id
FROM
trans),
max_transid AS (
SELECT
customerid,
MAX(transid) AS max_transid
FROM
trans
GROUP BY
customerid),
not_expired AS (
SELECT
t1.customerid,
t1.points,
t1.[date] AS t1_date,
CASE
WHEN m.customerid IS NOT NULL THEN GETDATE()
ELSE t2.[date]
END AS t2_date
FROM
ordered t1
LEFT JOIN ordered t2 ON t2.customerid = t1.customerid AND t1.transid != t2.transid AND t2.order_id = t1.order_id + 1 AND t1.[date] > DATEADD(YEAR, -1, t2.[date])
LEFT JOIN max_transid m ON m.customerid = t1.customerid AND m.max_transid = t1.transid
),
max_not_expired AS (
SELECT
customerid,
MAX(t1_date) AS max_expired
FROM
not_expired
WHERE
t2_date IS NULL
GROUP BY
customerid)
SELECT
n.customerid,
SUM(n.points) AS points
FROM
not_expired n
LEFT JOIN max_not_expired m ON m.customerid = n.customerid
WHERE
ISNULL(m.max_expired, '19000101') < n.t1_date
GROUP BY
n.customerid;
It could be refactored to be simpler, but I wanted to show the steps to get to the final answer:
customerid points
1 116.00
2 77.00
3 10.00
4 30.00
5 57.00
can you try this:
SELECT customerid,
Sum(t1.points)
FROM trans t1
WHERE NOT EXISTS (SELECT 1
FROM trans t2
WHERE Datediff(year, t1.date, t2.date) >= 1)
GROUP BY t1.customerid
Hope it helps!
try this:
select customerid,Sum(points)
from trans where Datediff(year, date, GETDATE()) < 1
group by customerid
output:
customerid Points
1 - 74.00
2 - 25.00
3 - 10.00
4 - 30.00

Self join with aggregate function or some better way

I am using SQL-Server and have a table of my Purchase orders (stock). But stuck in a query while I was trying to get my All Available stock with its Latest Cost Price and Latest Selling Price.
I made a query it run successfully, but i need some better and optimized way to do this, because it will get slow when table have n number of records.
Query Sample:
SELECT
po.ProductID, sum(po.AvailableQty) as AvailableQty,
(select top 1 po2.CostPrice from Sales_PurchaseOrders po2 where po2.PurchasedAt=max(po.PurchasedAt)) as CostPrice,
(select top 1 po2.SellingPrice from Sales_PurchaseOrders po2 where po2.PurchasedAt=max(po.PurchasedAt)) as SellingPrice
FROM
Sales_PurchaseOrders po
INNER JOIN Sales_Products p on p.ProductID=po.ProductID
GROUP BY po.ProductID
Table Data:
PurchaseOrderID ProductID CostPrice SellingPrice AvailableQty PurchasedAt
--------------- ----------- --------------------------------------- --------------------------------------- --------------------------------------- -----------------------
1 1 90.000000 100.000000 2.000000 2016-07-28 00:00:00.000
2 1 33.580000 50.000000 0.000000 2016-06-28 00:00:00.000
3 2 200.000000 240.000000 15.000000 2016-07-30 00:00:00.000
4 1 50.000000 60.000000 0.000000 2016-08-02 00:00:00.000
5 1 50.000000 60.000000 1.000000 2016-08-03 00:00:00.000
6 1 100.000000 110.000000 6.000000 2016-08-04 00:00:00.000
7 1 25.000000 30.000000 3.000000 2016-08-03 00:00:00.000
8 1 20.000000 30.000000 0.000000 2016-07-30 00:00:00.000
1007 1 100.000000 200.000000 2.000000 2016-09-24 00:00:00.000
Query Result:
ProductID AvailableQty CostPrice SellingPrice
----------- --------------------------------------- --------------------------------------- ---------------------------------------
1 14.000000 100.000000 200.000000
2 15.000000 200.000000 240.000000
May be via some kind of aggregate function, or any other better optimized way to do this.
Thanks,
I think this does what you want:
SELECT po.ProductID, sum(po.AvailableQty) as AvailableQty,
MAX(last_CostPrice), MAX(last_SellingPrice)
FROM (SELECT po.*,
FIRST_VALUE(CostPrice) OVER (PARTITION BY ProductId ORDER BY PurchasedAt DESC) as last_CostPrice,
FIRST_VALUE(SellingPrice) OVER (PARTITION BY ProductId ORDER BY PurchasedAt DESC) as last_SellingPrice
FROM Sales_PurchaseOrders po
) po
GROUP BY po.ProductID;
Notes:
The table Sales_Products seems totally unnecessary for the query.
You probably want the most recent cost and selling price for the product, not for all products.
You can use FIRST_VALUE() in the subquery to get this information.
Dear Mehmood Try this.
;with wt_table
as
(
select ROW_NUMBER() over(partition by po.ProductID order by PurchasedAt desc) as Num,
AvailableQty=sum(po.AvailableQty) over(partition by po.ProductID),
po.ProductID,
po.CostPrice,
po.SellingPrice,
po.PurchasedAt
From #Sales_PurchaseOrders po)
select * from wt_table where Num=1
try this:
with Sales_PurchaseOrders(PurchaseOrderID,ProductID,CostPrice,SellingPrice,AvailableQty,PurchasedAt)AS(
select 1,1,90.000000,100.000000,2.000000,'2016-07-28 00:00:00.000' union all
select 2,1,33.580000,50.000000,0.000000,'2016-06-28 00:00:00.000' union all
select 3,2,200.000000,240.000000,15.000000,'2016-07-30 00:00:00.000' union all
select 4,1,50.000000,60.000000,0.000000,'2016-08-02 00:00:00.000' union all
select 5,1,50.000000,60.000000,1.000000,'2016-08-03 00:00:00.000' union all
select 6,1,100.000000,110.000000,6.000000,'2016-08-04 00:00:00.000' union all
select 7,1,25.000000,30.000000,3.000000,'2016-08-03 00:00:00.000' union all
select 8,1,20.000000,30.000000,0.000000,'2016-07-30 00:00:00.000' union all
select 1007,1,100.000000,200.000000,2.000000,'2016-09-24 00:00:00.000'
)
select * from (
SELECT
po.ProductID, sum(po.AvailableQty)over(partition by po.ProductID) as AvailableQty,CostPrice,SellingPrice,
row_number()over(partition by po.ProductID order by po.PurchasedAt desc) as seq
FROM Sales_PurchaseOrders po
) as t where t.seq=1
ProductID AvailableQty CostPrice SellingPrice seq
1 1 14,000000 100,000000 200,000000 1
2 2 15,000000 200,000000 240,000000 1