how to group data after every change but not to merge group even if the next time value repeated in sql - sql

i have to group data on basis of amount column but if the amount repeat after some interval then it should be treated as new group.e.g
CREATE TABLE [dbo].[TEST](
[ID] [INT] NULL,
[DLRCODE] [VARCHAR](20) NULL,
[AMN] [DECIMAL](21, 5) NULL,
[RATE] [DECIMAL](7, 5) NULL,
[DTE] [DATETIME] NULL
) ON [NFS_DATA]
-----this should be first group
1 123 10.00000 5.00000 2019-11-01 00:00:00.000
2 123 10.00000 5.00000 2019-11-02 00:00:00.000
3 123 10.00000 5.00000 2019-11-03 00:00:00.000
-----this should be second group
4 123 15.00000 5.00000 2019-11-04 00:00:00.000
-----this should be third group
5 123 10.00000 5.00000 2019-11-05 00:00:00.000
6 123 10.00000 5.00000 2019-11-06 00:00:00.000
-----this should be fourth group
7 123 20.00000 5.00000 2019-11-07 15:02:07.537
as you can check from above code and data, result should be group, every time amount change new group will be created.
result will like this
1 30 --- group of first three records
2 15 --- group of fourth records
3 20 --- group of fifth and sixth records
4 20 --- group of seven record

You can do this by using a combination of LAG and conditional aggregation:
WITH CTE AS
(
SELECT Id
, DLRCode
, Amn
, Rate
, DTE
, ISNULL(LAG(Amn) OVER(ORDER BY DTE), Amn) As PreviousAmount
FROM dbo.Test
)
SELECT Id
, DLRCode
, Amn
, Rate
, DTE
, SUM(IIF(Amn = PreviousAmount, 0, 1)) OVER(ORDER BY DTE) As Grp
FROM CTE

To get your result set, you only need lag(), taking both the date and the amount into account:
select t.*
from (select t.*,
lag(amn) over (partition by dlrcode, rate order by dte) as prev_amn,
lag(dte) over (partition by dlrcode, rate order by dte) as prev_dte
from test t
) t
where prev_amn is null or
prev_amn <> amn or
prev_dte < dateadd(day, -1, dte);
If you want to incorporate this into a group id and then summarize the groups -- with information from multiple rows -- then we'll add a group id as the cumulative sum of the group changes and aggregate:
select dlrcode, rate, amn, min(dte), max(dte),
count(*)
from (select t.*,
sum(case when prev_amn = amn and prev_dte >= dateadd(day, -1, dte)
then 0 else 1
end) over (partition by dlrcode, rate) as grp
from (select t.*,
lag(amn) over (partition by dlrcode, rate order by dte) as prev_amn,
lag(dte) over (partition by dlrcode, rate order by dte) as prev_dte
from test t
) t
) t
group by dlrcode, rate, amn, grp;

Related

How to get date counts while considering n consecutive date ranges as single count in SQL?

For example there is some table with dates:
2022-01-01
2022-01-02
2022-01-03
2022-01-06
2022-01-07
2022-01-11
I have to write SQL query, which will return count of dates between date ranges and consider consecutive dates as a single count. So the result will be like:
consider single count if 2 consecutive dates
2022-01-01 1
2022-01-02 1
2022-01-03 2
2022-01-06 3
2022-01-07 3
2022-01-11 4
consider single count if 3 consecutive dates
2022-01-01 1
2022-01-02 1
2022-01-03 1
2022-01-06 2
2022-01-07 3
2022-01-11 4
consider single count if 4 consecutive dates
2022-01-01 1
2022-01-02 2
2022-01-03 3
2022-01-06 4
2022-01-07 4
2022-01-08 4
2022-01-09 4
2022-01-10 5
2022-01-13 6
consider single count if n consecutive dates
n is configurable
The solution first identify group of consecutive dates. And then break it into the required "n" consecutive dates
The description are in the comments. It is break up into few cte so it is easier to examine the value at each stage.
declare #n int = 3;
with cte as
(
-- find when the date is not consecutive
-- using lag() to compare with current row
select [date],
g = case when dateadd(day, -1, [date])
<> lag([date]) over (order by [date])
then 1
else 0
end
from dates
),
cte2 as
(
-- group conecusive dates together
-- by sum up g
select *, grp = sum(g) over (order by [date])
from cte
),
cte3 as
(
-- break the group by #n
select *, rn = (row_number() over (partition by grp order by [date]) - 1) / #n
from cte2
)
-- dense_rank() to number it
select *, [count] = dense_rank() over (order by grp, rn)
from cte3
db<>fiddle demo
EDIT: misunderstood your requirement. This should gives you the required result
with cte as
(
-- find when the date is not consecutive
-- using lag() to compare with current row
select [date],
g = case when dateadd(day, -1, [date])
<> lag([date]) over (order by [date])
then 1
else 0
end
from dates
),
cte2 as
(
-- group consecutive dates together
-- by sum up g
select *, grp = sum(g) over (order by [date])
from cte
),
cte3 as
(
-- count number of consecutive dates in a group
select *,
c = count(*) over (partition by grp)
from cte2
),
cte4 as
(
select *,
rn = case when c >= #n
then (row_number() over (partition by grp order by [date]) - 1) / #n
else row_number() over (partition by grp order by [date])
end
from cte3
)
select *, [count] = dense_rank() over (order by grp, rn)
from cte4
db<>fiddle demo

How to merge rows startdate enddate based on column values using Lag Lead or window functions?

I have a table with 4 columns: ID, STARTDATE, ENDDATE and BADGE. I want to merge rows based on ID and BADGE values but make sure that only consecutive rows will get merged.
For example, If input is:
Output will be:
I have tried lag lead, unbounded, bounded precedings but unable to achieve the output:
SELECT ID,
STARTDATE,
MAX(ENDDATE),
NAME
FROM (SELECT USERID,
IFF(LAG(NAME) over(Partition by USERID Order by STARTDATE) = NAME,
LAG(STARTDATE) over(Partition by USERID Order by STARTDATE),
STARTDATE) AS STARTDATE,
ENDDATE,
NAME
from myTable )
GROUP BY USERID,
STARTDATE,
NAME
We have to make sure that we merge only consective rows having same ID and Badge.
Help will be appreciated, Thanks.
You can split the problem into two steps:
creating the right partitions
aggregating on the partitions with direct aggregation functions (MIN and MAX)
You can approach the first step using a boolean field that is 1 when there's no consecutive date match (row1.ENDDATE = row2.STARTDATE + 1 day). This value will indicate when a new partition should be created. Hence if you compute a running sum, you should have your correctly numbered partitions.
WITH cte AS (
SELECT *,
IFF(LAG(ENDDATE) OVER(PARTITION BY ID, Badge ORDER BY STARTDATE) + INTERVAL 1 DAY = STARTDATE , 0, 1) AS boolval
FROM tab
)
SELECT *
SUM(COALESCE(boolval, 0)) OVER(ORDER BY ID DESC, STARTDATE) AS rn
FROM cte
Then the second step can be summarized in the direct aggregation of "STARTDATE" and "ENDDATE" using the MIN and MAX function respectively, grouping on your ranking value. For syntax correctness, you need to add "ID" and "Badge" too in the GROUP BY clause, even though their range of action is already captured by the computed ranking value.
WITH cte AS (
SELECT *,
IFF(LAG(ENDDATE) OVER(PARTITION BY ID, Badge ORDER BY STARTDATE) + INTERVAL 1 DAY = STARTDATE , 0, 1) AS boolval
FROM tab
), cte2 AS (
SELECT *,
SUM(COALESCE(boolval, 0)) OVER(ORDER BY ID DESC, STARTDATE) AS rn
FROM cte
)
SELECT ID,
MIN(STARTDATE) AS STARTDATE,
MAX(ENDDATE) AS ENDDATE,
Badge
FROM cte2
GROUP BY ID,
Badge,
rn
In Snowflake, such gaps and island problem can be solved using
function conditional_true_event
As below query -
First CTE, creates a column to indicate a change event (true or false) when a value changes for column badge.
Next CTE (cte_1) using this change event column with function conditional_true_event produces another column (increment if change is TRUE) to be used as grouping, in the final main query.
And, final query is just min, max group by.
with cte as (
select
m.*,
case when badge <> lag(badge) over (partition by id order by null)
then true
else false end flag
from merge_tab m
), cte_1 as (
select c.*,
conditional_true_event(flag) over (partition by id order by null) cn
from cte c
)
select id,min(startdate) ms, max(enddate) me, badge
from cte_1
group by id,badge,cn
order by id desc, ms asc, me asc, badge asc;
Final output -
ID
MS
ME
BADGE
51
1985-02-01
2019-04-28
1
51
2019-04-29
2020-08-16
2
51
2020-08-17
2021-04-03
3
51
2021-04-04
2021-04-05
1
51
2021-04-06
2022-08-20
2
51
2022-08-21
9999-12-31
3
10
2020-02-06
9999-12-31
3
With data -
select * from merge_tab;
ID
STARTDATE
ENDDATE
BADGE
51
1985-02-01
2019-04-28
1
51
2019-04-29
2019-04-28
2
51
2019-09-16
2019-11-16
2
51
2019-11-17
2020-08-16
2
51
2020-08-17
2021-04-03
3
51
2021-04-04
2021-04-05
1
51
2021-04-06
2022-05-05
2
51
2022-05-06
2022-08-20
2
51
2022-08-21
9999-12-31
3
10
2020-02-06
2019-04-28
3
10
2021-03-21
9999-12-31
3

Creating a new RANK based on delta of previous row

I've been working on an issue for a few days now, and I can't seem to find the right fix. Does anybody have an idea?
Case
We want to create a new a new sequence number whenever an employee has resigned for more than 1 day. We have the delta of the current employment record and the previous, so we can check the sequence. We want to calculate the min(Start) and max(End) of each employment record which isn't separated more than 1 day apart.
Data
Employee
Contract
Unit
Start
End
Delta
John Doe
1
Unit A
2014-01-01
2017-12-31
NULL
John Doe
2
Unit A
2018-02-01
2018-12-31
31
John Doe
3
Unit B
2019-01-01
2020-05-31
1
John Doe
4
Unit A
2020-06-01
NULL
1
With the query it should give back:
Employee
Contract
Unit
Start
End
Delta
Sequence
John Doe
1
Unit A
2014-01-01
2017-12-31
NULL
1
John Doe
2
Unit A
2018-02-01
2018-12-31
31
2
John Doe
3
Unit B
2019-01-01
2020-05-31
1
2
John Doe
4
Unit A
2020-06-01
NULL
1
2
That is because sequence 1 end at 31-12-2017, and a new one starts in February of 2018, so there has been more than 1 day of separation between the records. The following all have a sequence of 2 because it is continuing.
Query
I've tried a few things already with lag() and lead(), but I keep working myself into a corner with the data sample that I have. When I run it on the full set it won't work.
SELECT
Employee,
Start,
End,
DeltaPrevious,
Delta,
DeltaNext,
case
when DeltaPrevious IS NULL AND Delta = 1 then 1
when DeltaPrevious = 1 AND Delta > 1 then min(Contract) OVER (PARTITION BY Employee ORDER BY Contract ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
when DeltaPrevious > 1 AND Delta = 1 then min(Contract) OVER (PARTITION BY Employee ORDER BY Contract ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
end as Sequence
FROM
Contracts
ORDER BY
Employee, Start ASC
Hope that someone has a great idea.
Thanks,
Basically, you want to use lag() to get the previous date and then do a cumulative sum. This looks like:
select c.*,
sum(case when prev_end >= dateadd(day, -1, start) then 0 else 1
end) over (partition by employee order by start) as ranking
from (select c.*,
lag(end) over (partition by employee order by start) as prev_end
from contracts c
) c;
You mention that you might want to recalculate the new start and end. You would just use the above as a subquery/CTE and aggregate on employee and ranking.
If I understood correctly from the definition of Sequence in your second table, you are more interested in the DeltaNext than in the Delta(Previous). Here an attempt, including the code to create a sample input date with two more employees:
CREATE TABLE #input_table (Employee VARCHAR(255), [Contract] INT, Unit VARCHAR(6), [Start] DATE, [End] DATE)
INSERT INTO #input_table
VALUES
('John Doe', 1, 'Unit A', '2014-01-01', '2017-12-31'),
('John Doe', 2, 'Unit A', '2018-02-01', '2018-12-31'),
('John Doe', 3, 'Unit B', '2019-01-01', '2020-05-31'),
('John Doe', 4, 'Unit A', '2020-06-01', NULL),
('Alice', 1, 'Unit A', '2020-01-01', NULL),
('Bob', 1, 'Unit C', '2020-01-01', '2020-02-20')
First we create the deltas:
SELECT *
, DeltaPrev = DATEDIFF(DAY, LAG([End], 1, NULL) OVER(PARTITION BY Employee
ORDER BY [Start]), [Start]) -- Not relevant (?)
, DeltaNext = DATEDIFF(DAY, [End], LEAD([Start], 1, NULL) OVER(PARTITION BY Employee ORDER BY [Start]))
INTO #cte_delta -- I'll create a CTE at the end
FROM #input_table
Then we define Sequence:
SELECT *
, [Sequence] = CASE WHEN DeltaNext > 1 THEN 1 ELSE 2 END
INTO #cte_sequence
FROM #cte_delta
We then group the same Sequences by assigning a unique ROW_NUMBER for each employee with consecutive/ same Sequences:
SELECT *
, GRP = ROW_NUMBER() OVER(PARTITION BY Employee ORDER BY [Start]) - ROW_NUMBER() OVER(PARTITION BY Employee, [Sequence] ORDER BY [Start])
INTO #cte_grp
FROM #cte_sequence
Finally we calculate the min and max of the contract duration:
SELECT *
, MIN([Start]) OVER(PARTITION BY Employee, GRP) AS ContractStart
, CASE WHEN COUNT(*) OVER(PARTITION BY Employee, GRP) = COUNT([End])
OVER(PARTITION BY Employee, GRP) THEN MAX([End]) OVER(PARTITION BY Employee, GRP) ELSE NULL END AS ContractEnd
FROM cte_grp
The COUNT(*) and COUNT([End]) comparison is necessary or else the ContractEnd would be the max non-NULL value, i.e. 2018-02-01.
The whole code with CTEs here:
WITH cte_delta AS (
SELECT *
, DeltaPrev = DATEDIFF(DAY, LAG([End], 1, NULL) OVER(PARTITION BY Employee ORDER BY [Start]), [Start]) -- Not relevant (?)
, DeltaNext = DATEDIFF(DAY, [End], LEAD([Start], 1, NULL) OVER(PARTITION BY Employee ORDER BY [Start]))
FROM #input_table
)
, cte_sequence AS (
SELECT *
, [Sequence] = CASE WHEN DeltaNext > 1 THEN 1 ELSE 2 END
FROM cte_delta
)
, cte_grp AS (
SELECT *
, GRP = ROW_NUMBER() OVER(PARTITION BY Employee ORDER BY [Start]) - ROW_NUMBER() OVER(PARTITION BY Employee, [Sequence] ORDER BY [Start])
FROM cte_sequence
)
SELECT *
, MIN([Start]) OVER(PARTITION BY Employee, GRP) AS ContractStart
, CASE WHEN COUNT(*) OVER(PARTITION BY Employee, GRP) = COUNT([End]) OVER(PARTITION BY Employee, GRP) THEN MAX([End]) OVER(PARTITION BY Employee, GRP) ELSE NULL END AS ContractEnd
FROM cte_grp
Here the output:
Employee
Contract
Unit
Start
End
DeltaPrev
DeltaNext
Sequence
GRP
ContractStart
ContractEnd
Alice
1
Unit A
2020-01-01
NULL
NULL
NULL
2
0
2020-01-01
NULL
Bob
1
Unit C
2020-01-01
2020-02-20
NULL
NULL
2
0
2020-01-01
2020-02-20
John Doe
1
Unit A
2014-01-01
2017-12-31
NULL
32
1
0
2014-01-01
2017-12-31
John Doe
2
Unit A
2018-02-01
2018-12-31
32
1
2
1
2018-02-01
NULL
John Doe
3
Unit B
2019-01-01
2020-05-31
1
1
2
1
2018-02-01
NULL
John Doe
4
Unit A
2020-06-01
NULL
1
NULL
2
1
2018-02-01
NULL
Feel free to select DISTINCT records according to your needs.

Caller whose first and last call was to the same person

I have a phonelog table that has information about callers' call history. I'd like to find out callers whose first and last call was to the same person on a given day.
Callerid Recipientid DateCalled
1 2 2019-01-01 09:00:00.000
1 3 2019-01-01 17:00:00.000
1 4 2019-01-01 23:00:00.000
2 5 2019-07-05 09:00:00.000
2 5 2019-07-05 17:00:00.000
2 3 2019-07-05 23:00:00.000
2 5 2019-07-06 17:00:00.000
2 3 2019-08-01 09:00:00.000
2 3 2019-08-01 17:00:00.000
2 4 2019-08-02 09:00:00.000
2 5 2019-08-02 10:00:00.000
2 4 2019-08-02 11:00:00.000
Expected Output
Callerid Recipientid Datecalled
2 5 2019-07-05
2 3 2019-08-01
2 4 2019-08-02
I wrote the below query but can't get it to return recipientid. Any help on this will be appreciated!
select pl.callerid,cast(pl.datecalled as date) as datecalled
from phonelog pl inner join (select callerid, cast(datecalled as date) as datecalled,
min(datecalled) as firstcall, max(datecalled) as lastcall
from phonelog
group by callerid, cast(datecalled as date)) as x
on pl.callerid = x.callerid and cast(pl.datecalled as date) = x.datecalled
and (pl.datecalled = x.firstcall or pl.datecalled = x.lastcall)
group by pl.callerid, cast(pl.datecalled as date)
having count(distinct recipientid) = 1
Another dbFiddle option
First, my prequery (PQ alias), I am getting for a given client, per day, the min and max time called but also HAVING to make sure person had at least 2 phone calls in a given day. From that, I re-join to the phone log table on the FIRST (MIN) call for the person for the given day. Then I join one more time for the LAST (MAX) call for the same person for the same day and make sure the recipient of the first is same as last.
I do not have to join on the stripped-down "JustDate" column used for the grouping as the MIN/MAX qualifies the FULL date/time.
select
PQ.JustDate,
PQ.CallerID,
pl1.RecipientID
from
( select
callerID,
convert( date, dateCalled ) JustDate,
min( DateCalled ) minDateCall,
max( DateCalled ) maxDateCall
from
PhoneLog pl
group by
callerID,
convert( date, dateCalled )
having
count(*) > 1) PQ
JOIN PhoneLog pl1
on PQ.CallerID = pl1.CallerID
AND PQ.minDateCall = pl1.dateCalled
JOIN PhoneLog pl2
on PQ.CallerID = pl2.CallerID
AND PQ.maxDateCall = pl2.dateCalled
AND pl1.RecipientID = pl2.RecipientID
Its very easy with window function
WITH cte AS (
SELECT *, CAST(DateCalled as DATE) DateCalled
,FIRST_VALUE(Recipientid) OVER (PARTITION BY Callerid ,CAST(DateCalled as date) ORDER BY CAST(DateCalled AS DATE)) f
,LAST_VALUE(Recipientid) OVER (PARTITION BY Callerid ,CAST(DateCalled as date) ORDER BY CAST(DateCalled AS DATE)) l
FROM phonelog
)
SELECT DISTINCT Callerid,Recipientid, DateCalled FROM cte
WHERE f=l
Since SQL Server 2019 you could use the first_value() and last_value() window functions.
SELECT DISTINCT
x1.callerid,
x1.fri,
x1.datecalled
FROM (SELECT pl1.callerid,
pl1.recipientid,
convert(date, pl1.datecalled) datecalled,
first_value(pl1.recipientid) OVER (PARTITION BY pl1.callerid,
convert(date, pl1.datecalled)
ORDER BY pl1.datecalled
RANGE BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) fri,
last_value(pl1.recipientid) OVER (PARTITION BY pl1.callerid,
convert(date, pl1.datecalled)
ORDER BY pl1.datecalled
RANGE BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) lri
FROM phonelog pl1) x1
WHERE x1.fri = x1.lri;
In older versions you can use correlated subqueries with TOP 1.
SELECT DISTINCT
x1.callerid,
x1.fri,
x1.datecalled
FROM (SELECT pl1.callerid,
pl1.recipientid,
convert(date, pl1.datecalled) datecalled,
(SELECT TOP 1
pl2.recipientid
FROM phonelog pl2
WHERE pl2.callerid = pl1.callerid
AND pl2.datecalled >= convert(date, pl1.datecalled)
AND pl2.datecalled < dateadd(day, 1, convert(date, pl1.datecalled))
ORDER BY pl2.datecalled ASC) fri,
(SELECT TOP 1
pl2.recipientid
FROM phonelog pl2
WHERE pl2.callerid = pl1.callerid
AND pl2.datecalled >= convert(date, pl1.datecalled)
AND pl2.datecalled < dateadd(day, 1, convert(date, pl1.datecalled))
ORDER BY pl2.datecalled DESC) lri
FROM phonelog pl1) x1
WHERE x1.fri = x1.lri;
db<>fiddle
If you don't want to return log rows where somebody just made one call on a day, which of course means the first and the last call of the day were to the same person, you can use GROUP BY and HAVING count(*) > 1 instead of DISTINCT.
SELECT x1.callerid,
x1.fri,
x1.datecalled
FROM (...) x1
WHERE x1.fri = x1.lri
GROUP BY x1.callerid,
x1.fri,
x1.datecalled
HAVING count(*) > 1;
You can use a CTE to compute the first and last call of each day by Callerid, and then self-JOIN that CTE to find callers whose first and last calls were to the same Recipientid:
WITH CTE AS (
SELECT Callerid, RecipientId, CONVERT(DATE, Datecalled) AS Datecalled,
ROW_NUMBER() OVER (PARTITION BY Callerid, CONVERT(DATE, Datecalled) ORDER BY Datecalled) AS rna,
ROW_NUMBER() OVER (PARTITION BY Callerid, CONVERT(DATE, Datecalled) ORDER BY Datecalled DESC) AS rnb
FROM phonelog
)
SELECT c1.Callerid, c1.RecipientId, c1.Datecalled
FROM CTE c1
JOIN CTE c2 ON c1.Callerid = c2.Callerid AND c1.Recipientid = c2.Recipientid
WHERE c1.rna = 1 AND c2.rnb = 1
Output:
Callerid RecipientId Datecalled
2 5 2019-07-05
2 3 2019-08-01
2 4 2019-08-02
Demo on SQLFiddle
As my understanding, you want to select callerid with each Recipientid with the times greater than 1 to make sure that we have First call and Last call. So you just need to group by 3 columns combine with having count(Recipientid) > 1 Like this
SELECT Callerid, Recipientid, CAST(Datecalled AS DATE) AS Datecalled
FROM phonelog
GROUP BY Callerid, Recipientid, CAST(Datecalled AS DATE)
HAVING COUNT(Recipientid) > 1
Demo on db<>fiddle
As per my understanding we have to rank Caller_id as well as Recipient_id along with the Date.
Below is my solution which is working well for this case.
with CTE as
(select *,
row_number() over (partition by callerid, convert(VARCHAR,datecalled,23) order by convert(VARCHAR,datecalled,23)) as first_recipient_id,
row_number() over (partition by receipientid, convert(VARCHAR,datecalled,23) order by convert(VARCHAR,datecalled,23) desc) as last_recipient_id
from activity
)
select t.callerid,t.receipientid,CONVERT(VARCHAR,t.datecalled) as DateCalled from CTE t
where t.first_recipient_id >1 AND t.last_recipient_id>1;
The result that I was able to get:
Result
I think we need to identify first and last call made by caller on a day and then compare it with first and last call by caller to a recipient for that day. Below code has firstcall and lastcall made by caller on a day. Then it finds first and last call by caller to respective recipient and then compare.
SELECT DISTINCT
callerid,
recipientid,
CONVERT(date,firstcall)
FROM
(
Select
callerid,
recipientid,
MIN(dateCalled) OVER(PARTITION BY callerid,CONVERT(date,DateCalled)) as firstcall,
MAX(DateCalled) OVER(PARTITION BY callerid,CONVERT(date,DateCalled)) as lastcall,
MIN(DateCalled) OVER(PARTITION BY callerid,recipientid,convert(date,DateCalled)) as recipfirstcall,
MAX(call_start_time) OVER(PARTITION BY callerid,recipientid,convert(date,DateCalled)) as reciplastcall
from phonelog
) as A
where A.firstcall=A.recipfirstcall and A.lastcall=A.reciplastcall

Get latest entry in each week over several years period

I have the following table to store history for entities:
Date Id State
-------------------------------------
2017-10-10 1 0
2017-10-12 1 4
2018-5-30 1 8
2019-4-1 2 0
2018-3-6 2 4
2018-3-7 2 0
I want to get last entry for each Id in one week period e.g.
Date Id State
-------------------------------------
2017-10-12 1 4
2018-5-30 1 8
2019-4-1 2 0
2018-3-7 2 0
I'd try to use Partition by:
select
ID
,Date
,State
,DatePart(week,Date) as weekNumber
from TableA
where Date = (
select max(Date) over (Partition by Id Order by DatePart(week, Date) Desc)
)
order by ID
but it still gives me more than one result per week.
You can use ROW_NUMBER():
SELECT a.*
FROM (SELECT a.*, ROW_NUMBER() OVER (PARTITION BY a.id, DATEPART(WK, a.Date) ORDER BY a.Date DESC) AS Seq
FROM tablea a
) a
WHERE seq = 1
ORDER BY id, Date;