How can I get distinct data from one col - sql

I need to get member personal data for all our members whose subscriptions have lapsed i.e. have a subscription end date before 31/03/2020, however I want to show one member record only (distinct by membership number) ideally the most recent one
I've tried a ROW_NUMBER() solution SQL - Distinct One Col, Select Multiple other? and a cross apply solution sql distinct, getting 2 columns but I can't get it to work.
SELECT membershipnumber AS Id,
subscription.enddate
FROM [dbo].[userprofile]
INNER JOIN dbo.subscription
ON userprofile.id = subscription.userprofileid
INNER JOIN dbo.subscriptiontype
ON subscriptiontype.id = subscription.subscriptiontypeid
Output is
Id Enddate
1 2006-04-01 00:00:00.000
1 2001-04-01 00:00:00.000
1 1999-04-01 00:00:00.000
1 1998-04-01 00:00:00.000
1 2008-04-01 00:00:00.000
1 2007-04-01 00:00:00.000
1 2011-04-01 00:00:00.000
1 2005-04-01 00:00:00.000
1 2000-04-01 00:00:00.000
1 1997-04-01 00:00:00.000
2 1999-04-01 00:00:00.000
2 2012-04-01 00:00:00.000
2 2004-04-01 00:00:00.000
2 2001-04-01 00:00:00.000
2 2018-04-01 00:00:00.000
2 2009-04-01 00:00:00.000
2 2005-04-01 00:00:00.000
2 1997-04-01 00:00:00.000
Desired output
Id Enddate
1 2011-04-01 00:00:00.000
2 2018-04-01 00:00:00.000

Solved sql answer
;WITH cte
AS (SELECT membershipnumber AS Id,
subscription.enddate,
Row_number()
OVER (
partition BY membershipnumber
ORDER BY subscription.enddate DESC) AS rownumber
FROM [dbo].[userprofile]
INNER JOIN dbo.subscription
ON userprofile.id = subscription.userprofileid
INNER JOIN dbo.subscriptiontype
ON subscriptiontype.id = subscription.subscriptiontypeid
)
SELECT *
FROM cte
WHERE rownumber = 1
https://stackoverflow.com/a/6841644/5859743

Not sure if I got your question right.
but you can use DISTINCT in the SELECT, that would show only one record for each member.
SELECT DISTINCT Membershipnumber as Id
,'P' as PartyType
,'A' as Status
,case
when Name = 'Standard Membership paid annually.' and EndDate > '2020-03-31' then 'Member'
when Name = 'Lapsed subscription renewal' and EndDate > '2020-03-31' then 'Member'
when Name = '3 Year Subscription (members outside of UK and Ireland, Jersey, Guernsey and the Channel Islands)' and EndDate > '2020-03-31' then 'Overseas member'
when Name = '1 Year Subscription (members outside of UK and Ireland, Jersey, Guernsey and the Channel Islands).' and EndDate > '2020-03-31' then 'Overseas member'
when Name = 'Lapsed subscription renewal' and EndDate > '2020-03-31' then 'Member'
when Name = 'Lifetime membership' then 'Lifetime member'
when Name = 'Retired membership paid annually' and EndDate > '2020-03-31' then 'Retired member'
else 'Non member'
end As MemberType
,Title as NamePrefix
,FirstName as FirstName
,Surname as LastName
,DateOfBirth as BirthDate
,'Home' as AddressPurpose
,'Default' as CommunicationReasons
,AddressLine1
,AddressLine2
,AddressLine3
,Addressline4 as CityName
,'' as CountrySubEntityName
,Country as CountryCode
,'' as CountryName
,Postcode as PostalCode
,EmailAddress as Email
FROM [dbo].[UserProfile]
inner join dbo.Subscription on
UserProfile.Id = Subscription.UserProfileId
inner join dbo.SubscriptionType on
SubscriptionType.id = Subscription.SubscriptionTypeId```

If you are getting as above mentioned output. Then from that, your desired output will easily get using distinct.
; with cte as (
----- query which gives you above mentioned output
)
select distinct id, max(Enddate) as Enddate from cte

I suspect you want something like this:
select *
from (select . . ., -- all the columns you want
row_number() over (partition by Membershipnumber as Id order by s.Enddate) as seqnum
from [dbo].[UserProfile] up inner join
dbo.Subscription s
on up.Id = s.UserProfileId inner join
dbo.SubscriptionType st
on st.id = s.SubscriptionTypeId
) x
where seqnum = 1;

Related

Get max date for each from either of 2 columns

I have a table like below
AID BID CDate
-----------------------------------------------------
1 2 2018-11-01 00:00:00.000
8 1 2018-11-08 00:00:00.000
1 3 2018-11-09 00:00:00.000
7 1 2018-11-15 00:00:00.000
6 1 2018-12-24 00:00:00.000
2 5 2018-11-02 00:00:00.000
2 7 2018-12-15 00:00:00.000
And I am trying to get a result set as follows
ID MaxDate
-------------------
1 2018-12-24 00:00:00.000
2 2018-12-15 00:00:00.000
Each value in the id columns(AID,BID) should return the max of CDate .
ex: in the case of 1, its max CDate is 2018-12-24 00:00:00.000 (here 1 appears under BID)
in the case of 2 , max date is 2018-12-15 00:00:00.000 . (here 2 is under AID)
I tried the following.
1.
select
g.AID,g.BID,
max(g.CDate) as 'LastDate'
from dbo.TT g
inner join
(select AID,BID,max(CDate) as maxdate
from dbo.TT
group by AID,BID)a
on (a.AID=g.AID or a.BID=g.BID)
and a.maxdate=g.CDate
group by g.AID,g.BID
and 2.
SELECT
AID,
CDate
FROM (
SELECT
*,
max_date = MAX(CDate) OVER (PARTITION BY [AID])
FROM dbo.TT
) AS s
WHERE CDate= max_date
Please suggest a 3rd solution.
You can assemble the data in a table expression first, and the compute the max for each value is simple. For example:
select
id, max(cdate)
from (
select aid as id, cdate from t
union all
select bid, cdate from t
) x
group by id
You seem to only care about values that are in both columns. If this interpretation is correct, then:
select id, max(cdate)
from ((select aid as id, cdate, 1 as is_a, 0 as is_b
from t
) union all
(select bid as id, cdate, 1 as is_a, 0 as is_b
from t
)
) ab
group by id
having max(is_a) = 1 and max(is_b) = 1;

SQL - Find if column dates include at least partially a date range

I need to create a report and I am struggling with the SQL script.
The table I want to query is a company_status_history table which has entries like the following (the ones that I can't figure out)
Table company_status_history
Columns:
| id | company_id | status_id | effective_date |
Data:
| 1 | 10 | 1 | 2016-12-30 00:00:00.000 |
| 2 | 10 | 5 | 2017-02-04 00:00:00.000 |
| 3 | 11 | 5 | 2017-06-05 00:00:00.000 |
| 4 | 11 | 1 | 2018-04-30 00:00:00.000 |
I want to answer to the question "Get all companies that have been at least for some point in status 1 inside the time period 01/01/2017 - 31/12/2017"
Above are the cases that I don't know how to handle since I need to add some logic of type :
"If this row is status 1 and it's date is before the date range check the next row if it has a date inside the date range."
"If this row is status 1 and it's date is after the date range check the row before if it has a date inside the date range."
I think this can be handled as a gaps and islands problem. Consider the following input data: (same as sample data of OP plus two additional rows)
id company_id status_id effective_date
-------------------------------------------
1 10 1 2016-12-15
2 10 1 2016-12-30
3 10 5 2017-02-04
4 10 4 2017-02-08
5 11 5 2017-06-05
6 11 1 2018-04-30
You can use the following query:
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
ORDER BY company_id, effective_date
to get:
id company_id status_id effective_date grp
-----------------------------------------------
1 10 1 2016-12-15 0
2 10 1 2016-12-30 1
3 10 5 2017-02-04 2
4 10 4 2017-02-08 2
5 11 5 2017-06-05 0
6 11 1 2018-04-30 0
Now you can identify status = 1 islands using:
;WITH CTE AS
(
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
)
SELECT id, company_id, status_id, effective_date,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) -
cnt AS grp
FROM CTE
Output:
id company_id status_id effective_date grp
-----------------------------------------------
1 10 1 2016-12-15 1
2 10 1 2016-12-30 1
3 10 5 2017-02-04 1
4 10 4 2017-02-08 2
5 11 5 2017-06-05 1
6 11 1 2018-04-30 2
Calculated field grp will help us identify those islands:
;WITH CTE AS
(
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
), CTE2 AS
(
SELECT id, company_id, status_id, effective_date,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) -
cnt AS grp
FROM CTE
)
SELECT company_id,
MIN(effective_date) AS start_date,
CASE
WHEN COUNT(*) > 1 THEN DATEADD(DAY, -1, MAX(effective_date))
ELSE MIN(effective_date)
END AS end_date
FROM CTE2
GROUP BY company_id, grp
HAVING COUNT(CASE WHEN status_id = 1 THEN 1 END) > 0
Output:
company_id start_date end_date
-----------------------------------
10 2016-12-15 2017-02-03
11 2018-04-30 2018-04-30
All you want know is those records from above that overlap with the specified interval.
Demo here with somewhat more complicated use case.
Maybe this is what you are looking for? For these kind of questions, you need to join two instance of your table, in this case I am just joining with next record by Id, which probably is not totally correct. To do it better, you can create a new Id using a windowed function like row_number, ordering the table by your requirement criteria
If this row is status 1 and it's date is before the date range check
the next row if it has a date inside the date range
declare #range_st date = '2017-01-01'
declare #range_en date = '2017-12-31'
select
case
when csh1.status_id=1 and csh1.effective_date<#range_st
then
case
when csh2.effective_date between #range_st and #range_en then true
else false
end
else NULL
end
from company_status_history csh1
left join company_status_history csh2
on csh1.id=csh2.id+1
Implementing second criteria:
"If this row is status 1 and it's date is after the date range check
the row before if it has a date inside the date range."
declare #range_st date = '2017-01-01'
declare #range_en date = '2017-12-31'
select
case
when csh1.status_id=1 and csh1.effective_date<#range_st
then
case
when csh2.effective_date between #range_st and #range_en then true
else false
end
when csh1.status_id=1 and csh1.effective_date>#range_en
then
case
when csh3.effective_date between #range_st and #range_en then true
else false
end
else null -- ¿?
end
from company_status_history csh1
left join company_status_history csh2
on csh1.id=csh2.id+1
left join company_status_history csh3
on csh1.id=csh3.id-1
I would suggest the use of a cte and the window functions ROW_NUMBER. With this you can find the desired records. An example:
DECLARE #t TABLE(
id INT
,company_id INT
,status_id INT
,effective_date DATETIME
)
INSERT INTO #t VALUES
(1, 10, 1, '2016-12-30 00:00:00.000')
,(2, 10, 5, '2017-02-04 00:00:00.000')
,(3, 11, 5, '2017-06-05 00:00:00.000')
,(4, 11, 1, '2018-04-30 00:00:00.000')
DECLARE #StartDate DATETIME = '2017-01-01';
DECLARE #EndDate DATETIME = '2017-12-31';
WITH cte AS(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) AS rn
FROM #t
),
cteLeadLag AS(
SELECT c.*, ISNULL(c2.effective_date, c.effective_date) LagEffective, ISNULL(c3.effective_date, c.effective_date)LeadEffective
FROM cte c
LEFT JOIN cte c2 ON c2.company_id = c.company_id AND c2.rn = c.rn-1
LEFT JOIN cte c3 ON c3.company_id = c.company_id AND c3.rn = c.rn+1
)
SELECT 'Included' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date BETWEEN #StartDate AND #EndDate
UNION ALL
SELECT 'Following' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date > #EndDate
AND LagEffective BETWEEN #StartDate AND #EndDate
UNION ALL
SELECT 'Trailing' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date < #EndDate
AND LeadEffective BETWEEN #StartDate AND #EndDate
I first select all records with their leading and lagging Dates and then I perform your checks on the inclusion in the desired timespan.
Try with this, self-explanatory. Responds to this part of your question:
I want to answer to the question "Get all companies that have been at
least for some point in status 1 inside the time period 01/01/2017 -
31/12/2017"
Case that you want to find those id's that have been in any moment in status 1 and have records in the period requested:
SELECT *
FROM company_status_history
WHERE id IN
( SELECT Id
FROM company_status_history
WHERE status_id=1 )
AND effective_date BETWEEN '2017-01-01' AND '2017-12-31'
Case that you want to find id's in status 1 and inside the period:
SELECT *
FROM company_status_history
WHERE status_id=1
AND effective_date BETWEEN '2017-01-01' AND '2017-12-31'

How do I insert left joins in a nested query?

First of all I would like to thank the friends who helped this complex and difficult query.
I have three tables
Table 1
StaffId FirstName LastName staffType
---------------------------------------
1 Adam Sorme Student
2 Lara Sandra Teacher
3 Jack Jones Student
Table 2
GateId GateName
---------------------------------------
1 frontDoor
2 superDoor
Table 3
Id transitionDate GateId StaffId
---------------------------------------
1 2018-01-1 08:00:00 1 1
2 2018-01-1 10:00:00 2 1
3 2018-01-1 20:00:00 2 1
4 2018-01-2 07:00:00 1 2
5 2018-01-2 10:00:00 1 3
6 2018-01-9 12:00:00 2 2
I want the first and last movements of students for each day. Value must be set to null if no movement is available between the specified dates
transitionDate> '2018-01-1 00:00:00 000'
and transitionDate< '2018-01-03 00:00:00 000'
OUTPUT:
Id Date MinTransitionDate MaxTransitionDate FirstGateName LastGateName StaffId StaffType
1 2018-01-01 2018-01-1 08:00:00 2018-01-1 20:00:00 frontDoor superDoor 1 Student
2 2018-01-01 null null null null 3 student
3 2018-01-02 null null null null 1 student
4 2018-01-02 2018-01-2 10:00:00 null frontDoor null 3 student
The following query is partially working.
select s.staffId, d.dte,
min(t.transitionDate) as first_change,
max(t.transitionDate) as first_change,
max(case when seqnum_asc = 1 then gateId end) as first_gateid,
max(case when seqnum_desc = 1 then gateId end) as last_gateid
from (select s.* from Staff s where stafftype = 'Student') s cross join
(select distinct cast(transitionDate as date) as dte from Transitions) d left join
(select t.*,
row_number() over (partition by StaffId, cast(transitionDate as date) order by transitionDate) as seqnum_asc,
row_number() over (partition by StaffId, cast(transitionDate as date) order by transitionDate desc) as seqnum_desc
from Transitions t
) t
on cast(t.transitiondate as date) = d.dte and
t.staffId = s.staffId and
1 in (t.seqnum_asc, t.seqnum_desc)
group by s.staffId, d.dte;
Here is a SQL Fiddle.
How do I add firstGateName and LastGateName to this query result?
You can just join your existing query to the Gates table to get those names, i.e.
<existing query>
inner join Gates g1 on g1.gateId = (required gate id)
In your case you can join using the aggregate value you've
select
q.*,
g1.GateName as first_gate_name,
g2.GateName as last_gate_name
from
-- use existing query as a subquery, so we can easily use the first/last_gateid values
(
select s.staffId, d.dte,
min(t.transitionDate) as first_change,
max(t.transitionDate) as last_change,
max(case when seqnum_asc = 1 then gateId end) as first_gateid,
max(case when seqnum_desc = 1 then gateId end) as last_gateid
from (select s.* from Staff s where stafftype = 'Student') s cross join
(select distinct cast(transitionDate as date) as dte from Transitions) d left join
(select t.*,
row_number() over (partition by StaffId, cast(transitionDate as date) order by transitionDate) as seqnum_asc,
row_number() over (partition by StaffId, cast(transitionDate as date) order by transitionDate desc) as seqnum_desc
from Transitions t
) t
on cast(t.transitiondate as date) = d.dte and
t.staffId = s.staffId and
1 in (t.seqnum_asc, t.seqnum_desc)
group by s.staffId, d.dte
) q
-- join on the appropriate gate ids
inner join Gates g1 on g1.gateId = q.first_gateid
inner join Gates g2 on g2.gateId = q.last_gateid

Overlapping date spans

I have following table. How I can find out overlapping spans only? In example, below memberid 3 should not be in our scope since date spans do not overlap with each other
Any help is highly appreciated
MemberID fromdate todate
1 1/1/2018 12/31/2018
1 1/1/2018 12/31/2018
2 12/1/2017 1/1/2019
2 1/2/2018 2/2/2019
3 1/1/2015 12/31/2015
3 1/1/2016 12/31/2016
3 1/1/2017 12/31/2017
4 1/1/2018 1/1/2018
4 1/1/2018 1/1/2018
5 1/1/2015 1/31/2016
5 1/1/2016 7/31/2016
5 07/01/2016 12/31/2016
Expected results should be data associated with Member Ids 1,2,4 and 5 Member ID 3 should not be in the results set because date spans are not overlapping.
Hmmm. You can get the overlapping spans by doing:
select m.*
from members m
where exists (select 1
from members m2
where m2.memberid = m.memberid and
m2.todate > m.fromdate and m2.fromdate < m.todate
);
If you want members that don't overlap, let's use except:
select m.memberid
from members m
except
select m.*
from members m
where exists (select 1
from members m2
where m2.memberid = m.memberid and
m2.todate >= m.fromdate and m2.fromdate <= m.todate
);
Except removes duplicates. But if you wanted to be extra sure and redundant, you could write select distinct for each query.
Try this:
;with cte as
(select memberid, convert(Varchar,fromdate,101)fromdate,convert(Varchar,todate,101)todate from #tb),
cte2 as
(select Num,memberid,todate,fromdate,Num + 1 as num2 from
(select ROW_NUMBER() over(partition by memberid order by fromdate) as Num,memberid,fromdate,todate from cte) as a),
cte3 as
(select memberid,fromdate,todate, DATEDIFF(day,fromdate,todate) as date_diff from
(select ISNULL(memberid,bnum)memberid , isnull(fromdate1,fromdate2)fromdate,isnull(fromdate2,fromdate1)todate,bnum from
(select a.num,a.fromdate,a.todate,a.num2 as num1,a.memberid,case when a.Num=b.num2 then b.todate else a.fromdate end as fromdate1,
case when a.Num=b.num2 then a.fromdate else b.todate end as fromdate2,b.num2,b.todate as todate2,b.Num as bnum from cte2 as a
full join cte2 as b
on a.num = b.num2 and a.memberid = b.memberid) as a) as a)
select distinct memberid from cte3 where date_diff<0

Contiguous Dates

Here is the table that I am working with:
MemberID MembershipStartDate MembershipEndDate
=================================================================
123 2010-01-01 00:00:00.000 2012-12-31 00:00:00.000
123 2011-01-01 00:00:00.000 2012-12-31 00:00:00.000
123 2013-05-01 00:00:00.000 2013-12-31 00:00:00.000
123 2014-01-01 00:00:00.000 2014-12-31 00:00:00.000
123 2015-01-01 00:00:00.000 2015-03-31 00:00:00.000
What I want is to create one row that shows continuous membership,
and a second row if the membership breaks by more than 2 days, with a new start and end date..
So the output I am looking for is like:
MemberID MembershipStartDate MembershipEndDate
=================================================================
123 2010-01-01 00:00:00.000 2012-12-31 00:00:00.000
123 2013-05-01 00:00:00.000 2015-03-31 00:00:00.000
There is a memberID field attached to these dates which is how they are grouped.
I've had to deal with this kind of thing before
I use something like this
USE tempdb
--Create test Data
DECLARE #Membership TABLE (MemberID int ,MembershipStartDate date,MembershipEndDate date)
INSERT #Membership
(MemberID,MembershipStartDate,MembershipEndDate)
VALUES (123,'2010-01-01','2012-12-31'),
(123,'2011-01-01','2012-12-31'),
(123,'2013-05-01','2013-12-31'),
(123,'2014-01-01','2014-12-31'),
(123,'2015-01-01','2015-03-31')
--Create a table to hold all the dates that might be turning points
DECLARE #SignificantDates Table(MemberID int, SignificantDate date, IsMember bit DEFAULT 0)
--Populate table with the start and end dates as well as the days just before and just after each period
INSERT #SignificantDates (MemberID ,SignificantDate)
SELECT MemberID, MembershipStartDate FROM #Membership
UNION
SELECT MemberID,DATEADD(day,-1,MembershipStartDate ) FROM #Membership
UNION
SELECT MemberID,MembershipEndDate FROM #Membership
UNION
SELECT MemberID,DATEADD(day,1,MembershipEndDate) FROM #Membership
--Set the is member flag for each date that is covered by a membership
UPDATE sd SET IsMember = 1
FROM #SignificantDates sd
JOIN #Membership m ON MembershipStartDate<= SignificantDate AND SignificantDate <= MembershipEndDate
--To demonstrate what we're about to do, Select all the dates and show the IsMember Flag and the previous value
SELECT sd.MemberID, sd.SignificantDate,sd.IsMember, prv.prevIsMember
FROM
#SignificantDates sd
JOIN (SELECT
MemberId,
SignificantDate,
IsMember,
Lag(IsMember,1) OVER (PARTITION BY MemberId ORDER BY SignificantDate desc) AS prevIsMember FROM #SignificantDates
) as prv
ON sd.MemberID = prv.MemberID
AND sd.SignificantDate = prv.SignificantDate
ORDER BY sd.MemberID, sd.SignificantDate
--Delete the ones where the flag is the same as the previous value
delete sd
FROM
#SignificantDates sd
JOIN (SELECT MemberId, SignificantDate,IsMember, Lag(IsMember,1) OVER (PARTITION BY MemberId ORDER BY SignificantDate) AS prevIsMember FROM #SignificantDates ) as prv
ON sd.MemberID = prv.MemberID
AND sd.SignificantDate = prv.SignificantDate
AND prv.IsMember = prv.prevIsMember
--SELECT the Start date for each period of membership and the day before the following period of non membership
SELECT
nxt.MemberId,
nxt.SignificantDate AS MembershipStartDate,
DATEADD(day,-1,nxt.NextSignificantDate) AS MembershipEndDate
FROM
(
SELECT
MemberID,
SignificantDate,
LEAd(SignificantDate,1) OVER (PARTITION BY MemberId ORDER BY SignificantDate) AS NextSignificantDate,
IsMember
FROM #SignificantDates
) nxt
WHERE nxt.IsMember = 1