Select duplicate persons with duplicate memberships - sql

SQL Fiddle with schema and my intial attempt.
CREATE TABLE person
([firstname] varchar(10), [surname] varchar(10), [dob] date, [personid] int);
INSERT INTO person
([firstname], [surname], [dob] ,[personid])
VALUES
('Alice', 'AA', '1/1/1990', 1),
('Alice', 'AA', '1/1/1990', 2),
('Bob' , 'BB', '1/1/1990', 3),
('Carol', 'CC', '1/1/1990', 4),
('Alice', 'AA', '1/1/1990', 5),
('Kate' , 'KK', '1/1/1990', 6),
('Kate' , 'KK', '1/1/1990', 7)
;
CREATE TABLE person_membership
([personid] int, [personstatus] varchar(1), [memberid] int);
INSERT INTO person_membership
([personid], [personstatus], [memberid])
VALUES
(1, 'A', 10),
(2, 'A', 20),
(3, 'A', 30),
(3, 'A', 40),
(4, 'A', 50),
(4, 'A', 60),
(5, 'T', 70),
(6, 'A', 80),
(7, 'A', 90);
CREATE TABLE membership
([membershipid] int, [memstatus] varchar(1));
INSERT INTO membership
([membershipid], [memstatus])
VALUES
(10, 'A'),
(20, 'A'),
(30, 'A'),
(40, 'A'),
(50, 'T'),
(60, 'A'),
(70, 'A'),
(80, 'A'),
(90, 'T');
There are three tables (as per the fiddle above). Person table contains duplicates, same people entered more than once, for the purpose of this exercise we assume that a combination of the first name, surname and DoB is enough to uniquely identify a person.
I am trying to build a query which will show duplicates of people (first name+surname+Dob) with two or more active entries in the Person table (person_membership.person_status=A) AND two or more active memberships (membership.mestatus=A).
Using the example from SQL Fiddle, the result of the query should be just Alice (two active person IDs, two active membership IDs).
I think I'm making progress with the following effort but it looks rather cumbersome and I need to remove Katie from the final result - she doesn't have a duplicate membership.
SELECT q.firstname, q.surname, q.dob, p1.personid, m.membershipid
FROM
(SELECT
p.firstname,p.surname,p.dob, count(*) as cnt
FROM
person p
GROUP BY
p.firstname,p.surname,p.dob
HAVING COUNT(1) > 1) as q
INNER JOIN person p1 ON q.firstname=p1.firstname AND q.surname=p1.surname AND q.dob=p1.dob
INNER JOIN person_membership pm ON p1.personid=pm.personid
INNER JOIN membership m ON pm.memberid = m.membershipid
WHERE pm.personstatus = 'A' AND m.memstatus = 'A'

Since you are using SQL Server windows function will be handy for this scenario. The following will give you the expected output.
SELECT firstname,surname,dob,personid,memberid
from(
SELECT firstname,surname,dob,p.personid,memberid
,Rank() over(partition by p.firstname,p.surname,p.dob order by p.personid) rnasc
,Rank() over(partition by p.firstname,p.surname,p.dob order by p.personid desc) rndesc
FROM [StagingGRG].[dbo].[person] p
INNER JOIN person_membership pm ON p.personid=pm.personid
INNER JOIN membership m ON pm.memberid = m.membershipid
where personstatus='A' and memstatus='A')a
where a.rnasc+rndesc>2

You have to add Group by and Having clause to return duplicate items only-
SELECT
person.firstname,person.surname,person.dob
FROM
person, person_membership, membership
WHERE
person.personid=person_membership.personid AND person_membership.memberid = membership.membershipid
AND
person_membership.personstatus = 'A' AND membership.memstatus = 'A'
GROUP BY
person.firstname,person.surname,person.dob
HAVING COUNT(1) > 1

Related

How to write a running total based on criteria in T-SQL

I'm building a report which gives me the total count of unique accounts within a calendar month.
However, this total is based on the number of active accounts (accounts subscribed to a service), and once their contract ends they will be excluded from the total count.
For example, Company A has subscribed to the service on 1/1/2018 and their contract ends on 1/1/2020. So Company A should be included in the total count of unique accounts for all the months their under contract until their contract ends.
End Result would look something like this:
Here is the SQl query that I have so far. How can I write the code such that it will give me this cumulative/running total. I added the columns for reference.
SELECT A.Name, CA.Name, CA.Start_Date__c, CA.End_Date__c, CA.Product_Code_CPQ__c
FROM [salesforce].[Client_Asset__c] AS CA
INNER JOIN salesforce.Account AS A
ON CA.Account__c = A.Id
WHERE Product_Code_CPQ__c IN(
'DSWPSTRSUB','DSWPESSSUB','DSWPPROSUB','DSWPHOSTSUB','DSWPMULTIHOSTSUB','DSWPOLXWRAPFPE',
'DSWPOLXWRAPSUB','WPCALENDARFORALT','WPCALHOSTINGBUN','IMWPTM','SBWPRET','SBWPRETNR','WORDPLUMWEBSUCCESS',
'WORDPWEBSUCCESS','WORDPOGS','FDSTRWORDPDESGNSUB','FDWPFPE','WORDPEMERGHOST','WORDPSUBBUN','WPOLXPLUGIN',
'POSTSTARTWORDPAF','POSTWORDPSTARTBUN','LUMWORDPSSUBBUN','WORDPLUMOGS','LUMFDSTRWPDESGNSUB',
'LUMPSTWORDPSTRBUN','LUMPOSTSTRTWORDPAF','FDWPEMERGFPE')
AND End_Date__c > GETDATE()
AND Active__c = 1
Try something like that:
CREATE TABLE #tmp ([month] INT, [group] VARCHAR(10), [value] REAL)
INSERT INTO #tmp ([month], [group], [value]) VALUES
(1, 'A', 1), (2, 'A', 5), (3, 'A', 3), (4, 'A', 2), (5, 'A', 8),
(1, 'B', 7), (2, 'B', 3), (3, 'B', 2), (4, 'B', 4), (5, 'B', 6)
SELECT c.[month], c.[group], c.current_total, r.running_total
FROM
(
SELECT [month],[group], SUM([value]) current_total
FROM #tmp
GROUP BY [month],[group]
) C JOIN
(
SELECT [month],[group], SUM([value]) OVER (partition BY [group] ORDER BY [month]) running_total
FROM #tmp
) R ON C.[month]=R.[month] AND C.[group]=R.[group]
ORDER BY 2,1
Tested on mssql 2016. Handle potential missing values yourself.

Find the Biggest Number of Consecutive Occurrence of values in Table

I have the following table
create table Launches (Id int, Name char)
insert into Launches values
(1, 'A'),
(2, 'A'),
(3, 'B'),
(4, 'B'),
(5, 'B'),
(6, 'B'),
(7, 'C'),
(8, 'B'),
(9, 'B')
The result should be
4 - B
From 3 to 6
Similar question -
Count Number of Consecutive Occurrence of values in Table
You can subtract an enumerated value for each name to get a constant for adjacent values that are the same. The rest is aggregation:
select top (1) name, count(*), min(id), max(id)
from (select l.*,
row_number() over (partition by name order by id) as seqnum
from #Launches l
) l
group by (id - seqnum), name
order by count(*) desc;
Here is a db<>fiddle.

Retrieving consecutive rows (and the counts) with the same values

I've got a table with almost 10 million views and would to run this query on the latest million or hundred thousand or so.
Here's a SQL fiddle with example data and input/output: http://sqlfiddle.com/#!9/340a41
Is this even possible?
CREATE TABLE object (`id` int, `name` varchar(7), `value` int);
INSERT INTO object (`id`, `name`, `value`)
VALUES
(1, 'a', 1),
(2, 'b', 2),
(3, 'c', 100),
(4, 'a', 1),
(5, 'b', 2),
(6, 'c', 200),
(7, 'a', 2),
(8, 'b', 2),
(9, 'c', 300),
(10, 'a', 2),
(11, 'b', 2),
(12, 'a', 2),
(13, 'b', 2),
(14, 'c', 400)
;
-- Want:
-- name, max(id), count(id)
-- 'a', 4, 2
-- 'b', 14, 5
-- 'a', 12, 3
If you want the latest and the id is implemented sequentially, then you can do this using limit or top. In SQL Server:
select top 100000 o.*
from object o
order by id desc;
In MySQL, you would use limit:
select o.*
from object o
order by id desc
limit 100000
select name, count(id) cnt, max(id) max_id, max(value) max_v
from
(select
top 1000000 -- MS SQL Server
id,name,value
from myTable
limit 1000000 --mySQL
order by id desc)
group by name
remove line which doesn't match your server.

How do you join tables sharing the same column?

I made an SQL Fiddle and what I would like to do is join these two queries by using the departmentid.
What I would like to show is the departmentname and not_approved_manager.
Would it be best to use a union or join in this case?
Tables
create table cserepux
(
status int,
comment varchar(25),
departmentid int,
approveddate datetime
);
insert into cserepux (status, comment, departmentid, approveddate)
values (1, 'testing1', 1, NULL), (1, 'testing2', 1, NULL),
(1, 'testing2', 2, NULL), (0, 'testing2', 1, NULL),
(0, 'tesitng2', 1, NULL), (0, 'testing2', 1, NULL),
(0, 'tesitng2', 1, NULL), (0, 'testing3', 2, NULL),
(0, 'testing3', 3, NULL);
create table cseDept
(
departmentid int,
department_name varchar(25)
);
insert into cseDept (departmentid,department_name)
values (1, 'department one'), (2, 'department two'),
(3, 'department three'), (4, 'department four');
Query
select
departmentid,
COUNT(*) AS 'not_approved_manager'
from
cserepux
where
approveddate is null
group by
departmentid
SELECT * FROM cseDept
You need to do a join. A union will not get you what you want.
select d.department_name, COUNT(*) AS 'not_approved_manager'
from cserepux c
inner join cseDept d on c.departmentid = d.departmentid
where approveddate is null
group by d.department_name
Do you need just a join and a correct group by
select dep.department_name, COUNT(*) AS 'not_approved_manager'
from cseDept dep
join cserepux cs on cs.departmentid = dep.departmentid
where approveddate is null
group by dep.department_name
Fiddle: http://sqlfiddle.com/#!3/5cf4e/30
Since joins and group by are really basic things in SQL I can suggest you do take a look on some tutorials to get a bit more proficiency whit it. You can try SQL Server Central stairway articles series

SQL IF condition from other tables

I'm new here in the site and I need a help from you guys. Below is the schema i have which can be run in this site http://sqlfiddle.com/#!3/134c3. The name of my database is vehicle inspections. My question is after this schema.
CREATE TABLE Car
([CarID] varchar(36),
[PlateNo] varchar(6),
[Package] int);
INSERT INTO Car([CarID], [PlateNo], [Package])
VALUES('A57D4151-BD49-4B44-AF10-000F1C298E05', '8112AG', 4);
CREATE TABLE Event
([EventID] int,
[CarID] varchar(36),
[EventTime] smalldatetime,
TicketStatus varchar (10)) ;
INSERT INTO Event([EventID], [CarID], [EventTime], TicketStatus)
VALUES (1, 'A57D4151-BD49-4B44-AF10-000F1C298E05', '20130701', 'Open'),
(2, 'A57D4151-BD49-4B44-AF10-000F1C298E05', '20130702', 'Close') ;
CREATE TABLE EventDefects
([EventDefectsID] int,
[EventID] int,
[Status] varchar(15),
[DefectID] int) ;
INSERT INTO EventDefects ([EventDefectsID], [EventID], [Status], [DefectID])
VALUES (1, 1, 'YES', 1),
(2, 1, 'NO', 2),
(3, 1, 'N/A', 3),
(4, 1, 'N/A', 4),
(5, 2, 'N/A', 1),
(6, 2, 'N/A', 2),
(7, 2, 'N/A', 5),
(8, 2, 'YES', 3),
(9, 2, 'NO', 4) ;
CREATE TABLE Defects
([DefectID] int,
[DefectsName] varchar (36),
[DefectClassID] int) ;
INSERT INTO Defects ([DefectID], [DefectsName], [DefectClassID])
VALUES (1, 'TYRE', 1),
(2, 'BRAKING SYSTEM', 1),
(3, 'OVER SPEEDING', 3),
(4, 'NOT WEARING SEATBELTS', 3),
(5, 'MIRRORS AND WINDSCREEN', 2) ;
CREATE TABLE DefectClass
([Description] varchar (15),
[DefectClassID] int) ;
INSERT INTO DefectClass ([DefectClassID], [Description])
VALUES (1, 'CATEGORY A'),
(2, 'CATEGORY B'),
(3, 'CATEGORY C')
To clarify things. There are two conditions when we issue ticket to the driver.
When vehicle is inspected and found defects on any items under Class A or B (tick 'yes'). The ticket status of that is OPEN. On the other hand if all items on Class A and B are tick 'No' it means no defects are found. The ticket Status is CLOSE. Lastly items under Class C or (traffic violations) are tick N/A. Meaning its a mere vehicle inspection
Condition No. 2 is where vehicle is stopped because of traffic violation (ex. Over Speeding). Vehicle will NOT be inspected, The distinction of this issued ticket are all items under Class A and B are tick or mark 'N/A' while on Class C is tick either 'yes' or 'no'.
Now I have this SQL code below that can be use in the schema above where it will extract vehicles on its MAX(EventTime) with corresponding Ticket Status.
Select
PlateNo, TicketStatus, [EventTime]
FROM
(SELECT
ROW_NUMBER() OVER (PARTITION BY Event.CarID ORDER BY [EventTime] DESC) AS [index],
Event.CarID,
TicketStatus,
[EventTime],
plateNo
FROM
[Event]
Join
[Car] ON Event.CarID = Car.CarID) A
WHERE [index] = 1
Result:
RESULT: PlateNo - 8112AG ; EventTime - July 2, 2013; TicketStatus - Close.
THIS IS NOT THE CORRECT since on this particular date there were no inspection at all only the driver was caught for OVER SPEEDING (see the schema above) and items under Class A and B are marked N/A.
The correct result should be one step back which is July 1, 2013 and Ticket Status is OPEN since it was a clear inspection. Items under category A and B are inspected and found TIRES are defective and BRAKING SYSTEM has NO defects.
Somehow I was thinking code where if Event.TicketStatus = CLOSE it will examine if it is close because it was inspected or close because its a traffic violation.
Try this.
SELECT
PlateNo,
TicketStatus,
MAX(EventTime)
FROM
[Event] E
LEFT OUTER JOIN
[EventDefects] ED ON E.EventID = ED.EventID
LEFT OUTER JOIN
[Defects] D ON ED.DefectID = D.DefectID
LEFT OUTER JOIN
[Car] C ON E.CarID = C.CarID
WHERE ED.Status = 'YES' AND D.DefectClassID <> 3
GROUP BY PlateNo, TicketStatus
I think you can solve this that way:
SELECT C.PlateNo, E.EventID, E.TicketStatus, E.EventTime
FROM Car C
INNER JOIN Event E ON C.CarID = E.CarID
INNER JOIN (
SELECT CarID, MAX(E.EventTime) EventTime FROM Event E
LEFT JOIN EventDefects ED ON E.EventID = ED.EventID
LEFT JOIN Defects D ON ED.DefectID = D.DefectID
WHERE D.DefectClassID IN (1,2) AND ED.Status <> 'N/A'
GROUP BY CarID
) T ON E.CarID = T.CarID AND E.EventTime = T.EventTime
The subquery is filtering all events in class 1 and 2 (inspection) and where something happened (<> 'N/A'), and it's getting it's maximum date, so it will bring the last occurence of a real inspection of each car. Then, there's the join to bring the state on that date. From what I understood, that's what you want, right?