Identifying records with inconsistent relationships - sql

You can run all this SQL and see the results here.
Skip to the results ands the ensuing problems to get to the meat of the issue.
I have a table of Clubs (a club as in a group or organization of people, like a "swim club" or a "knitting club").
DECLARE #club TABLE (
Id INT
,Name NVARCHAR(255)
);
INSERT INTO #club VALUES
(1, 'Swim Club')
,(2, 'Knitting Club')
,(3, 'Bridge Club');
I have a table of Members.
DECLARE #member TABLE (
Id INT
,Name NVARCHAR(255)
);
INSERT INTO #member VALUES
(1, 'John Jones')
,(2, 'Sally Smith')
,(3, 'Rod Roosevelt')
,(4, 'Bobby Burns')
,(5, 'Megan Moore');
Members can belong to many Clubs, and so there is a Membership table that connects Clubs to Members (and also describes the membership dues price).
DECLARE #membership TABLE (
Id INT
,Member INT --FK to #member
,Club INT --FK to #club
,Dues INT --the cost of membership
);
INSERT INTO #membership VALUES
(1,1,1,10)
,(2,1,2,5)
,(3,2,1,10)
,(4,2,3,20)
,(5,3,1,10)
,(6,3,2,5)
,(7,4,2,5)
,(8,4,3,20)
,(9,5,1,10)
,(10,5,3,20);
Most Members just pay their associated Dues. Some Members, however, are Sponsored by other Members. And so those Sponsored members will have their Dues paid by another Member (the Sponsor). Therefore, we have a Sponsorship table. The Sponsorship table connects Sponsors (paying for the Dues) to a Sponsee (having their Dues paid by a Sponsor) for a particular Club. Because a Sponsorship is specific to a Club, a Sponsorship record connects two Membership records and NOT two Member records.
DECLARE #sponsorship TABLE (
Id INT
,Sponsee_Membership INT --FK to Sponsee's #membership record
,Sponsor_Membership INT --FK to Sponsor's #membership record
);
INSERT INTO #sponsorship VALUES
(1,5,1)
,(2,8,4)
,(3,9,3)
,(4,10,4);
To get a full view of our Clubs/Memberships/Sponsors, we have:
SELECT
mship.Id AS 'Mship'
,mem.Name AS 'Member'
,c.Name AS 'Club'
,mship.Dues
,spons_mem.Name AS 'Sponsor'
FROM
#membership AS mship
JOIN #member AS mem
ON mship.Member = mem.Id
JOIN #club AS c
ON mship.Club = c.Id
LEFT JOIN #sponsorship AS spons
ON spons.Sponsee_Membership = mship.Id
LEFT JOIN #membership AS spons_mship
ON spons_mship.Id = spons.Sponsor_Membership
LEFT JOIN #member AS spons_mem
ON spons_mem.Id = spons_mship.Member;
which gives us these Results:
Mship Member Club Dues Sponsor
1 John Jones Swim Club 10 NULL
2 John Jones Knitting Club 5 NULL
3 Sally Smith Swim Club 10 NULL
4 Sally Smith Bridge Club 20 NULL
5 Rod Roosevelt Swim Club 10 John Jones
6 Rod Roosevelt Knitting Club 5 NULL
7 Bobby Burns Knitting Club 5 NULL
8 Bobby Burns Bridge Club 20 Sally Smith
9 Megan Moore Swim Club 10 Sally Smith
10 Megan Moore Bridge Club 20 Sally Smith
Sponsorships SHOULD span all shared memberships.
That is, if Sally sponsors Bobby, any time they are both in the same club, Sally will be identified as Bobby's sponsor.
We can see this in lines Mship=7 and Mship=8.
Bobby and Sally are both in the Bridge Club, so Sally is identified as Bobby's sponsor for his Bridge Club membership.
Sally is NOT a member of the Knitting Club, and so Bobby's Knitting Club membership does not show Sally as a sponsor.
Sorry for the long setup. Here's my actual question:
How can I identify where a Sponsorship is missing?
From the example, we have lines Mship=5 and Mship=6.
John is Rod's sponsor.
We can see the sponsorship for Rod's Swim Club membership.
Rod and John are also both Knitting Club members,
but Rod does not show John as a sponsor for his Knitting Club membership.
This is incorrect and this is what I am after.
I want to query all such missing sponsorships.
I can accomplish this using cursors / WHILE loops, but I know that such solutions are usually not taking a proper set-based approach. What would a proper query for this look like?
Many thanks.

Here is a SQL query that might respond to your requirement.
The logic is to use a subquery to generate a mapping between sponsors and sponsees based on mapping member.id instead of memberships.id ; for this, we use aggregation. Then, the outer query searches for clubs where both the sponsor and the sponsee participate, but for which no relation is declared in the sponsorship table
The query returns one record for each offending membership, with the sponsee and sponsor names.
SELECT mship1.Id, m1.Name Member, m2.Name Sponsor, c.Name Club, mship1.Dues
FROM
#membership mship1
INNER JOIN #club c ON c.Id = mship1.Club
INNER JOIN (
SELECT ms1.Member Sponsee_Member , MAX(ms2.Member) Sponsor_Member
FROM #sponsorship ss
INNER JOIN #membership ms1 ON ms1.Id = Sponsee_Membership
INNER JOIN #membership ms2 ON ms2.Id = Sponsor_Membership
GROUP BY ms1.Member
) rels ON rels.Sponsee_Member = mship1.Member
INNER JOIN #membership mship2 ON mship2.Member = rels.Sponsor_Member AND mship2.Club = mship1.Club
INNER JOIN #member m1 ON m1.Id = mship1.Member
INNER JOIN #member m2 ON m2.Id = mship2.Member
LEFT JOIN #sponsorship sship ON sship.Sponsor_Membership = mship2.Id
WHERE sship.Id IS NULL
;
In the rextester that you provided, this returns :
Id | Member | Sponsor | Club | Dues
-----|-----------------|--------------|----------------|-----
6 | Rod Roosevelt | John Jones | Knitting Club | 5
Working on creating this query got me thinking that you could optimize your database design. The current model will make it hard to maintain consistency : your question by itself demonstrates that. In the future, what happens if a sponsor registers in a new club, where one of his sponsee already participate ? Once again you will need to detect the missing sponshorship relation, and somehow create it.
You actually have a 1-1 relationship between sponsors and sponsee, since you stated that sponsorships SHOULD span all shared memberships. It does not look like you would allow a sponsee to have several sponsors, even across different clubs.
I would suggest that you drop the sponshorship table and store a self-foreign key to the sponsor directly in the member table. Starting from there, it is easy to check which clubs both members have in common, and properly assign the dues using a SQL query.

I found that I can gather this in a single query by gathering all Sponsored Relationships with a CTE, finding all memberships that SHOULD have a sponsorship based on that, then removing all existing sponsorships with an EXCEPT. I'm left with sponsorships that should exist but do not.
WITH sponsored_relationships AS (
SELECT DISTINCT
sponsee_member.Id AS Sponsee
,sponsor_member.Id AS Sponsor
FROM
#sponsorship AS s
JOIN #membership AS sponsee_mship
ON s.Sponsee_Membership = sponsee_mship.Id
JOIN #member AS sponsee_member
ON sponsee_mship.Member = sponsee_member.Id
JOIN #membership AS sponsor_mship
ON s.Sponsor_Membership = sponsor_mship.Id
JOIN #member AS sponsor_member
ON sponsor_mship.Member = sponsor_member.Id
)
SELECT
see_mem.Name AS Sponsee
,sor_mem.Name AS Sponsor
,c.Name AS Club
FROM
sponsored_relationships AS sr
JOIN #member AS see_mem
ON sr.Sponsee = see_mem.Id
JOIN #membership AS see_mship
ON see_mship.Member = see_mem.Id
JOIN #member AS sor_mem
ON sr.Sponsor = sor_mem.Id
JOIN #membership AS sor_mship
ON sor_mship.Member = sor_mem.Id
JOIN #club AS c
ON (see_mship.Club = c.Id
AND sor_mship.Club = c.Id
)
EXCEPT
SELECT
see_mem.Name AS Sponsee
,sor_mem.Name AS Sponsor
,c.Name AS Club
FROM
sponsored_relationships AS sr
JOIN #member AS see_mem
ON sr.Sponsee = see_mem.Id
JOIN #membership AS see_mship
ON see_mship.Member = see_mem.Id
JOIN #member AS sor_mem
ON sr.Sponsor = sor_mem.Id
JOIN #membership AS sor_mship
ON sor_mship.Member = sor_mem.Id
JOIN #club AS c
ON (see_mship.Club = c.Id
AND sor_mship.Club = c.Id
)
JOIN #sponsorship AS sship
ON (sship.Sponsee_Membership = see_mship.Id
AND sship.Sponsor_Membership = sor_mship.Id
);

Related

Common records from SQL table

I have 3 tables called CompanyInfo , IDInfo and PersonalInfo.
The objective is to fetch P.[First Name], P.[Last Name], I.PAN, C.DOB.
I can put left join or right join to cater this but the sequence of tables may change as it is an input to some tool and user may enter the table names in any sequence like user1 mentions "CompanyInfo, IDInfo, PersonalInfo", user2 mentions "IDInfo, PersonalInfo, CompanyInfo" and so on.
Table data is as below, Is there a way I can fetch the data through single SQL query (may be union). Say for if I want to fetch the data for ID = 4, I should get:
FirstName LastName PAN DOB
User Four UAN44444 NULL
If ID = 3
FirstName LastName PAN DOB
User THree NULL 1987-12-08
CompanyInfo Table
ID Company OfficialEmail EmpID DOB Department Joining Date Status
1 RBS userone#rbs.co.uk UK222222 1980-11-15 HR 2012-11-20 Inactive
3 Infosys userthree#infy.com IN333333 1987-12-08 Admin 2016-08-18 Inactive
5 IBM userfive#us.ibm.com US55555 1986-03-26 Finance 2014-06-26 Active
10 Samsung userten#samsung.com SK101010 1988-04-04 Admin 2013-04-07 Active
IDInfo Table
ID UAN TIN PAN
2 UAN22222 TIN222222 PAN22222
4 UAN44444 TIN444444 PAN44444
5 UAN55555 TIN555555 PAN55555
PersonalInfo Table
ID FirstName LastName PhoneNumber City Address ZipCode Email
1 User One +44-7432564125 London United Kingdom RG231FDT userone#gmail.com
2 User Two +91-987654321 New Delhi India 110006 usertwo#gmail.com
3 User Three +44-782136425 Guildford United Kingdom GUI74DS userthree#gmail.com
4 User Four +1-230156428 Atlanta United States GA 30337 userfour#gmail.com
5 User Five +1-650324152 Houston United States TX 77077 userfive#gmail.com
6 User Six +91-8885552223 Mumbai India 400012 usersix#gmail.com
7 User Seven +91-9998887771 Bangalore India 560021 userseven#gmail.com
Simple LEFT JOIN will resolve your problem.
SELECT P.FirstName, P.LastName, I.PAN, C.DOB
FROM PersonalInfo P
LEFT JOIN IDInfo I ON I.ID = P.ID
LEFT JOIN CompanyInfo C ON C.ID = P.ID
WHERE P.ID = #YourInput
You can use left join and right join:
SELECT P.FirstName, P.LastName, I.PAN, C.DOB. FROM IDInfo i LEFT JOIN PersonalInfo p ON p.ID = i.ID LEFT JOIN CompanyInfo c ON c.ID = i.ID WHERE i.ID = 3
Creating table variables to hold data of 3 table
DECLARE #CompanyInfo TABLE
(ID INT,Company NVARCHAR(50),OfficialEmail NVARCHAR(50),EmpID NVARCHAR(50),DOB NVARCHAR(50),Department NVARCHAR(50),[Joining Date] NVARCHAR(50),[Status] NVARCHAR(50))
DECLARE #IDInfo TABLE
(ID INT,UAN NVARCHAR(50),TIN NVARCHAR(50),PAN NVARCHAR(50))
DECLARE #PersonalInfo TABLE
(ID INT,FirstName NVARCHAR(50),LastName NVARCHAR(50),PhoneNumber NVARCHAR(50),City NVARCHAR(50),[Address] NVARCHAR(50),ZipCode NVARCHAR(50),Email NVARCHAR(50))
Populate data to the tables
INSERT INTO #CompanyInfo VALUES
(1,'RBS','userone#rbs.co.uk','UK222222','1980-11-15','HR','2012-11-20','Inactive'),
(3,'Infosys','userthree#infy.com','IN333333','1987-12-08','Admin','2016-08-18','Inactive'),
(5,'IBM','userfive#us.ibm.com','US55555','1986-03-26','Finance','2014-06-26','Active'),
(10,'Samsung','userten#samsung.com','SK101010','1988-04-04','Admin','2013-04-07','Active')
INSERT INTO #IDInfo VALUES
(2,'UAN22222','TIN222222','PAN22222'),
(4,'UAN44444','TIN444444','PAN44444'),
(5,'UAN55555','TIN555555','PAN55555')
INSERT INTO #PersonalInfo VALUES
(1,'User','One','+44-7432564125','London','United Kingdom','RG231FDT','userone#gmail.com'),
(2,'User','Two','+91-987654321','New Delhi','India','110006','usertwo#gmail.com'),
(3,'User','Three','+44-782136425','Guildford','United Kingdom','GUI74DS','userthree#gmail.com'),
(4,'User','Four','+1-230156428','Atlanta','United States','GA 30337','userfour#gmail.com'),
(5,'User','Five','+1-650324152','Houston','United States','TX 77077','userfive#gmail.com'),
(6,'User','Six','+91-8885552223','Mumbai','India','400012','usersix#gmail.com'),
(7,'User','Seven','+91-9998887771','Bangalore','India','560021','userseven#gmail.com')
Query to obtain the required output
SELECT
P.FirstName,
P.LastName,
I.PAN,
C.DOB
FROM
#PersonalInfo P LEFT JOIN #IDInfo I ON I.ID = P.ID
LEFT JOIN #CompanyInfo C ON C.ID = P.ID
WHERE
P.ID = 4
OUTPUT :-
FirstName LastName PAN DOB
User Four PAN44444 NULL

would I need to use a Union here, a Join, or something else?

I cant figure out if what i'm needing to do here is a Join statement, or a Union.
Pets
Id name color
1 wiskers grey
2 midnight black
3 ralph yellow
4 Bob brown
Shots table
Id Rabbies a123
2 Yes No
4 No No
Notes tables
Id Notes
4 This pet is blind
2 This pet has no owner
The result im looking for:
Id Name Color Rabbies A123 Notes
1 Wiskers grey Null Null Null
2 midnight black Yes No This pet has no owner
......
I think you want left joins:
select p.*, s.rabies, s.a123, n.notes
from pets p left join
shots s
on s.id = p.id left join
notes n
on n.id = p.id;
Joins and Unions are both used to combine data, and both could potentially be used here. However, I would recommend using a Join, Joins combine columns from different tables, which seems to be what you want to include. You want all columns included for a single row (for the ID of the animal).
https://www.codeproject.com/Articles/1068500/What-Is-the-Difference-Between-a-Join-and-a-UNION
Try that link for more information.
If this is the case
create table Shots (id serial primary key, Rabbies varchar, a123 varchar, pet_id int);
insert into shots (pet_id, Rabbies, a123) values (2, 'Yes','No'), (4, 'No','No');
create table notes (id serial primary key, notes varchar, pet_id int);
insert into notes (pet_id, notes) values (4, 'This pet is blind'), (2, 'This pet has no owner');
select p.id, p.name, p.color, s.rabbies, s.a123, n.notes
from pets p
left join shots s on p.id = s.id
left join notes n on p.id = n.id;

Inner join an inner join with another inner join

I'm wondering if it is possible to inner join an inner join with another inner join.
I have a database of 3 tables:
people
athletes
coaches
Every athlete or coach must exist in the people table, but there are some people who are neither coaches nor athletes.
What I am trying to do is find a list of people who are active (meaning play or coach) in at least 3 different sports. The definition of active is they are either coaches, athletes or both a coach and an athlete for that sport.
The person table would consist of (id, name, height)
the athlete table would be (id, sport)
the coaching table would be (id, sport)
I have created 3 inner joins which tell me who is both a coach and and an athlete, who is just a coach and who is just an athlete.
This is done via inner joins.
For example,
1) who is both a coach and an athlete
select
person.id,
person.name,
coach.sport as 'Coaches and plays this sport'
from coach
inner join athlete
on coach.id = athlete.id
and coach.sport = athlete.sport
inner join person
on athlete.id = person.id
That brings up a list of everyone who both coaches and plays the same sport.
2) To find out who only coaches sports, I have used inner joins as below:
select
person.id,
person.name,
coach.sport as 'Coaches this sport'
from coach
inner join person
on coach.id = person.id
3) Then to find out who only plays sports, I've got the same as 2) but just tweaked the words
select
person.id,
person.name,
athlete.sport as 'Plays this sport'
from athlete
inner join person
on athlete.id = person.id
The end result is now I've got:
1) persons who both play and coach the same sport
2) persons who coach a sport
3) persons who play a sport
What I would like to know is how to find a list of people who play or coach at least 3 different sports? I can't figure it out because if someone plays and coaches a sport like hockey in table 1, then I don't want to count them in table 2 and 3.
I tried using these 3 inner joins to make a massive join table so that I could pick the distinct values but it is not working.
Is there an easier way to go about this without making sub-sub-queries?
What I would like to know is how to find a list of people who play /
coach at least 3 different sports? I can't figure it out because if
someone plays and coaches a sport like hockey in table 1, then I don't
want to count them in table 2 and 3.
you can do something like this
select p.id,min(p.name) name
from
person p inner join
(
select id,sport from athlete
union
select id,sport from coach
)
ca
on ca.id=p.id
group by p.id
having count(ca.sport)>2
CREATE TABLE #person (Id INT, Name VARCHAR(50));
CREATE TABLE #athlete (Id INT, Sport VARCHAR(50));
CREATE TABLE #coach (Id INT, Sport VARCHAR(50));
INSERT INTO #person (Id, Name) VALUES(1, 'Bob');
INSERT INTO #person (Id, Name) VALUES(2, 'Carol');
INSERT INTO #person (Id, Name) VALUES(2, 'Sam');
INSERT INTO #athlete (Id, Sport) VALUES(1, 'Golf');
INSERT INTO #athlete (Id, Sport) VALUES(1, 'Football');
INSERT INTO #coach (Id, Sport) VALUES(1, 'Tennis');
INSERT INTO #athlete (Id, Sport) VALUES(2, 'Tennis');
INSERT INTO #coach (Id, Sport) VALUES(2, 'Tennis');
INSERT INTO #athlete (Id, Sport) VALUES(2, 'Swimming');
-- so Bob has 3 sports, Carol has only 2 (she both coaches and plays Tennis)
SELECT p.Id, p.Name
FROM
(
SELECT Id, Sport
FROM #athlete
UNION -- this has an implicit "distinct"
SELECT Id, Sport
FROM #coach
) a
INNER JOIN #person p ON a.Id = p.Id
GROUP BY p.Id, p.Name
HAVING COUNT(*) >= 3
-- returns 1, Bob
I have created a SQL with some test data - should work in your case:
Connecting the two results in the subselect with UNION:
UNION will return just non-duplicate values. So every sport will be just counted once.
Finally just grouping the resultset by person.Person_id and person.name.
Due to the HAVING clause, just persons with 3 or more sports will be returned-
CREATE TABLE person
(
Person_id int
,name varchar(50)
,height int
)
CREATE TABLE coach
(
id int
,sport varchar(50)
)
CREATE TABLE athlete
(
id int
,sport varchar(50)
)
INSERT INTO person VALUES
(1,'John', 130),
(2,'Jack', 150),
(3,'William', 170),
(4,'Averel', 190),
(5,'Lucky Luke', 180),
(6,'Jolly Jumper', 250),
(7,'Rantanplan ', 90)
INSERT INTO coach VALUES
(1,'Football'),
(1,'Hockey'),
(1,'Skiing'),
(2,'Tennis'),
(2,'Curling'),
(4,'Tennis'),
(5,'Volleyball')
INSERT INTO athlete VALUES
(1,'Football'),
(1,'Hockey'),
(2,'Tennis'),
(2,'Volleyball'),
(2,'Hockey'),
(4,'Tennis'),
(5,'Volleyball'),
(3,'Tennis'),
(6,'Volleyball'),
(6,'Tennis'),
(6,'Hockey'),
(6,'Football'),
(6,'Cricket')
SELECT person.Person_id
,person.name
FROM person
INNER JOIN (
SELECT id
,sport
FROM athlete
UNION
SELECT id
,sport
FROM coach
) sports
ON sports.id = person.Person_id
GROUP BY person.Person_id
,person.name
HAVING COUNT(*) >= 3
ORDER BY Person_id
The coaches & athletes, ie people who are coaches or athletes, are relevant to your answer. That is union (rows in one or another), not (inner) join rows in one and another). (Although outer join involves a union, so there is a complicated way to use it here.) But there's no point in getting that by unioning only-coaches, only-athletes & coach-athletes.
Idiomatic is to group & count the union of Athletes & Coaches.
select id
from (select * from Athletes union select * from Coaches) as u
group by id
having COUNT(*) >= 3
Alternatively, you want ids of people who coach or play a 1st sport and coach or play a 2nd sport and coach or play a 3rd sport where the sports are all different.
with u as (select * from Athletes union select * from Coaches)
select u1.id
from u u1
join u u2 on u1.id = u2.id
join u u3 on u2.id = u3.id
where u1.sport <> u2.sport and u2.sport <> u3.sport and u1.sport <> u3.sport
If you wanted names you would join that with People.
Is there any rule of thumb to construct SQL query from a human-readable description?](https://stackoverflow.com/a/33952141/3404097)

Join ruins the select

I have two tables that contains People who are working at the company and their Employment information (so People is one table, Employment is another).
The People table contains information on where the person lives, emergency contact, phone number bla bla bla. The Employment table contains information on where he works, closest boss and more.
These tables have been corrupted and now contains a few duplicates by misstake. Now in both tables there is a Person id, but the employment id is only located in Employment. I want both numbers on all people that have been duplicated.
This works perfectly:
SELECT DISTINCT
pp.Personid,
pp.Firstname,
pp.Lastname,
pp.Address,
FROM People pp
JOIN People pp2
ON pp.Firstname = pp2.Firstname
AND pp.Lastname = pp2.Lastname
AND pp.Address = pp2.Address
AND pp.Personid <> pp2.Personid
ORDER BY pp.Firstname, pp.Lastname, pp.Personid
returning the following values (but does not include Employment number as you can see):
1001 Carl Johnsson Bigstreet 1
1002 Carl Johnsson Bigstreet 1
1003 Carl Johnsson Bigstreet 1
1010 Andrew Wilkinsson Smallstreet 2
1011 Andrew Wilkinsson Smallstreet 2
Now, to add the employment id I join in that table like this:
SELECT DISTINCT
pp.Personid,
e.Employmentid,
pp.Firstname,
pp.Lastname,
pp.Address,
FROM People pp
JOIN People pp2
ON pp.Firstname = pp2.Firstname
AND pp.Lastname = pp2.Lastname
AND pp.Address = pp2.Address
AND pp.Personid <> pp2.Personid
JOIN Employment e on pp.Personid = e.Personid
ORDER BY pp.Firstname, pp.Lastname, pp.Personid
And everything goes to h**l in a handbasket with the following result:
1001 1111 Carl Johnsson Bigstreet 1
1001 1111 Carl Johnsson Bigstreet 1
1001 1111 Carl Johnsson Bigstreet 1
1010 1234 Andrew Wilkinsson Smallstreet 2
1010 1234 Andrew Wilkinsson Smallstreet 2
As you can see I get both Personid and Employmentid but now I only get one of each (repeated the correct number of times) so I don't have all the different Personid and Employmentid in my list.
Why?
What happened with my join that crashed the party?
Ok, let's make some sample data;
CREATE TABLE #People (PersonID int, FirstName varchar(50), LastName varchar(50), Address1 varchar(50))
INSERT INTO #People (PersonID, FirstName, LastName, Address1)
VALUES
('1','Mike','Hunt','Cockburn Crescent')
,('2','Mike','Hunt','Cockburn Crescent')
,('3','Mike','Hunt','Cockburn Crescent')
,('4','Connie','Lingus','Dyke Close')
,('5','Connie','Lingus','Dyke Close')
,('6','Eric','Shun','Tickle Avenue')
,('7','Ivana','Humpalot','Bottom Street')
CREATE TABLE #Employment (PersonID int, EmploymentID int)
INSERT INTO #Employment (PersonID, EmploymentID)
VALUES
('1','10')
,('2','11')
,('3','12')
,('4','13')
,('5','14')
,('6','15')
,('7','16')
I'd do the first query differently, if you work out the duplicates in a sub-select it would be easier, you'll then be able to join to the employment table with no problems;
SELECT pp.PersonID
,em.EmploymentID
,pp.FirstName
,pp.LastName
,pp.Address1
FROM #People pp
JOIN (
SELECT FirstName
,LastName
,Address1
,COUNT(1) records
FROM #People
GROUP BY FirstName
,LastName
,Address1
HAVING COUNT(1) > 1
) pp2 ON pp.FirstName = pp2.FirstName
AND pp.LastName = pp2.LastName
AND pp.Address1 = pp2.Address1
LEFT JOIN #Employment em ON pp.PersonID = em.PersonID
Remember to clean up the temp tables;
DROP TABLE #People
DROP TABLE #Employment
I think you should try this
SELECT DISTINCT
ep.Personid,
ep.Employementid,
ep.FirstName,
ep.LastName,
ep.Address
FROM Person P join
(SELECT
pp.Personid,
e.Employmentid,
pp.Firstname,
pp.Lastname,
pp.Address,
from PP
JOIN Employment e on pp.Personid = e.Personid ) ep
on
P.Firstname = ep.Firstname
AND P.Lastname = ep.Lastname
AND P.Address = ep.Address
AND P.Personid <> ep.Personid
ORDER BY P.Firstname, P.Lastname, P.Personid
Sir please Check and reply to me
Your code should work and I am unable to reproduce your issue using data I have made up. The outcome you are seeing suggests to me that there are multiple person id's for carl johnsson in the employment table and that the employmentid is different - even though it looks the same in the output.
Can you supply your table definitions and sample data?

Complicated table join

I thought I had a good grasp on table joins but there is one problem here I can't figure out.
I am trying to track the progress of students on specifically required courses. Some students are required to complete an exact list of courses before further qualification.
Tables (simplified):
students
--------
id INT PRIMARY KEY
name VARCHAR(50)
student_courses
---------------
student_id INT PRIMARY KEY
course_id TINYINT PRIMARY KEY
course_status TINYINT (Not done, Started, Completed)
steps_done TINYINT
total_steps TINYINT
date_created DATETIME
date_modified DATETIME
courses
-------
id TINYINT PRIMARY KEY
name VARCHAR(50)
I want to insert a list of required courses, for example 5 different courses in the courses table and then select a specific student and get list of all the courses required, whether a row exists for that course in the student_courses table or not.
I guess I could insert all rows from the courses table in the student_courses table for each student, but I don't want that because not all students need to do these courses. And what if new courses are added later.
I just want a result which is something like this:
students table:
id name
--- ------------------
1 George Smith
2 Dana Jones
3 Maria Cobblestone
SELECT * FROM students (JOIN bla bla bla - this is the point where I'm lost...)
WHERE students.id = 1
Result:
id name course_id courses.name course_status steps_done
--- ------------------ --------- ------------ ------------- ----------
1 George Smith 1 Botany Not started 0
1 George Smith 2 Biology NULL NULL
1 George Smith 3 Physics NULL NULL
1 George Smith 4 Algebra Completed 34
1 George Smith 5 Sewing Started 2
If the course_status or steps_done is NULL it means that no row exists for this student for this course in the student_courses table.
The idea is then using this in MS Access (or some other system) and have the row automatically inserted in the student_courses table once you enter a value in the NULL field.
You can't just use an outer join to do this, you need to create a list of all students/classes combinations that you're interested in first, then use that list in a LEFT JOIN. Can be done in a cte/subquery using CROSS JOIN:
;WITH cte AS (SELECT DISTINCT s.id Student_ID
,s.name
,c.id Course_ID
,c.name Class_Name
FROM Students s
CROSS JOIN Courses c)
SELECT cte.*,sc.status
FROM cte
LEFT JOIN student_courses sc
ON cte.course_id = sc.course_id
Can also use a subquery if needs to be done in Access (not 100% on syntax in Access):
SELECT sub.*,sc.status
FROM (SELECT DISTINCT s.id Student_ID
,s.name
,c.id Course_ID
,c.name Class_Name
FROM Students s
CROSS JOIN Courses c
) AS sub
LEFT JOIN student_courses sc
ON sub.course_id = sc.course_id
Demo: SQL Fiddle
You want a left outer join. The first table is from the courses table and is used for the required courses (defined in the where clause).
select s.id, s.name, c.id, c.name, c.course_status, c.steps_done
from (courses as c left join
student_courses as sc
on sc.course_id = c.id and
sc.student_id = 1
) left join
students as s
on sc.student_id = s.id
where c.id in (<list of required courses>)
order by s.id, c.id;
I think I have all the "Access"isms in there.
Actually, the above will be missing the student name when s/he is missing a course. The following is more correct:
select s.id, s.name, c.id, c.name, c.course_status, c.steps_done
from (courses as c left join
student_courses as sc
on sc.course_id = c.id and
sc.student_id = 1
) cross join
students as s
on s.id = 1
where c.id in (<list of required courses>)
order by s.id, c.id;