How do I find the rows of two tables where a row element from TableA is different from TableB - sql

I am using SQL Server.
I am trying to find the groups that a student is not part of. Table A is a list of all the groups in the database. Table B shows the groups that each student is apart of. When I try joining the two tables and then using WHERE to see the differences, it does not work. I think I have to use EXCEPT but I have no idea how to.
The goal
What I want to happen is have a query, for example studentA, and then it must return all the groups that studentA is NOT apart of. Therefore, for the example tables below it should return groupC because it is the only group that studentA is NOT apart of.
TableA:
groupName
groupDescritpion
groupTags
group A
doesnt matter
doesnt matter
group B
doesnt matter
doesnt matter
group C
doesnt matter
doesnt matter
TableB:
username
groupName
studentA
group A
studentA
group B
studentB
group B
studentB
group C
What I tried:
SELECT DISTINCT TableA.groupName, TableA.groupDescription
FROM TableA
INNER JOIN TableB
ON TableA.groupName=TableB.groupName
WHERE TableB.username != 'studentA';
What should happen:
username
groupName
studentA
group C

SELECT DISTINCT T.USERNAME,A.groupName
FROM TABLEB AS T
CROSS JOIN TABLEA AS A
EXCEPT
SELECT X.USERNAME,X.GROUPNAME
FROM TABLEB AS X

Related

SQL Server : get unique ID from one table, another table, or both

I have two tables, TA and CMI, that contain a person_ID. The ID may exist in TA, it may exist in CMI, or it may exist in both. I want a distinct list of ALL person_ID's regardless whether they are in TA, CMI, or both tables.
I also want to be able to select them where their question_ID's are the same. However, the question_id's have different column names: TA.question and CMI.sco = question_id.
EDIT:
So, if I also wanted to do the select on question as I stated earlier AND a join to the person table, it would look something like:
select ta.person_id, person_key
from ta
left join person on person.person_id = ta.person_id
where question=7033
union -- on purpose to remove duplicates
select cmi.person_id, person_key
from cmi
left join person on person.person_id = cmi.person_id
where sco=7033
You would use union:
select person_id
from ta
union -- on purpose to remove duplicates
select person_id
from cmi;
You can use this as a CTE or subquery in a query.

I expect these 2 sql statements to return same number of rows

In my mind these 2 sql statements are equivalent.
My understanding is:
the first one i am pulling all rows from tmpPerson and filtering where they do not have an equivalent person id. This query returns 211 records.
The second one says give me all tmpPersons whose id isnt in person. this returns null.
Obviously they are not equivalent or theyd have the same results. so what am i missing? thanks
select p.id, bp.id
From person p
right join(
select distinct id
from tmpPerson
) bp
on p.id= bp.id
where p.id is null
select id
from tmpPerson
where id not in (select id from person)
I pulled some ids from the first result set and found no matching records for them in Person so im guessing the first one is accurate but im still surprised they're different
I much prefer left joins to right joins, so let's write the first query as:
select p.id, bp.id
From (select distinct id
from tmpPerson
) bp left join
person p
on p.id = bp.id
where p.id is null;
(The preference is because the result set keeps all the rows in the first table rather than the last table. When reading the from clause, I immediately know what the first table is.)
The second is:
select id
from tmpPerson
where id not in (select id from person);
These are not equivalent for two reasons. The most likely reason in your case is that you have duplicate ids in tmpPerson. The first version removes the duplicates. The second doesn't. This is easily fixed by putting distincts in the right place.
The more subtle reason has to do with the semantics of not in. If any person.id has a NULL value, then all rows will be filtered out. I don't think that is the case with your query, but it is a difference.
I strongly recommend using not exists instead of not in for the reason just described:
select tp.id
from tmpPerson tp
where not exists (select 1 from person p where p.id = tp.id);
select id
from tmpPerson
where id not in (select id from person)
If there is a null id in tmp person then they will not be captured in this query. But in your first query they will be captured. So using an isnull will be resolve the issue
where isnull(id, 'N') not in (select id from person)

Getting data from 2 or more tables: select or join?

Let's say there are 2 or more tables.
Table A: aID, name, birthday
Table B: bID, petType, petName
Table C: cID, stackOverFlowUsername
I want to get something like aID, name, birthday, number of cats a person has, stack overflow's username
We can
use joins to join all 3 tables select * from tableA... tableB... tableC...
use multiple select statements, select a.*, (select count(*) from tableB where petType = 'cat') as numberOfCats, (select...) as stackUsername from tableA a
or other ways that I didn't know
My question is when is the right situation to use select, joins or is there even better methods?
Update:
Here is another question. If I have 3 stackoverflow accounts, Tom has 1 and Peter has 2,
using
A left join B left join C
will return a total of 6 rows
select a.*, select count(*) from tableB where..., select top 1 stackOverFlowUsername from tableC
returns 3 rows because there are 3 person
Can I use joins to achieve something similar if I only want one row of data for each person in tableA regardless how many stackoverflow accounts he/she has?
Thanks
A selected sub-select (case 2) might be scanned for every result row, while joined tables/views/subselects are calculates only once: saving memory and joining time (with pre-built indices). Once you are used to talking SQL, you will find that the JOIN syntax is many times easier to read.

Self Join bringing too many records

I have this query to express a set of business rules.
To get the information I need, I tried joining the table on itself but that brings back many more records than are actually in the table. Below is the query I've tried. What am I doing wrong?
SELECT DISTINCT a.rep_id, a.rep_name, count(*) AS 'Single Practitioner'
FROM [SE_Violation_Detection] a inner join [SE_Violation_Detection] b
ON a.rep_id = b.rep_id and a.hcp_cid = b.hcp_cid
group by a.rep_id, a.rep_name
having count(*) >= 2
You can accomplish this with the having clause:
select a, b, count(*) c
from etc
group by a, b
having count(*) >= some number
I figured out a simpler way to get the information I need for one of the queries. The one above is still wrong.
--Rep violation for different HCP more than 5 times
select distinct rep_id,rep_name,count(distinct hcp_cid)
AS 'Multiple Practitioners'
from dbo.SE_Violation_Detection
group by rep_id,rep_name
having count(distinct hcp_cid)>4
order by count(distinct hcp_cid)

Select Distinct Value from Multiple tables, Where One has Multiples

I have searched for this problem but have not found anything quite like it. I have two tables, both with a field, FormID. In table A, FormID is unique. In table B, there can be multiple records with the same FormID -- table B is a problem tracker table, so if there are multiple data entry problems in the form, there will be multiple records in there.
The following query works:
select distinct b.FormID
from b
where b.FormID = a.FormID
and a.status='done'
as far it does generate a result list of unique FormIDs. However, I also need to get some other columns in this query and it's when I add those columns to the select or the join that I get ALL the duplicate FormIDs.
I have tried:
select distinct (b.FormID), a.FormType, a.Site, a.uid, b.ProbID, b.Date
from b, a
where b.FormID = a.FormID
and a.status='done'
as well as a couple of variations using joins, but they all end up with all the rows with duplicate FormIDs.
Suggestions?
Try
SELECT b.FormID,
MAX(a.FormType) FormType,
MAX(a.Site) Site,
MAX(a.uid) uid,
MAX(b.ProbID) ProbID,
MAX(b.Date) Date
FROM b INNER JOIN
a ON b.FormID = a.FormID
WHERE a.status='done'
GROUP BY b.FormID