Find all sets/entities that are in another set [duplicate] - sql

This question already has answers here:
Need a way to find matches between two many-to-many-relationships
(3 answers)
Closed 4 years ago.
The answer is found in the abstract here but I'm looking for the concrete SQL solution.
Given the following tables:
------------ -----------
| F_Roles | | T_Roles |
------+----- -----+-----
| FId | RId| |TId | RId|
------+------ -----+-----
| f1 | 2 | | t1 | 1 |
| f1 | 3 | | t1 | 2 |
| f2 | 2 | | t1 | 3 |
| f2 | 4 | | t1 | 4 |
| f2 | 9 | | t1 | 5 |
| f3 | 6 | | t1 | 6 |
| f3 | 7 | | t1 | 7 |
------------ ----------
(F_Roles) is a join table between F (not shown) and Roles (also not shown)
(T_Roles) is a join table between T (not shown) and Roles (not shown)
I need to return:
all (DISTINCT) FId's where the set of RId's for a given FId is a subset of (or 'IN') Roles. (I know I'm mixing Set Theory with database terms but only in the interest of better conveying the idea, I hope). So, f1 and f3 should be returned in this case, because the set of RId's for f1, {2,3}, and for f3, {6,7}, are subsets of T_Roles.
the list of RId's in T_Roles NOT found in any of the functions returned above. (T_Roles - (f1 Union f3)), or {1,4,5} in this example.

Let's define the following sample data:
DECLARE #F_Roles TABLE
(
[FID] CHAR(2)
,[RID] TINYINT
);
DECLARE #Roles TABLE
(
[RID] TINYINT
);
INSERT INTO #F_Roles ([FID], [RID])
VALUES ('f1', 2)
,('f1', 3)
,('f2', 2)
,('f2', 4)
,('f2', 9)
,('f3', 6)
,('f3', 7);
INSERT INTO #Roles ([RID])
VALUES (1), (2), (3), (4), (5), (6), (7);
No, the first query can be solved using the T-SQL statement below:
SELECT F.[FID]
FROM #F_Roles F
LEFT JOIN #Roles R
ON F.[RID] = R.[RID]
GROUP BY F.[FID]
HAVING SUM(CASE WHEN R.[RID] IS NULL THEN 0 ELSE 1 END) = COUNT(F.[RID]);
The idea is pretty simple. We are using LEFT join in order to check which RID from the #F_Roles table has corresponding RID in the #Rolestable. If it has not, the value returned by the query for the corresponding row is NULL. So, we just need to count the RIDs for each FID and to check if this count is equal to the count of values returned by the second table (NULL values are ignored).
The latter query is simple, too. Having the FID from the first, we just can use EXCEPT in order to found RIDs which are not matched:
SELECT [RID]
FROM #Roles
EXCEPT
SELECT [RID]
FROM #F_Roles
WHERE [FID] IN
(
SELECT F.[FID]
FROM #F_Roles F
LEFT JOIN #Roles R
ON F.[RID] = R.[RID]
GROUP BY F.[FID]
HAVING SUM(CASE WHEN R.[RID] IS NULL THEN 0 ELSE 1 END) = COUNT(F.[RID])
);
Here is the result of the execution of the queries:

For query 1:
with x as (
select f.fid, sum(case when r.rid is null then 1 end) as missing
from f_roles f
left join roles r on r.rid = r.rid
group by f.fid
)
select distinct f.fid
from f_roles f
join x on f.fid = x.fid
where x.missing = 0
For query 2:
with x as (
select f.fid, sum(case when r.rid is null then 1 end) as missing
from f_roles f
left join roles r on r.rid = r.rid
group by f.fid
),
y as (
select distinct f.fid
from f_roles f
join x on f.fid = x.fid
where x.missing = 0
)
select r.rid
from roles r
where r.rid not in (
select f.rid from y join f_roles f on f.rid = y.rid
)

Related

How to make a comparison for the record that has rows to another rows? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I have first table that has columns:
1. id
2. key
3. value
And second table(more like the list):
key
I need to get distinct id that contains all keys from second table
I have tried self join but it is very slow. Also I tried COUNT = COUNT but performance the same.
Self join:
select f.id from first
join first f2 on f.id = f2.id AND f2.key = f. key
COUNT:
select a.keyfrom #a a
where ( select SUM(CASE WHEN k.[key] is not NULL THEN 1 ELSE 0 END) from [b] b
LEFT JOIN Second s on s.key= b.[Key]
where b.[Key] = a.key) = #KeyCount
You can also check this-
SELECT A.id
FROM TAB1 A
INNER JOIN TAB2 B ON A.[key] = B.[Key]
GROUP BY A.id
HAVING COUNT(DISTINCT A.[key])
= (SELECT COUNT(DISTINCT [Key]) FROM TAB2)
This is somewhat of a stab in the dark, but perhaps this is what you're after...?
SELECT I.ID
FROM TableB B
CROSS APPLY (SELECT DISTINCT ca.ID
FROM dbo.TableA ca) I
LEFT JOIN TableA A ON B.[key] = A.[key]
AND I.ID = A.ID
GROUP BY I.ID
HAVING COUNT(CASE WHEN A.[Key] IS NULL THEN 1 END) = 0;
db<>fiddle
Assuming:
Your 2nd table lists all possible keys, and
Your first table (containing entity IDs, keys, and key values) can only contain 1 entity-key combination,
something like this may work:
SELECT [id], COUNT(*)
FROM Table1
GROUP BY [id]
HAVING COUNT(*) = (SELECT COUNT(*) FROM keys)
Now with some sample data. Assume the following keys:
+--------+----------+
| key_id | key_name |
+--------+----------+
| 1 | Key1 |
| 2 | Key2 |
| 3 | Key3 |
+--------+----------+
And the following entities:
+----+-----+-------+
| id | key | value |
+----+-----+-------+
| 1 | 1 | 1 |
| 1 | 2 | 2 |
| 1 | 3 | 3 |
| 2 | 2 | 2 |
| 2 | 3 | 3 |
+----+-----+-------+
Assume how Entity 1 has all keys, but Entity 2 is missing Key 1. So, as expected, the query returns only Entity 1.
You can use aggregation for the counting:
select f.id
from first f
where exists (select 1 from second s where s.key = f.key)
group by f.id
having count(*) = (select count(*) from second);
This assumes that there are no duplicates in the table. It also assumes that extra keys in first are ok. If not, use left join:
select f.id
from first f left join
second s
on s.key = f.key
group by f.id
having count(s.key) = (select count(*) from second) and
count(*) = count(s.key);

bitwise comparison in bit columns

I have a database table with columns shaped as following:
| ID | name | A | B | C | D |
| 1 | foo | 1 | 0 | 0 | 1 |
| 2 | bar | 0 | 0 | 1 | 1 |
| 3 | foo | 1 | 1 | 0 | 0 |
| 4 | bar | 1 | 1 | 0 | 0 |
A, B, C and D are bit columns.
I need to get the name values of the rows of which there at least two and that both have at least one identical bit column set to true. the result set I want to get for the given example is as following:
| name |
| foo |
I can do the following:
SELECT l.name
FROM dummy l
INNER JOIN dummy r ON l.name = r.name
WHERE (l.A = 1 AND r.A = 1)
OR (l.B = 1 AND r.B = 1)
OR (l.C = 1 AND r.C = 1)
OR (l.D = 1 AND r.D = 1)
GROUP BY l.name
HAVING COUNT(*) > 1
But this gets unreadable soon since the table is massive. I was wondering if there was a bitwise solution to solve this
I suspect that your data model is wrong. It feels like A-D represent the same "type" of thing and so the data ought to be represented using a single column that contains the data values A-D and (if necessary) one column to store the 1 or 0, with separate rows for each A-D value. (But then, of course, we can use the presence of a row to indicate a 1 and the absence of the row to represent a 0).
We can use UNPIVOT to get this "better" structure for the data and then the query becomes trivial:
declare #t table (ID int not null, name char(3) not null, A bit not null, B bit not null,
C bit not null, D bit not null)
insert into #t(ID,name,A,B,C,D) values
(1,'foo',1,0,0,1),
(2,'bar',0,0,1,1),
(3,'foo',1,1,0,0),
(4,'bar',1,1,0,0)
;With ProperLayout as (
select ID,Name,Property,Value
from #t t
unpivot (Value for Property in (A,B,C,D)) u
where Value = 1
)
select name,Property
from ProperLayout
group by name,Property
having COUNT(*) > 1
Result:
name Property
---- ---------
foo A
(Note also that the top of my script is not much different in size to the sample data in your question but has the massive benefit that it's runnable)
In similar way you could also use Apply opertaor
SELECT a.name FROM table t
CROSS APPLY (
VALUES (name, 'A', A), (name, 'B', B), (name, 'C', C), (name, 'D', D)
)a(name , names , value)
WHERE a.value = 1
GROUP BY a.name, a.Names, a.value
HAVING COUNT(*) > 1
From your description, you seem to want:
SELECT l.name
FROM dummy l
GROUP BY l.name
HAVING SUM( CAST(A as int) ) >= 2 OR
SUM( CAST(B as int) ) >= 2 OR
SUM( CAST(C as int) ) >= 2 OR
SUM( CAST(D as int) ) >= 2 ;
This is based on the description. I don't know what the same result row has to do with the question.
It is not hard to read. It is just long.
This would be more efficient:
SELECT distinct l.name
FROM dummy l
INNER JOIN dummy r
ON l.name = r.name
and l.id < r.id
and ( (l.A = 1 AND r.A = 1)
OR (l.B = 1 AND r.B = 1)
OR (l.C = 1 AND r.C = 1)
OR (l.D = 1 AND r.D = 1)
)
order by l.name
You could build it up reading sys.columns
I don't think TSQL has any bitwise operators.

Need T-SQL query to get multiple choice answer if matches

Example:
Table Question_Answers:
+------+--------+
| q_id | ans_id |
+------+--------+
| 1 | 2 |
| 1 | 4 |
| 2 | 1 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
+------+--------+
User_Submited_Answers:
| q_id | sub_ans_id |
+------+------------+
| 1 | 2 |
| 1 | 4 |
| 2 | 1 |
| 3 | 1 |
| 3 | 2 |
| 3 | 4 |
+------+------------+
I need a T-SQL query if this rows matches count 1 else 0
SELECT
t1.q_id,
CASE WHEN COUNT(t2.sub_ans_id) = COUNT(*)
THEN 1
ELSE 0 END AS is_correct
FROM Question_Answers t1
LEFT JOIN User_Submited_Answers t2
ON t1.q_id = t2.q_id AND
t1.ans_id = t2.sub_ans_id
GROUP BY t1.q_id
Try the following code:
select qa.q_id,case when qa.ans_id=sqa.ans_id then 1 else 0 end as result from questionans qa
left join subquestionans sqa
on qa.q_id=sqa.q_id and qa.ans_id=sqa.ans_id
This should give you expected result for every question.
select q_id, min(Is_Correct)Is_Correct from (
select Q.q_id,case when count(A.sub_ans_id)=count(*) then 1 else 0 end as Is_Correct
from #Q Q left join #A A on Q.q_id=A.q_id and Q.ans_id=A.sub_ans_id
group by Q.q_id
UNION ALL
select A.q_id,case when count(Q.ans_id)=count(*) then 1 else 0 end as Is_Correct
from #Q Q right join #A A on Q.q_id=A.q_id and Q.ans_id=A.sub_ans_id
group by A.q_id ) I group by q_id
MySQL solution (sql fiddle):
SELECT tmp.q_id, MIN(c) as correct
FROM (
SELECT qa.q_id, IF(qa.q_id = usa.q_id, 1, 0) as c
FROM question_answers qa
LEFT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
UNION
SELECT usa.q_id, IF(qa.q_id = usa.q_id, 1, 0) as c
FROM question_answers qa
RIGHT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
) tmp
GROUP BY tmp.q_id;
Now, step by step explanation:
In order to get the right output we will need to:
extract from question_answers table the answers which were not filled in by the user (in your example: q_id = 3 with ans_id = 3)
extract from user_submited_answers table the wrong answers which were filled in by the user (in your example: q_id = 3 with sub_ans_id = 4)
To do that we can use a full outer join (for mysql left join + right join):
SELECT *
FROM question_answers qa
LEFT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
UNION
SELECT *
FROM question_answers qa
RIGHT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id;
From the previous query results, the rows which we are looking for (wrong answers) contains NULL values (based on the case, in question_answers table or user_submited_answers table).
The next step is to mark those rows with 0 (wrong answer) using an IF or CASE statement: IF(qa.q_id = usa.q_id, 1, 0).
To get the final output we need to group by q_id and look for 0 values in the grouped rows. If there is at least one 0, the answer for that question is wrong and it should be marked as that.
Check sql fiddle: SQL Fiddle

Join tables with distinct highest ranked row

I have three tables defined like this:
[tbMember]
memberID | memberName
1 | John
2 | Peter
[tbGroup]
groupID | groupName
1 | Alpha
2 | Beta
3 | Gamma
[tbMemberGroupRelation]
memberID | groupID | memberRank (larger number is higher)
1 | 1 | 0
1 | 2 | 1
2 | 1 | 5
2 | 2 | 3
2 | 3 | 1
And now I want to perform a table-join selection to get result which contains (distinct) member with his highest ranked group in each row, for the given example above, the query result is desired to be:
memberID | memberName | groupName | memberRank
1 | John | Beta | 1
2 | Peter | Alpha | 5
Is there a way to implement it in a single SQL like following style ?
select * from tbMember m
left join tbMemberGroupRelation mg on (m.MemberID = mg.MemberID and ......)
left join tbGroup g on (mg.GroupID = g.GroupID)
Any other solutions are also appreciated if it is impossible to write in a simple query.
========= UPDATED =========
Only ONE highest rank is allowed in table
One solution would be to create an inverted sequence/rank of the memberRank so that the highest rank per member is always equal to 1.
This is how I achieved it using a sub-query:
SELECT
m.memberID,
m.memberName,
g.groupName,
mg.memberRank
FROM
tbMember m
LEFT JOIN
(
SELECT
memberID,
groupID,
groupName,
memberRank,
RANK() OVER(PARTITION BY memberID ORDER BY memberRank DESC) AS invRank
FROM
tbMemberGroupRelation
) mg
ON (mg.memberID = m.memberID)
AND (mg.invRank = 1)
LEFT JOIN
tbGroup g
ON (g.groupID = mg.groupID);
An alternative method:
SELECT
M.memberID,
M.memberName,
G.groupName,
MG.memberRank
FROM
Member M
LEFT OUTER JOIN MemberGroup MG ON MG.memberID = M.memberID
LEFT OUTER JOIN MemberGroup MG2 ON
MG2.memberID = M.memberID AND
MG2.memberRank > MG.memberRank
INNER JOIN [Group] G ON G.groupid = MG.groupid
WHERE
MG2.memberid IS NULL
Might perform better in some situations due to indexing, etc.
create table [tbGroup] (groupid int, groupname varchar(8000))
Insert [tbGroup] Values (1, 'Alpha')
Insert [tbGroup] Values (2, 'Beta')
Insert [tbGroup] Values (3, 'Gamma')
create table [tbMemberGroupRelation] (memberid int, groupid int, memberrank int)
Insert [tbMemberGroupRelation] Values (1,1,0)
Insert [tbMemberGroupRelation] Values (1,2,1)
Insert [tbMemberGroupRelation] Values (2,1,5)
Insert [tbMemberGroupRelation] Values (2,2,3)
Insert [tbMemberGroupRelation] Values (2,3,1)
;With cteMemberGroupRelation As
(
Select *, Row_Number() Over (Partition By MemberID Order By MemberRank Desc) SortOrder
From [tbMemberGroupRelation]
)
Select *
From tbMember M
Join (Select * From cteMemberGroupRelation Where SortOrder = 1) R On R.memberid = M.memberid
Join tbGroup G On G.groupid = R.groupid

SQL searching for rows that contain multiple criteria

I have 3 tables
Customer
Groups
CustomerGroupJoins
Fields to be used
Customer:Key
Groups:Key
CustomerGroupJoins:KeyCustomer, KeyGroup
I need to search for all users that are in all groups with keys, 1,2,3
I was thinking something like (but have no idea whether this is the right/best way to go):
SELECT
*
FROM
Customer
WHERE
Key = (
SELECT KeyCustomer
FROM CustomerGroupJoins
WHERE KeyGroup = a
) = (
SELECT KeyCustomer
FROM CustomerGroupJoins
WHERE KeyGroup = b
) = (
SELECT KeyCustomer
FROM CustomerGroupJoins
WHERE KeyGroup = c
)
I created this test data:
srh#srh#[local] =# select * from customer join customergroupjoins on customer.key = customergroupjoins.keycustomer join groups on groups.key = customergroupjoins.keygroup;
key | name | keycustomer | keygroup | key | name
-----+--------+-------------+----------+-----+---------
1 | fred | 1 | 1 | 1 | alpha
1 | fred | 1 | 2 | 2 | beta
1 | fred | 1 | 3 | 3 | gamma
2 | jim | 2 | 1 | 1 | alpha
2 | jim | 2 | 2 | 2 | beta
2 | jim | 2 | 4 | 4 | delta
2 | jim | 2 | 5 | 5 | epsilon
3 | shelia | 3 | 1 | 1 | alpha
3 | shelia | 3 | 3 | 3 | gamma
3 | shelia | 3 | 5 | 5 | epsilon
(10 rows)
So "fred" is the only customer in all of (alpha, beta, gamma). To determine that:
srh#srh#[local] =# select * from customer
where exists (select 1 from customergroupjoins where keycustomer = customer.key and keygroup = 1)
and exists (select 1 from customergroupjoins where keycustomer = customer.key and keygroup = 2)
and exists (select 1 from customergroupjoins where keycustomer = customer.key and keygroup = 3);
key | name
-----+------
1 | fred
(1 row)
This is one approach. The (1,2,3) - your known group keys - are the parameters in the subqueries. Someone already mentioned you don't actually need to join to the groups table at all.
Another way:
select customer.*
from customer
join customergroupjoins g1 on g1.keycustomer = customer.key
join customergroupjoins g2 on g2.keycustomer = customer.key
join customergroupjoins g3 on g3.keycustomer = customer.key
where g1.keygroup = 1 and g2.keygroup = 2 and g3.keygroup = 3
The general problem of finding users with all groups (g_1, g_2 .. g_N) is a bit tricker. These queries above have joined to the link table (customergroupjoins) N times, so it's a different query depending on the number of groups you're checking against.
One approach to that is to create a temporary table to use as a query parameter: the table contains the list of groups that the customers must have all of. So for instance create a temp table called "ParamGroups" (or "#ParamGroups" on SQL Server to mark it as temporary), populate it with the group keys you're interested in and then do this:
select * from customer where key in (
select keycustomer
from customergroupjoins
join paramgroup on paramgroup.keygroup = customergroupjoins.keygroup
group by keycustomer
having count(*) = (select count(*) from paramgroup))
Also, as a beginner, I strongly recommend you look into advice about naming conventions for database tables and columns. Everyone has different ideas (and they can spark off holy wars), but pick some standards (if they aren't dictated to you) and stick to them. For instance you named one table "customer" (singular) and one table "groups" (plural) which looks bad. It's more usual to use "id" rather than "key", and to use it as a suffix ("customer_id" or "CustomerID") than a prefix. The whole CamelCase vs old_skool argument is more a matter of style, as is the primary-key-is-just-"id"-not-"table_id".
The above solutions will work if the customer is in any of the three groups, but won't check for membership in all of them.
Try this instead:
SELECT a.*
FROM (SELECT c.*, substring((SELECT (', ' + cg.KeyGroup)
FROM CustomerGroupJoins cg
WHERE cg.KeyCustomer = c.[Key]
AND cg.KeyGroup IN (1,2,3)
ORDER BY cg.KeyGroup ASC
FOR XML PATH('')), 3, 2000) AS GroupList
FROM Customer AS c) AS a
WHERE a.GroupList = ('1, 2, 3')
This will also work:
SELECT c.*
FROM Customer c
WHERE c.[Key] IN (SELECT cg.[KeyGroup]
JOIN CustomerGroupJoins cg WHERE cg.KeyGroup IN (1,2,3)
GROUP BY cg.KeyGroup
HAVING count(*) = 3)
Maybe something like this?
SELECT c.Key, g.Key, cgj.KeyCustomer, cgj.KeyGroup
FROM Customer c
LEFT JOIN CustomerGroupJoins cgj ON cgj.KeyCustomer = c.Key
LEFT JOIN Groups g ON g.Key = cgj.KeyGroup
WHERE g.key IN (1, 2, 3)
From what you described, try this:
SELECT * FROM Customer c
INNER JOIN CustomerGroupJoins cgj
ON c.key = cgj.keyCustomer
INNER JOIN groups g
ON cgj.keyGroup = g.key
WHERE g.key IN (1,2,3)
SELECT *
FROM customer c
INNER JOIN customerGroupJoins j ON(j.customerKey = c.key)
WHERE j.keyGroup IN (1, 2, 3)
You don't need to join against groups-table, as long as you are only interested in the group key, which is found in your join table.
Here's a possible answer, not tested:
select custid
from CustomerGroupJoins
where groupid in (1,2,3)
group by custid
having count(*) = 3
Searches for customer's that have 3 rows with groupid 1, 2, or 3. Which means that they are in all 3 groups, because I assume you have a primary key on (custid,groupid).