SQL - Intersections - sql

I have table like below:
|Group|User|
| 1 | X |
| 1 | Y |
| 1 | Z |
| 2 | X |
| 2 | Y |
| 2 | Z |
| 3 | X |
| 3 | Z |
| 4 | X |
I want to calculate intersections of groups: 1&2, 1&2&3, 1&2&3&4.
Then I want to see users in each intersection and how many of them are there.
In the given example should be:
1&2 -> 3
1&2&3 -> 2
1&2&3&4 -> 1
Is it possible using SQL?
If yes, how can I start?
On a daily basis I'm working with Python, but now I have to translate this code into SQL code and I have huge problem with it.

In PostgreSQL you can use JOIN or INTERSECT. Example with JOIN:
select count(*)
from t a
join t b on b.usr = a.usr
where a.grp = 1 and b.grp = 2
Then:
select count(*)
from t a
join t b on b.usr = a.usr
join t c on c.usr = a.usr
where a.grp = 1 and b.grp = 2 and c.grp = 3
And:
select count(*)
from t a
join t b on b.usr = a.usr
join t c on c.usr = a.usr
join t d on d.usr = a.usr
where a.grp = 1 and b.grp = 2 and c.grp = 3 and d.grp = 4
Result in 3, 2, and 1 respectively.
EDIT - You can also do:
select count(*)
from (
select usr, count(*)
from t
where grp in (1, 2, 3, 4)
group by usr
having count(*) = 4 -- number of groups
) x
See running example at db<>fiddle.

Related

How to query all rows where a given column matches at least all the values in a given array with PostgreSQL?

The request below:
SELECT foos.id,bars.name
FROM foos
JOIN bar_foo ON (bar_foo.foo_id = id )
JOIN bars ON (bars.id = bar_foo.bar_id )
returns a list like this:
id | name
---+-----
1 | a
1 | b
2 | a
2 | y
2 | z
3 | a
3 | b
3 | c
3 | d
How to get the ids for which id must have at least a and b, and more generally the content of a given array ?
From the example above, I would get:
id | name
---+-----
1 | a
1 | b
3 | a
3 | b
3 | c
3 | d
For two values, you can use windowing boolean aggregation:
select *
from (
select f.id, b.name,
bool_or(b.name = 'a') over(partition by id) has_a,
bool_or(b.name = 'b') over(partition by id) has_b
from foos f
join bar_foo bf on bf.foo_id = f.id
join bars b on b.id = bf.bar_id
) t
where has_a and has_b
A more generic approach uses array aggregation:
select *
from (
select f.id, b.name,
array_agg(b.name) over(partition by id) arr_names
from foos f
join bar_foo bf on bf.foo_id = f.id
join bars b on b.id = bf.bar_id
) t
where arr_names #> array['a', 'b']

Need T-SQL query to get multiple choice answer if matches

Example:
Table Question_Answers:
+------+--------+
| q_id | ans_id |
+------+--------+
| 1 | 2 |
| 1 | 4 |
| 2 | 1 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
+------+--------+
User_Submited_Answers:
| q_id | sub_ans_id |
+------+------------+
| 1 | 2 |
| 1 | 4 |
| 2 | 1 |
| 3 | 1 |
| 3 | 2 |
| 3 | 4 |
+------+------------+
I need a T-SQL query if this rows matches count 1 else 0
SELECT
t1.q_id,
CASE WHEN COUNT(t2.sub_ans_id) = COUNT(*)
THEN 1
ELSE 0 END AS is_correct
FROM Question_Answers t1
LEFT JOIN User_Submited_Answers t2
ON t1.q_id = t2.q_id AND
t1.ans_id = t2.sub_ans_id
GROUP BY t1.q_id
Try the following code:
select qa.q_id,case when qa.ans_id=sqa.ans_id then 1 else 0 end as result from questionans qa
left join subquestionans sqa
on qa.q_id=sqa.q_id and qa.ans_id=sqa.ans_id
This should give you expected result for every question.
select q_id, min(Is_Correct)Is_Correct from (
select Q.q_id,case when count(A.sub_ans_id)=count(*) then 1 else 0 end as Is_Correct
from #Q Q left join #A A on Q.q_id=A.q_id and Q.ans_id=A.sub_ans_id
group by Q.q_id
UNION ALL
select A.q_id,case when count(Q.ans_id)=count(*) then 1 else 0 end as Is_Correct
from #Q Q right join #A A on Q.q_id=A.q_id and Q.ans_id=A.sub_ans_id
group by A.q_id ) I group by q_id
MySQL solution (sql fiddle):
SELECT tmp.q_id, MIN(c) as correct
FROM (
SELECT qa.q_id, IF(qa.q_id = usa.q_id, 1, 0) as c
FROM question_answers qa
LEFT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
UNION
SELECT usa.q_id, IF(qa.q_id = usa.q_id, 1, 0) as c
FROM question_answers qa
RIGHT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
) tmp
GROUP BY tmp.q_id;
Now, step by step explanation:
In order to get the right output we will need to:
extract from question_answers table the answers which were not filled in by the user (in your example: q_id = 3 with ans_id = 3)
extract from user_submited_answers table the wrong answers which were filled in by the user (in your example: q_id = 3 with sub_ans_id = 4)
To do that we can use a full outer join (for mysql left join + right join):
SELECT *
FROM question_answers qa
LEFT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
UNION
SELECT *
FROM question_answers qa
RIGHT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id;
From the previous query results, the rows which we are looking for (wrong answers) contains NULL values (based on the case, in question_answers table or user_submited_answers table).
The next step is to mark those rows with 0 (wrong answer) using an IF or CASE statement: IF(qa.q_id = usa.q_id, 1, 0).
To get the final output we need to group by q_id and look for 0 values in the grouped rows. If there is at least one 0, the answer for that question is wrong and it should be marked as that.
Check sql fiddle: SQL Fiddle

Hierarchy SQL server joining two different tables

I'm having a problem making a hierarchy query.
I have the following in MS SQL DB:
Table A - Orders with order code, article, qty:
OP | ART | QTY
A | X |100
B | Y |200
Table B with assembly references of articles, but articles CAN be made from other articles if there exist a child's reference (may need to go 3 levels deep):
ART | ART2 |QTY
X | U | 20
X | O | 10
X | Z | 30
Y | Q | 20
Y | W | 15
Y | E | 30
U | Z | 10
And I want to get something like this:
A.OP |LEVEL| ART | B.ART2 |QTY
A | 2 | X | Z |(100*20*10)=2000
A | 1 | X | O |(100*10) =1000
A | 1 | X | Z |(100*30) = 3000
B | 1 | Y | Q |(200*20) = 4000
B | 1 | Y | W |(200*15) = 3000
B | 1 | Y | E |(200*30) = 6000
B | 1 | Y | Z |(200*10) = 2000
I've already made one thing:
WITH X AS (
SELECT
firstlvl.ART,
1 AS LEVEL,
firstlvl.ART2,
firstlvl.QTY,
QTY AS PARENTQTY
FROM B AS firstlvl
WHERE firstlvl.ART='X'
UNION ALL
SELECT secondlevel.ART,
EL.LEVEL +1,
secondlevel.BDT_MLC,
secondlevel.ART2,
secondlevel.QTY,
EL.PARENTQTY AS PARENTQTY
FROM B AS secondlevel
INNER JOIN X AS EL
ON secondlevel.ART = EL.ART2)
SELECT * FROM X
But now I don't know how to join quantities with table A nor how to run this query for all items on the first table.
Can anyone help me please?
Many Thanks!
with AllData as
( select op as art, art as art2, qty from a
union all
select * from b),
Tree(RootLvl, Father, Child, qty, lvl) as (
select art, art, art2, qty, 0 from AllData
where art in ( 'A' , 'B')
union all
select C.RootLvl, C.Child, AllData.art2, AllData.qty * c.qty, C.lvl + 1 from Tree C
join AllData
on C.Child = AllData.art
)
select C1.RootLvl as OP, C1.Lvl, C1.Child, C1.Qty from Tree c1
left join Tree c2
on c1.child = c2.father and
c1.rootLvl = c2.RootLvl
where C2.RootLvl is null
order by 1, 2 desc
SQL Fiddle demo
The A and B tables are almost the same. I made one from them to just simplify it. After that we use recursion to create tree and at the end I use left join to get only leafs from tree.

SQL query uses "wrong" join

I have an query which gives me the wrong result.
Tables:
A
+----+
| id |
+----+
| 1 |
| 2 |
+----+
B
+----+----+
| id | x | B.id = A.id
+----+----+
| 1 | 1 |
| 1 | 1 |
| 1 | 0 |
+----+----+
C
+----+----+
| id | y | C.id = A.id
+----+----+
| 1 | 1 |
| 1 | 2 |
+----+----+
What I want to do: Select all rows from A. For each row in A count in B all x with value 1 and all x with value 0 with B.id = A.id. For each row in A get the minimum y from C with C.id = A.id.
The result I am expecting is:
+----+------+--------+---------+
| id | min | count1 | count 2 |
+----+------+--------+---------+
| 1 | 1 | 2 | 1 |
| 2 | NULL | 0 | 0 |
+----+------+--------+---------+
First Try:
This doesn't work.
SELECT a.id,
MIN(c.y),
SUM(IF(b.x = 1, 1, 0)),
SUM(IF(b.x = 0, 1, 0))
FROM a
LEFT JOIN b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id
+----+------+--------+---------+
| id | min | count1 | count 2 |
+----+------+--------+---------+
| 1 | 1 | 4 | 2 |
| 2 | NULL | 0 | 0 |
+----+------+--------+---------+
Second Try:
This works but I am sure it has a bad performance.
SELECT a.id,
MIN(c.y),
b.x,
b.y
FROM a
LEFT JOIN (SELECT b.id, SUM(IF(b.x = 1, 1, 0)) x, SUM(IF(b.x = 0, 1, 0)) y FROM b) b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id
+----+------+--------+---------+
| id | min | count1 | count 2 |
+----+------+--------+---------+
| 1 | 1 | 2 | 1 |
| 2 | NULL | 0 | 0 |
+----+------+--------+---------+
Last Try:
This works too.
SELECT x.*,
SUM(IF(b.x = 1, 1, 0)),
SUM(IF(b.x = 0, 1, 0))
FROM (SELECT a.id,
MIN(c.y)
FROM a
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id) x
LEFT JOIN b
ON ( b.id = x.id )
GROUP BY x.id
Now my question is: Is the last one the best choise or is there a way to write this query with just one select statement (like in the first try)?
Your joins are doing cartesian products for a given value, because there are multiple rows in each table.
You can fix this by using count(distinct) rather than sum():
SELECT a.id, MIN(c.y),
count(distinct (case when b.x = 1 then b.id end)),
count(distinct (case when b.x = 0 then b.id end))
FROM a
LEFT JOIN b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id;
You can also fix this by pre-aggregating b (and/or c). And you would need to take that approach if your aggregation function were something like the sum of a column in b.
EDIT:
You are correct. The above query counts the distinct values of B, but B contains rows that are exact duplicates. (Personally, I think having a column with the name id that has duplicates is a sign of poor design, but that is another issue.)
You could solve it by having a real id in the b table, because then the count(distinct) would count the correct values. You can also solve it by aggregating the two tables before joining them in:
SELECT a.id, c.y, x1, x0
FROM a
LEFT JOIN (select b.id,
sum(b.x = 1) as x1,
sum(b.x = 0) as x0
from b
group by b.id
) b
ON ( a.id = b.id )
LEFT JOIN (select c.id, min(c.y) as y
from c
group by c.id
) c
ON ( a.id = c.id );
Here is a SQL Fiddle for the problem.
EDIT II:
You can get it in one statement, but I'm not so sure that it would work on similar data. The idea is that you can count all the cases where x = 1 and then divide by the number of rows in the C table to get the real distinct count:
SELECT a.id, MIN(c.y),
coalesce(sum(b.x = 1), 0) / count(distinct coalesce(c.y, -1)),
coalesce(sum(b.x = 0), 0) / count(distinct coalesce(c.y, -1))
FROM a
LEFT JOIN b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id;
It is a little tricky, because you have to handle NULLs to get the right values. Note that this is counting the y value to get a distinct count from the C table. Your question re-enforces why it is a good idea to have a unique integer primary key in every table.

Sql query with many to many tables

I'm using VB.Net express with an Access file, and I have the following tables:
table Formula
id | name
-------------------
1 | formula 1
2 | formula 2
3 | formula 3
table Component
id | name
--------------------
1 | A
2 | B
3 | C
4 | D
table FormulaComponents
formula_id | component_id
-------------------------
1 | 1
1 | 2
1 | 4
2 | 1
2 | 3
2 | 4
3 | 1
3 | 2
3 | 3
So each formula have one or more components.
Which query will I use if I want all the formulas with for example Component A AND Component D (Result: formula 1, formula 2)? I try something with intersect, but it seems it doesn't work in VB...
Thanks!
Update:
select f.*
from (
select c.id
from FormulaComponents fc
inner join Component c on fc.component_id = c.id
where c.name in ('A', 'B')
group by c.id
having count(distinct c.name) = 2
) c2
inner join FormulaComponents fc on c2.id = fc.component_id
inner join Formula f on fc.formula_id = f.id
SELECT DISTINCT
f.*
FROM Formula f
INNER JOIN FormulaComponent fc on fc.formula_id = f.formula_id
INNER JOIN Component c on c.component_id = fc.componentid
WHERE Exists (SELECT * FROM FormulaComponent fc1 WHERE fc1.formulaID = f.formulaId AND c.Name = 'A')
AND Exists (SELECT * FROM FormulaComponent fc1 WHERE fc1.formulaID = f.formulaId AND c.Name = 'D')