Sql query with many to many tables - sql

I'm using VB.Net express with an Access file, and I have the following tables:
table Formula
id | name
-------------------
1 | formula 1
2 | formula 2
3 | formula 3
table Component
id | name
--------------------
1 | A
2 | B
3 | C
4 | D
table FormulaComponents
formula_id | component_id
-------------------------
1 | 1
1 | 2
1 | 4
2 | 1
2 | 3
2 | 4
3 | 1
3 | 2
3 | 3
So each formula have one or more components.
Which query will I use if I want all the formulas with for example Component A AND Component D (Result: formula 1, formula 2)? I try something with intersect, but it seems it doesn't work in VB...
Thanks!

Update:
select f.*
from (
select c.id
from FormulaComponents fc
inner join Component c on fc.component_id = c.id
where c.name in ('A', 'B')
group by c.id
having count(distinct c.name) = 2
) c2
inner join FormulaComponents fc on c2.id = fc.component_id
inner join Formula f on fc.formula_id = f.id

SELECT DISTINCT
f.*
FROM Formula f
INNER JOIN FormulaComponent fc on fc.formula_id = f.formula_id
INNER JOIN Component c on c.component_id = fc.componentid
WHERE Exists (SELECT * FROM FormulaComponent fc1 WHERE fc1.formulaID = f.formulaId AND c.Name = 'A')
AND Exists (SELECT * FROM FormulaComponent fc1 WHERE fc1.formulaID = f.formulaId AND c.Name = 'D')

Related

SQL - Intersections

I have table like below:
|Group|User|
| 1 | X |
| 1 | Y |
| 1 | Z |
| 2 | X |
| 2 | Y |
| 2 | Z |
| 3 | X |
| 3 | Z |
| 4 | X |
I want to calculate intersections of groups: 1&2, 1&2&3, 1&2&3&4.
Then I want to see users in each intersection and how many of them are there.
In the given example should be:
1&2 -> 3
1&2&3 -> 2
1&2&3&4 -> 1
Is it possible using SQL?
If yes, how can I start?
On a daily basis I'm working with Python, but now I have to translate this code into SQL code and I have huge problem with it.
In PostgreSQL you can use JOIN or INTERSECT. Example with JOIN:
select count(*)
from t a
join t b on b.usr = a.usr
where a.grp = 1 and b.grp = 2
Then:
select count(*)
from t a
join t b on b.usr = a.usr
join t c on c.usr = a.usr
where a.grp = 1 and b.grp = 2 and c.grp = 3
And:
select count(*)
from t a
join t b on b.usr = a.usr
join t c on c.usr = a.usr
join t d on d.usr = a.usr
where a.grp = 1 and b.grp = 2 and c.grp = 3 and d.grp = 4
Result in 3, 2, and 1 respectively.
EDIT - You can also do:
select count(*)
from (
select usr, count(*)
from t
where grp in (1, 2, 3, 4)
group by usr
having count(*) = 4 -- number of groups
) x
See running example at db<>fiddle.

How to query all rows where a given column matches at least all the values in a given array with PostgreSQL?

The request below:
SELECT foos.id,bars.name
FROM foos
JOIN bar_foo ON (bar_foo.foo_id = id )
JOIN bars ON (bars.id = bar_foo.bar_id )
returns a list like this:
id | name
---+-----
1 | a
1 | b
2 | a
2 | y
2 | z
3 | a
3 | b
3 | c
3 | d
How to get the ids for which id must have at least a and b, and more generally the content of a given array ?
From the example above, I would get:
id | name
---+-----
1 | a
1 | b
3 | a
3 | b
3 | c
3 | d
For two values, you can use windowing boolean aggregation:
select *
from (
select f.id, b.name,
bool_or(b.name = 'a') over(partition by id) has_a,
bool_or(b.name = 'b') over(partition by id) has_b
from foos f
join bar_foo bf on bf.foo_id = f.id
join bars b on b.id = bf.bar_id
) t
where has_a and has_b
A more generic approach uses array aggregation:
select *
from (
select f.id, b.name,
array_agg(b.name) over(partition by id) arr_names
from foos f
join bar_foo bf on bf.foo_id = f.id
join bars b on b.id = bf.bar_id
) t
where arr_names #> array['a', 'b']

SQL Join to Get Row with Maximum Value from Right table

I am having problem with sql join (oracle/ms sql)
I have two tables
A
ID | B_ID
---|------
1 | 1
1 | 4
2 | 3
2 | 2
----------
B
B_ID | B_VA| B_VB
-------|--------|-------
1 | 1 | a
2 | 2 | b
3 | 5 | c
4 | 2 | d
-----------------------
From these two tables I need A.ID, B.B_ID, B.B_VA (MAX), B.B_VB (with max B.B_VA)
So result table would be like
ID | B_ID | B_VA| B_VB
-------|--------|--------|-------
1 | 4 | 2 | d
2 | 3 | 5 | c
I tried some joins without success. Can anyone help me with query to get the result I want.
Thank you
Your logic as described doesn't quite correspond to the data. For instance, b_va is numeric, but the column in the output is a string.
Perhaps you want this. The data in a to be aggregated to get the maximum b_id value. Then each column to be joined to get the corresponding b_vb column. That, at least, conforms to your desired output:
select a.id, a.b_id, b1.b_vb as b_va, b2.b_vb
from (select id, max(b_id) as b_id
from a
group by id
) a join
b b1
on a.id = b1.b_id join
b b2
on a.b_id = b2.b_id;
EDIT:
For the corrected data, I think this is what you want:
select a.id, a.b_id, max(b1.b_va) as b_va, b2.b_vb
from (select id, max(b_id) as b_id
from a
group by id
) a join
b b1
on a.id = b1.b_id join
b b2
on a.b_id = b2.b_id
group by a.id, a.b_id, b2.b_vb;
Try this
SELECT X.ID, Y.B_ID, X.B_VA, Y.B_VB
FROM (SELECT A.ID, MAX(B_VA) AS B_VA
FROM A INNER JOIN B ON A.B_ID = B.B_ID
GROUP BY A.ID) AS X INNER JOIN
A AS Z ON X.ID = Z.ID INNER JOIN
B AS Y ON Z.B_ID=Y.B_ID AND X.B_VA=Y.B_VA

SQL query uses "wrong" join

I have an query which gives me the wrong result.
Tables:
A
+----+
| id |
+----+
| 1 |
| 2 |
+----+
B
+----+----+
| id | x | B.id = A.id
+----+----+
| 1 | 1 |
| 1 | 1 |
| 1 | 0 |
+----+----+
C
+----+----+
| id | y | C.id = A.id
+----+----+
| 1 | 1 |
| 1 | 2 |
+----+----+
What I want to do: Select all rows from A. For each row in A count in B all x with value 1 and all x with value 0 with B.id = A.id. For each row in A get the minimum y from C with C.id = A.id.
The result I am expecting is:
+----+------+--------+---------+
| id | min | count1 | count 2 |
+----+------+--------+---------+
| 1 | 1 | 2 | 1 |
| 2 | NULL | 0 | 0 |
+----+------+--------+---------+
First Try:
This doesn't work.
SELECT a.id,
MIN(c.y),
SUM(IF(b.x = 1, 1, 0)),
SUM(IF(b.x = 0, 1, 0))
FROM a
LEFT JOIN b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id
+----+------+--------+---------+
| id | min | count1 | count 2 |
+----+------+--------+---------+
| 1 | 1 | 4 | 2 |
| 2 | NULL | 0 | 0 |
+----+------+--------+---------+
Second Try:
This works but I am sure it has a bad performance.
SELECT a.id,
MIN(c.y),
b.x,
b.y
FROM a
LEFT JOIN (SELECT b.id, SUM(IF(b.x = 1, 1, 0)) x, SUM(IF(b.x = 0, 1, 0)) y FROM b) b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id
+----+------+--------+---------+
| id | min | count1 | count 2 |
+----+------+--------+---------+
| 1 | 1 | 2 | 1 |
| 2 | NULL | 0 | 0 |
+----+------+--------+---------+
Last Try:
This works too.
SELECT x.*,
SUM(IF(b.x = 1, 1, 0)),
SUM(IF(b.x = 0, 1, 0))
FROM (SELECT a.id,
MIN(c.y)
FROM a
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id) x
LEFT JOIN b
ON ( b.id = x.id )
GROUP BY x.id
Now my question is: Is the last one the best choise or is there a way to write this query with just one select statement (like in the first try)?
Your joins are doing cartesian products for a given value, because there are multiple rows in each table.
You can fix this by using count(distinct) rather than sum():
SELECT a.id, MIN(c.y),
count(distinct (case when b.x = 1 then b.id end)),
count(distinct (case when b.x = 0 then b.id end))
FROM a
LEFT JOIN b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id;
You can also fix this by pre-aggregating b (and/or c). And you would need to take that approach if your aggregation function were something like the sum of a column in b.
EDIT:
You are correct. The above query counts the distinct values of B, but B contains rows that are exact duplicates. (Personally, I think having a column with the name id that has duplicates is a sign of poor design, but that is another issue.)
You could solve it by having a real id in the b table, because then the count(distinct) would count the correct values. You can also solve it by aggregating the two tables before joining them in:
SELECT a.id, c.y, x1, x0
FROM a
LEFT JOIN (select b.id,
sum(b.x = 1) as x1,
sum(b.x = 0) as x0
from b
group by b.id
) b
ON ( a.id = b.id )
LEFT JOIN (select c.id, min(c.y) as y
from c
group by c.id
) c
ON ( a.id = c.id );
Here is a SQL Fiddle for the problem.
EDIT II:
You can get it in one statement, but I'm not so sure that it would work on similar data. The idea is that you can count all the cases where x = 1 and then divide by the number of rows in the C table to get the real distinct count:
SELECT a.id, MIN(c.y),
coalesce(sum(b.x = 1), 0) / count(distinct coalesce(c.y, -1)),
coalesce(sum(b.x = 0), 0) / count(distinct coalesce(c.y, -1))
FROM a
LEFT JOIN b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id;
It is a little tricky, because you have to handle NULLs to get the right values. Note that this is counting the y value to get a distinct count from the C table. Your question re-enforces why it is a good idea to have a unique integer primary key in every table.

Return count(*) even if 0

I have the following query:
select bb.Name, COUNT(*) as Num from BOutcome bo
JOIN BOffers bb ON bo.ID = bb.BOutcomeID
WHERE bo.EventID = 123 AND bo.OfferTypeID = 321 AND bb.NumA > bb.NumB
GROUP BY bb.Name
The table looks like:
Name | Num A | Num B
A | 10 | 3
B | 2 | 3
C | 10 | 3
A | 9 | 3
B | 2 | 3
C | 9 | 3
The expected output should be:
Name | Count
A | 2
B | 0
C | 2
Because when name is A and C then Num A is bigger to times than Num B and when Name is B, in both records Num A is lower than Num B.
My current output is:
Name | Count
A | 2
C | 2
Because B's output is 0, i am not getting it back in my query.
What is wrong with my query? how should I get it back?
Here is my guess. I think this is a much simpler approach than all of the left/right join hoops people have been spinning their wheels on. Since the output of the query relies only on columns in the left table, there is no need for an explicit join at all:
SELECT
bb.Name,
[Count] = SUM(CASE WHEN bb.NumA > bb.NumB THEN 1 ELSE 0 END)
-- just FYI, the above could also be written as:
-- [Count] = COUNT(CASE WHEN bb.NumA > bb.NumB THEN 1 END)
FROM dbo.BOffers AS bb
WHERE EXISTS
(
SELECT 1 FROM dbo.BOutcome
WHERE ID = bb.BOutcomeID
AND EventID = 123
AND OfferTypeID = 321
)
GROUP BY bb.Name;
Of course, we're not really sure that both Name and NumA/NumB are in the left table, since the OP talks about two tables but only shows one table in the sample data. My guess is based on the query he says is "working" but missing rows because of the explicit join.
Another wild guess. Feel free to downvote:
SELECT ba.Name, COUNT(bb.BOutcomeID) as Num
FROM
( SELECT DISTINCT ba.Name
FROM
BOutcome AS b
JOIN
BOffers AS ba
ON ba.BOutcomeID = b.ID
WHERE b.EventID = 123
AND b.OfferTypeID = 321
) AS ba
LEFT JOIN
BOffers AS bb
ON AND bb.Name = ba.Name
AND bb.NumA > bb.NumB
GROUP BY ba.Name ;