Return rows only if matches all list values - sql

Let's say I have a table customers:
-----------------
|id|name|country|
|1 |Joe |Mexico |
|2 |Mary|USA |
|3 |Jim |France |
-----------------
And a table languages:
-------------
|id|language|
|1 |English |
|2 |Spanish |
|3 |French |
-------------
And a table cust_lang:
------------------
|id|custId|langId|
|1 |1 |1 |
|2 |1 |2 |
|3 |2 |1 |
|4 |3 |3 |
------------------
Given a list: ["English", "Spanish", "Portugese"]
Using a WHERE IN for the list, it will still return customers with ids 1,2 because they match "English" and "Spanish".
However, the results should be 0 rows returned since no customer matches ALL three terms.
I only want the customer ids to return if it matches the cust_lang table.
For instance, Given a list: ["English", "Spanish"]
I would want the results to be customer Id 1, since he alone speaks both languages.
EDIT: #GordonLinoff - That works!!
Now to make it more complex, what's wrong with this additional related query:
Let's assume I also have a table degrees:
-----------
|id|degree|
|1 |PHD |
|2 |BA |
|3 |MD |
-----------
A corresponding join table cust_deg:
------------------
|id|custId|degId |
|1 |1 |1 |
|2 |1 |2 |
|3 |2 |1 |
|4 |3 |3 |
------------------
The following query does not work. However, it is two of the same queries combined. The results should be only rows that match both lists, instead of the one list.
SELECT * FROM customers C
WHERE C.id IN (
SELECT CL.langId FROM cust_lang CL
JOIN languages L on CL.langId = L.id
WHERE L.language IN ("English", "Spanish")
GROUP BY CL.langID
HAVING COUNT(*) = 2)
AND C.id IN (
SELECT CD.custId FROM cust_deg CD
JOIN degrees D ON CD.degID = D.id
WHERE D.degree IN ("PHD", "BA")
GROUP BY CD.custId HAVING COUNT(*) = 2));`
EDIT2: I think i fixed it. I accidentally had an extra select statement in there.

You can do this with group by and having:
select cl.custid
from cust_lang cl join
languages l
on cl.langid = l.id
where l.language in ('English', 'Spanish', 'Portuguese')
group by cl.custid
having count(*) = 3;
If, for example, you only wanted to check for two languages, then you need only change you WHERE ... IN and HAVING conditions, e.g.:
where l.language in ('English', 'Spanish')
and
having count(*) = 2

This is pretty much Gordon's answer but it has the benefit of being a little more flexible on the language list and it doesn't require any change to the having clause.
with my_languages as (
select langId from languages
where language in ('English', 'Spanish')
)
select cl.custId
from cust_lang as cl inner join my_languages as l on l.langId = cl.langId
group by cl.custId
having count(*) = (select count(*) from lang)

Related

In SQL, query a table by transposing column results

Background
Forgive the title of this question, as I'm not really sure how to describe what I'm trying to do.
I have a SQL table, d, that looks like this:
+--+---+------------+------------+
|id|sex|event_type_1|event_type_2|
+--+---+------------+------------+
|a |m |1 |1 |
|b |f |0 |1 |
|c |f |1 |0 |
|d |m |0 |1 |
+--+---+------------+------------+
The Problem
I'm trying to write a query that yields the following summary of counts of event_type_1 and event_type_2 cut (grouped?) by sex:
+-------------+-----+-----+
| | m | f |
+-------------+-----+-----+
|event_type_1 | 1 | 1 |
+-------------+-----+-----+
|event_type_2 | 2 | 1 |
+-------------+-----+-----+
The thing is, this seems to involve some kind of transposition of the 2 event_type columns into rows of the query result that I'm not familiar with as a novice SQL user.
What I've tried
I've so far come up with the following query:
SELECT event_type_1, event_type_2, count(sex)
FROM d
group by event_type_1, event_type_2
But that only gives me this:
+------------+------------+-----+
|event_type_1|event_type_2|count|
+------------+------------+-----+
|1 |1 |1 |
|1 |0 |1 |
|0 |1 |2 |
+------------+------------+-----+
You can use a lateral join to unpivot the data. Then use conditional aggregate to calculate m and f:
select v.which,
count(*) filter (where d.sex = 'm') as m,
count(*) filter (where d.sex = 'f') as f
from d cross join lateral
(values (d.event_type_1, 'event_type_1'),
(d.event_type_2, 'event_type_2')
) v(val, which)
where v.val = 1
group by v.which;
Here is a db<>fiddle.

How to disregard a row in a returned MS Access query when all fields bar one are distinct

I'm trying to create a query on Access 2010 which only produces a single row per patient. There are a really small number of patients (each patient represented by
a unique nhs_number in the table n) who are listed as having 2 practices in the table pp and so two rows are generated for them. Is there a way I can arbitrarily select one of the practices and ignore the other?
This is the query:
SELECT DISTINCT
n.nhs_number,
IIF(ch.care_home_date>#2/1/1900#, "TRUE", "FALSE") AS care_home,
pp.practice
FROM (nhs_no_tbl AS n
LEFT JOIN patient_practice_tbl AS pp ON n.nhs_number = pp.nhs_number)
LEFT JOIN patient_care_home_tbl AS ch ON n.nhs_number = ch.nhs_number;
The tables the query is using contains data along these lines:
nhs_no_tbl:
|nhs_number|
| -------- |
|1 |
|2 |
|3 |
|4 |
patient_practice_tbl:
|nhs_number|practice|
| -------- | ------ |
|1 |GP_A |
|2 |GP_A |
|3 |GP_B |
|4 |GP_A |
|4 |GP_B |
patient_care_home_tbl:
|nhs_number|care_home_date|
| -------- | ------------ |
|1 |1/5/2000 |
|1 |1/10/2010 |
|4 |26/10/2017 |
At the end, I'd like the query to return the following:
|nhs_number|Care_home|practice|
| -------- | ------- | ------ |
|1 |TRUE |GP_A |
|2 |FALSE | |
|3 |FALSE | |
|4 |TRUE |GP_A [or GP_B] |
I've update the query with CTE,
;WITH cte1 AS ---select all results
(
SELECT DISTINCT nnt.nhs_number,
CASE WHEN pcht.care_home_date IS NULL
THEN 'FALSE'
ELSE 'TRUE'
END AS CareHome,
ppt.practice,
rank()OVER(PARTITION BY nnt.nhs_number ORDER BY ppt.practice) AS R
FROM #nhs_no_tbl nnt
LEFT JOIN #patient_practice_tbl ppt ON nnt.nhs_number = ppt.nhs_number
LEFT JOIN #patient_care_home_tbl pcht ON nnt.nhs_number = pcht.nhs_number
),
CTE2 AS ---choose who may have multiple pracices
(
SELECT nhs_number
FROM CTE1
WHERE R = 2
),
CTE3 AS --- combine GP_A and GP_B
(
SELECT t.nhs_number,STRING_AGG(val,',') AS Practices
FROM
(
SELECT DISTINCT val = cte1.practice, CTE1.nhs_number
FROM #nhs_no_tbl nnt
INNER JOIN CTE2 ON nnt.nhs_number = CTE2.nhs_number
INNER JOIN cte1 ON nnt.nhs_number = CTE1.nhs_number
) t
--RIGHT JOIN #nhs_no_tbl nnt ON t.nhs_number = nnt.nhs_number
GROUP BY t.nhs_number
)
SELECT cte1.nhs_number,cte1.carehome,cte1.practice, cte3.Practices
FROM CTE1
LEFT JOIN cte3 ON cte1.nhs_number = CTE3.nhs_number
The result would be
then next you could store the result into a temp table and update temp table where practices is not null.

Join 2 Lookup Tables to a Detail table

I have 3 tables:
Products
Groups
Sales
The products table contains the following information:
|**Product ID**|**Product Description**|
|--------------|-----------------------|
|1 |Wine |
|2 |Ruler |
|3 |Gas |
|4 |Water |
The Groups table contains the following information:
|**Group ID**|**Group Description**|
|------------|---------------------|
|1 |Cheetahs |
|2 |Elephants |
|3 |Cougars |
The Sales table contains the following information:
|**GroupID**|**Product ID**|**Amount Sold**|**Day Sold**|
|-----------|--------------|---------------|------------|
|1 |2 | 3|07-31-2016 |
|1 |1 | 1|07-31-2016 |
|2 |3 | 5|07-31-2016 |
|1 |4 | 2|08-01-2016 |
Now I have to produce a query that could bring me a result set as follows (with the condition that I want only results from 07-31-2016):
|**Group ID**|**Product ID**|**Amount Sold**|
|------------|--------------|---------------|
|1 |1 |1 |
|1 |2 |3 |
|1 |3 |0 |
|1 |4 |0 |
|2 |1 |0 |
|2 |2 |0 |
|2 |3 |5 |
|2 |4 |0 |
|3 |1 |0 |
|3 |2 |0 |
|3 |3 |0 |
|3 |4 |0 |
I thought this was going to be just a matter of using left joins, but it appears it wouldn't bring me back the result I was looking for (I don't want to omit products nor groups which weren't sold).
So in summary, I need to display all groups and all products no matter if they had an appearance in the Sales table.
I would appreciate any feedback on this matter, directions on where to look at or any logic that I may be missing!
EDIT
I've marked Matt's (big thanks) post as the answer, turns out I've never used a cross join.
I only added the where clause inside the left join of the Sales table in order to get just the sales made on 07-31-2016
SELECT
g.GroupId
,p.ProductId
,SUM(COALESCE(s.AmountSold,0)) as AmountSold
FROM
Products p
CROSS JOIN Groups g
LEFT JOIN Sales s
ON p.ProductId = s.ProductId
AND g.GroupId = s.GroupId
AND daySold = '07-31-2016'
GROUP BY
g.GroupId
,p.ProductId
ORDER BY
g.GroupId
,p.ProductId
SELECT
g.GroupId
,p.ProductId
,SUM(COALESCE(s.AmountSold,0)) as AmountSold
FROM
Products p
CROSS JOIN Groups g
LEFT JOIN Sales s
ON p.ProductId = s.ProductId
AND g.GroupId = s.GroupId
AND s.daySold = '07-31-2016'
GROUP BY
g.GroupId
,p.ProductId
ORDER BY
g.GroupId
,p.ProductId
Note your expected results you provided are wrong for group 1 product 4 there were 2 of those in the sale.
You could join all the Products with all the Groups (so you get a list of all the combinations of the two) and then add the additional information (filtering out the results based on your condition with a WHERE statement.
SELECT A.[Group ID]
, B.[Product ID]
, ISNULL([Amount Sold], 0) AS 'Amount Sold'
FROM Groups A
INNER JOIN Products B
ON 1 = 1
LEFT JOIN Sales C
ON C.[Group ID] = A.[Group ID]
AND C.[Product ID] = B.[Product ID]
WHERE [Day Sold] = '07-31-2016'

Complicated min/max multi-table query

I need to get the min and max score of group ids, but only if they are enabled:
cdu_group_sl: cdu_group_cc: cdu_group_ph:
-------------------- -------------------- --------------------
|id |name |enabled | |id |name |enabled | |id |name |enabled |
-------------------- -------------------- --------------------
|1 |sl_1 |1 | |1 |cc_1 |1 | |1 |ph_1 |0 |
|2 |sl_3 |1 | |2 |cc_2 |0 | |2 |ph_2 |1 |
|3 |sl_4 |1 | |3 |cc_3 |1 | |3 |ph_3 |1 |
-------------------- -------------------- --------------------
Scores are found in a separate table:
cdu_user_progress
----------------------------------
|id |group_type |group_id |score |
----------------------------------
|1 |sl |1 |50 |
|1 |cc |1 |10 |
|1 |ph |1 |20 |
|1 |sl |2 |80 |
|1 |sl |3 |20 |
|1 |cc |3 |30 |
|1 |sl |1 |40 |
|1 |ph |1 |50 |
|1 |cc |1 |40 |
|1 |ph |2 |90 |
----------------------------------
I need to get a max and min score for each type of group for only enabled groups (for each type):
---------------------------------------------
|group_type |group_id |min_score |max_score |
---------------------------------------------
|sl |1 |40 |50 |
|sl |2 |80 |80 |
|sl |3 |20 |20 |
|cc |1 |10 |40 |
|cc |3 |30 |30 |
|ph |1 |20 |50 |
|ph |2 |90 |90 |
---------------------------------------------
Any idea what the query might be??? So far I have:
SELECT * FROM cdu_user_progress
JOIN cdu_group_sl ON (cdu_group_sl.id = cdu_user_progress.group_id AND cdu_user_progress.group_type = 'sl')
JOIN cdu_group_cc ON (cdu_group_cc.id = cdu_user_progress.group_id AND cdu_user_progress.group_type = 'cc')
JOIN cdu_group_ph ON (cdu_group_ph.id = cdu_user_progress.group_id AND cdu_user_progress.group_type = 'ph')
WHERE cdu_user_progress.uid = $student->uid
AND (cdu_user_progress.group_type = 'sl' AND cdu_group_sl.enabled = 1)
AND (cdu_user_progress.group_type = 'cc' AND cdu_group_cc.enabled = 1)
AND (cdu_user_progress.group_type = 'ph' AND cdu_group_ph.enabled = 1)
Probably completely wrong...
what about using a union to pick the groups you are interested in - something like:
select group_type, group_id min(score) min_score, max(score) max_score
from (
select id, 'sl' grp from cdu_group_sl where enabled = 1
union all
select id, 'cc' from cdu_group_cc where enabled = 1
union all
select id, 'ph' from cdu_group_ph where enabled = 1
) grps join cdu_user_progress scr
on grps.id = scr.group_id and grps.grp = scr.group_type
group by scr.group_type, scr.group_id
The following is probably the fastest way to do this query. To optimize this, you should have an index on group_id, enabled on each of the three "sl", "cc", and "ph" tables:
select cup.*
from cdu_user_progress cup
where (cup.group_type = 'sl' and
exists (select 1
from cdu_group_sl sl
where sl.id = cup.group_id and
sl.enabled = 1
)
) or
(cup.group_type = 'cc' and
exists (select 1
from cdu_group_cc cc
where cc.id = cup.group_id and
cc.enabled = 1
)
) or
(cup.group_type = 'ph' and
exists (select 1
from cdu_group_ph ph
where ph.id = cup.group_id and
ph.enabled = 1
)
)
As a note, having three tables with the same structure is usually a sign of a poor database schema. These three tables should probably be combined into a single table, which would make this query much easier to write.
If you are just starting up this project, I would recommend refining your data structure. Based on what you showed, you could benefit from only one cdu_groups table with a reference to a new cdu_group_types table, and removing the group_type column from cdu_user_progress.
If this is an established project, where changing the structure would be too disruptive... then one of the other answers showing a query would be a better/easier fit.
Otherwise, you could simplify things with restructured tables and end up with a query like:
SELECT group_type,
group_id,
MIN(score) as min_score,
MAX(score) as max_score
FROM cdu_user_progress c
INNER JOIN cdu_groups g
ON c.group_id=g.id
INNER JOIN cdu_group_types t
ON g.group_type_id=t.id
WHERE enabled=1
GROUP BY group_type, group_id
This is shown, with expected results, in this SQLFiddle. With this structure you can add new group types as you want (and also cut down on amount of tables and joins). Tables would be (simplified in this code below, no FKs or anything):
CREATE TABLE cdu_user_progress
(id INT, group_id INT, score INT)
CREATE TABLE cdu_group_types
(id INT, group_type VARCHAR(3))
CREATE TABLE cdu_groups
(id INT, group_type_id INT, name VARCHAR(10), enabled BIT NOT NULL DEFAULT 1)
Granted moving data to a new structure may be a pain or not reasonable... but wanted to throw this out there as a possibility or just something to chew on.

SQL Server recursive query with associated table

I have a typical parent/child relationship table to represent folders. My challenge is using it in conjunction with another table.
The folder table is like this:
+--+----+--------+
|id|name|parentid|
+--+----+--------+
|1 |a |null |
+--+----+--------+
|2 |b |1 |
+--+----+--------+
|3 |c1 |2 |
+--+----+--------+
|4 |c2 |2 |
+--+----+--------+
The association table is like this:
+--+--------+
|id|folderid|
+--+--------+
|66|2 |
+--+--------+
|77|3 |
+--+--------+
so that where association.id = 66 has a relationship to folder.id = 2
What I need to do is find the association.id of the first ancestor with a record in the association table.. Using the example data above, given folder.id of 3 I expect to find 77; given folder.id of 2 or 4 I expect to find 66; any other folder.id value would find null.
Finding folder ancestry can be done with a common table expression like this:
WITH [recurse] (id,name,parentid,lvl) AS
(
select a.id,a.name,a.parentid,0 FROM folder AS a
WHERE a.id='4'
UNION ALL
select r.id,r.name,r.parentid,lvl+1 FROM folder as r
INNER JOIN [recurse] ON recurse.parentid = r.id
)
SELECT * from [recurse] ORDER BY lvl DESC
yielding the results:
+--+----+--------+---+
|id|name|parentid|lvl|
+--+----+--------+---+
|1 |a | |2 |
+--+----+--------+---+
|2 |b |1 |1 |
+--+----+--------+---+
|4 |c2 |2 |0 |
+--+----+--------+---+
To include the association.id I've tried using a LEFT JOIN in the recursive portion of the CTE, but this is not allowed by SQL Server.
What workaround do I have for this?
Or better yet, is the a way to query directly for the particular association.id? (e.g., without walking through the results of the CTE query that I have been attempting)
SELECT r.id, r.name, r.parentid, r.lvl, a.folderid, a.id as associationid
FROM [recurse] r
LEFT JOIN [association] a
ON r.id = a.folderid
WHERE a.folderId IS NOT NULL
ORDER BY lvl DESC
This will give you the records that have values in the association table. Then you could limit it to the first record that has a value or just grab the top result