SQL searching for rows that contain multiple criteria - sql

I have 3 tables
Customer
Groups
CustomerGroupJoins
Fields to be used
Customer:Key
Groups:Key
CustomerGroupJoins:KeyCustomer, KeyGroup
I need to search for all users that are in all groups with keys, 1,2,3
I was thinking something like (but have no idea whether this is the right/best way to go):
SELECT
*
FROM
Customer
WHERE
Key = (
SELECT KeyCustomer
FROM CustomerGroupJoins
WHERE KeyGroup = a
) = (
SELECT KeyCustomer
FROM CustomerGroupJoins
WHERE KeyGroup = b
) = (
SELECT KeyCustomer
FROM CustomerGroupJoins
WHERE KeyGroup = c
)

I created this test data:
srh#srh#[local] =# select * from customer join customergroupjoins on customer.key = customergroupjoins.keycustomer join groups on groups.key = customergroupjoins.keygroup;
key | name | keycustomer | keygroup | key | name
-----+--------+-------------+----------+-----+---------
1 | fred | 1 | 1 | 1 | alpha
1 | fred | 1 | 2 | 2 | beta
1 | fred | 1 | 3 | 3 | gamma
2 | jim | 2 | 1 | 1 | alpha
2 | jim | 2 | 2 | 2 | beta
2 | jim | 2 | 4 | 4 | delta
2 | jim | 2 | 5 | 5 | epsilon
3 | shelia | 3 | 1 | 1 | alpha
3 | shelia | 3 | 3 | 3 | gamma
3 | shelia | 3 | 5 | 5 | epsilon
(10 rows)
So "fred" is the only customer in all of (alpha, beta, gamma). To determine that:
srh#srh#[local] =# select * from customer
where exists (select 1 from customergroupjoins where keycustomer = customer.key and keygroup = 1)
and exists (select 1 from customergroupjoins where keycustomer = customer.key and keygroup = 2)
and exists (select 1 from customergroupjoins where keycustomer = customer.key and keygroup = 3);
key | name
-----+------
1 | fred
(1 row)
This is one approach. The (1,2,3) - your known group keys - are the parameters in the subqueries. Someone already mentioned you don't actually need to join to the groups table at all.
Another way:
select customer.*
from customer
join customergroupjoins g1 on g1.keycustomer = customer.key
join customergroupjoins g2 on g2.keycustomer = customer.key
join customergroupjoins g3 on g3.keycustomer = customer.key
where g1.keygroup = 1 and g2.keygroup = 2 and g3.keygroup = 3
The general problem of finding users with all groups (g_1, g_2 .. g_N) is a bit tricker. These queries above have joined to the link table (customergroupjoins) N times, so it's a different query depending on the number of groups you're checking against.
One approach to that is to create a temporary table to use as a query parameter: the table contains the list of groups that the customers must have all of. So for instance create a temp table called "ParamGroups" (or "#ParamGroups" on SQL Server to mark it as temporary), populate it with the group keys you're interested in and then do this:
select * from customer where key in (
select keycustomer
from customergroupjoins
join paramgroup on paramgroup.keygroup = customergroupjoins.keygroup
group by keycustomer
having count(*) = (select count(*) from paramgroup))
Also, as a beginner, I strongly recommend you look into advice about naming conventions for database tables and columns. Everyone has different ideas (and they can spark off holy wars), but pick some standards (if they aren't dictated to you) and stick to them. For instance you named one table "customer" (singular) and one table "groups" (plural) which looks bad. It's more usual to use "id" rather than "key", and to use it as a suffix ("customer_id" or "CustomerID") than a prefix. The whole CamelCase vs old_skool argument is more a matter of style, as is the primary-key-is-just-"id"-not-"table_id".

The above solutions will work if the customer is in any of the three groups, but won't check for membership in all of them.
Try this instead:
SELECT a.*
FROM (SELECT c.*, substring((SELECT (', ' + cg.KeyGroup)
FROM CustomerGroupJoins cg
WHERE cg.KeyCustomer = c.[Key]
AND cg.KeyGroup IN (1,2,3)
ORDER BY cg.KeyGroup ASC
FOR XML PATH('')), 3, 2000) AS GroupList
FROM Customer AS c) AS a
WHERE a.GroupList = ('1, 2, 3')
This will also work:
SELECT c.*
FROM Customer c
WHERE c.[Key] IN (SELECT cg.[KeyGroup]
JOIN CustomerGroupJoins cg WHERE cg.KeyGroup IN (1,2,3)
GROUP BY cg.KeyGroup
HAVING count(*) = 3)

Maybe something like this?
SELECT c.Key, g.Key, cgj.KeyCustomer, cgj.KeyGroup
FROM Customer c
LEFT JOIN CustomerGroupJoins cgj ON cgj.KeyCustomer = c.Key
LEFT JOIN Groups g ON g.Key = cgj.KeyGroup
WHERE g.key IN (1, 2, 3)

From what you described, try this:
SELECT * FROM Customer c
INNER JOIN CustomerGroupJoins cgj
ON c.key = cgj.keyCustomer
INNER JOIN groups g
ON cgj.keyGroup = g.key
WHERE g.key IN (1,2,3)

SELECT *
FROM customer c
INNER JOIN customerGroupJoins j ON(j.customerKey = c.key)
WHERE j.keyGroup IN (1, 2, 3)
You don't need to join against groups-table, as long as you are only interested in the group key, which is found in your join table.

Here's a possible answer, not tested:
select custid
from CustomerGroupJoins
where groupid in (1,2,3)
group by custid
having count(*) = 3
Searches for customer's that have 3 rows with groupid 1, 2, or 3. Which means that they are in all 3 groups, because I assume you have a primary key on (custid,groupid).

Related

Bigquery SQL - WHERE IN Col Value

so i have reference table (table A) like this
| cust_id | prod |
| 1 | A, B |
| 2 | C, D, E|
This reference table will be joined by transaction history like table (table B)
| trx_id | cust_id | prod | amount
| 1 | 1 | A | 10
| 2 | 1 | B | 5
| 3 | 1 | C | 1
| 4 | 1 | D | 6
i want to get sumup value of table b amount, but the list of products is only obtained from table A.
i tried something like this but doesn't work
SELECT A.cust_id
, SUM(B.amount) AS amount
FROM A
INNER JOIN B ON A.cust_id = B.cust_id
AND B.prod IN(A.prod)
GROUP BY 1
Hmmm . . . Try splitting the prod and join on that:
SELECT A.cust_id, SUM(B.amount) AS amount
FROM A CROSS JOIN
UNNEST(SPLIT(a.prod, ', ')) p JOIN
B
ON A.cust_id = B.cust_id AND B.prod = p
GROUP BY 1;
Note: Storing multiple values in a string is a really bad idea. You can use a separate junction table (one row per customer and product) or use an array.
Below is for BigQuery Standard SQL
#standardSQL
select cust_id,
sum(if(b.prod in unnest(split(a.prod, ', ')), amount, 0)) as amount
from `project.dataset.tableB` b
join `project.dataset.tableA` a
using(cust_id)
group by cust_id
Also, note: for BigQuery - in general - storing multiple values in a string is a really good idea. :o)
See Denormalize data whenever possible for more on this

Multiple select from CTE with different number of rows in a StoredProcedure

How to do two select with joins from the cte's which returns total number of columns in the two selects?
I tried doing union but that appends to the same list and there is no way to differentiate for further use.
WITH campus AS
(SELECT DISTINCT CampusName, DistrictName
FROM dbo.file
),creditAcceptance AS
(SELECT CampusName, EligibilityStatusFinal, CollegeCreditAcceptedFinal, COUNT(id) AS N
FROM dbo.file
WHERE (EligibilityStatusFinal LIKE 'Eligible%') AND (CollegeCreditEarnedFinal = 'Yes') AND (CollegeCreditAcceptedFinal = 'Yes')
GROUP BY CampusName, EligibilityStatusFinal, CollegeCreditAcceptedFinal
),eligibility AS
(SELECT CampusName, EligibilityStatusFinal, COUNT(id) AS N, CollegeCreditAcceptedFinal
FROM dbo.file
WHERE (EligibilityStatusFinal LIKE 'Eligible%')
GROUP BY CampusName, EligibilityStatusFinal, CollegeCreditAcceptedFinal
)
SELECT a.CampusName, c.[EligibilityStatusFinal], SUM(c.N) AS creditacceptCount
FROM campus as a FULL OUTER JOIN creditAcceptance as c ON a.CampusName=c.CampusName
WHERE (a.DistrictName = 'xy')
group by a.CampusName ,c.EligibilityStatusFinal
Union ALL
SELECT a.CampusName , b.[EligibilityStatusFinal], SUM(b.N) AS eligible
From Campus as a FULL OUTER JOIN eligibility as b ON a.CampusName = b.CampusName
WHERE (a.DistrictName = 'xy')
group by a.CampusName,b.EligibilityStatusFinal
Expected output:
+------------+------------------------+--------------------+
| CampusName | EligibilityStatusFinal | creditacceptCount |
+------------+------------------------+--------------------+
| M | G | 1 |
| E | NULL | NULL |
| A | G | 4 |
| B | G | 8 |
+------------+------------------------+--------------------+
+------------+------------------------+----------+
| CampusName | EligibilityStatusFinal | eligible |
+------------+------------------------+----------+
| A | G | 8 |
| C | G | 9 |
| A | T | 9 |
+------------+------------------------+----------+
As you can see here CTEs can be used in a single statement only, so you can't get the expected output with CTEs.
Here is an excerpt from Microsoft docs:
A CTE must be followed by a single SELECT, INSERT, UPDATE, or DELETE
statement that references some or all the CTE columns. A CTE can also
be specified in a CREATE VIEW statement as part of the defining SELECT
statement of the view.
You can use table variables (declare #campus table(...)) or temp tables (create table #campus (...)) instead.

SQL - Select with specific conditions in sub-query

I'm trying to use a query to select all the data of a company that answers a specific condition but I have trouble doing so. The following is what I have done so far:
SELECT *
FROM company a
WHERE a.id IN (SELECT b.company_id
FROM provider b
WHERE b.service_id IN (2, 4));
What I intend for the role of the sub-query (using the table below) to be, is to select the company_id that possess the service_id 2 and 4.
So in this example, it would only return the company_id 5:
+----------------+
| provider TABLE |
+----------------+
+----------------+----------------+----------------+
| id | company_id | service_id |
+--------------------------------------------------+
| 1 | 3 | 2 |
| 2 | 5 | 2 |
| 3 | 5 | 4 |
| 4 | 9 | 6 |
| 5 | 9 | 7 |
| ... | ... | ... |
As you may have guessed, the use of the IN in the sub-query does not fulfill my needs, it will select the company_id 5 but also the company_id 3.
I understand why, IN exists to check if a value matches any value in a list of values so it is not really what I need.
So my question is:
How can I replace the IN in my sub-query to select company_id
having the service_id 2 and 4?
The subquery should be:
SELECT b.company_id
FROM provider b
WHERE b.service_id IN (2, 4)
GROUP BY b.company_id
HAVING COUNT(b.service) = 2
You can self-JOIN the provider table to find companies that own both needed services.
SELECT p1.company_id
FROM provider p1
INNER JOIN provider p2 on p2.company_id = p1.company_id and p2.service_id = 2
WHERE p1.service_id = 4
As well as the other answers are correct if you want to see all company detail instead of only Company_id you can use two EXISTS() for each service_id.
SELECT *
FROM Company C
WHERE EXISTS (SELECT 1
FROM Provider P1
WHERE C.company_id = P1.company_id
AND P1.service_id = 2)
AND EXISTS (SELECT 1
FROM Provider P2
WHERE C.company_id = P2.company_id
AND P2.service_id = 4)
I offer this option for the subquery...
SELECT b.company_id
FROM provider b WHERE b.service_id = 2
INTERSECT
SELECT b.company_id
FROM provider b WHERE b.service_id = 4
Often, I find the performance of these operations to be outstanding even with very large data sets...
UNION
INTERSECT
EXCEPT (or MINUS in Oracle)
This article has some good insights:
You Probably Don't Use SQL Intersect or Except Often Enough
Hope it helps.

Need T-SQL query to get multiple choice answer if matches

Example:
Table Question_Answers:
+------+--------+
| q_id | ans_id |
+------+--------+
| 1 | 2 |
| 1 | 4 |
| 2 | 1 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
+------+--------+
User_Submited_Answers:
| q_id | sub_ans_id |
+------+------------+
| 1 | 2 |
| 1 | 4 |
| 2 | 1 |
| 3 | 1 |
| 3 | 2 |
| 3 | 4 |
+------+------------+
I need a T-SQL query if this rows matches count 1 else 0
SELECT
t1.q_id,
CASE WHEN COUNT(t2.sub_ans_id) = COUNT(*)
THEN 1
ELSE 0 END AS is_correct
FROM Question_Answers t1
LEFT JOIN User_Submited_Answers t2
ON t1.q_id = t2.q_id AND
t1.ans_id = t2.sub_ans_id
GROUP BY t1.q_id
Try the following code:
select qa.q_id,case when qa.ans_id=sqa.ans_id then 1 else 0 end as result from questionans qa
left join subquestionans sqa
on qa.q_id=sqa.q_id and qa.ans_id=sqa.ans_id
This should give you expected result for every question.
select q_id, min(Is_Correct)Is_Correct from (
select Q.q_id,case when count(A.sub_ans_id)=count(*) then 1 else 0 end as Is_Correct
from #Q Q left join #A A on Q.q_id=A.q_id and Q.ans_id=A.sub_ans_id
group by Q.q_id
UNION ALL
select A.q_id,case when count(Q.ans_id)=count(*) then 1 else 0 end as Is_Correct
from #Q Q right join #A A on Q.q_id=A.q_id and Q.ans_id=A.sub_ans_id
group by A.q_id ) I group by q_id
MySQL solution (sql fiddle):
SELECT tmp.q_id, MIN(c) as correct
FROM (
SELECT qa.q_id, IF(qa.q_id = usa.q_id, 1, 0) as c
FROM question_answers qa
LEFT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
UNION
SELECT usa.q_id, IF(qa.q_id = usa.q_id, 1, 0) as c
FROM question_answers qa
RIGHT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
) tmp
GROUP BY tmp.q_id;
Now, step by step explanation:
In order to get the right output we will need to:
extract from question_answers table the answers which were not filled in by the user (in your example: q_id = 3 with ans_id = 3)
extract from user_submited_answers table the wrong answers which were filled in by the user (in your example: q_id = 3 with sub_ans_id = 4)
To do that we can use a full outer join (for mysql left join + right join):
SELECT *
FROM question_answers qa
LEFT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
UNION
SELECT *
FROM question_answers qa
RIGHT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id;
From the previous query results, the rows which we are looking for (wrong answers) contains NULL values (based on the case, in question_answers table or user_submited_answers table).
The next step is to mark those rows with 0 (wrong answer) using an IF or CASE statement: IF(qa.q_id = usa.q_id, 1, 0).
To get the final output we need to group by q_id and look for 0 values in the grouped rows. If there is at least one 0, the answer for that question is wrong and it should be marked as that.
Check sql fiddle: SQL Fiddle

Return count(*) even if 0

I have the following query:
select bb.Name, COUNT(*) as Num from BOutcome bo
JOIN BOffers bb ON bo.ID = bb.BOutcomeID
WHERE bo.EventID = 123 AND bo.OfferTypeID = 321 AND bb.NumA > bb.NumB
GROUP BY bb.Name
The table looks like:
Name | Num A | Num B
A | 10 | 3
B | 2 | 3
C | 10 | 3
A | 9 | 3
B | 2 | 3
C | 9 | 3
The expected output should be:
Name | Count
A | 2
B | 0
C | 2
Because when name is A and C then Num A is bigger to times than Num B and when Name is B, in both records Num A is lower than Num B.
My current output is:
Name | Count
A | 2
C | 2
Because B's output is 0, i am not getting it back in my query.
What is wrong with my query? how should I get it back?
Here is my guess. I think this is a much simpler approach than all of the left/right join hoops people have been spinning their wheels on. Since the output of the query relies only on columns in the left table, there is no need for an explicit join at all:
SELECT
bb.Name,
[Count] = SUM(CASE WHEN bb.NumA > bb.NumB THEN 1 ELSE 0 END)
-- just FYI, the above could also be written as:
-- [Count] = COUNT(CASE WHEN bb.NumA > bb.NumB THEN 1 END)
FROM dbo.BOffers AS bb
WHERE EXISTS
(
SELECT 1 FROM dbo.BOutcome
WHERE ID = bb.BOutcomeID
AND EventID = 123
AND OfferTypeID = 321
)
GROUP BY bb.Name;
Of course, we're not really sure that both Name and NumA/NumB are in the left table, since the OP talks about two tables but only shows one table in the sample data. My guess is based on the query he says is "working" but missing rows because of the explicit join.
Another wild guess. Feel free to downvote:
SELECT ba.Name, COUNT(bb.BOutcomeID) as Num
FROM
( SELECT DISTINCT ba.Name
FROM
BOutcome AS b
JOIN
BOffers AS ba
ON ba.BOutcomeID = b.ID
WHERE b.EventID = 123
AND b.OfferTypeID = 321
) AS ba
LEFT JOIN
BOffers AS bb
ON AND bb.Name = ba.Name
AND bb.NumA > bb.NumB
GROUP BY ba.Name ;