Where statement for exact match on Many to Many SQL tables - sql

I am trying to construct a SQL statement to search in two tables that are in a many to many relation.
Problem : SQL statement to search for products with exact stones.
For example, in the below tables, I need a statement that will search for product with Ruby and Emerald stone ONLY. In all my attempts I get both Ring and Necklace because they both have Ruby and Emerald even though Necklace has one additional stone. It should only give Ring product.
I need a way to implement the AND operator on the stone table so that the result contains products that have the exact stones. Please help.
Table stone
s_id
s_name
1
Ruby
2
Emerald
3
Onyx
Table product
p_id
p_name
1
Ring
2
Necklace
3
Pendent
Relation table - product_stone
p_s_id
p_id
s_id
1
1
1
1
1
2
1
2
1
1
2
2
1
2
3
1
3
3

This is a relational division question. We need to find the cross join of the two tables "divided" by our list, with no remainder i.e. no other stone in product.
We will assume that p_id and s_id are unique:
;WITH StonesToFind AS ( -- we could also use a table variable etc here
SELECT *
FROM stone
WHERE s_name IN ('Ruby','Emerald')
)
SELECT p.p_name
FROM product AS p -- let's get all products...
JOIN product_stone AS ps ON ps.p_id = p.p_id -- ...cross join all their stones
LEFT JOIN StonesToFind AS s ON s.s_id = ps.s_id -- they may have stones in the list
GROUP BY p.p_id, p_name
HAVING COUNT(CASE WHEN s.s_id IS NULL THEN 1 END) = 0
-- the number of non matching stones in product must be zero
AND COUNT(*) = (SELECT COUNT(*) FROM StonesToFind);
-- the total number of stones must be the same as the list

Related

Inner join + group by - select common columns and aggregate functions

Let's say i have two tables
Customer
---
Id Name
1 Foo
2 Bar
and
CustomerPurchase
---
CustomerId, Amount, AmountVAT, Accountable(bit)
1 10 11 1
1 20 22 0
2 5 6 0
2 2 3 0
I need a single record for every joined and grouped Customer and CustomerPurchase group.
Every record would contain
columns from table Customer
some aggregation functions like SUM
a 'calculated' column. For example difference of other columns
result of subquery to CustomerPurchase table
An example of result i would like to get
CustomerPurchases
---
Name Total TotalVAT VAT TotalAccountable
Foo 30 33 3 10
Bar 7 9 2 0
I was able to get a single row only by grouping by all the common columns, which i dont think is the right way to do. Plus i have no idea how to do the 'VAT' column and 'TotalAccountable' column, which filters out only certain rows of CustomerPurchase, and then runs some kind of aggregate function on the result. Following example doesn't work ofc but i wanted to show what i would like to achieve
select C.Name,
SUM(CP.Amount) as 'Total',
SUM(CP.AmountVAT) as 'TotalVAT',
diff? as 'VAT',
subquery? as 'TotalAccountable'
from Customer C
inner join CustomerPurchase CR
on C.Id = CR.CustomerId
group by C.Id
I would suggest you just need the follow slight changes to your query. I would also consider for clarity, if you can, to use the terms net and gross which is typical for prices excluding and including VAT.
select c.[Name],
Sum(cp.Amount) as Total,
Sum(cp.AmountVAT) as TotalVAT,
Sum(cp.AmountVAT) - Sum(CP.Amount) as VAT,
Sum(case when cp.Accountable = 1 then cp.Amount end) as TotalAccountable
from Customer c
join CustomerPurchase cp on cp.CustomerId = c.Id
group by c.[Name];

Why did the 'NOT IN' work but not the 'NOT EXISTS'?

I've been trying to improve my SQL and was playing around with a 'NOT EXISTS' function. I needed to find the names of salespeople who did not have any sales to company 'RED'.
I tried this and it did not work:
SELECT DISTINCT
sp.name
FROM salesperson sp
WHERE NOT EXISTS (
SELECT
ord.sales_id
FROM
company cmp
LEFT JOIN orders ord
on cmp.com_id=ord.com_id
WHERE cmp.name = 'RED')
This query ran but returned a NULL. Then I changed it to this and it worked fine:
SELECT DISTINCT
sp.name
FROM salesperson sp
WHERE sp.sales_id NOT IN (
SELECT
ord.sales_id as sales_id
FROM
company cmp
left join orders ord
on cmp.com_id=ord.com_id
WHERE cmp.name = 'RED')
Can someone explain why 'NOT EXISTS' did not work in this instance?
.
.
.
.
.
.
Just in case, here is the exercise in full:
Given three tables: salesperson, company, orders
Output all the names in the table salesperson, who didn’t have sales to company 'RED'.
Table: salesperson
sales_id
name
salary
commission_rate
hire_date
1
John
100000
6
4/1/2006
2
Amy
120000
5
5/1/2010
3
Mark
65000
12
12/25/2008
4
Pam
25000
25
1/1/2005
5
Alex
50000
10
2/3/2007
The table salesperson holds the salesperson information. Every salesperson has a sales_id and a name.
Table: company
com_id
name
city
1
RED
Boston
2
ORANGE
New York
3
YELLOW
Boston
4
GREEN
Austin
The table company holds the company information. Every company has a com_id and a name.
Table: orders
order_id
order_date
com_id
sales_id
amount
1
1/1/2014
3
4
100000
2
2/1/2014
4
5
5000
3
3/1/2014
1
1
50000
4
4/1/2014
1
4
25000
The table orders holds the sales record information, salesperson and customer company are represented by sales_id and com_id.
expected output
name
Amy
Mark
Alex
Explanation:
According to order '3' and '4' in table orders, it is easy to tell only salesperson 'John' and 'Pam' have sales to company 'RED', so we need to output all the other names in the table salesperson.
I think your two queries are totally different.
NOT EXISTS - this will return data when that subquery doesn't return data. Which will always return some data so you will always get null. You need to join this subquery with the main query using WHERE sp.sales_id = ord.sales_id AND cmp.name = 'RED'
NOT IN - this is what you need for your purpose. You can see that it's clearly giving you data for not in (subquery) condition.
The equivalent NOT EXISTS requires a correlation clause:
SELECT sp.name
FROM salesperson sp
WHERE NOT EXISTS (SELECT ord.sales_id
FROM company cmp JOIN
orders ord
ON cmp.com_id = ord.com_id
WHERE sp.sales_id = ord.sales_id AND
cmp.name = 'RED'
);
Neither the NOT IN nor NOT EXISTS versions requires a LEFT JOIN in the subquery. In fact, the LEFT JOIN somewhat defeats the purpose of the logic.
Without the correlation clause, the subquery runs and it will return rows if any cmp.name is 'RED'. That appears to be the case and so NOT EXISTS always returns false.

More efficient SQL statement to eliminate my n^2 algorithm?

Let's say the following are my SQL tables:
My first table is called [Customer].
CustomerID CustomerName CustomerAddress
---------- ------------ ---------------
1 Name1 1 Infinity Loop
2 Name2 2 Infinity Loop
3 Name3 3 Infinity Loop
My next table is called [Group].
GroupID GroupName
------- ---------
1 Group1
2 Group2
3 Group3
Then, to link the two, I have a table called [GroupCustomer].
GroupID CustomerID
------- ----------
1 2
1 3
2 1
3 1
So on the ASP.NET page, I have two tables I want to display. The first table are essentially all Customers that are in a particular group. So in a drop down list, if I select Group1, it would display the following table:
CustomerID CustomerName CustomerAddress
---------- ------------ ---------------
2 Name2 2 Infinity Loop
3 Name3 3 Infinity Loop
The table above is for all customers that are "associated" with the selected group (which in this case is Group1). Then, in the other table, I want it to display this:
CustomerID CustomerName CustomerAddress
---------- ------------ ---------------
1 Name1 1 Infinity Loop
Essentially, for this table, I want it to display all customers that are NOT in the selected group.
To generate the table for all customers that are in the selected group, I wrote the following SQL:
SELECT Customer.CustomerID, Customer.CustomerName, Customer.CustomerAddress
FROM Customer
INNER JOIN GroupCustomer ON
Customer.CustomerID = GroupCustomer.CustomerID
INNER JOIN [Group] ON
GroupCustomer.GroupID = [Group].GroupID
WHERE [Group].GroupID = #selectedGroupParameter
So when I mentioned my n^2 algorithm, I essentially used the SQL statement above, and compared it against a SQL statement where I just SELECT * from the Customer table. Where there was a match, I just simply had it did not display it. This is incredibly inefficient, and something I'm not proud of.
This leads to my current question, what's the most efficient SQL statement I can write that will eliminate my n^2?
You can use NOT EXISTS to get Customers not in a particular Group:
SELECT *
FROM Customer c
WHERE
NOT EXISTS(
SELECT 1
FROM GroupCustomer
WHERE
CustomerID = c.CustomerID
AND GroupID = #selectedGroupParameter
)
Read this article by Aaron Bertrand for different ways to solve this kind of problem and their performance comparisons, with NOT EXISTS being the fastest according to his test.
SQL Fiddle
Select * from Customer
where CustomerID not in
(select CustomerID
from GroupCustomer
where GroupID = #selectedGroupParameter)
You can use not in for this check. That said, you can probably just get rid of the join to the Group table for some increased performance, since you don't appear to actually use the group name.

SQL Server - tsql join/filtering issue

Probably the wrong title, but I can't summarise what I'm trying to do nicely. Which is probably why my googling hasn't helped.
I have a list of Discounts, and a list of TeamExclusiveDiscounts (DiscountId, TeamId)
I call a stored procedure passing in #TeamID (int).
What I want is all Discounts except if they're in TeamExclusiveDiscounts and don't have TeamID matching #TeamId.
So the data is something like
Table Discount:
DiscountID Name
-----------------------
1 Test 1
2 Test 2
3 Test 3
4 Test 4
5 Test 5
Table TeamExclusiveDiscount:
DiscountID TeamID
-----------------------
1 10
2 10
2 4
3 8
Expected results:
searching for TeamID = 10 I should get discounts 1,2,4,5
searching for TeamID = 5 I should get discounts 4, 5
searching for TeamID = 8 I should get discounts 3, 4, 5
I've tried a variety of joins, or trying to update a temp table to set whether the discount is allowed or not, but I just can't seem to get my head around this issue.
So I'm after the T-SQL for my stored procedure that will select the correct discounts (SQL Server). Thanks!
SELECT D.DiscountID FROM Discounts D
LEFT JOIN TeamExclusiveDiscount T
ON D.DiscountID=T.DiscountID
WHERE T.TeamID=#TeamID OR T.TeamID IS NULL
SQLFIDDLE for TEST
Can you try this - it only selects records where there is a teamdiscount record with the team or no teamdiscount record at all.
SELECT * FROM Discounts D
WHERE
EXISTS (
SELECT 1
FROM TeamExclusiveDiscount T
WHERE T.DiscountID = D.DiscountID
AND TeamID = #TeamID
)
OR
NOT EXISTS (
SELECT 1
FROM TeamExclusiveDiscount T
WHERE T.DiscountID = D.DiscountID
)
I like to translate the English description directly into SQL (atleast as a first pass):
"All Discounts except if they're in TeamExclusiveDiscounts and don't have TeamID matching #TeamId."
SELECT *
FROM Discounts D -- All Discounts
WHERE D.DiscountID NOT IN -- except if they're in TeamExclusiveDiscounts
(SELECT T.DiscountID
FROM TeamExclusiveDiscount T
WHERE T.DiscountID NOT IN -- and don't have TeamID matching #TeamId.
(SELECT Match.DiscountID
FROM TeamExclusiveDiscount Match
WHERE Match.TeamID = #TeamID)
)

Excluding null entries from multiples values with SQL

From 3 different tables, I want to know if a person (table1), with multiple visit in a store (table2), have purchased toys and enjoyed them (table3). In table3, 0 stand as either negative (so not enjoyed) or not bought. 1 stands for positive. Every visit has its own identification number.
My problem is that for every ID in table1, I have multiple entries for table2 for which I have multiple entries for table3 and only one of them is null.
Person Visit Toy
ID age Number Visit ID number name value
1 12 1 1 1 1 Plane
2 10 2 1 2 1 Train 1
3 2 1 2 Plane 1
4 2 2 2 Train 0
3 Plane 0
3 Train 1
(goes on for every id) (goes on for every visit)
I want to if know how many people have enjoyed a certain toy. However, since I have some null info, I have some trouble having those for which I only have value for both of their visit. For instance, the following code works only if the null condition is placed only on one of the visits
Select p.id, max(toy.value) as value
from person p
join visit v on p.id = v.id
join toy t on v.number = t.number
where
((t.name='plane' and v.visit=1)
or (t.name='plane' and v.visit=2))
and (
(v.visit=1 and ((t.value=1 or t.value=0) is not null))
---and (v.visit=2 and ((t.value=1 or t.value=0) is not null))
)
group by p.id
order by p.id
I have tried many ways of writing this. It does work if I try with both of null condition independently, but if I remove the -- and try for the condition on both the visit 1 and 2, it doesn't work. Note that I am using max on the value because I want a positive value is possible.
If you want to know how many people have enjoyed a certain toy, Then you may simply write this:
select count(*) from toy t where t.name='TOY NAME' and t.level=1;
If you want something else. Then kindly clarify.
Edited Query,
Select p.id, max(toy.value) as value
from person p
join visit v on p.id = v.id
join toy t on v.number = t.number
where
t.name='plane'
and t.value is not null
group by p.id
order by p.id
I used count as a way to eliminate all the null entries. The sum of null and a value is always null, so by adding restriction count=2 it eliminate the null