many to many select query - sql

I'm trying to write code to pull a list of product items from a SQL Server database an display the results on a webpage.
A requirement of the project is that a list of categories is displayed at the right hand side of the page as a list of checkboxes (all categories selected by default) and a user can uncheck categories and re-query the database to view products's in only the categories they want.
Heres where it starts to get a bit hairy.
Each product can be assinged to multiple categories using a product categories table as below...
Product table
[product_id](PK),[product_name],[product_price],[isEnabled],etc...
Category table
[CategoryID](PK),[CategoryName]
ProductCagetory table
[id](PK),[CategoryID](FK),[ProductID](FK)
I need to select a list of products that match a set of category ID's passed to my stored procedure where the products have multiple assigned categories.
The categort id's are passed to the proc as a comma delimited varchar i.e. ( 3,5,8,12 )
The SQL breaks this varchar value into a resultset in a temp table for processing.
How would I go aout writing this query?

One problem is passing the array or list of selected categories into the server. The subject was covered at large by Eland Sommarskog in the series of articles Arrays and Lists in SQL Server. Passing the list as a comma separated string and building a temp table is one option. There are alternatives, like using XML, or a Table-Valued-Parameter (in SQL Server 2008) or using a table #variable instead of a #temp table. The pros and cons of each are covered in the article(s) I linked.
Now on how to retrieve the products. First things first: if all categories are selected then use a different query that simply retrieves all products w/o bothering with categories at all. This will save a lot of performance and considering that all users will probably first see a page w/o any category unselected, the saving can be significant.
When categories are selected, then building a query that joins products, categories and selected categories is fairly easy. Making it scale and perform is a different topic, and is entirely dependent on your data schema and actual pattern of categories selected. A naive approach is like this:
select ...
from Products p
where p.IsEnabled = 1
and exists (
select 1
from ProductCategories pc
join #selectedCategories sc on sc.CategoryID = pc.CategoryID
where pc.ProductID = p.ProductID);
The ProductsCategoriestable must have an index on (ProductID, CategoryID) and one on (CategoryID, ProductID) (one of them is the clustered, one is NC). This is true for every solution btw. This query would work if most categories are always selected and the result contains most products anyway. But if the list of selected categories is restrictive then is better to avoid the scan on the potentially large Products table and start from the selected categories:
with distinctProducts as (
select distinct pc.ProductID
from ProductCategories pc
join #selectedCategories sc on pc.CategoryID = sc.CategoryID)
select p.*
from Products p
join distinctProducts dc on p.ProductID = dc.ProductID;
Again, the best solution depends largely on the shape of your data. For example if you have a very skewed category (one categoru alone covers 99% of products) then the best solution would have to account for this skew.

This gets all products that are at least in all of the desired categories (no less):
select * from product p1 join (
select p.product_id from product p
join ProductCategory pc on pc.product_id = p.product_id
where pc.category_id in (3,5,8,12)
group by p.product_id having count(p.product_id) = 4
) p2 on p1.product_id = p2.product_id
4 is the number of categories in the set.
This gets all products that are exactly in all of the desired categories (no more, no less):
select * from product p1 join (
select product_id from product p1
where not exists (
select * from product p2
join ProductCategory pc on pc.product_id = p2.product_id
where p1.product_id = p2.product_id
and pc.category_id not in (3,5,8,12)
)
group by product_id having count(product_id) = 4
) p2 on p1.product_id = p2.product_id
The double negative can be read as: get all products for which there are no categories that are not in the desired category list.
For the products in any of the desired categories, it's as simple as:
select * from product p1 where exists (
select * from product p2
join ProductCategory pc on pc.product_id = p2.product_id
where
p1.product_id = p2.product_id and
pc.category_id in (3,5,8,12)
)

This should do. Yo don't have to break the comma delimited category ids.
select distinct p.*
from product p, productcategory pc
where p.product_id = pc.productid
and pc.categoryid in ( place your comma delimited category ids here)
This will give the products which are in any of the passed in category ids i.e., as per JNK's comment its an OR not ALL. Please specify if you want an AND i.e, the product needs to be selected only if it is in ALL the categories specified in the comma separated list.

If you need anything else than product_id from products then you can write something like this (and adding the extra fields that you need):
SELECT distinct(p.product_id)
FROM product_table p
JOIN productcategory_table pc
ON p.product_id=pc.product_id
WHERE pc.category_id in (3,5,8,12);
on the other hand if you need really just the product_id you can simply select them from productcategory_table:
SELECT distinct(product_id)
FROM productcategory_table
WHERE category_id in (3,5,8,12);

This should be fairly close to what you are looking for
SELECT product.*
FROM product
JOIN ProductCategory ON ProductCategory.ProductID = Product.product_id
JOIN #my_temp ON #my_temp.category_id = ProductCategory.CategoryID
EDIT
As noted in the comments this will produce duplicates for those products appearing in multiple categories. To correct this then specify DISTINCT before the column list. I have included all product columns in the list product.* as I do not know which columns you are looking for but you should probably change that to the specific columns that you want

Related

SQL Server need to find suppliers who supply the most different products

I have two tables I need information from, Products and Suppliers. Both these tables have a SupplierID column I am trying to use to join them together to retrieve the right info.
The output I need is SupplierID and ContactName from the Suppliers table. The correct output should contain only two suppliers, so I attempted something like this, but ran into a conversion error converting nvarchar value to a data type int. I am not supposed to count how many products they supply but aggregate functions seem like the best method to me.
SELECT TOP 2 ContactName, COUNT(Products.SupplierID) AS Supply
FROM Products
LEFT JOIN Suppliers ON Suppliers.ContactName = Products.SupplierID
GROUP BY Products.SupplierID, Suppliers.ContactName
ORDER BY Supply;
I have tried many different queries but none will work. I am confused on how to join these tables without running into errors. All the products have a unique ProductID as well. The correct output should look something like this:
7 John Smith
12 John Sample
Both these tables have a SupplierID column I am trying to use to join them together to retrieve the right info
If so, you should be joining on that column accross tables.
Also, it is a good practice to use table aliases and prefix each column with the table it belongs to.
Another remark is that if you want suppliers that sell the most different products, then you want to order by descending count (not ascending).
Finally, if you want to left join, then you should start from the suppliers and then bring in the products, not the other way around.
Consider:
select top 2
s.SupplierID,
s.ContactName,
COUNT(*) as Supply
from Suppliers s
left join Products p on p.SupplierID = s.SupplierID
group by s.SupplierID, s.ContactName
order by Supply desc;
You're currently joining on two different fields:
on Suppliers.ContactName = Products.SupplierID
Presumably this should be as follows?
on Suppliers.SupplierID = Products.SupplierID

How do I show all records from one table for each semi-related record in another?

This problem is extremely difficult to explain and I did not even know how to title it correctly so I do apologise about that in advance.
I have a view of products which is as follows:
Product
ProductId
ProductName
In my database, I have a ratecard table and a ratecard product table. A ratecard may be titled "Tier 1 Customers" and the corresponding RatecardProduct records would be prices for products for that particular ratecard. It may only contain prices for a few products and not all of them.
Ratecard
RatecardId
RatecardName
RatecardProduct
RatecardProductId
RatecardId
ProductId
UnitPrice
The problem is that I need to create a view which displays all products for all ratecards. If the ratecard / product combination does not have a corresponding unit price in my ratecardproduct table, it should show NULL or 0.
Imagine I have 10 products and 4 ratecards; the view would contain 40 records, even if the RatecardProduct table was completely empty
The reason I need to do this is because I am populating a gridview on viewing a ratecard and I do not want to have to do a round trip for each row to ascertain if there is a corresponding price.
Thank you so much in advance.
Generate all the rows. Then use left join to bring in the data:
select p.*, r.*, coalesce(rp.unitprice, 0) as unitprice
from products p cross join
ratecards r left join
ratecardproduct rp
on rp.productid = p.productid and rp.ratecardid = r.ratecardid;
Or don't use coalesce() if you want NULL.

Faceted search count in SQL

I'm trying to implement faceted search count in SQL. For simplicity, I'll take the data that already exists on https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_all. A product has a category and a category belongs to many products, so it's a one-to-many relationship. I'm interested in filtering products by category, so if there are multiple categories selected, the query will get products whose category Id can be found in the list of Id's that the user filtered by (So it's an OR operation between categories). But this is not the challenge that I'm currently facing.
The query below tries to answer the question: For every category that exists, how many products would I get if that category was among the selected categories?
SELECT
cat.CategoryId,
p.Count
FROM Categories AS cat
LEFT JOIN (SELECT
COUNT(DISTINCT ProductId) AS Count
FROM Products AS p
WHERE p.CategoryId IN #CategoryIds
OR p.CategoryId = cat.CategoryId) AS p
The #CategoryIds is a parameter that is going to be handled by an ORM. For a more concrete scenario, you can just replace it with the list (1, 2) (so you can consider the case in which the user wants to filter all products that have the category 1 or 2).
The issue is that the word "cat" (on the last line) is not recognised so the query just throws an error.
Is there a way to make the second table recognise the first table's alias "cat" that I want to LEFT JOIN with? Or is there a better solution to this problem that I didn't take into consideration?
LEFT JOIN requires predicate. Some DBMS, like MS SQL Server, supports CROSS APPLY. This query should be equivalent to following one, ready to run on every SQL Database known to me:
SELECT
cat.CategoryId,
COUNT(ProductId)
FROM Categories AS cat
LEFT JOIN Products P ON p.CategoryId=cat.CategoryId OR p.CategoryId IN [list]
GROUP BY cat.CategoryId
Or, if you are using SQL Server:
SELECT
cat.CategoryId,
p.Count
FROM Categories AS cat
CROSS APPLY (SELECT COUNT(DISTINCT ProductId) AS Count
FROM Products AS p
WHERE p.CategoryId IN #CategoryIds
OR p.CategoryId = cat.CategoryId) AS p

Ensure in many-to-many relationship at least one relationship is primary

Sorry for the bad title, if you can think of a better one, let me know.
Many-to-many relationship using tables.
Product
ProductCategory
Category
In the ProductCategory table i have boolean column primarycategory
Each product must have a primary category.
I want to find all products in my database which don't have a primarycategory.
Note: I have assumed field names in tables other than the one you specified.
This should return a distinct list of product IDs that have no primary category. Bit fields in SQL server are numeric, so you can give them to the max() function.
select
pc.product
from
ProductCategory pc
group by
pc.product
having
max(pc.primarycategory) = 0
The above query assumes that all products have at least one category. If not, try the following:
select
pc.product
from
Product p
left join
ProductCategory pc on p.id = pc.product
group by
pc.product
having
max(isnull(pc.primarycategory, 0)) = 0
Assuming true = value 1, try this:
Select Product From Product p
Where Not Exists (Select * From ProductCategory
Where Product = p.Product
And primarycategory = 1 )
but if you have control over this database, Move the PrimaryCategory column to the Products table, (and populate it with the category identifier itself, not a boolean), that is where this belongs in a properly normalized schema...

SQL Syntax for Complex Scenario (Deals)

i have a complex query to be written but cannot figure it out
here are my tables
Sales --one row for each sale made in the system
SaleProducts --one row for each line in the invoice (similar to OrderDetails in NW)
Deals --a list of possible deals/offers that a sale may be entitled to
DealProducts --a list of quantities of products that must be purchased in order to get a deal
now im trying to make a query which will tell me for each sale which deals he may get
the relevant fields are:
Sales: SaleID (PK)
SaleProducts: SaleID (FK), ProductID (FK)
Deals: DealID (PK)
DealProducts: DealID(FK), ProductID(FK), Mandatories (int) for required qty
i believe that i should be able to use some sort of cross join or outer join, but it aint working
here is one sample (of about 30 things i tried)
SELECT DealProducts.DealID, DealProducts.ProductID, DealProducts.Mandatories,
viwSaleProductCount.SaleID, viwSaleProductCount.ProductCount
FROM DealProducts
LEFT OUTER JOIN viwSaleProductCount
ON DealProducts.ProductID = viwSaleProductCount.ProductID
GROUP BY DealProducts.DealID, DealProducts.ProductID, DealProducts.Mandatories,
viwSaleProductCount.SaleID, viwSaleProductCount.ProductCount
The problem is that it doesn't show any product deals that are not fulfilled (probably because of the ProductID join). i need that also sales that don't have the requirements show up, then I can filter out any SaleID that exists in this query where AmountBought < Mandatories etc
Thank you for your help
I'm not sure how well I follow your question (where does viwSaleProductCount fit in?) but it sounds like you will want an outer join to a subquery that returns a list of deals along with their associated products. I think it would go something like this:
Select *
From Sales s Inner Join SaleProducts sp on s.SaleID = sp.SaleID
Left Join (
Select *
From Deals d Inner Join DealProducts dp on d.DealID = dp.DealId
) as sub on sp.ProductID = sub.ProductID
You may need to add logic to ensure that deals don't appear twice, and of course replace * with the specific column names you'd need in all cases.
edit: if you don't actually need any information from the sale or deal tables, something like this could be used:
Select sp.SaleID, sp.ProductID, sp.ProductCount, dp.DealID, dp.Mandatories
From SaleProducts sp
Left Join DealProducts as dp on sp.ProductID = dp.ProductID
If you need to do grouping/aggregation on this result you will need to be careful to ensure that deals aren't counted multiple times for a given sale (Count Distinct may be appropriate, depending on your grouping). Because it is a Left Join, you don't need to worry about excluding sales that don't have a match in DealProducts.